Support providing root client ID via env. variables when bootstrapping #422

dimas-b · 2024-11-01T22:03:09Z

Allow the root client ID and secrets to be provided via environment variables when bootstrapping.

For "in memory" use cases, this mean the main Polaris server will read user-provided root secrets from the env., if propvided.

For "persistent" use cases, the bootstrap command will read user-provided root secrets from the env., if propvided.

The env. variables are:

POLARIS_BOOTSTRAP_<REALM>_ROOT_CLIENT_ID
POLARIS_BOOTSTRAP_<REALM>_ROOT_CLIENT_SECRET

If these variables are not provided, random values are generated as before.

How Has This Been Tested?

Manual smoke tests.

eric-maynard · 2024-11-03T07:02:06Z

I recall that @collado-mike and I discussed this as one option when bootstrapping, but I can't remember why we went with the current approach.

For my part, I would ideally like to see as little a difference in the UX of the different metastores as possible.

dimas-b · 2024-11-04T16:07:39Z

@eric-maynard :

I would ideally like to see as little a difference in the UX of the different metastores as possible.

Do you mean supporting the same overrides under the bootstrap command? I can certainly add that. WDYT?

adutra · 2024-11-04T20:20:04Z

Taking a step back, and thinking about real-life scenarios: how is an operator supposed to get hold of the root credentials, after installing and bootstrapping Polaris? I read the Polaris documentation about production setups, it does cover bootstrapping, but it doesn't cover this detail in particular, so I guess the exact procedure depends on the metastore being used? Wouldn't it be possible to standardize that?

I'm also raising this point because while working on the Quarkus port, I realized that it is not possible to create a true integration test currently, that is, a test that spawns Polaris as a black box, and only interacts with it through its external APIs. This is because once started and bootstrapped, and regardless of the metastore used, the test is unable to infer which credentials to use to communicate with Polaris.

All in all, I wonder if we aren't missing a simple and standard way to pass the root credentials to the boostrapping process, even in production setups – or alternatively, a way to retrieve those, that would also be standardized across metastores.

dimas-b · 2024-11-04T20:25:25Z

[...] I wonder if we aren't missing a simple and standard way to pass the root credentials to the boostrapping process.

My personal take on that is that the bootstrap command should either take root credentials as a parameters, or output the uniquely generated root credentials. I think we can support both options (e.g. as CLI arguments) and I'm willing to add that to this PR if other developers agree.

eric-maynard · 2024-11-05T08:40:33Z

A few thoughts here.

For testing, I've found it generally sufficient just to use e.g. --access-token 'principal:root;realm:default-realm', so you shouldn't actually need root credentials in many cases unless you are testing a context resolver.
In general, I feel nervous about allowing the bootstrap process to use non-random credentials (even when supplied by the user). I am not a security expert but this seems insecure.
With respect to the bootstrap command printing the credentials to stdout, this was thought to be insecure. I remember now that this explicitly came up in my previous conversations with @collado-mike

jbonofre

I guess we can integrate this in the TestEnvironmentExtension recently added (waiting Quarkus 😄 ).

dimas-b · 2024-11-05T14:25:14Z

@eric-maynard : thanks for sharing your views on this matter. Re: security of bootstrapping, I did not mean printing credentials to STDOUT. The "output" can be a file at a user-specified location. I think this is quite similar to downloading ssh certificates for a VM from a cloud vendor.

I wonder what options end users currently have for bootstrapping, though 🤔 How will a user be able to discover a generated root credential? (Apologies if I missed this in somewhere in docs).

adutra · 2024-11-05T14:26:29Z

I agree that we need to proceed carefully since it's a security-related issue.

I'd note, fwiw, that Keycloak allows to bootstrap the root credentials via environment variables.

I don't think we can suspect Keycloak of taking security lightly :-)

In another place of Keycloak docs, we can learn more about how the environment variables are processed:

If the initial administrator already exists and the environment variables are still present at startup, an error message stating the failed creation of the initial administrator is shown in the logs. Keycloak ignores the values and starts up correctly.

Couldn't we do something similar to that?

adutra · 2024-11-05T14:28:00Z

The "output" can be a file at a user-specified location.

If we go down this route, maybe check that the file permissions are as restrictive as possible.

eric-maynard · 2024-11-05T18:37:40Z

@dimas-b

I wonder what options end users currently have for bootstrapping, though 🤔 How will a user be able to discover a generated root credential? (Apologies if I missed this in somewhere in docs).

So if you're just doing testing like this PR would address, you can just use passwordless auth. You normally don't need the root credentials.

However if you're asking more generally how to retrieve the root credentials it is metastore-dependent. Ideally your metastore is set up with your auth provider and allows you to do something nice like set/retrieve the credentials using SSO. Or perhaps they are exposed through some API which is already secured. But in the simplest case where you're just using a postgres metastore, you can retrieve them through something like:

SELECT principalclientid, mainsecret FROM principal_secrets;

dimas-b · 2024-11-05T19:47:41Z

@eric-maynard :

So if you're just doing testing like this PR would address, you can just use passwordless auth. You normally don't need the root credentials.

I could use principal:root;realm:default-realm as an access token, but I happen to need to use the client credentials auth flow, which currently uses random values for the root user. Discovering the random credentials currently involves scanning Polaris STDOUT, which is inconvenient. I'd like to have control over the inputs I provide to Polaris.

So this PR proposes to make this an option for the user to define the root credential if the user so chooses. I think it could be convenient for other people too.

As for the general bootstrapping case, the above discussion is interesting, but maybe we can continue that on the dev list or separate PR. As for this PR, do you think the idea of allowing user overrides for root credentials is reasonable given that it only applies to the "test" authentication implementation, which is already not "secret" given the fixed root token?

eric-maynard · 2024-11-05T22:15:17Z

I see. If you want to test the auth flow and you can't call bootstrapRealms then I can see how the current bootstrap process is not great.

Could you call rotatePrincipalSecrets using the passwordless auth to receive new credentials?

Barring that, I am open to allowing the user to set the credentials using an env variable for the in-memory metastore.

collado-mike · 2024-11-08T05:50:43Z

It is a fairly common and accepted practice to generate random secrets during bootstrapping - e.g., terraform has built-in support for this - https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/password . Allowing Terraform to randomly generate a password, bootstrap Polaris, and stick the secrets in Vault or k8s secrets is a very secure pattern and something I think we ought to support.

adutra · 2024-11-08T09:33:20Z

It is a fairly common and accepted practice to generate random secrets during bootstrapping [..]

And I'd note that that is the usual pattern that most Helm charts adopt when bootstrapping things like databases.

dimas-b · 2024-11-08T14:16:12Z

SGTM, based on the above discussion, I think I'll extend the env. variables support to the normal bootstrap command too.

collado-mike · 2024-11-14T17:26:23Z

...e/src/main/java/org/apache/polaris/core/persistence/LocalPolarisMetaStoreManagerFactory.java

-  public synchronized Map<String, PolarisMetaStoreManager.PrincipalSecretsResult> bootstrapRealms(
-      List<String> realms) {
+  public Map<String, PolarisMetaStoreManager.PrincipalSecretsResult> bootstrapRealms(
+      List<String> realms, Function<String, PrincipalSecretsGenerator> rootSecretsPerRealm) {


I like the PrincipalSecretsGenerator. Is it necessary to have it per realm, though? Can it take the realm as a method argument instead

I'd probably have to add realm to PolarisCallContext.

From my POV, I wanted to avoid exposing realm to lower-level code for better abstraction.

In practice only one realm is normally bootstrapped at a time, so we can probably drop the realm parameter completely and use env. vars. like POLARIS_BOOTSTRAP_ROOT_CLIENT_ID. WDYT?

I've reworked the generator code a bit. Hope it's clearer wrt realms now.

collado-mike · 2024-11-14T17:29:44Z

polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisMetaStoreManagerImpl.java

+  public @NotNull BaseResult bootstrapPolarisService(
+      @NotNull PolarisCallContext callCtx, PrincipalSecretsGenerator rootSecretsGenerator) {


Rather than changing the public interface, can we make the PrincipalSecretsGenerator a constructor param? I don't think callers need to know anything about this

Different generators are used during bootstrapping and (REST) API-driven principal creation.

reworked to avoid modifying public interfaces.

collado-mike · 2024-11-14T17:31:58Z

...re/src/main/java/org/apache/polaris/core/persistence/PolarisTreeMapMetaStoreSessionImpl.java

  public @NotNull PolarisPrincipalSecrets generateNewPrincipalSecrets(
-      @NotNull PolarisCallContext callCtx, @NotNull String principalName, long principalId) {
+      @NotNull PolarisCallContext callCtx,
+      @NotNull String principalName,
+      long principalId,
+      PrincipalSecretsGenerator generator) {


Same here - rather than passing in the generator as an argument, I think it should be a constructor argument

reworked to avoid modifying public interfaces.

collado-mike

Change looks good to me - some javadocs would be helpful, though

collado-mike · 2024-11-19T17:19:36Z

polaris-core/src/main/java/org/apache/polaris/core/persistence/PrincipalSecretsGenerator.java

+import org.jetbrains.annotations.NotNull;
+
+@FunctionalInterface
+public interface PrincipalSecretsGenerator {


Short javadoc here

collado-mike · 2024-11-19T17:19:49Z

polaris-core/src/main/java/org/apache/polaris/core/persistence/PrincipalSecretsGenerator.java

+  static Realms bootstrap() {
+    return bootstrap(System.getenv()::get);
+  }
+
+  static Realms bootstrap(Function<String, String> config) {
+    return new Realms(config);
+  }
+
+  class Realms {


Some javadocs here would be useful

eric-maynard · 2024-11-19T19:30:10Z

polaris-core/src/main/java/org/apache/polaris/core/persistence/PrincipalSecretsGenerator.java

+ * from services that actually manage principal objects (create, remove, rotate secrets, etc.)
+ *
+ * <p>The implementation statically available from {@link #bootstrap()} allows one-time client ID
+ * and secret overrides via environment variables, which can be useful for bootstrapping new realms.


Can we call out exactly what environment variables are used, here and preferably in the docs?

eric-maynard · 2024-11-19T19:31:09Z

...e/src/main/java/org/apache/polaris/core/persistence/LocalPolarisMetaStoreManagerFactory.java

@@ -58,11 +60,17 @@ public abstract class LocalPolarisMetaStoreManagerFactory<StoreType>
  private static final Logger LOGGER =
      LoggerFactory.getLogger(LocalPolarisMetaStoreManagerFactory.class);

+  private final PrincipalSecretsGenerator.Realms secretsGenerator = bootstrap();


This looks a bit odd, why is a secretsGenerator of the type Realms?

I see, it's like a secretsGeneratorGenerator. I wonder if we can just get rid of the thin Realms type and push this into PrincipalSecretsGenerator

I believe this will require PolarisMetaStoreSession implementations to have "realm" as a field. Would that be ok from your POV?.. Currently sessions are not strongly bound to realms. Realm-specific behaviour is relevant only to the bootstrap command and factories.

I see, the places calling produceSecrets don't necessarily have access to the realm. Maybe we can just move this into the constructor of PrincipalSecretsGenerator itself

I reworked the impl. a bit... hopefully the bootstrapping logic is clearer now.

@eric-maynard: Would you still prefer to have a non-lambda class for PrincipalSecretsGenerator implementations?

I think it's fine as it is now, maybe in the future if there are more implementations we can refine it further. Thanks for iterating on this!

eric-maynard · 2024-11-19T22:20:55Z

polaris-core/src/main/java/org/apache/polaris/core/persistence/PrincipalSecretsGenerator.java

+  static PrincipalSecretsGenerator bootstrap(String realmName) {
+    return bootstrap(realmName, System.getenv()::get);
+  }
+
+  static PrincipalSecretsGenerator bootstrap(String realmName, Function<String, String> config) {


This looks a lot better, thanks!

dimas-b · 2024-11-20T19:49:20Z

I believe that after #438 but without this PR it is not really possible to bootstrap Polaris with EclipseLink... I do not see a way to discover the auto-generated root secret 🤔 ;)

eric-maynard · 2024-11-20T20:26:05Z

+1 @dimas-b, I think this is a partial fix for that problem. However we probably shouldn't allow bootstrapping without specifying credentials if the intent is that users can retrieve secrets from the metastore but we no longer put them in the metastore.

I still think we can merge this as-is and follow up with that potential restriction.

Other possibilities for resolving this issue:

Make EclipseLink boostrap print secrets like in-memory bootstrap does
Make it so that we persist plaintext secrets for primary secrets but not secondary secrets
Hack Do not persist plaintext secrets in the metastore #438 so that (2) applies only to root's first-time secret, and enforce rotation on root

Any preferences @dimas-b / @collado-mike ?

dimas-b · 2024-11-20T21:27:56Z

My preference would be to keep the env. variables for root password bootstrapping, plus add CLI options to the bootstrap command to print the auto-generated password to STDOUT, but only at user's request.

By the way, I'm rebasing this PR to resolve conflicts... will probably squash too.

eric-maynard · 2024-11-21T02:49:05Z

plus add CLI options to the bootstrap command to print the auto-generated password to STDOUT, but only at user's request.

I like this idea, and I can do this -- but do you think we should fail the bootstrap if

The env variables are not set
credential printing is disabled?

In this case the metastore is pretty much bricked, so I think we should not proceed with the bootstrap.

Edit: See my sketch of this idea here, which would be rebased on this PR.

dimas-b · 2024-11-21T18:33:24Z

but do you think we should fail the bootstrap if

The env variables are not set
credential printing is disabled? [...]

SGTM 👍

dimas-b · 2024-11-22T14:49:41Z

@collado-mike : are you ok to merge?

collado-mike · 2024-11-22T20:59:46Z

...e/polaris/extension/persistence/impl/eclipselink/PolarisEclipseLinkMetaStoreSessionImpl.java

@@ -113,7 +115,8 @@ public PolarisEclipseLinkMetaStoreSessionImpl(
      @NotNull PolarisStorageIntegrationProvider storageIntegrationProvider,
      @NotNull RealmContext realmContext,
      @Nullable String confFile,
-      @Nullable String persistenceUnitName) {
+      @Nullable String persistenceUnitName,
+      PrincipalSecretsGenerator secretsGenerator) {


@NotNull annotation, since we don't do any null checking below

Introduce a `PrincipalSecretsGenerator` interface to isolate secrets generation from principal management code. Update meta store factories to allow the user to define the root client ID and secret via environment variables during bootstrapping.

dimas-b requested review from jbonofre, ashvina, RussellSpitzer, snazy, vvcephei, takidau, jackye1995, flyrain, eric-maynard, collado-mike and ebyhr as code owners November 1, 2024 22:03

jbonofre reviewed Nov 5, 2024

View reviewed changes

dimas-b marked this pull request as draft November 8, 2024 14:16

dimas-b force-pushed the root-id-override-for-test branch from aff2823 to 3d5640f Compare November 8, 2024 18:05

dimas-b marked this pull request as ready for review November 8, 2024 18:05

dimas-b requested a review from adutra as a code owner November 8, 2024 18:05

collado-mike reviewed Nov 14, 2024

View reviewed changes

dimas-b force-pushed the root-id-override-for-test branch from 2a6321f to 24f2e54 Compare November 19, 2024 05:00

collado-mike approved these changes Nov 19, 2024

View reviewed changes

eric-maynard reviewed Nov 19, 2024

View reviewed changes

eric-maynard approved these changes Nov 19, 2024

View reviewed changes

dimas-b force-pushed the root-id-override-for-test branch 2 times, most recently from d88d74a to 5fdf9b8 Compare November 20, 2024 21:53

eric-maynard mentioned this pull request Nov 21, 2024

Do not persist plaintext secrets in the metastore #438

Merged

10 tasks

eric-maynard mentioned this pull request Nov 21, 2024

Prevent the bootstrap command from leaving root credentials unrecoverable #461

Merged

5 tasks

adutra approved these changes Nov 22, 2024

View reviewed changes

collado-mike approved these changes Nov 22, 2024

View reviewed changes

dimas-b force-pushed the root-id-override-for-test branch from 5fdf9b8 to 25e0cb7 Compare November 22, 2024 22:55

dimas-b added 2 commits November 25, 2024 11:24

review: add @NotNull

778706e

dimas-b force-pushed the root-id-override-for-test branch from 25e0cb7 to 778706e Compare November 25, 2024 16:24

Merge branch 'main' into root-id-override-for-test

4bdb425

eric-maynard approved these changes Nov 25, 2024

View reviewed changes

eric-maynard enabled auto-merge (squash) November 25, 2024 18:14

eric-maynard merged commit 29a9828 into apache:main Nov 25, 2024
5 checks passed

dimas-b deleted the root-id-override-for-test branch November 25, 2024 18:44

MonkeyCanCode mentioned this pull request Jan 2, 2025

Helm: add bootstrapExtraEnv for bootstrapping job #601

Merged

dimas-b mentioned this pull request Jan 9, 2025

Convert main integration tests into reusable blackbox tests #590

Merged

		public @NotNull BaseResult bootstrapPolarisService(
		@NotNull PolarisCallContext callCtx, PrincipalSecretsGenerator rootSecretsGenerator) {

Support providing root client ID via env. variables when bootstrapping #422

Support providing root client ID via env. variables when bootstrapping #422

Uh oh!

Conversation

dimas-b commented Nov 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How Has This Been Tested?

Uh oh!

eric-maynard commented Nov 3, 2024

Uh oh!

dimas-b commented Nov 4, 2024

Uh oh!

adutra commented Nov 4, 2024

Uh oh!

dimas-b commented Nov 4, 2024

Uh oh!

eric-maynard commented Nov 5, 2024

Uh oh!

jbonofre left a comment

Choose a reason for hiding this comment

Uh oh!

dimas-b commented Nov 5, 2024

Uh oh!

adutra commented Nov 5, 2024

Uh oh!

adutra commented Nov 5, 2024

Uh oh!

eric-maynard commented Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dimas-b commented Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eric-maynard commented Nov 5, 2024

Uh oh!

collado-mike commented Nov 8, 2024

Uh oh!

adutra commented Nov 8, 2024

Uh oh!

dimas-b commented Nov 8, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

collado-mike left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b commented Nov 1, 2024 •

edited

Loading

eric-maynard commented Nov 5, 2024 •

edited

Loading

dimas-b commented Nov 5, 2024 •

edited

Loading

eric-maynard commented Nov 20, 2024 •

edited

Loading

eric-maynard commented Nov 21, 2024 •

edited

Loading