Skip to content

Prevent the bootstrap command from leaving root credentials unrecoverable #461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 3, 2025

Conversation

eric-maynard
Copy link
Contributor

@eric-maynard eric-maynard commented Nov 21, 2024

Description

This alters the bootstrap command to require either explicit credentials or a new flag --print-credentials.

Fixes #219

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • Documentation update
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Credentials are now printed during bootstrap when it's enabled:

realm: default-realm root principal credentials: 2b98107557bcce20:f74281319ac8519ef30cbced6563223b

@eric-maynard
Copy link
Contributor Author

eric-maynard commented Nov 25, 2024

Hey @dimas-b, do you mind taking a look now that #422 has merged?

I think the integration is easy enough with some slight refactoring to PrincipalSecretsGenerator.

I left the current behavior wrt. using env variables even when printing is enabled, since if that's what the user decides to explicitly configure we can respect it. In the worst case we are just echoing env variables.

Comment on lines 214 to 224
boolean environmentVariableCredentials =
PrincipalSecretsGenerator.hasCredentialVariables(
realmContext.getRealmIdentifier(), PolarisEntityConstants.getRootPrincipalName());
if (!this.printCredentials(polarisContext) && !environmentVariableCredentials) {
String failureMessage =
String.format(
"It appears that environment variables were not provided for root credentials, and that printing "
+ "the root credentials is disabled via %s. If bootstrapping were to proceed, there would be no way "
+ "to recover the root credentials",
PolarisConfiguration.BOOTSTRAP_PRINT_CREDENTIALS.key);
LOGGER.error("\n\n {} \n\n", failureMessage);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here - why is the metastore aware of whether the secrets were provided by environment variables? What if there are other impls of secrets generators that don't rely on env variables? E.g., we could have one that calls AWS SecretsManager to dynamically generate and store the secrets without any env variables. Should this code throw an exception?

@eric-maynard eric-maynard requested a review from flyrain December 6, 2024 21:33
@flyrain
Copy link
Contributor

flyrain commented Dec 9, 2024

It seems odd that Polaris determines whether bootstrapping has failed based on a configuration controlling whether credentials are printed. IIUC, #438 removed plain text secrets from the metastore, meaning these secrets cannot be retrieved unless they are printed in the console. Would it be more reasonable to always print the credentials if they are generated by Polaris? This ensures the secrets remain accessible when needed without relying on an external configuration.

@eric-maynard
Copy link
Contributor Author

eric-maynard commented Dec 9, 2024

It seems odd that Polaris determines whether bootstrapping has failed based on a configuration controlling whether credentials are printed.

The issue at hand is that currently credentials are unrecoverable after bootstrapping, which needs to be fixed ASAP.

IIUC, #438 removed plain text secrets from the metastore, meaning these secrets cannot be retrieved unless they are printed in the console. Would it be more reasonable to always print the credentials if they are generated by Polaris? This ensures the secrets remain accessible when needed without relying on an external configuration.

@collado-mike expressed concern about an approach like this some time ago. I think a configuration, or perhaps better a CLI argument to the bootstrap command, is a good compromise in that it allows a secure behavior by default (e.g. no secrets to stdout) but also gives people an "out" in case they want to use polaris-generated credentials with a metastore that doesn't support retrieving credentials.

This last point is also very important to consider: some metastore implementations could allow secrets to be retrieved, in which case it's okay to bootstrap without printing credentials. The problem is that after #438 EclipseLink does not allow this.

@dimas-b
Copy link
Contributor

dimas-b commented Dec 9, 2024

I think it is generally not a good idea to store retrievable secrets in the metastore. If we want that functionality it would probably be preferable to integrate with well-known secret manages (e.g. k8s secrets, cloud-specific secret managers, Vault, etc.).

@snazy
Copy link
Member

snazy commented Dec 9, 2024

I think it is generally not a good idea to store retrievable secrets in the metastore.

Completely agree with this. I'd extend this even to not put any credentials into a server log at all, because that information is not just "ephemeral on a console window", but logs can easily go into 3rd party systems, which would then make those clear text credentials easily accessible.

@dimas-b
Copy link
Contributor

dimas-b commented Jan 23, 2025

@eric-maynard : GH still shows a conflict on this PR... Would you mind resolving it before the next review round?

@eric-maynard eric-maynard changed the title Add a flag to control whether credentials are printed during bootstrapping Prevent the bootstrap command from leaving root credentials unrecoverable Jan 23, 2025
@eric-maynard
Copy link
Contributor Author

Bulk-resolved some comments as this is a rewrite. PTAL @dimas-b, @collado-mike, @flyrain

@eric-maynard eric-maynard requested a review from dimas-b January 23, 2025 20:37
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions fixing #450, but it does not look like it does anything about the JDBC schema... Is #450 still related? 🤔 If not, I think it is totally fine, I just wanted to make sure the description is accurate.

@@ -38,20 +39,37 @@ public class BootstrapCommand extends BaseCommand {
List<String> realms;

@CommandLine.Option(
names = {"-c", "--credential"},
paramLabel = "<realm,clientId,clientSecret>",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while I agree that supporting well-formed JSON is valuable, I believe the value is realized mostly in automated use case. This old option, however, is easier to use for humans, IMHO... Why not keep both options available for users to choose what it convenient for them?

For example: --json for JSON, and --client-id, --client-secret for plain strings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This old option, however, is easier to use for humans, IMHO...

I really do not think that is the case for the current format.

As for supporting both the old and the new format (and maybe field-specific args like you suggest), I think we're best off just having one way to do the same thing here.

Copy link
Contributor

@dimas-b dimas-b Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not mean the old "format", but the old approach of using plain text for ID/secret :) That's what I suggest: add new options (--client-id, --client-secret) to allow the old way of using plain text values to bootstrap.

... and keep the new --json option too, of course.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But with --client-id etc., how does multi-realm support work?

Since bootstrap is a one-time operation, I feel OK with a "heavier" invocation so long as it is very human-readable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm suggesting to delegate the choice of which way to go to the user.

Option 1:
bootstrap --realm r1 --client-id c1 --client-secret s1
[... do something else ...]
bootstrap --realm r2 --client-id c2 --client-secret s2
etc....

Option 2:
bootstrap --json '{...}' - here all realm IDs and client ID/secret pairs are in the JSON.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify: Option 1 only bootstraps one realm at a time, correct? I think I can live with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I'm working on #878... I'm leaning towards a file in the following syntax:

{
  "credentials" : {
    "realm1" : {
      "client-id" : "client1",
      "client-secret" : "secret1"
    },
    "realm2" : {
      "client-id": "client2",
      "client-secret": "secret2"
    }
  }
}

YAML will also be supported:

credentials:
  realm1:
    client-id: client1
    client-secret: secret1
  realm2:
    client-id: client2
    client-secret: secret2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last comment, sorry... I think the admin tool should require the user to enter credentials for each realm to bootstrap. In real life, it makes little sense to bootstrap with random credentials. Wdyt? At the least the file syntax for now requires the credentials to be specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bootstrap --json feels a little unclear to me; e.g. should the --print-credentials option also go inside that JSON? It seems to imply that we are bootstrapping in JSON "mode" or providing all options into a JSON. While in fact, the JSON is just a set of credentials.

Since we call it credentials, I think having a second way to do the exact same thing could be confusing, i.e. having both a credentials option and a client-id option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--print-credentials is just a CLI option. I do no think it should be inside JSON.

As for --json being unclean, how about:

Option 1: bootstrap --realm ... [ --client-id ... --client-secret ...] (as above)
Option 2: bootstrap --realm-json <json>, where <json> is from from @adutra 's example
Option 3: bootstrap --realm-yaml ... - same as JSON, but in YAML format

description =
"Root principal credentials to bootstrap. Must be of the form 'realm,clientId,clientSecret'.")
List<String> credentials;
"Print root credentials to stdout")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option implies that credentials are generated, right? Could you mention that in the description?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could you add a test case for this to BootstrapCommandTest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, it also prints user-provided credentials

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I guess the intended use case is to print generated credentials (mostly). It would be nice to inform users about that. It may not be obvious to all users that Polaris will generate client ID/secret if they are not explicitly provided.

Copy link
Contributor Author

@eric-maynard eric-maynard Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not be obvious to all users that Polaris will generate client ID/secret if they are not explicitly provided.

Oh, this is a good point. I'll add a note on this, though I'm not sure the print-credentials flag is the right place for that.

Ultimately, I want to keep the behavior of the flags "simple", so having print-credentials just "print credentials" with no qualifiers appeals to me. Also, FWIW, it does not print out exactly the same clientId:clientSecret tuple you pass in. See the note in BootstrapCommandTest.

public void testPrintCredentials(LaunchResult result) {
String output = result.getOutput();
assertThat(output).contains("Bootstrap completed successfully.");
assertThat(output).contains("root:");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like:

realm: realm1 root principal credentials: root:c15196133e4348d64c7f478dca57e99e

value = {
"bootstrap",
"-c",
"[{\"realm\":\"realm1\",\"principal\":\"root\",\"clientId\":\"root\",\"clientSecret\":\"s3cr3t\"}]",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think users will get the JSON syntax on the command line right in every circumstance. The previous approach at least allowed to use quotes around arguments, if even necessary, w/o the need to think about the JSON syntax.

Automated tooling would have to deal with escaping of special characters.
" and \ and * are possible in secrets - but escaping those for JSON and then for the the command line is quite tricky to get right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new JSON format exactly matches the format used in the -f argument, so hopefully that makes this less treacherous for users

@eric-maynard
Copy link
Contributor Author

@adutra is it correct that after #605 we no longer have the ability to bootstrap to the default realm like before?

That is, realm always must be specified now?

@adutra
Copy link
Contributor

adutra commented Feb 18, 2025

@adutra is it correct that after #605 we no longer have the ability to bootstrap to the default realm like before?

That is, realm always must be specified now?

As of today:

  1. If using the in-memory metastore, all realms listed in polaris.realm-context.realms are bootstrapped automatically.
    a. The environment variable POLARIS_BOOTSTRAP_CREDENTIALS or the system property polaris.bootstrap.credentials can be used to provide root credentials.
  2. If using other metastores, no realm is automatically bootstrapped.
    a. To bootstrap realms in this case, the only option right now is to use the admin tool.
    b. With the admin tool, root credentials can be provided as command arguments, or in a json/yaml file.

@eric-maynard
Copy link
Contributor Author

eric-maynard commented Feb 18, 2025

@adutra so on this note:

a. To bootstrap realms in this case, the only option right now is to use the admin tool.
b. With the admin tool, root credentials can be provided as command arguments

This was true before (insofar as you call the Dropwizard bootstrap command the "admin tool"), but you could also just call bootstrap with no args. Where did that behavior go? Is it gone? Should it be gone?

@dimas-b dimas-b self-requested a review February 21, 2025 04:42

@CommandLine.Option(
names = {"-c", "--credential"},
paramLabel = "<realm,clientId,clientSecret>",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for removing the option for the user to specify plain text credentials as command line args?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to my comments and -1 on #633 for more context, but essentially I don't think this format captures the full range of ways you might wish to bootstrap. I also don't think we really want two syntaxes for bootstrapping.

With that being said, I'd like to understand why the pre-Quarkus bootstrap behavior is now gone, since being able to bootstrap a realm without specifying credentials is a core part of why this heavier syntax makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the existing -c option is adequate for bootstrapping current Polaris Server and it is convenient for users.

I think we should keep it, but I'm open to altering its format as long as it is human-writable in shells without too much quoting.

Comment on lines 80 to 82
* Parse a JSON object of credentials. Example: { "realm1": { "client-id": "client1",
* "client-secret": "secret1" }, "realm2": { "client-id": "client2", "client-secret": "secret2",
* "extra-field": "extra-value" } }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fromUrl() is already able to parse this kind of JSON... Why add a new parsing method? Why not reuse that code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but I wanted to showcase the behavior around principal that was discussed above; it's not clear to me how to add that logic into fromUrl. If there's an easy way, happy to just make that change now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about extracting a private shared method from fromUrl that would take a JsonFactory and an InputStream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that a shared method would be ideal, I don't see how we can keep this principal-check logic in fromJson but remove it from fromUrl while sharing a meaningful amount of code across the two methods. If it's okay with you though, I'm happy just to push this logic down into fromUrl and then share the code that way.

Otherwise, we can also remove this check altogether if you think it shouldn't be there, or we can keep the divergent parsing logic. I don't have a strong opinion here.

@adutra
Copy link
Contributor

adutra commented Feb 25, 2025

bootstrap with no args. Where did that behavior go? Is it gone? Should it be gone?

I missed that I confess. What the bootstrap command without args used to do? Bootstrap the configured default realm with random credentials?

If so, we might have a small problem because the notion of "default realm" is tied to the actual implementation of RealmContextResolver. If someone is using a different resolver, there might not be any "default realm".

@eric-maynard
Copy link
Contributor Author

I missed that I confess. What the bootstrap command without args used to do? Bootstrap the configured default realm with random credentials?

Yeah that is exactly right. I suppose if there is no default realm, failing to bootstrap with no realm might be correct.

The silver lining is, with this behavior gone this PR is less urgent. It is still possible to bootstrap a realm with random credentials though, and we should still make sure those credentials are either printed or recoverable.

@eric-maynard
Copy link
Contributor Author

For now, let me refactor out the controversial syntax changes, so we can look at just the changes necessary to prevent bricking the metastore.

Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

value = {"bootstrap", "-r", "realm1", "-c", "realm1,client1d,s3cr3t", "--print-credentials"})
public void testPrintCredentials(LaunchResult result) {
assertThat(result.getOutput()).contains("Bootstrap completed successfully.");
assertThat(result.getOutput()).contains("realm: realm1 root principal credentials: client1d:");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not assert that s3cr3t is printed?

Copy link
Contributor Author

@eric-maynard eric-maynard Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually not printed -- maybe it should be. Because of #801 the secret supplied becomes the secondary secret while the primary secret is printed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather we just solve #801 rather than add special logic around printing the secondary secret here

Copy link
Contributor

@dimas-b dimas-b Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL. I think this PR is good to merge then, and we'll follow-up on #801

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Feb 28, 2025
Copy link
Contributor

@collado-mike collado-mike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this. It's so simple

@eric-maynard eric-maynard merged commit 1a03108 into apache:main Mar 3, 2025
5 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Mar 3, 2025
gh-yzou pushed a commit to gh-yzou/polaris that referenced this pull request Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants