-
Notifications
You must be signed in to change notification settings - Fork 265
Prevent the bootstrap command from leaving root credentials unrecoverable #461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
...e/src/main/java/org/apache/polaris/core/persistence/LocalPolarisMetaStoreManagerFactory.java
Outdated
Show resolved
Hide resolved
...e/src/main/java/org/apache/polaris/core/persistence/LocalPolarisMetaStoreManagerFactory.java
Outdated
Show resolved
Hide resolved
Hey @dimas-b, do you mind taking a look now that #422 has merged? I think the integration is easy enough with some slight refactoring to I left the current behavior wrt. using env variables even when printing is enabled, since if that's what the user decides to explicitly configure we can respect it. In the worst case we are just echoing env variables. |
polaris-core/src/main/java/org/apache/polaris/core/persistence/PrincipalSecretsGenerator.java
Outdated
Show resolved
Hide resolved
...e/src/main/java/org/apache/polaris/core/persistence/LocalPolarisMetaStoreManagerFactory.java
Outdated
Show resolved
Hide resolved
boolean environmentVariableCredentials = | ||
PrincipalSecretsGenerator.hasCredentialVariables( | ||
realmContext.getRealmIdentifier(), PolarisEntityConstants.getRootPrincipalName()); | ||
if (!this.printCredentials(polarisContext) && !environmentVariableCredentials) { | ||
String failureMessage = | ||
String.format( | ||
"It appears that environment variables were not provided for root credentials, and that printing " | ||
+ "the root credentials is disabled via %s. If bootstrapping were to proceed, there would be no way " | ||
+ "to recover the root credentials", | ||
PolarisConfiguration.BOOTSTRAP_PRINT_CREDENTIALS.key); | ||
LOGGER.error("\n\n {} \n\n", failureMessage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar here - why is the metastore aware of whether the secrets were provided by environment variables? What if there are other impls of secrets generators that don't rely on env variables? E.g., we could have one that calls AWS SecretsManager to dynamically generate and store the secrets without any env variables. Should this code throw an exception?
...ore/src/main/java/org/apache/polaris/core/persistence/secrets/PrincipalSecretsGenerator.java
Outdated
Show resolved
Hide resolved
...ore/src/main/java/org/apache/polaris/core/persistence/secrets/PrincipalSecretsGenerator.java
Outdated
Show resolved
Hide resolved
...is-core/src/test/java/org/apache/polaris/core/persistence/PrincipalSecretsGeneratorTest.java
Outdated
Show resolved
Hide resolved
...src/test/java/org/apache/polaris/core/persistence/secrets/PrincipalSecretsGeneratorTest.java
Outdated
Show resolved
Hide resolved
...src/test/java/org/apache/polaris/core/persistence/secrets/PrincipalSecretsGeneratorTest.java
Outdated
Show resolved
Hide resolved
...e/polaris/extension/persistence/impl/eclipselink/PolarisEclipseLinkMetaStoreSessionImpl.java
Outdated
Show resolved
Hide resolved
It seems odd that Polaris determines whether bootstrapping has failed based on a configuration controlling whether credentials are printed. IIUC, #438 removed plain text secrets from the metastore, meaning these secrets cannot be retrieved unless they are printed in the console. Would it be more reasonable to always print the credentials if they are generated by Polaris? This ensures the secrets remain accessible when needed without relying on an external configuration. |
The issue at hand is that currently credentials are unrecoverable after bootstrapping, which needs to be fixed ASAP.
@collado-mike expressed concern about an approach like this some time ago. I think a configuration, or perhaps better a CLI argument to the This last point is also very important to consider: some metastore implementations could allow secrets to be retrieved, in which case it's okay to bootstrap without printing credentials. The problem is that after #438 EclipseLink does not allow this. |
I think it is generally not a good idea to store retrievable secrets in the metastore. If we want that functionality it would probably be preferable to integrate with well-known secret manages (e.g. k8s secrets, cloud-specific secret managers, Vault, etc.). |
Completely agree with this. I'd extend this even to not put any credentials into a server log at all, because that information is not just "ephemeral on a console window", but logs can easily go into 3rd party systems, which would then make those clear text credentials easily accessible. |
df1d642
to
6b2486e
Compare
@eric-maynard : GH still shows a conflict on this PR... Would you mind resolving it before the next review round? |
Bulk-resolved some comments as this is a rewrite. PTAL @dimas-b, @collado-mike, @flyrain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisCredentialsBootstrap.java
Outdated
Show resolved
Hide resolved
polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisCredentialsBootstrap.java
Outdated
Show resolved
Hide resolved
polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisCredentialsBootstrap.java
Outdated
Show resolved
Hide resolved
@@ -38,20 +39,37 @@ public class BootstrapCommand extends BaseCommand { | |||
List<String> realms; | |||
|
|||
@CommandLine.Option( | |||
names = {"-c", "--credential"}, | |||
paramLabel = "<realm,clientId,clientSecret>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while I agree that supporting well-formed JSON is valuable, I believe the value is realized mostly in automated use case. This old option, however, is easier to use for humans, IMHO... Why not keep both options available for users to choose what it convenient for them?
For example: --json
for JSON, and --client-id
, --client-secret
for plain strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This old option, however, is easier to use for humans, IMHO...
I really do not think that is the case for the current format.
As for supporting both the old and the new format (and maybe field-specific args like you suggest), I think we're best off just having one way to do the same thing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not mean the old "format", but the old approach of using plain text for ID/secret :) That's what I suggest: add new options (--client-id
, --client-secret
) to allow the old way of using plain text values to bootstrap.
... and keep the new --json
option too, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But with --client-id
etc., how does multi-realm support work?
Since bootstrap is a one-time operation, I feel OK with a "heavier" invocation so long as it is very human-readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm suggesting to delegate the choice of which way to go to the user.
Option 1:
bootstrap --realm r1 --client-id c1 --client-secret s1
[... do something else ...]
bootstrap --realm r2 --client-id c2 --client-secret s2
etc....
Option 2:
bootstrap --json '{...}'
- here all realm IDs and client ID/secret pairs are in the JSON.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify: Option 1 only bootstraps one realm at a time, correct? I think I can live with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I'm working on #878... I'm leaning towards a file in the following syntax:
{
"credentials" : {
"realm1" : {
"client-id" : "client1",
"client-secret" : "secret1"
},
"realm2" : {
"client-id": "client2",
"client-secret": "secret2"
}
}
}
YAML will also be supported:
credentials:
realm1:
client-id: client1
client-secret: secret1
realm2:
client-id: client2
client-secret: secret2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last comment, sorry... I think the admin tool should require the user to enter credentials for each realm to bootstrap. In real life, it makes little sense to bootstrap with random credentials. Wdyt? At the least the file syntax for now requires the credentials to be specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bootstrap --json
feels a little unclear to me; e.g. should the --print-credentials
option also go inside that JSON? It seems to imply that we are bootstrapping in JSON "mode" or providing all options into a JSON. While in fact, the JSON is just a set of credentials
.
Since we call it credentials
, I think having a second way to do the exact same thing could be confusing, i.e. having both a credentials
option and a client-id
option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--print-credentials
is just a CLI option. I do no think it should be inside JSON.
As for --json
being unclean, how about:
Option 1: bootstrap --realm ... [ --client-id ... --client-secret ...]
(as above)
Option 2: bootstrap --realm-json <json>
, where <json>
is from from @adutra 's example
Option 3: bootstrap --realm-yaml ...
- same as JSON, but in YAML format
quarkus/admin/src/main/java/org/apache/polaris/admintool/BootstrapCommand.java
Outdated
Show resolved
Hide resolved
description = | ||
"Root principal credentials to bootstrap. Must be of the form 'realm,clientId,clientSecret'.") | ||
List<String> credentials; | ||
"Print root credentials to stdout") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This option implies that credentials are generated, right? Could you mention that in the description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, could you add a test case for this to BootstrapCommandTest
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, it also prints user-provided credentials
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but I guess the intended use case is to print generated credentials (mostly). It would be nice to inform users about that. It may not be obvious to all users that Polaris will generate client ID/secret if they are not explicitly provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not be obvious to all users that Polaris will generate client ID/secret if they are not explicitly provided.
Oh, this is a good point. I'll add a note on this, though I'm not sure the print-credentials flag is the right place for that.
Ultimately, I want to keep the behavior of the flags "simple", so having print-credentials just "print credentials" with no qualifiers appeals to me. Also, FWIW, it does not print out exactly the same clientId:clientSecret tuple you pass in. See the note in BootstrapCommandTest
.
public void testPrintCredentials(LaunchResult result) { | ||
String output = result.getOutput(); | ||
assertThat(output).contains("Bootstrap completed successfully."); | ||
assertThat(output).contains("root:"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like:
realm: realm1 root principal credentials: root:c15196133e4348d64c7f478dca57e99e
value = { | ||
"bootstrap", | ||
"-c", | ||
"[{\"realm\":\"realm1\",\"principal\":\"root\",\"clientId\":\"root\",\"clientSecret\":\"s3cr3t\"}]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think users will get the JSON syntax on the command line right in every circumstance. The previous approach at least allowed to use quotes around arguments, if even necessary, w/o the need to think about the JSON syntax.
Automated tooling would have to deal with escaping of special characters.
"
and \
and *
are possible in secrets - but escaping those for JSON and then for the the command line is quite tricky to get right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new JSON format exactly matches the format used in the -f
argument, so hopefully that makes this less treacherous for users
polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisCredentialsBootstrap.java
Outdated
Show resolved
Hide resolved
quarkus/admin/src/main/java/org/apache/polaris/admintool/BootstrapCommand.java
Outdated
Show resolved
Hide resolved
As of today:
|
@adutra so on this note:
This was true before (insofar as you call the Dropwizard bootstrap command the "admin tool"), but you could also just call |
|
||
@CommandLine.Option( | ||
names = {"-c", "--credential"}, | ||
paramLabel = "<realm,clientId,clientSecret>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for removing the option for the user to specify plain text credentials as command line args?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to my comments and -1 on #633 for more context, but essentially I don't think this format captures the full range of ways you might wish to bootstrap. I also don't think we really want two syntaxes for bootstrapping.
With that being said, I'd like to understand why the pre-Quarkus bootstrap behavior is now gone, since being able to bootstrap a realm without specifying credentials is a core part of why this heavier syntax makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the existing -c
option is adequate for bootstrapping current Polaris Server and it is convenient for users.
I think we should keep it, but I'm open to altering its format as long as it is human-writable in shells without too much quoting.
* Parse a JSON object of credentials. Example: { "realm1": { "client-id": "client1", | ||
* "client-secret": "secret1" }, "realm2": { "client-id": "client2", "client-secret": "secret2", | ||
* "extra-field": "extra-value" } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fromUrl()
is already able to parse this kind of JSON... Why add a new parsing method? Why not reuse that code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but I wanted to showcase the behavior around principal
that was discussed above; it's not clear to me how to add that logic into fromUrl
. If there's an easy way, happy to just make that change now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about extracting a private
shared method from fromUrl
that would take a JsonFactory
and an InputStream
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree that a shared method would be ideal, I don't see how we can keep this principal-check logic in fromJson
but remove it from fromUrl
while sharing a meaningful amount of code across the two methods. If it's okay with you though, I'm happy just to push this logic down into fromUrl
and then share the code that way.
Otherwise, we can also remove this check altogether if you think it shouldn't be there, or we can keep the divergent parsing logic. I don't have a strong opinion here.
polaris-core/src/main/java/org/apache/polaris/core/entity/PolarisEntityConstants.java
Outdated
Show resolved
Hide resolved
I missed that I confess. What the If so, we might have a small problem because the notion of "default realm" is tied to the actual implementation of |
Yeah that is exactly right. I suppose if there is no default realm, failing to bootstrap with no realm might be correct. The silver lining is, with this behavior gone this PR is less urgent. It is still possible to bootstrap a realm with random credentials though, and we should still make sure those credentials are either printed or recoverable. |
For now, let me refactor out the controversial syntax changes, so we can look at just the changes necessary to prevent bricking the metastore. |
1ea2e90
to
e9e5305
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
value = {"bootstrap", "-r", "realm1", "-c", "realm1,client1d,s3cr3t", "--print-credentials"}) | ||
public void testPrintCredentials(LaunchResult result) { | ||
assertThat(result.getOutput()).contains("Bootstrap completed successfully."); | ||
assertThat(result.getOutput()).contains("realm: realm1 root principal credentials: client1d:"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not assert that s3cr3t
is printed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually not printed -- maybe it should be. Because of #801 the secret supplied becomes the secondary secret while the primary secret is printed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather we just solve #801 rather than add special logic around printing the secondary secret here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL. I think this PR is good to merge then, and we'll follow-up on #801
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this. It's so simple
…able (apache#461) * rebase * autolint
Description
This alters the bootstrap command to require either explicit credentials or a new flag
--print-credentials
.Fixes #219
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Credentials are now printed during bootstrap when it's enabled: