-
Notifications
You must be signed in to change notification settings - Fork 357
Description
Is your feature request related to a problem? Please describe.
Currently, when submitting pipeline metadata and files to S3, on pipeline submission, if a config / config file is present for the runtime, Elyra COS uses head-coded username and password for S3 as present verbatim in a json file.
Also, the GUI validator of the pipeline env config currently insists on cos_username and cos_password even when auth type KUBERNETES_SECRET is defined.
Describe the solution you'd like
- GUI validator for runtime env config should not insist on cos_username and cos_password fields when name of K8S secret is defined.
- if config is present and auth_type KUBERNETES_SECRET is selected, i.e. with Elyra running on K8S, when submitting a pipeline, for communicating with S3, Elyra should take the information for access_key and secret_key not from the config file readout, but from env variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
- i.e. use minio library's envAWSProvider not just at runtime in the target runtime environment (bootstrapper) but also in elyra cos logic running locally in Elyra
documentation would need to be updated to reflect this new behavior not only in user namespace of the target runtime, but also in the jupyterlab K8S namespace elyra is running in. The envFrom secretKeyRef section for populating the env variable values AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY in that case is up to the software setting up the workbench / jupyterlab instance in K8S.
leaving aside for a moment the GUI validator aspects for the two fields COS username and COS password (will be part of PR as well), possible
new logic flow from https://github.com/elyra-ai/elyra/blob/main/elyra/util/cos.py#L68 onward could be
# what do do if runtime config json is present
else:
auth_type = config.metadata["cos_auth_type"]
self.endpoint = urlparse(config.metadata["cos_endpoint"])
self.bucket = config.metadata["cos_bucket"]
if auth_type == "USER_CREDENTIALS":
cred_provider = providers.StaticProvider(
access_key=config.metadata["cos_username"],
secret_key=config.metadata["cos_password"],
)
elif auth_type == "KUBERNETES_SECRET":
if "AWS_ACCESS_KEY_ID" in os.environ and "AWS_SECRET_ACCESS_KEY" in os.environ:
cred_provider = providers.EnvAWSProvider()
else:
raise RuntimeError(
"Cannot connect to object storage. No credentials "
" were provided and environment variables "
" AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are not "
" properly defined."
)
elif auth_type == "AWS_IAM_ROLES_FOR_SERVICE_ACCOUNTS":
....
Describe alternatives you've considered
My point here is mainly, that for a running jupyterlab notebook on K8S, info like cos_username and cos_password should not be coming from a runtime json config file. People can specify the ENVs necessary either via CICD or via ODH Dashboard workbench env section - save in and reference from K8S Secret in background.
There is already logic a little further up that uses, when no runtime config is used, the ENVs as well.
But even there, credentials should come from ENVs always, not from CosClient runtime_configuration verbatim arguments. i.e. even when running Elyra locally, just as a process on the laptop, it is perfectly possible to set the env vars AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY on the system.
Additional context
https://github.com/elyra-ai/elyra/blob/main/elyra/util/cos.py#L68
github_repo_token / Github personal access token should in a next step get the possibility to be read from an env var within elyra, that can also be based on e.g. a K8S secret.
But one step at a time.