Create credentials upload plugins for GCP and AWS (#438)#439
Create credentials upload plugins for GCP and AWS (#438)#439
Conversation
dbalabka
left a comment
There was a problem hiding this comment.
I need to improve the code some. I will get back to it 😀
|
|
||
| from distributed import WorkerPlugin | ||
|
|
||
| logger = logging.getLogger(__name__) |
There was a problem hiding this comment.
I have to add some logging for the visibility of the fact that file has been uploaded
| logger = logging.getLogger(__name__) | ||
|
|
||
| class UploadGCPKey(WorkerPlugin): | ||
| """Automatically upload a GCP key to the worker.""" |
There was a problem hiding this comment.
Need to add more context about why we need to upload the key from the laptop.
| """ | ||
| Initialize the plugin by reading in the data from the given file. | ||
| """ | ||
| config_path = os.getenv("AWS_CONFIG_FILE", Path.home() / Path(".aws/config")) |
There was a problem hiding this comment.
Should why upload the complete config?
There was a problem hiding this comment.
Overall this seems like a great addition. This definitely needs some testing and documentation (see the Azure spot termination plugin for an example of how we documented that).
I'm marking this as a draft as it looks like you still intend to do more work here. Give me a ping when this is ready for review.
|
@jacobtomlinson, thanks for a good reference code. I will let you know when it can be reviewed. Linking the Azure plugging PR here: #251 |
We recently started using Dask in GCP and in our in-house Kubernetes that is deployed using Rancher. Using Dask locally is relatively straightforward, while deploying it to the cloud or K8s has some difficulties.
I found out that sharing GCP and AWS credentials with remote workers is not easy, and there are a lot of questions about how to do it the right way. In #430 PR, I've improved the credentials workflow for GCP by allowing using Application Default Credentials (see ticket #429). The following PR is focused on enhancing the DX by providing plugins that automatically share keys with remote workers. It allows workers to read/write data from S3/GCS.
Todo