Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ec2 client credential init fix for AWS IRSA #29784

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Greyeye
Copy link

@Greyeye Greyeye commented Oct 4, 2024

What does this PR do?

This PR modifies the initialisation of the AWS EC2 client to use the AWS configuration package. This change enables support for built-in credentials such as AWS IAM Roles for Service Accounts (IRSA). The current use of ec2.New() does not work with IRSA, and although not tested, a similar issue likely occurs with EKS Pod Identity as well.

When a pod is assigned an IRSA with the correct permissions to perform DescribeTags, it encounters an error and logs the following message:

(pkg/util/ec2/ec2_tags.go:104 in fetchEc2TagsFromAPI) | unable to get tags using default credentials (falling back to instance role): operation error EC2: DescribeTags, https response error StatusCode: 400, RequestID: 1234-1234-1234, api error MissingParameter: The request must contain the parameter AWSAccessKeyId

fixes issue 29916

Motivation

We are currently running the Datadog Agent pod in AWS EKS and are aiming to clean up the instance role by eliminating unnecessary permissions attached to it. Datadog is one of many services that currently use the instance role for the ec2:DescribeTags permission, and we would like to switch to using the IRSA-assigned role.

After extensive debugging and tracing, we discovered that when the pod is assigned an IRSA token but the instance is running IMDSv2, it fails to use IRSA and always falls back to using the instance role.

Describe how to test/QA your changes

  1. create a new EKS node and instance role that does not allow ec2:describeTags.
  2. create a new AWS IAM Rolewith ec2:describeTags, and attach this to datadog-agent using IRSA
  3. deploy datadog-agent with config.yaml (or using configMap) like this
    collect_ec2_tags: true
    collect_ec2_tags_use_imds: true
    ec2_prefer_imdsv2: true
  4. start the datadog-agent and verify the role gets used, and tags are collected using the IRSA role.

Possible Drawbacks / Trade-offs

none

Additional Notes

none

@Greyeye Greyeye requested a review from a team as a code owner October 4, 2024 04:28
@bits-bot
Copy link
Collaborator

bits-bot commented Oct 4, 2024

CLA assistant check
All committers have signed the CLA.

log.Warnf("unable to get aws configurations: %s", err)
return nil
}
return ec2.NewFromConfig(cfg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ maybe add a log such as: log.Info("AWS configurations successfully fetched")
What do you think ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@louis-cqrl I believe that will populate unnecessary log output. How about log.Debug() instead.

return ec2.NewFromConfig(cfg)
}

func getTagsWithCreds(ctx context.Context, instanceIdentity *EC2Identity, connection ec2ClientInterface) ([]string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ if ec2Cconnection fails then connection.DescribeTags will unreference a nil pointer, isn't an issue ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good find, I've added the check and updated the test

}

// ec2Cconnection creates an ec2 client with the given region and credentials
func ec2Cconnection(ctx context.Context, region string, awsCreds aws.CredentialsProvider) ec2ClientInterface {
Copy link
Contributor

@louis-cqrl louis-cqrl Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Why the function name is ec2Cconnection and not ec2Connection ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has been fixed

@Greyeye Greyeye requested a review from a team as a code owner November 4, 2024 23:12
Copy link
Contributor

@drichards-87 drichards-87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor feedback from Docs and approved the PR.

releasenotes/notes/ec2-cred-fix-9326728a245381e7.yaml Outdated Show resolved Hide resolved
Greyeye and others added 4 commits November 22, 2024 21:52
current use of ec2.New() does not work with IRSA. (not tested but most like same problem with EKS Pod Identity as well)

```
(pkg/util/ec2/ec2_tags.go:104 in fetchEc2TagsFromAPI) | unable to get tags using default credentials (falling back to instance role): operation error EC2: DescribeTags, https response error StatusCode: 400, RequestID: 1234-1234-1234, api error MissingParameter: The request must contain the parameter AWSAccessKeyId
```

This change is to change to use aws config package to help initialise aws ec2 client with the built-in credentials such as AWS IRSA tokens.
1. rename function name ec2Cconnection to ec2Connection
2. add nil check for connection
3. updated the unit test to cover nil test
4. add debug log print for aws config setup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] datadog-agent pod cannot assume AWS IRSA credentials
4 participants