Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SageMakerBaseOperator character limit bug #45550

Open
2 tasks done
dirkrkotzeml opened this issue Jan 10, 2025 · 1 comment · May be fixed by #45551
Open
2 tasks done

SageMakerBaseOperator character limit bug #45550

dirkrkotzeml opened this issue Jan 10, 2025 · 1 comment · May be fixed by #45551
Labels
area:providers kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues

Comments

@dirkrkotzeml
Copy link

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

8.27.0

Apache Airflow version

2.7.2

Operating System

Amazon Linux

Deployment

Amazon (AWS) MWAA

Deployment details

Standard MWAA deployment - V2.7.2

What happened

SageMakerProcessingJobs have a hard limit of 64 characters for the ProcessingJobName.
In the SageMakerBaseOperator there is a check for uniqueness for the name.
In the case that a name is not unique it adds a timestamp to prevent a potential collision, however there is no check to prevent the updated - from exceeding 64 characters. This causes the creation of the SageMakerProcessingJob to fail.

What you think should happen instead

In the SageMaker Pipelines SDK they truncate the base name before adding the timestamp, therefor we recommend taking a similar approach for consistency purposes.

How to reproduce

Create a SageMaker ProcessingJob using the SageMakerProcessingOperator with a name of longer than 50 characters, and trigger it more than once. On the second time it is triggered the time stamp will be added and in the airflow logs it will show the error stating the sagemaker processing job failed to create due to the name exceeding the character limit

Anything else

Every time that a scheduled run occurs with the same name (after the first run)

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@dirkrkotzeml dirkrkotzeml added area:providers kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Jan 10, 2025
Copy link

boring-cyborg bot commented Jan 10, 2025

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@dirkrkotzeml dirkrkotzeml linked a pull request Jan 10, 2025 that will close this issue
@dosubot dosubot bot added the provider:amazon-aws AWS/Amazon - related issues label Jan 10, 2025
@nathadfield nathadfield removed the needs-triage label for new issues that we didn't triage yet label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants