Skip to content

Conversation

@adtisdal-ASDC
Copy link
Contributor

Summary: Summary of changes

Addresses CUMULUS-4471: Coreify PDR Cleanup task

Changes

  • Add and adjusts ASDC's pdr-cleanup task to fit into cumulus core

PR Checklist

  • Update CHANGELOG
  • Unit tests
  • Ad-hoc testing - Deploy changes and test manually
  • Integration tests

📝 Note:
For most pull requests, please Squash and merge to maintain a clean and readable commit history.

@adtisdal-ASDC adtisdal-ASDC force-pushed the adtisdal/CUMULUS-4427-add-pdr-cleanup branch from c5a1bfc to 92fec12 Compare January 21, 2026 16:17
Copy link
Contributor

@reweeden reweeden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good! Just some ideas / food for thought mostly about best practice things that we might want to standardize...

Comment on lines +19 to +24
cp ./src/*/*.py ./dist/
cp -r ./schemas ./dist/

cd ./dist || exit 1

node ../../../bin/zip.js lambda.zip $(ls | grep -v lambda.zip)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into having to make the same modifications to this script. @paulpilone and I talked about possibly moving it to the top level bin folder so it doesn't need to be copy/pasted for every task. The thing I'm concerned with is slight drift between all the different versions over time.

source_code_hash = filebase64sha256("${path.module}/../dist/lambda.zip")
handler = "pdr_cleanup.handler"
role = var.lambda_processing_role_arn
runtime = "python3.13"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe our python version is 3.12. Are we trying to keep all the tasks on the same version?

unit-logs/
test_output.txt
test_output.txt
**/__pycache__/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*.pyc should catch everything

Comment on lines +70 to +91
{
"ErrorEquals": [
"States.ALL"
],
"Next": "WorkflowFailed",
"ResultPath": "$.exception"
}
],
"Retry": [
{
"BackoffRate": 2,
"ErrorEquals": [
"Lambda.ServiceException",
"Lambda.TooManyRequestsException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 5,
"MaxAttempts": 10
}
]
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something weird going on with the indent formatting here

},
"author": "Cumulus Authors",
"license": "Apache-2.0"
} No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be a number of files that GitHub is flagging as having whitespace issues at the end of the files. I would highly recommend checking your editor settings as usually there's an option to fix them automatically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be calling this task.py? I think in the best practices example there is a task.py and it makes sense to me for the 'entrypoint' (lambda handler) to always be in a file with the same name for all the tasks, but I'm not sure if that was actually the intention, so just an idea for discussion here.

Comment on lines +19 to +20
event (dict): A lambda event object
context (dict): An AWS Lambda context
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also wondering about the docstring convention here. I think I ended up using the :param event: blah... convention (which I'm forgetting what it's called). I think it would be nice to pick one and have it be consistent across our python code.


try:
s3_client.delete_object(Bucket=provider["host"], Key=src_path)
logger.info(f"DELETED: {provider['host']}/{src_path}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we feel about using format strings in logger calls? I thought usually the linter complains about this, but I guess it's not in the default lints or the lints we turned on.

provider = successful_event["config"]["provider"]
pdr = successful_event["input"]["pdr"]

with pytest.raises(Exception):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add a match just to make sure the exception you're catching is actually the expected one.

Suggested change
with pytest.raises(Exception):
with pytest.raises(Exception, match="Delete failed"):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants