Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archive Cancelled Workflows #939

Merged
merged 7 commits into from
Nov 6, 2024
Merged

Conversation

Leahh02
Copy link
Collaborator

@Leahh02 Leahh02 commented Oct 3, 2024

I was debating whether I should make this call to archive the workflow in wf_actions or wf_update. The calls to archive_workflow for other states are in wf_update, but I put it wf_actions to keep the full lifecycle of actions related to cancelling the workflow in one place.

Copy link
Collaborator

@pagrubel pagrubel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we have been marking states as Archived/{final_state} as they are archive. So I think we should stick to this format. I'll look to see how this is used. I think it is also used when there is a reset.

Copy link
Collaborator

@pagrubel pagrubel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Leahh02 This actually ended up with the state we need for the workflow.

clamr	be196f	Archived/Cancelled

when I ran it because when a workflow is archived it automatically adds the final state.
I'm not sure what happened to the state you add here.

Also, I'd like to see any dags that were run in the graphml files as well as the final dag in the archive.

Sorry that this wasn't asked for in the issue. I ran some dags before cancelling the workflow. There are none in the archive. I'm thinking someone may have a complex workflow that they cancelled due to something like an upcoming DST and they want to quickly see what parts actually ran.

@aquan9 aquan9 added the WIP Work in progress label Oct 15, 2024
@Leahh02
Copy link
Collaborator Author

Leahh02 commented Oct 22, 2024

@Leahh02 This actually ended up with the state we need for the workflow.

clamr	be196f	Archived/Cancelled

when I ran it because when a workflow is archived it automatically adds the final state. I'm not sure what happened to the state you add here.

Also, I'd like to see any dags that were run in the graphml files as well as the final dag in the archive.

Sorry that this wasn't asked for in the issue. I ran some dags before cancelling the workflow. There are none in the archive. I'm thinking someone may have a complex workflow that they cancelled due to something like an upcoming DST and they want to quickly see what parts actually ran.

I think I made the state "Cancelled and Archived" because I thought that would be reflected in the GDB. It's not, so I changed it back to the way it was for simplicity.

@Leahh02
Copy link
Collaborator Author

Leahh02 commented Oct 22, 2024

Right now the unit tests aren't testing archiving the workflow after it's been archived. I added the line mocker.patch('beeflow.wf_manager.resources.wf_actions.archive_workflow', return_value=None) to test_wf_manager to skip this. I was wondering if archiving should be in a different test.

@Leahh02
Copy link
Collaborator Author

Leahh02 commented Oct 24, 2024

all of the generated DAGs are now also being saved to the archive

@Leahh02 Leahh02 removed the WIP Work in progress label Oct 28, 2024
@pagrubel
Copy link
Collaborator

pagrubel commented Nov 6, 2024

Now the workflow is not actually being cancelled, instead it actually runs to completion:
beeflow query 0a
Archived/Cancelled
clamr--RUNNING
ffmpeg--WAITING

$ beeflow query 0a
Archived/Cancelled
clamr--COMPLETED
ffmpeg--PENDING

$ beeflow query 0a
Archived
clamr--COMPLETED
ffmpeg--COMPLETED

I didn't have time to look, but most likely the wf_manager is looking for the Cancelled state to stop it from running.

@pagrubel
Copy link
Collaborator

pagrubel commented Nov 6, 2024

@Leahh02 This is not caused by your PR because it also occurs in the devlop branch.

@Leahh02
Copy link
Collaborator Author

Leahh02 commented Nov 6, 2024

@Leahh02 This is not caused by your PR because it also occurs in the devlop branch.

That's so strange. I noticed it once while I was running a workflow, but I canceled it while tasks were completing, so I figured that's why it finished. That's definitely not the case in the example you gave.

I'll try to figure it out

Copy link
Collaborator

@pagrubel pagrubel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Cancelling is not working but is another issue #955 to be addressed. This addresses #894

@pagrubel pagrubel merged commit c0117fc into develop Nov 6, 2024
12 checks passed
@pagrubel pagrubel deleted the Issue894/archive-cancelled-workflows branch November 6, 2024 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants