-
Notifications
You must be signed in to change notification settings - Fork 2
update prefect version #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
kkamdin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! I'm not very educated (yet!) on writing prefect tests so I may have missed something. I just had one question and a note about building common-use prefect code in in-line comments.
| with prefect_test_harness(): | ||
| # Run flow | ||
| flow_run_id = asyncio.run(run_flow()) | ||
| flow_run_id = run_flow() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just so I understand -- can you explain why you no longer need the asyncio?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comments. Previously, in run_flow(), we invoked the async function create_then_begin_flow_run. Since this function is now deprecated, we instead call parent_flow inline. As a result, we no longer need to use asyncio.run when calling run_flow.
| """ | ||
| Run the parent flow inline (no workers/agents) and return the flow run ID. | ||
| """ | ||
| flow_run_id = parent_flow(model_name="model_name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we want to move towards commonly used prefect code, we could try renaming MLexchange-specific parameters like "model_name" to something more generic. Since parent flow is basically a test fixture here (I think?), maybe that's less useful. I think in prefect_utils/core.py there are some candidates, but that can be a job for future us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run_flow() is a helper function created to simplify testing. It runs a flow inline and returns a flow ID, which we can then use to test operations such as delete, cancel, or fetching logs.
In real applications, we do not run flows inline this way. Instead, only schedule_prefect_flow (defined in core.py) is used. This function uses more general parameter names:
def schedule_prefect_flow(
deployment_name: str,
parameters: Optional[dict] = None,
flow_run_name: Optional[str] = None,
tags: Optional[list] = [],
)
It also returns a flow ID.
mlex_utils/test/test_prefect.py
Outdated
| assert len(flow_run_logs) > 0 | ||
| assert isinstance(flow_run_logs[0], str) | ||
| print(f"Parent flow finished with flow_run_logs: {flow_run_logs}") | ||
| # assert len(flow_run_logs) > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assert statement is commented out now. Is there a reason to keep it, or is the next assertion comprehensive enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comments.
The assert len(flow_run_logs) > 0 check is important to ensure we successfully retrieve logs. However, in Prefect 3 with pytest, len(flow_run_logs) is always 0.
It looks like pytest changed between Prefect 2 and Prefect 3.
Prefect 2:

Prefect 3:

From the above images, you may notice that Prefect 3 uses a temporary server with no persistent log storage, whereas Prefect 2 saved logs in a local database by default.
When I tested against a real Prefect server, I was able to retrieve logs using the get_flow_runs_logs() API, but this no longer works in pytest. Since I couldn’t find a way to test it reliably, I delete the assertion for now.

Have you run into a similar issue?
| ) | ||
|
|
||
| print(f"Successfully scheduled flow run with ID: {flow_run_id}") | ||
| assert isinstance(flow_run_id, uuid.UUID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the other tests for flow_run_id, it is checking that it is a str. Here, it is checking if flow_run_id is a uuid.UUID. Super nitpicky, but maybe these should be consistent? I am not sure if other code assumes the id to be a str or uuid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good. I’ve updated them to use the same type: uuid.UUID.
davramov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of comments with minor suggestions; otherwise, it looks good!
Thank you, David. |
I upgrade prefect 2.14.21 to prefect 3.4.2, which is consistent with David's splash project.
I add mlflow utils to this repo.