-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Labels
bugSomething isn't workingSomething isn't working
Description
What happened?
anemoi-training==0.9.0
anemoi-utils==0.4.43
health_check method does not properly check for a successful web response
if response.text == "OK": ## <----- should be response.status_code == 200
return
throws the following
2026-01-30 02:34:26 INFO ✅ Successfully logged in to MLflow. Happy logging!
Apptainer> anemoi-training mlflow sync --source {mlflow_logs} --destination https://mlflow.ecmwf.int/ --run-id {run_id} --experiment-name {my_experiment} --verbose
30-Jan-26 02:34:37 - INFO - Using default logging config without output log file
Traceback (most recent call last):
File "/usr/local/bin/anemoi-training", line 10, in <module>
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.12/dist-packages/anemoi/training/__main__.py", line 23, in main
cli_main(__version__, __doc__, COMMANDS)
File "/usr/local/lib/python3.12/dist-packages/anemoi/utils/cli.py", line 266, in cli_main
cmd.run(args)
File "/usr/local/lib/python3.12/dist-packages/anemoi/training/commands/mlflow.py", line 261, in run
**health_check**(args.destination)
File "/usr/local/lib/python3.12/dist-packages/anemoi/utils/mlflow/utils.py", line 44, in health_check
raise ConnectionError(error_msg)
ConnectionError: Could not connect to MLflow server at https://mlflow.ecmwf.int/. The server may require authentication, did you forget to turn it on?
This occurs because the multiurl library's response.text field is the entire HTML document in the response body, which does not equal the literal string "OK".
Proposed Solution
To check for a successful response, we should just check response.status_code == 200.
What are the steps to reproduce the bug?
- anemoi-training==0.9.0
- successful authentication at mlflow.ecmwf.int via
anemoi-training mlflow login --url https://mlflow.ecmwf.int/ anemoi-training mlflow sync --source {mlflow_logs} --destination https://mlflow.ecmwf.int/ --run-id {run_id} --experiment-name {my_experiment}
Version
anemoi-training==0.9.0
anemoi-utils==0.4.43
Platform (OS and architecture)
SUSE Linux Enterprise Server 15 SP6
Relevant log output
Traceback (most recent call last):
File "/usr/local/bin/anemoi-training", line 10, in <module>
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.12/dist-packages/anemoi/training/__main__.py", line 23, in main
cli_main(__version__, __doc__, COMMANDS)
File "/usr/local/lib/python3.12/dist-packages/anemoi/utils/cli.py", line 266, in cli_main
cmd.run(args)
File "/usr/local/lib/python3.12/dist-packages/anemoi/training/commands/mlflow.py", line 261, in run
**health_check**(args.destination)
File "/usr/local/lib/python3.12/dist-packages/anemoi/utils/mlflow/utils.py", line 44, in health_check
raise ConnectionError(error_msg)
ConnectionError: Could not connect to MLflow server at https://mlflow.ecmwf.int/. The server may require authentication, did you forget to turn it on?Accompanying data
No response
Organisation
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
To be triaged