You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
dvc exp run --queue: fails with "No such file or directory" on a cache path similar to .dvc/tmp/exps
Description
It appears that dvc exp run --queue only works on DVC pipelines that have been previously committed to git
The error from this is not clear
When running a queued experiment with dvc exp run --queue, the job is queued and can be started with dvc queue start. However, it will fail with an error similar to ERROR: unexpected error - [Errno 2] No such file or directory: '[path to repo]/.dvc/tmp/exps/tmpabc123/....
The same experiment can be successfully run with dvc repro and dvc exp run. It appears to work once the pipeline is committed with git, which suggests this is either the cause or related to the issue but there is no mention of this in the error message.
Also, once the pipeline is committed and a new uncommitted change made, it calls into question which version of the pipeline is being run - the committed version, or the "dirty" version in the current directory.
Other minor issues
These can be separate issues if required.
print statements are not shown in --follow unless explicitly flushed, though this may just be unavoidable celery behaviour
when running dvc queue logs [task] on task that requires some slow dvc checkout startup, it gives a "no logs available" message, but using --follow it gives ERROR: unexpected error - : [Errno 2] No such file or directory: '/[path to repo]/.dvc/tmp/exps/run/[uuid]/[uuid].json. The same command later succeeds, presumably once the job has actually started.
the UTC timestamps shown by dvc queue status move to MM DD, YYYY format on the next day, which hides helpful time info, especially if you don't work in UTC (i.e. this can happen during the day)
Reproduce
Create a new repo: mkdir /tmp/example; cd /tmp/example; git init; dvc init;
Create a pipeline: mkdir pipeline and copy the following as files:
main.py
Run dvc exp run: experiment cannot be run without an existing git commit (not really a problem in most repos, plus has a good error message)
Run git commit -m "Setup repo" to commit init files but not the pipeline files to create a least one commit in the repo
Run dvc exp run: experiment runs successfully
Run dvc exp run --queue: command runs successfully
Run dvc queue start: command runs successfully
Run dvc queue logs [task name]: shows "ERROR: unexpected error - [Errno 2] No such file or directory"
Run dvc queue status: task shown as "Failed"
Commit pipeline files
Re-run steps 7 and 8
Run dvc queue logs [task name]: no error, task running as expected
Run dvc queue status: task eventually shown as success
Bonus: Run dvc queue logs [task name] --follow: note that print statements are not shown until end of task, unless sys.stdout.flush() is called
Expected
Either dvc exp run --queue should work without first committing the pipeline, or a clear error message should be shown indicating it needs to be committed first.
If a committed pipeline is required it should be clear whether the committed version or the current "dirty" version of the pipeline is being run.
Bug Report
dvc exp run --queue
: fails with "No such file or directory" on a cache path similar to .dvc/tmp/expsDescription
dvc exp run --queue
only works on DVC pipelines that have been previously committed to gitWhen running a queued experiment with
dvc exp run --queue
, the job is queued and can be started withdvc queue start
. However, it will fail with an error similar toERROR: unexpected error - [Errno 2] No such file or directory: '[path to repo]/.dvc/tmp/exps/tmpabc123/...
.The same experiment can be successfully run with
dvc repro
anddvc exp run
. It appears to work once the pipeline is committed with git, which suggests this is either the cause or related to the issue but there is no mention of this in the error message.Also, once the pipeline is committed and a new uncommitted change made, it calls into question which version of the pipeline is being run - the committed version, or the "dirty" version in the current directory.
Other minor issues
These can be separate issues if required.
--follow
unless explicitly flushed, though this may just be unavoidable celery behaviourdvc queue logs [task]
on task that requires some slow dvc checkout startup, it gives a "no logs available" message, but using--follow
it givesERROR: unexpected error - : [Errno 2] No such file or directory: '/[path to repo]/.dvc/tmp/exps/run/[uuid]/[uuid].json
. The same command later succeeds, presumably once the job has actually started.dvc queue status
move toMM DD, YYYY
format on the next day, which hides helpful time info, especially if you don't work in UTC (i.e. this can happen during the day)Reproduce
mkdir /tmp/example; cd /tmp/example; git init; dvc init;
mkdir pipeline
and copy the following as files:main.py
dvc.yaml
dvc repro
: pipeline runs successfullydvc exp run
: experiment cannot be run without an existing git commit (not really a problem in most repos, plus has a good error message)git commit -m "Setup repo"
to commit init files but not the pipeline files to create a least one commit in the repodvc exp run
: experiment runs successfullydvc exp run --queue
: command runs successfullydvc queue start
: command runs successfullydvc queue logs [task name]
: shows "ERROR: unexpected error - [Errno 2] No such file or directory"dvc queue status
: task shown as "Failed"dvc queue logs [task name]
: no error, task running as expecteddvc queue status
: task eventually shown as successdvc queue logs [task name] --follow
: note that print statements are not shown until end of task, unlesssys.stdout.flush()
is calledExpected
Either
dvc exp run --queue
should work without first committing the pipeline, or a clear error message should be shown indicating it needs to be committed first.If a committed pipeline is required it should be clear whether the committed version or the current "dirty" version of the pipeline is being run.
Environment information
The text was updated successfully, but these errors were encountered: