File-based version #360

gpetretto · 2025-09-29T09:14:41Z

gpetretto
Sep 29, 2025
Maintainer

As already discussed in #338, I have been trying to create a version that would allow users to try running workflows without having to install a database. Of course the idea would be that such a version should be used only to run a few workflows or to test the software without the burden of the complicated setup. It is unlikely that such a version would allow to run in high throughput or produce large output databases, so should be discouraged for production.

At this point I have prepared two versions, one where it is possible to disable the usage of the MongoDB pipelines. This allows to use MontyDB as a backend for the queue database, the other entirely replaces the orginal JobController with one based on SQL and thus can be based with SQLite.

While I believe that the a version based on SQLite would have the potential to be more robust and support larger DB sizes, this test version was mostly implemented through vibe coding and given that the task was quite involved I believe that the code needs extensive review. This will likely be a major task, so I suppose that the current version with some minor improvements would be the best I could afford to provide.
On the other hand the version without pipelines gives up on the atomicity of the operations in some cases, on top of the fact that also the implementation in MontyDB will not guarantee it. This version may be more prone to database corruption if multiple actions are perfomed simultaneously on the DB, but the implementation was way easier and could probably be merged after a standard short review. So I will start from the latter.

No pipelines version

To use this version install the code from the gp/pipelines branch: https://github.com/Matgenix/jobflow-remote/tree/gp/pipelines and upgrade MontyDB to the latest version (2.5.5) where I have implemented the find_one_and_update method.
For the configuration of the stores you can set them up like this:

queue:
  store:
    type: MontyStore
    collection_name: jobs
    database_path: /path/to/folder/for/queue/store
  use_mongodb_pipelines: false
jobstore:
  docs_store:
    type: MontyStore
    collection_name: outputs
    database_path: /path/to/folder/for/output/store
  additional_stores:
    data:
      type: MontyStore
      collection_name: data
      database_path: /path/to/folder/for/output/store

The rest of the configuration is the standard one.
Note that here jobflow-remote still forces to define a data store for atomate2, so maybe switching to a warning as discussed in #331 would be better.
My idea would be that this could be set as the default store if queue and jobstore are missing from the configuration and use the standard project folder to save the files. In this was a new user could test this with a minimal configuration.

SQL JobController

To use this version install the code from the gp/sql branch: https://github.com/Matgenix/jobflow-remote/tree/gp/sql
and for the configuration use:

queue:
  store:
    type: sqlite
jobstore:
  docs_store:
    type: MontyStore
    collection_name: outputs
    database_path: /path/to/folder/for/output/store
  additional_stores:
    data:
      type: MontyStore
      collection_name: data
      database_path: /path/to/folder/for/output/store

optionally a filepath in queue.store can be set to decide the path of the sqlite db file. By default it is stored in the project folder.

Let me know if you have the chance to test it.
Pinging @JaGeo @Andrew-S-Rosen and @utf as they may be interested.

JaGeo · 2025-10-08T06:40:00Z

JaGeo
Oct 8, 2025

Hi @gpetretto,

I tried to set up the montydb version. I run into the following error:

                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jgeorge/miniconda3/envs/2025_atomate2_workshop/lib/python3.11/site-packages/montydb/storage/__init__.py", line 45, in delegate
    return getattr(delegator, attr)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jgeorge/miniconda3/envs/2025_atomate2_workshop/lib/python3.11/site-packages/montydb/storage/sqlite.py", line 450, in query
    docs = self._conn.read_all(self._col_path, max_scan)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jgeorge/miniconda3/envs/2025_atomate2_workshop/lib/python3.11/site-packages/montydb/storage/sqlite.py", line 178, in read_all
    with self._connect(db_file) as conn:
         ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jgeorge/miniconda3/envs/2025_atomate2_workshop/lib/python3.11/site-packages/montydb/storage/sqlite.py", line 126, in _connect
    self.__conn.executescript(self.db_pragmas + ";" + wcon_pragmas)
sqlite3.OperationalError: database is locked

I am very sure I installed the gh/pipelines branch and not the one for sqlite. I also updated to the latest montydb. The project check also runs without any errors. However, when I try to start the runner I get the error that the database is locked. When I try to submit a new job, I get the error above.

Do you see a reason why this could happen?

Thanks in advance.

2 replies

gpetretto Oct 8, 2025
Maintainer Author

Hi @JaGeo, thanks for testing this.
The default in MontyStore for the file storage option of MontyDB is sqlite. So this explains why you have sqlite into play. I did not try the "lightening" option, but in any case it is obvious that a file based version will have to respect the fact that the concurrancy of the operations is limited. And SQLite itself only allows a single process writing at the time (see SQLite FAQs).
So, if the runner is updating the status of the DB I see how you would get this error. I forgot to mention this in the initial post, but to mitigate this problem (and also to avoid concurrency between runner processes) it would be better to start the runner with the --single option. For a file based version I would probably make this mandatory or at least print a warning.
And in general it would probably be good to disable the runner when doing operations that need to write to the DB. Altough I have tried to have both the runner and submitting a job and did not encouter this issue.

However, it is not entirely clear if you actually managed to start the runner or not. When you refer to the error of the DB locked, is it the same from SQL? Or is it from jobflow-remote not managing to put our "lock" on a collection of mongodb?

JaGeo Oct 8, 2025

For the one with sqlite, i am able to start the runner, for the other one not.
I will try around with the --single option and not using the runner to query the database while the job runs. However, this might bring some problems in practice. I will report back afterwards

JaGeo · 2025-10-08T07:34:02Z

JaGeo
Oct 8, 2025

I was able to setup the sqlite version with a local worker and it seems to start simple additions. I had to install sqlite itself, of course, and also "sqlalchemy"

I ran into to errors: I cannot run "jf -p my_project project check --errors". This fails with an error.

And, the DOWNLOADED job cannot be added to the MontyStoreDB. It looks like the same error as above.

5 replies

gpetretto Oct 8, 2025
Maintainer Author

I ran into to errors: I cannot run "jf -p my_project project check --errors". This fails with an error.

Always the same error with sqlite? or some of the checks fail?

And, the DOWNLOADED job cannot be added to the MontyStoreDB. It looks like the same error as above.

So maybe even above the problem is with the jobstore DB and not the queue DB? Did you have mutiple jobs running? Could it be that this is related to having multiple runner processes? Can you try starting the runner with --single?

Did the job have additional_data? I did not expect this to be an issue, but I think I did not test this case.

JaGeo Oct 8, 2025

@gpetretto The issue is connected to the MontyDB, i believe. When using the sqlite version, the document cannot be put in the database. Thus the "Download" step fails.

Do I need a conditional install for montydb? Or is there something else I need to setup?

JaGeo Oct 21, 2025

@gpetretto any further suggestions here? I might have some time this week to further debug

gpetretto Oct 22, 2025
Maintainer Author

Hi @JaGeo, sorry for the delay. I have tried to reproduce this without much luck. I have tried to submit multiple small atomate2 workflows while quering the DB, but they all complete correctly (I have tried the "pipelines" version, but I suppose it will the same, since this seems to be related to the output DB anyway).

One question would be: where are you running it? I have tried on my laptop. Are you maybe trying running it on the cluster front-end? If that is the case, maybe the disk is based on NTFS and obviously slower, resulting in this kind of issues? In that case I can also try to do that, to check if I can trigger the problem.

Another point is that I have seen that in the MontyDB homepage they suggest to set some configuration for the SQLite backend, that seem not to be set in the maggma MontyStore:
https://github.com/davidlatwe/montydb/?tab=readme-ov-file#-sqlite
https://github.com/materialsproject/maggma/blob/872cddda808d3d0523d3865cfa50cae90bec4113/src/maggma/stores/mongolike.py#L858
maybe you can try setting the storage_kwargs and client_kwargs and see if this improves things? (I am actually running without those settings)

JaGeo Oct 22, 2025

@gpetretto Thanks. I will try repeating the installation on a different server. Then, report back. There might be a problem with our file storage system that I did not anticipate before.

File-based version #360

Uh oh!

gpetretto Sep 29, 2025 Maintainer

No pipelines version

SQL JobController

Replies: 2 comments · 7 replies

Uh oh!

JaGeo Oct 8, 2025

Uh oh!

gpetretto Oct 8, 2025 Maintainer Author

Uh oh!

JaGeo Oct 8, 2025

Uh oh!

JaGeo Oct 8, 2025

Uh oh!

gpetretto Oct 8, 2025 Maintainer Author

Uh oh!

Uh oh!

JaGeo Oct 8, 2025

Uh oh!

JaGeo Oct 21, 2025

Uh oh!

gpetretto Oct 22, 2025 Maintainer Author

Uh oh!

JaGeo Oct 22, 2025

gpetretto
Sep 29, 2025
Maintainer

Replies: 2 comments 7 replies

JaGeo
Oct 8, 2025

gpetretto Oct 8, 2025
Maintainer Author

JaGeo
Oct 8, 2025

gpetretto Oct 8, 2025
Maintainer Author

gpetretto Oct 22, 2025
Maintainer Author