Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserializing a whole OPTIMADE API as a file #471

Open
ml-evs opened this issue Jun 12, 2023 · 2 comments
Open

Deserializing a whole OPTIMADE API as a file #471

ml-evs opened this issue Jun 12, 2023 · 2 comments
Assignees
Labels
status/has-concrete-suggestion This issue has one or more concrete suggestions spelled out that can be brought up for consensus. type/proposal Proposal for addition/removal of features. May need broad discussion to reach consensus.

Comments

@ml-evs
Copy link
Member

ml-evs commented Jun 12, 2023

At the workshop, we discussed a file format specification that would allow an entire OPTIMADE API to be stored in a single file on disk. This would be useful for many applications (archival + restore, data transfer, local tools/exploration, caching etc.). We (@unkcpz, @eimrek, @giovannipizzi) proposed a draft of this file format that is approximately the following:

  • We use the JSON lines format: every line in the file is a valid JSON object, separated by newlines.
  • First entry must be a dictionary with the key x-optimade-api-version with value e.g., 1.2.0.
  • Then comes any other metadata under meta
  • Then corresponding info endpoint data for the types in the file (see Entry info endpoints have no id, type, or attributes #470 for a problem with this)
  • Then each line contains an entry from the corresponding endpoints

optimade_example.jsonl:

{"x-optimade": {...}}
{"meta": {"api_version": "1.1.0"}}
{"data": {"type": "info", "attributes": {"provider": {...}}}
{"info/structures": {"properties": ["_aiida_cell_volume": {"type": "number", ...}]}
{"info/references": {...}}
{"data": {"type": "structures", "id": "1234", "attributes": {...}}
{"data": {"type": "structures", "id": "1235", "attributes": {...}}
{"data": {"type": "references", "id": "sfdas", "attributes": {...}}

If there is interest, we will try to write this up as an appendix of the specification.

@ml-evs ml-evs self-assigned this Jun 12, 2023
@ml-evs ml-evs added type/proposal Proposal for addition/removal of features. May need broad discussion to reach consensus. status/has-concrete-suggestion This issue has one or more concrete suggestions spelled out that can be brought up for consensus. labels Jun 12, 2023
@merkys
Copy link
Member

merkys commented Jun 13, 2023

I had a somewhat similar idea when exploring the idea to store OPTIMADE responses or even serve them statically. I was thinking more about file tree layout than single file, but this layer could be handled by tar and similar tools.

My principles were the following:

  • Entry listing endpoints become directories (/structures -> structures/)
  • Single entries become JSON files (/structures/123 -> structures/123.json)
  • If needed, the same extends to introspection (/info) endpoint as well
  • If needed, entry listings can also be stored as JSON files (/structures -> structures.json) alongside the directories with the same name

A very simple server can then be developed to serve these static files.

@ml-evs
Copy link
Member Author

ml-evs commented Jul 3, 2024

Just a note that we adopted a very similar format to this in https://github.com/materialscloud-org/optimade-maker, though the OP is slightly out of date.

I can see the benefits of allowing both a single file and a directory structure (suggested above by @merkys). Somehow this could be capture in the top line of the root file, outlining whether the "client" should read the next lines of the file to find data, or look elsewhere in the filesystem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/has-concrete-suggestion This issue has one or more concrete suggestions spelled out that can be brought up for consensus. type/proposal Proposal for addition/removal of features. May need broad discussion to reach consensus.
Projects
None yet
Development

No branches or pull requests

2 participants