-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collections endpoint #386
base: develop
Are you sure you want to change the base?
Collections endpoint #386
Conversation
Initial work preparing Collections endpoint
optimade.rst
Outdated
- The keys should be short strings describing the type of metadata being supplied. | ||
- The values can be any string, which may be human-readable. | ||
|
||
aggregated_fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe properties
instead of fields
, or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I need an example to see what this field is used for, this just aggregates field/property names but not values? Does every entry have to have a value of each field listed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had only said "fields" (or "properties" seems fine too) instead of also including values because:
- I wasn't sure what the best way would be to specify the reduction operations over the values (for example, sometimes you might want to sum the values, other times you might want to build a set from a list of values, etc.)
- I wasn't sure when the values should be reduced. Should the reduction occur before the collection is uploaded, meaning the reduced values wouldn't change even if the linked entries were edited? This wouldn't seem ideal, but I also don't know if it's acceptable (or possible) to specify that the reduction would be performed every time the collection was accessed.
A basic example of this, which we've been using for the OpenKIM/ColabFit project, is to have a "StructuresCollection" that aggregates all of the attributes.elements
fields of the linked structures to provide a single set of elements present in the collection. Something like structure1.attributes.elements = ['C', 'Fe']
, structure2.attributes.elements = ['Al']
, collection.attributes.elements = ['Al', 'C', 'Fe']
. Another simple example would be to aggregate attributes.nsites
to count the total number of sites in the collection.
Does every entry have to have a value of each field listed here?
Though it's a bit restrictive, I think that I'd say yes, every entry should have a value for each of the aggregated fields. I think that a collection should be assumed to be homogenous, but perhaps that could use some discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should update the collection every time one of the entries gets updated.
If entries get updated regularly, you could place relationships in each entry pointing to the collections they belong to. If this rarely happens, you could probably query all the collections to check whether a particular entry is in that collection and then update it. We don't have to specify how to update the data belonging to the collections in the specification though, only that it SHOULD be updated.
In some cases it could however be worth while to create a new structure rather than to update the existing one. For example, when you want a collection you refer to in an article to stay the same.
Perhaps you could make a dictionary for each Optimade property.
Which could, depending on the property, hold a set or the minimum, average and maximum value in the collection.
When making the properties for these collections I think it would be good to think about how you would search for collections.
The number of entries in your collection would probably also be a good property to include.
There is also the info endpoint where you can specify which properties are shared for each endpoint
For collections, it would be /info/collections. You therefore do not have to specify which properties are available for the collections. (I do have a field like that in the trajectories endpoint because in that case the fields do not need to be queryable.) If they are queryable you could use the IS KNOWN query to check whether an entry has the particular field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have difficulty seeing the utility of aggregated_fields being an OPTIMADE standardized property. There is of course no problem for OpenKIM serving, e.g., an _openkim_aggregated_elements
that aggregates the values of the elements, etc.; but it just seems the definition means this field anyway needs to be interpreted differently depending on which database is being queried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for kicking this off @jvita! I invited you to the Materials-Consortia GitHub organization - lmk if it didn't work.
My comments here echo those in #360 - it would be good to have a general discussion about the preferred way of dealing with relationships to more complicated entry types (like files and collections). I guess the overall aim here is to provide a simple container for any entry types that averts the need to provide all-to-all relationships between a set of related entries, so it might be nice to mock-up some example responses that showcases this.
optimade.rst
Outdated
**Collection** | ||
A Collection defines a relationship between a group of Entry resources. A Collection can be used to store metadata that applies to all of the entries in the group, and to aggregate metadata from each entry in the group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I would define collections here, as the list is primarily external defintions of terminology that we are going to use in OPTIMADE, not terminology we are really defining ourselves (e.g. structure is not in this list)
optimade.rst
Outdated
- **Query**: support for queries on this property is OPTIONAL. If supported, only a subset of the filter features MAY be supported. | ||
- The keys should be short strings describing the type of metadata being supplied. | ||
- The values can be any string, which may be human-readable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to go down the route of some defined (optional) fields for e.g. description
, name
, then let provider-specific fields do the rest of the work (which can then be described in /info/collections
), e.g. _exmpl_dft_parameters
if the collection defines a consistent set of DFT calculations.
optimade.rst
Outdated
- The keys should be short strings describing the type of metadata being supplied. | ||
- The values can be any string, which may be human-readable. | ||
|
||
aggregated_fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I need an example to see what this field is used for, this just aggregates field/property names but not values? Does every entry have to have a value of each field listed here?
Thank you for starting this PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for giving you this feedback so late. I did not realize my comments were still pending.
optimade.rst
Outdated
- The keys should be short strings describing the type of metadata being supplied. | ||
- The values can be any string, which may be human-readable. | ||
|
||
aggregated_fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should update the collection every time one of the entries gets updated.
If entries get updated regularly, you could place relationships in each entry pointing to the collections they belong to. If this rarely happens, you could probably query all the collections to check whether a particular entry is in that collection and then update it. We don't have to specify how to update the data belonging to the collections in the specification though, only that it SHOULD be updated.
In some cases it could however be worth while to create a new structure rather than to update the existing one. For example, when you want a collection you refer to in an article to stay the same.
Perhaps you could make a dictionary for each Optimade property.
Which could, depending on the property, hold a set or the minimum, average and maximum value in the collection.
When making the properties for these collections I think it would be good to think about how you would search for collections.
The number of entries in your collection would probably also be a good property to include.
There is also the info endpoint where you can specify which properties are shared for each endpoint
For collections, it would be /info/collections. You therefore do not have to specify which properties are available for the collections. (I do have a field like that in the trajectories endpoint because in that case the fields do not need to be queryable.) If they are queryable you could use the IS KNOWN query to check whether an entry has the particular field.
Sorry I haven't responded to this in awhile, I've been a bit overwhelmed with other projects and haven't found the time to work on this PR.
I've been struggling with deciding what the best approach to this would be. It seems like it would be difficult and/or expensive to make sure to update everything whenever an entry is changed, especially in the case where you have collections of collections. An alternative could be to not automatically update a collection when one of its entries changes, but to instead provide some kind of functionality to "re-synchronize" the collection to aggregate any desired properties up to the collection. This option would mean the collection isn't guaranteed to be up-to-date unless it's re-synchronized regularly, but it would also avoid running the potentially expensive task of updating everything on every change.
I'll try to add an example soon. We (the OpenKIM team) are still working on ironing out our data structures, so I've been holding off on providing an example until I have one that would be representative of our use case. |
I think that it is most important that the database as presented to the outside world is consistent. So I think you could create a backlog with all the changes that need to be made to the database and execute these at an appropriate time. For example when the server load is low. This would allow you to accumulate multiple changes, so you would only have to update a collection once. I am not sure whether it will be that much work to update a collection though. In most cases you will not need to access the other entries in a collection. Therefore, I am not sure if it is necessary to implement a log of all the changes that need to be executed. |
I like your idea of creating a backlog of pending changes that can be applied at chosen moments rather than automatically. We have implemented something similar in our software package, where our collections have a boolean flag that tracks if any of their entries have been changed, that way they can apply those changes before doing any critical operations (like saving to a file, or providing an aggregated statistic).
I agree with you that updating a single collection with multiple changes might not be that bad, but what I'm also worried about is having to update many collections that all share a common entry (or equivalently, nested collections). In this case you would need to propagate the changes to all collections containing the changed entry, and to all collections containing those collections, and so on and so forth... in this way, for a single changed entry the number of collections that need to be updated may be growing exponentially and it could begin to be a computationally demanding task to keep the database synchronized. Here is a basic example that I think will be useful for these discussions. Perhaps we can edit this example as we go:
Here is a second example if collections are allowed to be non-homogeneous (i.e., not all entries are required to be the same). Again, I think we should discuss if we want this to be allowed:
|
Another option could be to have “two” databases. Each time you want to do an update on the database, you can perform the update on a copy. Once you finish updating, you switch the database you serve. If you include a version number in the pagination link, you can inform the clients that the database has been updated and the result they get may not be consistent.
It depends on what kind of aggregated properties you have, but for the aggregated properties I can think of, you would not need to visit the members of other collections when you update one of the collection in a super collection. So unless you have more than 10.000 or so collections in which the changed entry occurs, I do not think it would be a problem to update one entry and update all the collections it is in.
In the example above, you have placed the data('A': {'x': 0, 'y': 0}) inside your collection. This looks a bit strange, as normally your collection would contain a reference to the structure. I would discourage not aggregating all the fields, as you do in the second example, as the aggregated fields are now inconsistent. A user could look for a collection with 'y' : 3 but he/she would not find it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the collection proposal is sound. I've thumbed up a few comments by others which I agree with (and it would be great if @jvita can address them in jvita:collections_endpoint).
One thing I also want to discuss is, in addition to the proposal to remove additional_metadata
and aggregated_fields
from the PR, would it make sense to add a category
string property to make it easier to think of different types of collections? We're probably going to end up with collections describing very different things, e.g., sets of structures because they are part of one investigation, sets of structures that are brought together because they are structurally similar, sets of calculations done as part of a project, etc. However, I realize that I can achieve the same thing by making collections of collections - but is that ok, or does it get too abstract? And how does a user find all "collection types" there is to ask for?
(One idea is to let the standard propose, at least on SHOULD level, a specialized collection with a specific title of collections of collections that the database wants users to be aware of. This is getting quite abstract, though.)
optimade.rst
Outdated
- The keys should be short strings describing the type of metadata being supplied. | ||
- The values can be any string, which may be human-readable. | ||
|
||
aggregated_fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have difficulty seeing the utility of aggregated_fields being an OPTIMADE standardized property. There is of course no problem for OpenKIM serving, e.g., an _openkim_aggregated_elements
that aggregates the values of the elements, etc.; but it just seems the definition means this field anyway needs to be interpreted differently depending on which database is being queried.
Thanks again for the work with this. I'm seeing some need of collections in my work and would like this PR to be added. I see that the PR has been idling for a while (and I very much understand being busy with other things). Do you plan to address the concerns eventually, or would you be ok with me pushing commits to address the outstanding issues? |
Hi @rartino; please feel free to make any changes that you wish -- I'm not sure that I'll be able to dedicate much time to this right now, unfortunately. Thanks! |
As this PR was made from @jvita's fork, perhaps we should remake in a branch that other's have access to and address the initial comments above? I'm happy to make the PR but can't really commit to owning the PR overall right now (maybe in a few weeks). |
No need to remake the branch - if someone clones a GitHub repo and then create a PR against it from a branch, the default is to allow the repo owners of the original repo push-access to that branch for precisely this reason. I've pushed the edits I wanted to do. Please review. |
Great! I must have been misremembering why we had to add collaborator access on forks outside of the org in the past, guess it was probably before the PR had been made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the (draft) proposal presents a sufficiently low-level method to represent groups of OPTIMADE entries. I left a few comments which I like to see answered before merge.
I seemingly misunderstood the whole issue with relationship naming. As well I thought the JSON API had some requirements for them (apparently it does not). So now I envisage three suggestions:
|
I think we first need to clarify this ambiguity of JSON API relationship naming in general... The most straightforward solution seems to be to edit 5.3.2 Entry Listing JSON Response Schema and for the definition of the "relationships" field say that: "Relationships MUST be grouped into a single JSON API relationships object for each entry type and placed in the relationships dictionary with that entry type name as key (e.g., "structures"). If we rather want to preserve the JSON API relationship "name" as a means to group related things - possibly of different types - together, this feature of a 'relationship name' needs to be added to the output-format-agnostic description of relationships in 3.6 Relationships. Preferably with some hint in the implementers note about how one can encode these names in more straightforward key-value based output formats (e.g. "type name + '_' + relationship name"?). It is probably also a good idea with a clarification in 5.3.2 that the key in this dictionary field represents the relationship name.
The most straightforward solution above means going with this choice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some specific comments below, and one general comment here:
It would be good to have examples of the two-step approach of querying collections (following the example of references elsewhere in the spec), e.g., find the collection ID and then do /v1/structures?filter=collections.id = "42"
.
Re: my comment about included
below, I drafted an issue (probably years ago now) about our non-compliance with JSON:API relationships (but never got around to posting it). The Fetching relationships part of the JSON:API spec states that we should also serve relationships at e.g., https://example.com/v1/collections/42/relationships/structures
. This would then be a normal entry listing endpoint that could be paginated. For such one-to-many relationships this is probably preferable to using included
, which is primarily for condensing many-to-one relationships into a single response (e.g., all structures pointing at the same reference).
This is actually part of a bigger non-compliance, as JSON:API mandates in Fetching resources that you should be able to pick out particular attributes of an entry via e.g., https://example.com/v1/structures/123/elements
. This would break our allowed IDs (which can have slashes, as I currently do with odbx/1
... hence why I never posted the issue 😅). Looking at my draft (which I am happy to post), my suggestion was to just add a note in the specification that this aspect of JSON:API is not supported, but maybe it is now actually required for collections functionality to work properly...
It's also the case that very few implementations are serving relationships at all at the moment... and support for the include
query parameter/included
is even lower.
The set of entries that belong to the collection is defined using relationships from the collection to each entry (OPTIMADE relationships are defined in `Relationships`_). | ||
A collection can contain other collections. | ||
Furthermore, implementations are suggested to add database-specific properties for additional metadata they want to store about the collections. | ||
An OPTIMADE response representing a collection with all referenced entries included via the JSON API field :field:`included` (or equivalent in other response formats) can be used as a universal format for storage or transfer of a subset of (or all) data in an OPTIMADE database. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some concerns about pagination of included
values, in the case of e.g., 100,000 structures in the same collection. Do we need to worry about that? included
is only an optional field at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
included
is mentioned here only as a suggestion of the potential use of a collection as an export format - in which case the whole idea would be to put everything you want to export (e.g., all 100'000 structures) in the same stream. Nothing is said here that indicates mandatory support for included
?
I thought we intended for clients to generally just get the list of ids from the relationship and then request entry data by further queries using the endpoint + id format. (Or, for efficiency when there are many, perhaps via filter=id=X OR id=Y OR id=Z OR ...
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
included
is mentioned here only as a suggestion of the potential use of a collection as an export format - in which case the whole idea would be to put everything you want to export (e.g., all 100'000 structures) in the same stream. Nothing is said here that indicates mandatory support forincluded
?I thought we intended for clients to generally just get the list of ids from the relationship and then request entry data by further queries using the endpoint + id format. (Or, for efficiency when there are many, perhaps via
filter=id=X OR id=Y OR id=Z OR ...
)
I understand that this is the intention, but I am a bit nervous that we have a field that can now grow unboundedly in a response if requested (or even if not, you cannot disable relationships
from the response, I don't think?). I guess you could argue the same for a 1 million site structure object but here it feels well within the design that even the list of IDs could be very large.
I think the larger comment now addressed in #420 would be the best mechanism around this (if we are going to support it anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit nervous that we have a field that can now grow unboundedly in a response if requested (or even if not, you cannot disable relationships from the response, I don't think?)
Are you talking about included
or relationships
now?; both of them can grow to arbitrary size - although - we are talking extremely large collections before relationships
becomes unduly large.
Possibly repeating myself a bit, but to be clear: I don't see a problem with included
. Implementations probably should avoid it except as recommended in OPTIMADE for references, unless a client somehow explicitly requests it (which we don't have a standard mechanism for yet). If an implementation decides to include included
anyway, while simultaneously having "unboundedly large" relationships, it would be silly to not implement a limit on the number of entries included this way.
The situation is more tricky with huge relationships
. I think JSON:API silently is built on the assumption that the list of IDs for all relationships of a resource must be small enough to handle in a single request. Sure, one can use the articles/1/relationships/comments
syntax to get something paginated, but how does one know in the first place which JSON:API relationship keys to fetch without first fetching the unboundedly large articles/1
?
Hence, I think we have to look at this list of IDs as a single "datum" where our default OPTIMADE JSON:API output format isn't equipped to handle arbitrary large data. This is echo:ed by our recommendation for other output formats to simply encode relationships alongside other data.
If we are concerned about this limitation, I don't see any other way to address it than to implement an alternative output format that can handle pagination on individual properties, including the relationships.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly repeating myself a bit, but to be clear: I don't see a problem with
included
. Implementations probably should avoid it except as recommended in OPTIMADE for references, unless a client somehow explicitly requests it (which we don't have a standard mechanism for yet). If an implementation decides to includeincluded
anyway, while simultaneously having "unboundedly large" relationships, it would be silly to not implement a limit on the number of entries included this way.
Understood
The situation is more tricky with huge
relationships
. I think JSON:API silently is built on the assumption that the list of IDs for all relationships of a resource must be small enough to handle in a single request.
I think my concern is that some of the intended use cases for collections
might cross this boundary already (would 10,000/100,000 IDs to structures that define a training set break this?) I'm also not sure that relationships
can be excluded from the request using response_fields
, so you can't even hit /collections
to browse/filter them without getting these potentially large responses. I understand that this is already the case with e.g., COD's mythical 100k atom structure, but at least you could choose which fields you wanted to be returned!
Sure, one can use the
articles/1/relationships/comments
syntax to get something paginated, but how does one know in the first place which JSON:API relationship keys to fetch without first fetching the unboundedly largearticles/1
?
I'm leaning towards this being the correct approach. The relationships can be included as a self
link to articles/1/relationships/comments
rather than as a data
which I think solves your problem. Perhaps we could say something like. "It is RECOMMENDED that implementations use self
links instead of explicit relationships
for collections with a number of entries in significantly in excess of the implementation's page limit."
If we are concerned about this limitation, I don't see any other way to address it than to implement an alternative output format that can handle pagination on individual properties, including the relationships.
Let's see how the discussion in #419 goes...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The relationships can be included as a self link to articles/1/relationships/comments rather than as a data which I think solves your problem. Perhaps we could say something like. "It is RECOMMENDED that implementations use self links instead of explicit relationships for collections with a number of entries in significantly in excess of the implementation's page limit."
You seem to be right - excluding the data
key if it contains too many entries, and instead rely on a link
that returns a paginated result set of the entries is a straightforward solution. So, if we try to follow the JSON API examples, I think this link should point to, e.g.,: /collections/42/structures
and in reference to the ongoing discussion in #420 this would thus be the first "required" use of third-level URLs.
Hmm... Reading Section 1 I'd say that strictly speaking, this construct is indeed required to be supported: "The API specification described in this document builds on top of the JSON API v1.0 specification. In particular, the JSON API specification is assumed to apply wherever it is stricter than what is formulated in this document. Exceptions to this rule are stated explicitly (e.g. non-compliant responses are tolerated if a non-standard response format is explicitly requested)."
This is quite something to drop on this poor PR :-), and IMO a very important issue, but with very little to do with the acceptance or not of this PR, given that we already generally allow relationships - it should just be posted as a general issue. (In relation to what I commented above about Nevertheless, strictly formally, based on the quote above, I think the implication is that we MUST support this. How your implementation works out the separation between your "/"-based IDs and these fields is on up to you, but you can certaintly do it by, e.g., relying on that your ids have a fixed format so you know the first slash is always part of the id. That said, the realization here is that one probably should avoid "/" in IDs... (Very helpful of us to give examples showing "/"-based ids though.) |
Co-authored-by: Matthew Evans <[email protected]>
Indeed, I'll dig out my draft issue and post it. I think I mentioned it at a meeting once but perhaps I did not explain myself well. |
I don't think a discussion of #420 is blocking this PR, if we are happy with the format otherwise. |
Just another quick thought that there is some enforced directionality to the collections relationship, in that each member of the collection does not have a relationship back to its collection. If it did, you would be able to do |
The suggested direction of the mono-directional relationships could also be reversed, i.e. instead of |
Just to throw in my two cents... In our work with ColabFit we originally started off only requiring mono-directional links (in the So to me, the question of whether to use mono-/bi-directional links, and in which direction if you're only doing mono-directional, is just too application-dependent. Each application would want to make a different choice depending on 1) what types of queries they want to support, and 2) what direction information needs to flow if they're doing data aggregation. My suggestion:
Personally, option 1 seems better to me. Though I can't think of an example off the top of my head, I could imagine that there might be some application where it's easier for the data provider to maintain one direction than it is to maintain the other, in which case option 2 would be overly restrictive. |
The way I edited the PR, it is meant to say that the direction I don't like the idea of exposing clients to the possibility of databases providing only With enforced Even if your database deep down encodes collections as |
@ml-evs I was about to say that with #438 merged this should be good to go. But, I realize your concerns about very large collections aren't resolved yet. It seems issue #428, created in response of the need to sharing very large sets as part of trajectories in #377, should be relevant here; but I don't see how it can be directly applied to JSON:API relationships. What do you think: should we re-edit collections to not use JSON:API |
Using a It would also solve the included problem. If we used the normal JSON:API Instead, the entries in the collection could be retrieved via a query that checks if the id of an entry is in a list of ids supplied by the client. This would use the same paging mechanism that we already use. A disadvantage could however be that there may be a maximum size for the query parameter (depending on the software and settings, it could be as small as 1024B). So the number of entries per query could be limited. An alternative solution would be to make bidirectional associations and query the individual entries for membership. So in summary, I think it would be better to use a separate variable for storing the members, as this gives more flexibility to handle them. |
To try to stay on point, I do not think we need to solve the "included problem" to add collections to the specification. The specification clearly allows an implementation not to ever include anything in a
It is specifically encouraged for references because it is a sensitive issue to include data without sourcing it. That only applies to references. I'd be happy with implementations never using or supporting
I was trying to force MUST level support for POST in OPTIMADE using the exact same semantics as for GET, to avoid issues like this. I'm sad that was rejected. Nevertheless, a client can work around the limitations by splitting up its request, so the issue can be worked around. I think that is good enough to accept the proposed version of collections.
I am very strongly opposed to a MUST level requirement for bi-directional collections (and, if clients cannot rely on it being there, i.e, a MUST, then it is not very useful). In the httk implementation we generally want to view stored data as immutable, and hence adding something existing to a collection would need some kind of side storage / querying tricks to handle the bi-directionality.
Right, so, lets get the solution of #428 implemented and merged, and then use it here. |
This is yet another argument in favor of supporting POST (#114). It may be worth revisiting the discussion there. |
After discussions: we realize that the partial data format in OPTIMADE may not cover the case of a too long relationship list to fit in a response. So we have to do one of:
|
Some pros and cons of having the collection members in data part, my PoV. Pros:
Cons:
|
Starting point for discussing a general "Collections" entry