Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Better reading of RO-Crate metadata files #542

Open
kmexter opened this issue Nov 5, 2024 · 6 comments
Open

[Feature]: Better reading of RO-Crate metadata files #542

kmexter opened this issue Nov 5, 2024 · 6 comments
Labels
enhancement New feature or request

Comments

@kmexter
Copy link

kmexter commented Nov 5, 2024

Detailed Description

I recently had a meeting with @huberrob and colleagues, in the context of the FAIR-EASE project, to discuss our use of the FUJI fairness checker on our metadata "records" which are actually ro-crate metadata json files. I ran the some files (see list below) through the checker and from a look at the report I got back, they are perhaps not quite read properly by FUJI. I mentioned this at the meeting and Robert asked me to provide these ro-crate examples for FUJI to check out

https://github.com/arms-mbon/data_release_001/blob/main/ro-crate-metadata.json
https://github.com/arms-mbon/analysis_release_001/blob/main/ro-crate-metadata.json
https://github.com/arms-mbon/code_release_001/blob/main/ro-crate-metadata.json

We made quite some effort to add lots of machine-understandable metadata to these rocrates, so I think they should give a good assessment. So we have identifiers, download URLs, licence, links to related entities, etc.
FYI The domain of these metadata is marine biology.

I know that there are 2 ways that I can specify the URL in the tool in order to get the best reporting, i.e.
https://raw.githubusercontent.com/arms-mbon/data_release_001/main/ro-crate-metadata.json vs https://github.com/arms-mbon/data_release_001/blob/main/ro-crate-metadata.json
It would be nice to have a recommendation as to which is the better approach (since the first one gave a better result, I am guessing it is that one)

Context

If you can modify FUJI to work better on ro-crates, we could trust the results better. We would also value any feedback from you on our ro-crates!

Possible Implementation

@huberrob
Copy link
Contributor

huberrob commented Nov 7, 2024

Just a quick first comment: only the https://raw.githubusercontent.com style URLs are pointing to JSON content, URLs without the 'raw' subdomain are delivering HTML content in which JSON is displayed in a textarea tag, therefore F-UJI would not find it.

@mpo-vliz
Copy link

mpo-vliz commented Nov 7, 2024

I know that there are 2 ways that I can specify the URL in the tool in order to get the best reporting, i.e.
https://raw.githubusercontent.com/arms-mbon/data_release_001/main/ro-crate-metadata.json vs https://github.com/arms-mbon/data_release_001/blob/main/ro-crate-metadata.json
It would be nice to have a recommendation as to which is the better approach (since the first one gave a better result, I am guessing it is that one)

There is a third way, well rather the imho "correct" way

My own recommendation is not to use either of those refs — we need to stop using github-based uri in these case, mostly because they are accidental to our current located host, and entirely not core to these data 'identity', definitely not a dependency we want to be tied to. We did an effort to organize the data.arms-mbon.org domain -- and so the correct way to publicly refer to that ro-crate is https://data.arms-mbon.org/data_release_001/

From there possible redirects + a finel html-embedded (link;rel=descibedby) fair-signposting-conform, should lead one to the metadata,json file in question.

@cedricdcc
Copy link

quick sidenote. The page that you are describing here https://data.arms-mbon.org/data_release_001/ is a collection of versions of a rocrate. These don't have LOD yet but will have that in the future. The page that has embedded fair-signposting is https://data.arms-mbon.org/data_release_001/latest/ which is the latest version of the data_release_001 rocrate.

image

@huberrob
Copy link
Contributor

huberrob commented Nov 8, 2024

True, at least F-UJI would follow the describedby links to locate the metadata.

@huberrob
Copy link
Contributor

huberrob commented Nov 8, 2024

Maybe you could replace the fileFormat values with correct mime types?
Not sure how this is handles in RO-Ctate namespace butaAs far as I know fileFormat is replaced in schema.org by encodingFormat

@mpo-vliz
Copy link

mpo-vliz commented Nov 8, 2024

I see the rocrate context has provisioning for both encodingFormat and fileFormat

And the spec recommendation is to use encodingFormat to hold the mime types
https://www.researchobject.org/ro-crate/specification/1.2-DRAFT/data-entities.html#encoding-file-paths

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

4 participants