Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Tabular transformation #184

Open
lucasalbertins opened this issue Feb 8, 2023 · 3 comments
Open

Generic Tabular transformation #184

lucasalbertins opened this issue Feb 8, 2023 · 3 comments

Comments

@lucasalbertins
Copy link

lucasalbertins commented Feb 8, 2023

Hi,

I am trying to build a generic tabular structure in RDF that I could use to load CSV and excel files. I've created triples to represent rows, columns and cells, and triples to relate them all. I am trying to use YARRRML to parse a CSV like this to this format, but I would need to keep track of the column indexes for that. Is there any way to get the index of a given column or maintain a variable during the transformation for that purpose? Below is a simplified description of what would be the RDF for this CSV:

ex:row-0
        rdf:type ex:Row ;
        ex:tabular#hasCell   ex:cell-00, ex:cell-01, ex:cell-02;
        ex:tabular#hasRowId  0 .
... (several rows increasing index)
ex:col-0
        rdf:type ex:Column ;
        ex:tabular#hasCell ex:cell-00, ex:cell-10, ex:cell-20 ;
        ex:tabular#hasColumnI 0 .
... (several columns increasing index)
ex:cell-00
        rdf:type ex:Cell ;
        ex:tabular#hasRowId 0 ;
        ex:tabular#hasColumnId 0 ;
        ex:tabular#hasValue "0" .
ex:cell-01 
        rdf:type ex:Cell ;
        ex:tabular#hasRowId 0 ;
        ex:tabular#hasColumnId 1 ;
        ex:tabular#hasValue "100" .
... (several cells increasing row and column indexes)
@bjdmeest
Copy link
Collaborator

bjdmeest commented Feb 9, 2023

I'm afraid this is currently not supported by the RML specification, and thus also not by YARRRML.

We're currently working on new versions of the specifications so I included your question in our process (see the linked issues).

However, given this is a very specific type of mapping that probably won't ever change, I'm not sure using YARRRML/RML is here the most efficient way forward: you probably won't need to update that mapping very often.

Can you further explain your use case to understand why you're doing this and how YARRRML currently helps you in that goal?

@lucasalbertins
Copy link
Author

Thank you for your answer. Indeed, the idea is to have a transformation of several tabular types of files (CSV, xls, etc.) into a generalized structure in RDF, aiming the querying and reasoning over it. So it is expected not to change that much. We thought of using RML/YARRRML for that, but implementing our own transformation may be the best way to go.

@namedgraph
Copy link

@lucasalbertins https://github.com/AtomGraph/CSV2RDF might do what you need

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants