-
Notifications
You must be signed in to change notification settings - Fork 10
Home
Mapeathor is a simple spreadsheet parser developed by the Ontology Engineering Group able to generate mapping rules in different mapping languages: R2RML, RML (releases on 2014 and 2023) and YARRRML. It provides straightforward way to create these mappings, specifying the transformation rules in spreadsheets, with the purpose of increasing the interoperability between the mapping languages as well as easing the creation process.
Currently, Mapeathor is being used to generate mappings for city open data publication related to traffic, public bus transport, budget and noise pollution in the context of the Ciudades Abiertas project.
In this section it is shown how to create and specify the transformation rules in the spreadsheet following the designed template, and the different options available to run the tool. You can also check this demo video.
The spreadsheet is organized in 5 different sheets that structure the mapping rules: Prefix, Subject, Source, Predicate_Object and Function.
This sheet contains the prefix and namespaces used in the rest of the rules. Any number of prefixes can be added.
Additionally, a base URI can be defined using '@base' as value of the column 'Prefix'.
In this sheet there are defined the subjects, the class(es) to which they belong, their URI, and optionally, their assigned named graph. Each subject has a unique identifier that links it to its correspondent source data (in the Source sheet) and predicate-object properties (in the Predicate_Object sheets). Notice that in the URI column there is information between curly brackets {}, this corresponds to a field in the source data. In this case, each URI takes the data from a field called 'ID' form their respective source data files.
This subject defines where the data is retrieved from. Every row must be assigned to an ID correspondent to its subject. You can specify the following features: source
, format
, iterator
, table
, query
and SQLVersion
. The examples show 4 different ways in which the source can be specified: PERSON in the first table as a table from a database with SQL version SQL2008, SPORT from the first table as a CSV file; PERSON from the second table as a JSON file, and SPORT from the second table as a query that extracts the fields 'ID' and 'sport' from the a database table. Each definition of source has to be coherent with the language to translate later, for example, R2RML doesn't accept more options than the ones referring to databases (table, query and SQLVersion).
The rest of the example will use as a reference the first table.
This sheet contains the predicate-object properties linked to their correspondent subject. Optionally , the datatype and language of the objects can be added, as well as data transformation functions (referencing the ones specified in the Function sheet). It also allows specifying the joins between different subjects by their ID and the data's fields that are equal to perform the linking.
This sheet contains the functions that the mapping may want to make use of. Similarly as in the source sheet, the column Feature gives information about what is specified in the colum Value. The functions' name corresponds to the feature fno:executes
. The same thing happens with specifying the parameters, the name of the parameter has to be written in the column Feature. A function can be another function's parameter.
The easiest way of running Mapeathor is using the web service and the Swagger instance. For CLI lovers, the service is available as a PyPi package.
With python:
# Install
$ python3 -m pip install mapeathor
# How to execute it. You can use a local XLSX file or a shared Google Spreadsheet URL
$ python3 -m mapeathor -i [PATH or URL] -l [RML | RML2014 | R2RML | YARRRML] -o [PATH]
# Help Menu
$ python3 main.py -h