Skip to content
150 changes: 150 additions & 0 deletions spec/docs/datatypeConversion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Datatype conversions

For each [=reference formulation=] there may be a set of defined <dfn data-lt="natural mapping">natural RDF mappings</dfn> that are applied to the [=expression evaluation results=] on the [=data source=]. These [=natural mappings=] are defined in the [[RML-IO-Registry]] and are used to convert the values of the [=expression evaluation result=] to the appropriate [=natural RDF literal=] corresponding with the [=reference formulation=].

## Natural mapping of source values

The <dfn>natural RDF literal</dfn> is a [=literal=] that is the result of applying a [=natural mapping=] on a value of a [=data source=], which produces a [=literal=] that is the most appropriate representation of the value in RDF. The [=natural RDF literal=] has a [=natural RDF lexical form=].

The <dfn>natural RDF lexical form</dfn> produces only the [=lexical form=] of the [=literal=] and recommends that implementations SHOULD apply the [=XSD canonical mapping=], making it a [=canonical RDF lexical form=]. It is used in RML when non-string [=expression evaluation results=] are used in a string context, for example when a timestamp is used in an [=template-valued term map=] with [=term type=] [=IRI=].

The <dfn>canonical RDF lexical form</dfn> produces only the [=lexical form=] of the [=literal=] and requires that the [=XSD canonical mapping=] MUST be applied.

<dfn>Cast to string</dfn> is an implementation-dependent function that maps values from [=expression evaluation results=] to equivalent Unicode strings. The specifics of [=cast to string=] per [=reference formulation=] are defined in the [[RML-IO-Registry]].

Additionally, the [=natural mapping=] determines the [=natural RDF datatype=] of the [=literal=].

The <dfn>natural RDF datatype</dfn> is the [=datatype=] corresponding to the [=natural RDF literal=] that is the result of the [=natural mapping=]. The [=natural RDF datatype=] is an [=IRI=] that represents the [=datatype=] of the value in RDF.

## Datatype-override mapping of source values

The <dfn>datatype-override RDF literal</dfn> corresponding to an [=expression evaluation result=] value `v` and a [=datatype IRI=] `dt`, is a [=literal=] whose [=lexical form=] is the [=natural RDF lexical form=] corresponding to `v`, and whose [=datatype IRI=] is `dt`. If the [=literal=] is [=ill-typed=], then a [=data error=] is raised.

A [=literal=] is <dfn data-lt="ill-typed literal">ill-typed</dfn> in RML if its [=datatype IRI=] denotes a [=validatable RDF datatype=] and its [=lexical form=] is not in the [=lexical space=] of the [=RDF datatype=] identified by its [=datatype IRI=].

The set of <dfn>validatable RDF datatypes</dfn> includes all [=datatypes=] in the RDF datatype column of [[[#table-lexical-forms]]], as defined in [[XMLSCHEMA11-2]]. This set MAY include implementation-defined additional RDF datatypes.

For example, `"X"^^xsd:boolean` is [=ill-typed=] because `xsd:boolean` is a validatable [=RDF datatype=] in RML, and `"X"` is not in the [=lexical space=] of `xsd:boolean` [[XMLSCHEMA11-2]].

<section class="informative">
<h2>Summary of XSD Lexical Forms</h2>

The [=natural mappings=] make reference to various [=XSD datatypes=] and require that values from [=expression evaluation results=] be converted to strings that are appropriate as [=lexical forms=] for these [=datatypes=]. This subsection gives examples of these [=lexical forms=] in order to aid implementers of the mappings. This subsection is non-normative; the normative definitions of the [=lexical spaces=] as well as the [=canonical mappings=] are found in [[XMLSCHEMA11-2]].

A general approach that may be used for implementing the natural mappings is as follows:

1. Identify the source datatype of value of the [=expression evaluation result=] on the [=data source=].
1. Look up its corresponding [=natural RDF datatype=] for the [=reference formulation=] in the [[RML-IO-Registry]].
1. Apply [=cast to string=] to the value.
1. Ensure that the resulting string is in the [=lexical space=] of the target [=RDF datatype=]; that is, it must be in a form such as those listed in either column of [[[#table-lexical-forms]]] below. This may require some transformations of the string, in particular for `xsd:hexBinary`, `xsd:dateTime` and `xsd:boolean`.
1. If the goal is to obtain a [=canonical RDF lexical form=], then further string transformations may be required to obtain a form such as those listed in the Canonical lexical forms column of [[[#table-lexical-forms]]] below.

<table class="numbered" id="table-lexical-forms">
<caption>Table of canonical and non-canonical lexical forms for some XSD datatypes</caption>
<tbody>
<tr>
<th>RDF datatype</th>
<th>Non-canonical lexical forms</th>
<th>Canonical lexical forms</th>
<th>Comments</th>
</tr>
<tr>
<td><code><a href="https://www.w3.org/TR/xmlschema11-2/#hexBinary">xsd:hexBinary</a></code></td>
<td><code>5232524d4c</code></td>
<td><code>5232524D4C</code></td>
<td>Convert from SQL by applying <a href="https://www.w3.org/TR/xmlschema11-2/#hexBinary"><code>xsd:hexBinary</code> lexical mapping</a>.</td>
</tr>
<tr>
<td rowspan="4"><code><a href="https://www.w3.org/TR/xmlschema11-2/#decimal">xsd:decimal</a></code></td>
<td><code>.224</code></td>
<td><code>0.224</code></td>
<td rowspan="4"></td>
</tr>
<tr>
<td><code>+001</code></td>
<td><code>1</code></td>
</tr>
<tr>
<td><code>42.0</code></td>
<td><code>42</code></td>
</tr>
<tr>
<td><code>-5.9000</code></td>
<td><code>-5.9</code></td>
</tr>
<tr>
<td rowspan="3"><code><a href="https://www.w3.org/TR/xmlschema11-2/#integer">xsd:integer</a></code></td>
<td><code>-05</code></td>
<td><code>-5</code></td>
<td rowspan="3"></td>
</tr>
<tr>
<td><code>+333</code></td>
<td><code>333</code></td>
</tr>
<tr>
<td><code>00</code></td>
<td><code>0</code></td>
</tr>
<tr>
<td rowspan="5"><code><a href="https://www.w3.org/TR/xmlschema11-2/#double">xsd:double</a></code></td>
<td><code>-5.90</code></td>
<td><code>-5.9E0</code></td>
<td rowspan="5">Also supports <code>INF</code>, <code>-INF</code>, <code>NaN</code> and <code>-0.0E0</code>,<br>but these do not appear in standard SQL.</td>
</tr>
<tr>
<td><code>+0.00014770215000</code></td>
<td><code>1.4770215E-4</code></td>
</tr>
<tr>
<td><code>+01E+3</code></td>
<td><code>1.0E3</code></td>
</tr>
<tr>
<td><code>100.0</code></td>
<td><code>1.0E2</code></td>
</tr>
<tr>
<td><code>0</code></td>
<td><code>0.0E0</code></td>
</tr>
<tr>
<td rowspan="2"><code><a href="https://www.w3.org/TR/xmlschema11-2/#boolean">xsd:boolean</a></code></td>
<td><code>1</code></td>
<td><code>true</code></td>
<td rowspan="2">Must be lowercase.</td>
</tr>
<tr>
<td><code>0</code></td>
<td><code>false</code></td>
</tr>
<tr>
<td><code><a href="https://www.w3.org/TR/xmlschema11-2/#date">xsd:date</a></code></td>
<td></td>
<td><code>2011-08-23</code></td>
<td>Dates in SQL don't have timezone offsets.<br>They are optional in XSD.</td>
</tr>
<tr>
<td rowspan="3"><code><a href="https://www.w3.org/TR/xmlschema11-2/#time">xsd:time</a></code></td>
<td><code>22:17:34.885+00:00</code></td>
<td><code>22:17:34.885Z</code></td>
<td rowspan="3">May or may not have timezone offset.</td>
</tr>
<tr>
<td><code>22:17:34.000</code></td>
<td><code>22:17:34</code></td>
</tr>
<tr>
<td><code>22:17:34.1+01:00</code></td>
<td><code>22:17:34.1+01:00</code></td>
</tr>
<tr>
<td><code><a href="https://www.w3.org/TR/xmlschema11-2/#dateTime">xsd:dateTime</a></code></td>
<td><code>2011-08-23T22:17:00.000+00:00</code></td>
<td><code>2011-08-23T22:17:00Z</code></td>
<td>May or may not have timezone offset.<br>Convert from SQL by replacing space wiht "<code>T</code>".</td>
</tr>
</tbody>
</table>

</section>
11 changes: 5 additions & 6 deletions spec/docs/expressions.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generating values with expressions

<dfn>Expressions</dfn> are mapping constructs that can be evaluated on a [=logical iteration=], according to the specified reference formulation, to generate values during the mapping process.
<dfn>Expressions</dfn> are mapping constructs that can be evaluated on a [=logical iteration=], according to the specified [=reference formulation=], to generate values during the mapping process.

## Expression map (`rml:ExpressionMap`)

Expand All @@ -10,17 +10,17 @@ An <dfn>expression map</dfn> (`rml:ExpressionMap`) is an abstract class, that is
* 0 or 1 `rml:template`, or
* another property, or properties, defined by a subclass of `rml:ExpressionMap`.

Each of these properties specifies an [=expression=] which, upon evaluation, results in an ordered list of values.
Each of these properties specifies an [=expression=] which, upon evaluation, results in an ordered list of values, called the <dfn>expression evaluation result</dfn>.

The <dfn>reference expression set</dfn> of an [=expression map=] is the set of expressions which are evaluated on a [=logical iteration=].

### Constant expression (`rml:constant`)

A <dfn>constant-valued expression map</dfn> is an [=expression map=] that always generates the same value. A constant-valued expression map is represented by a resource that has exactly one `rml:constant` property, the value of which is called a <dfn>constant expression</dfn>.
A <dfn>constant-valued expression map</dfn> is an [=expression map=] that always generates the same [=expression evaluation result=]. A constant-valued expression map is represented by a resource that has exactly one `rml:constant` property, the value of which is called a <dfn>constant expression</dfn>.

The <dfn>constant value</dfn> is a singleton list containing the [=constant expression=].

The [=reference expressions=] of a [constant-valued expression map=] is an empty list.
The [=reference expressions=] of a [=constant-valued expression map=] is an empty list.

### Reference (`rml:reference`)
A <dfn>reference-valued expression map</dfn> is an [=expression map=] that is represented by a resource that has exactly one `rml:reference` property, the value of which is called a <dfn>reference expression</dfn>.
Expand All @@ -29,8 +29,7 @@ The [=reference expression=] MUST be a valid [=expression=] according to the def

The [=reference expression set=] of a [=reference-valued expression map=] is the singleton set containing the [=reference expression=].

The <dfn>reference value</dfn> is an ordered list of values obtained by evaluating the [=reference expression=] against a given [=logical iteration=].
For each value in the ordered list, an expression is created.
The <dfn>reference value</dfn> is the [=expression evaluation result=] obtained by evaluating the [=reference expression=] against a given [=logical iteration=].

### Template (`rml:template`)
A <dfn>template-valued expression map</dfn> is an [=expression map=] that is represented by a resource that has exactly one `rml:template` property, the value of which is called a <dfn>template expression</dfn>. The [=template expression=] MUST be a valid [=string template=].
Expand Down
6 changes: 5 additions & 1 deletion spec/docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -106,9 +106,13 @@

<section id="joins" data-include="joinconditions.md" data-include-format="markdown"></section>

<section id="datatype-conversion" data-include="datatypeConversion.md" data-include-format="markdown"></section>

<section id="definitions" data-include="definitions.md" data-include-format="markdown"></section>

<section id="rdfTerminology" class="appendix, informative" data-include="rdfTerminology.md" data-include-format="markdown"></section>
<section id="rdfTerminology" class="informative" data-include="rdfTerminology.md" data-include-format="markdown"></section>

<section id="xsdTerminology" class="informative" data-include="xsdTerminology.md" data-include-format="markdown"></section>

</body>

Expand Down
12 changes: 7 additions & 5 deletions spec/docs/logicalSource.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@
# Defining Logical Sources
# Defining Logical Iterables and Logical Sources

A <dfn>logical source</dfn> is an abstract construct to describe data access and iteration for a [=data source=] such that it can be mapped to [=RDF triples=].
A <dfn>logical iterable</dfn> is an abstract construct to describe data access and iteration for a [=data source=].

A [=logical source=] (`rml:LogicalSource`) MUST have:
A [=logical iterable=] (`rml:LogicalIterable`) MUST have:
* exactly one `rml:referenceFormulation` property, whose value is a <dfn>reference formulation</dfn> which defines how the underlying [=data source=] is to be accessed, and which [=expressions=] can be evaluated on [=logical iterations=],
* zero or one `rml:iterator` property, whose value is a <dfn data-lt="iterator">logical iterator</dfn> that defines a sequence of [=logical iterations=] on the [=data source=]. If no [=iterator=] is provided, a <dfn class="lint-ignore">default iterator</dfn> MUST be associated with the [=reference formulation=].

A <dfn data-lt="iteration">logical iteration</dfn> is an item in the sequence produced by the [=logical source=], on which [=expressions=] can be evaluated.
A <dfn data-lt="iteration">logical iteration</dfn> is an item in the sequence produced by the [=logical iterable=], on which [=expressions=] can be evaluated.

A <dfn>data source</dfn> is an abstract concept that represents a source of data that can be accessed via a [=logical source=]. A [=data source=] can be a file, a database, a web service, or any other source of data.
A <dfn>data source</dfn> is an abstract concept that represents a source of data that can be accessed via a [=logical iterable=]. A [=data source=] can be a file, a database, a web service, or any other source of data, depending on the type of [=logical iterable=].

<aside class="note">
There can be many different types of [=reference formulation=]. The known types, and the details of how a reference formulation is handled and implemented for each data format, are specified in [[RML-IO-Registry]].
</aside>

A <dfn>logical source</dfn> (`rml:LogicalSource`) is a sub class of [=logical iterable=] that can be associated with a [=triples map=] such that a [=data source=] can be mapped to [=RDF triples=].
2 changes: 1 addition & 1 deletion spec/docs/mapping.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ All [=RDF triples=] generated from one [=logical iteration=] in the [=logical so

A [=triples map=] is represented by a [=resource=] that references the following other [=resources=]:

* It MUST have zero or one [=logical source=] (`rml:logicalSource`) property.
* It MUST have zero or one [=logical source=] (`rml:logicalSource`) property whose value MUST be a [=logical source=] (`rml:LogicalSource`).
* It MUST have exactly one [=subject map=] (`rml:SubjectMap`) that specifies how to generate a subject for each [=iteration=] of the [=logical source=].
It may be specified in two ways:
1. using the subject map `rml:subjectMap` property, whose value MUST be the [=subject map=], or
Expand Down
7 changes: 3 additions & 4 deletions spec/docs/rdfTerminology.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,20 @@
# RDF Terminology

This appendix lists some terms normatively defined in other specifications.

The following terms are defined in [[RDF11-CONCEPTS]] and usedin RML:
This section lists some terms normatively defined in [[RDF11-CONCEPTS]] and used in RML:

- <dfn><a data-cite="RDF11-CONCEPTS#dfn-rdf-dataset">RDF dataset</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-rdf-graph">RDF graph</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-rdf-triple">RDF triple</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-iri">IRI</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-blank-node">blank node</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-blank-node-identifier">blank node identifier</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-datatype">datatype</a></dfn>
- <dfn data-lt="RDF datatype"><a data-cite="RDF11-CONCEPTS#dfn-datatype">datatype</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-datatype-iri">datatype IRI</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-default-graph">default graph</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-language-tag">language tag</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-language-tagged-string">language-tagged string</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-lexical-form">lexical form</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-lexical-space">lexical space</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-literal">literal</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-named-graph">named graph</a></dfn>
- <dfn><a data-cite="RDF11-CONCEPTS#dfn-object">object</a></dfn>
Expand Down
Loading
Loading