Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 36 additions & 8 deletions csvw/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,12 @@ var respecConfig = {
companyURL: "https://skemu.com"
},
{
name: "Dylan Van Assche",
name: "Sitt Min Oo",
company: "Ghent University – imec – IDLab",
url: "https://dylanvanassche.be",
orcid: "0000-0002-7195-9935",
url: "https://data.knows.idlab.ugent.be/person/minoo/#me",
orcid: "0000-0001-9157-7507",
companyURL: "https://knows.idlab.ugent.be/"
},
}
],
edDraftURI: "https://w3id.org/rml/io-registry/csvw/spec",
editors: [
Expand All @@ -62,19 +62,47 @@ var respecConfig = {
orcid: "0009-0000-2598-1894",
companyURL: "https://skemu.com"
},
{
name: "Sitt Min Oo",
company: "Ghent University – imec – IDLab",
url: "https://data.knows.idlab.ugent.be/person/minoo/#me",
orcid: "0000-0001-9157-7507",
companyURL: "https://knows.idlab.ugent.be/"
}
],
formerEditors: [
{
name: "Dylan Van Assche",
company: "Ghent University – imec – IDLab",
url: "https://dylanvanassche.be",
orcid: "0000-0002-7195-9935",
companyURL: "https://knows.idlab.ugent.be/"
},
],
formerEditors: [
}
],
github: "https://github.com/kg-construct/rml-io-registry",
license: "w3c-software-doc",
localBiblio: {
"Turtle": {
title: "RDF 1.1 Turtle",
href: "https://www.w3.org/TR/turtle/",
status: "W3C Recommendation",
publisher: "W3C",
date: "25 February 2014",
},
"CSVW-Namespace": {
title: "CSVW Namespace Vocabulary Terms",
href: "https://www.w3.org/ns/csvw",
status: "W3C Document",
publisher: "W3C",
date: "06 June 2017",
},
"CSV": {
title: "Common Format and MIME Type for Comma-Separated Values (CSV) Files",
href: "https://www.ietf.org/rfc/rfc4180.txt",
status: "Internet Standard",
publisher: "IETF",
date: "October 2005",
},
"RML-Core": {
title: "RML-Core",
href: "https://w3id.org/rml/core/spec",
Expand All @@ -91,7 +119,7 @@ var respecConfig = {
},
},
otherLinks: [],
shortName: "RML-Ref-JSON-Path",
shortName: "RML-IO-Registry",
specStatus: "CG-DRAFT",
// W3C config
copyrightStart: "2024",
Expand Down
8 changes: 4 additions & 4 deletions csvw/section/natural-rdf-mapping.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,16 @@
CSV does not provide any native data types, therefore there is no natural RDF mapping of CSV values upon XSD data types.
CSVW allows to specify for each column the data type in a `csvw:Table`:

```

<aside class="ex-mapping">
<CSVWTable> a csvw:Table;
csvw:tableSchema [
csvw:columns [
csvw:name "Column";
csvw:datatype xsd:integer;
];
];
.
```
].
</aside>

The `csvw:datatype` must be used for the natural mapping of datatypes in RDF from CSV values.

Expand Down
17 changes: 17 additions & 0 deletions csvw/section/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
## Document Conventions {#document-convention}

The following namespace prefixes are used throughout the rest of the document unless otherwise stated.

| Prefix | Namespace |
| ------- | --------------------------------- |
| `rml:` | http://w3id.org/rml/ |
| `csvw:` | https://www.w3.org/ns/csvw# |
| `xsd:` | http://www.w3.org/2001/XMLSchema# |
| `ex:` | http://example.org/ |
| `:` | http://example.org/ |

The examples are contained in color-coded boxes. We use the Turtle syntax [[Turtle]] to write RDF.

<aside class="ex-mapping">
# This box contains an example mapping
</aside>
84 changes: 75 additions & 9 deletions csvw/section/reference-formulation.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ A <dfn>CSVW reference formulation</dfn> (`rml:CSVWReferenceFormulation`) is a <a

The default CSVW Reference Formulation is a <a data-cite="RML-Core#dfn-reference-formulation">reference formulation</a> identified with the IRI `rml:CSVW`. It has no specific properties.


## Iterator

The <a data-cite="RML-Core#dfn-iterator">iterator</a> for a <a data-cite="RML-Core#dfn-logical-source">logical source</a> with the [=CSVW reference formulation=] is always row-based over a table.
Expand All @@ -20,15 +21,80 @@ An <a data-cite="RML-Core#dfn-expression">expression</a> for <a data-cite="RML-C
An <a data-cite="RML-Core#dfn-expression">expression</a> is evaluated against a <a data-cite="RML-Core#dfn-logical-iteration">logical iteration</a> which is a [=CSV value=].
The result of evaluating the <a data-cite="RML-Core#dfn-expression">expression</a> is a [=CSV row=], which MUST be transformed to a list of [=CSV values=] that forms the <a data-cite="RML-Core#dfn-expression-evaluation-result">expression evaluation result</a>. The order of the [=CSV row=] MUST be preserved in the <a data-cite="RML-Core#dfn-expression-evaluation-result">expression evaluation result</a>.

## CSV derivates

CSVW allows to specify how a CSV should be parsed and read in terms of NULL values, separator, encoding, etc.
These CSVW properties must be used by the engine to correctly read the CSV file.
## CSVW properties
CSVW provides metadata which helps in parsing the CSV in terms of NULL values, separator, encoding, etc.
These CSVW properties must be used by the engine to correctly parse the CSV file.
[[CSVW-Namespace]] vocabulary can be used to provide more metadata to help
with parsing the CSV file.


### No headers
CSVW enables parsing CSV files without a header row.
Parsing CSV files with `1..N` columns with `csv:header` set to `false` will
produce a table with the column names "1" to "N" respectively.


```
Provided with the following input CSV file with 4 columns:
<aside class="ex-input">
647,434244.172304,428652.920455
646,434546.276382,428380.451633
6212,434644.819095,428412.411432
651,434758.675879,428527.599874
650,434821.652431,428439.025039
...
</aside>

and the following RML mapping containing the CSVW reference formulation definitions for
the aforementioned CSV file:
<aside class="ex-mapping">
<CSVWTable> a csvw:Table;
csvw:separator ";";
csvw:null "";
csvw:encoding "utf-16";
.
```
csvw:null "";
csvw:separator ";";
csvw:dialect [
csvw:header "false"^^xsd:boolean;
];
</aside>


It is the same as working with the following CSV table where the headers are
named <b>"1", "2", "3", and "4"</b>:
<aside class="ex-input">
1,2,3,4 #numbered header row
647,434244.172304,428652.920455
646,434546.276382,428380.451633
6212,434644.819095,428412.411432
651,434758.675879,428527.599874
650,434821.652431,428439.025039
...
</aside>



### Default properties
While the default reference formulation identifier (`rml:CSVW`) specifies the
use of the CSVW reference formulation, it does not by itself describe how to
parse a given CSV file.
To ensure consistent behaviour across implementations, a default set of CSVW
properties and corresponding values is defined.
These defaults provide a minimal, functional CSVW configuration suitable for
parsing standard CSV files in the absence of explicit definition of
CSVW properties.

<aside class="ex-mapping">
<CSVWTable> a csvw:Table;
csvw:null "";
csvw:separator ";";
csvw:dialect [
csvw:commentPrefix "#";
csvw:trim "false"^^xsd:boolean;
csvw:header "true"^^xsd:boolean;
csvw:delimiter ",";
csvw:encoding "utf-8";
csvw:skipBlankRows "false"^^xsd:boolean;
csvw:skipColumns "0"^^xsd:integer;
csvw:skipInitialSpace "false"^^xsd:boolean;
csvw:skipRows "0"^^xsd:integer;
];
</aside>