Skip to content

Bug: iso8859-1 encoded .csv resources create a TableSchemaError  #120

@scammo

Description

@scammo

Overview

I'm trying to use Open Data from the state of Schleswig-Holstein. This is my first time using frictionless, so please bear that in mind. The resource is: https://opendata.schleswig-holstein.de/data/frictionless/badegewaesser.json

This includes resources which are encoded in: iso8859-1

"path": "https://efi2.schleswig-holstein.de/bg/opendata/v_badegewaesser_odata.csv",
 "encoding": "iso8859-1",
"name": "badegewasser-stammdaten",
"profile": "tabular-data-resource",
"format": "csv",

If I try:

const resource = await datapackage.Package.load('https://opendata.schleswig-holstein.de/data/frictionless/badegewaesser.json', '', false)
await resource.getResource('badegewasser-messungen').read({ keyed: true})

I get the following error:

TableSchemaError: There are 3 type and format mismatch errors (see 'error.errors')
    at Schema.castRow (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/schema.js:176:15)
    at DestroyableTransform._transform (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/table.js:342:44)
    at Transform._read (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:184:10)
    at Transform._write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:172:83)
    at doWrite (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:428:64)
    at writeOrBuffer (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:417:5)
    at Writable.write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:334:11)
    at Parser.ondata (node:internal/streams/readable:1007:22)
    at Parser.emit (node:events:519:28)
    at addChunk (node:internal/streams/readable:559:12) {
  _errors: [
    TableSchemaError: The value "0178_1" in column "MESSSTELLENID" is not type "integer" and format "default"
        at Field.castValue (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/field.js:89:17)
        at Schema.castRow (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/schema.js:150:31)
        at DestroyableTransform._transform (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/table.js:342:44)
        at Transform._read (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:184:10)
        at Transform._write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:172:83)
        at doWrite (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:428:64)
        at writeOrBuffer (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:417:5)
        at Writable.write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:334:11)
        at Parser.ondata (node:internal/streams/readable:1007:22)
        at Parser.emit (node:events:519:28) {
      _errors: [],
      columnNumber: 3,
      rowNumber: 1,
      errors: []
    },
    TableSchemaError: The value "beh�rdliche �berwachung" does not conform to the "enum" constraint for column "UEBERWASCHUNGSARTTEXT"
        at Field.castValue (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/field.js:110:21)
        at Schema.castRow (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/schema.js:150:31)
        at DestroyableTransform._transform (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/table.js:342:44)
        at Transform._read (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:184:10)
        at Transform._write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:172:83)
        at doWrite (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:428:64)
        at writeOrBuffer (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:417:5)
        at Writable.write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:334:11)
        at Parser.ondata (node:internal/streams/readable:1007:22)
        at Parser.emit (node:events:519:28) {
      _errors: [],
      columnNumber: 5,
      rowNumber: 1,
      errors: []
    },
    TableSchemaError: The value "K�stengew�sser" does not conform to the "enum" constraint for column "GEWAESSERKATEGORIE"
        at Field.castValue (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/field.js:110:21)
        at Schema.castRow (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/schema.js:150:31)
        at DestroyableTransform._transform (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/table.js:342:44)
        at Transform._read (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:184:10)
        at Transform._write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:172:83)
        at doWrite (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:428:64)
        at writeOrBuffer (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:417:5)
        at Writable.write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:334:11)
        at Parser.ondata (node:internal/streams/readable:1007:22)
        at Parser.emit (node:events:519:28) {
      _errors: [],
      columnNumber: 6,
      rowNumber: 1,
      errors: []
    }
  ],
  rowNumber: 1,
  errors: [
    TableSchemaError: The value "0178_1" in column "MESSSTELLENID" is not type "integer" and format "default"
        at Field.castValue (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/field.js:89:17)
        at Schema.castRow (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/schema.js:150:31)
        at DestroyableTransform._transform (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/table.js:342:44)
        at Transform._read (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:184:10)
        at Transform._write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:172:83)
        at doWrite (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:428:64)
        at writeOrBuffer (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:417:5)
        at Writable.write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:334:11)
        at Parser.ondata (node:internal/streams/readable:1007:22)
        at Parser.emit (node:events:519:28) {
      _errors: [],
      columnNumber: 3,
      rowNumber: 1,
      errors: []
    },
    TableSchemaError: The value "beh�rdliche �berwachung" does not conform to the "enum" constraint for column "UEBERWASCHUNGSARTTEXT"
        at Field.castValue (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/field.js:110:21)
        at Schema.castRow (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/schema.js:150:31)
        at DestroyableTransform._transform (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/table.js:342:44)
        at Transform._read (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:184:10)
        at Transform._write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:172:83)
        at doWrite (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:428:64)
        at writeOrBuffer (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:417:5)
        at Writable.write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:334:11)
        at Parser.ondata (node:internal/streams/readable:1007:22)
        at Parser.emit (node:events:519:28) {
      _errors: [],
      columnNumber: 5,
      rowNumber: 1,
      errors: []
    },
    TableSchemaError: The value "K�stengew�sser" does not conform to the "enum" constraint for column "GEWAESSERKATEGORIE"
        at Field.castValue (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/field.js:110:21)
        at Schema.castRow (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/schema.js:150:31)
        at DestroyableTransform._transform (/home/scammo/projects/badewasser_frictionless/node_modules/tableschema/lib/table.js:342:44)
        at Transform._read (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:184:10)
        at Transform._write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_transform.js:172:83)
        at doWrite (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:428:64)
        at writeOrBuffer (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:417:5)
        at Writable.write (/home/scammo/projects/badewasser_frictionless/node_modules/readable-stream/lib/_stream_writable.js:334:11)
        at Parser.ondata (node:internal/streams/readable:1007:22)
        at Parser.emit (node:events:519:28) {
      _errors: [],
      columnNumber: 6,
      rowNumber: 1,
      errors: []
    }
  ]
}

It seems to me, that the encoding of .csv is not done correctly. I tried .rawRead() and also saw the encoding errors for the e.G. ü Umlaute.
If I use the .rawRead()` method with correct encoding, the .csv seems to be parsed correctly e.G.:

const stammdatenBuffer = await resource.getResource('badegewasser-infrastruktur').rawRead({ keyed: true })
const decoder = new TextDecoder('iso8859-1');
const stammdaten = decoder.decode(stammdatenBuffer);

I tried it on Node v21.6.2 on Ubuntu. In other frictionless packages, this .json seems to be valide.

Thanks for your work!


Please preserve this line to notify @aivuk (lead of this repository)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions