Skip to content

Problematic .tsv processing #62

Open
@suciokhan

Description

@suciokhan

When trying to ingest from .tsv files using Loader 1.4.1 on Ubuntu 20.04, I receive the following error:

[open_alex_0::5] ERROR com.vaticle.typedb.osi.loader.loader - async-writer-4: [THW07] Invalid Thing Write: Attempted to assign a key ',' of type 'id' that had been taken by another 'researcher'.

However, I've reviewed the .tsv and confirmed there are no comma values in this column; all values are open_alex identifiers, which are URLs starting with https.

In my typeDB config.json file, I have it set to expect tab separators, and it successfully ingests hundreds of thousands of rows.

"separator": "\t",

Below is a screenshot of confirming there are no commas in the id column using Python and Pandas.

image

I considered it being an issue with perhaps the header since it fails on the 2nd .tsv it's going through, as there is one record in the database with a comma for an id.
image

However, it doesn't fail until processing over 600,000 rows according to TypeDB processing updates.
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions