tap-csv

A Singer tap for extracting data from a CSV file.

Limitations

This is a fairly brittle implementation of a CSV reader, but it gets the job done for tasks where you file structure is highly predictable.

The input files must be a traditionally-delimited CSV (commas separated columns, newlines indicate new rows, double quoted values) as defined by the defaults to the python csv library.

Paths to local files and the names of their corresponding entities are specified in the config file, and each file must contain a header row including the names of the columns that follow.

Perhaps the greatest limitation of this implementation is that it assumes all incoming data is a string. Future iterations could intelligently identify data types based on a sampling of rows or allow the user to provide that information.

Install

Clone this repository, and then:

› python setup.py install

Run

Run the application

python tap_csv.py -c config.json

Where config.json contains an array called files that consists of dictionary objects detailing each destination table to be passed to Singer. Each of those entries contains:

entity: The entity name to be passed to singer (i.e. the table)
path: Local path to the file to be ingested. Note that this may be a directory, in which case all files in that directory and any of its subdirectories will be recursively processed
keys: The names of the columns that constitute the unique keys for that entity
columns: Define the column data type (optional)

Example:

{
	"files":	[ 	
					{	"entity" : "leads",
						"file" : "/path/to/leads.csv",
						"keys" : ["Id"]
					},
					{	"entity" : "opportunities",
						"file" : "/path/to/opportunities.csv",
						"keys" : ["Id"],
						"columns": {
							"Id": "integer",
							"name": "string"
						}
					}
				]
}

and state.json is a file containing only the value of the last state message, which again is moot for this tap because it is only run on individual files a single time.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
tap_csv		tap_csv
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tap-csv

Limitations

Install

Run

Run the application

About

Releases

Packages

Languages

License

yusufrahadika/tap-csv

Folders and files

Latest commit

History

Repository files navigation

tap-csv

Limitations

Install

Run

Run the application

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages