-
Notifications
You must be signed in to change notification settings - Fork 63
Data Import
You can currently provide data to UpSet through a publicly available dataset and a simple description file, which also must be publicly available.
UpSet uses a binary encoding for the sets. Here is a simple example, with Sets A, B and C, and three elements in the rows (R1, R2, R3):
Row;A;B;C
R1;1;0;0
R2;0;1;0
R3;0;0;1
You can download this file here.
To make upset understand this data format you have to provide a simple JSON file. The configuration file for the above dataset is as simple as this:
{
"file": "http://vcg.github.io/upset/data/test/test.csv",
"name": "Test",
"header": 0,
"separator": ";",
"skip": 0,
"meta": [
{ "type": "id", "index": 0, "name": "Name" }
],
"sets": [
{ "format": "binary", "start": 1, "end": 3 }
]
}
You can download this file here.
The properties of these attributes are the following:
-
file
describes the path to the data file. This path typically should be a globally accessible URL, unless you run upset locally, in which case you can use relative paths. -
name
is a custom name that you can give to your dataset, as it will appear in UpSet. -
header
defines the row in the dataset where your column IDs are stored (the sets and the attributes) -
separator
defines which symbols are used to separate the cells in the matrix. Common symbols used are semicolon;
, colon,
, and tab\t
. -
skip
is currently not in use but will provide the ability to skip rows at the beginning of the file. -
meta
is an array of metadata that specifies the id column and attribute columns. The above example defines the first column in the file (the column with index 0) to contain the identifiers for the elements. The name of the identifiers is "Name".meta
is also used for attributes, discussed later. -
sets
defines the sets in the dataset. It is specified in an array to allow multiple ranges of sets within a file. Here only one range of sets is defined, from thestart
index 1 (the second column) to theend
index 3 (the fourth column).
UpSet can visualize attributes in addition to sets. Here is an
Name;ReleaseDate;Action;Adventure;Children;Comedy;Crime;Documentary;Drama;Fantasy;Noir;Horror;Musical;Mystery;Romance;SciFi;Thriller;War;Western;AvgRating;Watches
Toy Story (1995);1995;0;0;1;1;0;0;0;0;0;0;0;0;0;0;0;0;0;4.15;2077
Jumanji (1995);1995;0;1;1;0;0;0;0;1;0;0;0;0;0;0;0;0;0;3.2;701
Grumpier Old Men (1995);1995;0;0;0;1;0;0;0;0;0;0;0;0;1;0;0;0;0;3.02;478
Waiting to Exhale (1995);1995;0;0;0;1;0;0;1;0;0;0;0;0;0;0;0;0;0;2.73;170
Father of the Bride Part II (1995);1995;0;0;0;1;0;0;0;0;0;0;0;0;0;0;0;0;0;3.01;296
{
"file": "https://dl.dropboxusercontent.com/u/36962787/UpSet/movies.csv",
"name": "Online Movies Genres ",
"header": 0,
"separator": ";",
"skip": 0,
"meta": [
{ "type": "id", "index": 0, "name": "Name" },
{ "type": "integer", "index": 1, "name": "Release Date" },
{ "type": "float", "index": 19, "name": "Average Rating", "min": 1, "max": 5 },
{ "type": "integer", "index": 20, "name": "Times Watched" }
],
"sets": [
{ "format": "binary", "start": 2, "end": 18 }
]
}
Specimen,IO,PG,SUV,PB,RN,PRN,PAG,SPVi,LRN,GRN,IRN,V,III,XII,MRN,ECU,IntAProp,IcgsProp,IntPProp,LNProp,MNProp,PBProp,SUVProp,YProp,LAVProp Sp1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1.0,0,0,0,0,0,0,0,0 Sp2,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0.84,0.15,0,0.0047,0,0,0,0,0 Sp3,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,0.82,0.13,0.011,0,0.0033,0.0050,0.03,0,0 Sp4,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0.68,0.13,0.012,0.013,0,0.028,0.077,0.021,0.043 Sp5,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0.63,0.16,0.14,0,0.015,0,0.055,0,0 Sp6,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0.59,0.32,0.0095,0.0063,0.016,0.0016,0.057,0.0032,0