-
Notifications
You must be signed in to change notification settings - Fork 190
Description
This is precursor to
and mentioned through out the existing issues and PRs, e.g.
and would be required to facilitate
- BIDS 2̶.̶0̶1.0: flex BIDS layout (bids-2-devel/issues/54) #1809 (for Make it possible to specify folders layout to be other than sub-{label}/[ses-{label}/] bids-2-devel#54)
- Formalize "summaries" over BIDS concepts (suffixes, entities, datatypes, ...) on top level bids-2-devel#94
- Formalize "life-cycle"/movement and placement of metadata bids-2-devel#85
- Formalize specification of shape(s) (AKA contour(s)) #2013
but I failed to find a dedicated issue.
Current situation/issue
We have an expanding list of already defined .tsv files summarized below (including `_scans.tsv` which is not per-entity per se but close in spirit):
Table 1: Entity-level TSV files ({entity-plural}.tsv pattern)
| File Pattern | Entity | Location | Column 1 | Column 2 | Column 3 | Column 4 | Issues etc |
|---|---|---|---|---|---|---|---|
participants.tsv |
subject (sub-<label>) |
Dataset root | participant_id |
species (R) |
age (R) |
sex (R) |
bids-2-devel#14 |
samples.tsv |
sample (sample-<label>) |
Dataset root | sample_id |
participant_id |
sample_type |
pathology (R) |
Composite index (sample_id + participant_id) |
sub-<label>_sessions.tsv |
session (ses-<label>) |
Subject folder | session_id |
acq_time (O) |
pathology (R) |
HED (O) |
|
phenotype/<name>.tsv |
subject (per assessment) | phenotype/ |
participant_id |
HED (O) |
|||
[sub-<label>_][ses-<label>_]descriptions.tsv |
description (desc-<label>) |
Derivatives (root/sub/ses) | desc_id |
description |
#2281 | ||
sub-<label>[_ses-<label>]_scans.tsv |
scan (data files) | Subject/session folder | filename |
acq_time (O) |
HED (O) |
Notes:
- (R) = Recommended, (O) = Optional, unmarked = Required
and those are not to be "conflated" (at the moment at least) with data type files like _electrodes.tsv in iEEG etc
Table 2: Internal construct TSV files (non-entity)
which somewhat relate but already inconsistent as
- use
namenot someid - to avoid composite index, like we need for bep032, use composite of some
{location}{index}withinnamecolumn
click to expand -- not primary target for this issue
| File Pattern | Describes | Column 1 | Col 1 Example | Column 2 | Column 3 | Column 4 | Issues etc |
|---|---|---|---|---|---|---|---|
*_channels.tsv (EEG) |
Recording channels | name |
VEOG |
type |
units |
description (O) |
|
*_channels.tsv (MEG) |
Recording channels | name |
VEOG |
type |
units |
description (O) |
|
*_channels.tsv (EMG) |
Recording channels | name |
type |
units |
description (O) |
||
*_channels.tsv (iEEG) |
Recording channels | name |
LT01 |
type |
units |
low_cutoff |
|
*_channels.tsv (NIRS) |
Recording channels | name |
S1-D1 |
type |
source |
detector |
|
*_channels.tsv (Motion) |
Recording channels | name |
t1_acc_x |
component |
type |
tracked_point |
|
*_electrodes.tsv (EEG) |
Electrode positions | name |
Cz |
x |
y |
z |
|
*_electrodes.tsv (iEEG) |
Electrode positions | name |
LT01 |
x |
y |
z |
|
*_electrodes.tsv (EMG) |
Electrode positions | name |
x |
y |
z (O) |
||
*_optodes.tsv (NIRS) |
Optode positions | name |
A1 |
type |
x |
y |
|
*_events.tsv |
Events/stimuli | onset |
1.2 |
duration |
trial_type (O) |
response_time (O) |
|
*_beh.tsv |
Behavioral data | trial_type (O) |
congruent |
response_time (O) |
HED (O) |
stim_file (O) |
|
Notes:
- (R) = Recommended, (O) = Optional, unmarked = Required
and we lack information in
- BEP Guidelines on construction of such files generally, that leading column should be
{entity}_id, that they could be added pretty much for any entity at the appropriate level in the hierarchy- TODO: file a clarification PR there
- BIDS Common principles to expect such files for (only some) ATM entities to succinctly provide metadata specific for each
{entity}_id(e.g. instead of duplicating it in individual data.json) files- TODO: file a clarification PR against common principles to describe such .tsv files and their purpose generally in the Tabular files section.
As a result, BEPs now
- come up with composite indexes (like we got one for
samples.tsvalready) - introduce new ad-hoc filenames/approaches (see below on
atlas-<label>_description.json)
Alternative/complimentary solutions proposed
{entity}_description.json (atlas-<label>_description.json) in BEP-038 Atlases
I guess it was largely motivated by the fact that .tsv is "flat" and embedding nested structures, e.g. "Authors" list is tricky and non-tsv friendly. But I think overall, we might want indeed to formalize some formalization like a
- json lines
.jsonl- https://jsonlines.org/ - ... some other like a
.dict.jsonwhere it would be a simple json with keys on the index of the .tsv ATM or.list.jsonwith close to .jsonl above
without introducing proliferation of use of _description suffix, and to be used interchangeably with any .tsv? IMHO worth a dedicated issue/discussion