You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"_description": "This section of the document is meant to help understand the organization of key-value pairs and how it supports the creation of files from https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats as part of the cBioportal ETL",
4
+
"sections": {
5
+
"merged_x": {
6
+
"_description": "Support information for merged genomic data and supporting meta files. ",
7
+
"dir": "output directory for merged results. Should match output dir specified by etl conversion script at run time",
8
+
"dtypes": {
9
+
"_description": "Data types - cBio defined data types. _comment has link to detailed specifics in each section.",
10
+
"ext": "File extension of merged outputs from etl script",
11
+
"cbio_name": "cBio output file name - a soft link to the etl output created inside the study directory",
12
+
"meta_file_attributes": "Direct key-value paris used by cBio in a meta_x file used to describe data_x files"
13
+
}
14
+
},
15
+
"study": "Special meta file used to describe the cBio study",
16
+
"case_x": "cBio case lists - sample lists describing which samples have mutation data, sv data, cnv data, etc. See https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#case-lists for specifics",
17
+
"data_sheets": "Clinical patient and sample data as well as gene matrix if panel data present. Distinct from other data_x files in that in contains sample and patient metadata and not genomic data",
18
+
"database_pulls": {
19
+
"_description": "This section is used to support pulling auto-generated clinical data tables and supporting genomics etl information from the D3b Data Warehouse",
20
+
"manifests": {
21
+
"<manifest descriptor>": {
22
+
"_description": "The key for this field is meant to be a convenient descriptor of what sub-study files derive from, as a cBio study make come from many sources",
23
+
"table": "D3b warehouse table name with relevant file info",
24
+
"file_types": "Manifests typically contain all possible harmonization outputs. Specifying specific file_type(s) limits to relevant outputs. Exception is annotated_public_output, etl will pull only he maf as vcfs are included in that query.",
25
+
"out_file": "Desired output file name"
26
+
}
27
+
},
28
+
"x_head": "Special header file table for data_clinical(sample/patient). cBio data_clinical headers have 5 header rows, and which columns are used are determined by the x_file table",
29
+
"x_file": "sample or patient tables with corresponding metadata at the sample and patient levels",
30
+
"genomics_etl": "a helper file with relevant cBio sample names and individual genomic files names for ETL merging",
31
+
"seq_center": "only if project has RNA data, a helper file to fill in missing sequencing center information for genomics etl",
32
+
"gene_file": "Only if study has panel data, the source information for gene matrix in the data_sheets section"
33
+
34
+
}
35
+
}
36
+
},
37
+
"merged_mafs": {
38
+
"_comment": "See https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#mutation-data for detailed specifics",
39
+
"dir": "merged_mafs",
40
+
"dtypes": {
41
+
"mutation": {
42
+
"ext": "maf",
43
+
"cbio_name": "data_mutations_extended.txt",
44
+
"meta_file_attr": {
45
+
"stable_id": "mutations",
46
+
"profile_name": "Mutations",
47
+
"profile_description": "For matched T/N sample: consensus calls from strelka2, mutect2, lancet, and VarDict Java. Two or more callers required to pass, < 0.001 frequency in gnomAD, and min read depth 8 in normal sample, unless in a TERT promoter region or in a hotspot region (see https://www.cancerhotspots.org). For tumor/model-only, mutect2 calls with < 0.001 frequency in gnomAD, unless in a hotspot region",
"_comment": "see https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#continuous-copy-number-data for detailed specifics",
61
+
"ext": "predicted_cnv.txt",
62
+
"cbio_name": "data_linear_CNA.txt",
63
+
"meta_file_attr": {
64
+
"stable_id": "linear_CNA",
65
+
"profile_name": "copy-number values",
66
+
"profile_description": "Predicted copy number values from WGS (Continuous). Copy number calls obtained using ControlFreeC, filtering calls smaller than 50KB",
"_comment": "see https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#example-clinical-header for detailed specifics",
103
+
"cbio_name": "data_clinical_patient.txt",
104
+
"meta_file_attr": {
105
+
"genetic_alteration_type": "CLINICAL",
106
+
"datatype": "PATIENT_ATTRIBUTES"
107
+
}
108
+
},
109
+
"sample": {
110
+
"_comment": "see https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#clinical-sample-columns for detailed specifics",
111
+
"cbio_name": "data_clinical_sample.txt",
112
+
"meta_file_attr": {
113
+
"genetic_alteration_type": "CLINICAL",
114
+
"datatype": "SAMPLE_ATTRIBUTES"
115
+
}
116
+
}
117
+
}
118
+
},
119
+
"study": {
120
+
"_comment": "see https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#cancer-study for detailed specifics",
121
+
"description": "Children with Down syndrome (DS), which occurs due to trisomy 21, have a 2000-fold increased risk of atrioventricular septal defects (AVSD) and a 20-fold increased risk of acute lymphoblastic leukemia (ALL), but it is not understood which genetic features of trisomy 21 are responsible for the increased risk. The objectives of this study are to determine the genetic variants underlying AVSD and ALL risk in children with DS, which builds upon our previous work suggesting having an extra copy of chromosome 21 may \"move\" the susceptibility threshold for disease in these children. Insights into the genes that drive DS-AVSD and DS-ALL may have implications for improved genetic counseling, surveillance, clinical management, and treatment strategies for these and other children who may develop AVSD or ALL. For updates, please see here: <a href=\"https://tinyurl.com/55cxz9am\">Release Notes</a>",
"short_name": "Genomic Analysis of CHD and ALL in Children with Down Syndrome",
126
+
"reference_genome": "hg38",
127
+
"display_name": "Genomic Analysis of Congenital Heart Defects and Acute Lymphoblastic Leukemia in Children with Down Syndrome (Kids First, Provisional)"
128
+
},
129
+
"cases_3way_complete": {
130
+
"stable_id": "3way_complete",
131
+
"case_list_name": "Tumor and model samples with mutatation, CNA and mRNA data",
132
+
"case_list_description": "All tumor and model samples with mutation, CNA, and mRNA data",
0 commit comments