You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: COLLABORATIONS/openPBTA/openpbta_case_meta_config.json
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -112,7 +112,7 @@
112
112
}
113
113
},
114
114
"study": {
115
-
"description": "The Open Pediatric Brain Tumor Atlas (OpenPBTA) Project is a global open science initiative led by <a href=\"https://www.ccdatalab.org/\">Alex's Lemonade Stand Childhood Cancer Data Lab (CCDL)</a> and <a href=\"https://www.chop.edu/\">Children's Hospital of Philadelphia's</a> <a href=\"https://d3b.center/\">Center for Data-Driven Discovery</a> to comprehensively define the molecular landscape of tumors of 943 patients from the <a href=\"http://cbtn.org\">Children's Brain Tumor Network</a> and the <a href=\"http://www.pnoc.us/\">Pacific Pediatric Neuro-oncology Consortium</a> through real-time, <a href=\"https://github.com/AlexsLemonade/OpenPBTA-analysis\">collaborative analyses</a> and <a href=\"https://github.com/AlexsLemonade/OpenPBTA-manuscript\"> collaborative manuscript writing</a> on GitHub. The study loaded matches that of v22. For updates, please see here: <a href=\"https://tinyurl.com/55cxz9am\">Release Notes</a>",
115
+
"description": "The Open Pediatric Brain Tumor Atlas (OpenPBTA) Project is a global open science initiative led by <a href=\"https://www.ccdatalab.org/\">Alex's Lemonade Stand Childhood Cancer Data Lab (CCDL)</a> and <a href=\"https://www.chop.edu/\">Children's Hospital of Philadelphia's</a> <a href=\"https://d3b.center/\">Center for Data-Driven Discovery</a> to comprehensively define the molecular landscape of tumors of 943 patients from the <a href=\"http://cbtn.org\">Children's Brain Tumor Network</a> and the <a href=\"http://www.pnoc.us/\">Pacific Pediatric Neuro-oncology Consortium</a> through real-time, <a href=\"https://github.com/AlexsLemonade/OpenPBTA-analysis\">collaborative analyses</a> and <a href=\"https://github.com/AlexsLemonade/OpenPBTA-manuscript\"> collaborative manuscript writing</a> on GitHub. The study loaded matches that of v23. For updates, please see here: <a href=\"https://tinyurl.com/55cxz9am\">Release Notes</a>",
+`chopaws`https://github.research.chop.edu/devops/aws-auth-cli needed for saml key generation for s3 upload
77
-
+ access to https://aws-infra-jenkins-service.kf-strides.org to start cbio load into QA and/or prod using the `d3b-center-aws-infra-pedcbioportal-import` task
82
+
+ access to https://github.com/d3b-center/aws-infra-pedcbioportal-import repo. To start a load job:
83
+
+ Create a branch and edit the `import_studies.txt` file with the study name you which to load. Can be an MSKCC datahub link or a local study name
84
+
+ Push the branch to remote - this will kick off a github action to load into QA
85
+
+ To load into prod, make a PR. On merge, load to prod will kick off
86
+
+ aws `stateMachinePedcbioImportservice` Step function service is used to view and mangage running jobs
87
+
+ To repeat a load, click on the ▶️ icon in the git repo to select the job you want to re-run
78
88
+ Access to the `postgres` D3b Warehouse database at `d3b-warehouse-aurora-prd.d3b.io`. Need at least read access to tables with the `bix_workflows` schema
79
89
+[cbioportal git repo](https://github.com/cBioPortal/cbioportal) needed to validate the final study output
80
90
@@ -112,6 +122,7 @@ Seemingly redundant, this file contains the file locations, BS IDs, file type, a
112
122
It helps simplify the process to integrate better into the downstream tools.
113
123
This is the file that goes in as the `-t` arg in all the data collating tools
114
124
#### - Sequencing center info resource file
125
+
DEPRECATED and will be removed from future releases
115
126
This is a simple file this BS IDs and sequencing center IDs and locations.
116
127
It is necessary to patch in a required field for the fusion data
117
128
#### - Data gene matrix - *OPTIONAL*
@@ -211,7 +222,7 @@ optional arguments:
211
222
Check the pipeline log output for any errors that might have occurred.
212
223
213
224
## Upload the final packages
214
-
Upload all of the directories named as study short names to `s3://kf-cbioportal-studies/public/`. You may need to set and/or copy aws your saml key before uploading. Next, edit the file in that bucket called `importStudies.txt` located at `s3://kf-cbioportal-studies/public/importStudies.txt`, with the names of all of the studies you wish to updated/upload. Lastly, go to https://jenkins.kids-first.io/job/d3b-center-aws-infra-pedcbioportal-import/job/master/, click on build. At the `Promotion kf-aws-infra-pedcbioportal-import-asg to QA` and `Promotion kf-aws-infra-pedcbioportal-import-asg to PRD`, the process will pause, click on the box below it to affirm that you want these changes deployed to QA and/or PROD respectively. If both, you will have to wait for the QA job to finish first before you get the prompt for PROD.
225
+
Upload all of the directories named as study short names to `s3://kf-cbioportal-studies/public/`. You may need to set and/or copy aws your saml key before uploading. Next, edit the file in that bucket called `importStudies.txt` located at `s3://kf-cbioportal-studies/public/importStudies.txt`, with the names of all of the studies you wish to updated/upload. Lastly, follow the directions reference in [Software Prerequisites](#software-prerequisites) to load the study.
215
226
## Congratulations, you did it!
216
227
217
228
# Collaborative and Publication Workflows
@@ -220,7 +231,7 @@ These are highly specialized cases in which all or most of the data come from a
220
231
## OpenTargets
221
232
This project is organized much like OpenPBTA in which all genomics data for each assay-type are collated into one giant table.
222
233
In general, this fits cBioPortal well.
223
-
Input files mostly come from a "subdirectory" from within `s3://kf-openaccess-us-east-1-prd-pbta/`, consisting of:
234
+
Input files mostly come from a "subdirectory" from within `s3://d3b-openaccess-us-east-1-prd-pbta/open-targets/`, consisting of:
224
235
-`histologies.tsv`
225
236
-`snv-consensus-plus-hotspots.maf.tsv.gz`
226
237
-`consensus_wgs_plus_cnvkit_wxs_x_and_y.tsv.gz`
@@ -270,7 +281,7 @@ To create the histologies file, recommended method is to:
270
281
1. Run `Rscript --vanilla pedcbio_sample_name_col.R --hist_dir path-to-hist-dir`. Histologies file must be `histologies.tsv`, modify file name or create sym link if needed. Results will be in`results`as`histologies-formatted-id-added.tsv`
271
282
272
283
### Inputs
273
-
Inputs are located in the old Kids First AWS account (`538745987955`) in this general bucket location: `s3://kf-openaccess-us-east-1-prd-pbta/open-targets/`.
284
+
Inputs are located in the old D3b AWS account (`684194535433`) in this general bucket location: `s3://d3b-openaccess-us-east-1-prd-pbta/open-targets/`.
274
285
Clinical data with cBio names are obtained from the `histologies-formatted-id-added.tsv` file, as noted in [Prep Work section](#prep-work).
Copy file name to clipboardExpand all lines: STUDY_CONFIGS/case_cptac_meta_config.json
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -112,12 +112,12 @@
112
112
}
113
113
},
114
114
"study": {
115
-
"_comment": "If a big study being split into many, make cancer_study_identifer blank, dx will be used",
116
-
"description": ["Genomic characterization through proteimics. Samples provided by the <a href=\"http://CBTTC.org\">Children's Brain Tumor Tissue Consortium</a> and its partners via the <a href=\"http://kidsfirstdrc.org\">Gabriella Miller Kids First Data Resource Center</a>. Updated Februrary 1, 2020 from last load, July 2019"],
115
+
"_comment": "see https://docs.cbioportal.org/5.1-data-loading/data-loading/file-formats#cancer-study for detailed specifics",
116
+
"description": "Genomic characterization through proteimics. Samples provided by the <a href=\"http://CBTTC.org\">Children's Brain Tumor Tissue Consortium</a> and its partners via the <a href=\"http://kidsfirstdrc.org\">Gabriella Miller Kids First Data Resource Center</a>. Updated Februrary 1, 2020 from last load, July 2019",
0 commit comments