Releases: kids-first/kf-cbioportal-etl
🚀 Automated ETL Workflow with Pip Installation Support
This release introduces a fully refractored ETL pipeline that is now installable via pip
, streamlining the setup and execution of the workflow.
- users can easily trigger the ETL process via a single command
- supports modular approach, allowing users to run specific steps of pipeline as needed
- improved code modularity
What's Changed
- ✏️ Minor Config/Template Updates by @migbro in #70
- converted etl to standalone tool and updated readme.md by @wongjessica93 in #72
New Contributors
- @wongjessica93 made their first contribution in #72
Full Changelog: v1.6.0...v2.0.0
📝 OPC V12 Doc Revision
This is a special case in which this particular branch has code and documentation relevant to the OpenPedCan v12 load, broken by later releases. It is a niche use case in the even that this specific study needs further revision or simply referenced for a repeat load.
Full Changelog: 0.8.1...v0.8.2
🗻CBTN Summit + Chordoma Updates
Config and software updates made to help simplify ETL
- Updated configs to reflect CNV changes
- Updated download script to allow for new cbio_file_name_id.txt format that now has file_id and s3_path so that ...
genomic file manifests eliminated and folded into cbio_file_name_id.txt files so that the command is much simpler. See line 21 in README documentation to see the difference - Updated maf merge to record sym link errors
- Some QOL formatting updates
What's Changed
Full Changelog: v1.5.0...v1.6.0
🛠️ Modify CNV Filtering amd fix chrodoma configs
- Provisional study loads of CNV data used a min CNV length cutoff of 50kb. It was recently found to be too simplistic a cut off as some WXS samples had oncogenes filtered out because the region was < 50kb. Since we currently use ControlFreeC inputs for provisional study loads, we have switch to requiring both the
WilcoxonRankSumTestPvalue
andKolmogorovSmirnovPvalue
be less than 0.05 to consider geens in that region worthy of loading. - Also update the chordoma study data config files to point to the correct resources for CNV gene naming
What's Changed
Full Changelog: v1.4.0...v1.5.0
🛠️ Study Updates and Bug Fixes
🤩 Added Treatment Data to `pbta_all`
🚀 OpenPedCan v15 and tll_sd_aq9kvn5p_2019 (Teachey) Added
- OpenPedCan v15 Added, with several bug fixes and adjustments to evolving realities of the project
- Added new KF study and refactored another in terms of config file
What's Changed
- 🛠️ refactor/rename Lupo study to new ID by @migbro in #56
- 🚀 Add OpenPedCan v15 by @migbro in #57
- ✏️ add teachey by @migbro in #59
- ✏️ update openpedcan desc by @migbro in #58
Full Changelog: v1.1.0...v1.2.0
✨ Clinical Data Diff Tool
Added a tool to identify and summarize changes slated to be made to an existing study on the portal based on the cBio formatted data_clinical files. It does the following for each of sample and patient views:
- Create a list, one per line, per ID, per attribute, of what would change if the data were loaded
- Output list of IDs that would be removed from the portal, if any
- Output list of IDs that would be added if any
- Create summary of the number of changes of each attribute type printed to STDOUT
Also contains study updates
What's Changed
Full Changelog: v1.0.1...v1.1.0
🔧 Fix Pandas, Numpy Calls
A recent upgrade in software versions used for pandas and numpy has caused some functions being used to be deprecated. This PR fixes that, and will be followed with an accompanying docker image and software list.
What's Changed
Full Changelog: v1.0.0...v1.0.1
🧹Cleanup Legacy
- Minor change to PBTA config to accommodate new
file_type
entries in manifests - Removed many legacy files to cleanup repo
- Will make this now public as most changes are related to config file updates, less so software
What's Changed
Full Changelog: v0.9.1...v1.0.0