Link to project repo: https://github.com/Locrian24/seng474-term-project
This repo contains the report, poster, and source code for an attempted implementation of the Hubble.2D6 tool based on the original paper. This was a term project for the SENG 474 class at the University of Victoria.
IMPORTANT: This is an naive and incomplete implementation of the Hubble.2d6 tool. This was an undergraduate project and is far from a reliable tool. The official version of Hubble.2d6 can be found at this repo.
This majority of logic in pre-processing and post-processing of data is taken from the original tool (here). This also includes supplementary data such as pre-computed embeddings, and specifics in the deep learning networks' architecture.
Using Anaconda:
conda env create --file cannett_474_env.yml
Using pip
with a virtual environment:
python3 -m venv env
source env/bin/activate
pip install -r requirements.txt
Sample file with 3 star alleles:
python3 model/hubble.py -v data/sample.vcf
Star alleles from PharmVar:
python3 model/hubble.py -v step3/data/star_samples.vcf
As well as the source code for the implementation, Google Colab notebooks are included showing the training processes as well as generation of evaluation metrics.
Colab notebooks are found in the notebooks
directory.