A final project for DS 4440 - Neural Networks
Using the python library Selene to enable applications of neural networks to Genome Wide Association Studies.
You will need to upload this directory to Google Drive, where you can train the models by opening run_training.ipynb in Colab and running each of the cells.
- Data can be downloaded and processed by following the instructions in
getting_started_with_selene.ipynb. - To select which model is used, simply update the two relevant lines in
training_configs.yml. - Information about the training (loss and precision) will be logged periodically in colab. To get the raw data from the logs, copy them into their own
.logfile and usescripts/scrape_log_values.pyto extract the raw list of values. These values can then be used to visualize the results.
run_training.ipynbloads data files and runs training in colabtraining_configs.ymlcontains all of the configurations for training, which can be updated to train a different model, change the length of training, or change the way that logs are generated.getting_started_with_selene.ipynbcontains instructions for preparing data and installing Selene. This file was copied from a tutorial in the Selene GitHub repo.visualize_results.ipynbcontains visualizations to compare the performance of each model.
View the colab files and data in Google Drive.