Welcome to our Machine Learning project. Here we tried to build different linear and non-linear ML models in order to predict whether an odour is perceived as more sweet or more sour.
The models we tried to build are based on the method of :
- Logistic regression
- Random forests
- Neural networks
- SVM
- Alexander Popescu
- Changling Li
The project is contained in the repository BIO322-Classification. You can find :
-
a repository data that contains the data sets used to train and predict the outcomes
-
a repository src that contains our code (in the form of R scripts) that contains the code for :
- Exploration : we explored the data and visualize it
- Linear method : The code used to produce our best results with logistic regression
- Non-linear method : The code used to produce our best results with neural networks,SVM trees and random forests.
-
a repository plots that contains all the plots produced during the data exploration.
-
our best results produced for the Kaggle competition in a .csv file
-
Our report in PDF file that presents the different models we built and our best results
R version is 4.0.3 In order to run our R scripts, please be in the repository BIO322-Classification. The different results can be reproduced by running the R scripts in the repository src. The code in the files Linear_method_skewness.R, RF_tuning.R, Bayes_optimisationNN.r and Bayes_optimisationSVM.R are used to tune some hyper-parameters and estimates errors with CV. Note that depending on the number of iterations specified, it can take very long to get a result.