Reorganized Haoran's scripts in Mutational_Efffect including splitting, deleting, and wrapping etc.
Pipeline and instructions (modified)
* Input: PPI_sig_network
* Code: net_trans.r
* Output: wang_network.txt, PPI_sig_network_no_selfloop.txt
* Input: wang_network.txt
* Code: find_motif.sh (MFINDER 1.21)
* Output: PPI_sig_network_MEMBERS.txt, PPI_sig_network_OUT.txt
* Input: PPI_sig_network_MEMBERS.txt, PPI_sig_network_OUT.txt
* Code: proc_motif.r
* Output: motif.txt, motif_subclass.txt
* Input: PPI_sig_network_no_selfloop.txt, Census.csv, motif
* Code: stat_motif.r
* Output: hot_motif_stat.txt
* Input: Census.csv, motif_subclass.txt
* Code: gen_fea_matirx_SMOTE.r
* Output: feature_original1.txt (for python), feature_original.txt (for R) (just output feature_matrix.txt; when used in python, can be first processed in bash and save as feature_matrix_py.txt)
* Input: feature_matrix_py.txt
* Code: RandomForest.py
* Output: RandomForest.joblib.pkl (RandomForest model)
* Input: feature_matrix.txt
* Code: RF_SGL.r (decision_path.r, discretize.r, Optimization.r, Prediction.r, Recover_Validation.r, ROC.r, Rule_RScore.r)
Should be changed based on whole data instead of just test data
* Output: RsResult.RData, Prediction_Result.RData, rule_extract.RData
* Input: rule_extract.RData, hot_motif_stat.txt
* Code: overlap_hot_motif.r
* Output: overlap_hot.motif.txt
* Input: motif, rule_extract.RData
* Code: draw_motif_score.r
* Output: motif_score.eps
* Input: X_test.txt, label_test.txt, RandomForest.joblib.pkl
* Code: Plot_PRC_ROC.py
* Output: model_RandomForest.best_estimator.txt, helpout_ROC.eps, helpout_PC.eps
* Input: PPI_sig_network, feature_label_motif.txt
* Code: gen_fea_cen.r
* Output: feature_label_cen.txt
* Input: feature_label_cen.txt
* Code: RandomForest_new.py
* Output: /RandomForest_cen.joblib.pkl, X_test_cen.txt
* Input: X_test_motif.txt, X_test_cen.txt, label_test.txt, RandomForest_motif.joblib.pkl, RandomForest_cen.joblib.pkl
* Code: plot_curve_cmp.py
* Output: heldout_ROC_cmp.eps, heldout_PC_cmp.eps