-
Notifications
You must be signed in to change notification settings - Fork 0
gaoqiweng/RediscMol
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Benchmark:
chembl_29_ecfp4_0.333.csv/smi is the pretraining dataset. The kinase and GPCR datasets have their corresponding pretrain datasets.
cdk contains 10 random 10% fine-tuning and the target sets of CDK.
cdk_100 contains 100 random 1% fine-tuning and the target sets of CDK.
Scripts:
evaluation.py is used to calculate the metrics whose environment requires Pytorch and RDKit.
usage: evaluation.py [-h] [--train_path TRAIN_PATH] [--goal_path GOAL_PATH] [--gen_path GEN_PATH] [--n_jobs N_JOBS]
[--metrics_path METRICS_PATH]
optional arguments:
-h, --help show this help message and exit
--train_path TRAIN_PATH
Path to fine-tuning molecules csv
--goal_path GOAL_PATH
Path to target molecules csv
--gen_path GEN_PATH Path to generated molecules csv
--n_jobs N_JOBS Number of threads
--metrics_path METRICS_PATH
Path to output file with metrics
Example:
python evaluation.py --train_path ../Benchmark/kinase/cdk/cdk_1_train.csv --goal_path ../Benchmark/kinase/cdk/cdk_1_goal.csv --gen_path generated_molecules.csv --n_jobs 48 --metrics_path metrics.csv
The molecules of the generated_molecules.csv have to be valid, unique and novel.About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published