|
11 | 11 | * generate_unique_lig_poses.py - Script for counter-example generation which computes all of the unique ligand poses in a directory |
12 | 12 | * counterexample_generation_jobs.py - Script which generates a file containing all of the gnina commands to generate new counter-examples |
13 | 13 | * generate_counterexample_typeslines.py - Script which generates a file containing the lines to add to the types file for a pocket. |
| 14 | + * types_extender.py - Script to generate a new types file containing the lines generated from the counterexamples from an existing types file. |
14 | 15 |
|
15 | 16 | ## Dependencies |
16 | 17 |
|
@@ -281,16 +282,16 @@ Lastly, we run clustering.py as follows |
281 | 282 | ``` |
282 | 283 | clustering.py --cpickle matrix.pickle --input my_types.types --output my_types_cv_ |
283 | 284 | ``` |
284 | | -## Generating new counterexamples |
285 | | -There are 3 scripts here which form a pipeline to generate new counter-examples for a data directory. |
| 285 | +## Adding new counterexamples to types files |
| 286 | +There are 4 scripts here which form a pipeline to generate new counter-examples for a data directory. |
286 | 287 |
|
287 | | -The pipeline is as follows: 1) generate_unique_lig_poses.py; 2) counterexample_generation_jobs.py; 3) generate_counterexample_typeslines.py. |
| 288 | +The pipeline is as follows: 1) generate_unique_lig_poses.py; 2) counterexample_generation_jobs.py; 3) generate_counterexample_typeslines.py; 4) types_extender.py. |
288 | 289 |
|
289 | 290 | Global Assumptions: 1) The data directory structure is <ROOT>/<POCKET>/<FILES>, 2) Crystal ligand files are named <PDBid>_<ligname><CRYSTAL SUFFIX>, |
290 | 291 | 3) Receptors are PDB files, 4) output poses are SDF files. |
291 | 292 |
|
292 | 293 | ### Step 1) Generating the unique poses for a Pocket |
293 | | -In order to avoid extra calculations, we need to find the unique poses. |
| 294 | +In order to avoid extra calculations, we need to find the unique poses. NOTE - This process needs to be done exactly once when generating new counterexamples. After a round of counterexamples are generated, script 3 in the pipeline will generate the updated unique_poses.sdf file. |
294 | 295 |
|
295 | 296 | WARNING -- this script performs an O(n^2) calcualtion for each unique ligand name in the pocket!! |
296 | 297 |
|
@@ -466,6 +467,32 @@ The above command will be need to run for each directory in cd2020_pockets.txt. |
466 | 467 |
|
467 | 468 | That text file contains the lines that need to be added to the training/test types files. The default values match what we used for the CrossDocked2020 paper. |
468 | 469 |
|
| 470 | +### Step 4 -- Adding the lines for the counterexamples to the types file |
| 471 | +Now that the lines we need to add are generated for each pocket, we can run types_extender.py on each of the types files that we use for training and testing to generate new types files with these added lines. |
| 472 | +``` |
| 473 | +usage: types_extender.py [-h] -i INPUT -o OUTPUT -n NAME [-r ROOT] |
| 474 | +
|
| 475 | +Add lines to types file and create a new one. Assumes data file structure is |
| 476 | +ROOT/POCKET/FILES. |
| 477 | +
|
| 478 | +optional arguments: |
| 479 | + -h, --help show this help message and exit |
| 480 | + -i INPUT, --input INPUT |
| 481 | + Types file you will be extending. |
| 482 | + -o OUTPUT, --output OUTPUT |
| 483 | + Name of the extended types file. |
| 484 | + -n NAME, --name NAME Name of the file containing the lines to add for a |
| 485 | + given pocket. This is the output of |
| 486 | + generate_counterexample_typeslines.py. |
| 487 | + -r ROOT, --root ROOT Root of the data directory. Defaults to current |
| 488 | + working directory. |
| 489 | +``` |
| 490 | +Continuing our example, after running script 3 there will be an it3_typeslines_toadd.txt file in each pocket. So now we generate a new train types file and new test types file as below: |
| 491 | +``` |
| 492 | +python3 types_extender.py -i my_initial_train.types -o my_new_train.types -n it3_typeslines_toadd.txt -r MYROOT |
| 493 | +python3 types_extender.py -i my_initial_test.types -o my_new_test.types -n it3_typeslines_toadd.txt -r MYROOT |
| 494 | +``` |
| 495 | + |
469 | 496 | ## Using visualization script |
470 | 497 | There are two scripts to help you visualize how the model scores atoms: 1) simple_grid_visualization.py; 2) grid_visualization.py |
471 | 498 |
|
|
0 commit comments