Skip to content

Voroscoring #867

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 40 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
05ffc37
first draft voroscoring module
VGPReys Apr 17, 2024
1c8e160
fix writing
VGPReys Apr 18, 2024
c37370d
fix types
VGPReys Apr 18, 2024
ef856c9
upgrade haddock module init
VGPReys Apr 18, 2024
75a69ad
import Any type
VGPReys Apr 18, 2024
8ba5ae7
redefine scoring modules classes
VGPReys Apr 18, 2024
2320357
add output attribute
VGPReys Apr 18, 2024
6757ae9
output Path type
VGPReys Apr 18, 2024
8601649
output var name
VGPReys Apr 18, 2024
e1ab6c8
get base_workdir from class attribute
VGPReys Apr 18, 2024
44b41e0
Path to str for .join() method
VGPReys Apr 18, 2024
1e9d612
voro scoring example
VGPReys Apr 18, 2024
042f484
output tsv file writing from self.output()
VGPReys Apr 18, 2024
319056d
remove import
VGPReys Apr 18, 2024
5c05779
solve error in recombine arguments
VGPReys Apr 18, 2024
0dd02ba
finetunings
VGPReys Apr 18, 2024
b431e6a
tidy types and lints
VGPReys Apr 18, 2024
d8a1322
reversing scores for systematic ascenting sorting
VGPReys Apr 18, 2024
994bea4
adding tests
VGPReys Apr 19, 2024
69245cd
fixed conflict
mgiulini Apr 29, 2024
f832c0c
add header information in generated final output file
VGPReys Apr 29, 2024
a5a3656
make sure line is terminated
VGPReys Apr 30, 2024
ac409ac
fix tuple error
VGPReys May 3, 2024
60c9ee6
Merge remote-tracking branch 'origin' into voroscoring
VGPReys May 3, 2024
795682a
tests
VGPReys May 21, 2024
1e0d578
Merge branch 'main' into voroscoring
VGPReys May 21, 2024
7e8ac73
change
ntxxt May 30, 2024
4370daf
fixes to adapt to new hardware
VGPReys Jun 17, 2024
80e112e
mergeing main and solve conflicts
VGPReys Jun 20, 2024
fbe08c0
Merge branch 'main' into voroscoring
rvhonorato Aug 13, 2024
e7055a8
Merge branch 'main' into voroscoring
VGPReys Sep 10, 2024
4d6253b
check
ntxxt Sep 18, 2024
9791476
Merge branch 'voroscoring' of https://github.com/haddocking/haddock3 …
ntxxt Sep 18, 2024
b4a8048
Merge branch 'main' into voroscoring
VGPReys Mar 19, 2025
7dd1d5f
Update how to obtain defaults.yaml filename
VGPReys Mar 19, 2025
bee2709
Updating the SLURM job
VGPReys Mar 19, 2025
46217f9
Updating tests to reflect the 3 digits outputs
VGPReys Mar 19, 2025
fd08d85
Update INSTALL.md for voroscoring module
VGPReys Mar 19, 2025
e188b74
Removing the chain contatenation parameter as not implemented
VGPReys Mar 19, 2025
e23861a
Merge branch 'main' into voroscoring
VGPReys Apr 22, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions docs/INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,3 +154,24 @@ on your machine.

Please refer to the [official page](http://docs.openmm.org/latest/userguide/)
of the project for a full description of the installation procedure.



## `voroscoring`

The use of the `[voroscoring]` module requires:
- A cluster with SLURM installed
- The setup of a conda environement (e.g.: ftdmp), in which you will install FTDMP
- A functional installation of [FTDMP](https://github.com/kliment-olechnovic)

Once those three conditions are fulfilled, when using the `[voroscoring]` module in haddock3, the configuration file must be tuned to contain parameters describing how to load the appropriate conda env (ftdmp) and where to find FTDMP scripts and executables:

```TOML
[voroscoring]
# This parameter defines the base directory where conda / miniconda is installed
conda_install_dir = "/absolute/path/to/conda/"
# This parameter defines the name of the conda env that you created and where FTDMP is installled
conda_env_name = "ftdmp"
# This parameter defines where FTDMP scripts / executables can be found
ftdmp_install_dir = "/absolute/path/to/FTDMP/"
```
33 changes: 33 additions & 0 deletions examples/scoring/voroscoring-test.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# ====================================================================
# Scoring example

# directory in which the scoring will be done
run_dir = "run1-voroscoring-test"
clean = false

# execution mode
ncores = 3
mode = "local"

# ensemble of different complexes to be scored
molecules = ["data/T161-rescoring-ens.pdb",
"data/HY3.pdb",
"data/protein-dna_1w.pdb",
"data/protein-protein_1w.pdb",
"data/protein-protein_2w.pdb",
"data/protein-trimer_1w.pdb"
]

# ====================================================================
# Parameters for each stage are defined below

[topoaa]

[voroscoring]

[seletop]
select = 3

[caprieval]

# ====================================================================
26 changes: 20 additions & 6 deletions src/haddock/modules/scoring/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
"""HADDOCK3 modules to score models."""
from os import linesep
import pandas as pd

from haddock.core.typing import FilePath, Path, Any
from haddock.core.typing import FilePath, Path, Any, Optional
from haddock.modules.base_cns_module import BaseCNSModule
from haddock.modules import BaseHaddockModule, PDBFile

Expand All @@ -14,6 +15,7 @@ def output(
output_fname: FilePath,
sep: str = "\t",
ascending_sort: bool = True,
header_comments: Optional[str] = None,
) -> None:
r"""Save the output in comprehensive tables.

Expand All @@ -36,11 +38,23 @@ def output(
df_sc = pd.DataFrame(sc_data, columns=df_columns)
df_sc_sorted = df_sc.sort_values(by="score", ascending=ascending_sort)
# writes to disk
df_sc_sorted.to_csv(output_fname,
sep=sep,
index=False,
na_rep="None",
float_format="%.3f")
output_file = open(output_fname, 'a')
# Check if some comment in header are here
if header_comments:
# Make sure the comments is ending by a new line
if header_comments[-1] != linesep:
header_comments += linesep
output_file.write(header_comments)
# Write the dataframe
df_sc_sorted.to_csv(
output_file,
sep=sep,
index=False,
na_rep="None",
float_format="%.3f",
lineterminator=linesep,
)

return


Expand Down
5 changes: 3 additions & 2 deletions src/haddock/modules/scoring/emscoring/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
"""EM scoring module.

This module performs energy minimization and scoring of the models generated in
the previous step of the workflow. No restraints are applied during this step.
This module performs energy minimization and scoring of the models generated
in the previous step of the workflow.
Note that no restraints (AIRs) are applied during this step.
"""

from pathlib import Path
Expand Down
3 changes: 2 additions & 1 deletion src/haddock/modules/scoring/mdscoring/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
"""MD scoring module.

This module will perform a short MD simulation on the input models and
score them. No restraints are applied during this step.
score them.
Note that no restraints (AIRs) are applied during this step.
"""

from pathlib import Path
Expand Down
94 changes: 94 additions & 0 deletions src/haddock/modules/scoring/voroscoring/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
"""Voro scoring module.

This module performs scoring of input pdb models using ftdmp voro-mqa-all tool.
For more information, please check: https://github.com/kliment-olechnovic/ftdmp

It is a third party module, and requires the appropriate set up and intallation
for it to run without issue.
"""
from os import linesep
from pathlib import Path

from haddock.core.defaults import MODULE_DEFAULT_YAML
from haddock.core.typing import Any, FilePath
from haddock.modules import get_engine
from haddock.modules.scoring import ScoringModule
from haddock.modules.scoring.voroscoring.voroscoring import (
VoroMQA,
update_models_with_scores,
)

RECIPE_PATH = Path(__file__).resolve().parent
DEFAULT_CONFIG = Path(RECIPE_PATH, MODULE_DEFAULT_YAML)


class HaddockModule(ScoringModule):
"""."""

name = RECIPE_PATH.name

def __init__(
self,
order: int,
path: Path,
*ignore: Any,
init_params: FilePath = DEFAULT_CONFIG,
**everything: Any,
) -> None:
"""Initialize class."""
super().__init__(order, path, init_params)

@classmethod
def confirm_installation(cls) -> None:
"""Confirm module is installed."""
# FIXME ? Check :
# - if conda env is accessible
# - if ftdmp is accessible
return

def _run(self) -> None:
"""Execute module."""
# Retrieve previous models
try:
models_to_score = self.previous_io.retrieve_models(
individualize=True
)
except Exception as e:
self.finish_with_error(e)

# Initiate VoroMQA object
output_fname = Path("voro_mqa_all.tsv")
voromqa = VoroMQA(
models_to_score,
'./',
self.params,
output=output_fname,
)

# Launch machinery
jobs: list[VoroMQA] = [voromqa]
# Run Job(s)
self.log("Running Voro-mqa scoring")
Engine = get_engine(self.params['mode'], self.params)
engine = Engine(jobs)
engine.run()
self.log("Voro-mqa scoring finished!")

# Update score of output models
try:
self.output_models = update_models_with_scores(
output_fname,
models_to_score,
metric=self.params["metric"],
)
except ValueError as e:
self.finish_with_error(e)

# Write output file
scoring_tsv_fpath = f"{RECIPE_PATH.name}.tsv"
self.output(
scoring_tsv_fpath,
header_comments=f"# Note that negative of the value are reported in the case of non-energetical predictions{linesep}", # noqa : E501
)
# Export to next module
self.export_io_models()
65 changes: 65 additions & 0 deletions src/haddock/modules/scoring/voroscoring/defaults.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
metric:
default: jury_score
type: string
choices:
- jury_score
- GNN_sum_score
- GNN_pcadscore
- voromqa_dark
- voromqa_light
- voromqa_energy
- gen_voromqa_energy
- clash_score
- area
minchars: 1
maxchars: 50
title: VoroMQA metric used to score.
short: VoroMQA metric used to score.
long: VoroMQA metric used to score.
group: analysis
explevel: easy

conda_install_dir:
default: "/trinity/login/vreys/miniconda3/"
type: string
minchars: 1
maxchars: 158
title: Path to conda intall directory.
short: Absolute path to conda intall directory.
long: Absolute path to conda intall directory.
group: execution
explevel: easy

conda_env_name:
default: "ftdmp5"
type: string
minchars: 1
maxchars: 100
title: Name of the ftdmp conda env.
short: Name of the ftdmp conda env.
long: Name of the ftdmp conda env.
group: execution
explevel: easy

ftdmp_install_dir:
default: "/trinity/login/vreys/Venclovas/ftdmp/"
type: string
minchars: 1
maxchars: 158
title: Path to ftdmp intall directory.
short: Absolute path to ftdmp intall directory.
long: Absolute path to ftdmp intall directory.
group: execution
explevel: easy

nb_gpus:
default: 1
type: integer
min: 1
max: 420
title: Number of accessible gpu on the device.
short: Number of accessible gpu on the device.
long: Number of accessible gpu on the device.
group: execution
explevel: easy

Loading
Loading