Skip to content

Latest commit

 

History

History
86 lines (61 loc) · 3.02 KB

README.md

File metadata and controls

86 lines (61 loc) · 3.02 KB

ASMC logo

Active Site Modeling and Clustering (ASMC)

ASMC combines (i) homology modeling of family members (MODELLER), (ii) ligand-binding pocket search (P2RANK), (iii) structural alignment of modeled active sites (USalign) and (iv) density-based spatial clustering of obtained alignments (DBSCAN) in a single command line.

The clustering step can be carried out on either structural or sequence alignment and users can directly analyse their own set of protein 3D structures (e.g. AI-based models) by skipping the homology modeling step.

ASMC workflow

Installation

Installation with conda and pip

Download the latest GitHub release to obtain the code (https://github.com/labgem/ASMC/releases) and extract the code from the archive.

Then, use the following commands from the ASMC/ directory:

conda env create -n asmc -f env.yml
pip install ./

Conda will install all the python dependencies and two required third-party softwares:

  • MODELLER (you still need to request the license key)
  • USalign

The pip command is required to create the asmc command and use ASMC.

It's also possible to use only the pip install ./ command, but this will not install any third party software.

Third-party software dependencies

P2RANK setup

Download the p2rank tar.gz file (e.g: p2rank_2.5.tar.gz) and extract the archive.

Create a symbolic link related to the prank script, e.g:

ln -s <full_path_to>/p2rank_2.5/prank /usr/bin/prank

Modify the previous prank script to work with a symbolic link. At line 22, replace:

THIS_SCRIPT_DIR_REL_PATH=`dirname "${BASH_SOURCE[0]}"`

by

THIS_SCRIPT_DIR_REL_PATH=$(dirname "$(readlink -f "${BASH_SOURCE[0]}")")

Now, ASMC can use P2RANK to detect ligand binding pockets.

Installation with Docker

Follow the instructions in the Docker section of the wiki

Quick Usage

Run ASMC in a blind way (unknown active site) using a multi fasta file that should contain at least 100 sequences for clustering to be sufficiently relevant.

asmc run --log run_asmc.log --threads 6 -r reference_file -s sequences.fasta

reference_file should contains the path to the reference(s) structure(s), e.g:

<path>/RefA.pdb
<path>/RefB.pdb

NB: For more details, see the wiki