Pangebin

Setup Python virtual environment

[For dev] Create the conda environment

conda env create -n pangebin-dev -f config/condaenv_311-dev.yml

[For dev] Activate the conda environment
```
conda activate pangebin-dev
```

Usage

Example with the dataset SAMN16357463:

dataset_dir="test/SAMN16357463"

Note: each script has a command clean
./script.sh clean $dataset_dir

standardize GFA assembly graphs

./test/std_asm_graph.sh run $dataset_dir

make pangenome graph with nextflow (make sur you have installed the command for the nextflow profile)
```
./test/pangenome.sh run $dataset_dir
```
make pan-assembly graph
```
./test/panassembly.sh run $dataset_dir
```
Obtain the GC probability scores of the fragments
```
./test/gc_prob_scores.sh run $dataset_dir
```
Obtain gene density on the fragments
1. Map the gene on the contigs from the two assemblers
```
./test/gene_mapping.sh blast $dataset_dir
```
2. Filter the gene mappings
```
./test/gene_mapping.sh filter $dataset_dir
```
3. Obtain the gene density on the fragments
```
./test/frag_gene_densities.sh run $dataset_dir
```
Obtain the seed from positive gene densities
```
./test/fragment_seeds.sh run $dataset_dir
```
Execute PlasBin-Flow modified for pan-assembly
```
./test/plasbin_panasm.sh run $dataset_dir
```

Pangebin-PlasBin-flow conversion

Use PlasBin-flow inputs to Pangebin

# pbf : PlasBin-flow
# pg : Pangebin
# Convert plasmidness
pangebin utils pbf-comp plm pbf_plasmidness.tsv pg_plasmidness.tsv
# Convert seeds
pangebin utils pbf-comp seeds pbf_seeds.tsv pg_seeds.tsv
# Recompute GC contents
pangebin sub gc from-gfa graph.gfa pg_gc_scores.tsv
# Run Pangebin on assembly graph
pangebin sub plasbin asm graph.gfa pg_seeds.tsv pg_gc_scores.tsv pg_plasmidness.tsv --outdir pg_outdir

Convert Pangebin outputs to PlasBin-flow output

pangebin utils pbf-comp bins pg_outdir pbf_bins.tsv

Going further into the details

Understanding GFA tags system:

Name		Name	Last commit message	Last commit date
Latest commit History 249 Commits
.vscode		.vscode
config		config
database		database
doc		doc
src/pangebin		src/pangebin
test		test
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
todo.md		todo.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pangebin

Setup Python virtual environment

Usage

Pangebin-PlasBin-flow conversion

Use PlasBin-flow inputs to Pangebin

Convert Pangebin outputs to PlasBin-flow output

Going further into the details

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

AlgoLab/pangebin

Folders and files

Latest commit

History

Repository files navigation

Pangebin

Setup Python virtual environment

Usage

Pangebin-PlasBin-flow conversion

Use PlasBin-flow inputs to Pangebin

Convert Pangebin outputs to PlasBin-flow output

Going further into the details

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages