genome_assembly_pipeline

This is an integrated pipeline for eukaryotic genome assembly and gene annotation. It currently supports PacBio HiFi reads and RNA-seq reads, both of which are required as inputs. See this page for details on the expected outputs.

Requirements

Before running the workflow, make sure the following software is installed:

Getting Started

Follow the steps below to set up and run the workflow:

1. Clone the Repository

Clone this repository to your local machine:

git clone https://github.com/mkrg01/genome_assembly_pipeline.git
cd genome_assembly_pipeline

2. Prepare Input Files and Configure Settings

See config/README.md for details on preparing input files and adjusting configuration parameters.

3. Execute the Workflow

Run the workflow from the repository root directory. Replace /path/to/repo with the actual path to your local repository:

cd /path/to/repo
snakemake --sdm conda apptainer --singularity-args "--bind $(pwd)" --cores 64 all

Tip

You can run the pipeline in a stepwise manner. Replace all with one of the command below.

assembly_all: Runs rules up to the generation of the Hifiasm assembly and its associated metrics.
remove_organelle_all: Runs rules up to the organelle removal step and its associated metrics.
remove_contamination_all: Runs rules up to the contamination removal step by FCS and its associated metrics.
softmask_all: Runs rules up to softmasking by RepeatMasker.
gene_prediction_all: Runs rules up to gene prediction and related metrics (equivalent to all).

You do not need to start from step 1 — for example, if you run remove_contamination_all first, the rules related to assembly_all and remove_organelle_all will be executed automatically.

Note

Adjust the --cores value based on your available computational resources.

Note

All rules except those with FCS wrapper scripts (fcs.py, run_fcsadaptor.sh) run in containers. These wrapper scripts internally call the main FCS functions, which are executed inside containers.

The output will be generated in the results directory.

Name		Name	Last commit message	Last commit date
Latest commit History 171 Commits
.github/workflows		.github/workflows
config		config
job_scripts		job_scripts
raw_data		raw_data
workflow		workflow
.gitignore		.gitignore
.snakemake-workflow-catalog.yml		.snakemake-workflow-catalog.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

genome_assembly_pipeline

Requirements

Getting Started

1. Clone the Repository

2. Prepare Input Files and Configure Settings

3. Execute the Workflow

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

mkrg01/genome_assembly_pipeline

Folders and files

Latest commit

History

Repository files navigation

genome_assembly_pipeline

Requirements

Getting Started

1. Clone the Repository

2. Prepare Input Files and Configure Settings

3. Execute the Workflow

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages