Skip to content

Commit 6435bf7

Browse files
committed
updated readme
1 parent c38a790 commit 6435bf7

File tree

1 file changed

+24
-11
lines changed

1 file changed

+24
-11
lines changed

README.md

Lines changed: 24 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,8 @@ For somatic variant calling using **tumor-only** samples, please try [ClairS-TO]
6565
----
6666

6767
## Latest Updates
68+
*v1.1.1 (May 19, 2025)* : 1. Fixed the malformed VCF header issue that occurred specifically in AWS cloud environments([#380](https://github.com/HKU-BAL/Clair3/issues/380)). 2.Added a Clair3 R10.4.1 model fine-tuned on 12 [bacterial genomes](https://elifesciences.org/reviewed-preprints/98300) with improved variant calling performance for bacterial samples. Performance benchmarks and detailed results are documented in our note ["fine-tuning_Clair3_with_12_bacteria_samples"](docs/fine-tuning_Clair3_with_12_bacteria_samples.pdf), (co-contributor @[William Shropshire](https://github.com/wshropshire)) .
69+
6870
*v1.1.0 (Apri 8, 2025)* : 1. Removed `parallel` version checking ([#377](https://github.com/HKU-BAL/Clair3/issues/377)).
6971

7072
*v1.0.11 (Mar 19, 2025)* : 1. Added the `--enable_variant_calling_at_sequence_head_and_tail` option to enable variant calling at the head and tail 16bp of each sequence. Use with caution because alignments are less reliable in the regions, and there would be insufficient context to be fed to the neural network for reliable calling ([#257
@@ -136,9 +138,10 @@ In a docker installation, models are in `/opt/models/`. In a bioconda installati
136138
| Model name | Platform | Option (`-p/--platform`) | Training samples | Included in the bioconda package | Included in the docker image | Date |
137139
| :----------------------------: | :---------: | :----------------------------------------------------------: | -------------------------------- | :--------------------------: | :------: | :------: |
138140
| r941_prom_sup_g5014 | ONT r9.4.1 | `ont` | HG002,4,5 (Guppy5_sup) | Yes | Yes | 20220112 |
139-
| r941_prom_hac_g360+g422 | ONT r9.4.1 | `ont` | HG001,2,4,5 | Yes | Yes | 20210517 |
141+
| r941_prom_hac_g360+g422 | ONT r9.4.1 | `ont` | HG001,2,4,5 | | | 20210517 |
140142
| r941_prom_hac_g360+g422_1235 | ONT r9.4.1 | `ont` | HG001,2,3,5 | | | 20210517 |
141-
| r941_prom_hac_g238 | ONT r9.4.1 | `ont` | HG001,2,3,4 | | Yes | 20210627 |
143+
| r941_prom_hac_g238 | ONT r9.4.1 | `ont` | HG001,2,3,4 | | | 20210627 |
144+
| r1041_e82_400bps_sup_v430_bacteria_finetuned | ONT r10.4.1 | `ont` | Fine-tuned on 12 [bacterial genomes](https://elifesciences.org/reviewed-preprints/98300) | | Yes | 20250519 |
142145
| ~~r941_prom_sup_g506~~ | ONT r9.4.1 | | Base model: HG001,2,4,5 (Guppy3,4) <br>Fine-tuning data: HG002 (Guppy5_sup) | | | 20210609 |
143146
| hifi_revio | PacBio HiFi Revio | `hifi` | HG002,4 | Yes | Yes | 20230522 |
144147
| hifi_sequel2 | PacBio HiFi Sequel II | `hifi` | HG001,2,4,5 | Yes | Yes | 20210517 |
@@ -148,6 +151,15 @@ In a docker installation, models are in `/opt/models/`. In a bioconda installati
148151

149152
ONT provides models for some latest or specific chemistries and basecallers (including both Guppy and Dorado) through [Rerio](https://github.com/nanoporetech/rerio). These models are tested and supported by the ONT developers.
150153

154+
We have also added the latest version of ONT-provided models in Docker and Bioconda since v1.1.1:
155+
156+
| Model name | Chemistry | Dorado basecaller model | Option (`-p/--platform`) | Training samples | Included in the bioconda package | Included in the docker image |
157+
| :-----------------------: | :---------------------: | ----------------------- | :----------------------: | -------------------------- | :------------------------------: | :--------------------------: |
158+
| r1041_e82_400bps_sup_v500 | ONT R10.4.1 E8.2 (5kHz) | v5.0.0 SUP | `ont` | Trained by ONT EPI2ME Lab | Yes | Yes |
159+
| r1041_e82_400bps_hac_v500 | R10.4.1 E8.2 (5kHz) | v5.0.0 HAC | `ont` | Trained by ONT EPI2ME Lab | | Yes |
160+
| r1041_e82_400bps_sup_v410 | R10.4.1 E8.2 (4kHz) | v4.1.0 SUP | `ont` | Trained by ONT EPI2ME Lab | Yes | Yes |
161+
| r1041_e82_400bps_hac_v410 | R10.4.1 E8.2 (4kHz) | v4.1.0 HAC | `ont` | Trained by ONT EPI2ME Lab | | Yes |
162+
151163

152164
----
153165

@@ -200,7 +212,7 @@ A pre-built docker image is available [here](https://hub.docker.com/r/hkubal/cla
200212
INPUT_DIR="[YOUR_INPUT_FOLDER]" # e.g. /home/user1/input (absolute path needed)
201213
OUTPUT_DIR="[YOUR_OUTPUT_FOLDER]" # e.g. /home/user1/output (absolute path needed)
202214
THREADS="[MAXIMUM_THREADS]" # e.g. 8
203-
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r941_prom_hac_g360+g422
215+
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r1041_e82_400bps_sup_v500
204216

205217
docker run -it \
206218
-v ${INPUT_DIR}:${INPUT_DIR} \
@@ -225,7 +237,7 @@ Check [Usage](#Usage) for more options.
225237
INPUT_DIR="[YOUR_INPUT_FOLDER]" # e.g. /home/user1/input (absolute path needed)
226238
OUTPUT_DIR="[YOUR_OUTPUT_FOLDER]" # e.g. /home/user1/output (absolute path needed)
227239
THREADS="[MAXIMUM_THREADS]" # e.g. 8
228-
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r941_prom_hac_g360+g422
240+
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r1041_e82_400bps_sup_v500
229241

230242
conda config --add channels defaults
231243
conda create -n singularity-env -c conda-forge singularity -y
@@ -263,7 +275,7 @@ conda create -n clair3 -c bioconda clair3 python=3.9.0 -y
263275
conda activate clair3
264276

265277
# run clair3 like this afterward
266-
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r941_prom_hac_g360+g422
278+
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r1041_e82_400bps_sup_v500
267279

268280
run_clair3.sh \
269281
--bam_fn=input.bam \ ## change your bam file name here
@@ -331,7 +343,7 @@ wget http://www.bio8.cs.hku.hk/clair3/clair3_models/clair3_models.tar.gz
331343
tar -zxvf clair3_models.tar.gz -C ./models
332344

333345
# run clair3
334-
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r941_prom_hac_g360+g422
346+
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r1041_e82_400bps_sup_v500
335347
./run_clair3.sh \
336348
--bam_fn=${INPUT_DIR}/input.bam \ ## change your bam file name here
337349
--ref_fn=${INPUT_DIR}/ref.fa \ ## change your reference file name here
@@ -468,7 +480,7 @@ CONTIGS_LIST="[YOUR_CONTIGS_LIST]" # e.g "chr21" or "chr21,chr22"
468480
INPUT_DIR="[YOUR_INPUT_FOLDER]" # e.g. /home/user1/input (absolute path needed)
469481
OUTPUT_DIR="[YOUR_OUTPUT_FOLDER]" # e.g. /home/user1/output (absolute path needed)
470482
THREADS="[MAXIMUM_THREADS]" # e.g. 8
471-
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r941_prom_hac_g360+g422
483+
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r1041_e82_400bps_sup_v500
472484

473485
docker run -it \
474486
-v ${INPUT_DIR}:${INPUT_DIR} \
@@ -491,7 +503,7 @@ KNOWN_VARIANTS_VCF="[YOUR_VCF_PATH]" # e.g. /home/user1/known_variants.vcf.gz
491503
INPUT_DIR="[YOUR_INPUT_FOLDER]" # e.g. /home/user1/input (absolute path needed)
492504
OUTPUT_DIR="[YOUR_OUTPUT_FOLDER]" # e.g. /home/user1/output (absolute path needed)
493505
THREADS="[MAXIMUM_THREADS]" # e.g. 8
494-
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r941_prom_hac_g360+g422
506+
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r1041_e82_400bps_sup_v500
495507

496508
docker run -it \
497509
-v ${INPUT_DIR}:${INPUT_DIR} \
@@ -526,7 +538,7 @@ BED_FILE_PATH="[YOUR_BED_FILE]" # e.g. /home/user1/tmp.bed (absolute path
526538
INPUT_DIR="[YOUR_INPUT_FOLDER]" # e.g. /home/user1/input (absolute path needed)
527539
OUTPUT_DIR="[YOUR_OUTPUT_FOLDER]" # e.g. /home/user1/output (absolute path needed)
528540
THREADS="[MAXIMUM_THREADS]" # e.g. 8
529-
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r941_prom_hac_g360+g422
541+
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r1041_e82_400bps_sup_v500
530542

531543
docker run -it \
532544
-v ${INPUT_DIR}:${INPUT_DIR} \
@@ -548,7 +560,7 @@ docker run -it \
548560
INPUT_DIR="[YOUR_INPUT_FOLDER]" # e.g. /home/user1/input (absolute path needed)
549561
OUTPUT_DIR="[YOUR_OUTPUT_FOLDER]" # e.g. /home/user1/output (absolute path needed)
550562
THREADS="[MAXIMUM_THREADS]" # e.g. 8
551-
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r941_prom_hac_g360+g422
563+
MODEL_NAME="[YOUR_MODEL_NAME]" # e.g. r1041_e82_400bps_sup_v500
552564

553565
docker run -it \
554566
-v ${INPUT_DIR}:${INPUT_DIR} \
@@ -563,7 +575,8 @@ docker run -it \
563575
--output=${OUTPUT_DIR} \
564576
--no_phasing_for_fa \ ## disable phasing for full-alignment
565577
--include_all_ctgs \ ## call variants on all contigs in the reference fasta
566-
--haploid_precise ## optional(enable --haploid_precise or --haploid_sensitive) for haploid calling
578+
--haploid_precise \ ## optional(enable --haploid_precise or --haploid_sensitive) for haploid calling
579+
--enable_variant_calling_at_sequence_head_and_tail ##optional (Enable variant calling in sequence head and tail start or end regions)
567580
```
568581
----
569582

0 commit comments

Comments
 (0)