vcf_to_tsv 🧬

Transforms a VCF (variant call format) file to a tab-separated values (.tsv) one.

Its compilation and functionality have been verified on the following operating system:

macOS 🍏
Linux 🐧

Download and Compilation 💾

>>> git https://github.com/alexcoppe/vcf_to_tsv
>>> cd vcf_to_tsv
>>> make

After compilation, move the generated executable vcf_to_tsv to a directory listed in the $PATH variable. You can identify these directories by using the echo $PATH command.

Run the software 🏃‍♂️

This software transforms an uncompressed VCF file to a tab-separated values (tsv) file. It also works with VCFs generated by SnpEff and ANNOVAR.

To run it, you need two arguments: the VCF file and a text file specifying the desired fields. Refer to the table below for guidance on creating this file.

When utilizing a SnpEff annotated VCF, the tool currently displays each transcript indicated by SnpEff in separate rows.

Starting character	What you get
None	get the fields from the VCF
:	get a subfield from the INFO field added by SnpEff
;	get a specific subfiled from the IMFO field
\|	get a specific subfield from the Genotype fields

Example of a text file specifying the desired fields and subfields:

:hgvs_c
position
;gnomAD_genome_AMR
|AD

Launching the program with the above text file

vcf_to_tsv a_vcf_file_path.vcf wanted_fields.txt

Output:

n.-3702C>T      157370625       0.0020  14,1    31,5
n.*1931C>T      157370625       0.0020  14,1    31,5
n.-3707C>T      157370630       0       15,1    33,4
...

Currently, the software operates exclusively on 1 or 2 genotype fields.

The table below displays all the sub-fields added by SnpEff along with the corresponding sub-field names used in vcf_to_table (listed in the first column).

Subfield by vcf_to_table	Subfield by SnpEff	Explanation
:allele	Allele (or ALT)	The alternative allele
:annotation	Annotation (a.k.a. effect)	Annotated using Sequence Ontology terms
:putative_impact	Putative_impact	A simple estimation of putative impact / deleteriousness : {HIGH, MODERATE, LOW, MODIFIER}
:gene_name	Gene Name	Common gene name (HGNC)
:gene_id	Gene ID	Gene ID
:feature_type	Feature type	Which type of feature is in the next field
:feature_id	Feature ID	Depends on the annotation
:transcript_biotype	Transcript biotype	The bare minimum is at least a description on whether the transcript is {"Coding", "Noncoding"}. Whenever possible, use ENSEMBL biotypes
:rank	Rank / total	Exon or Intron rank / total number of exons or introns
:hgvs_c	HGVS.c	Variant using HGVS notation (DNA level)
:hgvs_p	HGVS.p	If variant is coding, this field describes the variant using HGVS notation (Protein level)
:cdna_position	cDNA_position / cDNA_len	Position in cDNA and trancript's cDNA length (one based)
:cds_position	CDS_position / CDS_len	Position and number of coding bases (one based includes START and STOP codons)
:protein_position	Protein_position / Protein_len	Position and number of AA (one based, including START, but not STOP)
:distance_to_feature	Distance to feature	All items in this field are options see SnpEff page for details
:errors	Errors, Warnings or Information messages	Errors, warnings or informative message that can affect annotation accuracy

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
ann_variant.cpp		ann_variant.cpp
ann_variant.h		ann_variant.h
vcf.cpp		vcf.cpp
vcf.h		vcf.h
vcf_to_tsv.cpp		vcf_to_tsv.cpp
vcf_with_genotype_info.cpp		vcf_with_genotype_info.cpp
vcf_with_genotype_info.h		vcf_with_genotype_info.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

vcf_to_tsv 🧬

Download and Compilation 💾

Run the software 🏃‍♂️

About

Uh oh!

Releases

Packages

Languages

License

alexcoppe/vcf_to_tsv

Folders and files

Latest commit

History

Repository files navigation

vcf_to_tsv 🧬

Download and Compilation 💾

Run the software 🏃‍♂️

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages