Skip to content

input in HGVSp and HGVSc gives unexpected REF allele in reversed strand genes #1861

Open
@FedericaBrando

Description

@FedericaBrando

Describe the issue

I'm not sure if it's an issue or I am not fully getting the rational behind this behaviour, so I would like to understand more the results I get.

When inputting HGVSp (protein sequence) and HGVSc (coding sequence) in VEP of genes that are on the reversed strand I get opposite nucleotides in the REF and ALT. I was expecting the REF to be standard in giving the 5'->3' in the output. But this doesn't seem the case.

Additional information

For example, I have the mutation: NRAS:p.Gly13Asp . If I ran VEP on this mutation in HGVS format and in genomic coordinates, this is what I get as result:

Uploaded_variants NRAS:p.Gly13Asp 1_114716123_G/A 1_114716123_C/T
Location 1:114716123 1:114716123 1:114716123
REF_ALLELE G G C
Allele A A T
Gene ENSG00000213281 ENSG00000213281 ENSG00000213281
Consequence missense_variant missense_variant missense_variant
SYMBOL NRAS NRAS NRAS
HGNC_ID HGNC:7989 HGNC:7989 HGNC:7989
ENSP ENSP00000358548 ENSP00000358548 ENSP00000358548
CANONICAL YES YES YES
Amino_acids G/D A/V G/D
Protein_position 13 13 13
Feature ENST00000369535 ENST00000369535 ENST00000369535
EXON 2/7 2/7 2/7
INTRON - - -
Codons gGt/gAt gCt/gTt gGt/gAt
STRAND -1 -1 -1
HGVSp ENSP00000358548.4:p.Gly13Asp ENSP00000358548.4:p.Ala13Val ENSP00000358548.4:p.Gly13Asp
HGVSc ENST00000369535.5:c.38G>A ENST00000369535.5:c.38G>T ENST00000369535.5:c.38G>A
HGVSg - - -
MANE_SELECT NM_002524.5 NM_002524.5 NM_002524.5

Even if the gene is on the reverse strand I am expecting the REF to be C, I do not fully understand this results. Could I have an explanation?

System

I am using docker://ensemblorg/ensembl-vep:release_111.0

Full VEP command line

singularity exec docker://ensemblorg/ensembl-vep:release_111.0 vep  -i /tmp/tmps5xkl4ld_input.tsv -o STDOUT   --cache             --cache_version 111   --dir_cache /home/user/.vep  --assembly GRCh38 --force_overwrite   --offline --hgvs  --tab  --fields Uploaded_variation,Location,REF_ALLELE,Allele,Gene,Consequence,SYMBOL,HGNC_ID,ENSP,CANONICAL,Amino_acids,Protein_position,Feature,EXON,INTRON,Codons,STRAND,HGVSp,HGVSc,HGVSg,MANE_SELECT  --show_ref_allele   --symbol             --canonical             --protein             --numbers             --no_stats             --fork 4             --mane             --warning_file tmp/preprocessing/vep/vep_warnings.txt         

Full error message

This is the only warning that is actually not relevant I guess

WARNING: Possible invalid use of gene or protein identifier 'NRAS' as HGVS reference; most likely transcript will be selected

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions