Description
Describe the issue
I'm not sure if it's an issue or I am not fully getting the rational behind this behaviour, so I would like to understand more the results I get.
When inputting HGVSp (protein sequence) and HGVSc (coding sequence) in VEP of genes that are on the reversed strand I get opposite nucleotides in the REF and ALT. I was expecting the REF to be standard in giving the 5'->3' in the output. But this doesn't seem the case.
Additional information
For example, I have the mutation: NRAS:p.Gly13Asp
. If I ran VEP on this mutation in HGVS format and in genomic coordinates, this is what I get as result:
Uploaded_variants | NRAS:p.Gly13Asp |
1_114716123_G/A |
1_114716123_C/T |
---|---|---|---|
Location | 1:114716123 | 1:114716123 | 1:114716123 |
REF_ALLELE | G | G | C |
Allele | A | A | T |
Gene | ENSG00000213281 | ENSG00000213281 | ENSG00000213281 |
Consequence | missense_variant | missense_variant | missense_variant |
SYMBOL | NRAS | NRAS | NRAS |
HGNC_ID | HGNC:7989 | HGNC:7989 | HGNC:7989 |
ENSP | ENSP00000358548 | ENSP00000358548 | ENSP00000358548 |
CANONICAL | YES | YES | YES |
Amino_acids | G/D | A/V | G/D |
Protein_position | 13 | 13 | 13 |
Feature | ENST00000369535 | ENST00000369535 | ENST00000369535 |
EXON | 2/7 | 2/7 | 2/7 |
INTRON | - | - | - |
Codons | gGt/gAt | gCt/gTt | gGt/gAt |
STRAND | -1 | -1 | -1 |
HGVSp | ENSP00000358548.4:p.Gly13Asp | ENSP00000358548.4:p.Ala13Val | ENSP00000358548.4:p.Gly13Asp |
HGVSc | ENST00000369535.5:c.38G>A | ENST00000369535.5:c.38G>T | ENST00000369535.5:c.38G>A |
HGVSg | - | - | - |
MANE_SELECT | NM_002524.5 | NM_002524.5 | NM_002524.5 |
Even if the gene is on the reverse strand I am expecting the REF to be C, I do not fully understand this results. Could I have an explanation?
System
I am using docker://ensemblorg/ensembl-vep:release_111.0
Full VEP command line
singularity exec docker://ensemblorg/ensembl-vep:release_111.0 vep -i /tmp/tmps5xkl4ld_input.tsv -o STDOUT --cache --cache_version 111 --dir_cache /home/user/.vep --assembly GRCh38 --force_overwrite --offline --hgvs --tab --fields Uploaded_variation,Location,REF_ALLELE,Allele,Gene,Consequence,SYMBOL,HGNC_ID,ENSP,CANONICAL,Amino_acids,Protein_position,Feature,EXON,INTRON,Codons,STRAND,HGVSp,HGVSc,HGVSg,MANE_SELECT --show_ref_allele --symbol --canonical --protein --numbers --no_stats --fork 4 --mane --warning_file tmp/preprocessing/vep/vep_warnings.txt
Full error message
This is the only warning that is actually not relevant I guess
WARNING: Possible invalid use of gene or protein identifier 'NRAS' as HGVS reference; most likely transcript will be selected