Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some strain names in titer records are misspelled #60

Open
huddlej opened this issue Apr 12, 2021 · 0 comments
Open

Some strain names in titer records are misspelled #60

huddlej opened this issue Apr 12, 2021 · 0 comments
Labels
bug Something isn't working

Comments

@huddlej
Copy link
Contributor

huddlej commented Apr 12, 2021

Current Behavior

Titer records define strain names for test and reference viruses, but we do not automatically cross-check these names with existing names in GISAID/GenBank records. As a result, there can be misspelled strain names in the titer records that lead us to omit these measurements from analyses when they do not match the corresponding sequence record's strain name.

Tal from the Bloom lab has helpfully provided a list of misspellings that he detected in our Crick titers:

"Misspelled Virus Name","Correct Virus Name"
"A/Camb/925256/2020","A/Cambodia/925256/2020"
"A/Christchurch/4/1985","A/ChristChurch/4/1985"
"A/Christchurch/515/2019","A/ChristChurch/515/2019"
"A/Christchurch/515/2019-egg","A/ChristChurch/515/2019-egg"
"A/CotedIvore/544/2016","A/CoteDIvoire/544/2016"
"A/Eng/538/2018","A/England/538/2018"
"A/Greecd/4/2017","A/Greece/4/2017"
"A/Hk/5738/2014","A/HongKong/5738/2014"
"A/Hk/656/2018","A/HongKong/656/2018"
"A/Hk/675/2018","A/HongKong/675/2018"
"A/Lyon/CHU/R1811667/2018","A/Lyon/CHU-R1811667/2018"
"A/Lyon/CHU/R181282/2018","A/Lyon/CHU-R181282/2018"
"A/Lyon/CHU/R1813393/2018","A/Lyon/CHU-R1813393/2018"
"A/Lyon/CHU/R190259/2019","A/Lyon/CHU-R190259/2019"
"A/Lyon/CHU/R190377/2019","A/Lyon/CHU-R190377/2019"
"A/Lyon/CHU/R1914685/2019","A/Lyon/CHU-R1914685/2019"
"A/Lyon/CHU/R1915450/2019","A/Lyon/CHU-R1915450/2019"
"A/Lyon/EHPAD/108/2019","A/Lyon/EHPAD-108/2019"
"A/Nor/2516/2018","A/Norway/2516/2018"
"A/Nor/2620/2018","A/Norway/2620/2018"
"A/Nor/4436/2016","A/Norway/4436/2016"
"A/Norway/3806-egg","A/Norway/3806/2016-egg"
"A/Singapore/INFIMH-16-001/2016","A/Singapore/INFIMH-16-0019/2016"
"A/Singapore/INFIMH-16-001/2016-egg","A/Singapore/INFIMH-16-0019/2016-egg"
"A/Singapore/Infimh-16-0019/2016","A/Singapore/INFIMH-16-0019/2016"
"A/Singapore/Infimh-16-0019/2016-egg","A/Singapore/INFIMH-16-0019/2016-egg"
"A/Singapore/Infimh-16-0019/2016-egg","A/Singapore/INFIMH-16-0019/2016-egg"
"A/StEtienne/1912/2018","A/Saint-Etienne/1912/2018"
"A/StEtienne/1998/2018","A/Saint-Etienne/1998/2018"
"A/StEtienne/2539/2020","A/Saint-Etienne/2539/2020"
"A/Stock/6/2014","A/Stockholm/6/2014"
"A/Switz/8060/2017-egg","A/Switzerland/8060/2017-egg"
"A/Switzerlandz/8060/2017-egg","A/Switzerland/8060/2017-egg"

Expected behavior

Misspelled strains in the list above should match their sequence strain names.

Possible solution

In addition to manually correcting these records in our database, we should also consider flagging any titer records with potential misspellings. One easy check would be for records whose test or reference strains don't have corresponding records in the sequence database.

@huddlej huddlej added the bug Something isn't working label Apr 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant