Check strain names in titer data

### Context

@jameshadfield recently looked into matches of titer strain names against the strain names of sequences from the new ingest workflow and it's as good/bad as current strain name matches against fauna sequences ([Slack](https://bedfordlab.slack.com/archives/C03KWDET9/p1765500798956549)).

We should add a check to verify that titer strain names match an existing sequence strain name. This would help us identify misspelled strain names such as those in https://github.com/nextstrain/seasonal-flu/issues/60. 

### Description

Even before we move titer ingest out of fauna, we can add this check into the existing workflow for titers.

We currently download titers from fauna, separated by lineage/center/passage/assay, but the `who` center is a superset of all other centers. So after `download_titers`, we can check the `data/<lineage>/who_<passage>_<assay>_titers.tsv` files against the `<lineage>/metadata.tsv`. This would verify the titers' `virus_strain` and `serum_strain` values match `strain` in the metadata. Every value that does not match would get output to a log file.

This would create a path to use the log file to generate a mapping file of bad titer strain names to existing sequence strain names. This can be applied in a curation step before the titer data is uploaded to S3. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Check strain names in titer data #286

Context

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Check strain names in titer data #286

Description

Context

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions