Skip to content

Multiple Species Labels #27

Open
Open
@riggsd

Description

@riggsd

The GUANO spec says that Species Auto ID and Species Manual ID are a "list of strings".

The serialized value should be comma-separated:

Species Manual ID: Mylu, Epfu

The Python library currently returns an str value, which in the above case would be "Mylu, Epfu".

This means that library users would need to split it and strip it to know how many and which species labels were present.

It means that they might need to parse it and reassemble it in order to append a new value. (The spec doesn't actually state that values must be unique, but for most use cases that's probably desired.)

Therefore I think we should set up coercion/serialization for these Species fields so that they return a Python list, as follows:

  1. Species field not present: md.get("Species Auto ID") -> None and "species Auto ID" in md == false

  2. Species field empty/blank: md["Species Auto ID"] -> []

  3. Species field has a single species label: md["Species Auto ID"] -> ["Mylu"]

  4. Species field has multiple species labels: md["Species Auto ID"] -> ["Mylu", "Epfu"]

The spec defines these fields as optional, so the way to check whether a recording is a Mylu recording would be:

if "Mylu" in md.get("Species Auto ID", []): ...

It would be cleaner and more intuitive if these two fields were required, but Timestamp is the only required field at this time.

Another approach to making it slightly cleaner would be adding explicit accessor methods for all of the well-known fields, where in could still check whether the field was present, and def species_auto_id(self) -> list would always return a list, possibly empty.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions