Skip to content

Conversation

@DiamondJoseph
Copy link
Contributor

Builds on the work of #1036 - committing to trade off to @dan-fernandes.

The general idea is that there is a root Adapter type:

class Adapter(ABC, Generic[S]):
    def __init__(
        self,
        structure: S,
        *,
        metadata: Optional[JSON] = None,
        specs: Optional[List[Spec]] = None,
    ):
        self._structure = structure
        self._metadata = metadata or {}
        self._specs = specs or []

    @property
    def metadata(self) -> JSON:
        return self._metadata

    @property
    def specs(self) -> List[Spec]:
        return self._specs

    def structure(self) -> S:
        return self._structure

    @classmethod
    @abstractmethod
    def structure_family(cls) -> StructureFamily:
        ...

    @classmethod
    def supported_storage(cls) -> Set[type[Storage]]:
        return set()

And then for each Structure type there is a class that defines the Structure-type behaviours, e.g.

E = TypeVar("E")

class ArrayAdapter(Adapter[ArrayStructure], Generic[E]):
    def __init__(
        self,
        array: NDArray[E],
        structure: ArrayStructure,
        *,
        metadata: Optional[JSON] = None,
        specs: Optional[List[Spec]] = None,
    ) -> None:
        self._array = array
        super().__init__(structure, metadata=metadata, specs=specs)

    @classmethod
    def structure_family(cls) -> StructureFamily:
        return StructureFamily.array

    @classmethod
    def from_array(
        cls,
        array: NDArray[E],
        *,
    ) -> "ArrayAdapter[E]":

    @property
    def dims(self) -> Optional[Tuple[str, ...]]:

    def read(
        self,
        slice: NDSlice = NDSlice(...),
    ) -> NDArray[E]:

    def read_block(
        self,
        block: Tuple[int, ...],
        slice: NDSlice = NDSlice(...),
    ) -> NDArray[E]:

Specific implementations may then modify slightly the behaviour or implementation (e.g. a generic adapter may not be able to be stored on disc/in a database, but an implementation with a filetype or a database form may). By typing the methods, especially the constructing class methods, it is clearer to see what's going on, the intent behind the design and follow that structure eventually up into nodes.

flowchart TD
    A["Adapter[S = TypeVar(bound=Structure)]"] --> B
    A["Adapter[S = TypeVar(bound=Structure)]"] --> C
    A["Adapter[S = TypeVar(bound=Structure)]"] --> F
    B["TableAdapter(Adapter[TableStructure])"] --> D
    C["ArrayAdapter(Adapter[ArrayStructure])"] --> E
    C --> G
    F["ContainerAdapter(Adapter[ContainerStructure], Mapping[str, A = TypeVar(bound=Adapter[S])]"] --> H
    F --> I
    D["ParquetAdapter(TableAdapter)"]
    E["CSVArrayAdapter(ArrayAdapter)"]
    G["JPEGAdapter(ArrayAdapter)"]
    H["MapAdaper(ContainerAdapter[A])"]
    I["ZarrGroupAdapter(ContainerAdapter[Union[ArrayAdapter, ZarrGroupAdapter]])"]
Loading

Checklist

  • Add a Changelog entry
  • Add the ticket number which this PR closes to the comment section

@danielballan
Copy link
Member

Thanks for this nice write up. There are clear benefits to moving the common methods into a base class. What would the trade-offs be if we kept with Protocols to define the expected interfaces (as now) but then added an AdapterBase to define the boilerplate?

I see ABCs mixing the two---defining an interface while also smuggling in some implementation

As a practical point, I can foresee defining Tiled-compatible Adapters in third-party libraries. For example, the library tifffile has been known to oblige various frameworks by implementing their I/O plugins for TIFF within tifffile. Perhaps the XDI library could ship a Tiled Adapter. There are some advantages to maintaining the Adapter along with the I/O library, and its associated tests, rather putting all of that weight into Tiled itself. If we go the ABC route, these third-party libraries would need to accept a tiled dependency to import and subclass our ABCs. (They won't.) If we go the Protocols route---with base classes for boilerplate---third-party libraries may implement our Protocols without explicitly importing us.

[Forgive the soap box here.] Duck typing (structural subtyping) enables the distributed coordination that has made Scientific Python work, and we've emulated that in Bluesky. Any Device-like object works with the RunEngine as long as it implements the protocols. Nominal subtyping would lock us into dependency and inheritance, and I'm reluctant to go that way unless there is strong benefit.

@DiamondJoseph
Copy link
Contributor Author

[...] If we go the ABC route, these third-party libraries would need to accept a tiled dependency to import and subclass our ABCs. (They won't.) If we go the Protocols route---with base classes for boilerplate---third-party libraries may implement our Protocols without explicitly importing us.

I wanted to say that the horse is already bolted wrt. Spec and Storage, but what I found is even more concerning:

from abc import abstractmethod
from dataclasses import dataclass
from typing import Protocol, Union, runtime_checkable

from tiled.storage import Storage
from tiled.structures.core import Spec
from tiled.type_aliases import JSON


@runtime_checkable
class BaseAdapter(Protocol):
    supported_storage: set[type[Storage]]

    @abstractmethod
    def metadata(self) -> JSON:
        pass

    @abstractmethod
    def specs(self) -> list[Spec]:
        pass

FOREIGN_JSON = dict[str, Union[str, int, float, bool, dict[str, "FOREIGN_JSON"], list["FOREIGN_JSON"]]]

@dataclass(frozen=True)
class ForeignStorage:
    unknown_field: float

@dataclass(frozen=True)
class ForeignSpec:
    unknown_field: int


class ForeignAdapter:
    supported_storage: set[type[ForeignStorage]] = {ForeignStorage}

    def metadata(self) -> set[float]:
        return {8, 9, 10}

    def specs(self) -> list[ForeignSpec]:
        return [ForeignSpec(1)]

foreign_adapt = ForeignAdapter()
foreign_store = ForeignStorage(7.3)
foreign_spec = ForeignSpec(2)

print(isinstance(foreign_adapt, BaseAdapter))
# True
print(isinstance(foreign_store, Storage))
# False
print(isinstance(foreign_spec, Spec))
# False

ForeignSpec, ForeignStorage have nothing in common with what BaseAdapter should require of Spec and Storage fields and methods, and metadata is overwritten in a completely incompatible form, but ForeignAdapter reports that it walks like a Truck, quacks like a Duke and must be a Duck.

@danielballan
Copy link
Member

For Spec and Storage, though, third-party libraries can make their own little dataclasses that satisfy the protocol, just as many libraries implement array-likes and dataframe-likes. Bluesky does this for some tiled dataclasses.

If we're focused on isinstance checks, things get knotty, but good Python code generally tries to avoid isinstance checks where possible, to facilitate duck-typing.

@DiamondJoseph
Copy link
Contributor Author

@danielballan we've been discussing this more internally, I'm still not convinced by the argument of other libraries implementing adapters, vs. the existing standard of tiled having optional imports for which it defines adapters.

A third party adapter would already need to be registered and associated with mimetypes to be useable, having a simple wrapping Adapter type defined in tiled [or in the script that initialises Tiled where presumably the adapter is being registered, at which point we're very happy to accept more optional contributions] would allow accomplishing that, even if the library implemented something Adapter-shaped.

Where the SturctureFamily etc. are re-defined in bluesky there is already a dependency on tiled[client] (there actually isn't, there's just a dev requirement on tiled[all])- if we had all of these structures and optional modules organised, we could acknowledge that tiled is an optional dependency of bluesky, and when it is available the enums can be imported directly from the base product.

@dan-fernandes dan-fernandes marked this pull request as ready for review August 12, 2025 10:12
@dan-fernandes dan-fernandes marked this pull request as draft August 12, 2025 10:12
@danielballan
Copy link
Member

That's all valid. I am still inclined toward Protocols but not finding the bandwidth make good arguments here, so I think we should go ahead with the approach you propose.

We can conceivably refine later, and this would certainly be an improvement over the status quo.

@dan-fernandes dan-fernandes marked this pull request as ready for review August 20, 2025 12:20
Copy link
Member

@danielballan danielballan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dan-fernandes and @DiamondJoseph! I appreciate you cleaning up so much cruft here. It's a big improvement.

@danielballan
Copy link
Member

I resolved merge conflicts. This is good to merge once CI passes.

@danielballan danielballan merged commit fb40f08 into main Oct 7, 2025
12 checks passed
@danielballan danielballan deleted the typed_adapters branch October 7, 2025 20:17
jmaruland pushed a commit to jmaruland/tiled that referenced this pull request Oct 22, 2025
* fix(devcontainer): Add stage to Docker build for devcontainer

* chore(types): Add type hints and hierarchy to Structure types

* fix type checking issues on various versions

* ClassVar

* Numpy types

* Add root Adapter type

* chore: Type hint and rationalise adapters

* Changes to fix pre-commit

* Fixes for pre-commit including correct typing of MapAdapter

* Fixes for pre-commit

* Adopt Zarr expectations of JSON type

* Extract Zarr Adapter from array assumptions

* Fixes for pre-commit

* Retype xarray adapters, MadAdapter

* Add types-cachetools to dev requirements

* Fixes to pass pre-commit

* Remove type vars from non generic classes

* Add SQLAdapter.structure_family

* Fix treating .structure_family as a property

* Make ParquetDatabaseAdapter inherit from Adapter[TableStructure]

* Fixes for pre-commit

* Make structure_family an attribute, make metadata a callable

* Make CSVAdapter inherit directly from Adapter[TableStructure]

* Fix structure family pickling issue

* Fix hdf5 adapter self referential metadata

* Change ZarrGroupAdapter conatiner structure to use list of keys

* Change NPYAdapter to inherit directly from Adapter[ArrayStructure]

* Remove unused print statement

* Remove AnyStructure

* Fix typing for backwards compatibility

* Fix metadata argument

* Add type to ExceAdapter(MadAdapter) generic

* Add return type, remove unused type ignore

* Fix JPEGAdapter.read_block signature, remove unused type ignore

* Fix merge conflict resolution

---------

Co-authored-by: Daniel Fernandes <[email protected]>
Co-authored-by: Dan Allan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants