Skip to content

Blob input #83

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Nov 18, 2024
Merged

Blob input #83

merged 13 commits into from
Nov 18, 2024

Conversation

rwb27
Copy link
Collaborator

@rwb27 rwb27 commented Nov 4, 2024

This PR allows blobs to be passed as input to actions. It's not yet fully general, but two important cases are supported:

  1. client code can pass a ClientBlobOutput object as input to an action, for example blob = thing.acquire_image() then thing.process_image(blob). This re-uses the blob on the server, so no large data transfer is needed. This does mean it will fail if the second call comes after the first invocation has been deleted.
  2. server code can pass Blob objects as input to actions, with everything properly type-hinted. As we're using the objects directly, there's no reliance on the action manager and no data transfer overhead.

This should be particularly useful for camera code, for example allowing us to add processing steps that accept an image as input.

I'd like to extend it to include uploading BLOBs and downloading BLOBs from other servers - but that's a future feature.

Implementation

I've consolidated BlobOutput and BlobModel into a single class, Blob, which is a subclass of BaseModel. This is a breaking change, but only because the name has changed from BlobOutput

In the serialisation and validation methods of Blob, I use context variables to generate URLs and retrieve Blob data. This means they can only be created from JSON or dumped if said context vars are set. This is done by a new FastAPI dependency.

rwb27 added 11 commits November 4, 2024 15:23
If we use a BlobOutput object as an input to an action, just send
the HREF and media type. This avoids re-downloading/uploading
the blob.

TODO: elegantly handle the case of a BLOBOutput from a different server. Currently this will not work.
Using a contextvar allows pydantic validators to retrieve outputs
from the action manager. I'm coming around to the notion that
bloboutputs should have their own /blob_output/uuid endpoint,
to break the dependency on actions.
This commit builds on the last few, and enables BlobOutput objects to be passed around the server. This should enable
**both client and server code** to do things like:
blob = thing.action1()
result = thing.action2(image=blob)

There are unit tests for both client-side and server-side chaining.

Currently there is one serialiser that's a stub, and we need to check for that, but currently there's no unit test doing so.

Being able to pass BlobOutputs around more freely will be very useful for handling images, for example.
Action inputs were being serialised to base types with
`model_dump`. Now, they are converted with `dict(model)`
which does not recursively dump models.

This should mean that actions with models as parameters receive models, not `dict`s. It stops `Blob` inputs being converted to URLs prematurely.
Blob URLs now start with `blob/` and are easier to generate. Blob
objects are `BaseModel` subclasses that can be serialised/deserialised, using contextvars to generate URLs and retrieve BlobData objects.

This obviates file responses from invocation outputs and also the `invocations/<id>/files`  endpoint
`model_to_dict` was recursively converting RootModels, which
could have led to confusing behaviour. We only need to handle
the case of empty input represented by a RootModel, and the
code now raises an error if a RootModel is used otherwise.
I used several ignores that I am surprised I needed - MyPy is not
behaving quite as I expected.
@rwb27 rwb27 mentioned this pull request Nov 18, 2024
@rwb27 rwb27 merged commit 7ac72fa into main Nov 18, 2024
9 checks passed
@rwb27 rwb27 deleted the blob_input branch November 18, 2024 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant