Skip to content

Support Part.from_uri for arbitrary Google GenAI inputs #1647

@kylegallatin

Description

@kylegallatin

Is your feature request related to a problem? Please describe.

The latest version of the Google GenAI image/video/document understanding documentation supports Part.from_uri for passing in GCS objects in a number of different formats. From what I understand currently, I can only currently pass GCS objects directly in an implicit manner and only for images with autodetect_images=True, e.g. this example from the tests.

For PDFs I can get around this by using the HTTP version of a given object and using PDF.from_url, e.g. using this:

https://storage.googleapis.com/cloud-samples-data/generative-ai/pdf/2403.05530.pdf

instead of this:

gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf

Describe the solution you'd like

Ideally, I'd like to find the best way to be able to pass genAI's Part.from_uri directly through Instructor to the backend. We need to process images, PDFs, and videos all in GCS - and don't want to have to upload the Videos files if already supported by genAI.

Describe alternatives you've considered

Considered (1) creating a separate workflow in our codebase for multimodal Gemini content, or (2) using the HTTP version of these GCS URIs. However, would be nice to use Instructor for everything in our codebase and also not have to re-upload videos.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions