Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Context Caching for Gemini Provider #36

Open
ONLYstcm opened this issue Dec 24, 2024 · 5 comments
Open

feature: Context Caching for Gemini Provider #36

ONLYstcm opened this issue Dec 24, 2024 · 5 comments

Comments

@ONLYstcm
Copy link

I'm exploring the flutter/ai package and would like to understand how to implement Context Caching for specific content as described in the Gemini API documentation. I want to optimize token usage by caching repeated context or static content while generating responses dynamically for user-specific inputs. However, the documentation doesn't explicitly describe how this can be integrated with Flutter's AI package.

@csells csells changed the title Context Caching for Gemini Provider feature: Context Caching for Gemini Provider Dec 24, 2024
@csells
Copy link
Contributor

csells commented Dec 24, 2024

What would the scenario be here?

@ONLYstcm
Copy link
Author

ONLYstcm commented Dec 24, 2024

I have a document which serve as the main source of information for all of the user prompts. I want all the user prompts to be based on the document. In vertex studio, by instructing the system to only look at the context provided, I was sort of able to achieve this. Rather than feeding this document for each user prompt I was investigating if there's an alternative and more optimised approach since several users will be using the same document (single main source of information)

@csells
Copy link
Contributor

csells commented Dec 24, 2024

Can you share some example code that shows what you have in mind?

@ONLYstcm
Copy link
Author

I think this snippet in the Gemini docs should best describe it - https://ai.google.dev/gemini-api/docs/caching?lang=python#generate-content

@csells
Copy link
Contributor

csells commented Dec 24, 2024

That sample seems to show creating a model using the cached content. If that's the case, just pass that model when creating the LlmChatView.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants