-
Notifications
You must be signed in to change notification settings - Fork 21
Description
Is your feature request related to a problem? Please describe.
Currently, the readFile
tool only supports reading text files. It would be beneficial to extend its functionality to allow reading image files for models that support image input (e.g., Gemini, Anthropic).
Describe the solution you'd like
I propose updating the readFile
tool to detect the file type based on its extension or MIME type. If the file is an image, the tool should read it as a base64-encoded string and return it in a format that can be consumed by multimedia-capable models.
The implementation could be similar to how image outputs are handled in packages/livekit/src/chat/mcp-utils.ts
.
Specifically, the outputSchema
of the readFile
tool in packages/tools/src/read-file.ts
could be updated to support a content union type, similar to the ContentOutput
in mcp-utils.ts
:
const ContentOutput = z.union([
z.object({
...
}),
]);
Describe alternatives you've considered
An alternative would be to create a new tool specifically for reading images, but extending the existing readFile
tool seems more intuitive and efficient.
Additional context
This feature would enhance Pochi's ability to work with multimodal models and enable more complex interactions involving images.
Relevant files:
🤖 Generated with Pochi
Metadata
Metadata
Assignees
Labels
Type
Projects
Status