hotfix: fixed streaming packets specifically for Vertex streaming#3453
hotfix: fixed streaming packets specifically for Vertex streaming#3453
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
PR Summary
Implemented a hotfix for Google Vertex AI streaming by introducing stream teeing functionality to properly handle response chunks.
- Added
getBodiesMaybeTeeinworker/src/lib/HeliconeProxyRequest/getResponseBody.tsto split response streams specifically for Vertex AI endpoints - Moved stream handling logic from
ProxyRequestHandler.tstogetBodyInterceptorfor better code organization - Limited tee() implementation to Google Vertex AI endpoints (
aiplatform.googleapis.com/v1) for now - Added buffering logic for small chunks (<50 bytes) to prevent stream fragmentation
- Included dated comment explaining tee() usage is experimental and may expand to other providers after testing
2 file(s) reviewed, 2 comment(s)
Edit PR Review Bot Settings | Greptile
| "aiplatform.googleapis.com/v1" | ||
| ) | ||
| ) { | ||
| const [body1, body2] = body!.tee(); |
There was a problem hiding this comment.
logic: Non-null assertion on body could cause runtime error if maybeForceFormat returns null
| const [body1, body2] = body!.tee(); | |
| if (!body) throw new Error('Body stream is null'); | |
| const [body1, body2] = body.tee(); |
| } | ||
| }, | ||
| }); | ||
| return body.pipeThrough(transformer) ?? null; |
There was a problem hiding this comment.
style: Returning null here could cause issues with the tee() operation later. Consider throwing an error instead
| return body.pipeThrough(transformer) ?? null; | |
| const transformed = body.pipeThrough(transformer); | |
| if (!transformed) throw new Error('Failed to transform body stream'); | |
| return transformed; |
|
📝 Documentation updates detected! You can review documentation updates here |
Summary
⏳ 2 failed, 1 in progress
|
I am unsure of the root cause, but the readable intercept is not processing each chunk properly and is causing the response to come back all at once. I am a bit unsure about the
teemethod and I remember there being a reason why we arent using it (But that is when I designed this thing like 2 years ago) so cloudflare very well could have changed it to make it more reliable.For now this is a quick fix to use tee for Google Vertex to fix this issue