You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: API_DOCUMENTATION.md
+40-13Lines changed: 40 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -166,7 +166,12 @@ curl -X POST http://localhost:3001/v1/agents/cairo-coder/chat/completions \
166
166
167
167
### Streaming Response
168
168
169
-
Set `"stream": true` to receive SSE chunks that match OpenAI's `chat.completion.chunk` format. Each SSE frame is emitted as `data: {JSON}\n\n`, ending with `data: [DONE]\n\n`.
169
+
Set `"stream": true` to receive Server‑Sent Events (SSE). The stream contains:
170
+
171
+
- OpenAI‑compatible `chat.completion.chunk` frames for assistant text deltas
172
+
- Cairo Coder custom event frames with a top‑level `type` and `data` field
173
+
174
+
Each frame is sent as `data: {JSON}\n\n`, and the stream ends with `data: [DONE]\n\n`.
In addition to the OpenAI-compatible chunks above, Cairo Coder emits a custom SSE frame early in the stream with the documentation sources used for the answer. This enables frontends to display sources while the model is generating the response.
203
+
In addition to OpenAI‑compatible chunks, Cairo Coder emits custom events to expose retrieval context, progress, and optional reasoning. These frames have the shape `{"type": string, "data": any}` and can appear interleaved with standard chunks.
199
204
200
-
- The frame shape is: `data: {"type": "sources", "data": [{"title": string, "url": string}, ...]}`
201
-
- Clients should filter out objects with `type == "sources"` from the OpenAI chunks stream if they only expect OpenAI-compatible frames.
- Mirrors the final accumulated assistant content sent via OpenAI‑compatible chunks.
208
234
209
-
Notes:
235
+
Client guidance:
210
236
211
-
- Exactly one sources event is typically emitted per request, shortly after retrieval completes.
212
-
- The `url` field maps to the ingester `sourceLink` when available; otherwise it may be a best-effort `url` present in metadata.
237
+
- If you only want OpenAI‑compatible frames, ignore objects that include a top‑level `type` field.
238
+
- To build richer UIs, render `processing` as status badges, `sources` as link previews, and `reasoning` in a collapsible area.
239
+
- Streaming errors surface as OpenAI‑compatible chunks that contain `"delta": {"content": "\n\nError: ..."}` followed by a terminating chunk and `[DONE]`.
213
240
214
241
### Agent Selection
215
242
@@ -280,7 +307,7 @@ Setting either `mcp` or `x-mcp-mode` headers triggers **Model Context Protocol m
280
307
281
308
- Non-streaming responses still use the standard `chat.completion` envelope, but `choices[0].message.content` contains curated documentation blocks instead of prose answers.
282
309
- Streaming responses emit the same SSE wrapper; the payloads contain the formatted documentation as incremental `delta.content` strings.
283
-
-A streaming request in MCP mode also includes the same `{"type": "sources"}` event described above.
310
+
-Streaming also includes custom events: `processing` (e.g., "Formatting documentation...") and `sources` as described in Custom Stream Events. A `final_response` frame mirrors the full final text.
284
311
- MCP mode does not consume generation tokens (`usage.completion_tokens` reflects only retrieval/query processing).
0 commit comments