feat: new /v1/responses #161

blefo · 2025-10-10T13:40:58Z

Overview

Implements the OpenAI-compatible /v1/responses API, a more flexible alternative to Chat Completions with structured input, tool calling, web search, and streaming support.

Key Features

New /v1/responses endpoint
- OpenAI-compatible; supports streaming/non-streaming
- Integrated auth, rate limits, token tracking, and response signing
Advanced Tool Handling
- Auto-detects and executes tool calls
- Multi-turn workflows with context preservation
- Supports Python execution and concurrent tools
Web Search Integration
- Optional Brave Search enrichment via web_search param
- Adds real-time context with source attribution
Multimodal Support
- Image input validation for compatible models

Architecture

Split private.py into modular endpoints:
- /v1/chat/completions → chat.py
- /v1/responses → responses.py
New responses_tool_router.py for tool workflows
Modular API models in nilai-common

Technical Highlights

Validates model capabilities (tools, multimodal, web search)
Supports NilDB prompt retrieval and signed responses
SSE streaming with token usage and attribution

…ructure - Updated OpenAI dependency to version 1.99.2 in both `pyproject.toml` files for `nilai-api` and `nilai-common`. - Enhanced response model in `responses_model.py` by adding new fields and improving type definitions. - Refactored response handling in `responses.py` to include usage tracking for input and output tokens. - Adjusted import statements in `__init__.py` to streamline model access.

- Updated return types in `route_and_execute_tool_call` and `process_tool_calls` to use `FunctionCallOutput`. - Improved error handling and logging in tool execution. - Adjusted input handling in `handle_responses_tool_workflow` to support lists of `ResponseInputParam`. - Added new imports for `FunctionCallOutput` and related types in `nilai_common` models.

- Changed `ResponseFunctionToolCall` to `ResponseFunctionToolCallParam` in multiple functions for better type consistency. - Enhanced `handle_responses_tool_workflow` to utilize new input item types and improved handling of tool call results. - Updated imports in `__init__.py` and other files to reflect new model structures.

…e tests architecture - Introduced new test files for HTTP and OpenAI client interactions with the nilAI API. - Implemented tests for various scenarios including health checks, model retrieval, chat completions, and response generation. - Enhanced test coverage for rate limiting and code execution features. - Removed outdated test file for code execution, consolidating tests into more relevant suites.

…stream test

- Changed EC2 instance type from g4dn.xlarge to g6.xlarge in the CI workflow. - Updated the docker-compose command to use the new GPT-20B configuration file. - Added a new docker-compose file for the GPT-20B GPU service, including environment settings and health checks. - Updated the CI model reference in the test configuration to use the new GPT-20B model.

- Added a dummy API key for BRAVEE2B in the CI environment setup. - Updated the EC2 image ID to a new version in the CI configuration.

…ity group, key name, and region

…e new GPT-20B model

…nce health check logging

…rvice status visibility

…t new naming convention

…0B model

…T-20B model

…tion

…es.sh to align with new naming convention

…thod and changing ToolsConfig to use a list for implemented tools

…ng non supported functions

…'us-east-1'

…e tests

…rtions

…nd adjust large payload handling

blefo added 30 commits October 7, 2025 10:35

feat: modularized chat completion endpoint

2b3de7c

feat: reorganize API models

6dcc8bc

fix: add responses model

8ac1496

refactor: update import paths for API models

835fa29

feat: add responses endpoint and integrate with existing API

8730af2

refactor: update response handling and model structure in API

43849ea

feat: implement web search for response handling

5308a83

feat: implement tool call routing and execution in response handling

6662774

refactor: update test mocks to use new endpoint structure

f9dc8f0

fix: ruff

27ac2ba

fix: ruff format

d5f61ea

refactor: streamline test cases and enhance response validation

4f3395e

fix: update authorization header in response stream test for unique key

5c55e89

fix: correct query count and update authorization header in response …

77b6ae3

…stream test

fix: update query count and rate limit in chat completion tests

cda3fa1

feat: enhance tool execution workflow and add availability check

bb55273

chore: update CI workflow with new environment variable and EC2 image ID

ae05b7e

- Added a dummy API key for BRAVEE2B in the CI environment setup. - Updated the EC2 image ID to a new version in the CI configuration.

chore: update AWS region in CI workflow from eu-west-1 to us-east-1

1b483b9

chore: update EC2 configuration in CI workflow with new subnet, secur…

a75b00d

…ity group, key name, and region

fix: update model health check in wait_for_ci_services.sh to referenc…

118c024

…e new GPT-20B model

chore: update docker-compose configuration for GPT-20B model and enha…

4bd0d31

…nce health check logging

fix: container's name

3d6dbcc

chore: add docker ps command to wait_for_ci_services.sh for better se…

86e6553

…rvice status visibility

fix: update model container name in wait_for_ci_services.sh to reflec…

ff73448

…t new naming convention

fix: enhance health check parameters in docker-compose for GPT-20B model

4a620f2

blefo added 12 commits October 16, 2025 12:49

fix: extend start period for health check in docker-compose for GPT-2…

f310909

…0B model

fix: adjust GPU memory utilization parameter in docker-compose for GP…

28981e6

…T-20B model

feat: add tools configuration and update routing logic for tool execu…

ba5aa97

…tion

fix: update model container name and log output in wait_for_ci_servic…

422197c

…es.sh to align with new naming convention

refactor: improve config handling by updating NilAIConfig prettify me…

a841189

…thod and changing ToolsConfig to use a list for implemented tools

feat: all the test configurations and updated tool_router for detecti…

a5e80b5

…ng non supported functions

fix: update environment value in CI/CD workflow from 'production' to …

f11e782

…'us-east-1'

fix: update model container name in wait_for_ci_services.sh

45e7355

fix: extend expiration time for NUC tokens from 5 to 30 minutes in e2…

faab4ec

…e tests

test: add retry mechanism for chat completion tests and clean up asse…

2678ef9

…rtions

fix: add blank line for improved readability in chat completions test

f03574e

test: modify chat completions test to include full response logging a…

cedbaa1

…nd adjust large payload handling

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: new /v1/responses #161

feat: new /v1/responses #161

Uh oh!

blefo commented Oct 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: new /v1/responses #161

Are you sure you want to change the base?

feat: new /v1/responses #161

Uh oh!

Conversation

blefo commented Oct 10, 2025

Overview

Key Features

Architecture

Technical Highlights

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant