A lightweight mock server that simulates the OpenAI API for development and testing purposes.
- Implements key OpenAI API endpoints:
-
/v1/chat/completions -
/v1/completions -
/v1/models
-
- Supports API key authentication
- Logs requests and responses
- Supports streaming responses
- Configurable throughput (tokens per second)
- Handles tool calls
-
Clone the repository:
git clone https://github.com/taha-yassine/llama-ipsum.git cd llama-ipsum -
Run the server:
just run
The mock server uses Jinja2 templates to generate responses. You can customize these templates to fit your specific testing needs.
-
Create a directory for your custom templates:
mkdir -p my_templates
-
Copy the templates you want to customize:
# Example: customize chat completion response cp app/templates/chat/completion.json.jinja my_templates/chat/ -
Edit the templates according to your needs.
-
Start the server with your custom templates:
uv run -m app.main --template-dir /path/to/my_templates
This project is licensed under the MIT License.