Releases: zhudotexe/kani
Releases · zhudotexe/kani
v1.4.3
- Llama.cpp: Add
model_path
kwarg to allow loading local GGUF models (thanks @lawrenceakka!)
Note
This is technically a minor breaking change, as the position of arguments has changed. I recommend using keyword arguments to load any models.
- Hugging Face: Do not set
max_length
generation parameter ifmax_new_tokens
is set to avoid a verbose warning - OpenAI: Add default context lengths for o-series models, GPT-4.1, add warning for models without default context lengths
v1.4.2
v1.4.1
- Added better options for controlling the JSON Schema generated by an AIFunction
- Generated JSON Schema now includes a function's docstring by default as the top-level
description
key - Generated JSON Schema's top-level
title
key is now a function's name instead of_FunctionSpec
by default - Generated JSON Schema's fields only include a
title
key if atitle
kwarg is explicitly passed toAIParam
(fixing a regression introduced some time ago)
These changes should have no effect on OpenAI function calling; these changes are made to improve compatibility with open models that use raw JSON Schema to define functions (e.g., Step-Audio).
v1.4.0
Mainly improvements to the llama.cpp engine in this release.
Improvements
- Update the
LlamaCppEngine
to not use the Llama 2 prompt pipeline by default. Prompt pipelines must now be explicitly passed. - The
LlamaCppEngine
will now automatically download additional GGUF shards when a sharded model is given. - Added
ChatTemplatePromptPipeline.from_pretrained
to create a prompt pipeline from the chat template of any model on the HF Hub, by ID. - Added examples and documentation for using DeepSeek-R1 (quantized).
Fixes
chat_in_terminal_async
no longer blocks the asyncio event loop when waiting for input from the terminal.- Fixed the
LlamaCppEngine
not passing functions to the provided prompt pipeline.
v1.3.0
Enhancements
- Added
ToolCallParser
s -- these classes are wrappers around KaniEngine
s that parse raw text generated by a model, and return Kani-format tool calls. This is an easy way to enable tool calling on open-source models!
Example:
from kani.engines.huggingface import HuggingEngine
from kani.prompts.impl.mistral import MISTRAL_V3_PIPELINE
from kani.tool_parsers.mistral import MistralToolCallParser
model = HuggingEngine(model_id="mistralai/Mistral-Small-Instruct-2409", prompt_pipeline=MISTRAL_V3_PIPELINE)
engine = MistralToolCallParser(model)
- Added
NaiveJSONToolCallParser
(e.g., Llama 3) - Added
MistralToolCallParser
- Added
DeepseekR1ToolCallParser
Bug Fixes et al.
- Fix compatibility issues with Pydantic 2.10
- Update documentation to better reflect supported HF models
v1.2.4
v1.2.3
v1.2.2
- fix(mistral): ensure prompt and completion tokens are passed through in the MistralFunctionCallingAdapter when streaming
- fix(streaming): don't emit text in DummyStream if it is None
- feat: add standalone width formatters
- docs: gpt-3.5-turbo -> gpt-4o-mini defaults
- fix(streaming): potential line len miscount in format_stream
v1.2.1
- Fixes various issues in the
MistralFunctionCallingAdapter
wrapper engine for Mistral-Large and Mistral-Small function calling models. - Fixes an issue in
PromptPipeline.explain()
where manual examples would not be explained. - Fixes an issue in
PromptPipeline.ensure_bound_function_calls()
where passing an ID translator would mutate the ID of the underlying messages
v1.2.0
New Features
- Hugging Face: Models loaded through the
HuggingEngine
now use chat templates for conversational prompting and tool usage if available by default. This should make it much easier to get started with a Hugging Face model in Kani. - Added the ability to supply a custom tokenizer to the
OpenAIEngine
(e.g., for using OpenAI-compatible APIs)\
Fixes/Improvements
- Fixed a missing dependency in the
llama
extra - The
HuggingEngine
will now automatically setdevice_map="auto"
if theaccelerate
library is installed