Skip to content

Releases: zhudotexe/kani

v1.4.3

09 Jun 17:50
Compare
Choose a tag to compare
  • Llama.cpp: Add model_path kwarg to allow loading local GGUF models (thanks @lawrenceakka!)

Note

This is technically a minor breaking change, as the position of arguments has changed. I recommend using keyword arguments to load any models.

  • Hugging Face: Do not set max_length generation parameter if max_new_tokens is set to avoid a verbose warning
  • OpenAI: Add default context lengths for o-series models, GPT-4.1, add warning for models without default context lengths

v1.4.2

09 Apr 22:26
Compare
Choose a tag to compare
  • Add model_cls to HuggingEngine to allow specifying an alternate class other than AutoModelForCausalLM (e.g. for Qwen-2.5-omni)

v1.4.1

05 Apr 01:46
Compare
Choose a tag to compare
  • Added better options for controlling the JSON Schema generated by an AIFunction
  • Generated JSON Schema now includes a function's docstring by default as the top-level description key
  • Generated JSON Schema's top-level title key is now a function's name instead of _FunctionSpec by default
  • Generated JSON Schema's fields only include a title key if a title kwarg is explicitly passed to AIParam (fixing a regression introduced some time ago)

These changes should have no effect on OpenAI function calling; these changes are made to improve compatibility with open models that use raw JSON Schema to define functions (e.g., Step-Audio).

v1.4.0

21 Feb 16:09
Compare
Choose a tag to compare

Mainly improvements to the llama.cpp engine in this release.

Improvements

  • Update the LlamaCppEngine to not use the Llama 2 prompt pipeline by default. Prompt pipelines must now be explicitly passed.
  • The LlamaCppEngine will now automatically download additional GGUF shards when a sharded model is given.
  • Added ChatTemplatePromptPipeline.from_pretrained to create a prompt pipeline from the chat template of any model on the HF Hub, by ID.
  • Added examples and documentation for using DeepSeek-R1 (quantized).

Fixes

  • chat_in_terminal_async no longer blocks the asyncio event loop when waiting for input from the terminal.
  • Fixed the LlamaCppEngine not passing functions to the provided prompt pipeline.

v1.3.0

03 Feb 20:50
Compare
Choose a tag to compare

Enhancements

  • Added ToolCallParsers -- these classes are wrappers around Kani Engines that parse raw text generated by a model, and return Kani-format tool calls. This is an easy way to enable tool calling on open-source models!

Example:

from kani.engines.huggingface import HuggingEngine
from kani.prompts.impl.mistral import MISTRAL_V3_PIPELINE
from kani.tool_parsers.mistral import MistralToolCallParser
model = HuggingEngine(model_id="mistralai/Mistral-Small-Instruct-2409", prompt_pipeline=MISTRAL_V3_PIPELINE)
engine = MistralToolCallParser(model)
  • Added NaiveJSONToolCallParser (e.g., Llama 3)
  • Added MistralToolCallParser
  • Added DeepseekR1ToolCallParser

Bug Fixes et al.

  • Fix compatibility issues with Pydantic 2.10
  • Update documentation to better reflect supported HF models

v1.2.4

09 Dec 18:08
Compare
Choose a tag to compare
  • Pin the Pydantic dependency to pydantic<2.10.0 as this version breaks JSON schema generation and MessagePart serialization

v1.2.3

14 Nov 18:36
Compare
Choose a tag to compare
  • Fixes Anthropic tool calling being broken with anthropic-sdk>0.26.0
  • Fixes an issue where Anthropic prompts were over-eagerly trimming prompts that did not start with a user message
  • Added support for tool calling while streaming with Anthropic models

v1.2.2

25 Oct 15:42
Compare
Choose a tag to compare
  • fix(mistral): ensure prompt and completion tokens are passed through in the MistralFunctionCallingAdapter when streaming
  • fix(streaming): don't emit text in DummyStream if it is None
  • feat: add standalone width formatters
  • docs: gpt-3.5-turbo -> gpt-4o-mini defaults
  • fix(streaming): potential line len miscount in format_stream

v1.2.1

06 Oct 21:36
Compare
Choose a tag to compare
  • Fixes various issues in the MistralFunctionCallingAdapter wrapper engine for Mistral-Large and Mistral-Small function calling models.
  • Fixes an issue in PromptPipeline.explain() where manual examples would not be explained.
  • Fixes an issue in PromptPipeline.ensure_bound_function_calls() where passing an ID translator would mutate the ID of the underlying messages

v1.2.0

24 Sep 20:26
Compare
Choose a tag to compare

New Features

  • Hugging Face: Models loaded through the HuggingEngine now use chat templates for conversational prompting and tool usage if available by default. This should make it much easier to get started with a Hugging Face model in Kani.
  • Added the ability to supply a custom tokenizer to the OpenAIEngine (e.g., for using OpenAI-compatible APIs)\

Fixes/Improvements

  • Fixed a missing dependency in the llama extra
  • The HuggingEngine will now automatically set device_map="auto" if the accelerate library is installed