Skip to content

Releases: xorbitsai/inference

v1.0.0

15 Nov 10:15
4c96475
Compare
Choose a tag to compare

What's new in 1.0.0 (2024-11-15)

These are the changes in inference v1.0.0.

New features

Enhancements

Bug fixes

Documentation

Full Changelog: v0.16.3...v1.0.0

v0.16.3

08 Nov 05:47
85ab86b
Compare
Choose a tag to compare

What's new in 0.16.3 (2024-11-08)

These are the changes in inference v0.16.3.

New features

Enhancements

Bug fixes

Full Changelog: v0.16.2...v0.16.3

v0.16.2

01 Nov 10:09
67e97ab
Compare
Choose a tag to compare

What's new in 0.16.2 (2024-11-01)

These are the changes in inference v0.16.2.

New features

Enhancements

Bug fixes

  • BUG: fix bge-reranker-v2-minicpm-layerwise rerank issue by @hustyichi in #2495

Documentation

New Contributors

Full Changelog: v0.16.1...v0.16.2

v0.16.1

25 Oct 07:33
d4cd7b1
Compare
Choose a tag to compare

What's new in 0.16.1 (2024-10-25)

These are the changes in inference v0.16.1.

New features

Enhancements

Bug fixes

Documentation

New Contributors

Full Changelog: v0.16.0...v0.16.1

v0.16.0

18 Oct 11:40
5f7dea4
Compare
Choose a tag to compare

What's new in 0.16.0 (2024-10-18)

These are the changes in inference v0.16.0.

New features

  • FEAT: Adding support for awq/gptq vLLM inference to VisionModel such as Qwen2-VL by @cyhasuka in #2445
  • FEAT: Dynamic batching for the state-of-the-art FLUX.1 text_to_image interface by @ChengjieLi28 in #2380
  • FEAT: added MLX for qwen2.5-instruct by @qinxuye in #2444

Enhancements

Documentation

New Contributors

Full Changelog: v0.15.4...v0.16.0

v0.15.4

12 Oct 10:38
c0be115
Compare
Choose a tag to compare

What's new in 0.15.4 (2024-10-12)

These are the changes in inference v0.15.4.

New features

  • FEAT: Llama 3.1 Instruct support tool call by @codingl2k1 in #2388
  • FEAT: qwen2.5 instruct tool call by @codingl2k1 in #2393
  • FEAT: add whisper-large-v3-turbo audio model by @hwzhuhao in #2409
  • FEAT: Add environment variable setting to increase the retry attempts after model download failures by @hwzhuhao in #2411
  • FEAT: support getting progress for image model by @qinxuye in #2395
  • FEAT: support qwenvl2 vllm engine by @amumu96 in #2428

Enhancements

Bug fixes

Full Changelog: v0.15.3...v0.15.4

v0.15.3

30 Sep 13:42
00a9ee1
Compare
Choose a tag to compare

What's new in 0.15.3 (2024-09-30)

These are the changes in inference v0.15.3.

New features

Bug fixes

  • BUG: [UI] Fix 'Model Format' bug on model registration page. by @yiboyasss in #2353
  • BUG: Fix default value of max_model_len for vLLM backend. by @zjuyzj in #2385

New Contributors

Full Changelog: v0.15.2...v0.15.3

v0.15.2

20 Sep 09:05
5de46e9
Compare
Choose a tag to compare

What's new in 0.15.2 (2024-09-20)

These are the changes in inference v0.15.2.

New features

Bug fixes

Documentation

Full Changelog: v0.15.1...v0.15.2

v0.15.1

14 Sep 07:38
4c5e752
Compare
Choose a tag to compare

What's new in 0.15.1 (2024-09-14)

These are the changes in inference v0.15.1.

New features

Enhancements

Bug fixes

Documentation

New Contributors

Full Changelog: v0.15.0...v0.15.1

v0.15.0

06 Sep 08:45
e2618be
Compare
Choose a tag to compare

What's new in 0.15.0 (2024-09-06)

These are the changes in inference v0.15.0.

New features

Enhancements

Bug fixes

  • BUG: Fix docker image startup issue due to entrypoint by @ChengjieLi28 in #2207
  • BUG: fix init xinference fail when custom path is fault by @amumu96 in #2208
  • BUG: use default_uid to replace uid of actors which may override the xoscar actor's uid property by @qinxuye in #2214
  • BUG: fix rerank max length by @qinxuye in #2219
  • BUG: logger bug of function using generator decoration by @wxiwnd in #2215
  • BUG: fix rerank calculation of tokens number by @qinxuye in #2228
  • BUG: fix embedding token calculation & optimize memory by @qinxuye in #2221

Documentation

  • DOC: Modify the installation documentation to change single quotes to double quotes for Windows compatibility. by @nikelius in #2211

Others

New Contributors

Full Changelog: v0.14.4...v0.15.0