Skip to content

Releases: xorbitsai/inference

v0.13.1

12 Jul 11:10
5e3f254
Compare
Choose a tag to compare

What's new in 0.13.1 (2024-07-12)

These are the changes in inference v0.13.1.

New features

Enhancements

Bug fixes

  • BUG: cache status missing for model id with quantization placeholder by @Zihann73 in #1849

Documentation

Others

New Contributors

Full Changelog: v0.13.0...v0.13.1

v0.13.0

05 Jul 10:33
007408c
Compare
Choose a tag to compare

What's new in 0.13.0 (2024-07-05)

These are the changes in inference v0.13.0.

New features

Enhancements

Bug fixes

Tests

Documentation

Full Changelog: v0.12.3...v0.13.0

v0.12.3

28 Jun 07:36
3d9c261
Compare
Choose a tag to compare

What's new in 0.12.3 (2024-06-28)

These are the changes in inference v0.12.3.

New features

Enhancements

Bug fixes

Others

  • CHORE: upgrade version fix security vulnerability by @rickywu in #1674

New Contributors

Full Changelog: v0.12.2...v0.12.3

v0.12.2.post1

22 Jun 17:37
7705d4a
Compare
Choose a tag to compare

What's new in 0.12.2.post1 (2024-06-22)

These are the changes in inference v0.12.2.post1.

Enhancements

Full Changelog: v0.12.2...v0.12.2.post1

v0.12.2

21 Jun 09:14
5cef7c3
Compare
Choose a tag to compare

What's new in 0.12.2 (2024-06-21)

These are the changes in inference v0.12.2.

New features

  • FEAT: Add Tools Support for Qwen Series MOE Models by @zhanghx0905 in #1642
  • FEAT: [UI]Modify the deletion function of a custom model. by @yiboyasss in #1656
  • FEAT: [UI]Custom model presents JSON data and modifies it. by @yiboyasss in #1670
  • FEAT: Add Rerank model token input/output usage by @wxiwnd in #1657

Enhancements

  • ENH: Continuous batching supports all the models with transformers backend by @ChengjieLi28 in #1659

Bug fixes

  • BUG: show error when user launch quantized model without device supported by @Minamiyama in #1645
  • BUG: Fix default rerank type by @codingl2k1 in #1649
  • BUG: chat_completion not response while error appears more than 100 by @liuzhenghua in #1663

Tests

Others

Full Changelog: v0.12.1...v0.12.2

v0.12.1

14 Jun 09:31
34a57df
Compare
Choose a tag to compare

What's new in 0.12.1 (2024-06-14)

These are the changes in inference v0.12.1.

New features

Enhancements

Bug fixes

Others

New Contributors

Full Changelog: v0.12.0...v0.12.1

v0.12.0

07 Jun 07:27
55c5636
Compare
Choose a tag to compare

What's new in 0.12.0 (2024-06-07)

These are the changes in inference v0.12.0.

New features

Enhancements

  • ENH: make CogVLM2 support stream output by @Minamiyama in #1572
  • BLD: Docker clean all images after building image on self-hosted machine by @ChengjieLi28 in #1595
  • BLD: Fix pip is looking multiple versions of some packages while installing by @ChengjieLi28 in #1603

Bug fixes

Documentation

New Contributors

Full Changelog: v0.11.3...v0.12.0

v0.11.3

31 May 09:28
69c09cd
Compare
Choose a tag to compare

What's new in 0.11.3 (2024-05-31)

These are the changes in inference v0.11.3.

New features

Enhancements

Bug fixes

  • BUG: fix launch model error when use torch 2.3.0 by @amumu96 in #1543
  • BUG: fix vl-model img path error by @amumu96 in #1559
  • BUG: Fix validation errors when define a custom baichuan-chat LLM model by @buptzyf in #1557

Documentation

  • DOC: update readme and fix description about model engine by @qinxuye in #1566

Others

New Contributors

Full Changelog: v0.11.2...v0.11.3

v0.11.2.post1

24 May 11:52
ac8f334
Compare
Choose a tag to compare

What's new in 0.11.2.post1 (2024-05-24)

These are the changes in inference v0.11.2.post1, a hotfix version of v0.11.2.

Bug fixes

  • BUG: fix launch model error when use torch 2.3.0 by @amumu96 in #1543

Full Changelog: v0.11.2...v0.11.2.post1

v0.11.2

24 May 09:10
77e79f8
Compare
Choose a tag to compare

What's new in 0.11.2 (2024-05-24)

These are the changes in inference v0.11.2.

New features

Enhancements

Bug fixes

  • BUG: Fix start worker failed due to None device name by @codingl2k1 in #1539
  • BUG: Fix gpu_idx allocate error when set replica > 1 by @amumu96 in #1528

Others

Full Changelog: v0.11.1...v0.11.2