v0.11.2

XprobeBot released this 24 May 09:10

· 290 commits to main since this release

What's new in 0.11.2 (2024-05-24)

These are the changes in inference v0.11.2.

New features

FEAT: Add command cal-model-mem by @frostyplanet in #1460
FEAT: add deepseek llm and coder base by @qinxuye in #1533
FEAT: add codeqwen1.5 by @qinxuye in #1535
FEAT: Auto detect rerank type for unknown rerank type by @codingl2k1 in #1538
FEAT: Provide the functionality to query information on various cached models hosted on the query node. by @hainaweiben in #1522

Enhancements

ENH: Compatible with huggingface-hub v0.23.0 by @ChengjieLi28 in #1514
ENH: convert command-r to chat by @qinxuye in #1537
ENH: Support Intern-VL-Chat model by @amumu96 in #1536
BLD: adapt to langchain 0.2.x, which has breaking changes by @mikeshi80 in #1521
BLD: Fix pre commit by @frostyplanet in #1527
BLD: compatible with torch 2.3.0 by @qinxuye in #1534

Bug fixes

BUG: Fix start worker failed due to None device name by @codingl2k1 in #1539
BUG: Fix gpu_idx allocate error when set replica > 1 by @amumu96 in #1528

Others

CHORE: Basic benchmark/benchmark_rerank.py by @codingl2k1 in #1479

Full Changelog: v0.11.1...v0.11.2

Contributors

frostyplanet, qinxuye, and 5 other contributors

Assets 2