01 Mar 07:04

XprobeBot

7b20f76

v0.9.1

What's new in 0.9.1 (2024-03-01)

These are the changes in inference v0.9.1.

New features

FEAT: Docker for cpu only by @ChengjieLi28 in #1068

Enhancements

ENH: Support downloading gemma from modelscope by @aresnow1 in #1035
ENH: [UI] Setting quantization when registering LLM by @ChengjieLi28 in #1040
ENH: Restful client supports multiple system prompts for chat by @ChengjieLi28 in #1056
ENH: supports disabling worker reporting status by @ChengjieLi28 in #1057
ENH: Extra params for xinference launch command line by @ChengjieLi28 in #1048

Bug fixes

BUG: Fix some models that cannot download from modelscope by @ChengjieLi28 in #1066
BUG: Fix early truncation due to max_token being default to 16 instead of 1024 by @ZhangTianrong in #1061

Documentation

DOC: Update readme by @qinxuye in #1045
DOC: Fix readme by @qinxuye in #1054
DOC: Fix wechat links by @qinxuye in #1055

New Contributors

@ZhangTianrong made their first contribution in #1061

Full Changelog: v0.9.0...v0.9.1

Contributors

qinxuye, ZhangTianrong, and 2 other contributors

Assets 2

22 Feb 08:03

XprobeBot

v0.9.0

c653c97

v0.9.0

What's new in 0.9.0 (2024-02-22)

These are the changes in inference v0.9.0.

New features

FEAT: Refactor device related code and add initial Intel GPU support by @notsyncing in #968
FEAT: Support gemma series model by @aresnow1 in #1024

Enhancements

ENH: [UI] Supports replica when launching LLM models by @ChengjieLi28 in #1011
ENH: [UI] Show cluster resource information by @ChengjieLi28 in #1015

Bug fixes

BUG: fix chat completion error when indexing body.messages by @fffonion in #1008
BUG: Fix cache sd 1.5 error by @codingl2k1 in #1013
BUG: fix typo in modelscope llama-2-13b-chat-GGUF by @qinxuye in #1026
BUG: Fix missing qwen 1.5 7b gguf by @codingl2k1 in #1027

Documentation

DOC: Polish model operation command doc by @onesuper in #1000
DOC: Fix note on secret_key generation and algorithm selection for OAuth2 by @ChengjieLi28 in #1012

New Contributors

@fffonion made their first contribution in #1008
@notsyncing made their first contribution in #968

Full Changelog: v0.8.5...v0.9.0

Contributors

qinxuye, onesuper, and 5 other contributors

Assets 2

06 Feb 05:37

XprobeBot

v0.8.5

e903e05

v0.8.5

What's new in 0.8.5 (2024-02-06)

These are the changes in inference v0.8.5.

New features

FEAT: Implemented web UI for launching the text2image model. by @hainaweiben in #985
FEAT: Support qwen-1.5 series by @aresnow1 in #994

Enhancements

ENH: Download stable diffusion model from modelscope by @codingl2k1 in #980
REF: Supports pydantic v2 by @ChengjieLi28 in #983

Bug fixes

BUG: Fix load yi vl model to multiple cards by @codingl2k1 in #992
BUG: client compatible with old version of xinference by @ChengjieLi28 in #987

Others

CI: Free disk usage by @aresnow1 in #982
[DOC] Polish troubleshooting page by @onesuper in #990

New Contributors

@hainaweiben made their first contribution in #985

Full Changelog: v0.8.4...v0.8.5

Contributors

onesuper, aresnow1, and 3 other contributors

Assets 2

04 Feb 09:17

XprobeBot

v0.8.4

1b9b8c8

v0.8.4

What's new in 0.8.4 (2024-02-04)

These are the changes in inference v0.8.4.

Enhancements

ENH: [UI] Fix too long LLM model name by @ChengjieLi28 in #979
ENH: Add gguf models of llama-2-chat by @aresnow1 in #981

Bug fixes

BUG: Fix custom model tool calls by @codingl2k1 in #978
BUG: Fix chat template by @aresnow1 in #977

Documentation

DOC: Translate model docs by @onesuper in #965
DOC: Auto gen metrics doc by @codingl2k1 in #967
DOC: Update README.md by @codingl2k1 in #969

Full Changelog: v0.8.3.1...v0.8.4

Contributors

onesuper, aresnow1, and 2 other contributors

Assets 2

02 Feb 08:06

XprobeBot

v0.8.3.1

cfbe5ba

v0.8.3.1

What's new in 0.8.3.1 (2024-02-02)

These are the changes in inference v0.8.3.1.

Bug fixes

BUG: Remove flash-attn dependency by @codingl2k1 in #970

Full Changelog: v0.8.3...v0.8.3.1

Contributors

codingl2k1

Assets 2

02 Feb 07:14

XprobeBot

v0.8.3

749ef3f

v0.8.3

What's new in 0.8.3 (2024-02-02)

These are the changes in inference v0.8.3.

New features

FEAT: add whisper.small and belle distilwhisper model, fix parameter in rerank by @zhanghx0905 in #944
FEAT: Support jina-embeddings-v2-base-zh by @aresnow1 in #948
FEAT: Support Yi VL by @codingl2k1 in #946
FEAT: Support more embedding and rerank models by @aresnow1 in #959

Enhancements

ENH: Record gpu mem status in workers by @ChengjieLi28 in #941
ENH: Allow chat max_tokens is None by @codingl2k1 in #960
ENH: chatglm ggml format supports system_prompt by @ChengjieLi28 in #962

Bug fixes

BUG: Fix roles in chat UI by @aresnow1 in #949
BUG: Fix heartbeat by @codingl2k1 in #957
BUG: Fix model's content length by @aresnow1 in #955

Documentation

DOC: Update readme by @aresnow1 in #938
DOC: Add image model doc by @codingl2k1 in #947
DOC: Add audio model doc by @codingl2k1 in #954
DOC: Reorge model related docs by @onesuper in #961

New Contributors

@zhanghx0905 made their first contribution in #944

Full Changelog: v0.8.2...v0.8.3

Contributors

onesuper, zhanghx0905, and 3 other contributors

Assets 2

26 Jan 08:32

XprobeBot

v0.8.2

6fa3ee0

v0.8.2

What's new in 0.8.2 (2024-01-26)

These are the changes in inference v0.8.2.

New features

FEAT: Support events by @codingl2k1 in #916
FEAT: Support audio model by @codingl2k1 in #929
FEAT: Support orion series models by @aresnow1 in #933
Feat: Support Mixtral-8x7B-Instruct-v0.1-AWQ by @aresnow1 in #936

Enhancements

ENH: Launch model by version by @ChengjieLi28 in #896
ENH: Move multimodal to LLM by @codingl2k1 in #917
ENH: InternLM2 chat template by @aresnow1 in #919
ENH: Support use_fp16 for rerank model by @aresnow1 in #927
ENH: record instance count and version count when detailed listing model registrations by @ChengjieLi28 in #920
BLD: Resolve conflicts during installation by @aresnow1 in #924
REF: Move auth code to service for better scalability by @ChengjieLi28 in #925

Documentation

DOC: Update readme by @aresnow1 in #914
DOC: Display contributors in readme by @onesuper in #915
DOC: Merge multimodal to LLM by @codingl2k1 in #923
DOC: Model usage guide by @onesuper in #926
DOC: Audio doc by @codingl2k1 in #937

Full Changelog: v0.8.1...v0.8.2

Contributors

onesuper, aresnow1, and 2 other contributors

Assets 2

19 Jan 09:17

XprobeBot

v0.8.1

fb3985e

v0.8.1

What's new in 0.8.1 (2024-01-19)

These are the changes in inference v0.8.1.

New features

FEAT: Auto recover limit by @codingl2k1 in #893
FEAT: Prometheus metrics exporter by @codingl2k1 in #906
FEAT: Add internlm2-chat support by @aresnow1 in #913

Enhancements

ENH: Launch model asynchronously by @ChengjieLi28 in #879
ENH: qwen vl modelscope by @codingl2k1 in #902
ENH: Add "tools" in model ability by @aresnow1 in #904
ENH: Add quantization support for qwen chat by @aresnow1 in #910

Bug fixes

BUG: Fix prompt template of chatglm3-32k by @aresnow1 in #889
BUG: invalid volumn in docker compose yml by @ChengjieLi28 in #890
BUG: Revert #883 by @aresnow1 in #903
BUG: Fix chatglm backend by @codingl2k1 in #898
BUG: Fix tool calls on custom model by @codingl2k1 in #899
BUG: Fix is_valid_model_name by @aresnow1 in #907

Documentation

DOC: Update the documentation about use of docker by @aresnow1 in #901
DOC:ADD FAQ IN troubleshooting.rst by @sisuad in #911

New Contributors

@sisuad made their first contribution in #911

Full Changelog: v0.8.0...v0.8.1

Contributors

sisuad, aresnow1, and 2 other contributors

Assets 2

11 Jan 13:55

XprobeBot

v0.8.0

e4c892c

v0.8.0

What's new in 0.8.0 (2024-01-11)

These are the changes in inference v0.8.0.

New features

FEAT: qwen 1.8b gptq by @codingl2k1 in #869
FEAT: docker compose support by @Minamiyama in #868
FEAT: Simple OAuth2 system by @ChengjieLi28 in #793
FEAT: Chat vl web UI by @codingl2k1 in #882
FEAT: Yi chat gptq by @codingl2k1 in #876

Enhancements

ENH: Stream use xoscar generator by @codingl2k1 in #859
ENH: UI supports registering custom gptq models by @ChengjieLi28 in #875
ENH: make the size param of *_to_image more compatible by @liunux4odoo in #881
BLD: Update package-lock.json by @aresnow1 in #886
REF: Add model_hub property in EmbeddingModelSpec by @aresnow1 in #877

Bug fixes

BUG: Fix image model b64_json output by @codingl2k1 in #874
BUG: fix libcuda.so.1: cannot open shared object file by @superhuahua in #883
BUG: Fix auto recover kwargs by @codingl2k1 in #885

Documentation

DOC: docker image translation by @aresnow1 in #865
DOC: register model with model_family by @ChengjieLi28 in #863
DOC: Add OpenAI Client API doc by @codingl2k1 in #864
DOC: add docker instructions by @onesuper in #878

New Contributors

@superhuahua made their first contribution in #883

Full Changelog: v0.7.5...v0.8.0

Contributors

Minamiyama, onesuper, and 5 other contributors

Assets 2

05 Jan 07:56

XprobeBot

v0.7.5

56b28b3

v0.7.5

What's new in 0.7.5 (2024-01-05)

These are the changes in inference v0.7.5.

New features

FEAT: text2vec by @ChengjieLi28 in #857

Enhancements

ENH: Offload all response serialization to ModelActor by @codingl2k1 in #837
ENH: Custom model uses vLLM by @ChengjieLi28 in #861
BLD: Docker image by @ChengjieLi28 in #855

Bug fixes

BUG: Fix typing_extension version problem in notebook by @onesuper in #856
BUG: Fix multimodal cmdline by @codingl2k1 in #850
BUG: Fix generate of chatglm3 by @aresnow1 in #858

Documentation

DOC: CUDA Version recommendation by @ChengjieLi28 in #841
DOC: new doc cover by @onesuper in #843
DOC: Autogen modelhub info by @onesuper in #845
DOC: Add multimodal feature in README by @onesuper in #846
DOC: Chinese doc for user guide by @aresnow1 in #847
DOC: add notebook for quickstart by @onesuper in #854
DOC: Add docs about environments by @aresnow1 in #853
DOC: Add jupyter notebook quick start tutorial by @onesuper in #851

Others

CHORE: Add docker image with latest tag by @ChengjieLi28 in #862

Full Changelog: v0.7.4.1...v0.7.5

Contributors

onesuper, aresnow1, and 2 other contributors

Assets 2

Releases: xorbitsai/inference

v0.9.1

What's new in 0.9.1 (2024-03-01)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.9.0

What's new in 0.9.0 (2024-02-22)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.8.5

What's new in 0.8.5 (2024-02-06)

New features

Enhancements

Bug fixes

Others

New Contributors

Contributors

v0.8.4

What's new in 0.8.4 (2024-02-04)

Enhancements

Bug fixes

Documentation

Contributors

v0.8.3.1

What's new in 0.8.3.1 (2024-02-02)

Bug fixes

Contributors

v0.8.3

What's new in 0.8.3 (2024-02-02)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.8.2

What's new in 0.8.2 (2024-01-26)

New features

Enhancements

Documentation

Contributors

v0.8.1

What's new in 0.8.1 (2024-01-19)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.8.0

What's new in 0.8.0 (2024-01-11)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

v0.7.5

What's new in 0.7.5 (2024-01-05)

New features

Enhancements

Bug fixes

Documentation

Others

Contributors