Skip to content

Commit cdff769

Browse files
authored
bump version to v0.11.1 (#4221)
* bump version to v0.11.1 * minor fix
1 parent d408dc6 commit cdff769

File tree

7 files changed

+10
-11
lines changed

7 files changed

+10
-11
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -216,7 +216,7 @@ The default prebuilt package is compiled on **CUDA 12** since v0.3.0.
216216
For the GeForce RTX 50 series, please install the LMDeploy prebuilt package complied with **CUDA 12.8**
217217

218218
```shell
219-
export LMDEPLOY_VERSION=0.11.0
219+
export LMDEPLOY_VERSION=0.11.1
220220
export PYTHON_VERSION=310
221221
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu128-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu128
222222
```

README_zh-CN.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ pip install lmdeploy
217217
若使用 GeForce RTX 50 系列显卡,请安装基于 **CUDA 12.8** 编译的 LMDeploy 预编译包。
218218

219219
```shell
220-
export LMDEPLOY_VERSION=0.11.0
220+
export LMDEPLOY_VERSION=0.11.1
221221
export PYTHON_VERSION=310
222222
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu128-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu128
223223
```

docs/en/get_started/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ pip install lmdeploy
2323
The default prebuilt package is compiled on **CUDA 12**. If CUDA 11+ (>=11.3) is required, you can install lmdeploy by:
2424

2525
```shell
26-
export LMDEPLOY_VERSION=0.11.0
26+
export LMDEPLOY_VERSION=0.11.1
2727
export PYTHON_VERSION=310
2828
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
2929
```

docs/zh_cn/get_started/installation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ pip install lmdeploy
2323
默认的预构建包是在 **CUDA 12** 上编译的。如果需要 CUDA 11+ (>=11.3),你可以使用以下命令安装 lmdeploy:
2424

2525
```shell
26-
export LMDEPLOY_VERSION=0.11.0
26+
export LMDEPLOY_VERSION=0.11.1
2727
export PYTHON_VERSION=310
2828
pip install https://github.com/InternLM/lmdeploy/releases/download/v${LMDEPLOY_VERSION}/lmdeploy-${LMDEPLOY_VERSION}+cu118-cp${PYTHON_VERSION}-cp${PYTHON_VERSION}-manylinux2014_x86_64.whl --extra-index-url https://download.pytorch.org/whl/cu118
2929
```

lmdeploy/cli/chat.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,10 @@ def build_pipe(model_path, backend, **kwargs):
3232
from .utils import get_lora_adapters
3333
adapters = get_lora_adapters(kwargs['adapters'])
3434
engine_config.adapters = adapters
35+
# disable metrics to avoid installing prometheus_client, which is not needed
36+
# in interactive chat
37+
engine_config.enable_metrics = False
38+
3539
# set chat template config
3640
chat_template = kwargs.get('chat_template', None)
3741
chat_template_config = None

lmdeploy/pytorch/backends/cuda/blockedf8_modules.py

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -41,12 +41,7 @@ def forward(self,
4141
trans_scale=True,
4242
scale_fmt=self.scale_fmt)
4343

44-
out = blocked_gemm_fp8(input_quant,
45-
input_scale,
46-
weight.t(),
47-
scale.t(),
48-
out_dtype=x.dtype,
49-
scale_fmt=self.scale_fmt)
44+
out = blocked_gemm_fp8(input_quant, input_scale, weight.t(), scale.t(), out_dtype=x.dtype)
5045
if bias is not None:
5146
out += bias
5247

lmdeploy/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Copyright (c) OpenMMLab. All rights reserved.
22
from typing import Tuple
33

4-
__version__ = '0.11.0'
4+
__version__ = '0.11.1'
55
short_version = __version__
66

77

0 commit comments

Comments
 (0)