Add Support for Pre-trained RM #1038

RowitZou · 2025-06-25T07:13:20Z

Add fine-tuning support for RMP:

config examples
xtuner/xtuner/configs/reward_model/internrm/internrm_7b_full_varlenattn_custom_dataset.py
xtuner/xtuner/configs/reward_model/internrm/internrm_1_8b_full_varlenattn_custom_dataset.py

Add the client of RMP:
xtuner/xtuner/utils/rm_utils.py

RangiLyu · 2025-06-25T07:22:28Z

xtuner/configs/reward_model/internrm/internrm_1_8b_full_varlenattn_custom_dataset.py

+#                          PART 1  Settings                           #
+#######################################################################
+# Model
+pretrained_model_name_or_path = "internlm/internrm-1_8b-base"


这个配置是微调预训练过的RM的话，这里要改成你们的预训练模型的hf仓库名，因为会自动下载的

目前暂定是这个hf仓库名。如果最后发布的名字变了，我会再提交一次pr修改一下。

RangiLyu · 2025-07-07T09:16:40Z

@pppppM Plz review this PR.

hhaAndroid · 2025-07-07T10:02:42Z

@RowitZou please fix lint

RowitZou · 2025-07-07T10:37:31Z

@RowitZou please fix lint

@hhaAndroid Lint has been fixed. Plz try again.

RowitZou added 3 commits June 20, 2025 17:10

Support training for InternRM series.

9bc7072

Add InternRM Client.

2791c52

Add lmdeploy support.

69900bc

RangiLyu reviewed Jun 25, 2025

View reviewed changes

RowitZou added 3 commits June 30, 2025 23:58

Add reference link of InternRM in the config file.

4f6c8f4

Modify the request api for poem rm using lmdeploy.

e1cceb2

Modify names of POLAR RMs.

6e97bbb

RangiLyu approved these changes Jul 7, 2025

View reviewed changes

hhaAndroid self-assigned this Jul 7, 2025

pppppM approved these changes Jul 7, 2025

View reviewed changes

Fix lint.

5fe7b74

Fix lint.

99d4c16

hhaAndroid merged commit 00796c1 into InternLM:main Jul 7, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Support for Pre-trained RM #1038

Add Support for Pre-trained RM #1038

Uh oh!

RowitZou commented Jun 25, 2025

Uh oh!

RangiLyu Jun 25, 2025

Uh oh!

RowitZou Jun 25, 2025

Uh oh!

RangiLyu commented Jul 7, 2025

Uh oh!

hhaAndroid commented Jul 7, 2025

Uh oh!

RowitZou commented Jul 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Add Support for Pre-trained RM #1038

Add Support for Pre-trained RM #1038

Uh oh!

Conversation

RowitZou commented Jun 25, 2025

Uh oh!

RangiLyu Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

RowitZou Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

RangiLyu commented Jul 7, 2025

Uh oh!

hhaAndroid commented Jul 7, 2025

Uh oh!

RowitZou commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RowitZou commented Jul 7, 2025 •

edited

Loading