Skip to content

关于微调过程中打印的警告 #495

@Meng98L

Description

@Meng98L

出现You are using a model of type internlmxcomposer2 to instantiate a model of type internlm. This is not supported for all configurations of models and can yield errors.是正常的吗?

此外,我的微调过程中的打印如下,其中有哪些警告是不容忽视的吗?
[2025-06-25 18:26:46,313] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/root/miniconda3/envs/xtuner/lib/python3.8/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
You are using a model of type internlmxcomposer2 to instantiate a model of type internlm. This is not supported for all configurations of models and can yield errors.
Load model from: internlm/internlm-xcomposer2-vl-7b
Set max length to 4096
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:11<00:00, 5.52s/it]
Some weights of InternLMXComposer2ForCausalLM were not initialized from the model checkpoint at internlm/internlm-xcomposer2-vl-7b and are newly initialized: ['vit.vision_tower.vision_model.post_layernorm.weight', 'vit.vision_tower.vision_model.post_layernorm.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
trainable params: 150,994,944 || all params: 8,817,835,008 || trainable%: 1.7123811441585095
Loading data...
Load 1000 samples from ['/root/autodl-tmp/train600.json', '1']
init mix data at rank 0
load 1000 data
1000samples is loaded
True
0%| | 0/125 [00:00<?, ?it/s]Set seed 88 for rank 0
/root/miniconda3/envs/xtuner/lib/python3.8/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
Could not estimate the number of tokens of the input, floating-point operations will not be computed
{'loss': 2.8206, 'learning_rate': 2.5e-05, 'epoch': 0.01}
{'loss': 2.4058, 'learning_rate': 5e-05, 'epoch': 0.02}
{'loss': 2.6563, 'learning_rate': 4.999184590202141e-05, 'epoch': 0.02}
{'loss': 1.9066, 'learning_rate': 4.996738892723075e-05, 'epoch': 0.03}
{'loss': 1.5686, 'learning_rate': 4.992664502959351e-05, 'epoch': 0.04}
{'loss': 1.4275, 'learning_rate': 4.986964078748837e-05, 'epoch': 0.05}
{'loss': 1.2554, 'learning_rate': 4.979641338636935e-05, 'epoch': 0.06}
{'loss': 1.0514, 'learning_rate': 4.970701059450872e-05, 'epoch': 0.06}
{'loss': 0.8164, 'learning_rate': 4.960149073183643e-05, 'epoch': 0.07}
{'loss': 0.8936, 'learning_rate': 4.9479922631896405e-05, 'epoch': 0.08}
{'loss': 0.6487, 'learning_rate': 4.934238559694448e-05, 'epoch': 0.09}
{'loss': 0.6386, 'learning_rate': 4.918896934621734e-05, 'epoch': 0.1}
{'loss': 0.4914, 'learning_rate': 4.901977395740619e-05, 'epoch': 0.1}
{'loss': 0.5513, 'learning_rate': 4.8834909801373264e-05, 'epoch': 0.11}
{'loss': 0.523, 'learning_rate': 4.863449747015384e-05, 'epoch': 0.12}
{'loss': 0.4984, 'learning_rate': 4.8418667698290696e-05, 'epoch': 0.13}
{'loss': 0.3157, 'learning_rate': 4.8187561277552374e-05, 'epoch': 0.14}

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions