Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MarkupLM模型训练问题 #1822

Open
yegoling opened this issue Nov 15, 2024 · 0 comments
Open

MarkupLM模型训练问题 #1822

yegoling opened this issue Nov 15, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@yegoling
Copy link
Contributor

Describe the bug/ 问题描述 (Mandatory / 必填)
将dataloader换成torch里面的dataloader后,更改里面的张量为ms格式,输入MarkupLM模型训练,前向传播输出依然有问题

  • Hardware Environment(Ascend/GPU/CPU) / 硬件环境:

CPU

  • Software Environment / 软件环境 (Mandatory / 必填):
    -- MindSpore version : 2.4.0
    -- Python version : 3.9.20

To Reproduce / 重现步骤 (Mandatory / 必填)
运行训练代码,训练markuplm-base模型,则会发现前向传播输出有问题

Expected behavior / 预期结果 (Mandatory / 必填)
输出正常

Screenshots/ 日志 / 截图 (Mandatory / 必填)

import mindspore as ms
import numpy as np
for batch in dataloader:
    for item in batch:
        batch[item]= batch[item].numpy()
        batch[item]=ms.from_numpy(batch[item])
    # print(batch)
    inputs = {k:v for k,v in batch.items()}
    # print(inputs)
    outputs = model(**inputs)
    print(outputs)

输出:

TokenClassifierOutput(loss=Tensor(shape=[], dtype=Float32, value= nan), logits=Tensor(shape=[2, 512, 4], dtype=Float32, value=
[[[      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  ...
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)]],
 [[      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  ...
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)]]]), hidden_states=None, attentions=None)
TokenClassifierOutput(loss=Tensor(shape=[], dtype=Float32, value= nan), logits=Tensor(shape=[2, 512, 4], dtype=Float32, value=
[[[      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  ...
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)]],
 [[      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
...
  ...
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)],
  [      -nan(ind),       -nan(ind),       -nan(ind),       -nan(ind)]]]), hidden_states=None, attentions=None)

Additional context / 备注 (Optional / 选填)
mindspore有问题的代码和输出正常,用来对照的pytorch代码如下:

mindspore代码:
mindspore.md
torch代码:
torch.md

@yegoling yegoling added the bug Something isn't working label Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant