Skip to content

复现TinyBERT需要pre-train的wiki语料,另是否开源tinybert-cased模型 #237

@hppy139

Description

@hppy139

你好,

论文中提到,For the general distillation, we set the maximum sequence length to 128 and use English Wikipedia (2,500M words) as the text corpus and perform the intermediate layer distillation for 3 epochs with the supervision from a pre-trained BERT BASE and keep other hyper-parameters the same as BERT pre-training (Devlin et al., 2019). 关于pre-train的语料,是否可以提供下载地址?

此外,在pre-train阶段,对于general_distill.py的配置参数--do_lower_case,是否可以不设置该参数。看到已开放模型的vocab.txt是小写字典,请问目前是否有已训好的、关注大小写的TinyBERT模型(即"tinybert-cased")?

谢谢~

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions