-
Notifications
You must be signed in to change notification settings - Fork 124
models microsoft deberta large
Description: Decoding-enhanced BERT with Disentangled Attention is that it is an improvement of the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. With 80GB training data, it outperforms the BERT and RoBERTa models in many Natural Language Understanding (NLU) tasks. Key results can be found on the SQuAD 1.1/2.0 and GLUE benchmark tasks when fine-tuned with the MNLI task. The details are available in the official repository and a related paper. If it's useful, cite the paper as described in the citation.
Please Note: This model accepts masks in [mask]
format. See Sample input for reference. > The above summary was generated using ChatGPT. Review the original model card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model. ### Inference samples Inference type|Python sample (Notebook)|CLI with YAML |--|--|--| Real time|fill-mask-online-endpoint.ipynb|fill-mask-online-endpoint.sh Batch |fill-mask-batch-endpoint.ipynb| coming soon ### Finetuning samples Task|Use case|Dataset|Python sample (Notebook)|CLI with YAML |--|--|--|--|--| Text Classification|Emotion Detection|Emotion|emotion-detection.ipynb|emotion-detection.sh Token Classification|Named Entity Recognition|Conll2003|named-entity-recognition.ipynb|named-entity-recognition.sh Question Answering|Extractive Q&A|SQUAD (Wikipedia)|extractive-qa.ipynb|coming soon ### Model Evaluation Task| Use case| Python sample (Notebook)| CLI with YAML |--|--|--|--| Fill Mask | Fill Mask | rcds/wikipedia-for-mask-filling | evaluate-model-fill-mask.ipynb | evaluate-model-fill-mask.yml ### Sample inputs and outputs (for real-time inference) #### Sample input json { "inputs": { "input_string": ["Paris is the [MASK] of France.", "Today is a [MASK] day!"] } }
#### Sample output json [ { "0": "capital" }, { "0": "beautiful" } ]
Version: 12
Preview
computes_allow_list : ['Standard_NV12s_v3', 'Standard_NV24s_v3', 'Standard_NV48s_v3', 'Standard_NC6s_v3', 'Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_NC24rs_v3', 'Standard_NC6s_v2', 'Standard_NC12s_v2', 'Standard_NC24s_v2', 'Standard_NC24rs_v2', 'Standard_NC4as_T4_v3', 'Standard_NC8as_T4_v3', 'Standard_NC16as_T4_v3', 'Standard_NC64as_T4_v3', 'Standard_ND6s', 'Standard_ND12s', 'Standard_ND24s', 'Standard_ND24rs', 'Standard_ND40rs_v2', 'Standard_ND96asr_v4']
license : mit
model_specific_defaults : ordereddict([('apply_deepspeed', 'true'), ('apply_lora', 'true'), ('apply_ort', 'true')])
task : fill-mask
View in Studio: https://ml.azure.com/registries/azureml/models/microsoft-deberta-large/version/12
License: mit
SHA: a97e054da5f34feed3d26951db4a25831dfcb486
datasets:
evaluation-min-sku-spec: 8|0|28|56
evaluation-recommended-sku: Standard_DS4_v2
finetune-min-sku-spec: 4|1|28|176
finetune-recommended-sku: Standard_NC24rs_v3
finetuning-tasks: text-classification, token-classification, question-answering
inference-min-sku-spec: 4|0|14|28
inference-recommended-sku: Standard_DS3_v2, Standard_D4a_v4, Standard_D4as_v4, Standard_DS4_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_DS5_v2, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4, Standard_ND40rs_v2
languages: en