Skip to content

VisionEncoderDecoderModel ONNX Conversion - Swinv2-Xlm-roberta-base #2141

@Billybeast2003

Description

@Billybeast2003

System Info

I am using Google Colab

transformers version: 4.47.1

  • Platform: Linux-6.1.85+-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 0.27.0
  • Safetensors version: 0.4.5
  • Accelerate version: 1.2.1
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.5.1+cu121 (False)
  • Tensorflow version (GPU?): 2.17.1 (False)
  • Flax version (CPU?/GPU?/TPU?): 0.8.5 (cpu)
  • Jax version: 0.4.33
  • JaxLib version: 0.4.33

Who can help?

@sgugger

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

I want to convert my VisionEncoderDecoderModel to onnx which include swinv2 as the encoder and xlm-roberta-base as the decoder.

the command I used:
!optimum-cli export onnx --model /content/swin-xlm-image-recognition --task vision2seq-lm /content/swin-xlm-image-recognition-onnx --atol 1e-3
Error that I still got:
KeyError: "swinv2 is not supported yet for transformers. Only ['audio-spectrogram-transformer', 'albert', 'bart', 'beit', 'bert', 'blenderbot', 'blenderbot-small', 'bloom', 'camembert', 'clip', 'clip-vision-model', 'codegen', 'convbert', 'convnext', 'convnextv2', 'cvt', 'data2vec-text', 'data2vec-vision', 'data2vec-audio', 'deberta', 'deberta-v2', 'deit', 'detr', 'distilbert', 'donut', 'donut-swin', 'dpt', 'electra', 'encoder-decoder', 'esm', 'falcon', 'flaubert', 'gemma', 'glpn', 'gpt2', 'gpt-bigcode', 'gptj', 'gpt-neo', 'gpt-neox', 'groupvit', 'hubert', 'ibert', 'imagegpt', 'layoutlm', 'layoutlmv3', 'lilt', 'levit', 'longt5', 'marian', 'markuplm', 'mbart', 'mistral', 'mobilebert', 'mobilevit', 'mobilenet-v1', 'mobilenet-v2', 'mpnet', 'mpt', 'mt5', 'musicgen', 'm2m-100', 'nystromformer', 'owlv2', 'owlvit', 'opt', 'qwen2', 'llama', 'pegasus', 'perceiver', 'phi', 'phi3', 'pix2struct', 'poolformer', 'regnet', 'resnet', 'roberta', 'roformer', 'sam', 'segformer', 'sew', 'sew-d', 'speech-to-text', 'speecht5', 'splinter', 'squeezebert', 'swin', 'swin2sr', 't5', 'table-transformer', 'trocr', 'unispeech', 'unispeech-sat', 'vision-encoder-decoder', 'vit', 'vits', 'wavlm', 'wav2vec2', 'wav2vec2-conformer', 'whisper', 'xlm', 'xlm-roberta', 'yolos'] are supported for the library transformers. If you want to support swinv2 please propose a PR or open up an issue.

Is there any way to convert the model to onnx?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingonnxRelated to the ONNX export

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions