Skip to content

[experimental-webgpu] - Configuring Encoder/Decoder Precision with dtype for Local Models #50

@kostia-ilani

Description

@kostia-ilani

Hello,

I’m using whisper-web (experimental-webgpu branch) with local models,
(env.allowLocalModels = true and env.localModelPath = "./models"), and facing challenges in setting distinct dtype values for encoder_model and decoder_model_merged with a - small model.

The error I see -

Uncaught (in promise) Error: Can't create a session. ERROR_CODE: 7, ERROR_MESSAGE: Failed to load model because protobuf parsing failed.

Is there a specific convention for key names or values when setting dtype for encoder/decoder precision levels (according to the models ONNX files?

const transcriber = await pipeline(
  "automatic-speech-recognition",
  "my-whisper-model",
  {
    dtype: {
      encoder_model: "fp32",
      decoder_model_merged: "q4"
    },
    device: "webgpu"
  }
);

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions