[experimental-webgpu] - Configuring Encoder/Decoder Precision with dtype for Local Models

Hello,

I’m using whisper-web (experimental-webgpu branch) with local models,
(env.allowLocalModels = true and env.localModelPath = "./models"), and facing challenges in setting distinct dtype values for encoder_model and decoder_model_merged with a - small model.

The error I see -

`Uncaught (in promise) Error: Can't create a session. ERROR_CODE: 7, ERROR_MESSAGE: Failed to load model because protobuf parsing failed.`

Is there a specific convention for key names or values when setting dtype for encoder/decoder precision levels (according to the models ONNX files?


```
const transcriber = await pipeline(
  "automatic-speech-recognition",
  "my-whisper-model",
  {
    dtype: {
      encoder_model: "fp32",
      decoder_model_merged: "q4"
    },
    device: "webgpu"
  }
);
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[experimental-webgpu] - Configuring Encoder/Decoder Precision with dtype for Local Models #50

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[experimental-webgpu] - Configuring Encoder/Decoder Precision with dtype for Local Models #50

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions