Hello,
I’m using whisper-web (experimental-webgpu branch) with local models,
(env.allowLocalModels = true and env.localModelPath = "./models"), and facing challenges in setting distinct dtype values for encoder_model and decoder_model_merged with a - small model.
The error I see -
Uncaught (in promise) Error: Can't create a session. ERROR_CODE: 7, ERROR_MESSAGE: Failed to load model because protobuf parsing failed.
Is there a specific convention for key names or values when setting dtype for encoder/decoder precision levels (according to the models ONNX files?
const transcriber = await pipeline(
"automatic-speech-recognition",
"my-whisper-model",
{
dtype: {
encoder_model: "fp32",
decoder_model_merged: "q4"
},
device: "webgpu"
}
);