Skip to content

uform3 not performing as well as older onnx model (webgpu) #107

@mattdesl

Description

@mattdesl

I'm currently testing with Chrome v137.0.7151.69 and macOS 13.0. I'm using [email protected]. I can't seem to run the new v3 model with webgpu onnnx EP without a huge performance degradation compared to the older model.

Code:
https://gist.github.com/mattdesl/30bc5de23eb6edfd7362d91d43170922

(change the "provider" and "model" variables in main.js)

I'm testing three models:

{
  // The "new" v3 model
  v3: {
    image_encoder: "uform3-image-text-english-small/image_encoder.onnx",
    text_encoder: "uform3-image-text-english-small/text_encoder.onnx",
  },
  // The "old" models ...
  fp16: {
    text_encoder: "uform-vl-english-small-gpu-fp16/text_encoder.onnx",
    image_encoder: "uform-vl-english-small-gpu-fp16/image_encoder.onnx",
  },
  fp32: {
    text_encoder: "uform-vl-english-small-cpu-fp32/text_encoder.onnx",
    image_encoder: "uform-vl-english-small-cpu-fp32/image_encoder.onnx",
  },
}

Using webgpu backend, testing only image encoding / inference time:
v3 ~7000 ms
fp16 ~800 ms
fp32 ~750 ms

The v3 model seems to produce inaccurate/incorrect cosine similarity in webgpu mode.

Using cpu backend:
v3 ~6500 ms
fp16 N/A
fp32 ~7000 ms

I am hoping it's just something I've done wrong that is causing the v3 webgpu to both fail to infer correctly and perform very slowly?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions