Not able to convert OpenAI hugging face TF Whisper model to int8 model #394

pkgoogle · 2024-11-27T22:53:59Z

Description of the bug:

Original Issue: tensorflow/tensorflow#58451
Opening on behalf of @nyadla-sys

1. System information

Linux Ubuntu 16.04:

2. Code

Provide code to help us reproduce your issues using one of the following options:

Option A: Reference colab notebooks

Reference [TensorFlow Lite Model Colab]

Option B: Paste your code here or provide a link to a custom end-to-end colab

https://colab.research.google.com/drive/1rApSDy3KMoMMaK3SIQwvu21yPas2VFjx?usp=sharing

3. Failure after conversion

Model produces correct results with hybrid model.
Colab session is getting crashed with int8 model

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

gaikwadrahul8 · 2024-11-28T13:45:36Z

This issue originally reported by @nyadla-sys has been moved to this dedicated repository for ai-edge-torch to enhance issue tracking and prioritization. To ensure continuity, we have created this new issue on your behalf.

We appreciate your understanding and look forward to your continued involvement.

pkgoogle · 2024-12-18T18:19:03Z

Hi @nyadla-sys, I was able to accomplish this, with this library like this:

import torch
import whisper

import ai_edge_torch
import tensorflow as tf


model = whisper.load_model("turbo")

mel_shape = (1, 128, 3000)
tokens_shape = (1, 448)

sample_input = (torch.randn(mel_shape), torch.randint(low=0, high=51865, size=tokens_shape))

def representative_data_gen():
  for _ in range(100):
    yield [torch.randn((1, 128, 3000)).numpy(), torch.randint(0, 51865, (1, 448)).numpy()]

tfl_converter_flags = {
  'optimizations': [tf.lite.Optimize.DEFAULT],
  'representative_dataset': representative_data_gen,
  'target_spec.supported_ops': [tf.lite.OpsSet.TFLITE_BUILTINS_INT8],
  'inference_input_type': tf.uint8,
  'inference_output_type': tf.uint8,
}

edge_model = ai_edge_torch.convert(model.eval(),sample_input, _ai_edge_converter_flags=tfl_converter_flags)
edge_model.export("whisper.tflite")

You may wish to sample your real dataset to produce the representative dataset. Let me know if that works for you. I should note this also took me ~6 hours on 64 cores. So you may need to wait awhile (or use a smaller model).

nyadla-sys · 2024-12-18T21:40:39Z

For tiny model, use the below script and it is still in experiment stage and will update as soon as i have result.

!pip install git+https://github.com/google-ai-edge/ai-edge-torch.git
!pip install git+https://github.com/openai/whisper.git 
import torch
import whisper

import ai_edge_torch
import tensorflow as tf


model = whisper.load_model("tiny.en")

mel_shape = (1, 80, 3000)
tokens_shape = (1, 448)

sample_input = (torch.randn(mel_shape), torch.randint(low=0, high=51865, size=tokens_shape))

def representative_data_gen():
  for _ in range(100):
    yield [torch.randn((1, 80, 3000)).numpy(), torch.randint(0, 51865, (1, 448)).numpy()]

tfl_converter_flags = {
  'optimizations': [tf.lite.Optimize.DEFAULT],
  'representative_dataset': representative_data_gen,
  'target_spec.supported_ops': [tf.lite.OpsSet.TFLITE_BUILTINS_INT8],
  'inference_input_type': tf.uint8,
  'inference_output_type': tf.uint8,
}

edge_model = ai_edge_torch.convert(model.eval(),sample_input, _ai_edge_converter_flags=tfl_converter_flags)
edge_model.export("whisper.tflite")

Please change representative_date_gen() with the below implementation

def representative_dataset():
    for _ in range(1):#Change this to 100 and provide 100 different audio files from known dataset like libri dataset 
      mel_from_file = log_mel_spectrogram('/content/whisper/tests/jfk.flac')
      segment = pad_or_trim(mel_from_file, N_FRAMES)
      segment = tf.expand_dims(segment, 0)
      print(segment.shape)
      yield [segment]

nyadla-sys · 2024-12-18T23:07:07Z

@pkgoogle
While running inference with the newly generated INT8 model using the script mentioned above, I encountered the following error:

./minimal /home/nyadla/whisper.tflite/whisper.tflite ../samples/jfk.wav

n_vocab:50256

mel.n_len3000

mel.n_mel:80
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
ERROR: /home/nyadla/whisper.tflite/whisper_native/tensorflow_src/tensorflow/lite/kernels/embedding_lookup.cc:77 output->type == kTfLiteFloat32 was not true.
ERROR: Node number 242 (EMBEDDING_LOOKUP) failed to prepare.
ERROR: Failed to apply the default TensorFlow Lite delegate indexed at 0.
Error at /home/nyadla/whisper.tflite/whisper_native/tensorflow_src/tensorflow/lite/examples/minimal/minimal.cc:176

nyadla-sys · 2024-12-18T23:08:47Z

@pkgoogle can you please share the working whisper tflite model (int8 model)

pkgoogle · 2024-12-19T19:34:04Z

Hi @nyadla-sys, It's too large.. (634 MB) even when compressed.. can you share your exact .wav file? I think I can reproduce it on my end.

nyadla-sys · 2024-12-19T19:38:08Z

https://github.com/nyadla-sys/whisper.tflite/blob/main/whisper_native/samples/jfk.wav

pkgoogle · 2024-12-19T20:34:09Z

Thanks @nyadla-sys, I was able to reproduce with just running the minimal program actually:

./minimal whisper.tflite
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
ERROR: xxxxxxxx/git/tensorflow/tensorflow/lite/kernels/embedding_lookup.cc:77 output->type == kTfLiteFloat32 was not true.
ERROR: Node number 1698 (EMBEDDING_LOOKUP) failed to prepare.
ERROR: Failed to apply the default TensorFlow Lite delegate indexed at 0.
Error at xxxxxxxx/git/tensorflow/tensorflow/lite/examples/minimal/minimal.cc:62

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/kernels/embedding_lookup.cc#L77

It seems this op only supports float32 output for now.

github-actions bot assigned pkgoogle Nov 27, 2024

pkgoogle mentioned this issue Nov 27, 2024

Not able to convert OpenAI hugging face TF Whisper model to int8 model tensorflow/tensorflow#58451

Closed

pkgoogle added status:awaiting user response When awaiting user response type:support For use-related issues labels Dec 18, 2024

pkgoogle mentioned this issue Dec 18, 2024

Segmentation Fault (Core Dumped) when convert whisper with int8 quantization #396

Open

pkgoogle added the type:quantization For issues related to quantization label Dec 18, 2024

pkgoogle added type:bug Bug status:awaiting ai-edge-developer type:feature For feature requests and removed status:awaiting user response When awaiting user response type:support For use-related issues type:bug Bug labels Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to convert OpenAI hugging face TF Whisper model to int8 model #394

Not able to convert OpenAI hugging face TF Whisper model to int8 model #394

pkgoogle commented Nov 27, 2024

gaikwadrahul8 commented Nov 28, 2024

pkgoogle commented Dec 18, 2024 •

edited

Loading

nyadla-sys commented Dec 18, 2024

nyadla-sys commented Dec 18, 2024 •

edited

Loading

nyadla-sys commented Dec 18, 2024

pkgoogle commented Dec 19, 2024

nyadla-sys commented Dec 19, 2024

pkgoogle commented Dec 19, 2024 •

edited

Loading

Not able to convert OpenAI hugging face TF Whisper model to int8 model #394

Not able to convert OpenAI hugging face TF Whisper model to int8 model #394

Comments

pkgoogle commented Nov 27, 2024

Description of the bug:

1. System information

2. Code

Option A: Reference colab notebooks

Option B: Paste your code here or provide a link to a custom end-to-end colab

3. Failure after conversion

Actual vs expected behavior:

Any other information you'd like to share?

gaikwadrahul8 commented Nov 28, 2024

pkgoogle commented Dec 18, 2024 • edited Loading

nyadla-sys commented Dec 18, 2024

nyadla-sys commented Dec 18, 2024 • edited Loading

nyadla-sys commented Dec 18, 2024

pkgoogle commented Dec 19, 2024

nyadla-sys commented Dec 19, 2024

pkgoogle commented Dec 19, 2024 • edited Loading

pkgoogle commented Dec 18, 2024 •

edited

Loading

nyadla-sys commented Dec 18, 2024 •

edited

Loading

pkgoogle commented Dec 19, 2024 •

edited

Loading