Open
Description
Observed behaviour
When converting this model...
model = tf.keras.models.Sequential([
tf.keras.Input((32, 32, 3)),
lq.layers.QuantConv2D(
32,
(3, 3),
input_quantizer="ste_sign",
kernel_quantizer="ste_sign",
padding="same",
pad_values=1.0,
use_bias=False
),
tf.keras.layers.Conv2D(32, (3, 3)),
])
converted_model = lce.convert_keras_model(model, experimental_default_int8_range=(-3, 3))
...we obtain the following converted model, with extra dequantise and quantise nodes around the Conv2D
:
Expected behaviour
We expect there to be no dequantise or quantise nodes in a converted model when the experimental_default_int8_range
argument is used.
If the QuantConv2D
is replaced by a normal Conv2D
we get:
model = tf.keras.models.Sequential([
tf.keras.Input((32, 32, 3)),
tf.keras.layers.Conv2D(
32, (3, 3), padding="same", use_bias=False
),
tf.keras.layers.Conv2D(32, (3, 3)),
])
converted_model = lce.convert_keras_model(model, experimental_default_int8_range=(-3, 3))
Similarly, if the Conv2D
is replaced with a QuantConv2D
we get:
model = tf.keras.models.Sequential([
tf.keras.Input((32, 32, 3)),
lq.layers.QuantConv2D(
32,
(3, 3),
input_quantizer="ste_sign",
kernel_quantizer="ste_sign",
padding="same",
pad_values=1.0,
use_bias=False
),
lq.layers.QuantConv2D(
32,
(3, 3),
input_quantizer="ste_sign",
kernel_quantizer="ste_sign",
padding="same",
pad_values=1.0,
use_bias=False
),
])
converted_model = lce.convert_keras_model(model, experimental_default_int8_range=(-3, 3))
So there is something specifically going wrong with the QuantConv2D > Conv2D
combination.