Description
I have 2 models one is baseline keras model and its equivalent keras model, the models are taken from the QKerasTutorial.ipynb,
My keras model is shown below:
Model: "model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 28, 28, 1)] 0
conv2d_1 (Conv2D) (None, 26, 26, 18) 180
act_1 (Activation) (None, 26, 26, 18) 0
conv2d_2 (Conv2D) (None, 24, 24, 32) 5216
act_2 (Activation) (None, 24, 24, 32) 0
flatten (Flatten) (None, 18432) 0
dense (Dense) (None, 10) 184330
softmax (Activation) (None, 10) 0
=================================================================
Total params: 189,726
Trainable params: 189,726
Non-trainable params: 0
_________________________________________________________________
My Equivalent Qkeras models is
Model: "model_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 28, 28, 1)] 0
conv2d_1 (QConv2D) (None, 26, 26, 18) 180
act_1 (QActivation) (None, 26, 26, 18) 0
conv2d_2 (QConv2D) (None, 24, 24, 32) 5216
act_2 (QActivation) (None, 24, 24, 32) 0
flatten (Flatten) (None, 18432) 0
dense (QDense) (None, 10) 184330
softmax (Activation) (None, 10) 0
=================================================================
Total params: 189,726
Trainable params: 189,726
Non-trainable params: 0
_________________________________________________________________
comparison_models/qkeras_model.h5
qmodel.save('comparison_models/qkeras_model.h5')
I cannot able to see the difference in model size in both the models since both the models have the same model size, but the weights are being quantised in qkeras model when i check the each layer weights. My question is where can we observe the actual difference between the two models, what are the different metrics that decide the difference in models, usually the quantised models should perform faster inference when compared to keras models which is not quantised but I observed slower training and inference for quantised models when compared to keras model is it correct since inference should be faster for Qkeras models? what are the key metrics that spot the exact differences between the baseline keras model and quantised qkeras model at the software level, we may spot the difference in the model inference time and model size when we port them on hardware(FPGA )but during software simulations how can see the difference between the models as model size remains same and inference time is not giving a clear picture. Thanks in advance.