Skip to content

Difference between Qkeras model and Keras model #116

Open
@sandeep1404

Description

@sandeep1404

I have 2 models one is baseline keras model and its equivalent keras model, the models are taken from the QKerasTutorial.ipynb,
My keras model is shown below:

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, 28, 28, 1)]       0         
                                                                 
 conv2d_1 (Conv2D)           (None, 26, 26, 18)        180       
                                                                 
 act_1 (Activation)          (None, 26, 26, 18)        0         
                                                                 
 conv2d_2 (Conv2D)           (None, 24, 24, 32)        5216      
                                                                 
 act_2 (Activation)          (None, 24, 24, 32)        0         
                                                                 
 flatten (Flatten)           (None, 18432)             0         
                                                                 
 dense (Dense)               (None, 10)                184330    
                                                                 
 softmax (Activation)        (None, 10)                0         
                                                                 
=================================================================
Total params: 189,726
Trainable params: 189,726
Non-trainable params: 0
_________________________________________________________________

My Equivalent Qkeras models is

Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_2 (InputLayer)        [(None, 28, 28, 1)]       0         
                                                                 
 conv2d_1 (QConv2D)          (None, 26, 26, 18)        180       
                                                                 
 act_1 (QActivation)         (None, 26, 26, 18)        0         
                                                                 
 conv2d_2 (QConv2D)          (None, 24, 24, 32)        5216      
                                                                 
 act_2 (QActivation)         (None, 24, 24, 32)        0         
                                                                 
 flatten (Flatten)           (None, 18432)             0         
                                                                 
 dense (QDense)              (None, 10)                184330    
                                                                 
 softmax (Activation)        (None, 10)                0         
                                                                 
=================================================================
Total params: 189,726
Trainable params: 189,726
Non-trainable params: 0
_________________________________________________________________
comparison_models/qkeras_model.h5
qmodel.save('comparison_models/qkeras_model.h5')

I cannot able to see the difference in model size in both the models since both the models have the same model size, but the weights are being quantised in qkeras model when i check the each layer weights. My question is where can we observe the actual difference between the two models, what are the different metrics that decide the difference in models, usually the quantised models should perform faster inference when compared to keras models which is not quantised but I observed slower training and inference for quantised models when compared to keras model is it correct since inference should be faster for Qkeras models? what are the key metrics that spot the exact differences between the baseline keras model and quantised qkeras model at the software level, we may spot the difference in the model inference time and model size when we port them on hardware(FPGA )but during software simulations how can see the difference between the models as model size remains same and inference time is not giving a clear picture. Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions