Description
Hello, I am just curious, I have adapted test.py to do realtime inference on webcam, as well as the exact same modifications to https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix, and set it to do inference on CPU only, since my goal is to run it on a mobile phone.
While the memory usage of the compressed model is much smaller, only 5MB vs 200MB, inference time seems to actually be slower on the compressed model. I am getting 4 FPS on test_compressed.sh for edges2shoes_r, and in the original pytorch-CycleGAN-and-pix2pix code I am getting 8 FPS. Is this normal? I am sure the modifications I did in the inference code is the same, and that it looks like actual model latency.
Having 5MB instead of 200MB ram usage is a dramatic decrease, but is there also supposed to be a dramatic speedup?