Skip to content

Latest commit

 

History

History
349 lines (311 loc) · 6.92 KB

benchmark-celeba64.md

File metadata and controls

349 lines (311 loc) · 6.92 KB

CelebA 64x64 Benchmark

Basic setup

This benchmark does not aim to achieve the best performance, but to gain insights into the behavior of different quantization methods. Therefore, we use the same basic setup for all the experiments:

  • The network architecture is SimpleCNN.
  • Hyperparameters:
    • Batch size: 256
    • Learning rate: 4e-4
    • Optimizer: Adam
    • Training steps: 500k
  • Results are evaluated on CelebA test split which contains 19962 images.

VQVAE

Effect of codebook dimension:

Codebook dim. Codebook size Codebook usage↑ PSNR↑ SSIM↑ LPIPS↓ rFID↓
4 512 100.00% 32.2119 0.9517 0.0239 16.3249
8 100.00% 32.2406 0.9520 0.0228 16.6592
16 68.75% 31.6909 0.9473 0.0263 16.4272
32 66.41% 31.7674 0.9480 0.0261 16.3970
64 56.45% 31.5487 0.9453 0.0275 16.8227
  • Smaller codebook dimension leads to higher codebook usage.

Effect of codebook size:

Codebook dim. Codebook size Codebook usage↑ PSNR↑ SSIM↑ LPIPS↓ rFID↓
64 512 56.45% 31.5487 0.9453 0.0275 16.8227
1024 30.18% 31.3836 0.9459 0.0272 16.4965
2048 16.06% 31.6631 0.9470 0.0264 16.5808
  • With low codebook usage, increasing codebook size cannot improve the reconstruction quality.

Effect of l2-norm codes:

Codebook dim. Codebook size l2-norm codes Codebook usage↑ PSNR↑ SSIM↑ LPIPS↓ rFID↓
4 512 No 100.00% 32.2119 0.9517 0.0239 16.3249
Yes 100.00% 32.2439 0.9473 16.4495
64 512 No 56.45% 31.5487 0.9453 0.0275 16.8227
Yes 98.24% 31.3334 0.9492 0.0209 12.9127
  • The l2-normalized codes can improve codebook usage even when codebook dimension is large.

Effect of EMA update:

Codebook dim. Codebook size Codebook update Codebook usage↑ PSNR↑ SSIM↑ LPIPS↓ rFID↓
4 512 VQ loss 100.00% 32.2119 0.9517 0.0239 16.3249
EMA 100.00% 32.3070 0.9528 0.0224 16.3338
64 512 VQ loss 56.45% 31.5487 0.9453 0.0275 16.8227
EMA 100.00% 32.0709 0.9516 0.0228 15.5629
  • Using EMA to update the codebook can improve codebook usage even when codebook dimension is large.

Effect of entropy regularization:

Codebook dim. Codebook size Entropy reg. weight Codebook usage↑ PSNR↑ SSIM↑ LPIPS↓ rFID↓
64 512 0.0 56.45% 31.5487 0.9453 0.0275 16.8227
0.1 100.00% 29.5755 0.9277 0.0422 14.1500
  • Entropy regularization can improve codebook usage, but it may hurt the reconstruction quality.

FSQ-VAE

Levels Codebook size Codebook usage↑ PSNR↑ SSIM↑ LPIPS↓ rFID↓
[8,8,8] 512 100.00% 30.8543 0.9397 0.0315 15.7079
[8,5,5,5] 1000 100.00% 30.9025 0.9433 0.0266 15.8230
  • FSQ-VAE does not suffer from the codebook collapse problem.
  • FSQ-VAE can achieve comparable performance with VQVAE of the same codebook size.

LFQ-VAE

Dim. Codebook size Codebook usage↑ PSNR↑ SSIM↑ LPIPS↓ rFID↓
9 512 100.00% 26.1391 0.8685 0.0700 18.5518

⚠️ The result is not as good as expected. Some details may be missing from the implementation.

SimVQ-VAE

Codebook dim. Codebook size Codebook usage↑ PSNR↑ SSIM↑ LPIPS↓ rFID↓
64 512 100.00% 31.7468 0.9494 0.0242 14.9863
  • SimVQ addresses the codebook collapse problem by reparameterizing the codebook through a linear transformation layer.