MCUCoder: Adaptive Video Compression for IoT Devices (Workshop on Machine Learning and Compression, NeurIPS 2024)
MCUCoder is an open-source adaptive bitrate video compression model designed specifically for resource-constrained Internet of Things (IoT) devices. With a lightweight encoder requiring only 10.5K parameters and a memory footprint of 350KB, MCUCoder provides efficient video compression without exceeding the capabilities of low-power microcontrollers (MCUs) and edge devices.
This video showcases the progressive output of MCUCoder. As demonstrated, utilizing more latent channels improves the quality of the decoded video. The numbers show the count of used latent channels for decoding (out of 12 channels).
mcucoder_video.mp4
- Ultra-Lightweight Encoder: Only 10.5K parameters, enabling efficient processing on MCUs.
- Low Memory Usage: 350KB memory footprint, making it ideal for edge devices with limited RAM (1-2MB).
- High Compression Efficiency: Reduces bitrate by 55.65% (MCL-JCV dataset) and 55.59% (UVG dataset) while maintaining visual quality.
- Adaptive Bitrate Streaming: Latent representation sorted by importance allows for dynamic transmission based on available bandwidth.
- Comparable Energy Consumption to M-JPEG: Ensures efficient power usage for real-time streaming applications.
Follow these instructions to set up the required Python environment for running MCUCoder.
-
Clone this Git repository to your local machine using the following command:
git clone https://github.com/ds-kiel/MCUCoder cd MCUCoder
-
Create a virtual Python environment
virtualenv mcucoder source mcucoder/bin/activate
-
Install the necessary Python packages by running:
pip install -r req.txt
Use imagenet_prepration.py
to extract 400,000 ImageNet images with the highest resolution. To train MCUCoder, use the following command:
python train.py --batch_size <YOUR_BATCH_SIZE> --imagenet_root <YOUR_IMAGENET_PATH> --wandb_name <YOUR_WANDB_NAME> --wandb_project <YOUR_WANDB_PROJECT> --loss <YOUR_LOSS_FUNCTION> --number_of_iterations <TRAIN_ITER> --number_of_channels <N>
python train.py --batch_size 16 --imagenet_root "/path/to/imagenet" --wandb_name "MCUCoder_Training" --wandb_project "MCUCoder" --loss "msssim" --number_of_iterations 1000000 --number_of_channels 196
This script allows you to process videos using MCUCoder for encoding and decoding. It takes an input video, applies the model, and saves the output in the specified directory.
To process a video, use the following command:
python video_enc_dec.py --batch_size <YOUR_BATCH_SIZE> --model_path <YOUR_MODEL_PATH> --video_path <YOUR_VIDEO_PATH> --output_dir <OUTPUT_DIRECTORY>
The MCUCoder pretrained model, optimized with MS-SSIM loss, trained for 1M iterations, and featuring 196 decoder channels, is available at: https://zenodo.org/records/14988203.
This project is licensed under the MIT License. See the LICENSE
file for details.
@inproceedings{
hojjat2024mcucoder,
title={{MCUC}oder: Adaptive Bitrate Learned Video Compression for IoT Devices},
author={Ali Hojjat and Janek Haberer and Olaf Landsiedel},
booktitle={Workshop on Machine Learning and Compression, NeurIPS 2024},
year={2024},
url={https://openreview.net/forum?id=ESjy0fQJJE}
}