Skip to content

Egregore-ai/rkllm-toolkit-cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

RKLLM Toolkit Collection

This collection contains two different tools for converting Hugging Face models to RKLLM format for Rockchip RK3588(S) and RK3576 processors.

Available Projects

1. Hugging Face RKLLM Converter

Path: huggingface-rkllm-converter/

A comprehensive and feature-rich tool for converting Hugging Face models to RKLLM format with advanced capabilities.

Key Features:

  • Conversion of various model architectures (Qwen, OPT, etc.)
  • Support for multiple quantization formats (Q4_0, Q4_K_M, Q8_0, Q8_K_M)
  • Automatic configuration file generation
  • Model and parameter validation
  • Model metadata support
  • Detailed logging of conversion process
  • Support for 1D and 2D tensor conversion
  • Both CLI and Python API interfaces

Prerequisites:

  • Python 3.8 or higher
  • Hugging Face account and token (for private models)
  • CUDA-capable GPU (recommended for faster conversion)

Installation:

cd huggingface-rkllm-converter
python3 -m venv venv
source venv/bin/activate  # On Linux/macOS
pip install -r requirements.txt

Basic Usage:

# Simple conversion
python3 converter.py Qwen/Qwen2.5-7B

# Advanced conversion with options
python3 converter.py Qwen/Qwen2.5-7B \
    --output-dir "models/converted" \
    --quantization "Q4_K_M" \
    --max-context-len 8192 \
    --dtype "float16" \
    --device "cuda"

2. Official RKLLM Toolkit CLI

Path: official-rkllm-toolkit-cli/

A command-line interface for converting HuggingFace models to RKLLM format for Rockchip NPUs (RK3588/RK3576).

Key Features:

  • Support for multiple model IDs in a single command
  • Built-in quantization options (w8a8 default for RK3588, w4a16 for RK3576)
  • Hybrid rate configuration
  • Platform-specific optimization (RK3588/RK3576)
  • Multiple installation methods (UV, pip)
  • Comprehensive quantization type support

Prerequisites:

  • Python 3.10
  • Linux operating system (x86_64)
  • Internet access for downloading models

Installation Methods:

Method 1: Using UV (Recommended)

# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"

# Clone and setup
git clone <repository-url>
cd official-rkllm-toolkit-cli
uv venv
source .venv/bin/activate
uv pip install inquirer typer huggingface-hub

Method 2: Using pip

# Clone and setup
git clone <repository-url>
cd official-rkllm-toolkit-cli
python3 -m venv venv
source venv/bin/activate
pip install inquirer typer huggingface-hub

Usage Examples:

# Convert a single model with default settings
python3 -c "from src.rkllm_toolkit_cli import main; main()" microsoft/DialoGPT-medium

# Convert multiple models with different quantization types
python3 -c "from src.rkllm_toolkit_cli import main; main()" microsoft/DialoGPT-medium microsoft/DialoGPT-small --qtypes w8a8 w4a16 --platform rk3576

# Convert with hybrid quantization
python3 -c "from src.rkllm_toolkit_cli import main; main()" microsoft/DialoGPT-medium --hybrid-rates 0.5 --optimization

Supported Quantization Types:

  • RK3588: w8a8 (default), w8a8_g128, w8a8_g256, w8a8_g512
  • RK3576: w8a8 (default), w4a16, w4a16_g32, w4a16_g64, w4a16_g128

Which Tool Should You Use?

Choose Hugging Face RKLLM Converter if you:

  • Need detailed control over conversion parameters
  • Want comprehensive logging and debugging information
  • Require support for multiple quantization formats
  • Need both CLI and programmatic Python API access
  • Are converting complex or large models that need fine-tuning
  • Need metadata generation and model validation
  • Prefer a traditional Python environment setup

Choose Official RKLLM Toolkit CLI if you:

  • Want a quick and simple conversion process
  • Prefer minimal setup with Nix package manager
  • Need to convert multiple models in batch
  • Want a lightweight, dependency-free solution
  • Are comfortable with Nix ecosystem
  • Need basic quantization (w8a8) which works for most use cases
  • Want to avoid manual environment setup
  • Prefer a more modern, declarative approach to dependencies

Comparison Summary

Feature Hugging Face Converter Official Toolkit CLI
Setup Complexity Medium (Python + deps) Low (Nix only)
Quantization Options 4 types (Q4_0, Q4_K_M, Q8_0, Q8_K_M) 1 type (w8a8)
Configuration Control High Medium
Batch Processing Single model Multiple models
API Access CLI + Python API CLI only
Logging Detail Comprehensive Basic
Platform Support General RK3588/RK3576 specific
Dependency Management Manual Automatic (Nix)

Getting Started

  1. For beginners or quick conversions: Start with the Official RKLLM Toolkit CLI
  2. For advanced users or production use: Use the Hugging Face RKLLM Converter
  3. For development and experimentation: The Hugging Face converter provides more flexibility

Both tools target the same hardware (Rockchip RK3588/RK3576) and produce compatible RKLLM format files, so you can choose based on your workflow preferences and requirements.

About

CLI to convert models to rkllm format for running on Rockchip NPU

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published