Skip to content

Egregore-ai/local-llm-runner

Repository files navigation

LLM Server and Client for Rockchip 3588/3576

File Structure

  • ./models: contains your rkllm models.
  • ./lib: C++ rkllm library used for inference and fix_freqence_platform.
  • ./app.py: API Rest server.
  • ./client.py: Client to interact with the server.

Supported Python Versions:

  • Python 3.8 to 3.12

Main Features

  • Running models on NPU.
  • Pull models directly from Huggingface
  • include a API REST with documentation
  • Listing available models.
  • Dynamic loading and unloading of models.
  • Inference requests.
  • Streaming and non-streaming modes.
  • Message history.

Installation

  1. Download RKLLama:
git clone 
cd rkllama
  1. Install RKLLama
chmod +x setup.sh
sudo ./setup.sh

Usage

Run Server

Virtualization with conda is started automatically, as well as the NPU frequency setting.

  1. Start the server
rkllama serve

Run Client

  1. Command to start the client
rkllama

or

rkllama help
  1. See the available models
rkllama list
  1. Run a model
rkllama run <model_name>

Adding a Model (file.rkllm)

Using the rkllama pull Command

You can download and install a model from the Hugging Face platform with the following command:

rkllama pull username/repo_id/model_file.rkllm

Alternatively, you can run the command interactively:

rkllama pull
Repo ID ( example: punchnox/Tinnyllama-1.1B-rk3588-rkllm-1.1.4): <your response>
File ( example: TinyLlama-1.1B-Chat-v1.0-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm): <your response>

This will automatically download the specified model file and prepare it for use with RKLLAMA.


Manual Installation

  1. Download the Model

  2. Place the Model

    • Navigate to the ~/RKLLAMA/models directory on your system.
    • Place the .rkllm files in this directory.

    Example directory structure:

    ~/RKLLAMA/models/
        └── TinyLlama-1.1B-Chat-v1.0.rkllm
    

Uninstall

  1. Go to the ~/RKLLAMA/ folder
    cd ~/RKLLAMA/
    cp ./uninstall.sh ../
    cd ../ && chmod +x ./uninstall.sh && ./uninstall.sh

Upcoming Features

  • Add multimodal models
  • Add embedding models
  • Add RKNN for onnx models ( TTS, image classification/segmentation... )
  • GGUF/HF to RKLLM conversion software

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published