LLM Server and Client for Rockchip 3588/3576

File Structure

./models: contains your rkllm models.
./lib: C++ rkllm library used for inference and fix_freqence_platform.
./app.py: API Rest server.
./client.py: Client to interact with the server.

Supported Python Versions:

Python 3.8 to 3.12

Main Features

Running models on NPU.
Pull models directly from Huggingface
include a API REST with documentation
Listing available models.
Dynamic loading and unloading of models.
Inference requests.
Streaming and non-streaming modes.
Message history.

Installation

Download RKLLama:

git clone 
cd rkllama

Install RKLLama

chmod +x setup.sh
sudo ./setup.sh

Usage

Run Server

Virtualization with conda is started automatically, as well as the NPU frequency setting.

Start the server

rkllama serve

Run Client

Command to start the client

rkllama

or

rkllama help

See the available models

rkllama list

Run a model

rkllama run <model_name>

Adding a Model (`file.rkllm`)

Using the `rkllama pull` Command

You can download and install a model from the Hugging Face platform with the following command:

rkllama pull username/repo_id/model_file.rkllm

Alternatively, you can run the command interactively:

rkllama pull
Repo ID ( example: punchnox/Tinnyllama-1.1B-rk3588-rkllm-1.1.4): <your response>
File ( example: TinyLlama-1.1B-Chat-v1.0-rk3588-w8a8-opt-0-hybrid-ratio-0.5.rkllm): <your response>

This will automatically download the specified model file and prepare it for use with RKLLAMA.

Manual Installation

Download the Model
- Download .rkllm models directly from Hugging Face.
Place the Model
- Navigate to the ~/RKLLAMA/models directory on your system.
- Place the .rkllm files in this directory.
Example directory structure:
```
~/RKLLAMA/models/
    └── TinyLlama-1.1B-Chat-v1.0.rkllm
```

Uninstall

Go to the ~/RKLLAMA/ folder

cd ~/RKLLAMA/
cp ./uninstall.sh ../
cd ../ && chmod +x ./uninstall.sh && ./uninstall.sh

Upcoming Features

Add multimodal models
Add embedding models
Add RKNN for onnx models ( TTS, image classification/segmentation... )
GGUF/HF to RKLLM conversion software

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
documentation/api		documentation/api
lib		lib
models		models
src		src
README.md		README.md
client.py		client.py
client.sh		client.sh
requirements.txt		requirements.txt
server.py		server.py
server.sh		server.sh
setup.sh		setup.sh
uninstall.sh		uninstall.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Server and Client for Rockchip 3588/3576

File Structure

Supported Python Versions:

Main Features

Installation

Usage

Run Server

Run Client

Adding a Model (`file.rkllm`)

Using the `rkllama pull` Command

Manual Installation

Uninstall

Upcoming Features

About

Uh oh!

Releases

Packages

Languages

Egregore-ai/local-llm-runner

Folders and files

Latest commit

History

Repository files navigation

LLM Server and Client for Rockchip 3588/3576

File Structure

Supported Python Versions:

Main Features

Installation

Usage

Run Server

Run Client

Adding a Model (file.rkllm)

Using the rkllama pull Command

Manual Installation

Uninstall

Upcoming Features

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Adding a Model (`file.rkllm`)

Using the `rkllama pull` Command

Packages