Nyuntam is NyunAI's cutting-edge toolkit for optimizing and accelerating large language models (LLMs) through state-of-the-art compression techniques. 🛠️ With an integrated CLI, managing your workflows and experimenting with various compression methods has never been easier! ✨
Ready to dive in? Here's a minimal example to get you up and running with Nyuntam:
-
Initialize Your Workspace: 🗂️ First, set up your workspace using the
nyun init
command. This creates the necessary directories and configurations for your experiments.nyun init ~/my-workspace ~/my-data --extensions text-gen
This command initializes a workspace at
~/my-workspace
, sets the custom data path to~/my-data
, and installs thetext-gen
extension. -
Run an Example Experiment: 🏃♀️ Now, run an example experiment using a pre-configured YAML file. For instance, to try out FLAP pruning:
nyun run examples/text-generation/flap_pruning/config.yaml
This command executes the main script using the configurations specified in the provided YAML file.
- State-of-the-Art Compression: 🗜️ Includes advanced techniques like pruning, quantization, and distillation to ensure model efficiency without sacrificing performance.
- Multi-Platform Support: 💻 Run experiments seamlessly on various platforms using Docker or virtual environments.
- Integrated CLI: ⌨️ Built-in command-line interface (
nyun
) for easy workspace management and experiment execution. - Extensible Architecture: 🧩 Supports various SOTA compression algorithms, using a single cli command.
- Python 3.8 or later
- For GPU support: NVIDIA Container Toolkit (when using Docker) 🐳
Install Nyuntam using pip:
pip install nyuntam
-
Install NVIDIA Container Toolkit (Linux):
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt-get update sudo apt-get install -y nvidia-container-toolkit sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker
-
Clone and Setup:
git clone --recursive https://github.com/nyunAI/nyuntam.git cd nyuntam docker pull nyunadmin/nyuntam-text-generation:latest docker run -it -d --gpus all -v $(pwd):/workspace --name nyuntam-dev --network=host nyunadmin/nyuntam-text-generation:latest bash
-
Clone Repository:
git clone --recursive https://github.com/nyunAI/nyuntam.git cd nyuntam
-
Setup Environment:
python3 -m venv {ENVIRONMENT_NAME} source {ENVIRONMENT_NAME}/bin/activate pip install -r requirements.txt
This section is for developers who want to dive deep and modify the Nyuntam codebase.
-
Clone the Repository:
git clone --recursive https://github.com/nyunAI/nyuntam.git cd nyuntam
-
Choose Your Environment:
- Docker: Follow the instructions in the "Git + Docker" section above.
- Virtual Environment: Follow the instructions in the "Git + Virtual Environment" section above.
- Core Scripts: The core scripts like
main.py
,algorithm.py
, andcommands.py
are located in the root directory of thenyuntam
folder. - Examples: Practical examples for different tasks are located in the
nyuntam/examples
directory. Each subdirectory includes aREADME.md
for guidance andconfig.yaml
files for configurations. - Modules: The main modules for text generation are located in the
nyuntam/text_generation
directory. - Utilities: Utility scripts and functions are located in the
nyuntam/utils
directory.
-
Prepare Configuration: Create a YAML file defining your experiment parameters. Example configurations are available in the
nyuntam/examples
directory.- Refer to dataset imports and models imports for configuration details.
- Scripts and example YAML files are available here.
-
Execute:
python nyuntam/main.py --yaml_path path/to/recipe.yaml
Before running experiments, initialize your workspace:
nyun init [WORKSPACE_PATH] [CUSTOM_DATA_PATH] [OPTIONS]
Options:
--overwrite
,-o
: Overwrite existing workspace--extensions
,-e
: Specify extensions to install:text-gen
: For text generationall
: Install all extensionsnone
: No extensions
Example:
nyun init ~/my-workspace ~/my-data --extensions text-gen
-
Prepare Configuration: Create a YAML file defining your experiment parameters. Example configurations are available in the
nyuntam/examples
directory.- Refer to dataset imports and models imports for configuration details.
- Scripts and example YAML files are available here.
-
Execute:
nyun run path/to/recipe.yaml
For chained execution:
nyun run script1.yaml script2.yaml
For detailed examples and use cases, check out our examples directory, which includes:
- Maximising math performance for extreme compressions: 2-bit Llama3-8b (w2a16)
- Efficient 4-bit Quantization (w4a16) of Llama3.1-8b
- Llama3.1 70B: 0.5x the cost & size
- Achieving Up to 2.5x TensorRTLLM Speedups
- Accelerating a 4-bit Quantised Llama Model
For complete documentation, visit NyunAI Docs
Check your installed version:
nyun version
NOTE: For access to gated repositories within containers, ensure you have the necessary Hugging Face tokens configured. 🔑