Skip to content

jchip/lm-finetune2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local 7B LLM Fine-tuning and GGUF Conversion Pipeline

A complete pipeline for fine-tuning 7B language models using QLoRA and converting them to GGUF format for efficient inference.

Features

  • QLoRA Fine-tuning: Memory-efficient fine-tuning using 4-bit quantization
  • GPU/CPU Support: Automatic fallback to CPU if CUDA unavailable
  • Model Merging: Merge LoRA adapters with base models
  • GGUF Conversion: Convert to GGUF format with configurable quantization
  • Local Inference: Test models locally with interactive chat
  • Complete Pipeline: One-command execution of the entire workflow

Quick Start

  1. Setup Environment

    python setup.py
  2. Create Sample Data

    python finetune_pipeline.py --create_sample_data
  3. Run Complete Pipeline

    python finetune_pipeline.py --data_path data/sample_train.jsonl

Installation

  1. Install Python Dependencies

    pip install -r requirements.txt
  2. Install llama.cpp (Optional, for GGUF conversion)

    git clone https://github.com/ggerganov/llama.cpp
    cd llama.cpp
    make

Data Format

Training data should be in JSONL format with a "text" field containing formatted prompts:

{"text": "<s>[INST] What is machine learning? [/INST] Machine learning is a subset of artificial intelligence...</s>"}
{"text": "<s>[INST] Explain neural networks. [/INST] Neural networks are computational models...</s>"}

Usage Examples

Basic Fine-tuning

python finetune_pipeline.py --data_path my_data.jsonl

Custom Model and Settings

python finetune_pipeline.py \
  --model_name microsoft/DialoGPT-medium \
  --data_path my_data.jsonl \
  --epochs 5 \
  --batch_size 2 \
  --learning_rate 1e-4

Skip GGUF Conversion

python finetune_pipeline.py --data_path my_data.jsonl --skip_gguf

Individual Components

Fine-tuning Only

python finetune.py --data_path my_data.jsonl

Merge LoRA Adapter

python merge_model.py --adapter_path ./lora_adapters --output_path ./merged_model

Convert to GGUF

python gguf_converter.py --model_path ./merged_model

Test Model

python inference.py --model_path ./merged_model --interactive

Configuration

Edit config.py to customize training parameters:

@dataclass
class TrainingConfig:
    model_name: str = "microsoft/DialoGPT-medium"
    max_seq_length: int = 512
    num_train_epochs: int = 3
    learning_rate: float = 2e-4
    lora_r: int = 64
    lora_alpha: int = 16
    quantization_level: str = "Q4_K_M"

Output Structure

outputs/
├── lora_adapters/          # LoRA adapter files
├── merged_model/           # Merged PyTorch model
└── gguf_models/           # GGUF quantized models

data/
└── sample_train.jsonl     # Sample training data

GPU Requirements

  • Minimum: 8GB VRAM for 7B model fine-tuning with QLoRA
  • Recommended: 16GB+ VRAM for optimal performance
  • CPU Fallback: Available but significantly slower

Supported Models

The pipeline works with most causal language models on Hugging Face:

  • microsoft/DialoGPT-medium (default)
  • microsoft/DialoGPT-large
  • EleutherAI/gpt-neo-1.3B
  • EleutherAI/gpt-neo-2.7B
  • And many others...

Quantization Levels

Available GGUF quantization levels:

  • Q4_0, Q4_1: 4-bit quantization
  • Q5_0, Q5_1: 5-bit quantization
  • Q8_0: 8-bit quantization
  • Q4_K_M, Q5_K_M: K-quantization (recommended)

Troubleshooting

CUDA Out of Memory

  • Reduce per_device_train_batch_size
  • Increase gradient_accumulation_steps
  • Reduce max_seq_length

GGUF Conversion Fails

  • Install llama.cpp from source
  • Ensure conversion scripts are in PATH
  • Use --skip_gguf to bypass conversion

Model Quality Issues

  • Increase training epochs
  • Adjust learning rate
  • Improve training data quality
  • Increase LoRA rank (lora_r)

License

This project is licensed under the MIT License. See individual model licenses for usage restrictions.

About

LLM fine tuning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages