Skip to content

hashirventhodi/my-advanced-llm-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

My Advanced LLM Project

This repository demonstrates an *advanced pipeline for training a ChatGPT-like or Claude-like model. The pipeline includes:

** Pre-Training** (optional) ** Supervised Fine-Tuning (3FT)* ** Reward Modeling (pairwise preference data) ** RLHF (pconceptual PPO script)

=== Setup ===

  1. Install dependencies
pip install -r requirements.txt
  1. Prepare data

"- Place raw text in data/raw/ for pre-training. "- Place instruction’response data in data/sft/ for SFT. "- Place pairwise preference data in data/reward/ for reward modeling.

  1. Edit Configs
  • Adjust hyperparameters in configs/.
  1. Run
  • Pre-train:
bash scripts/run_pretrain.sh
  • SFT:
bash scripts/run_sft.sh
  • Reward model:
bash scripts/run_reward_model.sh
  • RLHF (placeholder):
bash scripts/run_rlhf.sh
  1. Inference
  • Load a checkpoint and generate text:
python src/inference.py

=== Notes ===

  • For large models, configure Accelerate and possibly use DeepSpeed or FSDP.

  • SFT, Reward Modeling, and RLHF require carefully curated datasets.

  • The code here is a demonstration scaffold; modify to your needs and environment.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published