My Advanced LLM Project

This repository demonstrates an *advanced pipeline for training a ChatGPT-like or Claude-like model. The pipeline includes:

** Pre-Training** (optional) ** Supervised Fine-Tuning (3FT)* ** Reward Modeling (pairwise preference data) ** RLHF (pconceptual PPO script)

=== Setup ===

Install dependencies

pip install -r requirements.txt

Prepare data

"- Place raw text in data/raw/ for pre-training. "- Place instruction’response data in data/sft/ for SFT. "- Place pairwise preference data in data/reward/ for reward modeling.

Edit Configs

Adjust hyperparameters in configs/.

Run

Pre-train:

bash scripts/run_pretrain.sh

SFT:

bash scripts/run_sft.sh

Reward model:

bash scripts/run_reward_model.sh

RLHF (placeholder):

bash scripts/run_rlhf.sh

Inference

Load a checkpoint and generate text:

python src/inference.py

=== Notes ===

For large models, configure Accelerate and possibly use DeepSpeed or FSDP.
SFT, Reward Modeling, and RLHF require carefully curated datasets.
The code here is a demonstration scaffold; modify to your needs and environment.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
data		data
docs		docs
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

My Advanced LLM Project

About

Uh oh!

Releases

Packages

Languages

hashirventhodi/my-advanced-llm-project

Folders and files

Latest commit

History

Repository files navigation

My Advanced LLM Project

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages