Mini_LLama is a lightweight implementation of the LLama (Large Language Model) architecture, optimized for efficient training and inference on limited hardware. This project is designed for research and experimentation in Natural Language Processing (NLP) and deep learning.
- Efficient transformer-based architecture
- Customizable model size and training configurations
- Support for fine-tuning on custom datasets
- Lightweight inference for deployment on resource-constrained devices
# Clone the repository
git clone https://github.com/dzungnguyen21/Mini_LLama.git
cd Mini_LLama
# Create a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtpython train.py --config configs/train_config.jsonpython infer.py --model checkpoint/model.pth --text "Your input text here"Model and training parameters can be customized in the configs/ directory. Example:
{
"model_size": "small",
"learning_rate": 0.001,
"batch_size": 32,
"epochs": 10
}The dataset should be formatted in JSON or CSV format and placed in the data/ directory. Modify data_loader.py to preprocess your specific dataset.
Contributions are welcome! Please follow these steps:
- Fork the repository.
- Create a new branch (
feature-branch). - Commit your changes.
- Push to your fork and create a pull request.
For questions and collaborations, feel free to reach out via GitHub issues or email.