Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<br>
</p>

🚀**LLaMA2-Accessory** is an open-source toolkit for pre-training, fine-tuning and deployment of **Large Language Models (LLMs)** and **mutlimodal LLMs**. This repo is mainly inherited from [LLaMA-Adapter](https://github.com/OpenGVLab/LLaMA-Adapter) with more advanced features.🧠
🚀**LLaMA2-Accessory** is an open-source toolkit for pre-training, fine-tuning and deployment of **Large Language Models (LLMs)** and **multimodals LLMs**. This repo is mainly inherited from [LLaMA-Adapter](https://github.com/OpenGVLab/LLaMA-Adapter) with more advanced features.🧠

## News
- **[2023.08.21]** We release the Quantization codes and Evaluation result🔥🔥🔥
Expand All @@ -19,7 +19,7 @@
- 🌈 Multi-modal fine-tuning with image-text pairs ([LAION](https://laion.ai/blog/laion-5b/), [COYO](https://github.com/kakaobrain/coyo-dataset) and more), interleaved image-text data ([MMC4](https://github.com/allenai/mmc4) and [OBELISC](https://github.com/huggingface/OBELISC)) and visual instruction data ([LLaVA](https://github.com/haotian-liu/LLaVA), [Shrika](https://github.com/shikras/shikra), [Bard](https://bard.google.com/))
- 🔧 LLM for API Control ([GPT4Tools](https://github.com/StevenGrove/GPT4Tools) and [Gorilla](https://github.com/ShishirPatil/gorilla)).
* **⚡Efficient Optimization and Deployment**
- 🚝 Parameter-efficient fine-tuning with [Zero-init Attenion](https://github.com/OpenGVLab/LLaMA-Adapter) and [Bias-norm Tuning](https://github.com/OpenGVLab/LLaMA-Adapter).
- 🚝 Parameter-efficient fine-tuning with [Zero-init Attention](https://github.com/OpenGVLab/LLaMA-Adapter) and [Bias-norm Tuning](https://github.com/OpenGVLab/LLaMA-Adapter).
- 💻 Fully Sharded Data Parallel ([FSDP](https://engineering.fb.com/2021/07/15/open-source/fsdp/)), [Flash Attention 2](https://github.com/Dao-AILab/flash-attention) and [QLoRA](https://github.com/artidoro/qlora).
* **🏋️‍♀️Support More Visual Encoders and LLMs**

Expand Down
10 changes: 5 additions & 5 deletions docs/finetune.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,18 +17,18 @@ This document demonstrates the fine-tuning use cases supported by LLaMA2-Accesso

> ## Prerequisites
>
> To run our provided experiment scripts on you own machine, please first adjust the following configurations:
> To run our provided experiment scripts on your own machine, please first adjust the following configurations:
>
> + Modify the value of the `pretrained_path` variable in the `.sh` file. This variable should point to the directory containing checkpoints to fine-tune from.
> + If you fine-tune from the officianl LLaMA / LLaMA2 checkpoints released by META, the directory should be like:
> + If you fine-tune from the official LLaMA / LLaMA2 checkpoints released by META, the directory should be like:
> ```
> pretrained_path
> ├── consolidated.00.pth
> ├── consolidated.01.pth
> └── ...
> ```
>
> and your should set `pretrained_type=meta_ori` in the `.sh` file.
> and you should set `pretrained_type=meta_ori` in the `.sh` file.
> + Alternatively, you may also fine-tune from checkpoints saved by LLaMA2-Accessory. In such cases, the directory should be like:
> ```
> pretrained_path
Expand Down Expand Up @@ -199,15 +199,15 @@ python demos/multi_turn.py \
> My content is:
>
> ```
> I want study math everyday because I like it. It makes me feel very good. I do not like English because it is hard. I can not remember the words
> I want to study math every day because I like it. It makes me feel very good. I do not like English because it is hard. I can not remember the words
> ```
>
> **Assistant:** "I am eager to study mathematics daily as I find it fascinating. It brings me great satisfaction. On the other hand, I do not have the same enthusiasm for English as it is a challenging subject for me. Memorizing vocabulary and grammar rules is not my strong suit."
>
> **Human:** My second content is:
>
> ```
> My name is Jimmy, a master student studing computer science. I have done some projects. For example, I designed a image classification model, its name is CLSModel. I learned a lot from this project.
> My name is Jimmy, a master student studying computer science. I have done some projects. For example, I designed an image classification model, its name is CLSModel. I learned a lot from this project.
> ```
>
> **Assistant:** "My name is Jimmy, and I am a master's student pursuing a degree in computer science. I have had the opportunity to work on various projects, one of which was an image classification model called CLSModel. Through this project, I gained valuable experience and knowledge that has helped me grow as a computer scientist."
Expand Down
6 changes: 3 additions & 3 deletions docs/pretrain.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ LLaMA2-Accessory currently supports two kinds of pre-training datasets: the *van

The vanilla dataset is supported in [`data/falcon.py`](../accessory/data/falcon.py). It loads data directly from `.parquet` data files (as an example, see [*Falcon Refined-web*](https://huggingface.co/datasets/tiiuae/falcon-refinedweb)). With the vanilla dataset, every piece of data will be converted into tokens of fixed length. Specifically, it will be truncated if it is longer than the target length, and padded if shorter.

An example for pre-training with the vanilla dataset is provided in [exps/pretrain/vanilla.sh](../accessory/exps/pretrain/vanilla.sh). Here are some notes about the script:
An example of pre-training with the vanilla dataset is provided in [exps/pretrain/vanilla.sh](../accessory/exps/pretrain/vanilla.sh). Here are some notes about the script:

+ To run the script one your own environment, point the `llama_config` variable to the `params.json` file defining the model structure, and the `tokenizer_path` variable to the `tokenizer.model` file.
+ To run the script on your own environment, point the `llama_config` variable to the `params.json` file defining the model structure, and the `tokenizer_path` variable to the `tokenizer.model` file.
+ A meta file specifying the list of `.parquet` files to use should be created and pointed to by the `data_meta_path` variable. We provide an example meta file for the *Falcon Refined-web* dataset [here](../data_example/PretrainMeta.json).
+ The elements in the meta file should be either absolute paths, or paths relative to `data_root`.

Expand All @@ -18,7 +18,7 @@ For more efficient token utilization, the packed dataset is supported in [data/f
python -u tools/generate_packed_data.py
```

An example for pre-training with the packed dataset is provided in [exps/pretrain/13B_packed.sh](../accessory/exps/pretrain/13B_packed.sh). Similar to the case of the vanilla dataset, you also need to create a meta file and point `data_meta_path` to it. If you use our `generate_packed_dataset.py` to preprocess data, elements in the meta file should end with `.pkl` (See [here](../data_example/PretrainMetaPacked.json) for example).
An example of pre-training with the packed dataset is provided in [exps/pretrain/13B_packed.sh](../accessory/exps/pretrain/13B_packed.sh). Similar to the case of the vanilla dataset, you also need to create a meta file and point `data_meta_path` to it. If you use our `generate_packed_dataset.py` to preprocess data, elements in the meta file should end with `.pkl` (See [here](../data_example/PretrainMetaPacked.json) for example).