Skip to content

Commit 2decd09

Browse files
authored
Update README.md
1 parent 497ec55 commit 2decd09

File tree

1 file changed

+54
-30
lines changed

1 file changed

+54
-30
lines changed

README.md

+54-30
Original file line numberDiff line numberDiff line change
@@ -1,63 +1,87 @@
1-
# Evolutionary Model Merge
1+
# 🐟 Evolutionary Optimization of Model Merging Recipes
22

3-
This is an official repository of [Evolutionary Optimization of Model Merging Recipes](https://arxiv.org/TODO) to reproduce the results.
3+
🤗 [Models](https://huggingface.co/SakanaAI) | 👀 [Demo](TODO) | 📚 [Paper](TODO) | 📝 [Blog](TODO) | 🐦 [Twitter](https://twitter.com/SakanaAILabs)
44

5-
## Model Zoo
5+
This repository serves as a central hub for SakanaAI's [Evolutionary Model Merge](TODO) series, showcasing its releases and resources. It includes models and code for reproducing the evaluation presented in our paper. Look forward to more updates and additions coming soon.
66

7-
### LLM
87

9-
| Id. | Model | MGSM-JA (acc ↑) | [lm-eval-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable) (Average ↑) |
10-
| :--: | :-- | --: | --: |
11-
| 1 | [Shisa Gamma 7B v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1) | 9.6 | 66.1 |
12-
| 2 | [WizardMath 7B V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1) | 18.4 | 60.1 |
13-
| 3 | [Abel 7B 002](https://huggingface.co/GAIR/Abel-7B-002) | 30.0 | 56.5 |
14-
| 4 | [Arithmo2 Mistral 7B](https://huggingface.co/upaya07/Arithmo2-Mistral-7B) | 24.0 | 56.4 |
15-
| 5 | [(Ours) EvoLLM-JP-A-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-A-v1-7B) | **52.4** | **69.0** |
16-
| 6 | [(Ours) EvoLLM-JP-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-7B) | **52.0** | **70.5** |
17-
| 7 | [(Ours) EvoLLM-JP-v1-10B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-10B) | **55.6** | **68.2** |
8+
## Models
9+
10+
### Our Models
11+
12+
| Model | Size | License | Source |
13+
| :-- | --: | :-- | :-- |
14+
| [EvoLLM-JP-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-7B) | 7B | Microsoft Research License | [shisa-gamma-7b-v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1), [WizardMath-7B-V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1), [GAIR/Abel-7B-002](https://huggingface.co/GAIR/Abel-7B-002)
15+
| [EvoLLM-JP-v1-10B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-10B) | 10B | Microsoft Research License | EvoLLM-JP-v1-7B, [shisa-gamma-7b-v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1) |
16+
| [EvoLLM-JP-A-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-A-v1-7B) | 7B | Apache 2.0 | [shisa-gamma-7b-v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1), [Arithmo2-Mistral-7B](https://huggingface.co/upaya07/Arithmo2-Mistral-7B), [GAIR/Abel-7B-002](https://huggingface.co/GAIR/Abel-7B-002) |
17+
| [EvoVLM-JP-v1-7B](https://huggingface.co/SakanaAI/EvoVLM-JP-v1-7B) | 7B | Apache 2.0 | [LLaVA-1.6-Mistral-7B](https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b), [shisa-gamma-7b-v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1)
18+
19+
20+
21+
22+
### Comparing EvoLLM-JP w/ Source LLMs
23+
24+
For details on the evaluation, please refer to Section 4.1 of the paper.
25+
26+
| Model | MGSM-JA (acc ↑) | [lm-eval-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable) (avg ↑) |
27+
| :-- | --: | --: |
28+
| [Shisa Gamma 7B v1](https://huggingface.co/augmxnt/shisa-gamma-7b-v1) | 9.6 | 66.1 |
29+
| [WizardMath 7B V1.1](https://huggingface.co/WizardLM/WizardMath-7B-V1.1) | 18.4 | 60.1 |
30+
| [Abel 7B 002](https://huggingface.co/GAIR/Abel-7B-002) | 30.0 | 56.5 |
31+
| [Arithmo2 Mistral 7B](https://huggingface.co/upaya07/Arithmo2-Mistral-7B) | 24.0 | 56.4 |
32+
| [EvoLLM-JP-A-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-A-v1-7B) | **52.4** | **69.0** |
33+
| [EvoLLM-JP-v1-7B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-7B) | **52.0** | **70.5** |
34+
| [EvoLLM-JP-v1-10B](https://huggingface.co/SakanaAI/EvoLLM-JP-v1-10B) | **55.6** | **68.2** |
35+
36+
37+
### Comparing EvoVLM-JP w/ Existing VLMs
38+
39+
For details on the evaluation, please see Section 4.2 of the paper.
1840

19-
### VLM
2041

2142
| Model | JA-VG-VQA-500 (ROUGE-L ↑) | JA-VLM-Bench-In-the-Wild (ROUGE-L ↑) |
2243
| :-- | --: | --: |
2344
| [LLaVA-1.6-Mistral-7B](https://llava-vl.github.io/blog/2024-01-30-llava-next/) | 14.32 | 41.10 |
24-
| [Japanese Stable VLM](https://huggingface.co/stabilityai/japanese-stable-vlm) | - | 40.50 |
25-
| [Heron BLIP Japanese StableLM Base 7B llava-620k](https://huggingface.co/turing-motors/heron-chat-blip-ja-stablelm-base-7b-v1-llava-620k)\* | 8.73 | 27.37 |
45+
| [Japanese Stable VLM](https://huggingface.co/stabilityai/japanese-stable-vlm) | -<sup>*1</sup> | 40.50 |
46+
| [Heron BLIP Japanese StableLM Base 7B llava-620k](https://huggingface.co/turing-motors/heron-chat-blip-ja-stablelm-base-7b-v1-llava-620k) | 8.73<sup>*2</sup> | 27.37<sup>*2</sup> |
2647
| [(Ours) EvoVLM-JP-v1-7B](https://huggingface.co/SakanaAI/EvoVLM-JP-v1-7B) | **19.70** | **51.25** |
2748

28-
* \* We are checking with the authors to see if this current results are valid.
49+
* \*1: Japanese Stable VLM cannot be evaluated using the VA-VG-VQA-500 dataset because this model has used this dataset for training.
50+
* \*2: We are checking with the authors to see if this current results are valid.
51+
2952

30-
## Installation
3153

32-
### 1. Clone the repo
54+
## Reproducing the Evaluation
55+
56+
### 1. Clone the Repo
3357

3458
```bash
3559
git clone https://github.com/SakanaAI/evolving-merged-models.git
3660
cd evolving-merged-models
3761
```
3862

39-
### 2. Download fastext
63+
### 2. Download fastext Model
4064

41-
We use fastext to detect language for evaluation. Please download lid.176.ftz from [this link](https://fasttext.cc/docs/en/language-identification.html) and set the path as below.
65+
We use fastext to detect language for evaluation. Please download `lid.176.ftz` from [this link](https://fasttext.cc/docs/en/language-identification.html) and place it in your current directory. If you place the file in a directory other than the current directory, specify the path to the file using the `LID176FTZ_PATH` environment variable.
4266

43-
```bash
44-
export LID176FTZ_PATH="path-to-lid.176.ftz"
45-
```
4667

47-
### 3. Install necesarry libraries
68+
### 3. Install Libraries
4869

4970
```bash
5071
pip install -e .
5172
```
73+
We conducted our tests in the following environment: Python Version 3.10.12 and CUDA Version 12.3.
74+
We cannot guarantee that it will work in other environments.
5275

53-
* We tested under the following environment:
54-
* Python Version: 3.10
55-
* CUDA Version: 12.3
56-
57-
## Evaluation
76+
### 4. Run
5877

5978
To launch evaluation, run the following script with a certain config. All configs used for the paper are in `configs`.
6079

6180
```bash
6281
python evaluate.py --config_path {path-to-config}
6382
```
83+
84+
85+
## Acknowledgement
86+
87+
We would like to thank the developers of the source models for their contributions and for making their work available.

0 commit comments

Comments
 (0)