-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Hi authors,
I am very happy to see a careful repository and step-by-step guideline as in your labs. It's really make sense to me that I could try to reproduce it. However, I got so many incorrect accuracy with reproduce the main result. For example, i try to reproduce MixLoRA by Gemma 2B model in ARC-e dataset. The result reported in the paper is 76.3%. However, when I run two lines:
python ./launch.py gen --template mixlora --tasks arc-e
python ./launch.py run --base_model google/gemma-2b
inside the MoE-PEFT repository, I have got
[
{
"adapter_name": "arc_e_0",
"task_name": "arc-e",
"date_time": "2025-04-17 10:33:58",
"metrics": {
"accuracy": 0.2398989898989899
}
}
]
[2025-04-17 10:33:59,224] MoE-PEFT: saving evaluation result to ./moe_peft_train_1744860839.json
One thing that I realize is that the loss train sometimes increase significant and then reduce, which is not stable. In addition, I have tried with Llama 2 7B and got the same phenomenon. Here is my configuration moe_peft.json i've got after the first line would be the same in the paper, i.e.,
{
"cutoff_len": 512,
"save_step": 1000,
"train_lora_candidate_num": 2,
"train_lora_simultaneously_num": 2,
"train_strategy": "optim",
"lora": [
{
"name": "arc_e_0",
"task_name": "arc-e",
"optim": "adamw",
"scheduler_type": "constant",
"warmup_steps": 0,
"lr": 0.0002,
"batch_size": 16,
"micro_batch_size": 8,
"evaluate_batch_size": 16,
"num_epochs": 2,
"r": 16,
"lora_alpha": 32,
"lora_dropout": 0.05,
"target_modules": {
"q_proj": true,
"k_proj": true,
"v_proj": true,
"o_proj": true,
"gate_proj": true,
"down_proj": true,
"up_proj": true
},
"routing_strategy": "mixlora",
"num_experts": 8,
"top_k": 2,
"group_by_length": false
}
]
}
Moreover, I haved tried to use both with and without
export MOE_PEFT_EVALUATE_MODE=1
as some issues but still got the wrong reproduction results. Here is my environment, that I can run smoothly with the repository
Package Version
---------------------------- --------------
absl-py 2.2.2
accelerate 1.6.0
aiofiles 23.2.1
aiohappyeyeballs 2.6.1
aiohttp 3.11.16
aiosignal 1.3.2
altair 5.5.0
annotated-types 0.7.0
anyio 4.9.0
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.3.0
asttokens 3.0.0
astunparse 1.6.3
async-lru 2.0.5
async-timeout 5.0.1
attrs 25.3.0
babel 2.17.0
beautifulsoup4 4.13.3
bitsandbytes 0.43.1
black 25.1.0
bleach 6.2.0
certifi 2025.1.31
cffi 1.17.1
charset-normalizer 3.4.1
click 8.1.8
comm 0.2.2
contourpy 1.3.1
cycler 0.12.1
datasets 3.5.0
debugpy 1.8.14
decorator 5.2.1
defusedxml 0.7.1
dill 0.3.8
distro 1.9.0
einops 0.8.1
evaluate 0.4.3
exceptiongroup 1.2.2
executing 2.2.0
fastapi 0.115.12
fastjsonschema 2.21.1
ffmpy 0.5.0
filelock 3.18.0
fire 0.7.0
flake8 7.2.0
flash-attn 2.7.1.post1
flatbuffers 25.2.10
fonttools 4.57.0
fqdn 1.5.1
frozenlist 1.5.0
fsspec 2024.12.0
gast 0.6.0
google-pasta 0.2.0
gradio 4.38.1
gradio_client 1.1.0
grpcio 1.71.0
h11 0.14.0
h5py 3.13.0
httpcore 1.0.8
httpx 0.28.1
huggingface-hub 0.30.2
idna 3.10
importlib_resources 6.5.2
ipdb 0.13.13
ipykernel 6.29.5
ipython 8.35.0
ipywidgets 8.1.6
isoduration 20.11.0
isort 6.0.1
jedi 0.19.2
Jinja2 3.1.6
jiter 0.9.0
joblib 1.4.2
json5 0.12.0
jsonpointer 3.0.0
jsonschema 4.23.0
jsonschema-specifications 2024.10.1
jupyter_client 8.6.3
jupyter_core 5.7.2
jupyter-events 0.12.0
jupyter-lsp 2.2.5
jupyter_server 2.15.0
jupyter_server_terminals 0.5.3
jupyterlab 4.4.0
jupyterlab_pygments 0.3.0
jupyterlab_server 2.27.3
jupyterlab_widgets 3.0.14
keras 3.9.2
kiwisolver 1.4.8
libclang 18.1.1
Markdown 3.8
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.10.1
matplotlib-inline 0.1.7
mccabe 0.7.0
mdurl 0.1.2
mistune 3.1.3
mixlora 0.2.3
ml_dtypes 0.5.1
mlora 0.3.1
moe_peft 2.0.2
mpmath 1.3.0
multidict 6.4.3
multiprocess 0.70.16
mypy-extensions 1.0.0
namex 0.0.8
narwhals 1.34.1
nbclient 0.10.2
nbconvert 7.16.6
nbformat 5.10.4
nest-asyncio 1.6.0
networkx 3.4.2
ninja 1.11.1.4
notebook 7.4.0
notebook_shim 0.2.4
numpy 2.2.4
nvidia-cublas-cu12 12.4.5.8
nvidia-cuda-cupti-cu12 12.4.127
nvidia-cuda-nvrtc-cu12 12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.1.3
nvidia-curand-cu12 10.3.5.147
nvidia-cusolver-cu12 11.6.1.9
nvidia-cusparse-cu12 12.3.1.170
nvidia-cusparselt-cu12 0.6.2
nvidia-nccl-cu12 2.21.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.4.127
openai 1.72.0
opt_einsum 3.4.0
optree 0.15.0
orjson 3.10.16
overrides 7.7.0
packaging 24.2
pandas 2.2.3
pandocfilters 1.5.1
parso 0.8.4
pathspec 0.12.1
peft 0.11.1
pexpect 4.9.0
pillow 10.4.0
pip 25.0
platformdirs 4.3.7
prometheus_client 0.21.1
prompt_toolkit 3.0.50
propcache 0.3.1
protobuf 5.29.4
psutil 7.0.0
ptyprocess 0.7.0
pure_eval 0.2.3
pyarrow 19.0.1
pycodestyle 2.13.0
pycparser 2.22
pydantic 2.11.3
pydantic_core 2.33.1
pydub 0.25.1
pyflakes 3.3.2
Pygments 2.19.1
pyparsing 3.2.3
python-dateutil 2.9.0.post0
python-json-logger 3.3.0
python-multipart 0.0.20
pytz 2025.2
PyYAML 6.0.2
pyzmq 26.4.0
referencing 0.36.2
regex 2024.11.6
requests 2.32.3
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rich 14.0.0
rpds-py 0.24.0
ruff 0.11.5
safetensors 0.5.3
scikit-learn 1.6.1
scipy 1.15.2
semantic-version 2.10.0
Send2Trash 1.8.3
sentencepiece 0.2.0
setuptools 75.8.0
shellingham 1.5.4
six 1.17.0
sniffio 1.3.1
soupsieve 2.6
stack-data 0.6.3
starlette 0.46.2
sympy 1.13.1
tensorboard 2.19.0
tensorboard-data-server 0.7.2
tensorflow-io-gcs-filesystem 0.37.1
termcolor 3.0.1
terminado 0.18.1
tf-slim 1.1.0
threadpoolctl 3.6.0
tiktoken 0.9.0
tinycss2 1.4.0
tokenize_rt 6.1.0
tokenizers 0.19.1
tomli 2.2.1
tomlkit 0.12.0
torch 2.5.1
tornado 6.4.2
tqdm 4.67.1
traitlets 5.14.3
transformers 4.44.2
triton 3.1.0
typer 0.15.2
types-python-dateutil 2.9.0.20241206
typing_extensions 4.13.1
typing-inspection 0.4.0
tzdata 2025.2
uri-template 1.3.0
urllib3 2.3.0
uvicorn 0.34.1
wcwidth 0.2.13
webcolors 24.11.1
webencodings 0.5.1
websocket-client 1.8.0
websockets 11.0.3
Werkzeug 3.1.3
wheel 0.45.1
widgetsnbextension 4.0.14
wrapt 1.17.2
xxhash 3.5.0
yarl 1.19.0
Please help me so that I might reproduce this meaningful repository. Thank you so much!