Closed
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
It is challenging to add new modules efficiently. Could we create a generic convert.py
script that allows seamless conversion of any Hugging Face modules to GGUF?
The code referenced here: https://github.com/ggerganov/llama.cpp/blob/master/convert_hf_to_gguf_update.py#L67-L112 represents just one place for updates. However, every time we add a new model, we also need to update other related sections of the code, which makes it difficult to maintain and follow.
# use deepseek as sample, this is the file list we need update, when add a new module
$ grep -ril deepseek *
README.md
convert_hf_to_gguf.py
convert_hf_to_gguf_update.py
docs/backend/CANN.md
examples/server/README.md
examples/server/webui/index.html
examples/server/public_legacy/system-prompts.js
examples/server/public_legacy/prompt-formats.js
examples/server/public_legacy/index-new.html
examples/main/README.md
gguf-py/gguf/tensor_mapping.py
gguf-py/gguf/constants.py
gguf-py/tests/test_metadata.py
include/llama.h
models/ggml-vocab-deepseek-coder.gguf
models/ggml-vocab-deepseek-llm.gguf
src/llama-vocab.cpp
src/llama-kv-cache.cpp
src/llama.cpp
src/llama-arch.h
src/llama-chat.h
src/llama-arch.cpp
src/llama-model.cpp
src/llama-hparams.h
src/llama-chat.cpp
tests/CMakeLists.txt
tests/test-tokenizer-random.py
tests/test-chat-template.cpp
a real PR to support a new model for reference: #11049
Motivation
easy to follow and maintain
Possible Implementation
No response