Description
Name and Version
latest llama.cpp won't quant llama3 with expanded bpe tokenizer (model works fine on fp16 and fp8 on aphrodite \ transformers \koboldcpp )
Operating systems
Linux
GGML backends
CUDA
Hardware
2xa6k
Models
llama 3.1
Problem description & steps to reproduce
Traceback (most recent call last):
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5689, in
main()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5575, in main
model_instance.write()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 441, in write
self.prepare_metadata(vocab_only=False)
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 434, in prepare_metadata
self.set_vocab()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 1624, in set_vocab
self._set_vocab_gpt2()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 746, in _set_vocab_gpt2
tokens, toktypes, tokpre = self.get_vocab_base()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 527, in get_vocab_base
tokpre = self.get_vocab_base_pre(tokenizer)
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 734, in get_vocab_base_pre
raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()
First Bad Commit
No response
Relevant log output
Traceback (most recent call last):
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5689, in <module>
main()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5575, in main
model_instance.write()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 441, in write
self.prepare_metadata(vocab_only=False)
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 434, in prepare_metadata
self.set_vocab()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 1624, in set_vocab
self._set_vocab_gpt2()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 746, in _set_vocab_gpt2
tokens, toktypes, tokpre = self.get_vocab_base()
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 527, in get_vocab_base
tokpre = self.get_vocab_base_pre(tokenizer)
File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 734, in get_vocab_base_pre
raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()