can't quant llama3 with expanded tokenizer

### Name and Version

latest llama.cpp won't quant llama3 with expanded bpe tokenizer (model works fine on fp16 and fp8 on aphrodite \ transformers \koboldcpp )

### Operating systems

Linux

### GGML backends

CUDA

### Hardware

2xa6k

### Models

llama 3.1

### Problem description & steps to reproduce

Traceback (most recent call last):
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5689, in <module>
    main()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5575, in main
    model_instance.write()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 441, in write
    self.prepare_metadata(vocab_only=False)
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 434, in prepare_metadata
    self.set_vocab()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 1624, in set_vocab
    self._set_vocab_gpt2()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 746, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 527, in get_vocab_base
    tokpre = self.get_vocab_base_pre(tokenizer)
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 734, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()


### First Bad Commit

_No response_

### Relevant log output

```shell
Traceback (most recent call last):
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5689, in <module>
    main()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 5575, in main
    model_instance.write()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 441, in write
    self.prepare_metadata(vocab_only=False)
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 434, in prepare_metadata
    self.set_vocab()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 1624, in set_vocab
    self._set_vocab_gpt2()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 746, in _set_vocab_gpt2
    tokens, toktypes, tokpre = self.get_vocab_base()
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 527, in get_vocab_base
    tokpre = self.get_vocab_base_pre(tokenizer)
  File "/home/sicarius/llama.cpp/convert_hf_to_gguf.py", line 734, in get_vocab_base_pre
    raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

can't quant llama3 with expanded tokenizer #13628

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

can't quant llama3 with expanded tokenizer #13628

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions