Issue converting models to new format #1419

wiseman-timelord · 2023-05-12T20:22:24Z

wiseman-timelord
May 12, 2023

quantize.exe "./models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin" "./models/13B/RQ-gpt4-x-vicuna-13B.ggml.q5_0.bin" q5_0 21
main: build = 531 (553fd4d)
main: quantizing './models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin' to './models/13B/RQ-gpt4-x-vicuna-13B.ggml.q5_0.bin' as q5_0 using 21 threads
llama.cpp: loading model from ./models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin
llama.cpp: saving model to ./models/13B/RQ-gpt4-x-vicuna-13B.ggml.q5_0.bin
[ 1/ 363] tok_embeddings.weight - 5120 x 32001, type = q5_0, llama_model_quantize: failed to quantize: type q5_0 unsupported for integer quantization
main: failed to quantize model from './models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin'

Same thing happens with the q5_1 version, I also tried on "wizard-vicuna-13B.ggml.q4_0.bin" with the "q4_0" settings, and "ggml-vic13b-q5_1.bin" with "q5_1" and "vicuna-13b-free-q4_0.bin" with "q4_0", keeps happening...

Its creating files 423KB in size then giving up, I have 60GB free on the drive, so cant be that..

hungerf3 · 2023-05-12T20:44:30Z

hungerf3
May 12, 2023

You'll want to use a floating point (non-quantized) model as input.

1 reply

wiseman-timelord May 12, 2023
Author

I'm guessing you mean one of the base models? GPT4 lied!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue converting models to new format #1419

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Issue converting models to new format #1419

wiseman-timelord May 12, 2023

Replies: 1 comment · 1 reply

hungerf3 May 12, 2023

wiseman-timelord May 12, 2023 Author

wiseman-timelord
May 12, 2023

Replies: 1 comment 1 reply

hungerf3
May 12, 2023

wiseman-timelord May 12, 2023
Author