Issue converting models to new format #1419
wiseman-timelord
started this conversation in
General
Replies: 1 comment 1 reply
-
You'll want to use a floating point (non-quantized) model as input. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
quantize.exe "./models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin" "./models/13B/RQ-gpt4-x-vicuna-13B.ggml.q5_0.bin" q5_0 21
main: build = 531 (553fd4d)
main: quantizing './models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin' to './models/13B/RQ-gpt4-x-vicuna-13B.ggml.q5_0.bin' as q5_0 using 21 threads
llama.cpp: loading model from ./models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin
llama.cpp: saving model to ./models/13B/RQ-gpt4-x-vicuna-13B.ggml.q5_0.bin
[ 1/ 363] tok_embeddings.weight - 5120 x 32001, type = q5_0, llama_model_quantize: failed to quantize: type q5_0 unsupported for integer quantization
main: failed to quantize model from './models/13B/gpt4-x-vicuna-13B.ggml.q5_0.bin'
Same thing happens with the q5_1 version, I also tried on "wizard-vicuna-13B.ggml.q4_0.bin" with the "q4_0" settings, and "ggml-vic13b-q5_1.bin" with "q5_1" and "vicuna-13b-free-q4_0.bin" with "q4_0", keeps happening...
Its creating files 423KB in size then giving up, I have 60GB free on the drive, so cant be that..
Beta Was this translation helpful? Give feedback.
All reactions