evaluating has an extremely large value when quantize to 4bit.

I followed the steps try to get 4bit version of llama7b by using command `python -m llama.llama_quant decapoda-research/llama-7b-hf c4 --wbits 4 --groupsize 128 --save pyllama-7B4b.pt`, the script works well, but at the evaluating stage, it got a very large number 251086.96875.  

![Screenshot2023_06_07_134554](https://github.com/juncongmoo/pyllama/assets/20760190/1bbdfe5c-c1a3-41c1-a87b-141ec172dca1)

And when I testing with the quantized .pt file, model returns un-readable results. 

![Screenshot 2023-06-07 at 13 50 05](https://github.com/juncongmoo/pyllama/assets/20760190/7719aa7f-91a7-4c6d-be78-5f6fbf7e662b)

Anyone has same problem?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

evaluating has an extremely large value when quantize to 4bit. #105

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

evaluating has an extremely large value when quantize to 4bit. #105

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions