When using the script in the README to finetune llama2, the training loss goes to 0 and the eval loss goes to nan randomly.