Open
Description
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task
- My own task or dataset (give details below)
Reproduction
Full Function (open-unlearning/src/trainer/unlearn/simnpo.py
)
class SimNPO(GradDiff):
def __init__(self, delta=0.0, beta=1.0, *args, **kwargs):
super().__init__(*args, **kwargs)
self.delta = delta
self.beta = beta
def compute_loss(self, model, inputs, return_outputs=False):
forget_inputs = inputs["forget"]
forget_labels = forget_inputs["labels"]
loss_mask = forget_labels != -100
forget_loss, forget_outputs = compute_batch_nll(model, forget_inputs)
forget_loss = forget_loss / loss_mask.sum(-1) - self.delta
forget_loss = -F.logsigmoid(self.beta * forget_loss).mean() * 2 / self.beta
retain_inputs = inputs["retain"]
retain_inputs = {
"input_ids": retain_inputs["input_ids"],
"attention_mask": retain_inputs["attention_mask"],
"labels": retain_inputs["labels"],
}
retain_loss = self.compute_retain_loss(model=model, retain_inputs=retain_inputs)
loss = self.gamma * forget_loss + self.alpha * retain_loss
return (loss, forget_outputs) if return_outputs else loss
❓ Place of doubt:
forget_loss = forget_loss / loss_mask.sum(-1) - self.delta
forget_loss = -F.logsigmoid(self.beta * forget_loss).mean() * 2 / self.beta
🛠️ My correction:
forget_loss = - self.beta * (forget_loss / loss_mask.sum(-1)) - self.delta
forget_loss = -F.logsigmoid(forget_loss).mean() * 2 / self.beta
Expected behavior
Hello! Thanks for your work on the repository, it's great and very valuable!
I had a bit of confusion with SimNPO, when I tested unlearning on my dataset (https://huggingface.co/datasets/AnniBorri/PopQA_forget_splits), the SimNPO algorithm kept diverging no matter what parameters I adjusted. The ROUGE-L metric on forget set was higher after unlearning than before it
I looked at the paper and it seems to me that there are slight discrepancies between the code and the formula. Is the implementation of SimNPO correct? Please correct me if I'm wrong
Thank you!