Skip to content

SimNPO implementation question #123

Open
@Anya-wUw

Description

@Anya-wUw

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task
  • My own task or dataset (give details below)

Reproduction

Paper's Formula: SimNPO Loss

Full Function (open-unlearning/src/trainer/unlearn/simnpo.py)

class SimNPO(GradDiff):
    def __init__(self, delta=0.0, beta=1.0, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.delta = delta
        self.beta = beta

    def compute_loss(self, model, inputs, return_outputs=False):
        forget_inputs = inputs["forget"]

        forget_labels = forget_inputs["labels"]
        loss_mask = forget_labels != -100
        forget_loss, forget_outputs = compute_batch_nll(model, forget_inputs)
        forget_loss = forget_loss / loss_mask.sum(-1) - self.delta
        forget_loss = -F.logsigmoid(self.beta * forget_loss).mean() * 2 / self.beta

        retain_inputs = inputs["retain"]
        retain_inputs = {
            "input_ids": retain_inputs["input_ids"],
            "attention_mask": retain_inputs["attention_mask"],
            "labels": retain_inputs["labels"],
        }
        retain_loss = self.compute_retain_loss(model=model, retain_inputs=retain_inputs)

        loss = self.gamma * forget_loss + self.alpha * retain_loss
        return (loss, forget_outputs) if return_outputs else loss

❓ Place of doubt:

forget_loss = forget_loss / loss_mask.sum(-1) - self.delta
forget_loss = -F.logsigmoid(self.beta * forget_loss).mean() * 2 / self.beta

🛠️ My correction:

forget_loss = - self.beta * (forget_loss / loss_mask.sum(-1)) - self.delta
forget_loss = -F.logsigmoid(forget_loss).mean() * 2 / self.beta

Expected behavior

Hello! Thanks for your work on the repository, it's great and very valuable!
I had a bit of confusion with SimNPO, when I tested unlearning on my dataset (https://huggingface.co/datasets/AnniBorri/PopQA_forget_splits), the SimNPO algorithm kept diverging no matter what parameters I adjusted. The ROUGE-L metric on forget set was higher after unlearning than before it
I looked at the paper and it seems to me that there are slight discrepancies between the code and the formula. Is the implementation of SimNPO correct? Please correct me if I'm wrong
Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions