SimNPO implementation question

### Information

- [x] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task
- [x] My own task or dataset (give details below)

### Reproduction

Paper's Formula: ![SimNPO Loss](https://latex.codecogs.com/svg.image?\ell_{\text{SimNPO}}(\theta)=\mathbb{E}_{(x,y)\in\mathcal{D}_f}\left[-\frac{2}{\beta}\log\sigma\left(-\frac{\beta}{|y|}\log\pi_\theta(y|x)-\gamma\right)\right])





### Full Function (`open-unlearning/src/trainer/unlearn/simnpo.py`)

```python
class SimNPO(GradDiff):
    def __init__(self, delta=0.0, beta=1.0, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.delta = delta
        self.beta = beta

    def compute_loss(self, model, inputs, return_outputs=False):
        forget_inputs = inputs["forget"]

        forget_labels = forget_inputs["labels"]
        loss_mask = forget_labels != -100
        forget_loss, forget_outputs = compute_batch_nll(model, forget_inputs)
        forget_loss = forget_loss / loss_mask.sum(-1) - self.delta
        forget_loss = -F.logsigmoid(self.beta * forget_loss).mean() * 2 / self.beta

        retain_inputs = inputs["retain"]
        retain_inputs = {
            "input_ids": retain_inputs["input_ids"],
            "attention_mask": retain_inputs["attention_mask"],
            "labels": retain_inputs["labels"],
        }
        retain_loss = self.compute_retain_loss(model=model, retain_inputs=retain_inputs)

        loss = self.gamma * forget_loss + self.alpha * retain_loss
        return (loss, forget_outputs) if return_outputs else loss
```

---

### ❓ Place of doubt:

```python
forget_loss = forget_loss / loss_mask.sum(-1) - self.delta
forget_loss = -F.logsigmoid(self.beta * forget_loss).mean() * 2 / self.beta
```

---

### 🛠️ My correction:

```python
forget_loss = - self.beta * (forget_loss / loss_mask.sum(-1)) - self.delta
forget_loss = -F.logsigmoid(forget_loss).mean() * 2 / self.beta
```


### Expected behavior

Hello! Thanks for your work on the repository, it's great and very valuable!
I had a bit of confusion with SimNPO, when I tested unlearning on my dataset (https://huggingface.co/datasets/AnniBorri/PopQA_forget_splits), the SimNPO algorithm kept diverging no matter what parameters I adjusted. The ROUGE-L metric on forget set was higher after unlearning than before it
I looked at the paper and it seems to me that there are slight discrepancies between the code and the formula. Is the implementation of SimNPO correct? Please correct me if I'm wrong
Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SimNPO implementation question #123

Information

Tasks

Reproduction

Full Function (`open-unlearning/src/trainer/unlearn/simnpo.py`)

❓ Place of doubt:

🛠️ My correction:

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SimNPO implementation question #123

Description

Information

Tasks

Reproduction

Full Function (open-unlearning/src/trainer/unlearn/simnpo.py)

❓ Place of doubt:

🛠️ My correction:

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Full Function (`open-unlearning/src/trainer/unlearn/simnpo.py`)