Add double mixture privacy losses #277

jan-schuchardt · 2024-11-13T16:04:25Z

Hi,

this is a draft pull request related to issue #274.

I have implemented instances of AdditiveNoisePrivacyLoss whose privacy loss distribution is dominated by a pair of two mixture distributions. The class hierarchy looks as follows:

DoubleMixturePrivacyLoss
├── DoubleMixtureLaplacePrivacyLoss
├── DoubleMixtureGaussianPrivacyLoss
│   ├── MixtureGaussianPrivacyLoss

I have kept the existing MixtureGaussianPrivacyLoss as a special case of DoubleMixtureGaussianPrivacyLoss for backwards compatibility and because it has certain optimizations that are not implemented in its super classes.

Specifically, the DoubleMixtureXYZ classes are less optimized in the following sense:

No caching of constant terms in the privacy loss
No specialized heuristic for finding binary search bounds

These optimization cannot be trivially generalized to the super classes, because they rely on the (inverse) likelihood ratio decomposing into a sum of single-distribution ratios. This is not the case when we have two mixtures.

Before I create a final pull request, it would be great if we could discuss the following questions:

Can we live with the above optimization-related issues? (The implementation was still sufficiently fast for use with DP-SGD in my last project).
The DoubleMixtureXYZ classes are currently only being tested via tests for MixtureGaussianPrivacyLoss. Do we need separate tests for each class?

arung54 · 2024-11-20T23:31:21Z

Hi Jan, sorry for the delayed response. Wanted to follow up on this:

Can we live with the above optimization-related issues? (The implementation was still sufficiently fast for use with DP-SGD in my last project).

I think it is okay for now, and if someone requests we can work on optimizing later (either internally, or with your help).

The DoubleMixtureXYZ classes are currently only being tested via tests for MixtureGaussianPrivacyLoss. Do we need separate tests for each class?

It would be good to make sure any behavior specific to these classes are being tested. If you feel this is a lot of added work, we can potentially have you submit the current version and I can make a follow-up change that adds more robust tests.

In addition, there are some upcoming changes to the structure of privacy_loss_mechanism that might affect the PR (the amount of work needed to correct the PR for these changes should be pretty small). So it may be best to wait until those changes are out to make a final pull request :)

jan-schuchardt · 2024-11-22T12:41:23Z

So it may be best to wait until those changes are out to make a final pull request :)

All right, let's do that.

The existing tests do not look to complicated, so I can also take care of those.

pritkamath · 2024-11-29T02:11:17Z

Hi Jan.

The changes alluded to by Arun have been pushed now (see changes to privacy_loss_mechanism.py in this commit).

Basically, your PR made us realize that it would be more natural to introduce a common class of MonotonePrivacyLoss as a common abstraction of AdditiveNoisePrivacyLoss and MixtureGaussianPrivacyLoss.
Please sync your PR with those changes. Thank you and sorry about the extra trouble.

jan-schuchardt · 2024-11-29T17:54:39Z

Ok, great. I'm on a vacation / conference travel for the next couple of weeks, but I'll look into doing the rebase after that.

jan-schuchardt · 2025-02-20T13:34:07Z

Hi!

I've been somewhat busy, but I've now gotten around to integrating the proposed changes into the new class hierarchy:

MonotonePrivacyLoss
├─ DoubleMixturePrivacyLoss
│  ├─ DoubleMixtureGaussianPrivacyLoss
│  │  ├─ MixtureGaussianPrivacyLoss
│  ├─ DoubleMixtureLaplacePrivacyLoss

I've also added unit tests to ensure that

We recover the behavior of MixtureGaussianPrivacyLoss or LaplacePrivacyLoss when one of the two mixtures has a single component with mean 0
We get the correct privacy_loss and delta_for_epsilon when both mixtures have multiple components

The reference values for delta_for_epsilon are computed via scipy.quad.

It would be great if you could have another look at my PR.

arung54 · 2025-02-21T11:38:51Z

Thanks Jan for making the changes! I am on vacation right now but will review the PR once I'm back.

arung54

Hi Jan, overall the changes look great! Most of these comments are catching typos or suggestions on style.

python/dp_accounting/dp_accounting/pld/privacy_loss_mechanism.py

python/dp_accounting/dp_accounting/pld/privacy_loss_mechanism_test.py

jan-schuchardt · 2025-03-11T16:01:47Z

Thank you! I've set all comments with trivial fixes (typos etc.) to resolved.

There are five open comments. It would be nice if you could have another look at those and let me know if everything looks ok to you, or if we should make any additional changes.

arung54 · 2025-03-17T15:59:59Z

Hi Jan,

Thanks for your responses! At this point everything LGTM. I'll try to take a second pass to double check, and I need to do a bit of work on my end (some internal permissions, nothing you need to worry about) before I can merge the PR, but at this point it should be ready to submit.

jan-schuchardt mentioned this pull request Nov 13, 2024

Feature: Double Mixture of Gaussians for dp_accounting #274

Open

jan-schuchardt force-pushed the double_mixture_pld branch 4 times, most recently from 11be431 to 8c514a9 Compare November 14, 2024 14:56

jan-schuchardt force-pushed the double_mixture_pld branch from 8c514a9 to da7c9e3 Compare February 20, 2025 13:20

jan-schuchardt marked this pull request as ready for review February 20, 2025 13:21

Jan Schuchardt added 7 commits February 20, 2025 14:26

Add DoubleMixturePrivacyLoss superclass

5b4198e

Add DoubleMixtureGaussianPrivacyLoss

e4a2f03

Add DoubleMixtureLaplacePrivacyLoss

0adfb53

Fit MixtureGaussianPrivacyLoss into classhierarchy

69e2995

Add missing "Return:" docstrings

7b91b59

Add unittests for DoubleMixtureGaussianPrivacyLoss

0bc2828

Add DoubleMixtureLaplacePrivacyLossMechanism tests

9b25bdd

jan-schuchardt force-pushed the double_mixture_pld branch from da7c9e3 to 9b25bdd Compare February 20, 2025 13:27

arung54 reviewed Mar 3, 2025

View reviewed changes

Jan Schuchardt added 8 commits March 11, 2025 13:29

Fix typos in privacy_loss_mechanism module

e308661

Use built-in tuple type hint

5f9d459

Add more specific shape mismatch error message

49cbf79

Fix typos in privacy_loss_mechanism_test module

0163244

Remove intermediate scale variable from tests

c62842d

Rename test case generator methods

00acdd0

Split up test_init_raises_error test cases

8814936

Compare shape of lower probs and sensitivities

0806912

Add double mixture privacy losses #277

Are you sure you want to change the base?

Add double mixture privacy losses #277

Uh oh!

Conversation

jan-schuchardt commented Nov 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arung54 commented Nov 20, 2024

Uh oh!

jan-schuchardt commented Nov 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pritkamath commented Nov 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jan-schuchardt commented Nov 29, 2024

Uh oh!

jan-schuchardt commented Feb 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arung54 commented Feb 21, 2025

Uh oh!

arung54 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jan-schuchardt commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arung54 commented Mar 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jan-schuchardt commented Nov 13, 2024 •

edited

Loading

jan-schuchardt commented Nov 22, 2024 •

edited

Loading

pritkamath commented Nov 29, 2024 •

edited

Loading

jan-schuchardt commented Feb 20, 2025 •

edited

Loading

jan-schuchardt commented Mar 11, 2025 •

edited

Loading