Error checking for #1461 #1462

Game4Move78 · 2022-07-11T19:58:46Z

#1461

Types of changes

Docs change / refactoring / dependency upgrade
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Motivation and Context / Related issue

#1461

How Has This Been Tested (if it applies)

Checklist

The documentation is up-to-date with the changes I made.
I have read the CONTRIBUTING document and completed the CLA (see CLA).
All tests passed, and additional code has been covered with new tests.

facebookresearch#1461

facebook-github-bot · 2022-07-11T19:58:52Z

Hi @Game4Move78!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

facebook-github-bot · 2022-07-11T22:24:38Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

facebook-github-bot · 2022-07-12T01:01:34Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

teytaud · 2022-07-12T07:58:21Z

A this line: https://github.com/Game4Move78/nevergrad/blob/7152ca66cc5f66c0427579d486075ba18dc003b7/nevergrad/optimization/differentialevolution.py#L104

We might add
self.llambda = max(self.llambda, self.num_workers)

This should solve the issue, however it means that the user might specify "I want llambda to be 20 and Nevergrad decides to set llambda to 30".

Nevergrad may ignore user specified llambda if fewer than num_workers

teytaud · 2022-07-22T08:31:56Z

Your code looks good to me, the problem might be in MixDeterministicRL. I investigate. Thanks for your work.

teytaud · 2022-07-22T09:58:31Z

nevergrad/optimization/differentialevolution.py

@@ -158,6 +159,8 @@ def _internal_ask_candidate(self) -> p.Parameter:
            self.population[candidate.uid] = candidate
            self._uid_queue.asked.add(candidate.uid)
            return candidate
+        # stop queue wrapping around to lineage waiting for a tell
+        assert self._uid_queue.told, "More untold asks than population size (exceeds num_workers)"


@jrapin you are the expert for self._uid_queue.told (among so many things...), do you validate this assert ?

Oh I guess the error is in class Portfolio. Let me propose a fix (fingers crossed :-) ).

@jrapin you are the expert for self._uid_queue.told (among so many things...), do you validate this assert ?

If it helps, my thinking was that there should be a tell preceding every ask after the initalization phase keeping the told queue non-empty. Even in the worst case where popsize ==num_workers and all workers are evaluating untold points, the worker that beats the others to the tell can use the same point again on the next ask.

I've just sent a message to Jeremy, who knows that code better than anyone else and who might not have been close to github recently. Sorry for the delay; your PR is interesting.

I used to be strict with the fact that we should not go beyond num_workers, but I changed my mind a couple of years ago because there are many cases you don't master all the details of what is happening (eg: a process dies and you'll never get the result), most times the user won't deal with it and we should be robust to it to simplify use. The code was then supposed to be robust but visibly there are corner cases :s
I would be therefore rather make it robust to this case (would that just take removing duplicates in UuidQueue.told ? it should be light speed so not a problem)

cc @bottler you seemed to disagree and want the user to strictly conform to the "contract", maybe we can discuss and adapt depending if I change your mind or not ;)

@Game4Move78 as a power user, would you rather it bugged explicitely, or be robust to those corner cases? (why did you happen to ask for more points?)

I followed the hyper-parameter settings of papers that used DE for HPO and set popsize to 20 explicitly without providing num_workers, and thought it would be robust. I then asked for more points and handed them to my own adaptive resource allocation + early stopping implementation that evaluated HPO choices with multiple budgets and only provided a tell to the NG optimiser when points were either stopped early or allocated maximum budget.

This would work fine for hundreds of points until it hit that corner case with a point in the told queue that has been deleted from population. My current workaround is to provide feedback immediately on the minimum budget and then treat all evaluations on higher budgets as unasked points, which works fine for DE.

If you want less strict (I do too), how about we allow duplicates in told but at L162 we add

while lineage not in self.population: lineage = self._uid_queue.ask()

~~Which I believe would toss away those points that were deleted from a better tell not asked.~~ Future asks will be biased to duplicate points. Added a commit that checks for duplicate tell using absence from asked queue, although there may be a more intuitive way.

My personal preference to help users master those details where they can is to copy Ax's client interface with an abandon_tell. For most optimisers this would just tell a large value, and the BO optimisers might do something different to avoid damaging the model.

so ParaPortfolio is not really parallel.

Avoid adding uid to queue twice. This handles both cases: - More asks than workers (point used twice but added to told queue once) - Ask without a tell (last worker grabs this point from asked queue) facebookresearch#1462 (comment)

Reworded comment

Game4Move78 · 2022-09-27T14:12:42Z

@jrapin Any chance of getting this merged 😃? Line

nevergrad/nevergrad/optimization/utils.py

Line 340 in 8403d6c

if uid in self.asked:

also uses absence in UidQueue.asked to check for presence in told, and self._uid_queue.asked is configured directly on many lines in _DE already.

I believe this code enforces that for asked points with the same parent, the lineage will be added to the told queue only once in subsequent tells:

if uid in self._uid_queue.asked:  # if taken from queue in multiple asks, add back only once
            self._uid_queue.asked.discard(uid)
            self._uid_queue.tell(uid)

Error checking for facebookresearch#1461

6750392

facebookresearch#1461

More descriptive error message

7152ca6

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jul 11, 2022

Set popsize so workers get distinct base vectors

09504be

Nevergrad may ignore user specified llambda if fewer than num_workers

Game4Move78 marked this pull request as ready for review July 15, 2022 11:55

Game4Move78 added 2 commits July 15, 2022 17:27

Mention in docstring that popsize is clipped

478c588

Removed trailing space

93b018f

teytaud added 2 commits July 22, 2022 10:42

Maybe MixDeterministicRL.no_parallelization = True

cfebd90

Update test_experiments.py

0364fdc

teytaud reviewed Jul 22, 2022

View reviewed changes

teytaud and others added 11 commits July 25, 2022 09:48

Update experimentalvariants.py

0fc538c

Update optimizerlib.py

4f26996

protobuf issue :-/

38bcac0

Create dev.txt

ace09f7

Update dev.txt

86b10d8

Update dev.txt

de292f2

SQP is not parallel

4f0090d

so ParaPortfolio is not really parallel.

Update test_experiments.py

fe0fe9e

Handle corner cases of too many untold points

144548c

Avoid adding uid to queue twice. This handles both cases: - More asks than workers (point used twice but added to told queue once) - Ask without a tell (last worker grabs this point from asked queue) facebookresearch#1462 (comment)

Update differentialevolution.py

a510a0a

Reworded comment

Add space after if (code formatting)

233216e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error checking for #1461 #1462

Error checking for #1461 #1462

Uh oh!

Game4Move78 commented Jul 11, 2022 •

edited

Loading

Uh oh!

facebook-github-bot commented Jul 11, 2022

Uh oh!

facebook-github-bot commented Jul 11, 2022

Uh oh!

facebook-github-bot commented Jul 12, 2022

Uh oh!

teytaud commented Jul 12, 2022

Uh oh!

teytaud commented Jul 22, 2022

Uh oh!

teytaud Jul 22, 2022

Uh oh!

teytaud Jul 25, 2022

Uh oh!

Game4Move78 Jul 26, 2022

Uh oh!

teytaud Jul 29, 2022

Uh oh!

jrapin Aug 2, 2022

Uh oh!

Game4Move78 Aug 2, 2022 •

edited

Loading

Uh oh!

Game4Move78 Aug 2, 2022 •

edited

Loading

Uh oh!

Game4Move78 commented Sep 27, 2022 •

edited

Loading

Uh oh!

Uh oh!

Error checking for #1461 #1462

Are you sure you want to change the base?

Error checking for #1461 #1462

Uh oh!

Conversation

Game4Move78 commented Jul 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Types of changes

Motivation and Context / Related issue

How Has This Been Tested (if it applies)

Checklist

Uh oh!

facebook-github-bot commented Jul 11, 2022

Action Required

Process

Uh oh!

facebook-github-bot commented Jul 11, 2022

Uh oh!

facebook-github-bot commented Jul 12, 2022

Uh oh!

teytaud commented Jul 12, 2022

Uh oh!

teytaud commented Jul 22, 2022

Uh oh!

teytaud Jul 22, 2022

Choose a reason for hiding this comment

Uh oh!

teytaud Jul 25, 2022

Choose a reason for hiding this comment

Uh oh!

Game4Move78 Jul 26, 2022

Choose a reason for hiding this comment

Uh oh!

teytaud Jul 29, 2022

Choose a reason for hiding this comment

Uh oh!

jrapin Aug 2, 2022

Choose a reason for hiding this comment

Uh oh!

Game4Move78 Aug 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Game4Move78 Aug 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Game4Move78 commented Sep 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Game4Move78 commented Jul 11, 2022 •

edited

Loading

Game4Move78 Aug 2, 2022 •

edited

Loading

Game4Move78 Aug 2, 2022 •

edited

Loading

Game4Move78 commented Sep 27, 2022 •

edited

Loading