Skip to content

Conversation

jlnav
Copy link
Member

@jlnav jlnav commented May 9, 2024

See checklist in #1307 (review)

@jlnav jlnav marked this pull request as ready for review May 16, 2024 20:06
@codecov
Copy link

codecov bot commented May 16, 2024

Codecov Report

❌ Patch coverage is 87.75510% with 54 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.75%. Comparing base (36ea32c) to head (a383dc0).
⚠️ Report is 26 commits behind head on develop.

Files with missing lines Patch % Lines
libensemble/generators.py 80.00% 16 Missing and 6 partials ⚠️
libensemble/gen_classes/aposmm.py 82.66% 4 Missing and 9 partials ⚠️
libensemble/utils/runners.py 92.22% 4 Missing and 3 partials ⚠️
libensemble/sim_funcs/borehole_kills.py 0.00% 5 Missing ⚠️
libensemble/utils/misc.py 95.65% 2 Missing and 3 partials ⚠️
libensemble/gen_funcs/aposmm_localopt_support.py 50.00% 1 Missing ⚠️
libensemble/libE.py 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1307      +/-   ##
===========================================
+ Coverage    78.23%   78.75%   +0.52%     
===========================================
  Files           76       79       +3     
  Lines         7561     7983     +422     
  Branches      1116     1195      +79     
===========================================
+ Hits          5915     6287     +372     
- Misses        1447     1477      +30     
- Partials       199      219      +20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

}

my_APOSMM = APOSMM(gen_specs)
my_APOSMM.setup()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be separate to constructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you've seen already, but setup() sets attributes that can't be pickled, but need to be done anyway. So I believe a separate generator.setup() needs to exist.

But perhaps for outside-of-libE purposes, to save a line, this could resemble: my_APOSMM = APOSMM(gen_specs, setup=True).

@shuds13
Copy link
Member

shuds13 commented May 30, 2024

@jlnav

Currently this has two existing user functions refactored to ask/tell, and it is a breaking change for codes using those gens. I don't think this is what we want. So, for now, we could supply these as duplicates (keeping the original), or remove them. Alternative would be to refactor all appropriate gens to ask/tell, but that would hold this up, so I think it may be better to supply these two as a duplicate for now.

Also, would it be better for these ask/tell gens (rand sample and gpCAM) to use AskTellGenRunner and not use the wrapper gen_f (given a breaking change anyway).

Issues we need to address:

  • cleanup/resolve ask/tell refactoring - clear separation of gens - see branch Making new gpCAM gen class #1316
  • Alt. could be to make old imports use the ask/tell gens?
  • re-test runs with libE and Optimas that previously failed with threads
  • compare running with wrapper and ask/tell runner - compare results
    - results match. ask/tell runner about 10% slower than wrapper for randsample test.
  • CI passing
  • Parameters in wrapper were rearranged (???) - requires discussion (for now changing on Making new gpCAM gen class #1316) - deciding this is another reason why we may not want to replace the old gen_f's yet.
  • test_1d_asktell_gen.py is not 1d, its 2d - changing on Making new gpCAM gen class #1316
  • are we now getting reproducible/matching random numbers in APOSMM ask/tell test - check seeding (inc. when running via Optimas).
  • Need assert on minima found in aposmm test - if losing precision - why.
  • check comparison using diff alloc for APOSMM - including initial sample processing.

@jlnav
Copy link
Member Author

jlnav commented May 30, 2024

@jlnav

Currently this has two existing user functions refactored to ask/tell, and it is a breaking change for codes using those gens. I don't think this is what we want. So, for now, we could supply these as duplicates (keeping the original), or remove them. Alternative would be to refactor all appropriate gens to ask/tell, but that would hold this up, so I think it may be better to supply these two as a duplicate for now.

Sounds good. Maybe we could raise DeprecationWarnings?

Also, would it be better for these ask/tell gens (rand sample and gpCAM) to use AskTellGenRunner and not use the wrapper gen_f?

I prefer the AskTellGenRunner myself (I did develop it), but if all the gens move to my runner, then what would you want to do with your wrapper?

@shuds13
Copy link
Member

shuds13 commented May 30, 2024

I think there is also some opportunity for inheritence with the gpCAM class. And gp_cam_simple is more complicated than gp_cam_asktell. So, if anything the latter would be the base class, and renamed to avoid confusion. See #1316

# ==================== Local version ===============================


def _retrieve_generator(gen_specs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this is needed. If gen is on a thread, should not need to be pickled.

Copy link
Member

@shuds13 shuds13 Jun 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed on my branch for now, so don't change here

@jlnav
Copy link
Member Author

jlnav commented Jun 5, 2024

Checking random sample when running without Optimas:

>>> import numpy as np
>>> normal = np.load("persistent_aposmm_nlopt_history_length=2005_evals=2000_workers=4.npy")
>>> asktell = np.load("persistent_aposmm_nlopt_asktell_history_length=2002_evals=2000_workers=3.npy")
>>> all([i in asktell["x"][:100] for i in normal["x"][:100]])
True

@jlnav
Copy link
Member Author

jlnav commented Jun 5, 2024

With Optimas, currently looks like random sample doesn't match, even when trying to account for seed

@jlnav
Copy link
Member Author

jlnav commented Jun 6, 2024

Precision fixed in the Optimas example: optimas-org/optimas@08a835b

@shuds13
Copy link
Member

shuds13 commented Jul 17, 2024

We may need to update gpCAM gens for latest gpCAM release. Make sure any changes made in gen_f is reflected here before pulling in.

  • Check gpCAM updates

@jlnav jlnav requested a review from shuds13 July 30, 2024 22:00
self.all_y = np.vstack((self.all_y, self.y_new))

if self.my_gp is None:
self.my_gp = GP(self.all_x, self.all_y, noise_variances=self.noise * np.ones(len(self.all_y)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to update for new gpCAM interface

"""

def __init__(self, _, persis_info, gen_specs, libE_info=None) -> list:
# self.H = H
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove I guess.

self.all_y = np.empty((0, 1))
np.random.seed(0)

def __init__(self, H, persis_info, gen_specs, libE_info=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will put above _initialize_gpcAM

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and make _initialize_gpcAM _initialize_gpCAM

self.all_x = np.empty((0, self.n))
self.all_y = np.empty((0, 1))
np.random.seed(0)

Copy link
Member

@shuds13 shuds13 Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to decide __init__ interface.
We questioned before whether we keep the same interface - which mirrors the current gen_f, or to rearrange, as H is often not given (basically an H0). So it could be gen_specs first. I'm leaning towards keeping the original ordering as it mirrors our user functions, but this should be discussed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. My opinion/intuition is a user is more likely to prefer either "classical" gens (e.g. Jeff) or ask/tell gens (e.g. other CAMPA folks). With these gens' interfaces and users being so different, I don't think an arguably simpler rearrangement of the input parameters is too confusing.

Similarly to how some people prefer numpy or pandas; they do similar things, but their interfaces being different isn't a point of contention.

I'd also lean towards if someone were to initialize some object, like a gen, themselves, they'd prefer their specifications be provided as early and clearly as possible:

my_gen = Generator(param=1,
     option="two",
)

vs.

my_gen = Generator(None,
     {},
     {"param": 1, "option": "two"},
     {},
)

Copy link
Member

@shuds13 shuds13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we pull this in, are we treating this as provisional, without documenting gen_classes, so interface (e.g. __init__ option ordering) is mutable. Or are we ready to say a user can use the gen_classes interface for their own gens if desired? This questions applies to the numpy interface (ask_np/tell_np). As for the outer API, we certainly have not finalized that, and it should somehow be noted.

E.g.,
# This feature is in beta and its interface may change in future releases

Note, that currently the two implemented classes (rearranged to ask/tell) are duplicates of gen_f, but in the long term we will not want to have duplicates.

Discussions:

  • Agree current status of gen_classes (internal or public).

  • If provisional - mark somehow in code

  • Agree wrapper paradigm (outer v inner interface)

  • Agree naming ask_np / tell_np

    • still not sure
      ask_numpy()

    is better than

      @convert_dict_to_numpy
      ask()

    It seems we need separate function names, and the output of an ask() would be ambiguous.

  • Do we allow passing a class for ask/tell (if a libE generator) so initialize internally.

  • this is currently supported with the wrapper method but not internal ask/tell method

  • I think we should probably not support both wrapper and ask/tell approaches.

  • Agree num_points for ask policy (initial and thereafter):

  • Always empty ask and generator has to decide (most similar to now but what about external gens?)

  • intercept:

    • init: gen_specs['user'] fields initial_batch_size or gen_batch_size and else use num workers
    • after: intercept num points coming in for tell and ask for that many
    • Problem with intercepting gen_specs['user'] is names for batch size may be different.
  • Add fixed gen_specs options gen_specs['initial_batch_size']

    • and maybe something like gen_specs['return_policy'] (or better name)
      • fixed batch size or generate num points got back or let generator decide points - empty ask()
      • (at some point may replace alloc_specs async_return if we want most user not to worry about allocs)?
  • Data conversion - how treat multi-dimensional fields.

  • currently makes a numpy array (as a dictionary value).

  • Agree gpCAM class names (e.g., GP_CAM_Covar) - and match gen_func names (original functions).

  • Naming convention for tests - we have:

    • test_gpCAM_class.py
    • test_asktell_surmise.py
    • test_persistent_surmise_killsims_asktell.py
    • test_persistent_aposmm_nlopt_asktell.py
    • test_sampling_asktell_gen.py
  • Do our gens all support passing in intiail points? And if so, in standard format?

Interfacer:

  • Does it work with executor (consideration of things added to executor in worker.py - including first-tier comm object)
  • Does it work when there is no executor
  • For user function should we insert comm into libE_info["comm"] inside qcomm_main
  • Works with processes and threads - or picked one (and named right in generators.py)
  • Works with fork and spawn (for spawn be clear what in calling script needs to be inside main block).
  • Sometimes get hang at end of runs using the interfacer - esp. if no final_tell, is there something not getting shut-down? Check this works.
  • Need to deal with final_tell when using gen (e.g. optimas)
  • Need to sepaarate final_tell as required by standard - tell and extract data function.
  • Constructor wrapper approach - demonstrate alternatives in notebook

Todo:

  • SH - update gpCAM to reflect current interface.
  • Change/rename original gpCAM to match naming in the class.
  • Need to run 'extra' tests for gpCAM tests to get run. Check passes.
  • Check correct no. points produced each call of gen (usually number sent back).

Check works:

  • Running libE ask/tell generator - e.g. gpCAM through Optimas (it will go via ask/tell wrappers).
  • Use libEnsemble ask/tell generator standalone (outside of libE)
  • must libE be installed to use generators?
  • Running libE with our ask_numpy/tell_numpy functions.
  • Running libE with ask/tell function (standardised format).

@shuds13 shuds13 requested a review from jmlarson1 July 31, 2024 16:47

class Generator(ABC):
"""
v 0.7.2.24
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this version number something we need?

"""

def __init__(
self, gen_specs: dict, History: npt.NDArray = [], persis_info: dict = {}, libE_info: dict = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we have a different constructor signature to those functions in gen_classes.

@shuds13
Copy link
Member

shuds13 commented Aug 6, 2024

In the full tests: https://github.com/Libensemble/libensemble/actions/runs/10256911231/job/28376940998#step:22:684

 ---Test 7: test_asktell_surmise.py starting with local on 4 processes 
/home/runner/miniconda3/envs/condaenv/lib/python3.11/site-packages/pydantic/_internal/_config.py:322: UserWarning: Valid config keys have changed in V2:
* 'orm_mode' has been renamed to 'from_attributes'
  warnings.warn(message, UserWarning)
Traceback (most recent call last):
  File "/home/runner/work/libensemble/libensemble/libensemble/tests/regression_tests/test_asktell_surmise.py", line 90, in <module>
    H_out, _a, _b = borehole(list_dicts_to_np(point), {}, sim_specs, {"H_rows": np.array([point["sim_id"]])})
                             ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/work/libensemble/libensemble/libensemble/utils/misc.py", line 102, in list_dicts_to_np
    first = list_dicts[0]
            ~~~~~~~~~~^^^
KeyError: 0
Error: The operation was canceled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants