Issue/17 #36

maxwest-uw · 2024-10-04T21:02:29Z

Change Description

creates the LoopConfiguration class, which collates all the various configuration options for the learn_loop function and adds both some pre-run validation for these options as well as the ability to write all the configuration options to a JSON file, which can be read and rebuilt at runtime.

I tried to make this change in a minimally invasive way to the preexisting API. For instance, if before learn_loop was called like:

learn_loop(
      nloops=1,
      features_method="bazin",
      strategy="RandomSampling",
      path_to_features=output_file,
      output_metrics_file=os.path.join(dir_name,"just_a_name.csv"),
      output_queried_file=os.path.join(dir_name,"just_other_name.csv"),
)

the new API could be

learn_loop(
    LoopConfiguration(
        nloops=1,
        features_method="bazin",
        strategy="RandomSampling",
        path_to_features=output_file,
        output_metrics_file=os.path.join(dir_name,"just_a_name.csv"),
        output_queried_file=os.path.join(dir_name,"just_other_name.csv"),
    )
)

of course, you can also just instantiate it separately.

lc = LoopConfiguration(
  nloops=1,
  features_method="bazin",
  strategy="RandomSampling",
  path_to_features=output_file,
  output_metrics_file=os.path.join(dir_name,"just_a_name.csv"),
  output_queried_file=os.path.join(dir_name,"just_other_name.csv"),
)
learn_loop(lc)

or write and read it from a json file.

lc1 = LoopConfiguration(
  nloops=1,
  features_method="bazin",
  strategy="RandomSampling",
  path_to_features=output_file,
  output_metrics_file=os.path.join(dir_name,"just_a_name.csv"),
  output_queried_file=os.path.join(dir_name,"just_other_name.csv"),
)
lc1.to_json("./config_cache.json")
...
# some lines later...
...
lc2 = LoopConfiguration.from_json("./config_cache.json")
learn_loop(lc2)

resolves #17

My PR includes a link to the issue that I am addressing

Solution Description

Created the LoopConfiguration class and placed all of the config parameters in there. Added some validation steps.

For the learn_loop.py module, I also passed the LoopConfiguration class through some of the ancillary functions, my arbitrary marker was that if it originally took in more than three of the config parameters I would just replace those with the full config instance and call the field directly in the function .

While I was going through the refactor, I also took some time to write a few more tests for learn_loop and changed the styling so that multi-line function/object calls have one parameter per line.

Code Quality

I have read the Contribution Guide
My code follows the code style of this project
My code builds (or compiles) cleanly without any errors or warnings
My code contains relevant comments and necessary documentation

New Feature Checklist

I have added or updated the docstrings associated with my feature using the NumPy docstring format
I have updated the tutorial to highlight my new feature (if appropriate)
I have added unit/End-to-End (E2E) test cases to cover my new feature
My change includes a breaking change
- My change includes backwards compatibility and deprecation warnings (if possible)

Documentation Change Checklist

Any updated docstrings use the NumPy docstring format

github-actions · 2024-10-04T21:05:19Z

Before [`0c8d74f`] <v0.1>	After [`7738bfa`]	Ratio	Benchmark (Parameter)
failed	137±1ms	n/a	benchmarks.time_feature_creation
failed	168±1ms	n/a	benchmarks.time_learn_loop('KNN', 'RandomSampling')
failed	168±3ms	n/a	benchmarks.time_learn_loop('KNN', 'UncSampling')
failed	2.68±0.01s	n/a	benchmarks.time_learn_loop('RandomForest', 'RandomSampling')
failed	2.67±0.01s	n/a	benchmarks.time_learn_loop('RandomForest', 'UncSampling')

Click here to view all benchmarks.

maxwest-uw · 2024-10-04T21:19:01Z

src/resspect/database.py

@@ -1622,16 +1622,17 @@ def save_metrics(self, loop: int, output_metrics_file: str, epoch: int, batch=1)

        # write to file)
        queried_sample = np.array(self.queried_sample)
-        flag = queried_sample[:,0].astype(int) == epoch
+        if len(queried_sample) > 0:


this is part of a small bug fix that I'm not totally sure about and would like a second opinion from one of the science team. When I was setting up the tests for the learn_loop function, I ran into an issue with the test data where queried_sample was empty, which caused the above line to fail. I added a check here and another place in database.py to check for an empty list before continuing, which seems ok in tandem with the sum(flag) > 0 check before writing, but I wanted to make sure I wasn't causing unexpected behavior.

drewoldag · 2024-10-07T17:34:19Z

src/resspect/learn_loop.py

-    if is_save_photoids_to_file or is_save_snana_types:
-        file_name = file_name_prefix + '_' + str(iteration_step) + file_name_suffix
+    if config.photo_ids_to_file or config.SNANA_types:
+        file_name = config.photo_ids_froot + '_' + str(iteration_step) + file_name_suffix


Cute - I didn't notice froot when I first looked at the LoopConfig dataclass.

drewoldag

Overall, this looks like a nice clean up. Thanks for tidying up the function definitions as well - I like the one-parameter-per-line look too :)

maxwest-uw added 3 commits October 3, 2024 15:42

create LoopConfiguration + tests

d3ae572

refactor learn_loop

9f4289b

add ability to write to and from json + tests

f1717e0

fix learn_loop benchmarks

963c63d

maxwest-uw commented Oct 4, 2024

View reviewed changes

maxwest-uw self-assigned this Oct 4, 2024

maxwest-uw requested a review from drewoldag October 4, 2024 22:14

drewoldag reviewed Oct 7, 2024

View reviewed changes

drewoldag approved these changes Oct 7, 2024

View reviewed changes

maxwest-uw merged commit 06c1a25 into main Oct 7, 2024
7 checks passed

maxwest-uw deleted the issue/17 branch October 7, 2024 21:21

maxwest-uw mentioned this pull request Nov 5, 2024

Refactor time_domain_loop with new TimeDomainConfiguration #74

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue/17 #36

Issue/17 #36

maxwest-uw commented Oct 4, 2024 •

edited by AmandaWasserman

Loading

github-actions bot commented Oct 4, 2024 •

edited

Loading

maxwest-uw Oct 4, 2024

drewoldag Oct 7, 2024

drewoldag left a comment

Issue/17 #36

Issue/17 #36

Conversation

maxwest-uw commented Oct 4, 2024 • edited by AmandaWasserman Loading

Change Description

Solution Description

Code Quality

New Feature Checklist

Documentation Change Checklist

github-actions bot commented Oct 4, 2024 • edited Loading

maxwest-uw Oct 4, 2024

Choose a reason for hiding this comment

drewoldag Oct 7, 2024

Choose a reason for hiding this comment

drewoldag left a comment

Choose a reason for hiding this comment

maxwest-uw commented Oct 4, 2024 •

edited by AmandaWasserman

Loading

github-actions bot commented Oct 4, 2024 •

edited

Loading