-
Notifications
You must be signed in to change notification settings - Fork 428
EAMxx: add pytorch ML emulator test for cld_fraction #7568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Add (empty) init method to the module - Use try/catch blocks when calling methods (for debugging)
Note: the emulator is NOT good, it just showcases the capability
Looks good, I will review this carefully in a second. I added @ndkeen as awareness (no need to review, but ofc welcome to do so) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is "data" (not code) and as such likely doesn't belong here at all (let's either throw it on the inputdata repo, or we can create a dedicated public place on github for toy models that we can just wget)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about the inputdata server. But I thought that a) it's a relatively small file (<100k), and b) I want to wait until the feature is "stable" before starting to put stuff on the data server. I feel that once data is on the server, it's doomed to stay there (as folks could checkout older version of master). I'd like to give the feature/test some "probation" time...
model = CldFracNet(nlevs,nlevs) | ||
model.load_state_dict(torch.load(model_file,map_location=torch.device('cpu'))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the design (which we can revisit), I was thinking of a few things:
- We actually don't need the model defined above fwiw, we can just get save it along with the weights and just instantiate it without the code above in the class
- I was hoping we would have options for users to run different versions, say:
- option 1: run c++ stuff we have by default
- option 2: run regular python stuff to just reproduce the c++ (since we have that)
- option 3: run pytorch python stuff (which is being added in this PR)
- I would do some guarding (
try ... except
type of stuff) to help give informative errors/messages to users (this may be should be done below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on all counts. In more details:
- Yes, I agree. It's just that I never used pytorch (/* noob mode on */) and was getting torch errors when loading the full model. It was in the early cycles of the impl, so maybe I fixed the core issue and can use the full model pth file now...
- We do that for this test. There are 3 tests that do precisely that. The
_py
and_cpp
tests are already tested to verify they are bfb, while the torch one is ofc not bfb with them, but runs the same input file. - I do have some try/catch blocks in c++, but maybe there are other places that I need to guard.
Why don't I see any build or runtime mods pointing to a python install? |
The python hooks for eamxx require
Edit: of course, I forgot to install the torch module... |
Adds a unit test showcasing how to hook up a pytorch emulator to an eamxx atm process.
[BFB]
Notes
Credit to @mahf708 for creating the pytorch model.