-
Notifications
You must be signed in to change notification settings - Fork 0
INCOMPLETE Corrections for NOACC component type when targeting GPU / OpenACC #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Wrong observation on issue , 2. is still present :-) Another difference relevant for |
@g5t the only missing element now seems to be to raise
|
Would both a component |
@willend Could you run the new tests on your
|
Pytest output shows a few issues - most if not all related to h5 stuff: I simply built and installed into my "usual" McStas dev env on the machine, i.e. not completely "clean". Do you want another attempt from a completely fresh environment @g5t? Success: Current code is functional for e.g. One more smaller wish - usability oriented - wrt. integration with e.g. mcrun: Could you add a "CFLAGS" line ala the standard mcstas cogen? - Then mcrun will pick up dependencies automatically? :-)
(mcstas-3.x-dev/miniconda3) NCrystal_example $ mcstas-antlr -t NCrystal_example.instr -I$MCSTAS
No initialization present?
-----------------------------------------------------------
Generating single GPU kernel or single CPU section layout:
-> SPLIT 10 at component monochromator
-> SPLIT 10 at component powder_sample
-----------------------------------------------------------
Generating GPU/CPU -DFUNNEL layout:
-> GPU kernel from component origin
-> GPU kernel from component source
-> GPU kernel from component mono_arm
Component monochromator is NOACC, CPUONLY=True
->FUNNEL mode enabled, SPLIT within buffer.
-> SPLIT within buffer at component monochromator
Component powder_sample is NOACC, CPUONLY=True
->FUNNEL mode enabled, SPLIT within buffer.
-> SPLIT within buffer at component powder_sample
-----------------------------------------------------------
(mcstas-3.x-dev/miniconda3) NCrystal_example $ mcstas -t NCrystal_example.instr -I$MCSTAS
Info: 'NCrystal_sample' is a contributed component.
-----------------------------------------------------------
Generating single GPU kernel or single CPU section layout:
-> SPLIT N at component monochromator
-> SPLIT N at component powder_sample
-----------------------------------------------------------
Generating GPU/CPU -DFUNNEL layout:
Component monochromator is NOACC, CPUONLY=1
-> FUNNEL mode enabled, SPLIT within buffer.
-> SPLIT within buffer at component monochromator
-> GPU kernel from component mono_out
Component powder_sample is NOACC, CPUONLY=1
-> FUNNEL mode enabled, SPLIT within buffer.
-> SPLIT within buffer at component powder_sample
-> GPU kernel from component powder_pattern_detc
-----------------------------------------------------------
CFLAGS= -Wl,-rpath,CMD(ncrystal-config --show libdir) -Wl,CMD(ncrystal-config --show libpath) -ICMD(ncrystal-config --show includedir) -DFUNNEL If you add the above |
Your HDF5/h5py problems were fixed by putting everything into a single virtual environment? I know it's not what you're after, but if you copy
(with no C file to worry about re-generating or accidentally modifying) As for the
|
Nope, au contraire: I suspect the issues arose from installing in a "dirty" env. But it could also be something else e.g. missing/unresolved dependencies (and there is no reason to suspect things should not function once "deployed" correctly e.g. via conda-forge).
True, but what I am after is something else: Compatibility with the legacy tools that people are familiar with. 😜
There is a regex match for (a single)
Nope that should be no issue. Resolved result from mcstas-antlr internals should be just fine. |
Found another minor thing:
(If successfully compiled with the |
There should be The |
And agreed |
I've updated the tests to check explicitly for
|
Some weirdness is still going on, but I think I have to conclude it is at my end... CFLAGS from
But is then apparently missing from the final compilation command:
|
Yay! Confirmed to function on my 8-way GPU box! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving-in the translation-time STDOUT CFLAGS=
line because sometimes I find it necessary to compile a translated file by hand, and not needing to expand the special flags + copy & pasting sounds like a good idea.
#109
@g5t I've tested this functional at least for
ISIS_CRISP.instr
, built a patchedmccode-antlr
and generated attached c-code locally on my Mac, transported to my 8-way GPU machine and ran on all the GPUS in parallel there:ISIS_CRISP.c.txt
For a reason I do not yet fully understand the code generated for NCrystal_example (perhaps 2 instances of the NCrystal_sample?) still fails to compile w/OpenACC.
NCrystal_example.c.txt