-
Notifications
You must be signed in to change notification settings - Fork 890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI_T Events #13133
base: main
Are you sure you want to change the base?
MPI_T Events #13133
Conversation
Signed-off-by: Nathan Hjelm <[email protected]> Signed-off-by: Kingshuk Haldar <[email protected]>
…or fixes. Signed-off-by: Chris Chambreau <[email protected]>
dss.h has been removed since this PR was originally opened Signed-off-by: Howard Pritchard <[email protected]>
…cking. Signed-off-by: Chris Chambreau <[email protected]>
Signed-off-by: Chris Chambreau <[email protected]>
…functionality Signed-off-by: Chris Chambreau <[email protected]>
- Correct MPI_T_event_dropped_cb_function arguments - Check for negative event index Signed-off-by: Chris Chambreau <[email protected]>
Signed-off-by: Howard Pritchard <[email protected]>
new profile interface generation method Signed-off-by: Howard Pritchard <[email protected]>
Signed-off-by: Howard Pritchard <[email protected]>
Signed-off-by: Howard Pritchard <[email protected]>
by fixing some code Signed-off-by: Howard Pritchard <[email protected]>
Signed-off-by: Chris Chambreau <[email protected]>
Signed-off-by: Chris Chambreau <[email protected]>
Signed-off-by: Chris Chambreau <[email protected]>
Signed-off-by: Chris Chambreau <[email protected]>
Signed-off-by: Chris Chambreau <[email protected]>
Signed-off-by: Chris Chambreau <[email protected]>
values. Signed-off-by: Chris Chambreau <[email protected]>
Signed-off-by: Chris Chambreau <[email protected]>
Signed-off-by: Chris Chambreau <[email protected]>
Signed-off-by: Howard Pritchard <[email protected]>
@hppritcha @jsquyres @hjelmn @cchambreau |
Has this PR been rebased on main? The mpi4py failure looks like it needs rebaing. |
I rebased with main which involved resolving a few conflicts. That may be the reason. I'm checking this. |
It looks like the singleton test fails in the mpi4py test. The branch seems up to date with |
here's the traceback:
|
Thank you. Yes, I'm checking if removing opal-locks during registration helps or not. |
Removing Similar functions such as Could any of you please suggest any plausible effects I need check for before removing them? It seems to me that they aren't necessary. The alternative is to unlock before calling |
Signed-off-by: Kingshuk Haldar <[email protected]>
Another option would be to use |
There seems to already be an There is a valid reason why pvars do not need to be protected, they can only be created during the init step, and the performance variables will be alive for the entire duration of the application, so as long as opal_util_init/fini are called in a protected way the pvars are getting the same protection. I don't know what a source event is so I cannot comment on that. |
The problem is the lock is taken first in the call to |
It is possible the number of sources can increase during an application run (see section 15.3.8 of the MPI 4.1 standard), so there probably should be locking besides just in the source init and finalize methods (which was the only locking in the file in @hjelmn 's original PR). |
…line mca_base_source_get_by_name() called from one place. Signed-off-by: Kingshuk Haldar <[email protected]>
I don't see a lock in Section 15.3.8 in the MPI standard states indeed that event sources can be added during the application execution, but in OMPI event sources are added during module initialization, and that happens in a single sequential context. Unless we are expecting sessions to be created from multiple threads and allow them to load different sets of modules I don't think we need any protection on the event creation path. Looking more carefully at this code I see other issues with locking. Let's look at At this point this PR looks very sketchy. |
This is the work from PR8057 rebased after PR13086.
Feel free to comment any suggestions. I shall list the suggested changes from the old PR and ask for advice about their relevance at present.