-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sort output for test_iProbe. #119
base: develop
Are you sure you want to change the base?
Conversation
tests/mpi/eckit_test_mpi.cc
Outdated
@@ -1000,6 +1001,8 @@ CASE("test_iProbe") { | |||
--count; | |||
} | |||
|
|||
std::sort(data.begin(), data.begin() + nproc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we know the size is nproc, we may as well write
std::sort(data.begin(), data.end());
or better still
std::sort(begin(data), end(data));
Also, I'd expect the same fix to be also needed for test_probe
, although I'm too surprised we hit this
first with test_iProbe
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have updated as you say.
The failure is transient, so maybe it does affect test_probe as well, I just haven't really noticed. I have made the same change for test_probe.
Is this transient failure still happening? The test should not fail without the sorting either though. I'm puzzled why sorting should be needed. Also the "test_probe" was also mentioned as intermittently failing, and this sorting is not applied there. We can also try to make the test more verbose in case of failure, e.g. each MPI rank logging the content of "data" |
More verbosity upon failure would be helpful. It isn't trivial though as steps would need to be taken to ensure the output from different processors was presented in a readable way, just printing stuff out results in garbled output. |
I have added in develop branch more output on this test in case of failures... |
I run the test with your updates with the extra output and this is what it produced as output: 76: Test command: /opt/cray/snplauncher/7.6.0/bin/mpiexec "-n" "4" "/home/d03/frwd/cylc-run/eckit-47/share/make_mo__cray_gnu/eckit/tests/mpi/eckit_test_mpi_parallel" |
Are you sure you ran "develop" branch? because the mentioned line 1009 is e.g. not one of the lines that would write output: |
I have run tests with the develop branch. Unfortunately I don't seem to be able to recreate the failure now. Do you think you might have fixed the issue? |
I did not fundamentally change anything except adding some more diagnostic output. If it occurs again, sporadically, please update the issue with the diagnostic output. |
Would this be a potential fix for #47? It seems that sometimes the output is not in the correct order for the ASSERT. I'm not sure whether this is a situation that the output is required to be in a particular order or not. If not, then sorting the output might be a solution.