-
Notifications
You must be signed in to change notification settings - Fork 0
Description
QUDA internally refines the multi shift inversions using the single CG.
The MILC code also calls the single CG (either CPU or GPU) again.
This generates a lot of overhead and essentially always does zero iterations, so just wastes a lot of time.
There is an option NO_REFINE
which skips the refinement step if the Naik epsilon of the higher shifts is identical to the one for the zeroth-shift.
It would be beneficial to turn off any refinement call from the MILC code, i.e. make NO_REFINE
the default option.
Any objections, @detar, @stevengottlieb ?
In a short test on a 32^4 lattice that reduce runtime of the RHMC (single precision) by a factor 2. Admitted, I basically just changed the test case from tests case, so the iteration count is low and the overhead more pronounced.