Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build reference build #5

Open
1 of 3 tasks
ilectra opened this issue Oct 11, 2024 · 8 comments
Open
1 of 3 tasks

Build reference build #5

ilectra opened this issue Oct 11, 2024 · 8 comments
Assignees

Comments

@ilectra
Copy link
Collaborator

ilectra commented Oct 11, 2024

In /home/dp208/dp208/shared/RAC16. Build also the dependencies there, so that everyone else can point to.

DONE:

@ilectra
Copy link
Collaborator Author

ilectra commented Oct 11, 2024

Write documentation of how it works in wiki

This was referenced Oct 11, 2024
@qiUip
Copy link
Collaborator

qiUip commented Oct 15, 2024

Currently fails at SU3 tests, check log in shared folder/ed_test.

shared folder/mg_test has no-sp3 branch that is the combination of the commit that works with A100 + patch to guard against SP3 - Created issue #6 to try and fix the build. We should not block this issue if we can get the first two tasks done.

@qiUip
Copy link
Collaborator

qiUip commented Oct 29, 2024

These are the flags I'm using for CPU. @asifsamiarain are these the same?

 ../configure \
    --prefix=${prefix} \
    --enable-comms=mpi \
    --enable-simd=AVX2 \
    --enable-gen-simd-width=256 \
    --disable-gparity \
    --disable-fermion-reps \
    --enable-Sp \
    --enable-Nc=4 \
    --with-lime=${prefix} \
    --with-gmp=${prefix} \
    --with-mpfr=${prefix} \
    CXX=mpicxx \
    CXXFLAGS="-std=c++17"

@asifsamiarain
Copy link
Collaborator

Here we have a script for testing:

/home/dp208/dp208/shared/RAC16/ma_test/deploy.sh

@qiUip
Copy link
Collaborator

qiUip commented Oct 29, 2024

This flag is GPU specific for unmanaged memory --disable-unified.

@asifsamiarain
Copy link
Collaborator

asifsamiarain commented Oct 31, 2024

These are the flags I'm using for CPU. @asifsamiarain are these the same?

 ../configure \
    --prefix=${prefix} \
    --enable-comms=mpi \
    --enable-simd=AVX2 \
    --enable-gen-simd-width=256 \
    --disable-gparity \
    --disable-fermion-reps \
    --enable-Sp \
    --enable-Nc=4 \
    --with-lime=${prefix} \
    --with-gmp=${prefix} \
    --with-mpfr=${prefix} \
    CXX=mpicxx \
    CXXFLAGS="-std=c++17"

Just to mention; did a fresh compilation without make check in one go and it took nearly 2 hours at gpu-a100-80 partition and directory size in the end is 14G.

@qiUip
Copy link
Collaborator

qiUip commented Oct 31, 2024

Just to mention; did a fresh compilation without make check in one go and it took nearly 2 hours at gpu-a100-80 partition and directory size in the end is 14G.

I built with the above configure options on the login node with make -j32 and it took 29 min (including the configure step). The size at least is the same as in your case (14G). Based on this, I think building on the GPU nodes is a significant waste of resources (GPU compute allocation).

@qiUip
Copy link
Collaborator

qiUip commented Nov 28, 2024

Further investigation into clean builds that pass the testing framework with make check:

  1. I found no easy way to remove tests from the Makefile.am files, as these seem to get included regardless by a script that includes all files in a directory into a Make.inc which is included in each Makefile.am. As a proper solution is out of scope and our goal is just to be able to use make check after introducing changes to the code, my suggestion is to disable the features only while developing, and upon a PR build the code without disabling features. As this will not be done as frequently, we will be able to accommodate the additional build times.

  2. A further test, tests/forces/Test_fthmc does not work when Nc!=3. To fix that, I applied the same logic as in the patch within the test itself

#if Nc == 3
#include <Grid/qcd/smearing/GaugeConfigurationMasked.h>
#include <Grid/qcd/smearing/JacobianAction.h>
#endif
int main (...){
#if Nc != 3
#warning FTHMC2p1f_3GeV will not work for Nc != 3
  std::cout << "This program will currently only work for Nc == 3." << std::endl;
#else
 ...
#endif
}
  1. In a bunch of hmc tests, a struct struct SmearingParameters: Serializable is defined. However, within the Grid namespace, there is already a struct defined with that name in Grid/qcd/smearing/HISQSmearing.h:65 which throws an error error: redefinition of ‘struct Grid::SmearingParameters’. I'm not sure how this is avoided in the CI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants