-
Notifications
You must be signed in to change notification settings - Fork 149
Open
Labels
Description
Describe the bug
ppconvert tests started failing. I know ppconvert is not a GPU code, but this is the set up in which I see the bug.
It could be that the fix to #3307, #3303 is not enough.
�[32;1m$ OMPI_ALLOW_RUN_AS_ROOT=1 OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 ctest -R ppconvert --output-on-failure�[0;m
Test project /builds/correaa/boost-multi/qmcpack/build
Start 65: build_output_ppconvert_exists
1/3 Test #65: build_output_ppconvert_exists .... Passed 0.00 sec
Start 66: ppconvert_runs
2/3 Test #66: ppconvert_runs ................... Passed 1.19 sec
Start 67: ppconvert_o_diff
3/3 Test #67: ppconvert_o_diff .................***Failed 0.13 sec
----------------
##6701 #:1 <== -nan
##6701 #:1 ==> -4.63469388240121e-20
@ @@
##6701 #:2 <== -nan
##6701 #:2 ==> 1.06342318361146e-03
@ @@
##6701 #:3 <== -nan
##6701 #:3 ==> 2.12686003182093e-03
@ @@
----------------
##6702 #:1 <== -nan
##6702 #:1 ==> 3.19032420922778e-03
@ @@
##6702 #:2 <== -nan
##6702 #:2 ==> 4.25382938043140e-03
@ @@
##6702 #:3 <== -nan
To Reproduce
Steps to reproduce the behavior:
All the steps are summarized here: https://gitlab.com/correaa/boost-multi/-/jobs/11854946333
git clone --depth 1 https://github.com/QMCPACK/qmcpack.git
cd qmcpack
cd build/
CUDACXX=/usr/local/cuda/bin/nvcc cmake .. -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx -DBUILD_AFQMC=1 -DQMC_CXX_STANDARD=17 -DQMC_GPU=cuda -DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc -DCMAKE_CUDA_HOST_COMPILER=g++ -DCMAKE_CXX_FLAGS="-Wno-deprecated -Wno-deprecated-declarations" -DCMAKE_CUDA_ARCHITECTURES=native
nvcc --version
CUDACXX=nvcc cmake .. -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx -DBUILD_AFQMC=1 -DQMC_CXX_STANDARD=17 -DQMC_GPU=cuda -DCMAKE_CUDA_COMPILER=nvcc -DCMAKE_CUDA_HOST_COMPILER=g++ -DCMAKE_CXX_FLAGS="-Wno-deprecated -Wno-deprecated-declarations" -DCMAKE_CUDA_ARCHITECTURES=native
make -j 4 ppconvert afqmc test_afqmc_matrix test_afqmc_numerics test_afqmc_slaterdeterminantoperations test_afqmc_walkers test_afqmc_hamiltonians test_afqmc_hamiltonian_operations test_afqmc_phmsd test_afqmc_wfn_factory test_afqmc_prop_factory test_afqmc_estimators qmc-afqmc-performance
ctest -R ppconvert --output-on-failureExpected behavior
ppconvert test should pass
System:
- system name: docker ubuntu 24.04, nvcc 12.9, g++ 13.3.0
Additional context
- I remember there was some controversial use of nan in ppconvert: Fix ppconvert build. #3307
- Could not reproduce locally with
$ g++ --version
g++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
Copyright (C) 2023 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
correaa@proart-linux:~/qmcpack/build$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Jan__6_16:45:21_PST_2023
Cuda compilation tools, release 12.0, V12.0.140
Build cuda_12.0.r12.0/compiler.32267302_0