Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIConGPU mallocmc error on taursml (possible duplicate of #3064) #3433

Open
PrometheusPi opened this issue Nov 16, 2020 · 22 comments
Open

PIConGPU mallocmc error on taursml (possible duplicate of #3064) #3433

PrometheusPi opened this issue Nov 16, 2020 · 22 comments
Labels
duplicate duplicate issue or pull-request (link main issue!) machine/system machine & HPC system specific issues question

Comments

@PrometheusPi
Copy link
Member

PrometheusPi commented Nov 16, 2020

While attempting to run a 256^3 cells per GPU PIConGPU simulation (laser only), I ran into the following memory error on the taurus V100 nodes:

Unhandled exception of type 'St13runtime_error' with message '/scratch/ws/1/s...-ml_streaming/pic_env/build/picongpu/thirdParty/cupla/alpaka/include/alpaka/mem/buf/BufUniformCudaHipRt.hpp(360) 'cudaMalloc( &memPtr, static_cast<std::size_t>(widthBytes))' returned error  : 'cudaErrorMemoryAllocation': 'out of memory'!', terminating
[cupla] Error: </scratch/ws/1/s...-ml_streaming/pic_env/build/picongpu/include/pmacc/../pmacc/memory/buffers/HostBufferIntern.hpp>:73 
[taurusml22:163119:0:163119] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x9)
Unhandled exception of type 'St13runtime_error' with message '/scratch/ws/1/s...-ml_streaming/pic_env/build/picongpu/thirdParty/cupla/alpaka/include/alpaka/mem/buf/BufUniformCudaHipRt.hpp(360) 'cudaMalloc( &memPtr, static_cast<std::size_t>(widthBytes))' returned error  : 'cudaErrorMemoryAllocation': 'out of memory'!', terminating
*** Error in `/lustre/scratch2/ws/1/s...-ml_streaming/runs/001_test/input/bin/picongpu': free(): invalid pointer: 0x00000000151c07f0 ***

@steindev did you encounter this before? I
@psychocoderHPC @sbastrakov Do you have any idea what might have caused this? Is this really a device memory issue?

@PrometheusPi PrometheusPi added question machine/system machine & HPC system specific issues labels Nov 16, 2020
@PrometheusPi
Copy link
Member Author

PrometheusPi commented Nov 16, 2020

Might this be induced by an extremely slow IO? It took 15 minuted to create simOutput. And after 20 minutes, there is still no feedback from picongpu.
I know this is a RAM out-of-memory - but do we recreate streams / buffer / etc. when no IO can be performed?

@steindev
Copy link
Member

I do not remember this specific error. You may try to run with blocking kernel on for debugging (pic-build -c "-DPMACC_BLOCKING_KERNEL=ON"). Just a sanity check: Are you sure the filesystem is not full? And are you sure the simulation fits on the GPUs, i.e. did you try with half the size in all directions?

@PrometheusPi
Copy link
Member Author

I wanted to check whether linking on the nodes was correct, but I only get the output of ldd picongpu that the program is not dynamically linked.

@PrometheusPi
Copy link
Member Author

Okay -the missing lddoutput originates from the Power architecture. O Power nodes, it works fine.

@PrometheusPi
Copy link
Member Author

@steindev reducing the simulation size by a factor 4 and switching blocking kernel on, still leads to the same error.

@PrometheusPi
Copy link
Member Author

PrometheusPi commented Nov 17, 2020

I tried running PIConGPU on a single GPU and it never showed up on nvida-smi on any of the 6 GPUs.

grafik

grafik

The CPU seems to do something. But I do not know what. The CUDA architecture is set to 70, thus a "JIT" compile should not take place.

@PrometheusPi
Copy link
Member Author

@franzpoeschel Did your PIConGPU simulation run?

@sbastrakov
Copy link
Member

Sorry for a late responce.

simOutput is not created by PIConGPU, but right after the task starts executing, before even cuda_memcheck is started. So if that took long since your job started, it indicates filesystem problems on a cluster.

From the exception message alone I would assume something is wrong with device memory. But given the aforementioned simOutput issue, this may be a secondary effect.

@psychocoderHPC
Copy link
Member

psychocoderHPC commented Nov 17, 2020

'cudaErrorMemoryAllocation': 'out of memory'!', terminating

It looks like the message we always got from taurus ml. IMO these are broken GPUs/Driver.
I saw the same with HIP on another system. I will look if I found my hacked branch where I generated a native mini-app out of PIConGPU memory allocation patterns to show the vendor that they did something wrong.

@psychocoderHPC
Copy link
Member

@PrometheusPi I will create on Friday out of psychocoderHPC@5e93302 a patched version we can run on a single GPU to build a template we can later use to write native CUDA code.
What we need is interactive access to a broken node.

@PrometheusPi
Copy link
Member Author

@psychocoderHPC Thats sound great. Thanks for the info.
Just as a note for me: currently I am always getting errors on taurusml9. Now, I got taurusml29 - perhaps this node works / has no broken drivers.

@PrometheusPi
Copy link
Member Author

some more details of error messages from taurusml31:

./picongpu -s 100 -g 128 128 128
full simulation time:  8sec 402msec = 8 sec
Unhandled exception of type 'St13runtime_error' with message '/scratch/ws/1/s5960712-ml_streaming/pic_env/build/picongpu/thirdParty/cupla/alpaka/include/alpaka/mem/buf/BufUniformCudaHipRt.hpp(360) 'cudaMalloc( &memPtr, static_cast<std::size_t>(widthBytes))' returned error  : 'cudaErrorMemoryAllocation': 'out of memory'!', terminating
*** Error in `./picongpu': double free or corruption (!prev): 0x0000000035bd9690 ***
======= Backtrace: =========
/lib64/libc.so.6(cfree+0x4a0)[0x200000f09be0]
/sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZdlPv+0x18)[0x200000c47c38]
/sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZdlPvm+0x18)[0x200000c47c78]
./picongpu(_ZN16cupla_cuda_async13cuplaFreeHostEPv+0x15c)[0x1055152c]
./picongpu(_ZN5pmacc6BufferINS_9SuperCellINS_5FrameINS_15ParticlesBufferINS_19ParticleDescriptionINS_4meta6StringIJLc101ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENS_4math2CT6VectorIN4mpl_10integral_cIiLi8EEESD_NSC_IiLi4EEEEEN5boost3mpl6v_itemIN8picongpu9weightingENSI_INSJ_8momentumENSI_INSJ_8positionINSJ_12position_picENS_13pmacc_isAliasEEENSH_7vector0INSB_2naEEELi0EEELi0EEELi0EEENSI_INSJ_11chargeRatioINSJ_20ChargeRatioElectronsESO_EENSI_INSJ_9massRatioINSJ_18MassRatioElectronsESO_EENSI_INSJ_7currentINSJ_13currentSolver9EsirkepovINSJ_9particles6shapes3TSCENS13_8strategy16CachedSupercellsELj3EEESO_EENSI_INSJ_13interpolationINSJ_28FieldToParticleInterpolationIS17_NSJ_30AssignedTrilinearInterpolationEEESO_EENSI_INSJ_5shapeIS17_SO_EENSI_INSJ_14particlePusherINS15_6pusher5BorisESO_EESS_Li0EEELi0EEELi0EEELi0EEELi0EEELi0EEENS_17HandleGuardRegionINS_9particles8policies17ExchangeParticlesENS15_8boundary29CallPluginsAndDeleteParticlesEEESS_SS_EESF_N8mallocMC9AllocatorIN6alpaka3acc12AccGpuCudaRtISt17integral_constantImLm3EEjEENS21_16CreationPolicies7ScatterINSJ_16DeviceHeapConfigENS29_11ScatterConf27DefaultScatterHashingParamsEEENS21_20DistributionPolicies4NoopENS21_11OOMPolicies10ReturnNullENS21_19ReservePoolPolicies9AlpakaBufIS28_EENS21_17AlignmentPolicies6ShrinkINS2M_12ShrinkConfig19DefaultShrinkConfigEEEEELj3EE29OperatorCreatePairStaticArrayILj256EEENS4_IS7_SF_NSI_INS_9multiMaskENSI_INS_12localCellIdxESV_Li0EEELi0EEES1S_S1Z_SS_NSI_INS_12NextFramePtrINSB_3argILi1EEEEENSI_INS_16PreviousFramePtrIS31_EESS_Li0EEELi0EEEEEEEEELj3EED1Ev+0x2c)[0x1046504c]
./picongpu(_ZN5pmacc18DeviceBufferInternINS_9SuperCellINS_5FrameINS_15ParticlesBufferINS_19ParticleDescriptionINS_4meta6StringIJLc101ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENS_4math2CT6VectorIN4mpl_10integral_cIiLi8EEESD_NSC_IiLi4EEEEEN5boost3mpl6v_itemIN8picongpu9weightingENSI_INSJ_8momentumENSI_INSJ_8positionINSJ_12position_picENS_13pmacc_isAliasEEENSH_7vector0INSB_2naEEELi0EEELi0EEELi0EEENSI_INSJ_11chargeRatioINSJ_20ChargeRatioElectronsESO_EENSI_INSJ_9massRatioINSJ_18MassRatioElectronsESO_EENSI_INSJ_7currentINSJ_13currentSolver9EsirkepovINSJ_9particles6shapes3TSCENS13_8strategy16CachedSupercellsELj3EEESO_EENSI_INSJ_13interpolationINSJ_28FieldToParticleInterpolationIS17_NSJ_30AssignedTrilinearInterpolationEEESO_EENSI_INSJ_5shapeIS17_SO_EENSI_INSJ_14particlePusherINS15_6pusher5BorisESO_EESS_Li0EEELi0EEELi0EEELi0EEELi0EEELi0EEENS_17HandleGuardRegionINS_9particles8policies17ExchangeParticlesENS15_8boundary29CallPluginsAndDeleteParticlesEEESS_SS_EESF_N8mallocMC9AllocatorIN6alpaka3acc12AccGpuCudaRtISt17integral_constantImLm3EEjEENS21_16CreationPolicies7ScatterINSJ_16DeviceHeapConfigENS29_11ScatterConf27DefaultScatterHashingParamsEEENS21_20DistributionPolicies4NoopENS21_11OOMPolicies10ReturnNullENS21_19ReservePoolPolicies9AlpakaBufIS28_EENS21_17AlignmentPolicies6ShrinkINS2M_12ShrinkConfig19DefaultShrinkConfigEEEEELj3EE29OperatorCreatePairStaticArrayILj256EEENS4_IS7_SF_NSI_INS_9multiMaskENSI_INS_12localCellIdxESV_Li0EEELi0EEES1S_S1Z_SS_NSI_INS_12NextFramePtrINSB_3argILi1EEEEENSI_INS_16PreviousFramePtrIS31_EESS_Li0EEELi0EEEEEEEEELj3EED2Ev+0x68)[0x10465188]
./picongpu(_ZN5pmacc10GridBufferINS_9SuperCellINS_5FrameINS_15ParticlesBufferINS_19ParticleDescriptionINS_4meta6StringIJLc101ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENS_4math2CT6VectorIN4mpl_10integral_cIiLi8EEESD_NSC_IiLi4EEEEEN5boost3mpl6v_itemIN8picongpu9weightingENSI_INSJ_8momentumENSI_INSJ_8positionINSJ_12position_picENS_13pmacc_isAliasEEENSH_7vector0INSB_2naEEELi0EEELi0EEELi0EEENSI_INSJ_11chargeRatioINSJ_20ChargeRatioElectronsESO_EENSI_INSJ_9massRatioINSJ_18MassRatioElectronsESO_EENSI_INSJ_7currentINSJ_13currentSolver9EsirkepovINSJ_9particles6shapes3TSCENS13_8strategy16CachedSupercellsELj3EEESO_EENSI_INSJ_13interpolationINSJ_28FieldToParticleInterpolationIS17_NSJ_30AssignedTrilinearInterpolationEEESO_EENSI_INSJ_5shapeIS17_SO_EENSI_INSJ_14particlePusherINS15_6pusher5BorisESO_EESS_Li0EEELi0EEELi0EEELi0EEELi0EEELi0EEENS_17HandleGuardRegionINS_9particles8policies17ExchangeParticlesENS15_8boundary29CallPluginsAndDeleteParticlesEEESS_SS_EESF_N8mallocMC9AllocatorIN6alpaka3acc12AccGpuCudaRtISt17integral_constantImLm3EEjEENS21_16CreationPolicies7ScatterINSJ_16DeviceHeapConfigENS29_11ScatterConf27DefaultScatterHashingParamsEEENS21_20DistributionPolicies4NoopENS21_11OOMPolicies10ReturnNullENS21_19ReservePoolPolicies9AlpakaBufIS28_EENS21_17AlignmentPolicies6ShrinkINS2M_12ShrinkConfig19DefaultShrinkConfigEEEEELj3EE29OperatorCreatePairStaticArrayILj256EEENS4_IS7_SF_NSI_INS_9multiMaskENSI_INS_12localCellIdxESV_Li0EEELi0EEES1S_S1Z_SS_NSI_INS_12NextFramePtrINSB_3argILi1EEEEENSI_INS_16PreviousFramePtrIS31_EESS_Li0EEELi0EEEEEEEEELj3ES39_ED2Ev+0x5c8)[0x104660d8]
./picongpu(_ZThn56_N8picongpu9ParticlesIN5pmacc4meta6StringIJLc101ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEEN5boost3mpl6v_itemINS_11chargeRatioINS_20ChargeRatioElectronsENS1_13pmacc_isAliasEEENS7_INS_9massRatioINS_18MassRatioElectronsESA_EENS7_INS_7currentINS_13currentSolver9EsirkepovINS_9particles6shapes3TSCENSG_8strategy16CachedSupercellsELj3EEESA_EENS7_INS_13interpolationINS_28FieldToParticleInterpolationISK_NS_30AssignedTrilinearInterpolationEEESA_EENS7_INS_5shapeISK_SA_EENS7_INS_14particlePusherINSI_6pusher5BorisESA_EENS6_7vector0IN4mpl_2naEEELi0EEELi0EEELi0EEELi0EEELi0EEELi0EEENS7_INS_9weightingENS7_INS_8momentumENS7_INS_8positionINS_12position_picESA_EES13_Li0EEELi0EEELi0EEEED0Ev+0xbc)[0x1047779c]
./picongpu(_ZNSt19_Sp_counted_deleterIPN5pmacc15ISimulationDataESt14default_deleteIS1_ESaIvELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv+0x38)[0x102d4ce8]
./picongpu(_ZN5pmacc13DataConnectorD2Ev+0x24c)[0x10456aac]
/lib64/libc.so.6(+0x44994)[0x200000eb4994]
/lib64/libc.so.6(exit+0x24)[0x200000eb49e4]
/lib64/libc.so.6(+0x25208)[0x200000e95208]
/lib64/libc.so.6(__libc_start_main+0xc4)[0x200000e953f4]
======= Memory map: ========
10000000-10ac0000 r-xp 00000000 00:32 162131688502177885                 /lustre/scratch2/ws/1/s5960712-ml_streaming/picInput/LaserOnly_streaming/.build/picongpu
10ac0000-10ad0000 r--p 00ab0000 00:32 162131688502177885                 /lustre/scratch2/ws/1/s5960712-ml_streaming/picInput/LaserOnly_streaming/.build/picongpu
10ad0000-10ae0000 rw-p 00ac0000 00:32 162131688502177885                 /lustre/scratch2/ws/1/s5960712-ml_streaming/picInput/LaserOnly_streaming/.build/picongpu
10ae0000-10b70000 rw-p 00000000 00:00 0 
31860000-31bf0000 rw-p 00000000 00:00 0                                  [heap]
31bf0000-31c10000 rw-p 00000000 00:00 0                                  [heap]
31c10000-31c20000 rw-p 00000000 00:00 0                                  [heap]
31c20000-31c30000 rw-p 00000000 00:00 0                                  [heap]
31c30000-31c40000 rw-p 00000000 00:00 0                                  [heap]
31c40000-31c50000 rw-p 00000000 00:00 0                                  [heap]
31c50000-37510000 rw-p 00000000 00:00 0                                  [heap]
200000000-200400000 ---p 00000000 00:00 0 
200400000-200600000 rw-s 00000000 00:06 119821                           /dev/nvidiactl
200600000-200800000 rw-s 00000000 00:06 180240                           /dev/nvidia0
200800000-200c00000 rw-s 00000000 00:05 559859184                        /dev/zero (deleted)
200c00000-200e00000 rw-s 00000000 00:06 180240                           /dev/nvidia0
200e00000-201e00000 ---p 00000000 00:00 0 
201e00000-202000000 rw-s 00000000 00:06 119821                           /dev/nvidiactl
202000000-202200000 rw-s 00000000 00:06 119821                           /dev/nvidiactl
202200000-202600000 rw-s 00000000 00:05 559859185                        /dev/zero (deleted)
202600000-202a00000 rw-s 00000000 00:05 559859186                        /dev/zero (deleted)
202a00000-202e00000 rw-s 00000000 00:05 559859187                        /dev/zero (deleted)
202e00000-203200000 rw-s 00000000 00:05 559859188                        /dev/zero (deleted)
203200000-203600000 rw-s 00000000 00:05 559859189                        /dev/zero (deleted)
203600000-203a00000 rw-s 00000000 00:05 559859190                        /dev/zero (deleted)
203a00000-203e00000 rw-s 00000000 00:05 559859191                        /dev/zero (deleted)
203e00000-204000000 rw-s 00000000 00:05 559859192                        /dev/zero (deleted)
204000000-204200000 rw-s 00000000 00:05 559859193                        /dev/zero (deleted)
204200000-204400000 rw-s 00000000 00:05 559859194                        /dev/zero (deleted)
204400000-204600000 rw-s 00000000 00:05 559859195                        /dev/zero (deleted)
204600000-204800000 rw-s 00000000 00:05 559859196                        /dev/zero (deleted)
204800000-204a00000 rw-s 00000000 00:05 559859197                        /dev/zero (deleted)
204a00000-204c00000 rw-s 00000000 00:05 559859198                        /dev/zero (deleted)
204c00000-204e00000 rw-s 00000000 00:05 559859199                        /dev/zero (deleted)
204e00000-205000000 rw-s 00000000 00:05 559859200                        /dev/zero (deleted)
205000000-205200000 rw-s 00000000 00:05 559859201                        /dev/zero (deleted)
205200000-205400000 rw-s 00000000 00:05 559859202                        /dev/zero (deleted)
205400000-205600000 rw-s 00000000 00:05 559859203                        /dev/zero (deleted)
205600000-205800000 rw-s 00000000 00:05 559879641                        /dev/zero (deleted)
205800000-205a00000 rw-s 00000000 00:05 559879642                        /dev/zero (deleted)
205a00000-205c00000 rw-s 00000000 00:05 559879643                        /dev/zero (deleted)
205c00000-205e00000 rw-s 00000000 00:05 559879644                        /dev/zero (deleted)
205e00000-206000000 rw-s 205e00000 00:06 194563                          /dev/nvidia-uvm
206000000-206200000 rw-s 00000000 00:06 119821                           /dev/nvidiactl
206200000-206400000 ---p 00000000 00:00 0 
206400000-206600000 rw-s 00000000 00:06 119821                           /dev/nvidiactl
206600000-206800000 rw-s 00000000 00:05 559893733                        /dev/zero (deleted)
206800000-300200000 ---p 00000000 00:00 0 
10000000000-10004000000 ---p 00000000 00:00 0 
200000000000-200000030000 r-xp 00000000 09:01 201331802                  /usr/lib64/ld-2.17.so
200000030000-200000040000 r--p 00020000 09:01 201331802                  /usr/lib64/ld-2.17.so
200000040000-200000050000 rw-p 00030000 09:01 201331802                  /usr/lib64/ld-2.17.so
200000050000-200000070000 r-xp 00000000 00:00 0                          [vdso]
200000070000-2000001b0000 r-xp 00000000 00:2d 79992436                   /software/ml/OpenMPI/3.1.4-gcccuda-2018b/lib/libmpi.so.40.10.4
2000001b0000-2000001c0000 r--p 00130000 00:2d 79992436                   /software/ml/OpenMPI/3.1.4-gcccuda-2018b/lib/libmpi.so.40.10.4
2000001c0000-2000001e0000 rw-p 00140000 00:2d 79992436                   /software/ml/OpenMPI/3.1.4-gcccuda-2018b/lib/libmpi.so.40.10.4
2000001e0000-2000001f0000 rw-p 00000000 00:00 0 
2000001f0000-200000210000 r-xp 00000000 00:2d 79992440                   /software/ml/OpenMPI/3.1.4-gcccuda-2018b/lib/libmpi_cxx.so.40.10.1
200000210000-200000220000 ---p 00020000 00:2d 79992440                   /software/ml/OpenMPI/3.1.4-gcccuda-2018b/lib/libmpi_cxx.so.40.10.1
200000220000-200000230000 r--p 00020000 00:2d 79992440                   /software/ml/OpenMPI/3.1.4-gcccuda-2018b/lib/libmpi_cxx.so.40.10.1
200000230000-200000240000 rw-p 00030000 00:2d 79992440                   /software/ml/OpenMPI/3.1.4-gcccuda-2018b/lib/libmpi_cxx.so.40.10.1
200000240000-200000270000 r-xp 00000000 00:32 162131686539173694         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_filesystem.so.1.71.0
200000270000-200000280000 r--p 00020000 00:32 162131686539173694         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_filesystem.so.1.71.0
200000280000-200000290000 rw-p 00030000 00:32 162131686539173694         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_filesystem.so.1.71.0
200000290000-2000002a0000 r-xp 00000000 00:32 162131686539190333         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_system.so.1.71.0
2000002a0000-2000002b0000 r--p 00000000 00:32 162131686539190333         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_system.so.1.71.0
2000002b0000-2000002c0000 rw-p 00010000 00:32 162131686539190333         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_system.so.1.71.0
2000002c0000-200000370000 r-xp 00000000 00:32 162131686539174124         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_math_tr1.so.1.71.0
200000370000-200000380000 r--p 000a0000 00:32 162131686539174124         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_math_tr1.so.1.71.0
200000380000-200000390000 rw-p 000b0000 00:32 162131686539174124         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_math_tr1.so.1.71.0
200000390000-2000003a0000 -w-s 00000000 00:06 180240                     /dev/nvidia0
2000003a0000-2000003b0000 r--s ff010000 00:06 21669                      /dev/infiniband/uverbs1
2000003b0000-2000003d0000 r-xp 00000000 09:01 207014255                  /usr/lib64/libpthread-2.17.so
2000003d0000-2000003e0000 r--p 00010000 09:01 207014255                  /usr/lib64/libpthread-2.17.so
2000003e0000-2000003f0000 rw-p 00020000 09:01 207014255                  /usr/lib64/libpthread-2.17.so
2000003f0000-200000420000 r-xp 00000000 00:2d 68930154                   /software/ml/zlib/1.2.11-GCCcore-7.3.0/lib/libz.so.1.2.11
200000420000-200000430000 r--p 00020000 00:2d 68930154                   /software/ml/zlib/1.2.11-GCCcore-7.3.0/lib/libz.so.1.2.11
200000430000-200000440000 rw-p 00030000 00:2d 68930154                   /software/ml/zlib/1.2.11-GCCcore-7.3.0/lib/libz.so.1.2.11
200000440000-2000004e0000 r-xp 00000000 00:32 162131686539174239         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_program_options.so.1.71.0
2000004e0000-2000004f0000 r--p 00090000 00:32 162131686539174239         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_program_options.so.1.71.0
2000004f0000-200000500000 rw-p 000a0000 00:32 162131686539174239         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_program_options.so.1.71.0
200000500000-200000560000 r-xp 00000000 00:32 162131686539174327         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_serialization.so.1.71.0
200000560000-200000570000 r--p 00050000 00:32 162131686539174327         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_serialization.so.1.71.0
200000570000-200000580000 rw-p 00060000 00:32 162131686539174327         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib/libboost_serialization.so.1.71.0
200000580000-200000590000 r-xp 00000000 09:01 207014259                  /usr/lib64/librt-2.17.so
200000590000-2000005a0000 r--p 00000000 09:01 207014259                  /usr/lib64/librt-2.17.so
2000005a0000-2000005b0000 rw-p 00010000 09:01 207014259                  /usr/lib64/librt-2.17.so
2000005b0000-200000630000 r-xp 00000000 00:2d 67380129                   /software/ml/CUDA/9.2.88-GCC-7.3.0-2.30/targets/ppc64le-linux/lib/libcudart.so.9.2.148
200000630000-200000640000 rw-p 00070000 00:2d 67380129                   /software/ml/CUDA/9.2.88-GCC-7.3.0-2.30/targets/ppc64le-linux/lib/libcudart.so.9.2.148
200000640000-200000710000 r-xp 00000000 09:01 201331822                  /usr/lib64/libm-2.17.so
200000710000-200000720000 r--p 000c0000 09:01 201331822                  /usr/lib64/libm-2.17.so
200000720000-200000730000 rw-p 000d0000 09:01 201331822                  /usr/lib64/libm-2.17.so
200000730000-200000990000 r-xp 00000000 00:32 162131686539210679         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib64/libopenPMD.so
200000990000-2000009a0000 ---p 00260000 00:32 162131686539210679         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib64/libopenPMD.so
2000009a0000-2000009b0000 r--p 00260000 00:32 162131686539210679         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib64/libopenPMD.so
2000009b0000-2000009c0000 rw-p 00270000 00:32 162131686539210679         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib64/libopenPMD.so
2000009c0000-2000009d0000 r-xp 00000000 00:32 162131686539199953         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib64/libadios2_cxx11_mpi.so.2.6.0
2000009d0000-2000009e0000 r--p 00000000 00:32 162131686539199953         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib64/libadios2_cxx11_mpi.so.2.6.0
2000009e0000-2000009f0000 rw-p 00010000 00:32 162131686539199953         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib64/libadios2_cxx11_mpi.so.2.6.0
2000009f0000-200000b70000 r-xp 00000000 00:32 162131686539199949         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib64/libadios2_cxx11.so.2.6.0
200000b70000-200000b80000 r--p 00170000 00:32 162131686539199949         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib64/libadios2_cxx11.so.2.6.0
200000b80000-200000b90000 rw-p 00180000 00:32 162131686539199949         /lustre/scratch2/ws/1/s5960712-ml_streaming/pic_env/local/lib64/libadios2_cxx11.so.2.6.0
200000b90000-200000db0000 r-xp 00000000 00:2d 75652055                   /software/ml/GCCcore/7.3.0/lib64/libstdc++.so.6.0.24[taurusml31:26070] *** Process received signal ***
[taurusml31:26070] Signal: Aborted (6)
[taurusml31:26070] Signal code:  (-6)
[taurusml31:26070] [ 0] [0x2000000504d8]
[taurusml31:26070] [ 1] /lib64/libc.so.6(abort+0x2b4)[0x200000eb2094]
[taurusml31:26070] [ 2] /lib64/libc.so.6(+0x88d10)[0x200000ef8d10]
[taurusml31:26070] [ 3] /lib64/libc.so.6(cfree+0x4a0)[0x200000f09be0]
[taurusml31:26070] [ 4] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZdlPv+0x18)[0x200000c47c38]
[taurusml31:26070] [ 5] /sw/installed/GCCcore/7.3.0/lib64/libstdc++.so.6(_ZdlPvm+0x18)[0x200000c47c78]
[taurusml31:26070] [ 6] ./picongpu(_ZN16cupla_cuda_async13cuplaFreeHostEPv+0x15c)[0x1055152c]
[taurusml31:26070] [ 7] ./picongpu(_ZN5pmacc6BufferINS_9SuperCellINS_5FrameINS_15ParticlesBufferINS_19ParticleDescriptionINS_4meta6StringIJLc101ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENS_4math2CT6VectorIN4mpl_10integral_cIiLi8EEESD_NSC_IiLi4EEEEEN5boost3mpl6v_itemIN8picongpu9weightingENSI_INSJ_8momentumENSI_INSJ_8positionINSJ_12position_picENS_13pmacc_isAliasEEENSH_7vector0INSB_2naEEELi0EEELi0EEELi0EEENSI_INSJ_11chargeRatioINSJ_20ChargeRatioElectronsESO_EENSI_INSJ_9massRatioINSJ_18MassRatioElectronsESO_EENSI_INSJ_7currentINSJ_13currentSolver9EsirkepovINSJ_9particles6shapes3TSCENS13_8strategy16CachedSupercellsELj3EEESO_EENSI_INSJ_13interpolationINSJ_28FieldToParticleInterpolationIS17_NSJ_30AssignedTrilinearInterpolationEEESO_EENSI_INSJ_5shapeIS17_SO_EENSI_INSJ_14particlePusherINS15_6pusher5BorisESO_EESS_Li0EEELi0EEELi0EEELi0EEELi0EEELi0EEENS_17HandleGuardRegionINS_9particles8policies17ExchangeParticlesENS15_8boundary29CallPluginsAndDeleteParticlesEEESS_SS_EESF_N8mallocMC9AllocatorIN6alpaka3acc12AccGpuCudaRtISt17integral_constantImLm3EEjEENS21_16CreationPolicies7ScatterINSJ_16DeviceHeapConfigENS29_11ScatterConf27DefaultScatterHashingParamsEEENS21_20DistributionPolicies4NoopENS21_11OOMPolicies10ReturnNullENS21_19ReservePoolPolicies9AlpakaBufIS28_EENS21_17AlignmentPolicies6ShrinkINS2M_12ShrinkConfig19DefaultShrinkConfigEEEEELj3EE29OperatorCreatePairStaticArrayILj256EEENS4_IS7_SF_NSI_INS_9multiMaskENSI_INS_12localCellIdxESV_Li0EEELi0EEES1S_S1Z_SS_NSI_INS_12NextFramePtrINSB_3argILi1EEEEENSI_INS_16PreviousFramePtrIS31_EESS_Li0EEELi0EEEEEEEEELj3EED1Ev+0x2c)[0x1046504c]
[taurusml31:26070] [ 8] ./picongpu(_ZN5pmacc18DeviceBufferInternINS_9SuperCellINS_5FrameINS_15ParticlesBufferINS_19ParticleDescriptionINS_4meta6StringIJLc101ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENS_4math2CT6VectorIN4mpl_10integral_cIiLi8EEESD_NSC_IiLi4EEEEEN5boost3mpl6v_itemIN8picongpu9weightingENSI_INSJ_8momentumENSI_INSJ_8positionINSJ_12position_picENS_13pmacc_isAliasEEENSH_7vector0INSB_2naEEELi0EEELi0EEELi0EEENSI_INSJ_11chargeRatioINSJ_20ChargeRatioElectronsESO_EENSI_INSJ_9massRatioINSJ_18MassRatioElectronsESO_EENSI_INSJ_7currentINSJ_13currentSolver9EsirkepovINSJ_9particles6shapes3TSCENS13_8strategy16CachedSupercellsELj3EEESO_EENSI_INSJ_13interpolationINSJ_28FieldToParticleInterpolationIS17_NSJ_30AssignedTrilinearInterpolationEEESO_EENSI_INSJ_5shapeIS17_SO_EENSI_INSJ_14particlePusherINS15_6pusher5BorisESO_EESS_Li0EEELi0EEELi0EEELi0EEELi0EEELi0EEENS_17HandleGuardRegionINS_9particles8policies17ExchangeParticlesENS15_8boundary29CallPluginsAndDeleteParticlesEEESS_SS_EESF_N8mallocMC9AllocatorIN6alpaka3acc12AccGpuCudaRtISt17integral_constantImLm3EEjEENS21_16CreationPolicies7ScatterINSJ_16DeviceHeapConfigENS29_11ScatterConf27DefaultScatterHashingParamsEEENS21_20DistributionPolicies4NoopENS21_11OOMPolicies10ReturnNullENS21_19ReservePoolPolicies9AlpakaBufIS28_EENS21_17AlignmentPolicies6ShrinkINS2M_12ShrinkConfig19DefaultShrinkConfigEEEEELj3EE29OperatorCreatePairStaticArrayILj256EEENS4_IS7_SF_NSI_INS_9multiMaskENSI_INS_12localCellIdxESV_Li0EEELi0EEES1S_S1Z_SS_NSI_INS_12NextFramePtrINSB_3argILi1EEEEENSI_INS_16PreviousFramePtrIS31_EESS_Li0EEELi0EEEEEEEEELj3EED2Ev+0x68)[0x10465188]
[taurusml31:26070] [ 9] ./picongpu(_ZN5pmacc10GridBufferINS_9SuperCellINS_5FrameINS_15ParticlesBufferINS_19ParticleDescriptionINS_4meta6StringIJLc101ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEENS_4math2CT6VectorIN4mpl_10integral_cIiLi8EEESD_NSC_IiLi4EEEEEN5boost3mpl6v_itemIN8picongpu9weightingENSI_INSJ_8momentumENSI_INSJ_8positionINSJ_12position_picENS_13pmacc_isAliasEEENSH_7vector0INSB_2naEEELi0EEELi0EEELi0EEENSI_INSJ_11chargeRatioINSJ_20ChargeRatioElectronsESO_EENSI_INSJ_9massRatioINSJ_18MassRatioElectronsESO_EENSI_INSJ_7currentINSJ_13currentSolver9EsirkepovINSJ_9particles6shapes3TSCENS13_8strategy16CachedSupercellsELj3EEESO_EENSI_INSJ_13interpolationINSJ_28FieldToParticleInterpolationIS17_NSJ_30AssignedTrilinearInterpolationEEESO_EENSI_INSJ_5shapeIS17_SO_EENSI_INSJ_14particlePusherINS15_6pusher5BorisESO_EESS_Li0EEELi0EEELi0EEELi0EEELi0EEELi0EEENS_17HandleGuardRegionINS_9particles8policies17ExchangeParticlesENS15_8boundary29CallPluginsAndDeleteParticlesEEESS_SS_EESF_N8mallocMC9AllocatorIN6alpaka3acc12AccGpuCudaRtISt17integral_constantImLm3EEjEENS21_16CreationPolicies7ScatterINSJ_16DeviceHeapConfigENS29_11ScatterConf27DefaultScatterHashingParamsEEENS21_20DistributionPolicies4NoopENS21_11OOMPolicies10ReturnNullENS21_19ReservePoolPolicies9AlpakaBufIS28_EENS21_17AlignmentPolicies6ShrinkINS2M_12ShrinkConfig19DefaultShrinkConfigEEEEELj3EE29OperatorCreatePairStaticArrayILj256EEENS4_IS7_SF_NSI_INS_9multiMaskENSI_INS_12localCellIdxESV_Li0EEELi0EEES1S_S1Z_SS_NSI_INS_12NextFramePtrINSB_3argILi1EEEEENSI_INS_16PreviousFramePtrIS31_EESS_Li0EEELi0EEEEEEEEELj3ES39_ED2Ev+0x5c8)[0x104660d8]
[taurusml31:26070] [10] ./picongpu(_ZThn56_N8picongpu9ParticlesIN5pmacc4meta6StringIJLc101ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0ELc0EEEEN5boost3mpl6v_itemINS_11chargeRatioINS_20ChargeRatioElectronsENS1_13pmacc_isAliasEEENS7_INS_9massRatioINS_18MassRatioElectronsESA_EENS7_INS_7currentINS_13currentSolver9EsirkepovINS_9particles6shapes3TSCENSG_8strategy16CachedSupercellsELj3EEESA_EENS7_INS_13interpolationINS_28FieldToParticleInterpolationISK_NS_30AssignedTrilinearInterpolationEEESA_EENS7_INS_5shapeISK_SA_EENS7_INS_14particlePusherINSI_6pusher5BorisESA_EENS6_7vector0IN4mpl_2naEEELi0EEELi0EEELi0EEELi0EEELi0EEELi0EEENS7_INS_9weightingENS7_INS_8momentumENS7_INS_8positionINS_12position_picESA_EES13_Li0EEELi0EEELi0EEEED0Ev+0xbc)[0x1047779c]
[taurusml31:26070] [11] ./picongpu(_ZNSt19_Sp_counted_deleterIPN5pmacc15ISimulationDataESt14default_deleteIS1_ESaIvELN9__gnu_cxx12_Lock_policyE2EE10_M_disposeEv+0x38)[0x102d4ce8]
[taurusml31:26070] [12] ./picongpu(_ZN5pmacc13DataConnectorD2Ev+0x24c)[0x10456aac]
[taurusml31:26070] [13] /lib64/libc.so.6(+0x44994)[0x200000eb4994]
[taurusml31:26070] [14] /lib64/libc.so.6(exit+0x24)[0x200000eb49e4]
[taurusml31:26070] [15] /lib64/libc.so.6(+0x25208)[0x200000e95208]
[taurusml31:26070] [16] /lib64/libc.so.6(__libc_start_main+0xc4)[0x200000e953f4]
[taurusml31:26070] *** End of error message ***
Abgebrochen

@PrometheusPi
Copy link
Member Author

Thanks to @steindev for pointing out, that this is most likely related to #3064. (possible duplicate)

@steindev steindev added the duplicate duplicate issue or pull-request (link main issue!) label Nov 19, 2020
@steindev steindev changed the title PIConGPU mallocmc error on taursml PIConGPU mallocmc error on taursml (possible duplicate of #3064) Nov 19, 2020
@PrometheusPi
Copy link
Member Author

PrometheusPi commented Nov 19, 2020

Current list of "defective" (?) nodes:

Node status
taurusml1 defective
taurusml2 defective
taurusml3 defective
taurusml4 defective
taurusml5 pending
taurusml6 defective
taurusml7 defective
taurusml8 defective
taurusml9 defective
taurusml10 defective
taurusml11 defective
taurusml12 defective
taurusml13 defective
taurusml14 defective
taurusml15 defective
taurusml16 defective
taurusml17 defective
taurusml18 defective
taurusml19 defective
taurusml20 defective
taurusml21 defective
taurusml22 defective
taurusml23 defective
taurusml24 defective
taurusml25 defective
taurusml26 defective
taurusml27 defective
taurusml28 defective
taurusml29 defective
taurusml30 defective
taurusml31 defective
taurusml32 defective

@psychocoderHPC
Copy link
Member

psychocoderHPC commented Nov 20, 2020

I prepared the code we need to debug the issue:

git remote add psychocoderHPC [email protected]:psychocoderHPC/picongpu.git
git fetch psychocoderHPC
# goto you branch you use for testing (branch should best depend on the current PIConGPU dev/last 2 month)
git cherry-pick 5dd93949e51c88bac25926bafcbc608d45794872
# compile on taurus and run interactivly on a single GPU

@PrometheusPi
Copy link
Member Author

The analysis (nearly) finished: There seem to be no node working.

@PrometheusPi
Copy link
Member Author

@psychocoderHPC I will test your approach.

@PrometheusPi
Copy link
Member Author

PrometheusPi commented Nov 20, 2020

@psychocoderHPC The stdout of the picongpu run on a single GPU can be found below:

PIConGPUVerbose PHYSICS(1) | Sliding Window is OFF
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1728llu,144llu,136llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,96llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,96llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,12llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,12llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1728llu,144llu,136llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,96llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,96llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,196608llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,3072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,48llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,12llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,12llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1728llu,144llu,136llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,128llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,128llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,128llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,589824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,128llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,589824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,3llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,589824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,3llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,589824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,3llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,3llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,3llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,13824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,3llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,13824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,393216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,96llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,96llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,3llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,3llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,3llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,3llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,3llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,3llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,128llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,589824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,128llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,589824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,128llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,128llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,128llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,13824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,128llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,13824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,2llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,2llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,9216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,2llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,2llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,144llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,2llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,2llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,3llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,13824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(1536llu,3llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,13824llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,3llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(24llu,3llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,216llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,3llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,324llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(36llu,3llu,3llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,324llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(576llu,144llu,136llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,128llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,128llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,128llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,128llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,1llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,1llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,1llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,1llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,1llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,1llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,2llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,128llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,128llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,128llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,128llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,128llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,128llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,1llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,1llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,1llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,4llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,1llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,4llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,32llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,32llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,1llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,1llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,2llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,2llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,2llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,2llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,2llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,2llu,1llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,131072llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,65536llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,128llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,1llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,1llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,1024llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,1llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,1llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,1llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,1llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(512llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,512llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(4llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,16llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,32llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(8llu,2llu,2llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,32llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,4llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,4llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(3072llu,128llu,128llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
PIConGPUVerbose PHYSICS(1) | used Random Number Generator: RNGProvider3XorMin seed: 42
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,0llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,64llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,0llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,0llu,1llu));}
{ size_t pitchBytes = 0; auto ex=make_cudaExtent(432llu,18llu,34llu); cudaPitchedPtr pptr;pptr.ptr=nullptr; CUDA_CHECK(cudaMalloc3D(&pptr,ex));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,262144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,262144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,262144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,262144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2359296llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2359296llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2359296llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2359296llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,786432llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,262144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,262144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,262144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,262144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,24576llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,8192llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,6144llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,8llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMallocPitch(&memPtr,&pitchBytes,2048llu,1llu));}
{ size_t pitchBytes = 0; void * memPtr; CUDA_CHECK(cudaMalloc(&memPtr,32139378688llu));}
full simulation time:  7sec 626msec = 7 sec

@PrometheusPi
Copy link
Member Author

@psychocoderHPC If you need stderr as well, let me know. (I already lost the node access again.)

@psychocoderHPC
Copy link
Member

psychocoderHPC commented Nov 20, 2020

native CUDA reproducer

main.txt

# please rename main.txt to main.cu
nvcc -std=c++14 -arch sm_70 main.cu
./a.out # this should crash on a bad node

[updated the example to guarantee a crash (increased the last allocation size]

@PrometheusPi
Copy link
Member Author

Result of offline work with @psychocoderHPC:
The issue is in memory allocation. Not all memory can be allocated apparently. The ZIH IT is informed.

@PrometheusPi
Copy link
Member Author

PrometheusPi commented Nov 20, 2020

Setting the reserved GPU memory from 350MB to 2047MB solved the problem for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate duplicate issue or pull-request (link main issue!) machine/system machine & HPC system specific issues question
Projects
None yet
Development

No branches or pull requests

4 participants