Picongpu: ISSAC on CUDA 7.5

Created on 17 Nov 2017  路  12Comments  路  Source: ComputationalRadiationPhysics/picongpu

In the current dev as of 988547900, compiling ISAAC 1.3.3 on CUDA 7.5 fails with:

lib/isaac/isaac_kernel.hpp(213): internal error:
  assertion failed: alloc_copy_of_pending_pragma: copied pragma has source sequence entry
  (/dvs/p4/build/sw/rel/gpu_drv/r352/r352_00/drivers/compiler/edg/EDG_4.9/src/pragma.c, line 512)

Modules

  1) gcc/4.9.2                     4) cuda/7.5
  2) cmake/3.7.2                   5) openmpi/1.8.4.kepler.cuda75
  3) boost/1.62.0                  6) pngwriter/0.5.6

Build

pic-build -b cuda:37

# ALPAKA_ACC_GPU_CUDA_ONLY_MODE
# ALPAKA_ACC_GPU_CUDA_ENABLED

on Hypnos4 & Hypnos 5

cuda plugin

All 12 comments

I am on it

In parallel, I am testing CUDA 8.0 already to see if its throwing there as well

Update: CUDA 8.0.44 & GCC 5.3.0 is fine

At least I can reproduce! :smile:

What the hell did you do to my server! It run for months until you started to do things...

*** Error in `./isaac': malloc(): memory corruption: 0x00007efe7c85b190 ***

However I restarted it inside a gdb session now. So if it crashes again in some months, I'll get a backtrace :smiley:

Seems to be related to https://github.com/roboptim/roboptim-core/issues/106 I'll try deactivating some GCC diagnostic pragmas. If that helps, I can just don't ignore the warning for older gcc/nvcc versions. :smile_cat:

However I restarted it inside a gdb session now. So if it crashes again in some months, I'll get a backtrace 馃槂

challenge accepted. I will crash it right away!

P.S.: gdb on the server makes the visualization reaaaaally slow.

Okay, I found the cause of the error! It's these 3 #pragma lines:
https://github.com/ComputationalRadiationPhysics/isaac/blob/dev/lib/isaac/isaac_kernel.hpp#L215

However I don't know, whether this is a nvcc7.5, a gcc4.9 or both problem. I will check with nvcc8.0 und gcc4.9.

related to the link above, I think it's a nvcc 7.5 issue when they are used.

I'll find out :shipit:

the nice thing is, it's a GCC 5 warning you want to suppress there. since nvcc 7.5 did support only gcc <=4.9, you can just add a check to only add the pragmas for gcc 5.0+ (and due to that nvcc 8+ where nothing crashes).

this issue might actually be rated to the CUDA 7.5 cudafe++ hang we experience in https://github.com/ComputationalRadiationPhysics/picongpu/pull/2368#issuecomment-343925035

Neat!

Okay, as thought: This bug is not happening with nvcc 8.0 and gcc 4.9

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cbontoiu picture cbontoiu  路  3Comments

HighIander picture HighIander  路  4Comments

ax3l picture ax3l  路  4Comments

hightower8083 picture hightower8083  路  4Comments

bussmann picture bussmann  路  4Comments