Picongpu: Compile issue with icc + Boost

Created on 1 Feb 2019  路  5Comments  路  Source: ComputationalRadiationPhysics/picongpu

It turned out we have issues with the size of Boost::mpl vector with the Intel compiler (discovered for icc version 19.0.0.117 (gcc version 7.3.0 compatibility)), probably same as here. The suspect are these workarounds in our cmake and so we are falling back on the pmacc defines. Changing the latter to 30, 40 or 50 does not help with other issues appearing (and over 50 is not supported by Boost).

@psychocoderHPC and me spent some time investigating. It seems the problem is that sometimes there are too many combinations, e.g. with HDF5 and ADIOS the standard LaserWakefield example builds (and runs) fine, but the KelvinHelmholtz does not build.

Details on the failing configuration:

  • System: batch partitions of JURECA and JUWELS.
  • Our standard profiles for these partitions, with HDF5 and ADIOS.
  • Example: standard KelvinHelmholtz. Note that the standard LaserWakefield works fine, and KelvinHelmholtz works if we disable HDF5 or ADIOS.
  • Compiler: icc 19.0.0.117 (gcc version 7.3.0 compatibility).
  • Boost 1.68.0, as available on these systems for the enabled compiler. We suspect the issue is not specific to this version of Boost. Manually compiled Boost has not been tried yet.
  • PIConGPU version: latest dev e8c4268 . We suspect the issue is not specific to this or any recent commit.
  • Output of the failing build

Current workaround: reduce the number of combinations by disabling either ADIOS or HDF5. Untested: maybe reducing the number of combinations in fileOutput.param also helps.

omp2b bug

Most helpful comment

@ax3l good points, updated the description.

All 5 comments

Just do add additional info for issue readers and documentation:
Which version of boost did you test and how was it compiled? (Even though mpl is header-only, install might generate macros with maximum lengths for pre-variadic implementations.) On which OS & machine? What's the exact error message (snippet) for documentation? Are you referring to PIConGPU dev as of the latest commit e8c4268c1771d80a7017068a021d648571841cbe ? Did you try a specific example?
It's just generally helpful to have that info as well documented.

Our ICC support due to the lack of CI is indeed flaky and breaks now and then. Maybe we can remove the work-arounds as you suggest for recent ICC+Boost combinations (and/or add further) :) We also have new in-house CI ready which we can integrate for future proof checks, someone just has to do it with @tobiasfrust

But honestly, MPL is slow for CT and a legacy implementation. That's why we have to move all our code-bases to more modern implementations: #1997 (probably instead of investing time to fix it)

It was boost 1.68.0 and some nearly up to date dev branch of PIConGPU.
We played also around with the workaround to see if this issues was solved in newer ICC versions. Removeing the workaround is not solving the issue.

But honestly, MPL is slow for CT and a legacy implementation. That's why we have to move all our code-bases to more modern implementations: #1997 (probably instead of investing time to fix it)

I agree this issue was opened mostly do document this behavior.

I agree this issue was opened mostly do document this behavior.

yep, that's excellent :+1:

@ax3l good points, updated the description.

So it seems there is no quick fix we can do for this setup and this issue will just document the behaviour and provide workarounds. Some final points on this:

  • Manually building Boost with our instructions behaves the same as the system one: with HDF5 and ADIOS the LWFA example builds fine, the KHI one fails
  • Reducing the number of combinations in fileOutput.param does not provide a workaround
Was this page helpful?
0 / 5 - 0 ratings

Related issues

ax3l picture ax3l  路  4Comments

berceanu picture berceanu  路  4Comments

cbontoiu picture cbontoiu  路  3Comments

bussmann picture bussmann  路  4Comments

saipavankalyan picture saipavankalyan  路  3Comments