Picongpu: mpiexec error

Created on 20 Aug 2020  路  5Comments  路  Source: ComputationalRadiationPhysics/picongpu

Hello,

I probably met this error below but I forgot the origin. There is no problem with the input files as they are used for a model running on another computer. I just tried to run this model on a computer which might be left behind in terms of the latest updates for PIConGPU.

Here is the compilation output:

out.txt

and the runtime error:

Running program...
[mpiexec@T7500] match_arg (utils/args/args.c:163): unrecognized argument am
[mpiexec@T7500] HYDU_parse_array (utils/args/args.c:178): argument matching returned error
[mpiexec@T7500] parse_args (ui/mpich/utils.c:1642): error parsing input array
[mpiexec@T7500] HYD_uii_mpx_get_parameters (ui/mpich/utils.c:1694): unable to parse user arguments
[mpiexec@T7500] main (ui/mpich/mpiexec.c:148): error parsing parameters

Any help is appreciated. Thank you.

question

All 5 comments

Hello @cbontoiu . Perhaps that happened earlier as well, but I also do not remember.

So the compilation of PIConGPU with the model seems to have indeed gone fine. However, this error is related to run time parameters. In order to investigate, please attach the (zipped) tbg subdirectory of your output directory. Even though PIConGPU itself did not start, this subdirectory should have been created before that. If it was somehow not, please provide the .cfg file used

However, I personally could only take a look on that tomorrow.

Here it is. Thanks.

tbg.zip

Thanks. So in that archive there is file submit.start that has the exact commands submitted to your machine after all the work of tbg was done and the settings from your environment applied. In particular, line 57 has PIConGPU launch. There is that -am parameter that your implementation of mpiexec does not understand.

I think as the simplest attempt to fix we can try to just remove it, as this is setting for MPI and not PIConGPU-specific. There are two ways to try it:

  • manually: take the contents of that submit.start file, remove the fragment -am /media/cristi/Depozit/PARALLEL_LAYERS_PLATES_2D_40_3.70e+29_las_65.0nm_1.0e-03_10fs_LX_msh_2400_2400_st_539/tbg/openib.conf --mca mpi_leave_pinned 0 (I am not sure about the--mca part but it seems related to -am) and execute in the terminal, I believe we also suggested this technique to investigate some other issue
  • edit your .tpl file used in the tbg run to remove the same place. I assume you use this one, then remove -am !TBG_dstPath/tbg/openib.conf --mca mpi_leave_pinned 0 from line 60. Then resubmit your job with tbgusing the new .tpl file. You could check in the [new_ouput_directory]/tbg/submit.start if the mpiexec parameters were indeed changed.

I can confirm that using Spack and installing the latest version of openmpi v. 4.0.5 and the latest version of picongpu@develop as of August 2020, removes this problem.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ax3l picture ax3l  路  3Comments

bussmann picture bussmann  路  4Comments

cbontoiu picture cbontoiu  路  3Comments

ax3l picture ax3l  路  4Comments

ax3l picture ax3l  路  4Comments