Hello @KratosMultiphysics/technical-committee ,
I managed to compile Kratos (and the PoromechanicsApplication) in MPI, but when I run a test case I get the following warning:
[WARNING] ParallelFillCommunicator: All nodes have a PARTITION_INDEX index of 0! This could mean that PARTITION_INDEX was not assigned
And then a segmentation fault error shows up. I guess they are related... Do you know when and where the PARTITION_INDEX is supposed to be assigned ?
I copy here all the output just in case you see something wrong:
ipouplana@ipouplana:~/Ara/mpi/poro_mpi_test_1.gid$ mpiexec -np 2 python3 MainKratos.py --using-mpi
| / |
' / __| _` | __| _ \ __|
. \ | ( | | ( |\__ \
_|\_\_| \__,_|\__|\___/ ____/
Multi-Physics 7.1.0-cad4fbcde6-Release
Compiled with OpenMP and MPI support.
Maximum OpenMP threads: 1.
MPI world size: 2.
Importing KratosExternalSolversApplication
Initializing KratosExternalSolversApplication...
Importing KratosFluidDynamicsApplication
Initializing KratosFluidDynamicsApplication...
Importing KratosStructuralMechanicsApplication
KRATOS ___| | | |
\___ \ __| __| | | __| __| | | __| _` | |
| | | | | ( | | | | ( | |
_____/ \__|_| \__,_|\___|\__|\__,_|_| \__,_|_| MECHANICS
Initializing KratosStructuralMechanicsApplication...
Importing KratosPoromechanicsApplication
Initializing KratosPoromechanicsApplication...
Poromechanics Analysis: Sat Mar 21 17:00:19 2020
Poromechanics Analysis: MPI parallel configuration. OMP_NUM_THREADS = 1
Importing KratosTrilinosApplication
KRATOS _____ _ _ _
|_ _| __(_) (_)_ __ ___ ___
| || '__| | | | '_ \ / _ \/ __|
| || | | | | | | | | (_) \__ \
|_||_| |_|_|_|_| |_|\___/|___/
Initializing KratosTrilinosApplication...
UPwSolver: Construction of UPwSolver finished.
TrilinosUPwSolver: : Construction of MPI UPwSolver finished.
UPwSolver: Variables added correctly.
Importing KratosMetisApplication
KRATOS __ __ _ _
| \/ | ___| |_(_)___
| |\/| |/ _ \ __| / __|
| | | | __/ |_| \__ \
|_| |_|\___|\__|_|___/
Initializing KratosMetisApplication...
Node Partition
Partition 0: 125 objects.
Partition 1: 126 objects.
Element Partition
Partition 0: 219 objects.
Partition 1: 225 objects.
Condition Partition
Partition 0: 0 objects.
Partition 1: 14 objects.
No isolated nodes found.
NumColors : 1
ModelPartIO: [Total Lines Read : 1594]
::[DistributedImportModelPartUtility]::: Metis divide finished.
ModelPartIO: [Reading Nodes : 137 nodes read]
ModelPartIO: [Reading Elements : 219 elements read] [Type: UPwSmallStrainFICElement2D3N]
ModelPartIO: [Reading Conditions : 0 conditions read] [Type: UPwFaceLoadCondition2D2N]
ModelPartIO: [Total Lines Read : 1137]
Read materials: Started
Read materials: Finished
UPwSolver: Constitutive law was successfully imported via json.
UPwSolver: Model reading finished.
[WARNING] ParallelFillCommunicator: All nodes have a PARTITION_INDEX index of 0! This could mean that PARTITION_INDEX was not assigned
[WARNING] ParallelFillCommunicator: All nodes have a PARTITION_INDEX index of 0! This could mean that PARTITION_INDEX was not assigned
::[DistributedImportModelPartUtility]::: MPI communicators constructed.
TrilinosUPwSolver: : Model reading finished.
UPwSolver: DOFs added correctly.
TrilinosUPwSolver: : Solver initialization finished.
Poromechanics Analysis: Analysis -START-
Poromechanics Analysis: STEP: 1
Poromechanics Analysis: TIME: 0.01
[ipouplana:03054] *** Process received signal ***
[ipouplana:03053] *** Process received signal ***
[ipouplana:03053] Signal: Segmentation fault (11)
[ipouplana:03053] Signal code: Address not mapped (1)
[ipouplana:03053] Failing at address: 0xb9
[ipouplana:03054] Signal: Segmentation fault (11)
[ipouplana:03054] Signal code: Address not mapped (1)
[ipouplana:03054] Failing at address: (nil)
[ipouplana:03053] [ 0] [ipouplana:03054] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7f40c30bcf20]
[ipouplana:03053] [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20)[0x7fc13bfa9f20]
[ipouplana:03054] [ 1] /home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0x9b2ea)[0x7f409c48d2ea]
[ipouplana:03053] [ 2] /home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0xc3228/home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0xc3204)[0x7fc115430204]
[ipouplana:03054] [ 2] )[0x7f409c4b5228]
[ipouplana:03053] [ 3] /home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0x3dd74d)/home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0x3dd74d)[0x7fc11574a74d]
[ipouplana:03054] [ 3] [0x7f409c7cf74d]
[ipouplana:03053] [ 4] /home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0x3e0d6a)[0x7fc11574dd6a]
[ipouplana:03054] [ 4] /home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0x3c32a6/home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0x3e0d6a)[0x7f409c7d2d6a]
[ipouplana:03053] [ 5] )[0x7fc1157302a6]
[ipouplana:03054] [ 5] /home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0x3c32a6)[0x7f409c7b52a6]
/home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0xed3f2)[0x7fc11545a3f2]
[ipouplana:03054] [ 6] [ipouplana:03053] [ 6] /home/ipouplana/Kratos/bin/Release/libs/KratosTrilinosApplication.cpython-36m-x86_64-linux-gnu.so(+0xed3f2)[0x7f409c4df3f2]
[ipouplana:03053] [ 7] python3(_PyCFunction_FastCallDict+0x35c)[0x5674fc]
[ipouplana:03054] [ 7] python3(_PyCFunction_FastCallDict+0x35c)[0x5674fc]
[ipouplana:03053] [ 8] python3[0x50abb3]
[ipouplana:03054] [ 8] python3[0x50abb3]
[ipouplana:03053] [ 9] python3(_PyEval_EvalFrameDefault+0x449)[0x50c5b9]
[ipouplana:03054] [ 9] python3(_PyEval_EvalFrameDefault+0x449)[0x50c5b9]
[ipouplana:03053] [10] python3[0x509d48]
[ipouplana:03054] [10] python3[0x509d48]
[ipouplana:03053] [11] python3[0x50aa7d]
[ipouplana:03054] [11] python3(_PyEval_EvalFrameDefault+0xpython3[0x50aa7d]
[ipouplana:03053] [12] 449)[0x50c5b9]
[ipouplana:03054] [12] python3(_PyEval_EvalFrameDefault+0x449)[0x50c5b9]
[ipouplana:03053] python3[0x509d48]
[13] [ipouplana:03054] [13] python3[0x509d48]
[ipouplana:03053] [14] python3[0x50aa7d]
[ipouplana:03054] [14] python3[0x50aa7d]
[ipouplana:03053] [15] python3(_PyEval_EvalFrameDefault+0x449)python3(_PyEval_EvalFrameDefault+0x449)[0x50c5b9]
[ipouplana:03053] [16] [0x50c5b9]
[ipouplana:03054] [15] python3[0x509d48]
[ipouplana:03053] [17] python3[0x509d48]
[ipouplana:03054] [16] python3[0x50aa7d]
[ipouplana:03053] [18] python3[0x50aa7d]
[ipouplana:03054] [17] python3(_PyEval_EvalFrameDefault+0x449)[0x50c5b9]
[ipouplana:03053] python3[19] (_PyEval_EvalFrameDefault+0x449)[0x50c5b9python3[0x509d48]
[ipouplana:03053] [20] ]
[ipouplana:03054] [18] python3[0x50aa7d]
[ipouplana:03053] [21] python3[0x509d48]
[ipouplana:03054] [19] python3(_PyEval_EvalFrameDefault+0x449)[0x50c5b9]
python3[0x50aa7d]
[ipouplana:03054] [20] [ipouplana:03053] [22] python3[0x508245]
[ipouplana:03053] [23] python3(_PyEval_EvalFrameDefault+0x449)[0xpython3(PyEval_EvalCode+0x23)[0x50b40350c5b9]
[ipouplana:03054] [21] ]
[ipouplana:03053] python3[0x508245]
[ipouplana:03054] [24] python3[0x635222]
[ipouplana:03053] [25] [22] python3(PyEval_EvalCode+0x23)[0xpython3(PyRun_FileExFlags+0x97)[0x6352d7]
50b403]
[ipouplana:03054] [23] [ipouplana:03053] [26] python3[0x635222]
python3(PyRun_SimpleFileExFlags+0x17f)[0x638a8f]
[ipouplana:03054] [24] [ipouplana:03053] [27] python3(PyRun_FileExFlags+0x97)[0x6352d7]
[ipouplana:03054] [25] python3(Py_Main+0x591)[0x639631]
[ipouplana:03053] [28] python3(main+0xe0)[0x4b0f40]
[ipouplana:03053] [29] python3(PyRun_SimpleFileExFlags+0x17f)[0x638a8f]
[ipouplana:03054] [26] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7f40c309fb97]
python3(Py_Main+0x591)[0x639631]
[ipouplana:03053] *** End of error message ***
[ipouplana:03054] [27] python3(main+0xe0)[0x4b0f40]
[ipouplana:03054] [28] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)[0x7fc13bf8cb97]
[ipouplana:03054] [29] python3(_start+0x2a)[0x5b2fda]
[ipouplana:03054] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 0 on node ipouplana exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
ipouplana@ipouplana:~/Ara/mpi/poro_mpi_test_1.gid$
Many thanks in advance!
At first glance it seems that your reading is done in serial so you are not partitioning the mesh.
Use the DistributedImportModelPartUtility.
@rubenzorrilla I also though that, but the warning appears at this line
The message can still appear if one of the partitions is empty after reading the parallel mesh.
Could you check that the modelparts that are written (nameofyourmodelpart_0.mdpa, nameofyourmodelpart_1.mdpa, ...) are all non empty?
I just checked the two written model parts and they are non empty...
The message can still appear if one of the partitions is empty after reading the parallel mesh.
Really? o,o
That was not the intention, will fix it
For the overall issue let me take a look
Although on first sight I don’t see a problem
Do you know exactly where it crashes? It seems like it is in the first timestep
try to set the omp threads to 1, to see if a meaningful exception is thrown
On Sun, Mar 22, 2020, 9:29 AM Philipp Bucher notifications@github.com
wrote:
The message can still appear if one of the partitions is empty after
reading the parallel mesh.Really? o,o
That was not the intention, will fix itFor the overall issue let me take a look
Although on first sight I don’t see a problemDo you know exactly where it crashes? It seems like it is in the first
timestep—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
https://github.com/KratosMultiphysics/Kratos/issues/6582#issuecomment-602164645,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AB5PWEPG5MJ75KWR4N5QRRLRIXEANANCNFSM4LQ7CPHQ
.
@RiccardoRossi I already fixed the number of threads to 1... If you look at the beginning of the output it says:
Maximum OpenMP threads: 1.
Do you know when and where the PARTITION_INDEX is supposed to be assigned ?
this should be assigned in the partitioning-process (aka by Metis)
It is strange that this does not work for you. I added this warning mostly bcs I keep forgetting to assign PARTITION_INDEX when I do the partitioning by hand
could you post the mdpa file you are using?
Here I attach the case zipped:
So I checked the warning and it checks sth wrong, hence you can ignore it, I will fix it in the next days.
@roigcarlo I checked where the PARTITION_INDEX is assigned, but I coulnd't find it anywhere. Do you know where this happens? I thought somewhere in the MetisApp but searching for PARTITION_INDEX gives not a single hit
It is not assigned in the code, it is written when we create the partition:
So when we perform the second read its already there.
The value it's calculated in the metis process and passed to the modelpart io via the function in the link
Thx, I suspect this
But where is this done when the partitioning is done in memory?
Same place.
As you can see that uses not one but two modelpart_io instances. The serial_model_part_io gets its buffer automatically assigned from the metis in memory process:
When the partitioning is done in memory that is written in a streambuffer instead of a file and sent directly to the modelpart io (the swap function), which reads it from the stream instead of the regular file.
the warning should be fixed in #6637
As @philbucher this should be fixed. Please reopen if needed