I am opening this issue for the sake of record and discussion.
Recently I came across a mesh file which causes an error when ReorientTetMesh is called in the parallel mode with a certain number of MPI tasks. The error looks similar to the one reported in https://github.com/mfem/mfem/issues/1028, but it is different in the sense it occurs without refinement. Also, the error message reports the location of triangle, which was not seen before. Another difference is that the mesh is 2nd order.
This error occurs with #MPI of 192, 256. but not with #MPI of 32, 64, and 128. And, we are using MFEM 4.1 if this matters.
The problematic mesh file was sent already off-line to Mark. Please let me know what you think.
Verification failed: (stria_flag[i] == stria_master_flag[i]) is false:
--> inconsistent vertex ordering found, shared triangle 230: (475, 472, 688), local flag: 1, master flag: 0
... in function: void mfem::ParMesh::ReorientTetMesh()
... in file: /global/homes/s/shiraiwa/.conda/envs/20200331/src/mfem/mesh/pmesh.cpp:2664
Rank 119 [Tue Oct 13 14:31:42 2020] [c3-0c2s14n2] application called MPI_Abort(MPI_COMM_WORLD, 1) - process 119
Verification failed: (stria_flag[i] == stria_master_flag[i]) is false:
--> inconsistent vertex ordering found, shared triangle 275: (162, 738, 546), local flag: 0, master flag: 1
... in function: void mfem::ParMesh::ReorientTetMesh()
... in file: /global/homes/s/shiraiwa/.conda/envs/20200331/src/mfem/mesh/pmesh.cpp:2664
Rank 143 [Tue Oct 13 14:31:42 2020] [c3-0c2s14n3] application called MPI_Abort(MPI_COMM_WORLD, 1) - process 143
Hi Syun'ichi, I'll take a look at this issue -- I thought we fixed the algorithm in ParMesh::ReorientTetMesh() in #1119 but maybe not.
I was able to load and view the mesh in serial with mfem build in non-debug mode. However, in debug mode, I get "invalid mesh topology" error from this check: https://github.com/mfem/mfem/blob/7e45098ad1bc9a52fb8f968addb2b82baf71f854/mesh/mesh.cpp#L2611-L2623
What this means is that, topologically, two elements with a common face are on the same side of that common face.
I also get the message:
Elements with wrong orientation: 6 / 820958 (NOT FIXED)
which could be a related issue. Note that this check only checks the element centers. A more detailed check in mesh-explorer (option x with subdivision factor 3) shows 10319 bad tets and there may be more (may need to use larger subdivision factor to see those).
We may be able to fix some of the inverted elements with node movement with TMOP, however, that cannot help with the topological issue.
Thank you @v-dobrev. This is a great guide to diagnose and narrow down the root cause of this. I will take a closer look of suspicious mesh elements.
Please close this. Close look of the mesh file revealed that there are very (extremely) thin elements, which becomes topologically invalid as @v-dobrev pointed out. After cleaning the input CAD geometry and revising the mesh sequence, the topology error is eliminated. I think this is the root cause. We are going to test the new mesh file on the cluster tomorrow for final check.
Most helpful comment
Please close this. Close look of the mesh file revealed that there are very (extremely) thin elements, which becomes topologically invalid as @v-dobrev pointed out. After cleaning the input CAD geometry and revising the mesh sequence, the topology error is eliminated. I think this is the root cause. We are going to test the new mesh file on the cluster tomorrow for final check.