Orleans: -Failed to get ping responses from all 1 silos that are currently listed as Active in the Membership table

Created on 22 Mar 2019  路  5Comments  路  Source: dotnet/orleans

Orleans.Runtime.MembershipService.MembershipOracleData[100661]
-Failed to get ping responses from all 1 silos that are currently listed as Active in the Membership table. Newly joining silos validate connectivity with all pre-existing silos that are listed as Active in the table and have written I Am Alive in the table in the last 00:10:00 period, before they are allowed to join the cluster. Active silos are: [[SiloAddress=S10.0.75.1:11111:290933742 SiloName=Silo_048a5 Status=Active HostName=LiBo ProxyPort=30000 RoleName= UpdateZone=0 FaultZone=0 StartTime = 2019-03-22 06:55:43.000 GMT IAmAliveTime = 2019-03-22 07:00:51.000 GMT ]]
warn: Orleans.Runtime.Scheduler stg/15/0000000f.WorkItemGroup[101215]
Task [Id=1653, Status=Faulted] in WorkGroup [SystemTarget: S172.19.227.65:11111:290933946
stg/15/0000000f@S0000000f] took elapsed time 0:00:00.2714147 for execution, which is longer than 00:00:00.2000000. Running on thread System.Threading.Thread

Most helpful comment

So what is the question here?

A wild guess (I may be wrong - not enough data) - you might be restarting a previously abruptly shut down cluster with the same cluster ID, and some silos are still listed in the table as Active while they are actually dead.

All 5 comments

So what is the question here?

A wild guess (I may be wrong - not enough data) - you might be restarting a previously abruptly shut down cluster with the same cluster ID, and some silos are still listed in the table as Active while they are actually dead.

Closing due to inactivity. Feel free to reopen if needed.

cluster ID

That's right, we're working on a team development, everyone using the same cluster ID to debug the native code. Later modified for everyone to debug with a separate cluster ID, everything is normal, the problem is solved

you might be restarting a previously abruptly shut down cluster with the same cluster ID, and some silos are still listed in the table as Active while they are actually dead.

We have same issue exactly with this case.

What can we do with startup of the very first silo in cluster? It seems to always fail with timeout at start.

What can we do with startup of the very first silo in cluster? It seems to always fail with timeout at start.

If this is a dev/test scenario, the recommendation to generate unique cluster IDs for each run.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jdom picture jdom  路  3Comments

SebastianStehle picture SebastianStehle  路  4Comments

pherbel picture pherbel  路  4Comments

JorgeCandeias picture JorgeCandeias  路  3Comments

bobanco picture bobanco  路  3Comments