Silo failed to start with error:
info: Orleans.Runtime.Silo[100452]
Start Incoming message agents took 33 Milliseconds to finish
info: Orleans.Threading.ThreadPoolThread[0]
Starting thread Runtime.Messaging.IncomingMessageAgent/Application0 on managed thread 22
info: Orleans.Runtime.GrainDirectory.LocalGrainDirectory[0]
Start
info: Runtime.GrainDirectory.AdaptiveDirectoryCacheMaintainer1[0] Starting AsyncAgent Runtime.GrainDirectory.AdaptiveDirectoryCacheMaintainer1 on managed thread 5
info: Runtime.GrainDirectory.GlobalSingleInstanceActivationMaintainer[0]
Starting AsyncAgent Runtime.GrainDirectory.GlobalSingleInstanceActivationMaintainer on managed thread 5
info: Orleans.Threading.ThreadPoolThread[0]
Starting thread Runtime.GrainDirectory.AdaptiveDirectoryCacheMaintainer`10 on managed thread 23
info: Orleans.Runtime.Silo[100452]
Start local grain directory took 13 Milliseconds to finish
info: Orleans.Threading.ThreadPoolThread[0]
Starting thread Runtime.GrainDirectory.GlobalSingleInstanceActivationMaintainer0 on managed thread 24
info: Orleans.Runtime.Silo[100452]
Init implicit stream subscribe table took 11 Milliseconds to finish
info: Orleans.Runtime.Silo[100452]
Create system targets and inject dependencies took 27 Milliseconds to finish
info: Orleans.Runtime.SiloLifecycleSubject[100452]
Lifecycle observer Orleans.Runtime.Silo started in stage 4000 which took 118 Milliseconds.
info: Orleans.Runtime.SiloLifecycleSubject[100452]
Starting lifecycle stage 4000 took 118.1442 Milliseconds
info: Orleans.Runtime.Catalog[100507]
Before collection#1: memory=6MB, #activations=0, collector=<#Activations=0, #Buckets=0, buckets=[]>.
info: Orleans.Runtime.Catalog[100508]
After collection#1: memory=6MB, #activations=0, collected 0 activations, collector=<#Activations=0, #Buckets=0, buckets=[]>, collection time=00:00:00.0137324.
info: Orleans.Runtime.Silo[100452]
Init grain services took 1 Milliseconds to finish
info: Orleans.Runtime.MembershipService.MembershipOracleData[100603]
MembershipOracle starting on host = szf-sl address = S127.0.0.1:11111:274874211 at 2018-09-17 09:56:51.881 GMT, backOffMax = 00:00:02
info: Orleans.Runtime.MembershipService.SystemTargetBasedMembershipTable[100635]
Creating in-memory membership table
info: Orleans.Runtime.MembershipService.MembershipTableSystemTarget[100637]
GrainBasedMembershipTable Activated.
fail: Orleans.Runtime.Messaging.IncomingMessageAcceptor[101017]
Exception trying to process 198 bytes from endpoint 127.0.0.1:63001
System.IndexOutOfRangeException: Index was outside the bounds of the array.
at Orleans.Runtime.MessagingStatisticsGroup.OnMessageReceive(Message msg, Int32 headerBytes, Int32 bodyBytes) in D:buildagent_work23ssrcOrleans.CoreStatisticsMessagingStatisticsGroup.cs:line 192
at Orleans.Runtime.IncomingMessageBuffer.TryDecodeMessage(Message& msg) in D:buildagent_work23ssrcOrleans.CoreMessagingIncomingMessageBuffer.cs:line 197
at Orleans.Runtime.Messaging.IncomingMessageAcceptor.ReceiveCallbackContext.ProcessReceived(SocketAsyncEventArgs e) in D:buildagent_work23ssrcOrleans.RuntimeMessagingIncomingMessageAcceptor.cs:line 659
fail: Orleans.Runtime.Messaging.IncomingMessageAcceptor[101027]
ProcessReceivedBuffer exception with RemoteEndPoint 127.0.0.1:63001:
System.IndexOutOfRangeException: Index was outside the bounds of the array.
at Orleans.Runtime.MessagingStatisticsGroup.OnMessageReceive(Message msg, Int32 headerBytes, Int32 bodyBytes) in D:buildagent_work23ssrcOrleans.CoreStatisticsMessagingStatisticsGroup.cs:line 192
at Orleans.Runtime.IncomingMessageBuffer.TryDecodeMessage(Message& msg) in D:buildagent_work23ssrcOrleans.CoreMessagingIncomingMessageBuffer.cs:line 197
at Orleans.Runtime.Messaging.IncomingMessageAcceptor.ReceiveCallbackContext.ProcessReceived(SocketAsyncEventArgs e) in D:buildagent_work23ssrcOrleans.RuntimeMessagingIncomingMessageAcceptor.cs:line 659
at Orleans.Runtime.Messaging.IncomingMessageAcceptor.ProcessReceive(SocketAsyncEventArgs e) in D:buildagent_work23ssrcOrleans.RuntimeMessagingIncomingMessageAcceptor.cs:line 489
It happens on all 2.1.0 versions(Including beta1,rc1 and ci builds). And there's no such error if I directly reference the Orleans.Server source project instead of the package Microsoft.Orleans.Server .
Can you confirm that all Orleans packages you reference in your projects have the same version? Can you share your silo config/startup code?
@sergeybykov As here https://github.com/csyszf/orleans/tree/%234990/Samples/2.0/HelloWorld
Just the HelloWorld sample with 2.1.0-rc1 packages.
@sergeybykov Ah... It's not a 2.1.0 issue, On 2.1.0 those exceptions are thrown when silo starting, and on 2.0.4, they'll be thrown when client try to connect Silos.
And it seems more about my local development environment. It work's fine on other systems, whether Linux or Windows.
I've tried to clean NuGet caches but not help.
Did you check that all projects reference same versions of Orleans and other NuGet packages?
@onionhammer provided a repro - https://github.com/onionhammer/PoC.Orleans. We are looking into it.
Try turning off TieredCompilation. Looks like it is breaking serialization somehow.
Strange, wonder why that is. BTW tiered compilation will be enabled by default in netcore 2.2 or 3.0 I believe, so this will be a big issue
Found an environment variable COMPlus_TieredCompilation=1 in my system. Remove it and the silo works fine.
So it is the tiered compilation's problem. Thanks a lot, @sergeybykov @onionhammer
We suspect it might be a JIT bug, but not 100% sure yet. Continuing investigation.
Opened https://github.com/dotnet/coreclr/issues/20040 with a repro
We'll document that TieredCompilation should not be turned on for now. Will reevaluate when the CLR issue is fixed.
This has been fixed in .NET Core 2.2.
Most helpful comment
Try turning off
TieredCompilation. Looks like it is breaking serialization somehow.