I've got a Orleans silo running on a VM within a VNET with ip address & port 10.1.2.4:30000.
I have an Azure App Service which has been configured with VNET integration to the VM's VNET.
I've currently configured the Network Security Groups & OS Firewalls to allow all connections over the port ranges that the silo has been configured with.
When I attempt to connect a ClusterClient from the .net core web app running in the Azure App Service I get back:
Orleans.Networking.Shared.SocketConnectionException
Unable to connect to endpoint S10.1.2.4:30000:0. See InnerException Unable to connect to 10.1.2.4:30000. Error: AccessDenied
Note: There is no InnerException.
In the kudu console of the Azure App Service I'm able to run tcpping 10.1.2.4:30000 and i get back:
D:\home>tcpping 10.1.2.4:30000
Connected to 10.1.2.4:30000, time taken: 172ms
Connected to 10.1.2.4:30000, time taken: 13ms
Connected to 10.1.2.4:30000, time taken: 11ms
Connected to 10.1.2.4:30000, time taken: <1ms
Complete: 4/4 successful attempts (100%). Average success time: 49ms
Which indicates to me that the App Service is correctly VNET integrated.
I'm also able to successfully connect to the silo with a ClusterClient from my local machine (which has a VPN connection to the VNET), so there isn't an issue with the silo not accepting connections.
Is the IClusterClient meant to be able to run in an Azure App Service ? Does it maybe attempt to listen on a local socket which is not permitted according to the Azure App Service restrictions?
Is there some kind of work around, that does not involve having to convert everything to run in the hugely complex Kubernetes/Docker nightmare, or setting up IIS on virtual machines.
A bit more information, as I think it will help narrow down where the problem is.
Our Silo VM is running on a VNET with only IPv4 addresses (10.1.2.4).
The client in Azure AppService has access to IPv4 & IPv6 addresses, and it appears that the issue is that when the SocketConnectionFactory.cs builds up the Socket it is not explicitly instructing the Socket of the silo endpoint AddressFamily.
Instead it is using the Socket constructor that detects if the OS supports IPv4 or IPv6 AddressFamily.
var socket = new Socket(SocketType.Stream, ProtocolType.Tcp)
{
LingerState = new LingerOption(true, 0),
NoDelay = true
};
The Azure AppService reports that it does support IPv6, so it appears that when the client opens the Socket the socket defaults to attempting IPv6, which is not mapped with the Azure App Service VNET integration, and then for some reason never attempts the IPv4 address family.
That is what I'm thinking at the moment atleast..
With further testing and building the Orleans repo with the following change in the SocketConnectionFactory.
var socket = new Socket(endpoint.AddressFamily, SocketType.Stream, ProtocolType.Tcp)
{
LingerState = new LingerOption(true, 0),
NoDelay = true
};
The IClusterClient can now connect to a silo in a VNET when the client is running in Azure App Services with VNET integration.
@adminnz , I'm just wondering, what version of Orleans are you using?
I use the same setup, and I've never had issues connecting an app service to a VMSS.
@cosmintiru We are using the Microsoft.Orleans.Server 3.0.0 and Microsoft.Orleans.Client 3.0.0
Are you running in an Azure App Service (S0) with VNET integration to a Virtual Machine without a public IP address?
@adminnz , yes, the VMSS is without a public IP address. I've tried in both configuration, with an internal LB and without it.
I think the only difference (I assume this solely because you didn't mention it) is the fact that my VMSS doesn't live in the same VNET as my app service. We have an ARM template the spawns everything, then does a VNET peering between the App Service VNET and the VMSS VNET.
Also, we're using the "preview" VNET integration experience, the one that doesn't require a Gateway.
We're relying on Azure Table Storage as a cluster membership provider.
Hello @adminnz , did you manage to fix it?
No.
I have raised this with azure support, and they are unsure why the App Service is not working as expected.
I've reproduced the issue for them multiple times with them capturing network traffic from both the backend VM and the app service.
@cosmintiru Would you be willing to share your Azure setup (App Service, Virtual Machine and VNET integration ARM template) as we as how your creating & connecting the IClusterClient (if its different than the ASPNET 2.0 Sample )
As we have been in contact with Azure technical support and they are saying this is an issue with the Orleans library in how it establishes the Socket.
Hi @adminnz, did you manage to fix somehow this problem? I've faced the same situation:
Orleans.Runtime.Messaging.ConnectionFailedException: Unable to connect to endpoint S10.1.1.9:30000:0. See InnerException ---> Orleans.Networking.Shared.SocketConnectionException: Unable to connect to 10.1.1.9:30000. Error: AccessDenied
at Orleans.Networking.Shared.SocketConnectionFactory.ConnectAsync(EndPoint endpoint, CancellationToken cancellationToken)
at Orleans.Runtime.Messaging.ConnectionFactory.ConnectAsync(SiloAddress address, CancellationToken cancellationToken)
at Orleans.Runtime.Messaging.ConnectionManager.ConnectAsync(SiloAddress address)
--- End of inner exception stack trace ---
at Orleans.Runtime.Messaging.ConnectionManager.ConnectAsync(SiloAddress address)
at Orleans.Runtime.Messaging.ConnectionManager.GetConnectionAsync(SiloAddress endpoint)
at Orleans.Messaging.ClientMessageCenter.<GetGatewayConnection>g__ConnectAsync|37_1(SiloAddress gateway, ValueTask`1 connectionTask, Message message, Boolean directGatewayMessage)
at Orleans.Internal.OrleansTaskExtentions.<ToTypedTask>g__ConvertAsync|4_0[T](Task`1 asyncTask)
at Orleans.OutsideRuntimeClient.<>c__DisplayClass55_0.<<StartInternal>b__1>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at Orleans.OutsideRuntimeClient.<StartInternal>g__ExecuteWithRetries|55_2(Func`1 task, Func`2 shouldRetry)
@vbobyr not yet. At this point I'm believing that Orleans will need a patch to fix how it opens Sockets. However I do not know what issues that will cause with other environments (and its next to impossible to unit test those environments, especially when its the internals of the Socket library doing the real decisions).
Just wanting to let you know I'm facing the same issue...
@adminnz did your change here fix this for you? https://github.com/dotnet/orleans/issues/6093#issuecomment-552593129
If so, we can take that change and release 3.0.2 with it
We have done that one line change on a previous version of orleans when we first were trying to track down this issue and did fix the problem.
When should we expect 3.0.2 being available as nuget packages?
Thank you for verifying!
When should we expect 3.0.2 being available as nuget packages?
Possibly as early as tomorrow afternoon
3.0.2 is released. Please let us know if this fixes things for you
Yes, this release fixes the bug for me.
Now I'm able to connect from my Azure App Service with VNet Integration enabled to my VM running a Silo on the same Virtual Network.
Thanks all!
Thank you guys. For me it also works like a charm.
Thanks all!
Great, I'm glad this helped! Thank you all for putting in the leg work 馃檪
I am also facing the same issue. I have app service integrated to VNet and have Silo running on Azure VM on same network. The app service is not able to connect to the Silo. When I launch the application from local machine I am able to connect to silo running on azure VM (using VPN on local machine). But the same is not working from azure app service.
I verified below things before commenting:
Can someone please help me figure out whats the issue here. I am pretty new to configuring silo in Azure.
@Jain-Nidhi this might be worth a new issue, or a conversation over MS Teams if you are internal to MS or Gitter if you are external, since it is difficult to track comments on closed issues.
Thanks for providing those diagnostics. Do you know which IP the silo is listening on and which IP it has written into the SQL membership table? Feel free to provide responses in a new issue or a message and we can continue diagnosing from there.
Most helpful comment
3.0.2 is released. Please let us know if this fixes things for you