Azure-docs: SQL availability group Load Balancer requirement IaaS vs On-Prem

Created on 10 Apr 2018 · 10Comments · Source: MicrosoftDocs/azure-docs

Is there a more detailed document illustrating why a SQL availability group needs a Load Balancer in Azure IaaS but not On-Prem?

Is there a way around it in higher versions of SQL or Windows Server?

Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

ID: 09504e18-c72f-07ee-3ef8-dc9ed44562ca
Version Independent ID: 8058bb4f-2321-ddf2-9930-0cc767b68a62
Content: Create a SQL Server availability group listener in Azure virtual machines
Content Source: articles/virtual-machines/windows/sql/virtual-machines-windows-portal-sql-alwayson-int-listener.md
Service: virtual-machines-sql
GitHub Login: @MikeRayMSFT
Microsoft Alias: mikeray

Pri2 assigned-to-author doc-bug triaged virtual-machines-sqsvc

Source

Ayanmullick

All 10 comments

@Ayanmullick Thanks for the question! We are currently investigating and will update you shortly.

mimckitt on 10 Apr 2018

@Ayanmullick I assume you are referring to the following statement:

An availability group requires a load balancer when the SQL Server instances are on Azure virtual machines.

I believe this statement is misleading. There is nothing stopping you from creating a SQL machine in an AV set without a load balancer. However, when creating the load balancer you can opt to balancer between VMs in an AV set. In addition, it is best practice to combine an AV set with a load balancer to manage high availability.

@MikeRayMSFT can you confirm and determine if this needs to be made more clear?

mimckitt on 10 Apr 2018

👍1

Yes. So what challenges could one face if one deploys a 2-node SQL AlwaysOn Cluster in an availability Set and connects directly to the SQL listener IP without any Load Balancer in between?

Ayanmullick on 10 Apr 2018

@Ayanmullick if you don't use a load balancer and one of your SQL machines go down traffic would not be automatically directed to the running machine. This could impact your production as any services using the machine that went down would not work.

One of the main purposes of a load balancer is to distrubute traffic to avaialble machines and redirect it when there is a problem. So although you could consider your enviornment "Always On" it would not meet the critera for high availability.

mimckitt on 10 Apr 2018

But why don't I face the same issue without a Load Balancer on-Prem? We've deployed AlwaysON on-Prem without Load Balancers. Works fine.

Ayanmullick on 10 Apr 2018

@Ayanmullick I believe that is just the difference between running in the cloud and running on prem. The environment is different and require different settings to work. Not 100% sure I can provide a complete answer as it is just how it was designed to work. @MikeRayMSFT might be able to elaborate further than I can.

mimckitt on 10 Apr 2018

In regular WSFC (Windows server failover cluster) on-prem setup, when AG listener is created, it will create a DNS record for AG listener with the IP(s) provided. This IP address has to map now to MAC address of the current Primary node in ARP tables of switches/routers in the network. The cluster does this by using Gratuitous ARP (GARP) where it will broadcast to the network the latest IP-to-MAC mapping whenever a new Primary is elected after failover. Here, the IP is listener’s and MAC is of current Primary. This GARP should force an update on ARP table entries for the switches/routers and to a user connection to the listener IP address seamlessly goes to the current Primary.
GARP (even ARP) is not supported on any public clouds (Azure, GCP and AWS, I believe as well) due to security reasons. In short, any kind of broadcast is not supported on cloud setup.

So, in public cloud’s network infrastructure, load balancers provide traffic routing. In short, the load balancers are setup with a frontend IP, corresponding to the listener, and a probe port is assigned where LB will periodically poll for status. The VM which responds successfully to probe on this port will be forwarded incoming traffic. At one time only one SQL VM (Primary) will respond for this TCP probe. There is also configuration made at WSFC level, where corresponding probe port is setup at cluster IP resource level, thereby ensuring that Primary node does respond to TCP probe requests on this port.

Does that help?

MikeRayMSFT on 18 Apr 2018

❤1

Yes. Thanks. So we'd continue to need an additional Load Balancer for Azure IaaS even in later versions of SQL or Windows Server , right?

Also, Azure private DNS just went 'in Preview'. Could that make a difference in the process of associating the new IP after failover?

Ayanmullick on 18 Apr 2018

@Ayanmullick - first let me apologize for not replying to this conversation sooner.

You will need an load balancer for IaaS on Azure VMs (or any other cloud service that hosts VMs) in the future.

I don't think the Private DNS will change that because GARP is still not available on the network.

please-close

MikeRayMSFT on 8 Oct 2018

@mikerayMSFT Exactly what I was looking for the past two weeks as I was struggling to setup an AO AG on google cloud platform. Could you please post the above info "In regular WSFC (Windows server failover cluster) on-prem setup, when AG listener is created, it will create a DNS record for AG listener with the IP(s) provided. ...." as post on technet. There is not a single post explaining what is the difference between running AOAG onPrem vs cloud and how the listener ip is broadcasted. Wish you allowed the search crawling on the comments as well :)

fivefq on 22 Nov 2019

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

AzureDevops don't considerate as 'Microsoft Services'

renattomachado · 42Comments

Failed to Start IoT Edge daemon

smcd253 · 44Comments

AKS with RBAC unable to view Dashboard with Azure AD

Sudharma · 48Comments

The code appears to build fine. The certificates are all loaded. But I cannot connect. There has to be a step missing. Do we need to create client certificate on the cluster? When I try to open the service fabric explorer I get a not authorized message. Not sure why?

tshinkle · 40Comments

Verification of publisher domain failed

xkobal · 42Comments