Framework: Azure SQL Server - Laravel not detecting lost database connections

Created on 18 Apr 2018  路  12Comments  路  Source: laravel/framework

  • Laravel Version: 5.6
  • PHP Version: 7.2
  • Database Driver & Version: pdo_sqlsrv

Description:

Running Laravel on CentOS 7.4 in an Azure VM. Using Azure SQL Server database and Azure queues.

We have queue workers running. These monitor the queues with no issues. However, after a short time (a few hours or thereabouts) if a job comes in on the queue, the job throws an exception as it tries to access the database. The exception includes the message:

ERROR: SQLSTATE[08S02]: [Microsoft][ODBC Driver 13 for SQL Server]SMux Provider: Physical connection is not usable [xFFFFFFFF].

This continues for all further jobs, until the queue workers are killed.

Steps To Reproduce:

Most of the details are above. Basically, use Azure SQL Server with pdo_sqlsrv driver, run a artisan queue:work, do nothing for a couple of hours, then dispatch a job to the queue that needs to use the database.

I have found the list of error messages that Laravel uses to decide whether a database connection has been lost:

https://github.com/illuminate/database/blob/71f75f4d1d7ffaa33da4bde0da6f246ed7741479/DetectsLostConnections.php#L10

I have hacked "Physical connection is not usable" into that list on our system to see if that fixes the problem. Will know later today.

Assistance sought here: https://laracasts.com/discuss/channels/laravel/how-do-i-add-a-message-to-the-list-detecting-a-closed-database-connection

Most helpful comment

Hello @judgej, I have this problem as well and it's caused by running the workers in daemon mode, causing the connection to the DB to be left open.

When changing to not use it in production it fixes the issue (although jobs and processed much slower).

You can increase the timeout in SQL Server but not sure how to do it for AzureDB

SQLSTATE[08S02]: [Microsoft][ODBC Driver 13 for SQL Server]SMux Provider: Physical connection is not usable [xFFFFFFFF]

All 12 comments

A 17:00 BST scheduled job we have ran successfully after being left without processing any queued jobs for 2.5 hours. It's looking promising. I'll leave the system for the midnight job to see how that goes, before submitting a PR.

Just a note, but the list of messages that Laravel checks for, to determine whether the database connection has been closed, are all English. How does this work in other locales, where the PDO driver may be returning non-English error messages? Are they all having problems running queue worker daemons?

Had a different "connection lost" error this time:

SQLSTATE[08S01]: [Microsoft][ODBC Driver 13 for SQL Server]TCP Provider: Error code 0x68

It is becoming apparent to me that this list of error messages that indicate a connection was lost, needs to be configurable. I can see no way to override that Trait, hard-coded in the Connection class, that is not macroable (AFAICS), except by an override in the autoloader.

If the trait moved the list of messages to a property, and provided some public methods to manipulate that list, that could be one approach. Though, this trait is used in multiple places, so maybe not. Perhaps it should be a singleton class and not a trait.

Hello @judgej, I have this problem as well and it's caused by running the workers in daemon mode, causing the connection to the DB to be left open.

When changing to not use it in production it fixes the issue (although jobs and processed much slower).

You can increase the timeout in SQL Server but not sure how to do it for AzureDB

SQLSTATE[08S02]: [Microsoft][ODBC Driver 13 for SQL Server]SMux Provider: Physical connection is not usable [xFFFFFFFF]

I've not been able to find much information on what running the jobs in non-daemon mode actually does. If it continuously restarts itself every few seconds, then that is a horrendous use of resources - constant database connects, constant process restart processing costs. I would rather avoid that if I can. But the documentation on what it actually does is very sparse.

Having the database connection dropped though inactivity is fine, so long as Laravel invisibly reconnects when the connection is needed again. Obviously avoid keeping transactions open for long periods, as you could end up with lost data when that drops.

I am finding this is becoming an issue with long-running queue listeners though. When the queue connection drops, it sends an error message to Laravel, and Laravel is not able to handle it and reconnect. For now, I'm restarting the queue workers every ten minutes, which is a reasonable compromise.

The PR I've submitted has the "Physical connection is not usable" message you are seeing included in the list of message Laravel uses to recognise lost connections, which should help you if/when it gets merged in. However, I do feel a better solution is needed there, something that will work in non-English locales would be good for a start, or a way for applications to inject their own list of messages would be an okay workaround. It's all hard-coded in a trait at the moment, and I cannot see a way to extend that, except by an auto-loader override, which can be a bit flaky as you have no control over the order composer runs its filesystem search.

Would that make Laravel auto reconnect? Or just show a more friendly error message?
We never seem to have dropped connections our MySQL or PostgreSQL applications.

I will try your PR on my local and report back!

EDIT: Just read your comment on the PR!

You do have dropped connections with MySQL. It is Laravel that quietly brushes them under the carpet, reconnects, then retries the SQL or DML again for you. That's the magic of using a framework - it does stuff like that so you don't have to think about it :-) Until it doesn't :-(

This is becoming a real problem on production, I wrote some terrible code which seems to have fixed it until your PR goes through, since you can't easily override the trait

private function checkForDroppedConnection()
{
    $location = '../vendor/laravel/framework/src/Illuminate/Database/DetectsLostConnections.php';
    $file = file_get_contents($location);

    if (!\Illuminate\Support\Str::contains($file, 'Physical connection is not usable')) {
        $newFile = file_get_contents('../storage/hacks/DetectsLostConnections.php');
        $fp = fopen($location, 'wb');
        fwrite($fp, $newFile);
        fclose($fp);
    }
}

Which copies over the new file, this can be run as a composer post-install hook, or even just stick it in a middleware

$newFile is the same just with your changes

Another way is to duplicate the whole class, make your changes, then make sure it is loaded before the Laravel version is loaded (just a nasty hacky include in index.php will make sure your odified version is loaded), and the composer autoloader won't then try to load the original unpatched version.

Did I submit the PR to the wrong place?

I think so! Realised none of these PR's got merged then saw this was read only. I reopened it on the real one and was merged (sorry, feel like I took your credit) I tagged this issue to be looked at.

@adamtester I'll hunt you down - for a beer. I obviously failed to get this thing fixed, and you did, so thank you :-)

@adamtester Thank you.
I ran this command in my docker container and the error disappeared.
sudo docker exec -ti dockerizedapp_app_1 supervisorctl restart laravel-worker:laravel-worker_00

Was this page helpful?
0 / 5 - 0 ratings

Related issues

gabriellimo picture gabriellimo  路  3Comments

ghost picture ghost  路  3Comments

digirew picture digirew  路  3Comments

klimentLambevski picture klimentLambevski  路  3Comments

lzp819739483 picture lzp819739483  路  3Comments