There seems to be a runaway memory leak when using any of the join DB functions.
On a clean Laravel 5.8 install just drop this inside your console.php:
// Memory usage consistent
Artisan::command('static-memory-test', function () {
while(true) {
$users = DB::table('users')->get();
var_dump ('current memory usage: '.memory_get_usage());
}
});
// Memory leak
Artisan::command('leak-memory-test', function () {
while(true) {
$users = DB::table('users')->join('password_resets', 'users.email', '=', 'password_resets.email')->get();
var_dump ('current memory usage: '.memory_get_usage());
}
});
Running php artisan static-memory-test shows:
string(30) "current memory usage: 12982728"
string(30) "current memory usage: 12982728"
string(30) "current memory usage: 12982728"
string(30) "current memory usage: 12982728"
...
But running php artisan leak-memory-test shows:
string(30) "current memory usage: 12819376"
string(30) "current memory usage: 12821864"
string(30) "current memory usage: 12824320"
string(30) "current memory usage: 12826776"
...
I noticed this, because I have a very complicated SQL query with lots of joins on a continuous loop as part of a long-running process. The script would keep crashing after just a few minutes saying it was out of memory. After many hours I managed track down the cause to the join statements.
What I've been able to work out is any/all join statements are affected -i.e. join(), leftJoin() etc.
I did some further testing, and discovered the bug does NOT exist in either Laravel 5.1 or Laravel 5.2. In both those versions you can run a loop with a join and the memory usage remains static, so we know it can be done.
But on Laravel 5.3.0 the memory usage issue appears. I think I've tracked down the PR that caused this to a refactor submitted here: https://github.com/laravel/framework/pull/13576
The problem is there have been so many code changes and other PRs since then, it's difficult to work out the solution here.
Ping @acasar (original PR contributor)
Ping @staudenmeir (resident DB expert)
p.s. and the size of the memory leak is related to the size of the join dataset. So in the example above it looks small, but when you have large data sets, it causes the memory to rapidly get out of hand.
I can reproduce this, but the memory usage regularly "resets" (when the garbage collection kicks in?):
string(30) "current memory usage: 16562760"
string(30) "current memory usage: 16565160"
string(30) "current memory usage: 16567560"
string(30) "current memory usage: 12581160"
string(30) "current memory usage: 12583560"
string(30) "current memory usage: 12585960"
Do you get the same behavior?
Do you get the same behavior?
Interesting... yes I do.
If I add a counter, and check when the reset occurs, its the same at 1423 loops (at least for me).
The problem is for my "real" script - it only lasts 20-25 loops before the memory failure kicks in (due to the size of the queries).
If I add a gc_collect_cycles() into the loop, the memory issue goes away (at least on this basic script)... but doesnt solve it on my larger complex script (although I'll go see if I can tweak it somehow).
So what's the root cause though? I'm guessing a reference not being released somewhere...
The memory leak is caused by this line:
Every JoinClause instance receives and stores the parent query. The instance itself then gets stored in the parent query's $joins property. This leads to some kind of "recursion".
The question is whether this is the expected behavior or a PHP bug.
Simple example:
class A
{
public $b;
public function b()
{
$this->b = new B($this);
}
}
class B
{
public $a;
public function __construct($a)
{
$this->a = $a;
}
}
while (true) {
(new A)->b();
var_dump(memory_get_usage());
usleep(1000);
}
The question is whether this is the expected behavior or a PHP bug.
This seems relevant: https://medium.com/@johann.pardanaud/about-circular-references-in-php-10f71f811e9
According to that, a solution is coming in PHP7.4: https://wiki.php.net/rfc/weakrefs
I guess the only other thought is I'll have a look and see if we can refactor the code at all to remove the circular reference... but given the issue has gone undetected for so long, I might just have to use some raw SQL queries on my particular use case to as an intrim solution... 馃槩
Do your queries use the JoinClause::newQuery() and forSubQuery() methods?
If not, you could create your own JoinClause class and remove the $this->parentQuery = $parentQuery; line by overriding the constructor.
It looks like we can fix this by storing the parent query's connection, grammar, processor and class name separately instead of the whole query object.
Do your queries use the JoinClause::newQuery() and forSubQuery() methods? If not, you could create your own JoinClause class and remove the $this->parentQuery = $parentQuery; line by overriding the constructor.
Yeah, I dont use either, so that seems to help in this instance.
It looks like we can fix this by storing the parent query's connection, grammar, processor and class name separately instead of the whole query object.
How do we get around the call to newQuery() and forSubQuery() though if we dont have the parentQuery?
edit: oh - if we change the constructor, i guess that works...
The only downside is the breaking change. I'll submit a PR.
The only downside is the breaking change. I'll submit a PR.
Just target 5.9 I guess? It's been here since 5.3, so it's not like a new urgent bug...?
edit: I've changed my script and added the gc_collect_cycles() as an intrim until ?5.9 (if it gets accepted).