Phantomjs version: Phantomjs 2 (built from source on 13th Nov 14 from master)
I wrote a script to scan websites using phantomjs. I open not more than 4 simultaneous pages at any point.
Some serious memory leak happens when i run it
page.settings.loadImages = false;
I run my script on an amazon linux ec2 server with a 1GB ram and a single core processor (its one of the lighter servers).
When not loading images my ram usage gradually increases to 99% as seen by the "top" command before phantom crashes.
However, when loading images it stabilizes at around 6-7%.
+1 have the same issue too
Can you build the tag 2.0.0 and try it again? If that still happens, I'll mark this as Regression.
@ariya ok will try again on 2.0.0
This is WebKit related issue :
https://bugreports.qt.io/browse/QTBUG-34494 (https://codereview.qt-project.org/#/c/76934/)
https://bugreports.qt.io/browse/QTBUG-36530
https://bugreports.qt.io/browse/QTBUG-38857
Take a look at this test script, you'll see how memory usage grows up to 1GB with --load-images=false but with --load-images=true it doesn't grow more then 350MB:
var page = require('webpage').create(),
u = ["http://www.google.com/","https://www.facebook.com/","https://www.youtube.com/","https://www.yahoo.com/","http://www.amazon.com/","http://www.bing.com/","http://www.ebay.com/","http://www.wikipedia.org/","http://detroit.craigslist.org/","https://www.linkedin.com/","https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=12&ct=1422984414&rver=6.4.6456.0&wp=MBI_SSL_SHARED&wreply=https:%2F%2Fmail.live.com%2Fdefault.aspx%3Frru%3Dinbox&lc=1033&id=64855&mkt=en-us&cbcxt=mai","https://twitter.com/","https://accounts.google.com/ServiceLogin?service=blogger&passive=1209600&continue=https://www.blogger.com/home&followup=https://www.blogger.com/home<mpl=start","http://www.aol.com/","http://go.com/","https://www.pinterest.com/","http://www.msn.com/","https://www.tumblr.com/","http://www.cnn.com/","http://www.ask.com/","http://www.huffingtonpost.com/","https://www.netflix.com/us/","https://www.paypal.com/home","http://www.weather.com/","http://conduit.com/","http://espn.go.com/","http://instagram.com/","https://wordpress.com/","https://www.bankofamerica.com/","http://akamihd.net/","http://www.imdb.com/","https://www.chase.com/","http://www.microsoft.com/en-us/default.aspx","http://www.about.com/","http://www.avg.com/us-en/homepage","http://www.pornhub.com/","http://xfinity.comcast.net/tt2/?cid=BBI","http://www.foxnews.com/","http://www.apple.com/","http://www.walmart.com/","http://xhamster.com/","http://home.mywebsearch.com/index.jhtml","https://www.wellsfargo.com/","http://www.xvideos.com/","http://www.yelp.com/","http://imgur.com/","http://www.nytimes.com/","http://www.nbcnews.com/","http://www.cnet.com/","http://www.reddit.com/","http://www.adobe.com/","http://www.ehow.com/","http://www.pandora.com/","http://www.pch.com/","http://www.hulu.com/","http://www.zedo.com/","https://www.etsy.com/","https://www.flickr.com/","http://www.outbrain.com/","http://optmd.com/","http://www.indeed.com/","http://new.livejasmin.com/en/?showPreviousVersionLink=0","http://www.zillow.com/","http://www.target.com/","http://www.xnxx.com/","http://www.homedepot.com/","http://www.redtube.com/","http://www.answers.com/","http://www.att.com/","http://www.shopathome.com/","http://www.wikia.com/Wikia","http://www.dailymail.co.uk/ushome/index.html","https://www.usps.com/","http://www.babylon.com/","http://www.ups.com/","http://www.bestbuy.com/","http://www.youporn.com/","http://www.reference.com/","https://www.godaddy.com/","http://www.groupon.com/","http://www.deviantart.com/","http://www.usatoday.com/","http://www.pof.com/","https://www.capitalone.com/","http://www.bbc.co.uk/","http://www.washingtonpost.com/","http://www.match.com/","http://drudgereport.com/","http://mlb.mlb.com/home","http://www.tripadvisor.com/","http://www.pogo.com/","http://www.verizonwireless.com/","https://accounts.google.com/ServiceLogin?service=blogger&passive=1209600&continue=https://www.blogger.com/home&followup=https://www.blogger.com/home<mpl=start","http://www.buzzfeed.com/","http://doublepimp.com/Account/LogIn","http://inksr.com/","http://www.fedex.com/","http://inksdata.com/","http://www.aweber.com/","http://abcnews.go.com/","https://vimeo.com/","https://hootsuite.com/","http://bleacherreport.com/","http://www.lowes.com/","http://www.yellowpages.com/","https://www.americanexpress.com/","http://www.tube8.com/","https://www.salesforce.com/"]
i = 0;
page.onError = function() {}
function open() {
var url = u.shift();
if (url) {
page.open(url, function(x) {
console.log(i++ + ' : ' + url + ' : ' + x);
setTimeout(open, 2000);
});
} else {
phantom.exit();
}
}
open();
+1 here, any thoughts on a fix for this this?
+1 me too.
+1 same issue
+1 here
Any ETAs for a fix for this issue?
+1 please fix it
+1 please fix it
Guys, this is a QT bug. If you want it fixed, please vote here: https://bugreports.qt.io/browse/QTBUG-38857
Voted.
Here's a workaround that was adequate for my purposes: set loadImages to true, then use onResourceRequested to abort requests for image urls. In my case, I'm using casper.on('resource.requested'). It seems to stop the leaks without other side effects.
@behrangsa voted
@mepard good idea!
+1 to fix it
@mepard I'm getting error: TypeError: Object #
This is working for me (OS X 10.11 and Windows 10):
function ResourceRequested (request, networkRequest)
{
if (options.workAroundLoadImagesLeak)
{
// Work around memory leak in WebKitQT in PhantomJS 2.0.0 when not loading images.
// In initialize above we told PhantomJS to always load images, but here we'll suppress
// the image resources.
if (/\.(jpg|jpeg|png|gif|tif|tiff|mov)$/i.test(request.url))
{
if (options.trackResources)
{
console.log(moment().format() + ': Suppressing image #' + request.id + ': ' + request.url);
}
networkRequest.abort();
return;
}
}
}
FWIW, https://codereview.qt-project.org/#/c/76934/2 claims to have fixed this, but it doesn't seem like it to me. The master branch currently has the change, but the leak still happens.
@mepard The bug is still open: https://bugreports.qt.io/browse/QTBUG-38857
Turns out it's a WebKit bug. I suspect it's the same as https://bugs.webkit.org/show_bug.cgi?id=17469 even though that report doesn't mention autoLoadImages.
+1
I just did a run where I visited around 330 pages with lots of images. Without this patch memory consumption hit around 2.5 GB, with the patch it was about 500 MB.
set option:load_image as True, resolve my problem
It has been 3 years without any fixes. 2.1.1 has still memory leak issue with python 2.7 in all operating systems.
I appreciate that this is an open source project, but if you were an end user of your own project and had to eat your own dog's food, would you be happy with such an important bug to be left unfixed for three years?
At least start a campaign, ask for an amount of donation that you think would be enough to fix this problem, ask people to contribute and donate money, and then focus on this issue and fix it.
Goddamit ;) https://youtu.be/0ubcetW9-D8?t=4s
+1 fix please!
+1 ...
+1
+1
+1
...20 years later: +1
ahaha, yeah! old guys are still pressing the button "+1"
+1
+1
+1
+1
+1
+1
+1
+1 (please)
+1 please!!
+1 Please, this is still an issue 1 year later. :(
+1
Let's track this in https://github.com/ariya/phantomjs/issues/11390
+1
+1 (2.1.1)
Most helpful comment
+1 fix please!