Phantomjs: evaluate querySelector extremely slow on one machine for phantomjs 2.1.1

Created on 7 Mar 2016  路  14Comments  路  Source: ariya/phantomjs

A few weeks ago my local machine suddenly started taking over 3 hours to run all tests, whereas the Jenkins instance took 20 minutes. It seems the problem lies in the querySelector wrapper, which basically does

exports.querySelector = function(query) {
  return casper.page.evaluate(function (query) {
    return document.querySelector(query);
  }, query);
};

(I am using CasperJS, but the problem persists when calling PhantomJS directly via casper.page as done in the above example.)

I put in some profiling code and tested this on several machines. On an Amazon EC2 instance running Ubuntu (same as Jenkins instance), the results are

[profile] querySelector: 17 calls between 0.001 .. 0.969s, avg = 0.878235294117647s; 1 misses @ avg = 0.001s/call; 16 hits @ avg = 0.9330625s

On my local Mac Mini,

[profile] querySelector: 17 calls between 0 .. 22.454s, avg = 16.966s; 2 misses @ avg = 0.0005s/call; 15 hits @ avg = 19.228066666666667s

On my neighbor's Mac Mini,

[profile] querySelector: 17 calls between 0 .. 0.901s, avg = 0.7819411764705881s; 2 misses @ avg = 0.0005s/call; 15 hits @ avg = 0.8861333333333333s

The problem goes away by downgrading to PhantomJS 1.9.2, the only other version I was able to install via HomeBrew.

Need more information Need reproduction

Most helpful comment

Returning anything other than plain JavaScript object is not really supported. Even if it seems to work, it's a hit and miss, including a potential performance problem.

From the test case:

    var result = page.evaluate(function(sel) {
        return document.querySelector(sel);
    }, selector);

the problematic part is returning a selector (a DOM element).
Try to change it so that you return the actual title, e.g. return document.querySelector(sel).title and check the performance again.

All 14 comments

Well, returning DOM elements between execution contexts is extremely slow and inefficient due to the Qt<->JSC bridge.

I don't see how that is relevant, considering I am seeing the expected performance by downgrading.
PhantomJS 1.9.2: average 0.9s per call.
PhantomJS 2.1.1: average 19.2s per call.
That's a 21x increase in processing time.

Qt<->JSC is a bit complicated. Perhaps something is slowing it down on you local Mac mini. Could you please check that you're running the same versions (PhantomJS, CasperJS) across all your environments?

Yeah I was running 2.1.1 on my local Mac Mini (19.2s / call), and my neighbor was running 2.1.1 on his Mac Mini (0.9s / call).

This is definitely not the case for all OS X 2.1.1 installations, but I have no idea what is different on my side. Especially since downgrading to 1.9.2 resolved the issue for me.

Thanks. In this case could you please also write a small example script to reproduce your problem?

var webPage = require('webpage');
var page = webPage.create();

function querySelector(selector) {
    var starttime = new Date().getTime();
    var result = page.evaluate(function(sel) {
        return document.querySelector(sel);
    }, selector);
    var endtime = new Date().getTime();
    console.log("took " + ((endtime-starttime)/1000.0) + "s");
    return result;
}

page.open('http://m.bing.com', function(status) {

    var title = querySelector("title");

    console.log(title.textContent);
    phantom.exit();

});

On my local machine (not the Mac mini at work, but at my home), I get a 10x decrease in performance for PJS 2.1.1 over PJS 1.9.2:

$ phantomjs --version
1.9.2
$ phantomjs pjs.js
2016-03-08 00:01:54.020 phantomjs[11580:2994311] *** WARNING: Method userSpaceScaleFactor in class NSView is deprecated on 10.7 and later. It should not be used in new applications. Use convertRectToBacking: instead.
took 0.141s
Bing
$ phantomjs pjs.js
2016-03-08 00:02:03.524 phantomjs[11591:2994387] *** WARNING: Method userSpaceScaleFactor in class NSView is deprecated on 10.7 and later. It should not be used in new applications. Use convertRectToBacking: instead.
took 0.118s
Bing
$ phantomjs pjs.js
2016-03-08 00:02:05.544 phantomjs[11596:2994424] *** WARNING: Method userSpaceScaleFactor in class NSView is deprecated on 10.7 and later. It should not be used in new applications. Use convertRectToBacking: instead.
took 0.122s
Bing
$ brew unlink phantomjs192
Unlinking /usr/local/Cellar/phantomjs192/1.9.2... 1 symlinks removed
$ brew link phantomjs
Linking /usr/local/Cellar/phantomjs/2.1.1... 2 symlinks created
$ phantomjs --version
2.1.1
$ phantomjs pjs.js
took 1.324s
Bing
$ phantomjs pjs.js
took 1.342s
Bing
$ phantomjs pjs.js
took 1.394s
Bing

This is on the machine at work -- same results:

$ phantomjs --version
1.9.2
$ phantomjs pjs.js
took 0.116s
Bing
$ phantomjs pjs.js
took 0.117s
Bing
$ phantomjs pjs.js
took 0.116s
Bing
$ brew unlink phantomjs192
Unlinking /usr/local/Cellar/phantomjs192/1.9.2... 1 symlinks removed
$ brew link phantomjs
Linking /usr/local/Cellar/phantomjs/2.1.1... 2 symlinks created
$ phantomjs --version
2.1.1
$ phantomjs pjs.js
took 1.36s
Bing
$ phantomjs pjs.js
took 1.299s
Bing
$ phantomjs pjs.js
took 1.315s
Bing
$

This degraded performance is a problem with PhantomJS 2.0.1 (binary from https://github.com/Vitallium/phantomjs/releases/tag/2.0.1). Unfortunately I'm unable to build PJS before that one due to OpenSSL so I can't figure out exactly where this problem started occurring.

Edit: I managed to track down binaries covering 1.9.8 up to 2.0.1. The problem begins to appear in PhantomJS 2.0.0 and persists from there on. PhantomJS 1.9.8 gives the desired performance while 2.0.0 is 10x slower for the given test case. It could be a fix in between 1.9.8+1 and 2.0.0, of course.

@kallewoof How did you configure CasperJS to work with an older version of PhantomJS?
I ran your test script on 1.9.8 and recent PhantomJS and saw the same 10X slow down.

Use PHANTOMJS_EXECUTABLE environment variable, pointing it at e.g. /usr/local/bin/phantomjs.

Returning anything other than plain JavaScript object is not really supported. Even if it seems to work, it's a hit and miss, including a potential performance problem.

From the test case:

    var result = page.evaluate(function(sel) {
        return document.querySelector(sel);
    }, selector);

the problematic part is returning a selector (a DOM element).
Try to change it so that you return the actual title, e.g. return document.querySelector(sel).title and check the performance again.

That worked fine, but aren't you concerned over the 10x increase in execution time?

Edit: nevermind, I realize now that you say it's not actually supported. And it's also written in the docs. I have no idea how I could have missed that part. Thanks.

If the performance degradation is on plain JS object, we would have been very concerned!

Thanks for the report and the detailed investigation! I'm closing it since not much we can follow up here.

Just to clarify for my future self and others:
$("[aria-hidden='true']:contains('some text')"); // SLOW and possibly not supported?
$("[aria-hidden='true']:contains('some text')").text(); // FAST

Was this page helpful?
0 / 5 - 0 ratings