I'm currently using phantomJS to render HTML pages as images. This works totally fine most of the time but sometimes the rendered page has no text. I've compiled phantomJS myself last week, so I'm able to use Webfonts from typekit and fonts.net, which works fine but every now an then the following error occurs while requesting the font file:
Network - Resource request error: QNetworkReply::NetworkError(OperationCanceledError) ( "Operation canceled" ) URL: "https://fast.fonts.net/someFont"
I looked up this error in the Qt documentation, which states the following:
the operation was canceled via calls to
abort()orclose()before it was finished.
This gave me the idea, that phantomJS might call abort() or close() for some reason. I'm just not sure why it would call those methods in only 5% of the time.
What I am doing is pretty straight forward:
var page = require('webpage').create();
page.onInitialized = function() {
page.customHeaders = {
"Accept-Language": "de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4",
"Cache-Control": "no-cache",
"Connection": "keep-alive",
"Pragma": "no-cache",
"Origin": "https://myDomain.com"
};
};
page.viewportSize = { width: 1024, height: 768 };
// Old Chrome version without woff2 support
page.settings.userAgent = "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1866.237 Safari/537.36";
page.open('https://myDomain.com/myFile.html', function (status)
{
if(status !== 'success') {
console.log('Unable to load the address!');
phantom.exit(1);
}
else
{
page.clipRect = {top:0, left:0, width: 1024, height: 768};
page.render('image.jpg', {quality: 100, format: 'jpeg'});
phantom.exit();
}
});
The html file is requested and loads the fonts with Webfontloader. For some reason it seems to load the js and css files from fast.fonts.net but not the fonts (otherwise it wouldn't request the fonts). For typekit it fails loading the p.gif file with a long query string attached (I guess that's some kind of authentification script disguised as an image).
I call the script like this:
phantomjs --ignore-ssl-errors=true --web-security=false script.js
I set web-security to false because the file contains file:/// URLs. I tested it on a local dev server with Debian Wheezy and on an online root server with Debian Jessie. The requested URL has a self signed certificate on the dev server, which is why I ignore ssl errors. On the root server the certificate is valid and I call it without --ignore-ssl-errors. I tried pretty much every other option as well though, but nothing seems to solve the problem. Since it happens infrequently I assumed that it's probably a problem with phantomJS; It's not a realease version after all.
Thank you for the clear and detailed bug report. Unfortunately, we need _even more_ information before we can do anything about it. Specifically, we can't reproduce this ourselves because we don't have access to your development server -- so we need you to construct the server side of a self-contained test case. I can give some advice on how to go about that, but I can't do it for you.
This is probably a bug in Qt's implementation of HTTP and it's probably triggered by something weird that fast.fonts.net is doing. The way I would begin is by capturing packet traces of phantomjs talking to your servers and to this third-party service. Look for differences in the HTTP responses between when your script works and when it doesn't work. If you can identify a difference, then program a scratch webserver to do the when-it-doesn't-work thing all the time. At the same time, cut down your HTML, CSS, and JavaScript to the smallest possible set of things that still reproduces the problem, and strip out anything that you couldn't give permission for us to stick into our automated test suite (and thus redistribute to the world under a BSD license).
The end result of this process is ideally a small collection of files plus a webserver configuration, such that if you spin up that webserver to serve those files, and run your phantomjs script against it, the network error happens 100% of the time, and it doesn't contact any _other_ web servers in the process. If you can't make 100% happen, "often enough that you should see it if you run the script over and over again" is good enough. If you can't make "no other web servers" happen, "only public CDNs" is good enough (e.g. loading jQuery from Google is fine, but loading a font that you need to pay money to access is not fine).
Also, I don't see anything in your report about which version of PhantomJS you are using. We are about to release 2.1, which contains major updates to Qt and Webkit that _may_ have eliminated the problem. If it's not too much trouble, please build and test the latest development sources.
Wow, thanks for the fast response.
Also, I don't see anything in your report about which version of PhantomJS you are using. We are about to release 2.1, which contains major updates to Qt and Webkit that may have eliminated the problem. If it's not too much trouble, please build and test the latest development sources.
I compiled from the development sources four days ago. Since then I didn't see any important commits that might solve my problem but I can try to compile it again later.
As for the other stuff, I'll try to prepare a test case. It's not that easy, since it fails on paid fonts but I'll try to make it fail on another, open resource.
I compiled from the development sources four days ago. Since then I didn't see any important commits that might solve my problem but I can try to compile it again later.
It looks like the Qt 5.5 update happened before that point, so yeah, maybe better focus on the test case.
As for the other stuff, I'll try to prepare a test case. It's not that easy, since it fails on paid fonts but I'll try to make it fail on another, open resource.
I doubt this is a problem with the font itself, but it could easily be a problem with the way the font is being served -- look very closely at the HTTP request and response headers for the font.
OK, It is pretty hard to create a good test case here, because sometimes I don't see the problem the entire day and sometimes it happens on every 10th execution. Today was one of those days with a lot of occurrences.
I wasn't able to inspect the raw packets because of SSL, so I sent everything through a HTTP proxy. Every time the Problem occurred the SSL handshake was successfully done but no follow up request was sent by phantomJS. I assume phantomJS received the answer from the handshake because it usually throws an error if the handshake fails. The handshake also didn't differ from the working attempts (Same SSL protocol version and same ciphers).
In the js script, I added listener for onResourceRequested, onResourceReceived, onResourceTimeout and onResourceError.
onResourceRequested is always called (as expected).
onResourceReceived isn't called when the problem occurs (as expected).
onResourceTimeout isn't called either.
onResourceError also isn't called, so I can't really read a more detailed error message.
I also found one case where similar Error behavior is consistently showing on always the same file. When I request "https://www.google.com" I get the same error message always when it requests "https://www.google.com/textinputassistant/tia.png". When I run the same script over the proxy though, it can load that image and the error occurs a bit later when requesting a javascript file.
I created the following script for that, which failed on systems I tested (Debian Wheezy, Debian Jessie and Windows 8.1).
var page = require('webpage').create();
page.onResourceError = function(resourceError) {
console.log('\033[1;31m' + 'Error (#' + resourceError.id + '): URL: ' + request.url.substr(0,80) + ' || ' + resourceError.errorCode + ' || Description: ' + resourceError.errorString + '\033[0m');
};
page.onResourceTimeout = function(request) {
console.log('\033[1;33m' + 'Timeout (#' + request.id + '): ' + request.url.substr(0,80) + '\033[0m');
};
page.onResourceRequested = function(request) {
console.log('\033[0;36m' + '--> Request (#' + request.id + '): ' + request.url.substr(0,80) + '\033[0m');
};
page.onResourceReceived = function(response) {
console.log('\033[1;34m' + '<-- Response (#' + response.id + '): ' + response.url.substr(0,80) + '\033[0m');
};
page.viewportSize = { width: 1024, height: 768 };
// Old Chrome version without woff2 support
page.settings.userAgent = "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1866.237 Safari/537.36";
page.settings.resourceTimeout = 1000;
page.open('https://www.google.com', function (status)
{
if(status !== 'success') {
console.log('Unable to load the address!');
phantom.exit(1);
}
else
{
page.clipRect = {top:0, left:0, width: 1024, height: 768};
page.render('image.jpg', {quality: 100, format: 'jpeg'});
phantom.exit();
}
});
I called it with no additional parameters, other than --debug=true and it shows the following line in the debug:
2015-12-17T16:54:39 [DEBUG] Network - Resource request error: QNetworkReply::NetworkError(OperationCanceledError) ( "Operation canceled" ) URL: "https://www.google.com/textinputassistant/tia.png"
I'm pretty sure that this has something to do with SSL and it seems to happen between the Handshake and the request of the actual file.
I encountered this issue when Web pages perform redirects to a different domain (usually mdot to desktop and geoip domain name, aka .co.uk to .com etc.) via window.location. The URL of the target URL is provided but sadly there is no way to catch the error via Phantom.
Eg when asking for http://m.noisey.vice.com/en_uk/blog/sam-smith-has-awoken-to-racism:
2016-01-25T16:36:55 [DEBUG] Network - Resource request error: QNetworkReply::NetworkError(OperationCanceledError) ( "Operation canceled" ) URL: "http://m.noisey.vice.com/en_uk/blog/sam-smith-has-awoken-to-racism"
config.json
{
"ignoreSslErrors": true,
"webSecurityEnabled": false,
"sslProtocol": "TLSv1",
"outputEncoding": "utf8",
"scriptEncoding": "utf8"
}
will-error.js
'use strict';
var system = require('system');
var webPage = require('webpage');
var page = webPage.create();
var DEFAULT_PAGE_SETTINGS = {
encoding: "utf8",
headers: {
"Accept-Language": 'en',
"Cache-Control": 'no-cache',
"DNT": '1'
}
};
page.open('http://m.noisey.vice.com/en_uk/blog/sam-smith-has-awoken-to-racism', DEFAULT_PAGE_SETTINGS, function(status){
if (status === 'fail') {
system.stderr.write('Page failed to load.');
phantom.exit(75);
}
else {
phantom.exit(0);
}
});
It will sometimes succeed, sometimes fail but in general, the page.content contains some HTML but not the <body> nor anything contained in the <body> tag.
PS: I had the issue on Phantom 2.0 and it persists with Phantom 2.1.1.
$ phantomjs --config=config.json --debug=true redirect.js
2016-01-25T17:09:12 [DEBUG] CookieJar - Created but will not store cookies (use option '--cookies-file=<filename>' to enable persistent cookie storage)
2016-01-25T17:09:12 [DEBUG] Set "http" proxy to: "" : 1080
2016-01-25T17:09:12 [DEBUG] Phantom - execute: Configuration
2016-01-25T17:09:12 [DEBUG] 0 objectName : ""
2016-01-25T17:09:12 [DEBUG] 1 cookiesFile : ""
2016-01-25T17:09:12 [DEBUG] 2 diskCacheEnabled : "false"
2016-01-25T17:09:12 [DEBUG] 3 maxDiskCacheSize : "-1"
2016-01-25T17:09:12 [DEBUG] 4 diskCachePath : ""
2016-01-25T17:09:12 [DEBUG] 5 ignoreSslErrors : "true"
2016-01-25T17:09:12 [DEBUG] 6 localUrlAccessEnabled : "true"
2016-01-25T17:09:12 [DEBUG] 7 localToRemoteUrlAccessEnabled : "false"
2016-01-25T17:09:12 [DEBUG] 8 outputEncoding : "utf8"
2016-01-25T17:09:12 [DEBUG] 9 proxyType : "http"
2016-01-25T17:09:12 [DEBUG] 10 proxy : ":1080"
2016-01-25T17:09:12 [DEBUG] 11 proxyAuth : ":"
2016-01-25T17:09:12 [DEBUG] 12 scriptEncoding : "utf8"
2016-01-25T17:09:12 [DEBUG] 13 webSecurityEnabled : "false"
2016-01-25T17:09:12 [DEBUG] 14 offlineStoragePath : ""
2016-01-25T17:09:12 [DEBUG] 15 localStoragePath : ""
2016-01-25T17:09:12 [DEBUG] 16 localStorageDefaultQuota : "-1"
2016-01-25T17:09:12 [DEBUG] 17 offlineStorageDefaultQuota : "-1"
2016-01-25T17:09:12 [DEBUG] 18 printDebugMessages : "true"
2016-01-25T17:09:12 [DEBUG] 19 javascriptCanOpenWindows : "true"
2016-01-25T17:09:12 [DEBUG] 20 javascriptCanCloseWindows : "true"
2016-01-25T17:09:12 [DEBUG] 21 sslProtocol : "tlsv1"
2016-01-25T17:09:12 [DEBUG] 22 sslCiphers : "ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-RC4-SHA:ECDHE-RSA-RC4-SHA:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA:DHE-RSA-AES256-SHA:AES128-GCM-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:RC4-SHA:RC4-MD5"
2016-01-25T17:09:12 [DEBUG] 23 sslCertificatesPath : ""
2016-01-25T17:09:12 [DEBUG] 24 sslClientCertificateFile : ""
2016-01-25T17:09:12 [DEBUG] 25 sslClientKeyFile : ""
2016-01-25T17:09:12 [DEBUG] 26 sslClientKeyPassphrase : ""
2016-01-25T17:09:12 [DEBUG] 27 webdriver : ":"
2016-01-25T17:09:12 [DEBUG] 28 webdriverLogFile : ""
2016-01-25T17:09:12 [DEBUG] 29 webdriverLogLevel : "INFO"
2016-01-25T17:09:12 [DEBUG] 30 webdriverSeleniumGridHub : ""
2016-01-25T17:09:12 [DEBUG] Phantom - execute: Script & Arguments
2016-01-25T17:09:12 [DEBUG] script: "redirect.js"
2016-01-25T17:09:12 [DEBUG] Phantom - execute: Starting normal mode
2016-01-25T17:09:12 [DEBUG] WebPage - setupFrame ""
2016-01-25T17:09:12 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode", QVariant(QString, "r")))
2016-01-25T17:09:12 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mode", QVariant(QString, "r")))
2016-01-25T17:09:12 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mode", QVariant(QString, "r")))
2016-01-25T17:09:12 [DEBUG] WebPage - updateLoadingProgress: 10
2016-01-25T17:09:12 [DEBUG] WebPage - updateLoadingProgress: 30
2016-01-25T17:09:12 [DEBUG] WebPage - setupFrame ""
2016-01-25T17:09:12 [DEBUG] WebPage - updateLoadingProgress: 32
2016-01-25T17:09:12 [DEBUG] WebPage - updateLoadingProgress: 34
2016-01-25T17:09:12 [DEBUG] WebPage - updateLoadingProgress: 37
2016-01-25T17:09:12 [DEBUG] WebPage - updateLoadingProgress: 37
2016-01-25T17:09:13 [DEBUG] WebPage - updateLoadingProgress: 39
2016-01-25T17:09:13 [DEBUG] WebPage - updateLoadingProgress: 41
2016-01-25T17:09:13 [DEBUG] WebPage - updateLoadingProgress: 43
2016-01-25T17:09:13 [DEBUG] WebPage - updateLoadingProgress: 45
2016-01-25T17:09:13 [DEBUG] WebPage - updateLoadingProgress: 47
2016-01-25T17:09:13 [DEBUG] WebPage - updateLoadingProgress: 48
2016-01-25T17:09:13 [DEBUG] WebPage - updateLoadingProgress: 48
2016-01-25T17:09:13 [DEBUG] WebPage - updateLoadingProgress: 49
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 49
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 49
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 49
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 49
2016-01-25T17:09:14 [DEBUG] CookieJar - Saved "isMobile=false; domain=m.noisey.vice.com; path=/"
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 100
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 10
Page succeeded to load.2016-01-25T17:09:14 [DEBUG] WebPage - setupFrame ""
2016-01-25T17:09:14 [DEBUG] Network - Resource request error: QNetworkReply::NetworkError(OperationCanceledError) ( "Operation canceled" ) URL: "http://noisey.vice.com/en_uk/blog/sam-smith-has-awoken-to-racism"
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 100
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 10
2016-01-25T17:09:14 [DEBUG] WebPage - setupFrame ""
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 100
2016-01-25T17:09:14 [DEBUG] WebPage - setupFrame ""
2016-01-25T17:09:14 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode", QVariant(QString, "r")))
2016-01-25T17:09:14 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mode", QVariant(QString, "r")))
2016-01-25T17:09:14 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mode", QVariant(QString, "r")))
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 10
2016-01-25T17:09:14 [DEBUG] WebPage - setupFrame ""
2016-01-25T17:09:14 [DEBUG] FileSystem - _open: ":/modules/fs.js" QMap(("mode", QVariant(QString, "r")))
2016-01-25T17:09:14 [DEBUG] FileSystem - _open: ":/modules/system.js" QMap(("mode", QVariant(QString, "r")))
2016-01-25T17:09:14 [DEBUG] FileSystem - _open: ":/modules/webpage.js" QMap(("mode", QVariant(QString, "r")))
2016-01-25T17:09:14 [DEBUG] WebPage - updateLoadingProgress: 100
2016-01-25T17:09:14 [DEBUG] WebPage - setupFrame ""
2016-01-25T17:09:14 [DEBUG] CookieJar - Purged (session) "isMobile=false; domain=m.noisey.vice.com; path=/"
I am experiencing the same issue.
Non-redirect links are rendered/accessed fine. But sometimes requests to https://maps.googleapis.com (Static Maps API) and https://fonts.gstatic.com (fonts) and some others just fail. Not always, but.
Hmm, what is interesting: not enabling debug succeeds almost always. (i haven't noticed anything in the first few days i was using the lib without debug)
I'm experiencing the same problem, my case is with local assets, I'm running UI tests (Mink) with Phantomjs driver, the website has a lot of JavaScript files that are loaded async using requirejs (around 100 files). The errors are random. )this is an extract of one of them:
2016-05-07T00:05:18 [DEBUG] Network - Resource request error: QNetworkReply::NetworkError(OperationCanceledError) ( "Operation canceled" ) URL: "http://localhost:8000/bower/mustache/mustache.min.js"
2016-05-07T00:05:18 [DEBUG] WebPage - updateLoadingProgress: 10
2016-05-07T00:05:18 [DEBUG] WebPage - evaluateJavaScript result QVariant(QString, "{\"status\":0,\"value\":null}")
[ERROR - 2016-05-06T23:05:18.961Z] Session [ee019f50-13de-11e6-932b-712400caa33e] - page.onError - msg: Error: Script error for "mustache", needed by: modallica
http://requirejs.org/docs/errors.html#scripterror
phantomjs://platform/console++.js:263 in error
[ERROR - 2016-05-06T23:05:18.961Z] Session [ee019f50-13de-11e6-932b-712400caa33e] - page.onError - stack:
defaultOnError (http://localhost:8000/bower/requirejs/require.js:143)
onError (http://localhost:8000/bower/requirejs/require.js:547)
onScriptError (http://localhost:8000/bower/requirejs/require.js:1735)
phantomjs://platform/console++.js:263 in error
I have a hunch that some of these problems are related to pages that include webfonts (.woff).
In my experience, I've seen phantomjs v2 and higher fail to load css and js resources when the page also includes webfonts.
At times, if you look at the javascript phantomjs has parsed, it appears that it's mixed up content from one buffer (say, a js file), and another buffer (another js file or even css). We found that removing our woff resource from the page was a pretty reliable way of getting our "Operation Canceled" requests to stop appearing.
Can anyone else try that out? Would be great to be able to fix some of these underlying bugs...
Duplicate. Please see https://github.com/ariya/phantomjs/issues/12750
Most helpful comment
I encountered this issue when Web pages perform redirects to a different domain (usually mdot to desktop and geoip domain name, aka .co.uk to .com etc.) via
window.location. The URL of the target URL is provided but sadly there is no way to catch the error via Phantom.Eg when asking for
http://m.noisey.vice.com/en_uk/blog/sam-smith-has-awoken-to-racism:config.json
will-error.js
It will sometimes succeed, sometimes fail but in general, the
page.contentcontains some HTML but not the<body>nor anything contained in the<body>tag.PS: I had the issue on Phantom 2.0 and it persists with Phantom 2.1.1.