Phantomjs: File download

Created on 1 Mar 2011  Β·  159Comments  Β·  Source: ariya/phantomjs

_alexsa...@gmail.com commented:_

It would be good to accept (and save) 'Content-Disposition: attachment; filename=' content.

Disclaimer:
This issue was migrated on 2013-03-15 from the project's former issue tracker on Google Code, Issue #52.
:star2:   40 people had starred this issue at the time of migration.

PJS core

Most helpful comment

Just a friendly reminder: PhantomJS team consists of volunteers. We do not always have the spare time necessary to implement every feature request. Competing priorities are unavoidable.

If you can't live with this, see #13861 where I outlined various ways you can help us. If you're still not satisfied, pay your favorite consultant to contribute a feature. These are all way more productive than whining or even insulting our work.

All 159 comments

_ariya.hi...@gmail.com commented:_

This is again related to issue 41.

 
_Metadata Updates_

  • Milestone updated: FutureRelease (was: ---)

_roejame...@gmail.com commented:_

Issue 92 has been merged into this issue.

_brian.th...@gmail.com commented:_

I'm trying to implement this functionality and not making much progress. Using the attached patch, I run:

$ bin/phantomjs examples/download.js

and get this output:

WebPage instantiated
WebPage instantiated
Download complete - fail

I added cout of "WebPage instantiated" (to verify my debug messages work as expected). I also added a cout in my downloadRequested slot. That one did not get displayed. Can someone spot what I'm doing wrong or let me know if I'm on the completely wrong track?

Here is where I found out about the downloadRequested signal: http://doc.qt.nokia.com/latest/qwebpage.html#downloadRequested

_brian.th...@gmail.com commented:_

Whoops, here is the patch file attachment without the ANSI color codes

_nperria...@gmail.com commented:_

Any progress on this issue?

_ariya.hi...@gmail.com commented:_

No progress as of now.

_nperria...@gmail.com commented:_

A friend of mine (http://svay.com/) just told me a nice trick for dealing around with this issue, using XHR within the page environment and base64 encoding to retrieve file contents and it works rather great. For the record you can find an example here: http://jsfiddle.net/3kUXy/

_gopiredd...@gmail.com commented:_

The URL to the file is not always known so XHR is not a general solution. For instance, if you are downloading a utility/bank/cc statement, you may have to click a link which will possibly execute some JS code and trigger another page load with a frame embedding the PDF. Or the statement comes in as an attachment.

What will it take to support the file download feature?

Requirement: Download files that come in embedded in the page/frame or as attachments. The URLs may or may not be known. Allow saving the files to the file system or "upload" them to a web server (so the server can save the files in a DB for instance).

_ja...@recovend.com commented:_

I've got an early but functional version of this at

https://github.com/woodwardjd/phantomjs/tree/add_download_capabilities

Example:

var page = require('webpage').create();

page.onUnsupportedContentReceived = function(data) {
console.log('Got a download at url: ' + data.url);
page.saveUnsupportedContent('some.file.path', data.id);
phantom.exit();
}

page.open('http://some.pdf.url.com/some.pdf');

I call this "early but functional" because it works where I've tested it (linux, PDF downloads), but has a likely small memory leak, and I'm not 100% convinced the callback mechanism I used is idea.

Comments desired.

_rotava...@gmail.com commented:_

I've downloaded and built the git for above, but I can't seem to get the onUnsupportedContentReceived event to fire and calling saveUnsupportedContent throws an undefined error. Are there special build steps required to enable it?

Thanks,
Robert

_ja...@recovend.com commented:_

No special build steps required, as far as I know. If
saveUnsupportedContent is undefined, maybe you haven't built the version in
the add_download_capabilities branch (git checkout
add_download_capabilities after the git clone)? Just speculating.

_audi...@gmail.com commented:_

I second the XHR+base64 method. It takes another 50+ lines of code to send to page.evaluate(), and I have to de-base64 the content afterward, and that's basically how CasperJS does it (as far as I can tell from their codeβ€”they do a lot of weird (unnecessary, in my book) binding with window.utils in the page context).

I used this one (first answer):
http://stackoverflow.com/questions/7370943/retrieving-binary-file-content-using-javascript-base64-encode-it-and-reverse-de

It works great. Just be sure to try-catch the call to base64ArrayBuffer(), because Uint8Array(arrayBuffer) may throw an error, and check xhr.getHeader('content-type') == 'application/pdf' if you're doing pdf downloads like I was.

_subel...@gmail.com commented:_

I need this as well. Can't use the XHR method because the inline attachments I need to scrape don't come with a URL I can hit.

_audi...@gmail.com commented:_

Wouldn't inline attachments be even more easily downloaded? For an image:
var content = page.evaluate(function() {
return $('img#whatever').attr('src');
});
fs.write(yer_path, content, 'w');


Ariya, can you give some estimate of how long this feature (downloading a url) would take to implement? I'd love to get involved in PhantomJS development, but maybe this issue is a lot trickier than it sounds?

_subel...@gmail.com commented:_

Sorry, I didn't mean to write "inline". The file I need is not an image and is not part of the DOM. It gets sent as a result of a POST with the Content-Disposition header 'attachment;filename="report.csv"'

_bogusan...@gmail.com commented:_

Hi there. I think the base64-encoding solution can only be a stop-gap solution.

  • Downloading big files will probably exhaust memory and base64 encoding and -decoding it will use up resources that would have better been spent elsewhere - therefore we want to have the option to redirect a downloaded stream to file
  • We may have pages where we cannot control the loading of a file that is not supported (e.g. PDF)
  • We may want to save resources that have already been loaded as part of the page (e.g. images)

I think the optimal solution would be to add functionality to the onResourceReceived hook to allow setting up a "redirection" handler, and if such a handler is set, unsupported file formats should silently be downloaded. This handler could then have another onDownloadFinished hook to resume operation once the download is done.

_james.m....@gmail.com commented:_

 

 
_Metadata Updates_

  • Label(s) removed:

    • Type-Defect

  • Label(s) added:

    • Type-Enhancement

  • Status updated: Accepted

I'm interested in committing some of my company's resources to adding this feature. Is anyone already working on it? If so, could my company sponsor your work? If not, we can assign it to one of our own people. I just want to avoid duplicating anyone else's work.

I'm also interested in helping with this feature. We're trying to capture an Acrobat file that is sent as a result of a POST with the Content-Disposition header 'attachment;filename="file.pdf"' Is anyone working on this? I don't want to duplicate effort. Ideally we want to access the functionality from CasperJS as well.

any progress on this?

I'd love to see this fixed too. I saw @Vitallium has a fork with download support, as well as a few other fixes.

https://github.com/Vitallium/phantomjs/tree/download-support

Anyone else able/available to help merge the new code? I wouldn't be doing anyone a favor if I messed with the C codebase. I wouldn't mind donating towards a bounty for this.

This feature is under development. When it's ready, it'll be merged into the master tree.
I can't say when this feature will be ready.

I'm also interested in this issue. Will we be able to render the pdf content as png / jpeg? Or is that altogether a different problem?

@FergusNelson that's a different problem, but much more easily solved using ghostscript, X11, ImageMagick, etc.

looks like @Vitallium is pretty far along with an awesome solution in his download-support branch, described here: https://groups.google.com/forum/#!msg/phantomjs/JChUakj--24/epby47h3ZGAJ

I see that there are at least two attempts to address this issue on GitHub. @woodwardjd's add_download_capabilities branch, and @Vitallium's download-support branch. Is one of those a more promising path forward than the other? What work is outstanding before it would be ready to merge upstream?

@Vitallium how close is this to being merged with the master?

I rebased @Vitallium's download-support branch on a recent master HEAD.

I've been exercising it with a happy path test case, and it seems to be working fine.

@ariya and @Vitallium,

I'd like to continue the work that @Vitallium started if there's more to do.

What do you think blocks merging this upstream?

I'm actually want to rework the 'download-support' branch. I want to make it similar to real browsers.
But I didn't post my ideas to the mailing-list yet (https://groups.google.com/forum/#!topic/phantomjs/JChUakj--24).
So, i want to:

  • split up download support to a separate class DownloadManager (or smth similar)
  • and make it as a module. To make sure downloads will not be depended on a WebPage instance

hi, we are having trouble with downloading files too, we gave a try to download-support branch code; but onFileDownload() callback seems not called - and we are assuming that it's because the web page does not return "content-disposition" header, but only "application/octet-stream" content type. (As the target page is not our code we can't change anything on server side.)

It seems that the phantomjs stops executing at clicking "download" button.
So we are actually not very much sure if it is onFileDownload is not called, or the whole process is lost and suspended somewhere. However, we still are thinking that it is because of "application/octet-stream" content-type header.

I'm not sure if i'm making myself clear but we want to know if 1) our understanding is correct about missing Content-disposition header, 2) will Vitallium's DownloadManager solve this problem, and finally, 3) if yes, if it will be available sometime soon (say, within a month).

Thank you,
minami

UPDATE:
it seems this one works in our case:
https://github.com/ariya/phantomjs/pull/11484

thank you

May I ask what is the progress for this function?

:+1:

:+1:

For some cases, one workaround is enabling phantomjs cache and scanning cache directory to retrieve that downloaded attachment.

This feature will be in the next version. So, stay tuned!
On Apr 2, 2014 7:52 PM, "momogentoo" notifications@github.com wrote:

For some cases, one workaround is enabling phantomjs cache and scanning
cache directory to retrieve that downloaded attachment.

β€”
Reply to this email directly or view it on GitHubhttps://github.com/ariya/phantomjs/issues/10052#issuecomment-39347465
.

up!

Need this ASAP :-)

+1

@Vitallium do you have any details about when that will be?

For those who need file download ability now, from what I understand casperjs solves this.

Correction. I tried out casperjs and downloading large files does not work, they are 0 bytes. CasperJS folks say this relates to another bug in phantomjs, inability to set a larger timeout value. Please fix these bugs, downloading large files is very important for automation and testing!

push!

Happy to beta test anything here.

I'm trying to download an xslx file and get access to the content.

+1 for fix large timeout bug

I need to download an excel of 25 MB, every day, at same time. After login, search, and so on.

So casperJs was my friend ... could be my friend,because for this bug I cannot download the file ... sgrunt !!!!

@realtebo, did you try using CasperJS with SlimerJS?
Because of PhantomJS bugs I use SlimerJS and it works very well.

I need this too ASAP

my current workaround is to use an XMLHttpRequest to GET the file as 'arraybuffer' inside page.evaluate() so we keep the page context with cookies and all, then use the 'fs' module to write the binary data.

              var results = page.evaluate(function () {
                  // downloads have to be in the context of the web page
                  function downloadReport(id, name) {
                      console.log('downloading: ' + name);
                      var result = {};
                      try {
                          var xhr = new XMLHttpRequest();
                          xhr.open("GET", "http://host/api/v1/reports/" + id, false);
                          xhr.responseType = 'arraybuffer';
                          xhr.send(null);
                          var bin = xhr.response;
                          var u8 = new Uint8Array(bin), ic = u8.length, bs = [];
                          while (ic--) { bs[ic] = String.fromCharCode(u8[ic]); };
                          result.data = bs.join('');
                          result.name = name;
                      } catch (e) {
                          result.error = JSON.stringify(e);
                      }
                      return result;
                  }

                  var result = [];
                  result.push(downloadReport(123, 'report.pdf'));
                  return result;
              }, token);

              results.forEach(function (item) {
                  if (item.data != null)
                      fs.write(item.name, item.data, { mode: 'wb' } );
                  else
                      console.log(item.error);
              });

+1

+1

I came up with another workaround. From within page.evaluate I click on the link I need to download, then listen for onResourceReceived.

page.set('onResourceReceived', function (resource) {
     if (resource.contentType && resource.stage === 'end' && resource.contentType.indexOf('application/pdf') > -1)  {
          console.log(resource);
          // Here you can download the file from resource.url by using http(s) request (e.g. https://gist.github.com/ialpert/3136595)
}
})

@Vitallium, if you need help getting this into the next release please do let me know. This is _the_ biggest barrier to my testing so far. Thanks.

This issue is still marked as FutureRelease. Is there any possibility it will be included in 2.0?

Has anyone already tried the @mrampersad solution?

Did a quick test. Seem to work for me.

Well, one issue: onDownloadFinished callback is not called, but file is saved fine.
Nevermind. I was exiting phantom in onLoadFinished(fail), so it did not have chance to be called.

@agr would you be able to share a working example?

var page = require('webpage').create();
page.onFilePicker = function(oldFile)
{
    console.log('onFilePicker(' + oldFile + ') called');
    return 'master.zip';
}
page.onDownloadFinished = function(status)
{
    console.log('onDownloadFinished(' + status + ')');
    phantom.exit(1);
}
page.onLoadFinished = function(status)
{
    console.log('onLoadFinished(' + status + ')');
}
page.open('https://github.com/facebook/php-webdriver/archive/master.zip');

Has the bug about large downloaded been fixed ?

Files over 2GB never have, and still don't, work for me. If anyone got this
to work please let me know how.

On Tue, Oct 14, 2014 at 3:02 AM, Mirko Tebaldi notifications@github.com
wrote:

Has the bug about large downloaded been fixed ?

β€”
Reply to this email directly or view it on GitHub
https://github.com/ariya/phantomjs/issues/10052#issuecomment-58998209.

is this stil being worked on? did it ever get merged into master?

I'm working on windows
@agr @mrampersad
Is it possible to share the binary with download support ?
I was not able to build it successfully.
I have been struggling a lot for the same please help.

If you are not afraid of running random executables from Internet, here you go: https://github.com/agr/phantomjs/releases/tag/pj-download

Thanks a lot @agr

@agr do you have the binaries for osx and Linux too ?

nope

@agr do you have the binary for 32bit Windows os

ping

@agr the binary for phantomjs to download file shared, does not work on windows xp. It says invalid 32 bit application. Is this expected behaviour
The phamtomjs 1.9.8 works fine

Is phantomjs2 has the support for file download ?

@ankitgr8 It worked for me in windows 8 and 7 64 bit. May be the binary is for 64 bit.

Binary _is_ 32 bit:

$ file phantomjs.exe
phantomjs.exe: PE32 executable (console) Intel 80386, for MS Windows

It might be built with minimum Windows version set to something higher, than XP, but I am not sure how to check that.

Building should be independent of the windows version.. it dependent on the compile we used.. vc++ .
unless and until we used some windows API which are not available in XP.

But since phantomjs1.9.8 works fine on windows xp.. wondering what new API used which in this phantomjs branch which are not compatible with windows xp.

DO we have the setps to how to create the binary from this branch

Clone repository to local disk, run build.cmd? That's what I did.

The branch we are talking about is https://github.com/Vitallium/phantomjs/tree/download-support

does not have build.cmd.. can u please share the branch to clone

Hi.
I tried to use phantomjs with selenium as replacement for firefox.
And I have a question, how to use this functional in webdriver?

@momogentoo thanks for the suggestion to scan the cache.

One question : cache files seem compressed. Did you find out how to uncompress them from the command line ?

I was able to build phantomjs 2.0 on windows and also merged the download capability feature in phantomjs 2.0 branch.
Here's the link to download the exe if any one required. https://github.com/ankitgr8/phantomjs2.0
Thanks to Vitallium and ariya for this feature and easy of build env for phantom 2.0.
How to use the feature , i have upload the readme also .. in the above link..
Thanks everyone for all help

@ankitgr8 :beers:

PR #11557 looked like it was the furthest along on features. The last comment from @Vitallium looked like some tests were potentially holding up the merge.

There were also some notes about the need to have better tracking of a download's completion. In my local copy I had created a public property to expose m_downloadingFiles.count(); so I could see if files were actively downloading. Following the normal conventions though, this should be implemented as a Callback function when the private downloadFinished completes. I normally look at .NET so it isn't clear to me if this is accessible As-Is. It doesn't look like there is a clear method to tie the onFileDownload to a download ID that can be matched in onResourceReceived which I think is the closest to an OnDownloadFinished that exists to cover this need in the current implementation.

Is there something in Qt5 that may suggest a different approach to this? And require more tweaks?

Is there general acceptance that the API used in this version with page.onFileDownload and page.onFileDownloadError is where we are going? I would be more comfortable applying the patch locally and doing a Local build to know that this is how it will look once it is officially merged.

I read somewhere that @Vitallium had made a comment about wanting to provide some better download progress status, so I don't know if this bar is needed to get this accepted.

So do we just need to provide some tests and a version of the PR for the current Master (still getting my head around Git to know if this is needed) to move this forward or is more needed?

Sorry if this is too verbose, and Thank you to everyone that has been active on these threads, it has been very helpful to understand where things are.

πŸ™

Oh lawd! praise white baby jesus!

Simple page to use as a test case for this: http://www.fangraphs.com/projections.aspx?pos=all&stats=bat&type=steameru (click the "Export Data" link). Hard to believe PhantomJS can't handle this basic scenario, but good luck to those working on a solution.

FYI casperjs can handle this fine, as explained here (the answer by julianjm): http://stackoverflow.com/questions/16144252/downloading-a-file-that-comes-as-an-attachment-in-a-post-request-response-in-pha

My own code had even way less lines, and was more like:

var casper = require('casper').create({
    pageSettings: {
        webSecurityEnabled: false
    }
});

casper.start();

casper.thenOpen('http://www.fangraphs.com/projections.aspx?pos=all&stats=pit&type=steameru', function() {
    var postbody = this.page.evaluate(function() {
        $('#__EVENTTARGET').val('ProjectionBoard1$cmdCSV');
        return $('#form1').serialize();
    });
    casper.download('http://www.fangraphs.com/projections.aspx?pos=all&stats=pit&type=steameru', 'fg_pitchers.csv', 'POST', postbody);
});

casper.run();

Call the script from command line like so:

casperjs --ssl-protocol=any --cookies-file=cookies.txt myscript.js

All hail the "download" function πŸ™Œ

This is the most frustrating issue with Phantom!

I need to be able to download pdf files for testing. While not able to do this directly via Phantom, I can do it using sync xmlhttprequest or async (to set the responseType to blob or arrayBuffer).

But either way, I end up with garbage.

The blob only outputs [object object] in the file and the arrayBuffer outputs a pdf that is garbled. The text portions are fine, but the encoded portions are garbage.

I have tried binary writes, conversion of the arrayBuffer as mentioned earlier, and all types of charsets (which should not be used on the binary format) but all result in corrupted pdfs.

Anyone have a working solution???????

Phantom really needs a solid download solution. I have tried some of the linked "solutions" but they result in no file. The file I am downloading is typed as an application.pdf but named xxxx.sap

Hi tommunro

Not sure for which OS your are looking for... But if you are working on windows then this is the phantom js 2.0 build with download functionality build in ,, u can download the exe from https://github.com/ankitgr8/phantomjs2.0 ..... for your testing purpose.. and also i have mentioned the sample js code for same

Thank you for the fast reponse.

I have tried a couple different download builds for windows, but neither seemed to fire the ondownload event.

The link is a β€œgetpdf.sap?...” link.

I can receive the file using both sync and async xmlhttprequest calls under page.evaluate such as:

var xhr = new XMLHttpRequest();

xhr.open('GET', tempLink, true);

xhr.responseType = 'arrayBuffer';

xhr.onload = function(e) {

window.callPhantom(xhr.response);

};

The above works when I include a wait in the evaluate for the async to complete.

With async I can get a blob or an arrayBuffer, but I have not been able to save the blob to a file.

With sync I can return the response directly rather than through a callback.

Both methods however result in a pdf file that looks ok in an editor, but the encoded (binary) portions are encoded incorrectly so the text does not appear.

I save the file using fs.write('test.pdf', data, 'w'); (tried wb also – just gives different encoding)

I should get:

%PDF-1.3

zG_Γ•ΓΉΓŸJ€·°#s6Β­Β¦dR Lβ€žsΒ­

1 0 obj

<<

But instead I get:

%PDF-1.3

zG_οΏ½οΏ½οΏ½JοΏ½οΏ½οΏ½#s6Β­οΏ½dR LοΏ½sΒ­

1 0 obj

<<

With your download build, I β€œclick” on the pdf link and use

page.onFileDownload = function(status) {

           console.log('onFileDownload(' + status + ')'); 

           return 'test.pdf'; 

}

page.onFileDownloadError = function(status) {

           console.log('onFileDownloadError(' + status + ')');

           //phantom.exit(1);

}

But these never seem to get called.

Tom

From: ankitgr8 [mailto:notifications@github.com]
Sent: Saturday, June 20, 2015 3:51 AM
To: ariya/phantomjs
Cc: tommunro
Subject: Re: [phantomjs] File download (#10052)

Hi tommunro

Not sure for which OS your are looking for... But if you are working on windows then this is the phantom js 2.0 build with download functionality build in ,, u can download the exe from https://github.com/ankitgr8/phantomjs2.0 ..... for your testing purpose.. and also i have mentioned the sample js code for same

β€”
Reply to this email directly or view it on GitHub https://github.com/ariya/phantomjs/issues/10052#issuecomment-113724216 . https://github.com/notifications/beacon/AMXx286eBClkxWc027ZqNTGG_iPgfwf2ks5oVRLWgaJpZM4Ajhpl.gif

try adding some debug in your script.. and check what request is being send and what response is being received.. and check the content-type of the response .. if it is HTML then onFileDownload will not be called

Try below script example to show the was send and received

page.onResourceReceived = function(status){console.log('onResourceReceived(' + status.contentType + ')'); if(status.stage === 'end'){phantom.exit(0);}}
page.onResourceRequested = function(requestData, networkRequest){console.log('onResourceRequested(' + JSON.stringify(requestData) + ')');}

ok, cool.

After much tweaking, it turns out that the funky onclick event for this particular link was not generating a resource request like the others. I changed the page to "open" the link directly and that did fire the onDownload, while failing to load the page as expected. The pdf came out intact!

So your download build works as described! Thank you!

Is there a build for Linux? or build instructions?

linux Build instruction are on phantom js2.0 site.

Latest source code did not work for me. I built phantomjs, but it did not download file. Even onFilePicker event did not fire. So I used previous source code provided by agr.

In case who are interested in binary with download support (it's not static) for Debian 7
https://github.com/skornev/phantomjsbinary

+1, in the vain hope this helps move this issue to the top of the queue...

++, lets make that vain hope double

++

++

++

can u please share some sample code to download using casperjs

@ballesdbc Please share sample code, would be very appreciated.

Thanks, this works!

@pwaldhauer
@ballesdbc
Where is the code sample? Does it work with large files too?

this works great for me:

// these two functions answer to ankitgr8 download patch
page.onFileDownload = function(status) {
messageLog('==> Download complete');
var name = rawPdf + '.pdf';
return name;
}
page.onFileDownloadError = function(status) {
messageLog('onFileDownloadError(' + status + ')');
}

// in my case, the page link will not work by "clicking" it - I need to formally open it
// the page open triggers the above function
page.open(tempLink, function(status) {
if (status !== 'success') {
//console.log('FAIL to load pdf');
}
});

hope this helps everyone.

I rebased @Vitallium s download-support on 2.0 here https://github.com/sgraham/phantomjs/tree/download-support-vs-2.0 in case anyone else wants it to save the bother of doing the merge/apply.

I followed the Linux build instructions here http://phantomjs.org/build.html and this simple example works OK. Would be great to have something in mainline!

var page = require('webpage').create();

page.onResourceReceived = function(status) {
  console.log('onResourceReceived(' + status.contentType + ')');
  if(status.stage === 'end') {
    phantom.exit(0);
  }
}

page.onResourceRequested = function(requestData, networkRequest) {
  console.log('onResourceRequested(' + JSON.stringify(requestData) + ')');
}

page.onFileDownload = function(status) {
  console.log('done');
  return 'sample.pdf';
}

page.open('http://www.cbu.edu.zm/downloads/pdf-sample.pdf');

I compiled sgraham's branch above, but there seems to be a difference with the windows build by ankitgr8 (https://github.com/ankitgr8/phantomjs2.0). The build of sgraham's branch does not trigger onFileDownload in my case (zipfile download). So for the moment, I'm running ankitgr8's thing with wine. ( yes, I know -- newbie).

(masked) output of onResourceReceived:

Received {
"body": "",
"bodySize": 0,
"contentType": "application/zip;charset=UTF-8",
"headers": [
{
"name": "Date",
"value": "Thu, 14 Jan 2016 xx:xx:xx GMT"
},
{
"name": "Content-Disposition",
"value": "attachment; filename=somefilename.zip"
},
{
"name": "Expires",
"value": "0"
},
{
"name": "Cache-Control",
"value": "must-revalidate, post-check=0, pre-check=0"
},
{
"name": "Pragma",
"value": "public"
},
{
"name": "Keep-Alive",
"value": "timeout=10, max=100"
},
{
"name": "Connection",
"value": "Keep-Alive"
},
{
"name": "Transfer-Encoding",
"value": "chunked"
},
{
"name": "Content-Type",
"value": "application/zip;charset=UTF-8"
}
],
"id": XX,
"redirectURL": null,
"stage": "end",
"status": 200,
"statusText": "OK",
"time": "2016-xx-xxTxx:xx:xx.xxxZ",
"url": "https://domain/dir"
}

Does downloading large files with PhantomJS 2.1 work?

How? How to download this quite big file from phamtomjs for example?
https://download.jetbrains.com/idea/ideaIC-15.0.3.tar.gz

@lanzorg no, you can't download files with PhantomJS.

Did u tried my private build it is on windows
https://github.com/ankitgr8/phantomjs2.0

I have also attached the sample code for the same

Any chance it will be available in the next release?

+1. File downloading would be a great feature for PhantomJS.

+1

+1

+1. Just wasted a couple of days converting my Selenium Firefox script that downloads a CSV to use PhantomJS only to find this ticket. Very sad now.

+1

+1

+1 ΠΎΡ‡Π΅Π½ΡŒ Π½ΡƒΠΆΠ½Π° Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡ‚ΡŒ скачивания Ρ„Π°ΠΉΠ»ΠΎΠ²

+1

Thanking you
Indranil Gayen

On Mon, Apr 4, 2016 at 8:52 PM, Staf4 notifications@github.com wrote:

+1 ΠΎΡ‡Π΅Π½ΡŒ Π½ΡƒΠΆΠ½Π° Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡ‚ΡŒ скачивания Ρ„Π°ΠΉΠ»ΠΎΠ²

β€”
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
https://github.com/ariya/phantomjs/issues/10052#issuecomment-205346505

+1

This is awesome. I am watching this issue for 2 years now and it amazes me how people keep bumping it :)

+1

+1

+Googolplex

5 years have passed!

It would be a really useful to implement download feature.

Yeah, why no download feature? ((( I want download files with phantomjs

I think the author is a liar and he never planned to release this feature.

I think the author is a liar and he never planned to release this feature.

Hey, the author is doing this on his spare time and is giving away all of it for free. Things gets in the way all the time and he haven't promised a deadline for this. Please try and show some respect...

Agree, but the issue has been opened 5 years ago! Does this feature require a lot of time to implement it? Because a lot of people really need it.

I would kindly ask all the haters to read about forking and pull requests.

Hey, don't blame the author please. They are doing it entirely for free.
Only God does the same. It's very rude to tell this. If you want, develop
it yourself.

Thanking you
Indranil Gayen

On Sun, Jun 5, 2016 at 10:10 PM, Andrey Gadyukov notifications@github.com
wrote:

I think the author is a liar and he never planned to release this feature.

β€”
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/ariya/phantomjs/issues/10052#issuecomment-223823148,
or mute the thread
https://github.com/notifications/unsubscribe/ACrtqe_LtWK0j6fhs7RIEdvz6wxeaK82ks5qIvwPgaJpZM4Ajhpl
.

So many butt hurt in this thread :) )))))))))))))) .

hi, as i mentioned in my post (way back before already), in our case:

11484

did help. i believe it depends on what you are trying to do but it may work for otheres too. not sure of the latest phantomjs status but we are downloading files through phantomjs all right... and it's helping us a lot!

Just a friendly reminder: PhantomJS team consists of volunteers. We do not always have the spare time necessary to implement every feature request. Competing priorities are unavoidable.

If you can't live with this, see #13861 where I outlined various ways you can help us. If you're still not satisfied, pay your favorite consultant to contribute a feature. These are all way more productive than whining or even insulting our work.

Please don't confuse those of us who have upvoted the feature request with the obnoxious comments above!

There is also the electron-based nightmare.js https://github.com/segmentio/nightmare which seems to perform a lot faster, too.

I rebased @Vitallium download-support on 2.1
https://github.com/SeNaP/phantomjs

Could you explain what does it mean? Can we already use download feature within phantomjs 2.1?

Yep

Clone repository https://github.com/SeNaP/phantomjs
Build http://phantomjs.org/build.html

 page.onResourceReceived = function(response) {
            page.onFilePicker = function(){
                console.log("save file:"+filename);
                return "filename.extension"; // ex. file.zip
            }
        }
    }

    page.onDownloadFinished = function(status){
        console.log('onDownloadFinished(' + status + ')');
    }
    page.onLoadFinished = function(status){
        console.log('onLoadFinished(' + status + ')');
    }

SeNaP, you can do binary for windows? I do not know how to compile phantomjs (tried - did not work)

@SeNaP
So why is it still not merged in the official phantomjs repository?
Please could you explain me?

@lanzorg my wild guess is that the feature is only implemented on 2.1 and would not compile on the current master branch.
Update1:
@Staf4 After a retarded amount of time I managed to compile SeNaP version, although for some odd reason its quite large than the original, it's working just fine:
--> phantomjs.zip
Update2:
@SeNaP Would your fork work with selenium(python)?

I built the linux-64 version of @SeNaP 's fork in this link: phantomjs.tar.gz

I'm trying to figure out how to use it with poltergeist (https://github.com/teampoltergeist/poltergeist) so I can download files with Capybara (https://github.com/teamcapybara/capybara). Any help is welcome.

+1 😜

Did someone manage to build a Mac version of @SeNaP ’s fork?

+1 :100:

+1

I built the osx binary of @SeNaP 's fork.
https://github.com/jomix/phantomjs/raw/2.1/bin/phantomjs
SHA1 bbecc70411c8094e95b7b8c6f3a1403cc7edc1e3
Enjoy.

Has someone done it for Java-Selenium?

I have done it, but for which OS type Windows or Linux

On 08-May-2017 7:22 PM, "rakeshnambiar" notifications@github.com wrote:

Has someone done it for Java-Selenium?

β€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ariya/phantomjs/issues/10052#issuecomment-299872691,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGGCQKyCTLo46NSBdEl-msIviplbHbz8ks5r3x4xgaJpZM4Ajhpl
.

Hello @ankitgr8 ... I am looking for Windows.

Thanks

Hi @ankitgr8
I am also looking for the same . Just would like to know how we can set the default download directory if we use your phantom js exe and download the file in java-selenium

Many Thanks
Musaffir

Here's the link to download the exe if any one required. https://github.com/ankitgr8/phantomjs2.0..
The above exe has the download capability.
IT run on windows 64 bit.

To set the default download.. U can create the download JS at runtime and set the download directory at that time.. below is the sample of the js code which can be create using JAVA .. CHECK for "downloadFileName"

BufferedWriter bos = new BufferedWriter(fos);
bos.append("var page = require('webpage').create(); ");
for(Cookie ck : webDriver.manage().getCookies()) {
bos.append(" phantom.addCookie({ name: '"+ck.getName()+"', value: '"+ck.getValue()+"', domain: '"+ck.getDomain()+"' }); ");
bos.newLine();
}

bos.append(" page.onFileDownload = function(status){console.log('onFileDownload(' + status + ')'); return '"+downloadFileName+"'; }");
bos.newLine();
bos.append(" page.onResourceReceived = function(status){console.log('onResourceReceived(' + status.stage + ')'); if(status.stage === 'end'){phantom.exit(1);}}");
bos.newLine();
bos.append(" page.onResourceRequested = function(status){console.log('onResourceRequested(' + status + ')'); }");
bos.newLine();
bos.append(" page.onFileDownloadError = function(status){console.log('onFileDownloadError(' + status + ')');phantom.exit(1);}");
bos.newLine();
bos.append(" page.onLoadStarted = function(status){console.log('onLoadStarted(' + status + ')');}");
bos.newLine();
bos.append(" page.onLoadFinished = function(status){console.log('onLoadFinished(' + status + ')');}");
bos.newLine();
bos.append(" page.open('"+downloadURL+"');");
bos.flush();
bos.close();

@ankitgr8 .. Many thanks... Will come back to you in case I am facing any issue.

@ankitgr8 .. I am working in Windows10. Do you know where is the PhantomJs default download directory?

Not sure what is the path, but u can print the path in your js script, to get the details..

If you hit this wall building on Macbook (Xtools):

Xcode not set up properly. You may need to confirm the license
   agreement by running /usr/bin/xcodebuild without arguments.
ERROR: Failed to build PhantomJS! Configuration of Qt Base failed.

First double-check you have full Xcode (not Command Line Tools version). Then if you fail to build the following should workaround:

cd /Applications/Xcode.app/Contents/Developer/usr/bin/
sudo ln -s xcodebuild xcrun

I am also looking for setting up PhantomJs default download directory.
So when I click on element it should download in that directory. Please suggest

Windows or Linux

On 04-Dec-2017 5:50 PM, "Yogesh" notifications@github.com wrote:

I am also looking for setting up PhantomJs default download directory.
So when I click on element it should download in that directory. Please
suggest

β€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ariya/phantomjs/issues/10052#issuecomment-348946587,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGGCQGrCQq_rHtduT7NCIH4tbpf5Gpnwks5s8-N9gaJpZM4Ajhpl
.

on both Windows as well as Linux.

Due to our very limited maintenance capacity (see #14541 for more details), we need to prioritize our development focus on other tasks. Therefore, this issue will be automatically closed. In the future, if we see the need to attend to this issue again, then it will be reopened.
Thank you for your contribution!

😱

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yairza picture yairza  Β·  6Comments

h4wlt picture h4wlt  Β·  5Comments

Snowlav picture Snowlav  Β·  3Comments

sinojelly picture sinojelly  Β·  3Comments

mdominado picture mdominado  Β·  3Comments