Pdf.js: Memory Leaking with pdf viewer

Created on 28 Aug 2018  路  15Comments  路  Source: mozilla/pdf.js

Hi, so i am using bower, requireJS, backboneJS, and i believe thats all thats necessary to know in my web software. I am trying to integrate this viewer and am including everything necessary (pdf.js, worker, and web/viewer).

The issue is that no matter what i delete and remove the viewer still holds memory. Is there a way to tell the pdf_viewer to remove all references to children so that they are removed from memory?
heap
destroy
howimloading

I create a new viewer when the view containing the pdf display is created (which is necessary for our software). So i need it to clear itself out completely on destroy since everything is ajax and asynchronous.

I have gotten it down to a small amount of memory leaking but it is still holding onto old references. Attached are heap screenshots and code from what i am doing. Please let me know what else i need to call to remove the appropriate memory.

Thank you

1-viewer 2-performance

Most helpful comment

Unfortunately not. It's a difficult topic, and I've got almost no free time this month. I hope to be able to return to the project soon.

You don't happen to have both free time and a lot of curiosity? I always appreciate pull requests! Or hints, to put the bar a bit lower :).

All 15 comments

This is after going out of the view that generates the viewer 4 times (so it created 4 PDFViewer instances). I delete them and I even go through the pages and delete the annotationLayerFactory, TextLayerFactory, and Rendering queue. As well as the linkService and renderingQueue of the viewer. As you can see it is still holding memory and does not free anything (even when "deleting" them). Please help.
comparison

The viewer does it like this: https://github.com/mozilla/pdf.js/blob/4ea663aa8a23f7e77108223b1698b3380b5211d7/web/app.js#L1230-L1241

Perhaps the ordering is important here, I'm not sure. Also, which PDF.js version are you using?

It would be interesting to know if that doesn't work correctly since that may indicate a memory leak in the cleanup code.

This is the bower file in the pdfjs-dist that i just integrated last week
"name": "pdfjs-dist",
"version": "2.0.550",
"main": [
"build/pdf.js",
"build/pdf.worker.js"
],
"ignore": [],
"keywords": [
"Mozilla",
"pdf",
"pdf.js"
]

Hi Tim,
I'm using pdf.js 1.4.20 and opening the PDF file in a modal window in an asp.net application. At production environment, whenever there are multiple documents opened simultaneously by the users, stackoverflow exceptions are occurring continuously. The following is the error:

Faulting application name: w3wp.exe, version: 8.5.9600.16384, time stamp: 0x52157ba0
Faulting module name: KERNEL32.DLL, version: 6.3.9600.17415, time stamp: 0x545049be
Exception code: 0xc00000fd
Fault offset: 0x0001bb0b
Faulting process id: 0x14bc
Faulting application start time: 0x01d453e495a2c365
Faulting application path: C:\Windows\SysWOW64\inetsrv\w3wp.exe
Faulting module path: C:\Windows\SYSTEM32\KERNEL32.DLL
Report Id: d3d7358f-bfd7-11e8-80e1-00505695600f
Faulting package full name:
Faulting package-relative application ID:

Are there any issues with memory usage if multiple documents are opened and accessed and even after closing the documents, the reference are not released or heap is not cleared. If yes, is it addressed in new versions of PDF.js 1.9.x or so?

Please help me to identify and resolve the issue.

Thank you..

As far as I can see, there a quite a few memory leaks in the viewer. After removing the PDF viewer from the DOM and tidying up, the memory snapshot looks like this:

image

any update on this?

Unfortunately not. It's a difficult topic, and I've got almost no free time this month. I hope to be able to return to the project soon.

You don't happen to have both free time and a lot of curiosity? I always appreciate pull requests! Or hints, to put the bar a bit lower :).

I have gotten memory to drop low enough that it isnt a huge problem but it still leaks a little memory so it would be nice to have a fix to this sometime.

Possible dupe of #9902

Both seem to be related to a memory leak...

If someone has a simple example that I can run locally that demonstrates this leak, it will be much easier to look into.

Unfortunately i do not. If you create with new PDFViewer.PDFViewer then PDFJS.getDocument then cleanup and remove from DOM then run it again you will see leaks in dev tools -> memory.

I provided all the code necessary to ensure that it leaks in the initial post.

To those that want my temporary solution for less memory leaks; I create the viewer only once then reuse that same viewer throughout the software so the only thing that leaks is a little PDFDocument memory each time it loads a new document. Since PDFViewer was the main culprit of the leaking I dont remove from memory once its created and just use the pointer to that viewer everywhere we display PDFs.

EDIT: Also, were using pdfjs-dist with bower if that makes any difference for your testing purposes

I am using version 2.2.15, and it still suffers from this issue. It looks like it is makes if difficult to use for documents larger than a number of pages, specific to a client machine. In my case (Win10/Chrome) the limit it is ~200 pages before it crashes the browser. My collegue's Mac/FF doesn't crash for 500 pages, but eats about 7Gb of RAM.

Labeling as a performance issue for the viewer so we can find it more easily.

Perhaps you could try the latest development version, i.e., version 2.2.189 as found on https://mozilla.github.io/pdf.js/web/viewer.html or the latest master checkout, to see if it helps. Some fixes were done very recently with regards to less object creation (intermediate strings and primitive objects) and cache clearing (images and primitive objects), so perhaps that improves the situation a bit.

In the memory usage report I do see a lot of page proxy and page view objects. From the screenshot in https://github.com/mozilla/pdf.js/issues/10021#issuecomment-417007360 I can at least say that the PDFObjects is kept alive by the PDFPageProxy and the PDFPageProxy is probably kept alive by: https://github.com/mozilla/pdf.js/blob/fef86cc3e327f0554136482fee901b52909fa37a/src/display/api.js#L2213
Perhaps this is a start for looking into where the cleanup is going wrong.

Can report the same/similar issue (using Firefox 68.0.1 32-Bit) when embedding pdf files into the html through the object/embedded tag.
I am subsequently loading a lot of different pdf files(~200kb) while staying on the same site, while only one PDF is rendered at any given time. I am not sure why the garbage collector is neglecting them in the first place since the isn't any reference to the file after loading another one. Even the IE11 is handling the situation as expected.

task
memory

Having same problem in Firefox - when using object/embedded for FireFox PDF Application. Is the Promise or wrapper tied to FireFox's copy of PDF.JS that preventing garbage collection?

FireFox_PDFJS_Memory

Different Approach: Using same code, I forced a hosted copy of PDF.JS to load into an iFrame (Like IE, Edge and Safari). Did not use FireFox's PDF Application version of PDF.JS. With this approach, memory is being released as desired when PDF data and PDF.JS are removed.

FireFox_PDFJS_Memory_released

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hp011235 picture hp011235  路  4Comments

smit-modi picture smit-modi  路  3Comments

PeterNerlich picture PeterNerlich  路  3Comments

liuzhen2008 picture liuzhen2008  路  4Comments

THausherr picture THausherr  路  3Comments