Attach (recommended) or Link to PDF file here:
https://jsfiddle.net/p3ybwp7d/1/
Configuration:
Steps to reproduce the problem:
Create loading task and add onProgress callback:
let loadingTask: any = PDFJS.getDocument(this.src);
loadingTask.onProgress = (progressData) => {
// progressData won't contain "total", only "loaded"
};
What is the expected behaviour? (add screenshot)
In previous versions, onProgress
did return both total
and loaded
.
What went wrong? (add screenshot)
total
field is undefined in loadingTask.onProgress
callback.
Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):
https://jsfiddle.net/p3ybwp7d/1/
Is this Chrome-specific? In Firefox I get the following output, which looks good:
PDF.js ProgressData
{"loaded":14480,"total":1016315}
{"loaded":836400,"total":1016315}
{"loaded":1016315,"total":1016315}
It's related to fetch
, since you can observe the same issue in Firefox too (with the dom.streams.enabled
and javascript.options.streams
prefs set in about:config
).
@timvandermeij , Can I try this issue ?
Since this issue can be observed in two different browsers, are we sure that this isn't a problem with the Fetch standard[1] itself?
If it's a Fetch standard limitation, or perhaps a browser one, then I'm not sure if we'd be reasonably able to fix this in the PDF.js library.
[1] This looks like the relevant part of the specification: https://fetch.spec.whatwg.org/#terminology-headers
Basically while using the Firefox
browser, total
is defined in this manner
https://github.com/mozilla/pdf.js/blob/237bc2ef9df204069c4996e14433e0a35123444a/src/display/network.js#L427
and while using Chrome
browser, total
is defined in this way because it uses the fetch_stream
https://github.com/mozilla/pdf.js/blob/237bc2ef9df204069c4996e14433e0a35123444a/src/display/fetch_stream.js#L152
So basically this._contentLength
is undefined
in both cases but in the first case data.total
contains the total count so it doesn't cause a problem. So we can try to do something similar in the second case too.
Shall I give this a try ?
So basically this._contentLength is undefined in both cases but in the first case data.total contains the total count so it doesn't cause a problem.
In fetch_stream
we are setting this._contentLength = source.length
here and both source.length
and data.total
are logically same things. So I don't think doing the way you are thinking going to solve the problem.
@mukulmishra18 , ok let me look into this.
@mukulmishra18 , could you explain when is the PDFFetchStream
constructor
called.https://github.com/mozilla/pdf.js/blob/237bc2ef9df204069c4996e14433e0a35123444a/src/display/fetch_stream.js#L34 I just want to see from where is the argument source
passed. As source.length
turns out to be undefined
First we are setting PDFNetworkStream
at https://github.com/mozilla/pdf.js/blob/237bc2ef9df204069c4996e14433e0a35123444a/src/pdf.js#L35-L45
based on the environment and support for stream. So if streaming is supported by the browser then PDFNetworkStream
is PDFFetchStream
.
After this we are calling this constructor with all the provided params
from: https://github.com/mozilla/pdf.js/blob/237bc2ef9df204069c4996e14433e0a35123444a/src/display/api.js#L255
If you want to see what these params
are, you can read here: https://github.com/mozilla/pdf.js/blob/237bc2ef9df204069c4996e14433e0a35123444a/src/display/api.js#L98-L140
@mukulmishra18 , Thanks for all this explanation and now I can understand things better.
In fetch_stream we are setting this._contentLength = source.length here and both source.length and data.total are logically same things.
But when I debugged this I found out that in both the cases i.e. be it fetchStream
or not , the source.length is always undefined. I also found out that in the source
object there exists no length
parameter
https://github.com/mozilla/pdf.js/blob/fad2a3f427db76033873200b77ecb137420a7119/src/display/network.js#L433
In this line data.total
contains the value of total size
https://github.com/mozilla/pdf.js/blob/fad2a3f427db76033873200b77ecb137420a7119/src/display/network.js#L142
In this line evt
is actually the data
object used above. So should I try doing something like this in the fetchStream
case too.
Or
Can you suggest me something ? Because I can see that @Snuffleupagus was right on this as I think this is a fetch API limitation
Can you suggest me something ?
I will suggest you to create a simple PDF.js app and try to run in Chrome and check if you are getting right headers(especially Content-Length
) somewhere: https://github.com/mozilla/pdf.js/blob/237bc2ef9df204069c4996e14433e0a35123444a/src/display/fetch_stream.js#L106
I also think it may be a problem of fetch standard or browser as mentioned in https://github.com/mozilla/pdf.js/issues/9103#issuecomment-357745218. If that is the case, we can't do a lot in PDF.js to fix this.
I will suggest you to create a simple PDF.js app
I have been using the app all the time, mentioned in https://github.com/mozilla/pdf.js/issues/9103#issue-271184807
check if you are getting right headers(especially Content-Length) somewhere:
That's what I'm trying to say that I can't find the content.length
and it always comes out to be undefined. So even I now think that this is a short coming of the Fetch API
FYI, _contentLength
is set at:
and the value originates from https://github.com/mozilla/pdf.js/blob/6b7e2cbcd1fbfd68c17f92178ce47df7f6665c31/src/display/network_utils.js#L42-L47
But this value is guarded behind range requests. Before the above snippet, the function returns early if Range
requests are disabled:
https://github.com/mozilla/pdf.js/blob/6b7e2cbcd1fbfd68c17f92178ce47df7f6665c31/src/display/network_utils.js#L23-L42
I don't see an immediate reason for blocking that, so perhaps it makes sense to unconditionally use the value of the Content-Length
header (unless Transfer-Encoding
is specified but not starting with identity
).
Hi! My first comment in GitHub, sorry if I make a mistake.
I kind of found a solution:
The suggestedLength of returnValues is never really updated with the length calculated.
So I did:
returnValues.suggestedLength = length ; ,
before any "if" .
Also the Http header must match, so attention with Content-Length and Content-length (case sensitive)
We just updated to 1.10.97 which broke our loading task. As a (hopefully temporary) workaround, we do a header-only fetch for the content-length header beforehand:
const total = await fetch(new Request(documentUrl, { method: 'HEAD', credentials: 'include' }))
.then(res => parseInt(res.headers.get('content-length'), 10));
const loadingTask = window.PDFJS.getDocument({url: documentUrl, withCredentials: true});
loadingTask.onProgress = ({ loaded }) => {
// do stuff with `loaded` and `total`
};
Adding good-beginner-bug label because I've already explained the issue and how it can be fixed in https://github.com/mozilla/pdf.js/issues/9103#issuecomment-363436612
can anybody help me with the architecture of pdf.js . i'm working on the project.
Most helpful comment
It's related to
fetch
, since you can observe the same issue in Firefox too (with thedom.streams.enabled
andjavascript.options.streams
prefs set inabout:config
).