Configuration:
Steps to reproduce the problem:
My code is:
import pdflib from 'pdfjs-dist'
pdflib.PDFJS.workerSrc = './node_modules/pdfjs-dist/build/pdf.worker.entry.js'
exactly as described in https://github.com/mozilla/pdf.js/wiki/Setup-pdf.js-in-a-website#with-webpack,
yet I get a Warning: Setting up fake worker.'
in my console when I actually load a document, which makes it seem like the instructions did not work.
Additionally the wording on the instructions seems wrong as "PDFJS.workerSrc _shall_ be set to point to this file" (the current wording) means that it's automatically set, whereas "PDFJS.workerSrc _should_ be set to point to this file" means you have to set it yourself.
The example code also isn't extremely helpful as it uses the relative paths into the repository (pdfjsLib.PDFJS.workerSrc = '../../build/webpack/pdf.worker.bundle.js';
) that a person importing PDFJS would not be able to do.
I'm also confused as I searched through issues/PRs that are < 1 year old like https://github.com/mozilla/pdf.js/pull/6595 that do some automatic loading of the worker script, but that code seems to no longer exist in the repo, so both setting and not setting the workerSrc
cause the fake worker to load for me... The fake worker knows where to find the worker script built by webpack (e.g. 1.bundle.js
is the worker when bundle.js
is the script), so I'm confused why a real worker couldn't use this logic as well.
I've tried pointing workerSrc
to the 1.bundle.js
file created and even using webpack's worker-loader to instantiate and replace PDFWorker
(pdflib.PDFJS.PDFWorker = require('worker!pdfjs-dist/build/pdf.worker.entry.js')
), but that didn't work either, so I'm completely lost as to how the worker is supposed to work with webpack
The fake worker knows where to find the worker script built by webpack (e.g. 1.bundle.js is the worker when bundle.js is the script), so I'm confused why a real worker couldn't use this logic as well.
Check if bundle.js includes worker -- it's wrong (from page loading performance and size) to have it there. Entire pdf.worker.js file shall be placed into separate bundle.
The example code also isn't extremely helpful as it uses the relative paths into the repository (pdfjsLib.PDFJS.workerSrc = '../../build/webpack/pdf.worker.bundle.js';) that a person importing PDFJS would obviously not be able to do (not a very useful example).
pdf.worker.bundle.js file you create as a bundle output that includes pdf.worker.js module (imported from pdfjs-dist)
The description of the issue is not clear. Can you provide complete example source code?
Check if bundle.js includes worker -- it's wrong (from page loading performance and size) to have it there. Entire pdf.worker.js file shall be placed into separate bundle.
Checked the bundled code and can confirm It does not include the worker. As I mentioned, the worker script is bundled as 1.bundle.js
. Upon loading a PDF, a script tag for 1.bundle.js
is inserted into my <head>
tag (not sure if this is from PDFJS or webpack).
pdf.worker.bundle.js file you create as a bundle output that includes pdf.worker.js module (imported from pdfjs-dist)
Is there an example that uses the first, and arguably preferred, method in the Wiki of loading the worker script from node_modules
? From the wiki section: "The worker shall be built into a separate bundle: take the file "./node_modules/pdfjs-dist/build/pdf.worker.entry.js"
@yurydelendik could you elaborate on the auto-detection/loading of the worker in #6595 that seems to no longer be in the codebase? I'm building a library that uses pdf.js, so if someone had to import pdf.js code to make my library work, that would be fairly tedious (managing dependencies of dependencies).
The description of the issue is not clear. Can you provide complete example source code?
I didn't attach source code as there's really not much else helpful or relevant, but here's a ~50 line summary: http://pastebin.com/raw/PHs6bfby. The file
argument is literally a file from an <input type='file' />
element.
@yurydelendik could you elaborate on the auto-detection/loading of the worker in #6595 that seems to no longer be in the codebase?
This functionality is not intended for bundlers/packagers.
I'm building a library that uses pdf.js, so if someone had to import pdf.js code to make my library work, that would be a bit annoying (managing dependencies of dependencies).
So far we did not find a bundler that properly manages web worker and we don't want give preferences to webpack or browserify -- we had problem with supporting both at the same time.
The solution is to managing dependencies of dependencies is not trivial. But keep in mind that efficient parsing and rendering of PDFs is not trivial task. If you abandon using web worker, and you are free to do that, the UI performance will suffer and it will be your trade-off.
I didn't attach source code as there's really not much else helpful or relevant
You can publish small example of a library that demonstrate the intent of what you are trying to achieve. Provided snippets of the code is not useful since they are not: runnable and not what you are trying to do -- a library.
I am experiencing the same issue. See https://cdn.kidoju.com/support/viewer/index.html.
The code can be found at https://github.com/kidoju/Kidoju-Help. Use the build
cmd file.
See also https://github.com/kidoju/Kidoju-Help/issues/2.
This functionality is not intended for bundlers/packagers.
Didn't realize that 馃憤
So far we did not find a bundler that properly manages web worker and we don't want give preferences to webpack or browserify -- we had problem with supporting both at the same time.
The solution is to managing dependencies of dependencies is not trivial. But keep in mind that efficient parsing and rendering of PDFs is not trivial task. If you abandon using web worker, and you are free to do that, the UI performance will suffer and it will be your trade-off.
@yurydelendik I'd agree with you but I don't think that trade-off needs to be made. Webpack has worker-loader and Browserify has webworkify -- wouldn't detecting the bundler system and using one or the other completely solve this problem?
Seems like it could be done here: https://github.com/mozilla/pdf.js/blob/master/src/display/api.js#L1132, using the direct path to the worker entry with
var worker = require('worker!../pdf.worker.entry.js')
in webpack or
var worker = require('webworkerify')('../pdf.worker.entry.js')
in browserify.
If you think I hit the mark with that, I'd be happy to write a PR for that.
You can publish small example of a library that demonstrate the intent of what you are trying to achieve. Provided snippets of the code is not useful since they are not: runnable and not what you are trying to do -- a library.
The snippet I attached is all of the code that would be in the library for now (pdf-to-dataURL
). I could make a quick example that uses that snippet if @jlchereau's example isn't good enough (it doesn't seem to require pdfjs-dist
from NPM, so not sure about the accuracy of it)
Webpack has worker-loader and Browserify has webworkerify -- wouldn't detecting the bundler system and using one or the other completely solve this problem?
Yes, I tried that at #6785, later at #6791 and then reverted that. Having require('worker!...
causes issue in browserify, and require('webworkerify')(...)
in webpack.
If you think I hit the mark with that, I'd be happy to write a PR for that.
Yes, working PR will really good to have. So far pdfjs-dist somehow works with webpack, browserify along with system.js and node.js; and we will try to keep it this way. Thanks.
Also notice that if worker is not available for some reason (security or just legacy browser), it shall load code as a script tag. I was planning to add overloaded constructor for PDFWorker that would accept an web worker is a parameter -- this might provide the alternative solution for some webpack/browserify usecases.
btw, webpack has entry-loader which can be used with workerSrc
Yes, I tried that at #6785, later at #6791 and then reverted that. Having require('worker!... causes issue in browserify, and require('webworkerify')(...) in webpack.
But wouldn't your __webpack_require__
check here
https://github.com/mozilla/pdf.js/pull/6785/commits/79c2f69c3288494c5ce2809182c896484bf4be5c#diff-5f93a8d6c23cf0a169c6ee7347477ce8R30 prevent browserify from parsing that statement? (not sure if the ensure
part was causing problems)
Yes, working PR will really good to have. So far pdfjs-dist somehow works with webpack, browserify along with system.js and node.js; and we will try to keep it this way. Thanks.
I'll probably get to this later next week -- is there a test to run to check against various bundlers/platforms?
I was planning to add overloaded constructor for PDFWorker that would accept an web worker is a parameter -- this might provide the alternative solution for some webpack/browserify usecases.
I think this would be a fantastic alternative! Specifically, if it could accept a Worker
class, then webpack folks could use something like: webworkify-webpack and browserify folks use webworkify and just pass the loader/Worker as an argument. So it would be
var worker = new WorkerFromArgs('../pdf.worker.entry.js')
in the overloaded case.
This would offload the configuration of worker loaders logic into user-land so potentially messy PRs that check for platform/bundler into pdf.js aren't necessary (it would be on the user to install the proper loader in any case)
yet I get a Warning: Setting up fake worker.' in my console when I actually load a document, which makes it seem like the instructions did not work.
That's awesome, but as stated above the issue is not addressable without complete example. Shall we close this one and wait for the PR?
@jlchereau gave one example https://github.com/mozilla/pdf.js/issues/7612#issuecomment-245973303 where you can similarly see Warning: Setting up fake worker
in the console and I can give another if need be
This issue is still relevant as workerSrc
should work in the current implementation, but doesn't.
In any case, the PR would resolve this issue so shouldn't this be left open for tracking until then?
I would also like to hear your feedback to my questions above before starting a PR (regarding why browserify complained when you tried checking __webpack_require__
, as I would do the same in my PR, and if there are any tests I should run to test all bundlers/platforms simultaneously)
@agilgur5, you say:
The snippet I attached is all of the code that would be in the library for now (pdf-to-dataURL).
I could make a quick example that uses that snippet if @jlchereau's example isn't good enough
(it doesn't seem to require pdfjs-dist from NPM, so not sure about the accuracy of it).
See https://github.com/kidoju/Kidoju-Help/blob/master/src/main.js and uncomment as you see fit:
require('../web/viewer.css');
require('../web/compatibility.js');
// require('pdfjs-dist/web/compatibility.js');
require('../web/l10n.js');
window.pdfjsDistBuildPdf = require('../build/pdf.js');
// window.pdfjsDistBuildPdf = require('pdfjs-dist/build/pdf.js');
// require('../web/debugger.js');
require('./viewer.js');
The reason I have been trying both is https://github.com/mozilla/pdf.js/blob/master/web/viewer.js and https://github.com/mozilla/pdfjs-dist/blob/master/web/pdf_viewer.js are not the same and I have deemed more relevant to keep all files from the same source/version.
Anyway, both exhibit the same behaviour as far as the worker is concerned.
@yurydelendik it doesn't seem like you checked out @jlchereau's example yet. I also made pdf-to-dataURL as a tiny package that exhibits this bug.
I was planning to add overloaded constructor for PDFWorker that would accept an web worker is a parameter -- this might provide the alternative solution for some webpack/browserify usecases.
Are there any updates on this? As I previously mentioned, I feel that's a much better solution than the one I proposed (to which you hadn't answered the questions I asked so I couldn't really make a PR anyway) and is far more generic for future use cases and other bundlers.
I ran into the same issue with my webpack project, but I solved it differently. Instead of relying on webpack's bundling or loaders, I made use of the CopyWebpackPlugin to copy the worker source into my build directory.
In my viewer:
import pdfjsLib from 'pdfjs-dist';
if (process.env.NODE_ENV !== 'production') {
//in dev, get it from the node_modules directory
//NOTE: don't use the "entry" file -- the script will fail and the web worker will not be used
pdfjsLib.PDFJS.workerSrc = `${process.env.APP_ROOT}/node_modules/pdfjs-dist/build/pdf.worker.js`;
} else {
//in prod, get it from the build directory
pdfjsLib.PDFJS.workerSrc = `${process.env.APP_ROOT}/build/pdf.worker.js`;
}
In webpack.config.js
:
const CopyWebpackPlugin = require('copy-webpack-plugin');
return {
//... rest of config omitted
plugins: [{
new CopyWebpackPlugin([{
from: 'node_modules/pdfjs-dist/build/pdf.worker.js',
to: 'pdf.worker.js'
}])
}]
}
@agilgur5, I just ran into this issue and it was because I was using the CommonsChunkPlugin. In my case, the webworker was loading but ran into an error Uncaught ReferenceError: webpackJsonp is not defined
(because that code got pulled to a common chunk and wasn't available to the worker). This caused the worker to exit early and fallback to the fake implementation.
You can either not use the CommonsChuckPlugin or use the solution @ctowler suggested.
Hope this solves your problem.
Hey all,
I was struggling a lot to make pdf.js working with Webpack. The key thing is I didn't want the worker to be in a separate file.
If anyone is facing problems like:
Warning: Setting up fake worker.
message,I included raw-loader
in my package.json to be able to import files as plaintext.
"raw-loader": "latest",
I configured Webpack in a way so pdf.worker.js is loaded via raw-loader
.
module: {
rules: [
{
test: /pdf\.worker(\.min)?\.js$/,
use: 'raw-loader',
},
{
test: /\.(js|jsx)$/,
exclude: [/node_modules/, /pdf\.worker(\.min)?\.js$/],
use: 'babel-loader',
},
],
},
Now comes the tricky part. The only way to pass a worker to PDF.js is via workerSrc
setting. Unfortunaetly, doing stuff like
pdfjsLib.PDFJS.workerSrc = require('pdfjs-dist/build/pdf.worker');
won't work.
BUT! We can create URLs on the fly from Blob
s, and we can create Blob
s from strings on the fly!
So, the working solution for me was:
const pdfjsLib = require('pdfjs-dist');
const pdfjsWorker = require('pdfjs-dist/build/pdf.worker.min');
const pdfjsWorkerBlob = new Blob([pdfjsWorker]);
const pdfjsWorkerBlobURL = URL.createObjectURL(pdfjsWorkerBlob);
pdfjsLib.PDFJS.workerSrc = pdfjsWorkerBlobURL;
馃帀 :D
js
require.ensure([], function () {
var worker;
worker = require('./pdf.worker.js');
callback(worker.WorkerMessageHandler);
});
pdf.worker.min
like I did, Webpack will get confused and include pdf.worker.js
anyway because of this stuff. What's even worse, even the minified version of PDF.js calls for non-minified pdf.worker.js
. So how can we deal with this crap?js
new webpack.NormalModuleReplacementPlugin(
/pdf\.worker(\.min)?\.js$/,
path.join(__dirname, 'node_modules', 'pdfjs-dist', 'build', 'pdf.worker.min.js'),
),
/pdf\.worker(\.min)?\.js$/
- more specifically, pdf.worker.js
and pdf.worker.min.js
will be replaced with minified version of it.Phew. Hope this helps anyone!
@wojtekmaj we added pdfjs-dist/webpack for zero-configuration for webpack users. You can see its usage at https://github.com/yurydelendik/pdfjs-react
@yurydelendik Thanks, yes, I was aware of this. Although I didn't manage to get it working fully and I was facing multple issues as ending up with one compiled file was a necessity for me.
This is because I'm working on react-pdf and it has to be super easy for my users to install it. package.json + import, boom, nothing else.
There's no way I could tell them to figure pdf.worker.js on their own, let alone write instructions for webpack, browserify and whatnot.
it has to be super easy for my users to install it. package.json + import, boom, nothing else.
@wojtekmaj I don't really understand your requirements. I don't see how adding pdfjs-dist and using pdfjs-dist/webpack will be impossible to use in a react component project. And user will just include the former (a component project).
ending up with one compiled file was a necessity for me.
One compiled file is not what you want: a) for page startup, b) caching and transmission size, c) possible problems with worker -- but it's your choice.
@yurydelendik
Oh, sorry, I misread your previous post. I thought you are talking about /examples/webpack which is a completely different thing. It should definitely be updated to use pdfjs/webpack! Thank you!
One more thing... Using pdfjs-dist/webpack does not stop pdf.js itself from trying to require pdf.worker.js on its own. I end up with:
When I define pdf.worker as one of the entries, it gets even worse, I end up with:
How do I fix this problem?
After running yarn build
with my react-pdf example above, I've got following files:
...
File sizes after gzip:
198.42 KB build/7b14afe24b211632ecc8.worker.js
197.76 KB build/static/js/0.974e8de4.chunk.js
130.14 KB build/static/js/main.5a79c9e3.js
4.19 KB build/static/css/main.d852b162.css
...
That's normal: the app is 'build/static/js/main.5a79c9e3.js' stuff (pdf.js plus react), 'build/static/js/0.974e8de4.chunk.js' is pdf.worker.js fallback code is loaded when worker is disabled and 'build/7b14afe24b211632ecc8.worker.js' worker code. Am I missing something?
@wojtekmaj please prepare complete react component (example?) with user's test app and report in the separate issue with STR -- I think your particular problem is not related to this issue.
PDFJS.workerSrc = 'scripts.bundle.js';
PDFJS.getDocument(getPdfName).then((pdfFile:any)=>{
this.pdfFile = pdfFile;
this.renderPdfIntoPages(pdfFile, getPdfPages, this.pdfReady);
});
assign the value like this then its works...
Thanks...
While using @yurydelendik solution if anyone gets window
not defined error please put
globalObject: 'true'
In the output
segment of your webpack config.
There seems to be a bug in webpack, webpack messes up with window
object when working with web workers
. So the above hack solves that issue.
@wojtekmaj:
Thanks for your solution! It works fine for me in Chrome, FF, Edge, Opera, Safari. But as you exclude it from babel-loader
it isn't transpiled back to ES5. So I get a problem in IE11 and so on. Or am I missing something there?
I may be doing something wrong here, so please correct oh smart people, but I took @wojtekmaj's train of thought, and got it working much more simply.
In webpack.config:
...
{
test: /pdf\.worker(\.min)?\.js$/,
loader: 'file-loader'
},
And then, in my code:
import PDFJS from 'pdfjs-dist';
import PDFJSWorker from 'pdfjs-dist/build/pdf.worker.min';
PDFJS.GlobalWorkerOptions.workerSrc = PDFJSWorker;
...
Configuration :
Hey, I had some trouble using webpack and pdfjs (and it's worker).
Due to webpack stuff, I had this error trying to load worker :
I couldn't find anything to fix it.
vendors_pdfjsWorker existed but wasn't in this path. In my case, I want the worker to be in the same place where pdf.js are.
At first I tried to change workerSrc, as wojtekmaj explained. But my workerSrc wasn't used by pdfjs to get the worker. Webpack parsing change pdfjs (l.9915) :
if (typeof window === 'undefined') {
isWorkerDisabled = true;
if (typeof require.ensure === 'undefined') {
require.ensure = require('node-ensure');
}
useRequireEnsure = true;
} else if (typeof require !== 'undefined' && typeof require.ensure === 'function') {
useRequireEnsure = true;
}
INTO
if (typeof window === 'undefined') {
isWorkerDisabled = true;
if (typeof require.ensure === 'undefined') {
require.ensure = require('node-ensure');
}
useRequireEnsure = true;
} else if (true) {
useRequireEnsure = true;
}
So, fakeWorkerFilesLoader is set (l.9932) :
fakeWorkerFilesLoader = useRequireEnsure ? function () {
Then, my workerSrc isn't get cause fakeWorkerFilesLoader is defined :
var loader = fakeWorkerFilesLoader || function () {
return (0, _dom_utils.loadScript)(_getWorkerSrc()).then(function () {
return window.pdfjsWorker.WorkerMessageHandler;
});
};
In my webpack configuration I added :
module: {
noParse: (filename) => {
return /\\node_modules\\pdfjs-dist\\build\\pdf.js/.test(filename);
},
rules: [
.......................
]
},
And then I had this error :
It tells me my script "ecm_viewer.worker.js" doesn't exist.
I added an entry in my webpack config :
entry: {
'ecm_viewer': getFileList(),
'ecm_viewer.worker': './node_modules/pdfjs-dist/build/pdf.worker.entry',
}
And it works perfectly, even if I remove the noParse function. I wasn't able to debug the real error until I add this noParse webpack option.
I don't know if I'm in the right place to write this ; I can move my post on stackoverflow or somewhere else. It may help someone one day.
I ran into the same issue with my webpack project, but I solved it differently. Instead of relying on webpack's bundling or loaders, I made use of the CopyWebpackPlugin to copy the worker source into my build directory.
In my viewer:
import pdfjsLib from 'pdfjs-dist'; if (process.env.NODE_ENV !== 'production') { //in dev, get it from the node_modules directory //NOTE: don't use the "entry" file -- the script will fail and the web worker will not be used pdfjsLib.PDFJS.workerSrc = `${process.env.APP_ROOT}/node_modules/pdfjs-dist/build/pdf.worker.js`; } else { //in prod, get it from the build directory pdfjsLib.PDFJS.workerSrc = `${process.env.APP_ROOT}/build/pdf.worker.js`; }
In
webpack.config.js
:const CopyWebpackPlugin = require('copy-webpack-plugin'); return { //... rest of config omitted plugins: [{ new CopyWebpackPlugin([{ from: 'node_modules/pdfjs-dist/build/pdf.worker.js', to: 'pdf.worker.js' }]) }] }
This fixed an issue my team has been hunting down for WEEKS. Thank you @ctowler :D <3
While using @yurydelendik solution if anyone gets
window
not defined error please putglobalObject: 'true'
In the
output
segment of your webpack config.
There seems to be a bug in webpack, webpack messes up withwindow
object when working withweb workers
. So the above hack solves that issue.
@vivektiwary I am running into the same issue. It keeps saying ReferenceError: Can't find variable: window
I did put that globalObject:'true'
setting in the Webpack.config file but the app now won't load at all. Are you sure it worked?
While using @yurydelendik solution if anyone gets
window
not defined error please putglobalObject: 'true'
In the
output
segment of your webpack config.
There seems to be a bug in webpack, webpack messes up withwindow
object when working withweb workers
. So the above hack solves that issue.@vivektiwary I am running into the same issue. It keeps saying
ReferenceError: Can't find variable: window
I did put that
globalObject:'true'
setting in the Webpack.config file but the app now won't load at all. Are you sure it worked?
Yes @taihuuho, did you put that in output obj in the config?
btw what is the error which you are getting?
@vivektiwary I am getting this error ReferenceError: Can't find variable: window
when using that pdfjs-dist/webpack
I may be doing something wrong here, so please correct oh smart people, but I took @wojtekmaj's train of thought, and got it working much more simply.
In webpack.config:
... { test: /pdf\.worker(\.min)?\.js$/, loader: 'file-loader' },
And then, in my code:
import PDFJS from 'pdfjs-dist'; import PDFJSWorker from 'pdfjs-dist/build/pdf.worker.min'; PDFJS.GlobalWorkerOptions.workerSrc = PDFJSWorker; ...
I ended up using @varunarora's solution and that works really well. Apparently, this documentation page for Webpack https://github.com/mozilla/pdf.js/tree/master/examples/webpack doesn't seem to work for everyone
Without needed to edit the webpack config:
PDFJS.GlobalWorkerOptions.workerSrc = require('!!file-loader!pdfjs-dist/build/pdf.worker.min.js').default;
or
import PDFJS from 'pdfjs-dist';
import PDFJSWorker from '!!file-loader!pdfjs-dist/build/pdf.worker.min.js';
PDFJS.GlobalWorkerOptions.workerSrc = PDFJSWorker;
and of course make sure you have file-loader
installed.
I am using electron-forge which caused file-loader to put the worker up a directory so I had to use
PDFJS.GlobalWorkerOptions.workerSrc = '../' + require('!!file-loader!pdfjs-dist/build/pdf.worker.min.js').default;
Using file-loader somehow had side effects on the rest of my app, because other librairies need it. So the other way that I found is to get the pdf.worker.js file from a cdn:
cf here: https://github.com/wojtekmaj/react-pdf/issues/321#issuecomment-451291757
Most helpful comment
I ran into the same issue with my webpack project, but I solved it differently. Instead of relying on webpack's bundling or loaders, I made use of the CopyWebpackPlugin to copy the worker source into my build directory.
In my viewer:
In
webpack.config.js
: