Package version: 3.0.3
Other: Heroku (4 web dynos all running horizon)
We're using Laravel Horizon to handle our Redis queues but getting errors which I think are due to our multi server architecture.
Whenever a "AppendQueryToSheet" job is handled by a different server to the one that handled the "QueueExport" job we get these errors...
Undefined offset: -1 {"exception":"[object] (ErrorException(code: 0): Undefined offset: -1 at /app/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Csv.php:191)"} []File "/tmp/laravel-excelIxge6a" does not exist. {"exception":"[object] (InvalidArgumentException(code: 0): File \"/tmp/laravel-excelIxge6a\" does not exist. at /app/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Shared/File.php:137)"}From what I can tell QueueExport writes a temporary file to the current server, which in theory won't be available to any other servers in a multi server setup.
You would need at least 2 servers running Horizon, pointing to the same Redis instance.
They should be working jobs from the queue and whenever you get a scenario where a server handles "QueueExport" but another handles "AppendQueryToSheet" I think you will see the issue. (see logs below for successfully and unsuccessful scenarios)
Expected behavior:
Here's a scenario that went ok, the entire job was processed by the same web server. Doing this locally (1 server) also works flawlessly
# OK
Jun 17 19:11:45 app/web.3: [2018-06-17 18:11:44] Processing: Maatwebsite\Excel\Jobs\QueueExport
Jun 17 19:11:45 app/web.3: [2018-06-17 18:11:44] Processed: Maatwebsite\Excel\Jobs\QueueExport
Jun 17 19:11:45 app/web.3: [2018-06-17 18:11:44] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 17 19:11:46 app/web.3: [2018-06-17 18:11:45] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 17 19:11:46 app/web.3: [2018-06-17 18:11:45] Processing: Maatwebsite\Excel\Jobs\CloseSheet
Jun 17 19:11:46 app/web.3: [2018-06-17 18:11:45] Processed: Maatwebsite\Excel\Jobs\CloseSheet
Jun 17 19:11:46 app/web.3: [2018-06-17 18:11:45] Processing: Maatwebsite\Excel\Jobs\StoreQueuedExport
Jun 17 19:11:46 app/web.3: [2018-06-17 18:11:45] Processed: Maatwebsite\Excel\Jobs\StoreQueuedExport
Actual behavior:
Here's a log of 2 problematic scenarios
# Errors when server handling job isn't the one that processed "QueueExport" job. web.1 handled "QueueExport" but web.2 and web.4 threw errors
Jun 18 13:23:26 app/web.1: [2018-06-18 12:23:25] Processing: Maatwebsite\Excel\Jobs\QueueExport
Jun 18 13:23:27 app/web.1: [2018-06-18 12:23:25] Processed: Maatwebsite\Excel\Jobs\QueueExport
Jun 18 13:23:27 app/web.1: [2018-06-18 12:23:25] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:23:28 app/web.1: [2018-06-18 12:23:27] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:23:28 app/web.1: [2018-06-18 12:23:27] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:23:32 app/web.1: [2018-06-18 12:23:30] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:23:32 app/web.1: [2018-06-18 12:23:30] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:23:38 app/web.1: [2018-06-18 12:23:36] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:23:38 app/web.1: [2018-06-18 12:23:36] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:23:45 app/web.1: [2018-06-18 12:23:44] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:23:45 app/web.1: [2018-06-18 12:23:44] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:23:54 app/web.1: [2018-06-18 12:23:52] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:23:54 app/web.1: [2018-06-18 12:23:52] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:24:05 app/web.1: [2018-06-18 12:24:03] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:24:05 app/web.1: [2018-06-18 12:24:03] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:24:19 app/web.1: [2018-06-18 12:24:17] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:24:19 app/web.1: [2018-06-18 12:24:17] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:24:37 app/web.1: [2018-06-18 12:24:35] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:24:37 app/web.1: [2018-06-18 12:24:35] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:24:53 app/web.1: [2018-06-18 12:24:52] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:24:53 app/web.1: [2018-06-18 12:24:52] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:25:14 app/web.1: [2018-06-18 12:25:13] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:25:14 app/web.1: [2018-06-18 12:25:13] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:25:40 app/web.4: [2018-06-18 12:25:38] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:25:40 app/web.1: [2018-06-18 12:25:38] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:25:40 app/web.2: [2018-06-18 12:25:39] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:25:40 app/web.2: [2018-06-18 12:25:39] Failed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:25:40 app/web.2: [2018-06-18 12:25:40] production.ERROR: Undefined offset: -1 {"exception":"[object] (ErrorException(code: 0): Undefined offset: -1 at /app/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Csv.php:191)"} []
Jun 18 13:25:40 app/web.2: [2018-06-18 12:25:39] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:25:40 app/web.2: [2018-06-18 12:25:39] production.ERROR: Undefined offset: -1 {"exception":"[object] (ErrorException(code: 0): Undefined offset: -1 at /app/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Csv.php:191)"} []
Jun 18 13:25:41 app/web.4: [2018-06-18 12:25:39] production.ERROR: File "/tmp/laravel-excelIxge6a" does not exist. {"exception":"[object] (InvalidArgumentException(code: 0): File \"/tmp/laravel-excelIxge6a\" does not exist. at /app/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Shared/File.php:137)"} []
# Another example, web 1 haandles QueueExport but web.3 and web.4 end up throwing errors
Jun 18 13:56:05 app/web.1: [2018-06-18 12:56:04] Processing: Maatwebsite\Excel\Jobs\QueueExport
Jun 18 13:56:05 app/web.1: [2018-06-18 12:56:04] Processed: Maatwebsite\Excel\Jobs\QueueExport
Jun 18 13:56:05 app/web.1: [2018-06-18 12:56:04] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:56:07 app/web.1: [2018-06-18 12:56:06] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:56:07 app/web.1: [2018-06-18 12:56:06] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:56:11 app/web.1: [2018-06-18 12:56:10] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:56:11 app/web.1: [2018-06-18 12:56:10] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:56:16 app/web.4: [2018-06-18 12:56:16] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:56:17 app/web.1: [2018-06-18 12:56:16] Processed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:56:17 app/web.3: [2018-06-18 12:56:16] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:56:17 app/web.4: [2018-06-18 12:56:16] production.ERROR: File "/tmp/laravel-excelGLNTc7" does not exist. {"exception":"[object] (InvalidArgumentException(code: 0): File \"/tmp/laravel-excelGLNTc7\" does not exist. at /app/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Shared/File.php:137)"} []
Jun 18 13:56:18 app/web.4: [2018-06-18 12:56:16] Processing: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:56:18 app/web.4: [2018-06-18 12:56:17] Failed: Maatwebsite\Excel\Jobs\AppendQueryToSheet
Jun 18 13:56:18 pp/web.3: [2018-06-18 12:56:17] production.ERROR: File "/tmp/laravel-excelGLNTc7" does not exist. {"exception":"[object] (InvalidArgumentException(code: 0): File \"/tmp/laravel-excelGLNTc7\" does not exist. at /app/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Shared/File.php:137)"} []
Jun 18 13:56:18 pp/web.4: [2018-06-18 12:56:17] production.ERROR: File "/tmp/laravel-excelGLNTc7" does not exist. {"exception":"[object] (InvalidArgumentException(code: 0): File \"/tmp/laravel-excelGLNTc7\" does not exist. at /app/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Shared/File.php:137)"} []
The order of the log message may not be 100% accurate but the timestamps should be accurate.
The only way I could see how CSV.php could end up complaining about an offset of -1 is when the file that is being read doesn't exist meaning the $numberLines variable remains as 0.
# /vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Reader/Csv.php
$numberLines = 0;
....
$middleIdx = floor(($numberLines - 1) / 2); // -1
$series[$middleIdx] // Error
Hey @carltondickson. Thanks for submitting the issue. Multi server might indeed be the issue. You could try to change the temp path https://github.com/Maatwebsite/Laravel-Excel/blob/3.0/config/excel.php#L28 to something that is shared? Also it would be best if the export where handled by one and the same worker. Multiple workers don't really any benefit, as the jobs are "chained" (so they don't end up in a different order), so perhaps trying to make sure only 1 worker picks up these export jobs would be the easiest solution?
@patrickbrouwers thanks for the suggestions. I'll give the config one a try first as I'm not sure I can control which worker processes the file.
@carltondickson let me know if you make any progress with this! Happy to update the docs with information of getting this to work with Horizon.
@patrickbrouwers I've tried a few approaches now I've had some time to look at this
AWS EFS
Other things I tried...
Specify temp file location
\Maatwebsite\Excel\Writer::tempFile, so I was able to specify the temp file path. I tried registering a stream wrapper for s3 (https://docs.aws.amazon.com/sdk-for-php/v3/developer-guide/s3-stream-wrapper.html) so I could say my temp file should be at s3://bucket/file. There were a few issues in phpspreadsheet writing to s3 in that wb+ mode isn't supported, also the mime type couldn't be retrieved and the contents of the file appeared to be empty in S3.Horizon
You could send all export jobs to a single queue exports:
(new InvoicesExport)
->queue('invoices.xlsx')
->allOnQueue('exports');
and have this queue processed by just one worker, next to the default queue.
php artisan queue:work --queue=exports,default
@tomlankhorst that's what I thought would work as well, however haven't tested it in combination with horizon. Let us know what you're findings are with this @carltondickson .
@tomlankhorst, @carltondickson, @patrickbrouwers : I will close this ticket now. I hope it all worked out fine.
@patrickbrouwers We're actually having a very similar issue to this using multiple servers with imports, though our situation is a bit different.
We currently have a few web servers handing the actual traffic. For our import, we upload the file, and copy it over to S3. We then pass that file into the Import job and queue it. The issue here is that our queue server is separate from the web servers, and it appears that the temporary file is being created on the web server making the request, and not on the actual queue server responsible for running the chunked reads, thus throwing a file not found exception.
Right now there is no way to specify the drive for the temporary file to be written to, and since it's written before the queue, there isn't any way for us to fix this. Being that our traffic is routed between different app servers, we can't just run a queue on them since we can't guarantee the temp file is created on the same server that will be running the actual import job.
After looking at the replies above, it seems our team is on the same page attempting to come up with solutions to this. Right now it seems like being able to specify a disk where the temporary file could be stored (and thus could be read by all servers) might be the best solution. Looking at how everything is written, it doesn't seem very feasible to wait until the queued jobs run to create the temp file, as uploaded files would make that difficult.
I surprisingly haven't been able to find any other similar issues, but I believe that our server setup (having a dedicated queue server) is fairly common, and I doubt we're the only ones to run into something like this. We're happy to work on a PR that would add the ability to define a disk for the temporary file if that's something that you'd be interested in, otherwise I'd love to hear some idea of how we could resolve this!
Thanks @cmorbitzer for the PR. We'll add an option to specify a shared disk for the temporary files. This can either be a local disk (no performance difference) or a remote disk (small performance penalty, as on each job it will have to redownload the file)