Environment
(pip Windows CI hits this)
Description
The PAX format wheel 0.34.1 sdists fail to install on Python 2.7 on Windows with a UnicodeEncodeError, or on non-Windows systems in a non-utf-8 locale: https://github.com/pypa/wheel/issues/331
Expected behavior
Unicode filename from the PAX tarball is correctly encoded for the local filesystem.
How to Reproduce
Attempt to install a PAX formatted tarball containing a file name that cannot be encoded to the default code page (Windows) or the default locale encoding (non-Windows).
In GNU tar, the affected paths are pre-mangled to something ASCII compatible, but PAX tar preserves them correctly, so the installer needs to handle them itself.
Output
See
https://dev.azure.com/pypa/pip/_build/results?buildId=18040&view=logs&j=404e6841-f5ba-57d9-f2c8-8c5322057572&t=0219f6bf-240d-5b08-c877-377b12af5079&l=309 for a Windows example in the pip test suite.
The wheel issue linked above has some Linux examples.
@ncoghlan Just an FYI, the issue I noted on https://github.com/pypa/wheel/issues/331 was using Python 3.6 (in case that has any bearing here).
In the process of justifying not fixing this, I figured out enough to fix it. :( See #7668.
@johnthagen Yeah, the non-universal locale encoding problem I mention in https://github.com/pypa/pip/pull/7668#issuecomment-579706165 will apply Python 3 as well.
However 3.7+ mitigate it significantly, as they don't believe the OS when it claims to be using ASCII, and automatically switch to using UTF-8 instead.
Most helpful comment
@johnthagen Yeah, the non-universal locale encoding problem I mention in https://github.com/pypa/pip/pull/7668#issuecomment-579706165 will apply Python 3 as well.
However 3.7+ mitigate it significantly, as they don't believe the OS when it claims to be using ASCII, and automatically switch to using UTF-8 instead.