It seems system python builds use this option (and a few others I haven't checked). It would be nice if pyenv builds matched common distros builds by default to reduce surprises.
Sharing libraries between the system python and pyenv install is more difficult when the above option isn't specified. Also some thirdparty tools will not work, for example rubypython-0.6.3. They have the error:
ImportError: ~/.pyenv/versions/.../lib/python2.7/lib-dynload/operator.so: undefined symbol: _PyUnicodeUCS2_AsDefaultEncodedString (RubyPython::PythonError)
Basically, pyenv will build Pythons with default options. If you want to build Python with specific configuration options, please specify the options via environment variables of PYTHON_COFIGURE_OPTS or PYTHON_MAKE_OPTS. Please see also the README.md of python-build.
We recently approved an upstream PEP defining a naming scheme and build environment for prebuilt wheel files for Linux systems, which currently only has wide Unicode builds in the build environment: https://www.python.org/dev/peps/pep-0513/
While that shouldn't be a problem for pyenv Python 3.x builds (since the wide/narrow build distinction was removed back in 3.3), the Python 2.x builds are going to hit this problem: most of the Linux wheels uploaded to PyPI aren't going to be compatible with narrow builds of Python, so pyenv users would still need to compile from source even if a wheel targeting wide builds is available.
(I was directed here from a distutils-sig thread discussing the Unicode build settings for common pre-built binaries on Linux)
@yyuu I think we should reopen.
Thanks for being willing to reconsider this. For the record, we're still discussing the possibility of simply adding a Python 2.7 narrow build to the reference build environment, as switching from narrow builds to wide builds would pose a binary extension module compatibility problem for any redistributor making the switch: https://mail.python.org/pipermail/distutils-sig/2016-February/028284.html
I opened PR #542 for the fix for this. Can anyone try it?
Thank you!
This change is a big deal. Anyone who installed a version of Python prior to this change may suddenly start getting divergent UCS encodings among their installs. We need to communicate to users that they need to reinstall all their Python versions after upgrading to this version of pyenv. Or pyenv could start distinguishing versions based version number _and_ UCS encoding setting.
Just a note. This change is having some consequences with pretty popular modules like PIL: https://github.com/python-pillow/Pillow/issues/1753
We need to communicate to users that they need to reinstall all their Python versions after upgrading to this version of pyenv.
Yes -- this bit me and it took a long while of baffled googling before I ended up here and figured out why.
Anyone who installed a version of Python prior to this change may suddenly start getting divergent UCS encodings among their installs. We need to communicate to users that they need to reinstall all their Python versions after upgrading to this version of pyenv.
FWIW, I wasn't able to escape from the undefined symbol: PyUnicodeUCS2_FromString rabbit hole with these instructions. I had to use the advice found here:
$pyenv uninstall 2.7.11
$PYTHON_CONFIGURE_OPTS="--enable-unicode=ucs2" pyenv install 2.7.11
That solved it for me for the times I have to switch into Python 2 (in my case, when using fabric).
Further communication of the reason for the change is definitely desirable here, as folks forcing their Python install back to narrow Unicode builds are going to end up in the situation where their installation experience is worse, since they won't be able to use prebuilt wide unicode wheel files published to PyPI.
A preferable approach would be to use pip to reinstall all the existing packages in the environment:
$ pip freeze | pip install --ignore-installed --no-use-wheel -r /dev/stdin
(The "--no-use-wheel" is needed as Python 2.7 wheels built with versions of pip prior to 8.0.0 didn't have their ABI dependencies encoded correctly, and hence would still show up as compatible)
@konklone Sorry if I misled you. That's actually what I had to do as well. My comment was regarding the fact that people will start having different UCS encodings among their Python versions installed by pyenv. The only way to make them all consistent is to reinstall any versions that were installed prior to this update.
I ran into the issue now where homebrew is building python2 packages against the system python which apparently is using a narrow build on OS X. Thus homebrew-compiled packages are incompatible with pyenv unless using ucs2. Is there any recommendation here? The package in question is from homebrew because it's fairly difficult to compile on its own.
Just to say that the change to UCS4 on OSX is a pretty big deal, because this makes pyenv the only - to my knowledge - UCS4 Python build on OSX. Therefore very few people are building wheels that work with pyenv, and so using pyenv leads to the kind of problem mentioned in the issue above - where the user is surprised to find that wheel installs are not working.
Regarding my own comments above: they're specific to _Linux_, where the system Python configuration is determined by distro policy rather than CPython's default build settings, and distros long ago opted for ucs4 as the default (before the question became irrelevant in Python 3.3+). The manylinux1 specification then inherited that convention.
For Mac OS X, the upstream CPython defaults (i.e. a narrow ucs2/UTF-16 build) would be the more portable choice at the pyenv level.
Confirming that all the OSX Python variants that I know of are UCS2 builds:
$ # System Python
$ /usr/bin/python -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2
$ # Python.org Python
$ /Library/Frameworks/Python.framework/Versions/2.7/bin/python -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2
$ # Homebrew Python
$ /usr/local/Cellar/python/2.7.12/bin/python -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2
$ # Macports Python
$ /opt/local/bin/python2.7 -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2
$ # Anaconda Python
$ /Users/bnaul/anaconda/bin/python -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2
I think you did the switch to UCS4 to be more compatible with standard Linux distributions. For the same reason, here's a plea to switch back to UCS2 by default on OSX. Otherwise pyenv users on OSX are going to have many more problems installing standard packages.
The encoding configuration is done at https://github.com/yyuu/pyenv/blob/v1.0.2/plugins/python-build/bin/python-build#L1922-L1925
I can tweak the lines to stop configuring UCS4 on OS X. Although, I'm not sure how it should be on other platforms like BSDs.
How about tweaking that line for OSX only for now? I don't know what BSD's defaults are for Python builds either, but I guess it would be reasonable to make changes for BSD later, when more information comes in?
It appears this issue just came up again for Python / Pillow (see link above).
FreeBSD 10 via pkg install python:
# python -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs4
OpenBSD 6.0 via pkg_add python:
# python2.7 -c "import sys; print('ucs2' if sys.maxunicode == 65535 else 'ucs4')"
ucs2
So maybe UCS4 for FreeBSD, UCS2 for OpenBSD.
I've opened #726 to stop configuring --enable-unicode=ucs4 on OS X. Please give it a try and will merge it if it works.
Most helpful comment
FWIW, I wasn't able to escape from the
undefined symbol: PyUnicodeUCS2_FromStringrabbit hole with these instructions. I had to use the advice found here:That solved it for me for the times I have to switch into Python 2 (in my case, when using
fabric).