Numpy: BUG: Python from windows 10 store does not extend PATH via numpy.__config__.py

Created on 5 Jan 2019  ·  45Comments  ·  Source: numpy/numpy

Hi,

I'm not sure if it is an error of numpy or python, but when using Python 3.7 from the Windows store, importing numpy fails with "DLL load failed". Even though there are already many quite similar issues, I never found that issue in combination with a fresh python installation (from the Microsoft Store).

Reproducing code example:

pip install numpy

import numpy

Error message:

Traceback (most recent call last):
  File "C:\Users\stroy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\numpy\core\__init__.py", line 16, in <module>
    from . import multiarray
ImportError: DLL load failed: Das angegebene Modul wurde nicht gefunden.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\stroy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\numpy\__init__.py", line 142, in <module>
    from . import add_newdocs
  File "C:\Users\stroy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\numpy\add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "C:\Users\stroy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\numpy\lib\__init__.py", line 8, in <module>
    from .type_check import *
  File "C:\Users\stroy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\numpy\lib\type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "C:\Users\stroy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.7_qbz5n2kfra8p0\LocalCache\local-packages\Python37\site-packages\numpy\core\__init__.py", line 26, in <module>
    raise ImportError(msg)
ImportError:
Importing the multiarray numpy extension module failed.  Most
likely you are trying to import a failed build of numpy.
If you're working with a numpy git repo, try `git clean -xdf` (removes all
files not under version control).  Otherwise reinstall numpy.

Original error was: DLL load failed: Das angegebene Modul wurde nicht gefunden.

Numpy/Python version information:


Python 3.7.2 (v3.7.2:9a3ffc0492, Dec 23 2018, 23:19:17) [MSC v.1916 64 bit (AMD64)]
Numpy 1.15.4

distribution

Most helpful comment

it seems Anaconda is pushing forward

I'm aware of that work, and I pushed back against it for similar reasons, but encouraged it for others.

One advantage that they have is the are the system integrator, so it's "okay" for them to decide to combine all the DLLs required by an environment and reference them that way. They also (in theory) control the full set of packages in the environment, and so enabling the default path lookup is also "okay".

As a component in someone else's integration task, numpy/scipy don't get to make the same assumption. In your case, arguably it's _best_ to have a separate .libs folder so that it's easier for conda to merge those into one location as part of their build/install process (and arguably it would be nice for pip to support an "install into the DLLs directory" option, but that has all sorts of flow on problems).

Ultimately, system integrators need to figure this out. Tools like pip and conda will always fall short, because they try to do it automatically, though conda has a _slight_ chance of succeeding if they're careful (and the reason I pushed back against their approach is because they're not being careful, they're simply adding all directories in PATH as search paths which creates repetition and potentially randomizes the search order - I suggested defining their own list of search paths and inferring that from installed packages, but they didn't like it for whatever reason).

All 45 comments

The numpy c-extension module multiarray.cp37-win_amd64.pyd links to LIBOPENBLAS.CSRRD7HKRKC3T3YXA7VY7TAZGLSWDKW6.GFORTRAN-WIN_AMD64.DLL which is part of the wheel package installed by pip. Pip seems to put this in the numpy directory under 'numpy/.libs' which must be added to the system path before importing numpy. After adding that path to PATH, I checked with depends.exe that no other dependencies are missing. Still something is wrong since even when os.path.exists(<path-to-multiarray>), ctypes.CDLL(<path-to-multiarray>) fails.

Python via the Microsoft App Store is only available if you enable the preview updates, and the python package itself warns "NOTE: We are still evaluating the Microsoft Store release of Python, and not all features are guaranteed stable. See https://docs.python.org/3.7/using/windows.html for more information."

I opened https://bugs.python.org/issue35688, maybe they can help solve this.

Looking at stock python3.7 from python.org, pip install numpy; import numpy succeeds. After it imports, somehow site-packages\numpy\.libs is added to os.environ['PATH']. I don't see how that happens in the code, but however it happens is perhaps blocked on the python from the app store?

My first guess was that it is an issue of the Microsoft Store release of python, but since there had been a few DLL load failed issues being caused by numpy, I tried my luck here.
I'm aware that the Microsoft Store release is not fully supported, but since there hasn't been any existing issue, I wanted to at least inform someone about it. Thanks for creating an issue at Python!

We have this code for win32 in numpy.__config__.py (which is generated when building NumPy):

extra_dll_dir = os.path.join(os.path.dirname(__file__), '.libs')

if sys.platform == 'win32' and os.path.isdir(extra_dll_dir):
    os.environ.setdefault('PATH', '')
    os.environ['PATH'] += os.pathsep + extra_dll_dir

While it works properly in "regular" python, it does not work in the python from the app store. os.environ['PATH'] is modified properly, the dll exists in that directory, but somehow the change is not percolated to the dll search mechanism.

It's more likely that the DLL is refusing to load for some reason... perhaps it's using a banned dependency? (Though I thought there weren't any of those anymore)

Can you try loading the DLL directly using ctypes? And if it fails, look at GetLastError? The import loader will hide a lot of load problems, even when the module is found (a separate CPython issue... maybe we need some debugging flags for this)

The error from import numpy is DLL load failed: The specified module could not be found. Also ctypes.CDLL(fname) where fname is the multiarray*.pyd file (verified existance via os.path.exists) gives me OSError: [WinError 126] The specified module could not be found and ctypes.GetLastError gives 6 (that is The handle is invalid).

I can load the dependent DLL cblas_file=libopenblas*.dll; ctypes.CDLL(cblas_file) and I can verify that the directory of the dependent DLL is on the path os.path.dirname(cblas_file) in os.environ['PATH'].

I can use depends.exe to successfully load fname once I add os.path.dirname(cblas_file) to the path.

Hmm, I guess it is store apps refusing to look at PATH anymore. That may qualify as a bug for a full-trust app, but it's not been a recommended approach for a long time already and so it may just be deprecated in this context.

If you make an AddDllDirectory call instead of modifying PATH, does that work?

10229 added code to use AddDllDirectory, but we removed it in #11449 since it broke other things since it required calling windll.kernel32.SetDefaultDllDirectories(0x1000).

Why should store apps refuse to look at PATH and what is the recommended approach? If looking at PATH is considered dangerous, the AddDllDirectory direction seems even more so.

Perhaps the host python should provided a blessed directory for ancillary DLLs and we could adjust our installation scripts to use that.

Looking at PATH is more dangerous because it's so easy for someone else to override the DLLs that will be loaded.

The correct way to do it is to reference DLLs by full path when using LoadLibrary[Ex]. Python already does this when loading pyd's, so it's irrelevant there, but it does get used when the loader is resolving static (load-time) dependencies of the PE file.

The correct way to reference dependencies is to put them in the same directory as the module that is loading them. The module directory is always searched first, since the files closest to the module being loaded are assumed to be the correct ones (with the exception of the KnownDLLs list, but you're not hitting that here).

Providing a blessed directory for ancillary DLLs has never been the solution and is still not the solution (and removing the existing DLLs directory is still a backwards-compatibility concern, or I'd have merged those files into the Lib folder already).

The second most reliable way to have the loader use dependencies from a separate directory is to explicitly load them first using LoadLibrary[Ex] with a full path before importing the module that needs them. (I know, I know, I keep providing exactly the same suggestions every time this problem comes up. It's not my fault you guys don't like them :) )

Now apparently the libraries that stopped working with the SetDefaultDllDirectories call also need to fix their issues, as they are relying on bad behavior too. I'm prepared to consider making this call in CPython 3.8, and possibly expose AddDllDirectory through either sys or importlib, though since we already know it'll cause significant breakage it's probably better to save it until a more significant version change (and multiple AddDllDirectory calls simply give you a random search order unless you follow the guidance above, so it may not help that much anyway, apart from preventing DLL hijacking).

Continuing the discussion on the cpython issue, since it seems a more general issue than just NumPy.

Edit: add link

(asking here as well as on the cpython issue for discoverability) @zooba is there a "best practices" guide? I am sure we are not the only project with concerns.

I still think something a bit like this is impacting the Windows portability of #12523 when using DLL load as the mechanism. I see issues both locally on Windows & on Azure DevOps with OSError: [WinError 126] The specified module could not be found that don't crop up on POSIX using the same workflow.

It seems Anaconda has worked out how to use AddDllDirectory ContinuumIO/anaconda-issues#10628 in a way that fixes what we had in #10229 but had to remove in #11449.

Anaconda translates _all_ PATH entries into AddDllDirectory entries, which basically just (potentially) randomizes the order of what was there previously. I suggested only adding the directories that are known to have necessary DLLs in them, and I'll suggest the same if you're going to look at this approach.

@zooba the problem is we have no idea which directories have DLLs in them. We are only one of many packages in the Python ecosystem, see for instance this issue where using AddDllDirectory without adding PATH broke someone else's work.

Looking at that issue, it was the SetDefaultDllDirectories call that did it. It's unfortunate that it applies to the entire process, but the only way to deal with it any better is for CPython to enable it at startup, which leaves everyone just as broken (not that I'm _completely_ opposed to doing this for 3.8, but it is going to cause a lot of pain).

I just move your libopenblas...dll into numpy/core instead of .libs and everything seems to be fine. I still don't understand why this is such a bad solution?

Is there a reliable way to probe a DLL from CPython / ctypes and get the names of the DLLs it is looking for, but can't find? To be fair, in my hands depends.exe does no better than ctypes when operating on libopenblas.dll when something major is "missing."

It seems to be reduced to things either working or I get OSError: [WinError 126] The specified module could not be found, without much traction for debugging / error handling in between. This would be immensely useful if one were relatively naive about the requisite DLLs in a build workflow & wanted to reconstitute a successful build, even if only by copying said DLLs into the path of the loading module as suggested.

just move your libopenblas...dll into numpy/core instead of .libs and everything seems to be fine

@zooba this is a solution for numpy, and we will probably adopt that.

I am more worried about countless other python developers who have been using os.environ['PATH'] to integrate with third-part dlls. For instance, I remember having to change the version of Intel IPP I was using with a c-extension module depending on various runtime configuration constraints. What is the solution for these types of tasks? Copying the files into the directory is not an option, and having to use LoadLibraryEx() would require analysing which dlls Intel is loading for each version of IPP for each CPU the testing code runs on. Another common use case is a testbed for various versions of hardware, each with their SDK version, in different directories.

Is there a reliable way to probe a DLL from CPython / ctypes and get the names of the DLLs it is looking for, but can't find?

Not that I'm aware of, much as I'd love one (and we'd totally have an improved import error message in this case).

depends.exe doesn't do much better for me these days, as it hasn't ever caught up with more recent Windows enhancements like API sets (though Dependencies looks interesting).

I am more worried about countless other python developers who have been using os.environ['PATH'] to integrate with third-part dlls.

The long-term solution is part of the debate that's been occasionally flaring up on Twitter recently about Python's distribution model.

Basically, the status quo assumes that everyone installing anything is building it all from source (even wheels are just a caching mechanism here, as demonstrated by all the packages that hit this issue). Assembling an executable application from a set of binaries is not something that should be done automatically.

For your specific example, the solution sounds like AddDllDirectory along with SetDefaultDllDirectories. Now, the "problem" with this approach is that it breaks loading DLLs from environment variables, but the _alternate_ way to look at this is that _you apparently don't know what DLLs you need_, which is a fundamental issue when assembling a working application from binaries.

Basically, all of the current tool implementations we have for creating Python environments don't work well on Windows (or for similar reasons, any portable environment). If you compare to actual Windows applications that do this, you'll find they have very specific installers that essentially vendor all their dependencies, but only one level deep. So they may vendor Python, but Python doesn't vendor the C Runtime - the app has to vendor it itself. And there are well paid engineers who track and manage these dependencies, who also despise the tools that try to do it automatically because they know those will all eventually fail.

So basically, numpy et al. are at the forefront of "our current approach to automatic Python environment construction is lacking" because we haven't accounted for properly vendored and isolated environments (occasionally "the deployment problem" is used to refer to the same thing).

The correct way to reference dependencies is to put them in the same directory as the module that is loading them

just move your libopenblas...dll into numpy/core instead of .libs and everything seems to be fine

@zooba this is a solution for numpy, and we will probably adopt that.

Doesn't this break down as soon as multiple submodules need a dll? At least according to the docs on module search path, no directories higher up are searched. I think for numpy we need BLAS in numpy/core and LAPACK in numpy/linalg; for scipy we need both BLAS and LAPACK in many submodules.

Forgive my ignorance, but I thought the point was that, once the DLL is loaded, Windows no longer searches for it, but uses the loaded DLL. So, this is a hack that depends on something in numpy/core loading the DLL (and finding in its own directory) before anything else tries to load the DLL.

depends on something in numpy/core loading the DLL (and finding in its own directory) before anything else tries to load the DLL

This is correct. I believe the original idea of the DLLs directory was probably to allow all of the standard library extension modules to be alongside all of the dependencies they'd need (e.g. hashlib and ssl both require OpenSSL) _without_ polluting the Lib directory with random DLLs.

Possibly what we need is better support for having numpy.core._multiarray_umath.pyd and numpy.linalg._umath_linalg.pyd in a single directory with libopenblas...dll and Python is able to resolve the modules correctly. Though this would likely require a divergence from "normal" Python import semantics (as I said, Python is simply not well designed for assembling a Windows application).

So, this is a hack …

Actually, this is the correct way. The hack is when you explicitly LoadLibrary the DLL from some other location before loading anything that may depend on it, not when you deliberately put the DLL in the correct search path ;)

FWIW, import numpy already does import numpy.core._multiarray_umath which means you can't avoid having the DLL loaded on import numpy.linalg.

FWIW, import numpy already does import numpy.core._multiarray_umath which means you can't avoid having the DLL loaded on import numpy.linalg.

yes for numpy this happens to work, because there is a core that is guaranteed to be imported first and will load the right DLL. In general there's no such guarantee for a Python package, and in SciPy that will be the problem - there are 17 submodules that can be loaded independently, and there's no core.

So, this is a hack …

Actually, this is the correct way. The hack is when you explicitly LoadLibrary the DLL from some other location before loading anything that may depend on it, not when you deliberately put the DLL in the correct search path ;)

The "correct way" is clearly not scalable, as the SciPy example shows. In general, say you have N submodules that each need some subset of M DLLs. There's a number of ways one could deal with that:

  1. put all extension modules for those N submodules and M DLLs in a single dir (the "correct way").
  2. what we did until now, put it in some special dir (.libs) and put that dir on the search path. worked for a long time, but had some problems so now we have Windows Store Python and Conda that broke this method.
  3. modify Python import semantics (not sure how, but doesn't sound very appealing)
  4. put DLLs in a special dir (e.g. .libs) and use some other method than extending PATH to find them
    4a. @zooba's 2nd suggestion: explicitly load them first using LoadLibrary[Ex] with a full path before importing the module that needs them.
    4b. put relative symlinks for each DLL in every submodule

1 creates a mess. 2 was just broken. 3 doesn't sound good, and doesn't solve things in the right timeframe anyway. I'm not sure if 4b is portable (it probably isn't), but otherwise 4a seems like the way to go.

yes for numpy this happens to work, because there is a core that is guaranteed to be imported first and will load the right DLL. In general there's no such guarantee for a Python package,

True, but it's easy enough to set it up. Easier than modifying global state in a particular import. (The guarantee for a Python package is that in import a.b, a will _always_ be imported before a.b.)

On the others, 4a is a good enough hack, though not really any different from 1. 4b is not portable at all. 3 would be tied into 1 to be able to resolve submodules rather than top-level extension modules (e.g. currently the easiest way to make 1 work is to name it numpy_core_multiarray_umath.pyd without any dots, so that it can be imported as a top-level module having to negotiate any packages, while 3 could enable numpy.core._multiarray_umath to resolve to numpy.core._multiarray_umath.pyd rather than numpy/core/_multiarray_umath.pyd)

True, but it's easy enough to set it up. Easier than modifying global state in a particular import. (The guarantee for a Python package is that in import a.b, a will _always_ be imported before a.b.)

Here is the list of DLLs in the latest scipy release (from py37 64-bit wheel on PyPI):

$ ls scipy/extra-dll
lib_arpack-.XS6PLV3734SEBIN3L7VHQL4V6AFVR3MS.gfortran-win_amd64.dll
lib_blas_su.NB4WQJWKUFT4P25L5KENNGUG4L73EMTU.gfortran-win_amd64.dll
lib_test_fo.JF5HTWMUPBXWGAYEBVEJU3OZAHTSVKCT.gfortran-win_amd64.dll
libansari.R6EA3HQP5KZ6TAXU4Y4ZVTRPT7UVA53Z.gfortran-win_amd64.dll
libbanded5x.H4XYVA4HYHIISTP5NNCPCQPACG6FKUND.gfortran-win_amd64.dll
libbispeu.5N2XSD7URZS4WTOSLTOG4DDMA4HGB46U.gfortran-win_amd64.dll
libblkdta00.ZGG7V3JKZ4GEJEF2MTA5BHMT7BJIUCKN.gfortran-win_amd64.dll
libchkder.G7WSOGIYYQO3UWFVEZ3PPXCXR53ADVPA.gfortran-win_amd64.dll
libcobyla2.JEGTSUUFJ7DFXWZN5PAYZTTLBDATC4WD.gfortran-win_amd64.dll
libd_odr.R6AEFCLW5EDWH44PJDZ4WA2EIUUQWWJM.gfortran-win_amd64.dll
libdcosqb.BFJ36UD5XZWZE5UMOTP5UDYKAJ3LWZ6R.gfortran-win_amd64.dll
libdcosqb.YMN7XEXYADIEZSKAGEVNR4E3MD7AXDG2.gfortran-win_amd64.dll
libdcsrch.I2AOPDCXAPDRFNPWY55H5UE7XZSU5CVN.gfortran-win_amd64.dll
libdet.MC6JBNE6VNYD6FNPXJKRRGWJPYV2NN2N.gfortran-win_amd64.dll
libdfft_sub.IJGUYXR6WJE4HITNWLK3Q6CCYYDR5M44.gfortran-win_amd64.dll
libdfitpack.PJU6IBGOYZCWITNVROHYOQAYNGAXO3HT.gfortran-win_amd64.dll
libdgamln.L2MD744DJHVE3HDIDGQFBRBYHR75FSAI.gfortran-win_amd64.dll
libdop853.6TJTQZW3I3Q3QIDQHEOBEZKJ3NYRXI4B.gfortran-win_amd64.dll
libdqag.NBT4GJCYXTJZ6FKYPTMU262SBAG2QI76.gfortran-win_amd64.dll
libgetbreak.3JNHTDZYEWBO45P4ZRQMAM3G2F777JB5.gfortran-win_amd64.dll
liblbfgsb.UBS3OB2ZGZATGJADSNFQXG6JBUJXYZFS.gfortran-win_amd64.dll
libmvndst.5VXNIPAPINAF5NIHXAFNA4OTHOPNDEWG.gfortran-win_amd64.dll
libnnls.IXEEHJUCGHJL42YZEM6UIEMROJWXHMLJ.gfortran-win_amd64.dll
libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll
libslsqp_op.LIFGE6AEK5GZMIV4YAH6Q4UEDG4INU5S.gfortran-win_amd64.dll
libspecfun.BHLTWMBI4EYWDACZN4DQUESSDJRJNGEL.gfortran-win_amd64.dll
libvode.UPE44X4HLFF56JWPT3ESOS5IVN5QQ5A4.gfortran-win_amd64.dll
libwrap_dum.MRQ7UAVPNY36S6LDFETEBLDUEUIEUBHR.gfortran-win_amd64.dll
libwrap_dum.Y5YNB62CEIOAELCP2WZDLJGRR3KZYA7H.gfortran-win_amd64.dll

I think it's just libopenblas that's used by multiple submodules, but I'm not sure. It may be necessary, but redistributing all that stuff and adding code to load extra dll's certainly doesn't look easier than just adding extra-dll to the search path (and will increase import time a bit as well, because libopenblas now will get always loaded instead of only when needed).

On the others, 4a is a good enough hack, though not really any different from 1.

I think they are different to some extent; (1) requires a bigger reorganization as well as writing code that either actually loads the DLLs or that changes all imports in submodules. So I'd go with 4a.

FWIW it seems Anaconda is pushing forward with a solution that is not on your list: patching python to modify the DLL load path just before internal python calls to LoadLibraryEx. I don't know if they are CCed on this thread.

it seems Anaconda is pushing forward

I'm aware of that work, and I pushed back against it for similar reasons, but encouraged it for others.

One advantage that they have is the are the system integrator, so it's "okay" for them to decide to combine all the DLLs required by an environment and reference them that way. They also (in theory) control the full set of packages in the environment, and so enabling the default path lookup is also "okay".

As a component in someone else's integration task, numpy/scipy don't get to make the same assumption. In your case, arguably it's _best_ to have a separate .libs folder so that it's easier for conda to merge those into one location as part of their build/install process (and arguably it would be nice for pip to support an "install into the DLLs directory" option, but that has all sorts of flow on problems).

Ultimately, system integrators need to figure this out. Tools like pip and conda will always fall short, because they try to do it automatically, though conda has a _slight_ chance of succeeding if they're careful (and the reason I pushed back against their approach is because they're not being careful, they're simply adding all directories in PATH as search paths which creates repetition and potentially randomizes the search order - I suggested defining their own list of search paths and inferring that from installed packages, but they didn't like it for whatever reason).

Ultimately, system integrators need to figure this out. Tools like pip and conda will always fall short,

Unfortunately a large fraction of users will stay with pip and be their own system integrator, so we're it:(

Agree that the Anaconda thing could make sense for them if they're careful, but it doesn't help us. Anyway, we seem to have a way forward here.

It would be nice for the Windows Store Python to be pulled, and that behavior changed only for Python 3.8. That can then be done without breaking already released wheels. For now I guess all we can do is fix this in the next release, and tell people to not use Windows Store Python in the meantime.

It would be nice for the Windows Store Python to be pulled, and that behavior changed only for Python 3.8.

Maybe, but ultimately this was never a guarantee made by Python, and any configuration change to Windows could also break things in the same way.

In any case, I wouldn't worry too much about Windows Store Python just yet - it'll take quite some time for it to be the source of a serious amount of usage. In the meantime, we should get enough usage to discover and fix issues like this.

@tylerjereddy

Is there a reliable way to probe a DLL from CPython / ctypes and get the names of the DLLs it is looking for, but can't find?

  • to look at the requirements of a specific DLL, use dumpbin /dependents your.dll from within a Visual Studio Command Prompt.
  • to follow the loading at runtime, use Process Monitor. Add a filter by Python PID, do import foo or whatever triggers the load. Look for the LoadFile(your.dll) (LoadFile == however MS spells open) for your DLL. Then look at the immediately subsequent open calls -- you will see it try the DLL name in a number of different paths.

@ihnorton is there a way to do this from the command line or as a script (like ldd on linux)?

@mattip yes, dumpbin is a command line program. To do it from a python script there is a library called (something like) “petools” — you could check what the condabuild verifier uses.

Hi guys, I only stumbled upon this thread by mistake, please try to remember to CC someone from Anaconda when discussing AD stuff.

@rgommers

but had some problems so now we have Windows Store Python and Conda that broke this method.

We released exactly 1 build of Python which essentially (in retrospect!) contained a prototype of our latest AddDllDirectory patch. As soon as @mattip brought it to our attention we modified the patch so that it respects any changes made to PATH. We also entirely disabled this new stuff by default. To use it you must set the CONDA_DLL_SEARCH_MODIFICATION_ENABLE env. var. When people report DLL-hell to use we advise them to set this. This has worked exactly as expected for everyone we've suggested it to. In the period in which it was enabled it caught that PyTorch were distributing an old CUDA DLL with their wheels and PyTorch only worked because it would find the one in C:\Windows\System32 first (which was the original, sole intention of this patch). PyTorch have since removed this DLL.

@zooba

Anaconda translates all PATH entries into AddDllDirectory entries,

.. disabled by default at present (I hope to change that). At the end of the day, if it causes some insurmountable trouble we can disable it by default again. All of the bugs (the numpy one and the PyTorch one) that this caused or helped identify have now been fixed but while it is disabled new problems will not become apparent.

which basically just (potentially) randomizes the order of what was there previously. I suggested only adding the directories that are known to have necessary DLLs in them, and I'll suggest the same if you're going to look at this approach.

This is not the case at present. They are added in LIFO order, so to present entries in PATH to AddDllDirectory you need only to go through them in reverse order.

@mingwandroid Since you've dropped in here, please look at https://bugs.python.org/issue36085 as well (I don't know your bpo username to add you directly)

As soon as @mattip brought it to our attention we modified the patch so that it respects any changes made to PATH

Yes, much appreciated! We did get ~2 bug reports per day for a week or two.

All of the bugs (the numpy one and the PyTorch one) that this caused or helped identify have now been fixed but while it is disabled new problems will not become apparent.

Well, arguably it wasn't a NumPy bug, and it certainly hasn't been fixed for SciPy which uses the exact same mechanism (see comment https://github.com/numpy/numpy/issues/12667#issuecomment-465706501 above).

Well, arguably it wasn't a NumPy bug

I was careless in my description. Clearly it was my bug.

Still there are some of us at anaconda who feel the default on this new code should be switched so it is used by default, myself included.

@mingwandroid could you participate in https://bugs.python.org/issue36085 ? This seems like it is a general windows-python issue and not one specific to Anaconda. If a solution can be agreed upon for python 3.8, and if it is properly documented along with "what needs to change and why", it would provide a nice reference for filing change requests for packages. Then the solution could become widely accepted, and not Anaconda-specific.

Still there are some of us at anaconda who feel the default on this new code should be switched so it is used by default, myself included.

If a solution can be agreed upon for python 3.8 ...

+1 for only making the change to default in python 3.8 only also in Anaconda. Otherwise we're back to at least "all existing wheels on PyPI are broken. Unfortunately it's easy and common for numpy to be pulled in with pip into a conda env.

Otherwise we're back to at least "all existing wheels on PyPI are broken".

No, we are not. Anaconda will not (deliberately) do anything that breaks PIP or Python in general - because this bug could effect even a simple script, basically any code that adds an entry os.environ['PATH'] then tries to import a module from that entry. I fixed this bug 2 days after @mattip reported it to me. Unfortunately this meant our latest python build contained the bug for maybe 4 days. Still anyone with any problems due to the buggy build should be able to just conda update python, then anyone with further problems due to DLL-hell can set CONDA_DLL_SEARCH_MODIFICATION_ENABLE and be happy.

Please take a look at the patch. It's pretty simple and should help to clear up these misconceptions.

From my POV, so long as we don't change the behaviour of Python wrt PATH and module search we can 'get away' with somewhat hacky fixes such as this. The Anaconda Distribution will continue to use PATH as the primary DLL and Python extension module look-up mechanism. At the end of the day we didn't create DLL-hell, but we will try to mitigate it for our users. There are two mitigation schemes that I know of:

  1. Delete the conflicting DLLs from C:\Windows\System32
  2. Put all DLLs into the same directory

1 is terrible advice and will break (badly engineered / packaged) 3rd party software. 2 is not possible when you have an ecosystem that includes multiple DLLs with the same name compiled with different compilers (MSVC and mingw-w64).

Then all we are left with is PATH, but hey that's fine, that's what's documented to work (apart from the complete mess of 1. Looking in C:\Windows\System32 before PATH with no way to reconfigure that and 2. Allowing ISVs to write to C:\Windows\System32 in any non-driver-installation scenario) on Windows.

Rest assured we're not interested in breaking PIP or Python on Windows. Quite the opposite. We want things to work as smoothly as possible and in my opinion, CONDA_DLL_SEARCH_PATH_MODIFICATION_ENABLE achieves that. The only potential issue with it is that Microsoft could turn around and say "OK, the docs warning about the order of AddDllDirectory has come to pass, it is now truly random" then we'd have to come up with something else. From past experience Microsoft are not in the business of breaking existing software via operating system updates so I'm hopeful that we'll be OK here.

But please @zooba, if you can, chat with people about adding RPATH and ORIGIN support to the Windows Loader. It would fix DLL hell. If you copy Linux's implementation it would be easier for people to adjust their tooling.

@mattip

Then the solution could become widely accepted, and not Anaconda-specific.

That would be great. I would definitely prefer not to have to maintain patches such as this (or any patches for anything, generally speaking).

I'm not sure how much time I have to contribute to the issue on BPO but I will try. I am more than happy for people to discuss the 'Anaconda solution' and will answer any questions about it. Clearly for it to be acceptable for upstream Python all the HARDCODE_CONDA_PATHS stuff would be removed.

Please feel free to try our fix in a range of scenarios though. Here are some lines you can paste into cmd.exe to install Miniconda (into %TEMP%\mc, careful, I force deletion of that), update Python to the the latest one (with this change in its latest incarnation), install numpy from pip and run it (disproving @rgommers concern), then finally fake up a DLL-hell scenario and test that our mitigation works correctly:

cd %TEMP%
powershell -command "& { (New-Object Net.WebClient).DownloadFile('https://repo.continuum.io/miniconda/Miniconda3-latest-Windows-x86_64.exe', 'mc3.exe') }"
rmdir /s /q mc > NUL
start /wait "" mc3.exe /InstallationType=JustMe /AddToPath=0 /RegisterPython=0 /NoRegistry=1 /S /D=%TEMP%\mc
%TEMP%\mc\Scripts\activate.bat
conda update -y python conda
pip install numpy
python -v -c "import numpy; print(numpy.version.version); import numpy.core.multiarray" 2>&1 | findstr multiarray
@REM OK, great, PIP installed numpy wheels work just fine (unless you encounter DLL hell).

@REM Put a broken DLL into C:\Windows\System32
copy %TEMP%\mc\Library\bin\sqlite3.dll libcrypto-1_1-x64.dll
explorer .
@REM At this point, MOVE (do *not* copy) libcrypto-1_1-x64.dll to %windir%\System32\libcrypto-1_1-x64.dll
@REM Try to import _ssl
python -c "import _ssl"
@REM Observe:
@REM Traceback (most recent call last):
@REM   File "<string>", line 1, in <module>
@REM ImportError: DLL load failed: The specified procedure could not be found.
@REM Apply our flag to enable the new behaviour
set CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1
@REM Try again:
python -c "import _ssl"
@REM Yay, problems due to DLL-hell and inconsiderate 3rd party ISVs avoided.

If you followed this please be careful to delete %windir%\System32\libcrypto-1_1-x64.dll afterwards.

I am of the opinion that if we want to address DLL-hell, this (to be clear, calling AddDllDirectory() with each entry in PATH in reverse order before attempting to load any extension module, with caching so we do not call AddDllDirectory() all the time) is currently the only tool we have.

arguably it's best to have a separate .libs folder so that it's easier for conda to merge those into one location as part of their build/install process (and arguably it would be nice for pip to support an "install into the DLLs directory" option, but that has all sorts of flow on problems).

@zooba. this option is not open to us unfortunately. We need to use a mixture of compilers to compile DLLs loaded by Python. The compilers generate DLLs with the same name that are incompatible. We cannot easily go for a DLL renaming scheme either as we'd have to adjust the build system for every package we build that creates DLLs. Please think about this, you have suggested this as a possibility for us on a few occasions and it really is not.

Closing, since #13019 numpy preloads the dlls in .libs so importing numpy now works

A big thank you to all of you for that!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sturlamolden picture sturlamolden  ·  68Comments

andyfaff picture andyfaff  ·  65Comments

valentinstn picture valentinstn  ·  61Comments

InonS picture InonS  ·  70Comments

jakirkham picture jakirkham  ·  55Comments