Numpy: BUG: np.min does not always propagate NaNs

Created on 11 Jan 2018 · 65Comments · Source: numpy/numpy

This investigation arises from scipy CI starting to fail on appveyor in the last few days, i.e. after 1.14 being released.

On my home computer (macOS, conda python 3.6.2, conda numpy):

>>> import numpy as np
>>> np.version.version
'1.14.0'
>>> np.min([1.0, 2.0, np.nan])
nan
>>> np.min([1.0, np.nan, 2.0])
nan
>>> np.min([np.nan, 1.0, 2.0])
nan
>>> np.min([np.nan, 1.0])
/Users/andrew/miniconda3/envs/dev3/lib/python3.6/site-packages/numpy/core/_methods.py:29: RuntimeWarning: invalid value encountered in reduce
  return umr_minimum(a, axis, None, out, keepdims)
nan

On my work computer (macOS, conda python 3.6.2, numpy installed via pip in a clean env):

>>> import numpy as np
>>> np.version.version
'1.14.0'
>>> np.min([1., 2., 3., 4., np.nan])
nan
>>> np.min([1., 2., 3., np.nan, 4.])
nan
>>> np.min([1., 2., np.nan, 3., 4.])
nan
>>> np.min([1., np.nan, 2., 3., 4.])
nan
>>> np.min([np.nan, 1., 2., 3., 4.])
nan
>>> np.min([np.nan, 1.])
nan
>>> np.min([np.nan, 1., np.nan])
nan
>>> np.min([1., np.nan])
nan
>>> np.seterr(all='raise')
{'divide': 'warn', 'over': 'warn', 'under': 'ignore', 'invalid': 'warn'}
>>> np.min([1., np.nan])
nan
>>> np.min([np.nan, 1.])
nan
>>> np.min([np.nan, 1., 2., 3., 4.])
nan
>>> np.min([np.nan, 1., 2., 3., 4.])
nan

With the first set of code examples why don't the first three examples result in a warning, only the last?
With the second set of examples no warning is emitted at all.

00 - Bug numpy.core

Source

andyfaff

All 65 comments

xref scipy/scipy#8282, scipy/scipy#8279

andyfaff on 11 Jan 2018

I think another point here is that it might be different code paths that trigger different (not really numpy I think) behaviour, e.g. due to using AVX or such, I think I saw sometimes that vector stuff failed to set the error cpu flags, and since that can appear a bit random if they are used or not....

Dunno if that has much to do with it, but I expect it, numpy does not do too much about floating point errors normally I think. Except checking whether something happened.

seberg on 12 Jan 2018

And as Chuck said, a lot of this floating point error flag is different on different systems unfortunately.

seberg on 12 Jan 2018

Ok, understood about why the error only occurs on some installs and not other.
The reason why scipy is using np.min on an array that possibly contains NaN is because it's a quick way of checking for the presence of them. The numpy documentation suggests that this is allowable:

NaN values are propagated, that is if at least one item is NaN, the corresponding min value will be NaN as well.

Given that this is one of the allowed use cases for np.min, from a usage point of view I wouldn't expect any warnings/errors to be emitted at all.

(There are obviously other ways of achieving this, e.g. np.isnan(arr).any(), but that doesn't change my reasoning above)

andyfaff on 12 Jan 2018

Not at all is a plausible change I guess.

seberg on 12 Jan 2018

+1 for no warnings at all

rgommers on 13 Jan 2018

Hi,

I stumbled upon a similar inconsistency:

>>> import numpy
>>> import warnings
>>> warnings.simplefilter("always", category=RuntimeWarning)
>>> numpy.min(numpy.full((7,), numpy.nan, dtype=numpy.float64))
nan
>>> numpy.min(numpy.full((8,), numpy.nan, dtype=numpy.float64))
/home/user/env/lib/python3.6/site-packages/numpy/core/_methods.py:29: RuntimeWarning: invalid value encountered in reduce
  return umr_minimum(a, axis, None, out, keepdims)
nan

Why a shape of more than 8 raises that RuntimeWarning ?

NotSqrt on 30 Jan 2018

@NotSqrt @seberg I'm seeing similar behaviour where min is not propagating NaNs correctly at all, when the size of the input array gets to size 8:

> cat np-min-wat.py
import numpy as np

print "numpy version:", np.__version__
print ""

def show_min(input):
    print ""
    arr = np.array(input)
    print arr.dtype, arr
    print "this should be nan as per docs:", arr.min()
    arr = np.array

input = [31., 487., np.nan, 10000., 10000., 19., 101., 22., 1000., 300., 10.]
for x in xrange(3, len(input) + 1):
    show_min(input[0:x])

Shows this rather odd behaviour on OSX & Windows, but not Linux.. fresh virtualenv using python 2.7.13 and numpy 1.14.2:

numpy version: 1.14.2

/Users/kip/ac/Environments/test/lib/python2.7/site-packages/numpy/core/_methods.py:29: RuntimeWarning: invalid value encountered in reduce
  return umr_minimum(a, axis, None, out, keepdims)

float64 [ 31. 487.  nan]
this should be nan as per docs: nan

float64 [   31.   487.    nan 10000.]
this should be nan as per docs: nan

float64 [   31.   487.    nan 10000. 10000.]
this should be nan as per docs: nan

float64 [   31.   487.    nan 10000. 10000.    19.]
this should be nan as per docs: nan

float64 [   31.   487.    nan 10000. 10000.    19.   101.]
this should be nan as per docs: nan

float64 [   31.   487.    nan 10000. 10000.    19.   101.    22.]
this should be nan as per docs: 19.0

float64 [   31.   487.    nan 10000. 10000.    19.   101.    22.  1000.]
this should be nan as per docs: 19.0

float64 [   31.   487.    nan 10000. 10000.    19.   101.    22.  1000.   300.]
this should be nan as per docs: nan

float64 [   31.   487.    nan 10000. 10000.    19.   101.    22.  1000.   300.
    10.]
this should be nan as per docs: nan

See the two "this should be nan as per docs: 19.0" lines in there?

Also n.b. the warning does not appear on numpy 1.13.1 (which is where I first observed this behaviour.)

kippr on 13 Apr 2018

@kippr Where did you get NumPy?

charris on 14 Apr 2018

I'm going to guess the eight element boundary may be related to AVX512 and the problem is some combination of compiler and cpu. What cpus/compiler are you seeing the problem on?

charris on 14 Apr 2018

@juliantaylor Thoughts?

charris on 14 Apr 2018

Alignment might also be having an effect.

charris on 14 Apr 2018

Hi @charris

Thanks for taking a look at my post.. to answer your questions:

CPU:
❃ sysctl -n machdep.cpu.brand_string Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

NumPy was installed through pip on my mac in a fresh virtualenv (python 2.7.13 installed through homebrew), so I guess it uses all the default compiler flags etc?

I just recreated the virtual env and reran pip install, and here is what looks pertinent to me, but there are a lot of messages that come by on install.. (full log attached.) Please let me know if there is other info I can fish out of the build directory or whatnot, or if there are compile flags to try.

[..]
 Downloading numpy-1.14.2.zip (4.9MB):                                                                                                                                                                                                                                            
  Downloading from URL https://pypi.python.org/packages/0b/66/86185402ee2d55865c675c06a5cfef742e39f4635a4ce1b1aefd20711c13/numpy-1.14.2.zip#md5=080f01a19707cf467393e426382c7619 (from https://pypi.python.org/simple/numpy/)                                                      
...Downloading numpy-1.14.2.zip (4.9MB): 4.9MB downloaded              
[..]
    building library "npymath" sources
    get_default_fcompiler: matching types: '['gnu95', 'nag', 'absoft', 'ibm', 'intel', 'gnu', 'g95', 'pg']'
    customize Gnu95FCompiler
    Found executable /usr/local/bin/gfortran
    customize Gnu95FCompiler
    customize Gnu95FCompiler using config
    C compiler: clang -fno-strict-aliasing -fno-common -dynamic -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes

    compile options: '-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -I/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c'
    clang: _configtest.c
    clang _configtest.o -o _configtest
    success!
    removing: _configtest.c _configtest.o _configtest

Thanks
Kris

build.log

kippr on 15 Apr 2018

P.s. probably no surprise, but the same behaviour appears with max:

numpy version: 1.14.2


arr.dtype, arr: float64 [ 31. 487.  nan]
/Users/kip/ac/Environments/meh/lib/python2.7/site-packages/numpy/core/_methods.py:29: RuntimeWarning: invalid value encountered in reduce
  return umr_minimum(a, axis, None, out, keepdims)
arr.min(): nan
/Users/kip/ac/Environments/meh/lib/python2.7/site-packages/numpy/core/_methods.py:26: RuntimeWarning: invalid value encountered in reduce
  return umr_maximum(a, axis, None, out, keepdims)
arr.max(): nan
np.amin(arr): nan
np.nanmin(arr): 31.0

arr.dtype, arr: float64 [   31.   487.    nan 10000.]
arr.min(): nan
arr.max(): nan
np.amin(arr): nan
np.nanmin(arr): 31.0

arr.dtype, arr: float64 [   31.   487.    nan 10000. 10000.]
arr.min(): nan
arr.max(): nan
np.amin(arr): nan
np.nanmin(arr): 31.0

arr.dtype, arr: float64 [   31.   487.    nan 10000. 10000.    19.]
arr.min(): nan
arr.max(): nan
np.amin(arr): nan
np.nanmin(arr): 19.0

arr.dtype, arr: float64 [   31.   487.    nan 10000. 10000.    19.   101.]
arr.min(): nan
arr.max(): nan
np.amin(arr): nan
np.nanmin(arr): 19.0

arr.dtype, arr: float64 [   31.   487.    nan 10000. 10000.    19.   101.    22.]
arr.min(): 19.0
arr.max(): 10000.0
np.amin(arr): 19.0
np.nanmin(arr): 19.0

arr.dtype, arr: float64 [   31.   487.    nan 10000. 10000.    19.   101.    22.  1000.]
arr.min(): 19.0
arr.max(): 10000.0
np.amin(arr): 19.0
np.nanmin(arr): 19.0

arr.dtype, arr: float64 [   31.   487.    nan 10000. 10000.    19.   101.    22.  1000.   300.]
arr.min(): nan
arr.max(): nan
np.amin(arr): nan
np.nanmin(arr): 19.0

arr.dtype, arr: float64 [   31.   487.    nan 10000. 10000.    19.   101.    22.  1000.   300.
    10.]
arr.min(): nan
arr.max(): nan
np.amin(arr): nan
np.nanmin(arr): 10.0

kippr on 15 Apr 2018

I am using the same numpy version (on arch) with a CPU which also has AVX2 and sse4.2 capabilities and not seeing it on linux. Maybe it has something to do with mac/clang?

EDIT/Ammendment: (I had tried around a bit with provoking different alignments, did not check the warning part, but the wrong result part I do not see)

seberg on 15 Apr 2018

I'm guessing that there is a cpu flag that is not getting set correctly, probably compiler related.

charris on 15 Apr 2018

We should have a test that turns up this problem, if only to track it. @kippr What does

np.min(np.diagflat([np.nan]*10), axis=0)

do on your installation?

charris on 15 Apr 2018

Hi @charris

That seems ok:

In [1]: import numpy as np

In [2]: np.min(np.diagflat([np.nan]*10), axis=0)
Out[2]: array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan])

But I tried a few more combos and found the following (see last one, 8 values and axis=1):

In [3]: np.min(np.diagflat([np.nan]*10), axis=1)
Out[3]: array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan])

In [4]: np.min(np.diagflat([np.nan]*8), axis=0)
Out[4]: array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan])

In [5]: np.min(np.diagflat([np.nan]*8), axis=1)
Out[5]: array([ nan,  nan,   0.,   0.,  nan,   0.,  nan,  nan])

Quite mysterious.. not sure if that helps understand cause at all.

@seberg - yes that's right, I observed this behaviour on my macs and also on windows, but linux was fine.

Thanks
Kris

kippr on 17 Apr 2018

@kippr This is a really disturbing bug, as it affects a basic operation and the problem probably doesn't seem to originate with NumPy. We can try turning off vectorization on MAC and Windows and see if that helps. Do you only see this problem on Python 2.7?

charris on 17 Apr 2018

I cannot see anything in NumPy that would have changed this behavior for 1.14. Maybe OpenBLAS is fiddling with the controls ...

charris on 17 Apr 2018

@juliantaylor @VictorRodriguez Thoughts?

charris on 17 Apr 2018

Hi @charris

I see this behaviour also in python 3 on my Mac:

Python 3.6.0 (default, Jan 23 2017, 20:09:28)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np

>>> np.min(np.diagflat([np.nan]*8), axis=1)
array([nan, nan,  0.,  0., nan,  0., nan, nan])
>>> np.__version__
'1.14.2'

N.b. I first noticed this behaviour in NumPy version 1.13 under python 2.7 on my Mac, so it was not a regression introduced in 1.14:

Python 2.7.10 (default, Feb  7 2017, 00:08:15)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.min(np.diagflat([np.nan]*8), axis=1)
array([ nan,  nan,   0.,   0.,  nan,   0.,  nan,  nan])
>>> np.__version__
'1.13.1'

And reproduced it in cygwin / windows:

$ python
Python 2.7.13 (default, Mar 13 2017, 20:56:15)
[GCC 5.4.0] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.min(np.diagflat([np.nan]*8), axis=1)
array([ nan,  nan,   0.,   0.,  nan,   0.,  nan,  nan])
>>> np.__version__
'1.13.1'

But don't see this problem in linux:

Python 2.7.6 (default, Oct 26 2016, 20:32:47) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.min(np.diagflat([np.nan]*8), axis=1)
array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan])
>>> np.__version__
'1.13.1'

kippr on 17 Apr 2018

@charris I have seen on my journey on performance optimizations for numpy that openblas and compiler flags can affect the behavior of numpy/scipy ( based on my experience ). The first step we could take is to tack this on a C (using openblas libraries) test so we can isolate the behavior and see if it can be replicated. Also just to double check this is just on MAC and windows right? i can't see this behavior on Linux ( fwiw the avx patches get into for 1.15 , so they should not affect this ) . @kippr how did you build your numpy in mac/windows ? regards

VictorRodriguez on 17 Apr 2018

Hi @VictorRodriguez

This was via pip install on both platforms, a fresh virtualenv in the Mac case. See above where I pasted the log for pip including compilation output for python 2.7 install. (Python 3 pip install seems to be a wheel.)

Cheers Kris

kippr on 17 Apr 2018

Hi folks
Anything more info I can provide to help?

Thanks

kippr on 23 Apr 2018

On the mac (and maybe other platforms as well) the bug is occurring because the compiler (on mac it's clang / llvm not gcc by default) is reordering the SSE2 commands in a problematic way. The bug in:

np.min(np.diagflat([np.nan]*8), axis=1)

is happening because the code that is being run is the sse2_minimum_DOUBLE generated from:
https://github.com/numpy/numpy/blob/d7d5cb3feccc1fc6cf57159e8b9fe0a733968706/numpy/core/src/umath/simd.inc.src#L1020

Specifically let's look at this code (I've auto expanded the SSE2 form) inside the function in line 1041:

        }
        c1 = _mm_min_pd(c1, c2);

        if (npy_get_floatstatus() & NPY_FPE_INVALID) {
            *op = NPY_NAN;
        }
        else {
            npy_double tmp = sse2_horizontal_min___m128d(c1);
            ....
         }

The compiler does not understand that the c1 assignment and npy_get_floatstatus() are related, and thus changes the code to:

        }
        if (npy_get_floatstatus() & NPY_FPE_INVALID) {
            *op = NPY_NAN;
        }
        else {
            c1 = _mm_min_pd(c1, c2);
            npy_double tmp = sse2_horizontal_min___m128d(c1);
            ....
         }

which is of course non-sense... I do not know what is the recommended way to make it not do this under optimizations. (I think other platforms have STDC FENV_ACCESS pragma for this?)

Maybe since the changes in this code are 5 years old, the bug is cause by a new version of clang and different optimizations ?

tzickel on 25 Apr 2018

Thanks @tzickel

I've not written C since a high school class many moons ago, so I'm definitely not going to figure out the smart way to solve this, but I did try a few things to force the compiler to evaluate in the original order by changing the if statement of the above to:

if ((sizeof(c1) != 0) || (sizeof(c1) == 0) & npy_get_floatstatus() & NPY_FPE_INVALID) {

And indeed, once I do this, the bug disappears.

(I also tried a bunch of other less dumb forms to try to get the compiler to evaluate _mm_min_pd ahead of the if statement, for example by putting references to c1 on both sides of the if statement and after the if/ else block, all to no avail. I suspect that it either reordered them anyway but always ran the c = _mm_min_pd assignment, or figured that my calls were actually noops of one form or another..)

But in any case I can confirm you've identified the bug that I'm seeing, lets hope someone has a good way to hint to the compiler to leave the order of _mm_min_pd and npy_get_floatstatus alone..

kippr on 25 Apr 2018

Ok, one other thing that google suggested that also worked: marking c1 volatile:

https://github.com/numpy/numpy/blob/d7d5cb3feccc1fc6cf57159e8b9fe0a733968706/numpy/core/src/umath/simd.inc.src#L1029

becomes:

        /* load the first elements */
        @vtype@ volatile c1 = @vpre@_load_@vsuf@((@type@*)&ip[i]);
        @vtype@ c2 = @vpre@_load_@vsuf@((@type@*)&ip[i + stride]);
        i += 2 * stride;

But I'm not sure what implications that as/ whether that's the best way to accomplish the goal.

kippr on 25 Apr 2018

👍1

Thanks a lot for the extra debug on this @tzickel this was very deep debug, @kippr I have a question, you mention that is a fresh pip install on a Mac , so is not supposed to build from scratch right? Also if i ran on my Linux system with gcc :

$ python
Python 3.6.5 (default, Apr 1 2018, 15:40:54)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import numpy as np
np.version.version
'1.14.2'
np.min([1., 2., 3., 4., np.nan])
nan
np.min([1., 2., 3., np.nan, 4.])
nan
np.min([1., 2., np.nan, 3., 4.])
nan
np.min([1., np.nan, 2., 3., 4.])
nan
np.min([np.nan, 1., 2., 3., 4.])
nan
np.min([np.nan, 1.])
nan
np.min([np.nan, 1., np.nan])
nan
np.min([1., np.nan])
nan
np.seterr(all='raise')
{'divide': 'warn', 'over': 'warn', 'under': 'ignore', 'invalid': 'warn'}
np.min([np.nan, 1.0])
nan

and see no warnings at all , so maybe as described the problem might be with clang. the only thing that makes me wonder is why you are seeing this with pip install numpy . How was it build ? and what flags were used ?

Also no error with my mac :

https://hastebin.com/cuzinajero.swift

VictorRodriguez on 25 Apr 2018

I wonder if this could be made safer with something like

if (c1 = n_mm_min_pd(c1, c2), py_get_floatstatus() & NPY_FPE_INVALID) {

charris on 25 Apr 2018

@juliantaylor Thoughts?

charris on 25 Apr 2018

Or even

return_nan = (c1 = n_mm_min_pd(c1, c2), py_get_floatstatus() & NPY_FPE_INVALID);
if (return_nan) {

charris on 25 Apr 2018

The comma operator is supposed to be a sequence point, although it may not be thread safe. Hmm, I wonder how much in the simd code is thread safe.

charris on 25 Apr 2018

@VictorRodriguez if I run pip install numpy on my mac with python2.7, it builds from source. If I run pip install with python3, it installs from a wheel.

I captured the full output of pip install: https://github.com/numpy/numpy/files/1912086/build.log

To test the changes above, I built from source (current master), but checked that without the tweak I was still seeing the broken behaviour. When building that I used the command line suggested in INSTALL.rst.txt:
python setup.py build -j 4 install --prefix $HOME/.local

kippr on 25 Apr 2018

A word of caution, someone here says he's seen the bug in windows as well. I assume it was not compiled with clang on windows, but with MSVC or GCC, this might mean that numpy isn't respecting the rules of the platform. In windows, numpy uses _statusfp for getting the fp state, the MSDN documentation states:

Many math library functions modify the floating-point status word, with unpredictable results. Optimization can reorder, combine, and eliminate floating-point operations around calls to _status87, _statusfp, and related functions. Use the /Od (Disable (Debug)) compiler option or the fenv_access pragma directive to prevent optimizations that reorder floating-point operations.

GCC and MSVC (but not clang for now) have a pragma to control this.

tzickel on 25 Apr 2018

@tzickel pretty sure the windows build said it was with cygwin. It does not reproduce for me with MSVC.

overall I like @charris 's suggestion, it is standard C, requires no pragmas, and AFAICT does not add any overhead

/* use the comma operator to prevent optimization changing the order of evaluation */
if (c1 = n_mm_min_pd(c1, c2), py_get_floatstatus() & NPY_FPE_INVALID) {

which is supported even by the visual studio 8 compiler used for python 2.7
the second, more compact suggestion seems too obfuscated

Are there other places in the code that are susceptible to this bug?

mattip on 28 Apr 2018

The cygwin build says GCC which means the bug should act like linux no ?
If I Understand correctly, the comma operator does not help with optimizations, it works in another level of the code generation. The compiler is still free to think that both expressions are unrelated and to move them around. Here is a small example to show this is the case (try changing the comments code with the line next to it), check the minpd relative to the call to outside function (make sure -O3 in compiler flags is given), and switch between GCC and CLANG (seems that GCC trunk has the same bug for now :) ):

https://godbolt.org/g/Zoc5xr

If you look at fenv.h, basically every function in the code that access that file has the same potential for issues. So everything that is being called in the code from that is being used in numpy/core/src/npymath/ieee754.c
In my opinion, clang currently cannot produce safe code for this type of functions with optimizations, so the options are:
A. Compile with gcc (atleast the official mac wheels), and produce a warning if compiled with clang.
B. Clang supports an optnone attribute per function, which can be sprinkled on such functions to disable all optimizations (slower but correct code) if compiled with clang:
https://clang.llvm.org/docs/AttributeReference.html#optnone-clang-optnone-clang-optnone

tzickel on 29 Apr 2018

👍1

Switching to the comma operator won't help at all - ; is already a sequence point.

eric-wieser on 29 Apr 2018

@eric-wieser Are you sure? I don't see ; listed as a sequence point. https://msdn.microsoft.com/en-us/library/azk8zbxd.aspx .

charris on 29 Apr 2018

In any case, the order in which the arguments of a comma operator are evaluated is guaranteed, which is not the case with statements separated by ;,

charris on 29 Apr 2018

@charris unfortunately the suggested change doesn't seem to resolve the issue:

diff --git a/numpy/core/src/umath/simd.inc.src b/numpy/core/src/umath/simd.inc.src
index 2241414ac..8345e3ef7 100644
--- a/numpy/core/src/umath/simd.inc.src
+++ b/numpy/core/src/umath/simd.inc.src
@@ -1038,9 +1038,8 @@ sse2_@kind@_@TYPE@(@type@ * ip, @type@ * op, const npy_intp n)
             c1 = @vpre@_@VOP@_@vsuf@(c1, v1);
             c2 = @vpre@_@VOP@_@vsuf@(c2, v2);
         }
-        c1 = @vpre@_@VOP@_@vsuf@(c1, c2);

-        if (npy_get_floatstatus() & NPY_FPE_INVALID) {
+        if (c1 = @vpre@_@VOP@_@vsuf@(c1, c2), npy_get_floatstatus() & NPY_FPE_INVALID) {
             *op = @nan@;
         }
         else {

N.B. that marking the variable as volatile does fix the problem.

Do let me know if you want to try other variants.

kippr on 30 Apr 2018

It seems compilers do not adhering to the spec regarding comma operators, or at least as we understand the spec. The simplest solution for the current state of compilers seems to be adding volatile, although it seems brittle.

mattip on 30 Apr 2018

Well, GCC 8.1 was just released and guess what... GCC with optimizations now produces the same issue that clang does here https://github.com/numpy/numpy/issues/10370#issuecomment-384154230 (and the comma does not help there as expected, but the volatile does), although I don't know if distutils enables in gcc flags which mitigate this, I would think not.

Here is the code on GCC 8.1 (you can compare to 7.3 where it's ok):
https://godbolt.org/g/AJRdRQ

Search there for the minpd and npy_get_floatstatus call in the asm.

tzickel on 2 May 2018

Let's go with volatile for the time being together with some tests that should hopefully signal a problem if is arises. Another option we might look at is a separate function where we could use volatile or compiler directives, but have it in one place to make it easier to manage.

charris on 2 May 2018

This is a much bigger case than this specific issue. All of numpy code which calls functions in ieee754.c needs to be audit and fixed. (Maybe rename this issue yet again, or better yet, open a new issue).

tzickel on 2 May 2018

The surprising behavior of the minpd instruction may be relevant for https://github.com/numpy/numpy/issues/10370#issuecomment-381241813:

If only one value is a NaN (SNaN or QNaN) for this instruction, the second operand (source operand), either a NaN or a valid floating-point value, is written to the result.

See: http://www.felixcloutier.com/x86/MINPD.html

mattkretz on 2 May 2018

@mattkretz doesn't seem so. (thats expected behaviour and why this check is already in place).

tzickel on 2 May 2018

@tzickel Out of curiousity, at what level of optimization does the problem appear? We used to limit ourself to -O2, but I think some platforms have been using -O3 or equivalent.

charris on 2 May 2018

Similar to this gcc bug, I think: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=6065

eric-wieser on 3 May 2018

It seems like setting #pragma STDC FENV_ACCESS ON is the correct solution here, but it does require C99, which we need to drop python 2.7 to use. Perhaps the GCC --std=c90 mode is assuming it can make the optimization due to the pragma being absent

eric-wieser on 3 May 2018

For all your questions, just go to the link here:

https://godbolt.org/g/AJRdRQ (*it seems shared, so copy the input to a new window maybe)

and try changing the compiler version, compiler flags and code, and seeing the result on the fly...

This happens when using even -O1 on both latest compilers. The #pragma does not seem to do anything....

tzickel on 3 May 2018

That version is very complicated, there is too much assembler and it is hard to relate the assembler to the c code. The original version was much simpler and also demostrated the problem.

See the also the PR

mattip on 3 May 2018

hmm.... can anybody tell me on which OS / compiler was numpy-1.14.3-cp27-cp27mu-manylinux1_x86_64.whl (from the pypi archive) compiled on ? I think there might be another (related) issue hiding here as well.

tzickel on 3 May 2018

@tzickel I believe the default ubuntu trusty compiler is gcc 4.8, I can't see that anything more recent was installed.

charris on 3 May 2018

@charris the manylinux container runs Centos5 with gcc 4.2 installed.

ngoldbaum on 3 May 2018

@ngoldbaum Thanks for the info. We will need to build wheels for Python 3.7 soon. Will that be as easy as adding an MB_PYTHON_VERSION=3.7 entry to the travis.yml build matrix?

charris on 3 May 2018

I think you'll need to wait until the manylinux tooling gets updated. That took a few weeks after python3.6 came out IIRC. @njsmith probably knows more.

ngoldbaum on 3 May 2018

@ngoldbaum What drove the choice of gcc 4.2? Is it easy to use a later compiler version if we so desire?

charris on 3 May 2018

What drove the choice of gcc 4.2?

I believe the goal was to enabling compiling with a glibc that is old enough that issues with binary compatibility will not be a problem in practice

Is it easy to use a later compiler version if we so desire?

I don't know. I also don't know how numpy manages building wheels. I'm using Matthew Brett's multibuild project for my projects and I'll need to wait until that's updated to build python3.7 wheels for my projects.

ngoldbaum on 3 May 2018

Ok, not sure if it's the final related issue, but I think I've found another issue in that SSE code:

In numpy 1.14.0 some code was added to throw the runtime warnings if there are FP errors:
https://github.com/numpy/numpy/commit/d47ca7b26172c42b01c3132d0e46e70578c8ea21

But if we look at the SSE implementation again:
https://github.com/numpy/numpy/blob/d7d5cb3feccc1fc6cf57159e8b9fe0a733968706/numpy/core/src/umath/simd.inc.src#L1020

We can see it passes through the FPU only the middle aligned part of the array, the header and trailer (which aren't SSEable cause they aren't memory aligned) are manually checked for NaN and not gone via the FPU, while the SSE parts do, thus only the middle part will trigger a NaN warning, while the other (while equivlent in the input and ouput) won't. Is that ok ?

np.min([1, np.nan, 1, 1, 1, 1, 1, 1]) won't trigger a runtime warning
but np.min([1, 1, np.nan, 1, 1, 1, 1, 1]) does.

tzickel on 3 May 2018

np.min([1, np.nan, 1, 1, 1, 1, 1, 1]) won't trigger a runtime warning

@tzickel this is related to #11029, right?

Edit: formatting

mattip on 4 May 2018

It seems the source issue was #8954, which led to PR #9020

mattip on 4 May 2018

Yes, but my point is that #9020 was not covering all of the possible cases. One of them is this SSE code (which subverts this mechanism for some optimization).

As for #11029 I'm trying to figure even why both in posts here and there, besides the NaN propogation bug, sometimes warnings show up and sometimes they dont

tzickel on 4 May 2018

Another question, if NaN is the outcome of both min/max if it exists in the input, and we already check isNan / invalid, shouldn't it fast-exit upon discovering the first instance of NaN ?

tzickel on 4 May 2018

@tzickel none of the reduction operations early-exit. They might in the future if they get refactored as gufuncs.

ngoldbaum on 4 May 2018

Was this page helpful?

0 / 5 - 0 ratings