Incubator-mxnet: Return a NaN when using operator ( ** ) on Windows version with GPU

Created on 26 Jan 2018  Â·  12Comments  Â·  Source: apache/incubator-mxnet

Description

I find a bug when using ** operator to compute grad clipping (using gpu),like this:

p=nd.array([-7.11766724e-03,-1.11747989e-02,-3.17790220e-03,-2.94371421e-04],ctx=mx.gpu(0))
print(p)
print(p**2)

result as following:

[-0.00711767 -0.0111748 -0.0031779 -0.00029437]

[ nan nan nan nan]
But it is ok on CPU , like that:

a=nd.array([-7.11766724e-03,-1.11747989e-02,-3.17790220e-03,-2.94371421e-04])
print(a)
print(a**2)
result as following:

[-0.00711767 -0.0111748 -0.0031779 -0.00029437]

[ 5.06611868e-05 1.24876125e-04 1.00990619e-05 8.66545307e-08]

Environment info (Required)

----------Python Info----------
Version : 3.6.1
Compiler : MSC v.1900 64 bit (AMD64)
Build : ('default', 'May 11 2017 13:25:24')
Arch : ('64bit', 'WindowsPE')
------------Pip Info-----------
Version : 9.0.1
Directory : E:\Anaconda2\envs\gluon\libsite-packagespip
----------MXNet Info-----------
Version : 1.0.0
Directory : E:\Anaconda2\envs\gluon\libsite-packages\mxnet
Hashtag not found. Not installed from pre-built package.
----------System Info----------
Platform : Windows-10-10.0.15063-SP0
system : Windows
node : DESKTOP-OALBEUS
release : 10
version : 10.0.15063
----------Hardware Info----------
machine : AMD64
processor : Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
Name
Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz

Package used (Python/R/Scala/Julia):
(I'm using Python 3.6.1)

PS: error also find in (win_amd64):

mxnet-cu80-1.0.1b20180102
mxnet-cu80-1.0.0
mxnet-cu80-1.0.0b20171213

python version: 3.6.0

Bug NDArray

All 12 comments

The same issues . However it does not occur in python27(mxnet-gpu version).I also use Windows10. How can I fix it?

Thanks for the notifications. But I'm afraid I cannot help you here :o)

@yajiedesign @zhreshold would either of you take a look? Thanks.

Related post:
https://discuss.gluon.ai/t/topic/989

This issue has been raised by multiple users. Thanks.

Interestingly, I found it my self on windows using GPU as well: https://github.com/apache/incubator-mxnet/issues/9555

@Feywell @zhreshold Is the issue still there? I tested on linux gpu and cpu with both mx version 1.0.0 and master version. There is no issue. The behavior on cpu and gpu are also same. If the issue still there, have you tried upgraded the mxnet?

sorry for later,The problem has come back, and I'll try to find out.

@cgraywang I find the issue still exist.
I upgraded the mxnet-cu80 windows version to

1.1.0b20180212

@Feywell ok, I will be looking into it. Since I will need to setting up the windows, it may take sometime. But I will get back to you asap.

@Feywell just follow up, for now we can reproduce the error with mxnet-cu80 version 1.1.0. We are working on fixing the issue in the new release.

@cgraywang That's great! Hope the new release.

It is now determined to be the result of fastmath

Was this page helpful?
0 / 5 - 0 ratings