Incubator-mxnet: nd.reshape truncate values

Created on 27 Feb 2019 · 13Comments · Source: apache/incubator-mxnet

reshape with a smaller shape truncates the value in the tensor, which is strange and inconsistent with numpy:

>>> import mxnet as mx
>>> a = mx.nd.arange(10)
>>> a

[0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
<NDArray 10 @cpu(0)>
>>> b = a.reshape((1,2))
>>> b

[[0. 1.]]
<NDArray 1x2 @cpu(0)>
>>>

whereas in numpy:

>>> import numpy as np
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a.reshape((1,2))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: cannot reshape array of size 10 into shape (1,2)

Bug Operator

Source

eric-haibin-lin

👍2

Most helpful comment

I'm okay if we add a boolean flag to turn this feature on. I'm even okay if we move this feature to a different operator if you want 100% parity with NumPy for all operators that share a name with NumPy operators. I just want to retain ability to use this feature :)

Thanks,
Stephen

stephenrawls on 28 Feb 2019

👍3

All 13 comments

@junrushao1994 @reminisce

eric-haibin-lin on 27 Feb 2019

There might be issues with infer shape

junrushao1994 on 27 Feb 2019

Not sure about others but I like this behavior.

It allows me to create a maximum-sized array in imperative mode, and re-shape it to the right size each time through the loop at zero cost and with zero allocations.

When running in training mode with autograd this will give you the error you want though:

>>> import mxnet as mx
>>> x = mx.nd.random.randn(10)
>>> x.reshape(1,2)

[[1.1630787 0.4838046]]
<NDArray 1x2 @cpu(0)>
>>> with mx.autograd.record():
...   x.reshape(1,2)
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/local/lib/python3.7/site-packages/mxnet/ndarray/ndarray.py", line 1062, in reshape
    ctypes.byref(handle)))
  File "/usr/local/lib/python3.7/site-packages/mxnet/base.py", line 251, in check_call
    raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [15:15:07] src/ndarray/ndarray.cc:229: Check failed: shape_.Size() == shape.Size() (10 vs. 2) NDArray.Reshape: target shape must have the same size as current shape when recording with autograd.

stephenrawls on 28 Feb 2019

@steventhornton @eric-haibin-lin What if we can add a boolean flag of the API, indicating whether an error will be thrown in this situation?

junrushao1994 on 28 Feb 2019

Thanks,
Stephen

stephenrawls on 28 Feb 2019

👍3

I think it's a bug not a feature.

NonvolatileMemory on 2 Mar 2019

👍2

I suggest by default we throw errors if shapes mismatch. And add an option to enable truncated reshape inplace if users choose to

eric-haibin-lin on 3 Mar 2019

👍2

@NonvolatileMemory Do you mean you think it's a bug because users should expect mx.nd.Reshape() to have identical behavior to NumPy's Reshape() operator?

If so, as mentioned, I don't have an opinion at all, and as mentioned I am happy for this feature to be turned on optionally, or to be renamed to something else entirely, e.g. mx.nd.SomeNewOperator().

If you mean it's a bug for users to want this feature then I disagree. For example, consider the world of NLP where you have dynamically shaped inputs. Maybe one of my inputs is only 5 words long, and another is 10 words long. I might have a routine I want to run at inference time that needs to use an NDArray with one of its dimensions sized to fit however many words are in my utterance. I would like to (1) not have to do dynamic memory allocations during inference; and (2) to not have to allocate extra space for each shape I want to support. To do this you can image it is convenient to be able to do this:

def init():
  some_array = mx.nd.array( max_shape )

def infer():
  ...
  some_array_local = some_array.Reshape( current_shape )
  ...

Just to reiterate, it doesn't matter to me that Reshape() retains this behavior, just that MxNet has some way of spelling this behavior.

stephenrawls on 3 Mar 2019

👍1

In my opinion, we should fix the bug in Reshape OP, and add a new operator named adapted_crop to get the interesting feature.

e.g.

a = nd.zeros((3,6,9))
b = nd.zeros((2,8,5))
a_crop, b_crop = nd.adapted_crop(a, b, mode='tiny',merge=False)
a_crop2, b_crop2 = nd.adapted_crop(a, b, mode='full', fill_value=3, merge=False)

it is equivalent to the following code:
a_crop = a[:2,:6,:5]
b_crop = b[:2,:6,:5]

a_crop2 = nd.full((3,8,9), fill_value=3)
a_crop2[:3,:6,:9] = a

b_crop2 = nd.full((3,8,9), fill_value=3)
b_crop2[:2,:8,:5] = b

Meanwhile, we also add a flag to support single output.