Incubator-mxnet: #12285 Breaks NDArrayIter For 3D Arrays

Created on 12 Sep 2018  路  10Comments  路  Source: apache/incubator-mxnet

Description

Our mnist smokescreen tests are breaking in the latest build (mxnet-1.3.0b20180911) as a result of this PR (#12285) with an index out of range error.

Environment info (Required)

Breaks on both Linux and OSX. Build mxnet-1.3.0b20180909 is fine, 1.3.0b20180911 is faulty.

Bug NDArray

All 10 comments

Can you provide more details? For example, length of the NDArray, batch size?

Interested to understand the issue with an example. If this is a critical issue and takes time to fix, we can revert that commit till we root cause the issue.

@sandeep-krishnamurthy @zhreshold

Here's a specific example from the Nightly Binary test which just failed:

[StraightDope: Python2 Single-GPU]
[StraightDope: Python2 Single-GPU]
[StraightDope: Python2 Single-GPU] IndexErrorTraceback (most recent call last)
[StraightDope: Python2 Single-GPU] <ipython-input-8-d40071ee971d> in <module>()
[StraightDope: Python2 Single-GPU]      20     train_data.reset()
[StraightDope: Python2 Single-GPU]      21     iter = 0
[StraightDope: Python2 Single-GPU] ---> 22     for batch in train_data:
[StraightDope: Python2 Single-GPU]      23         ############################
[StraightDope: Python2 Single-GPU]      24         # (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
[StraightDope: Python2 Single-GPU]
[StraightDope: Python2 Single-GPU] /work/mxnet/python/mxnet/io/io.pyc in next(self)
[StraightDope: Python2 Single-GPU]     678             raise StopIteration
[StraightDope: Python2 Single-GPU]     679         data = self.getdata()
[StraightDope: Python2 Single-GPU] --> 680         label = self.getlabel()
[StraightDope: Python2 Single-GPU]     681         # iter should stop when last batch is not complete
[StraightDope: Python2 Single-GPU]     682         if data[0].shape[0] != self.batch_size:
[StraightDope: Python2 Single-GPU]
[StraightDope: Python2 Single-GPU] /work/mxnet/python/mxnet/io/io.pyc in getlabel(self)
[StraightDope: Python2 Single-GPU]     748     def getlabel(self):
[StraightDope: Python2 Single-GPU]     749         """Get label."""
[StraightDope: Python2 Single-GPU] --> 750         return self._batchify(self.label)
[StraightDope: Python2 Single-GPU]     751
[StraightDope: Python2 Single-GPU]     752     def getpad(self):
[StraightDope: Python2 Single-GPU]
[StraightDope: Python2 Single-GPU] /work/mxnet/python/mxnet/io/io.pyc in _batchify(self, data_source)
[StraightDope: Python2 Single-GPU]     730             self.cursor + self.batch_size > self.num_data:
[StraightDope: Python2 Single-GPU]     731             pad = self.batch_size - self.num_data + self.cursor
[StraightDope: Python2 Single-GPU] --> 732             first_data = self._getdata(data_source, start=self.cursor)
[StraightDope: Python2 Single-GPU]     733             second_data = self._getdata(data_source, end=pad)
[StraightDope: Python2 Single-GPU]     734             return self._concat(first_data, second_data)
[StraightDope: Python2 Single-GPU]
[StraightDope: Python2 Single-GPU] /work/mxnet/python/mxnet/io/io.pyc in _getdata(self, data_source, start, end)
[StraightDope: Python2 Single-GPU]     692         assert start is not None or end is not None, 'should at least specify start or end'
[StraightDope: Python2 Single-GPU]     693         start = start if start is not None else 0
[StraightDope: Python2 Single-GPU] --> 694         end = end if end is not None else data_source[0][1].shape[0]
[StraightDope: Python2 Single-GPU]     695         s = slice(start, end)
[StraightDope: Python2 Single-GPU]     696         return [
[StraightDope: Python2 Single-GPU]
[StraightDope: Python2 Single-GPU] IndexError: list index out of range
[StraightDope: Python2 Single-GPU] IndexError: list index out of range

http://jenkins.mxnet-ci.amazon-ml.com/job/NightlyTests_onBinaries/148/console

The two notebooks are from The Straight Dope book that both repro the out of bounds error:
chapter14_generative-adversarial-networks/dcgan and
chapter14_generative-adversarial-networks/pixel2pixel available at
https://github.com/zackchase/mxnet-the-straight-dope

Vishaal

Thanks Vishaal.

On a side note, how did we miss nightly master build failure. We need to revisit it once.

Thanks @vishaalkapoor.
work on it

Thanks for submitting the issue @iamthebot
@mxnet-label-bot[NDArray, Bug]

@stu1130 Do we still need a repro? Sorry, I haven't gotten around to it.

@iamthebot I am able to find the root cause, so don't worry about it. Thanks

@iamthebot
Could you give me the repro of what the data shape is and how you used and initialized the NDArrayIter?
We would like to make sure all the existing use cases work!
Thank you so much!

The patch is merged, @sandeep-krishnamurthy please close it

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yuconglin picture yuconglin  路  3Comments

seongkyun picture seongkyun  路  3Comments

Zhaoyang-XU picture Zhaoyang-XU  路  3Comments

xzqjack picture xzqjack  路  3Comments

xzqjack picture xzqjack  路  3Comments