Incubator-mxnet: Problem with gluon.utils.split_data()

Created on 19 Dec 2019 · 5Comments · Source: apache/incubator-mxnet

Description

The current gluon.utils.split_data() has:

step = size // num_slice

# If size < num_slice, make fewer slices
if not even_split and size < num_slice:
        step = 1
        num_slice = size

if batch_axis == 0:
        slices = [data[i*step:(i+1)*step] if i < num_slice - 1 else data[i*step:size]
                  for i in range(num_slice)]

Considering an example:
we have a tensor of shape (31, *), and we want to split it into 8 slices. According to the function, step will be (31 // 8 = 3), so that the tensor will be split into 8 tensors of size [3, 3 ,3 ,3 ,3 ,3, 3, 10], in which the last tensor is excessive large. A better result could be [4, 4, 4, 4, 4, 4, 4, 3]

Maybe we can follow np.array_split()?

Error Message

(Paste the complete error message. Please also include stack trace by setting environment variable DMLC_LOG_STACK_TRACE_DEPTH=10 before running your script.)

To Reproduce

(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)

Steps to reproduce

(Paste the commands you ran that produced the error.)

1.
2.

What have you tried to solve it?

1.
2.

Environment

We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:

curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python

# paste outputs here

API change Feature request Gluon Performance

Source

zburning

👍2

Most helpful comment

@leezu @zburning How about labeling it as a performance issue?

sxjscience on 19 Dec 2019

👍2

All 5 comments

slice_len = length // num_slice
rest = length % num_slice
start = slice_len * index + min(index, rest)
end = start + slice_len + (index < rest)

wkcn on 19 Dec 2019

Thank you, this is a clean solution.

zburning on 19 Dec 2019

🚀1

Following np.array_split is a good idea. It should have been done from the beginning. Would you like to create a PR?

leezu on 19 Dec 2019

@leezu Yes

zburning on 19 Dec 2019

@leezu @zburning How about labeling it as a performance issue?

sxjscience on 19 Dec 2019

👍2

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Training Accuracy and Validation Accuracy problem

WangcsShuai · 3Comments

Is there a simple way to make two similar networks share same weights?

xzqjack · 3Comments

Missing constant symbol (equivalent of tf.constant)?

Ajoo · 3Comments

train.rec test.rec for cifar100

ranti-iitg · 3Comments

TypeError: __init__() got an unexpected keyword argument 'multi_precision'

Fzz123 · 3Comments