Fastai: Expected input batch_size (17664) to match target batch_size (32).

Created on 9 Apr 2019 · 3Comments · Source: fastai/fastai

Describe the bug

Expected input batch_size (17664) to match target batch_size (32)
Provide your installation details

=== Software === 
python        : 3.7.3
fastai        : 1.0.51
fastprogress  : 0.1.20
torch         : 1.0.1.post2
nvidia driver : 410.79
torch cuda    : 8.0.61 / is available
torch cudnn   : 7102 / is enabled

=== Hardware === 
nvidia gpus   : 2
torch devices : 2
  - gpu0      : 11178MB | GeForce GTX 1080 Ti
  - gpu1      : 11178MB | GeForce GTX 1080 Ti

=== Environment === 
platform      : Linux-4.15.0-43-generic-x86_64-with-debian-stretch-sid
distro        : #46~16.04.1-Ubuntu SMP Fri Dec 7 13:31:08 UTC 2018
conda env     : /mnt/4A50ACA463E069FB/xxx/.env
python        : /mnt/4A50ACA463E069FB/xxx/.env/bin/python
sys.path      : /mnt/4A50ACA463E069FB/xxx
/mnt/4A50ACA463E069FB/xxx/.env/lib/python37.zip
/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7
/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/lib-dynload

/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages
/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages/IPython/extensions
/home/lixin/.ipython

To Reproduce

data_lm = TextDataBunch.from_csv(path, 'train.csv', 
test='test.csv', text_cols='comment_text', label_cols='label', bs=32)

data_lm.save('data_lm.pkl')
data_lm = load_data(path, 'data_lm.pkl', bs=32)

then

learn = language_model_learner(data_lm, AWD_LSTM, drop_mult=0.3, path='.')
learn.lr_find()

The output is:
LR Finder is complete, type {learner_name}.recorder.plot() to see the graph.

ValueError Traceback (most recent call last)
in
----> 1 learn.lr_find()
2 # learn.recorder.plot(skip_end=15)

/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, stop_div, wd)
30 cb = LRFinder(learn, start_lr, end_lr, num_it, stop_div)
31 epochs = int(np.ceil(num_it/len(learn.data.train_dl)))
---> 32 learn.fit(epochs, start_lr, callbacks=[cb], wd=wd)
33
34 def to_fp16(learn:Learner, loss_scale:float=None, max_noskip:int=1000, dynamic:bool=True, clip:float=None,

/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
194 callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
195 if defaults.extra_callbacks is not None: callbacks += defaults.extra_callbacks
--> 196 fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
197
198 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, learn, callbacks, metrics)
98 for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
99 xb, yb = cb_handler.on_batch_begin(xb, yb)
--> 100 loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)
101 if cb_handler.on_batch_end(loss): break
102

/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages/fastai/basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
27
28 if not loss_func: return to_detach(out), yb[0].detach()
---> 29 loss = loss_func(out, *yb)
30
31 if opt is not None:

/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages/fastai/layers.py in __call__(self, input, target, *kwargs)
265 if self.floatify: target = target.float()
266 input = input.view(-1,input.shape[-1]) if self.is_2d else input.view(-1)
--> 267 return self.func.__call__(input, target.view(-1), *kwargs)
268
269 def CrossEntropyFlat(args, axis:int=-1, *kwargs):

/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, input, *kwargs)
487 result = self._slow_forward(input, *kwargs)
488 else:
--> 489 result = self.forward(input, *kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
902 def forward(self, input, target):
903 return F.cross_entropy(input, target, weight=self.weight,
--> 904 ignore_index=self.ignore_index, reduction=self.reduction)
905
906

/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
1968 if size_average is not None or reduce is not None:
1969 reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 1970 return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
1971
1972

/mnt/4A50ACA463E069FB/xxx/.env/lib/python3.7/site-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
1786 if input.size(0) != target.size(0):
1787 raise ValueError('Expected input batch_size ({}) to match target batch_size ({}).'
-> 1788 .format(input.size(0), target.size(0)))
1789 if dim == 2:
1790 ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

ValueError: Expected input batch_size (17664) to match target batch_size (32).

Expected behavior

It should work fine.
Screenshots

Additional context

the problem is also in
https://forums.fast.ai/t/v-1-0-50-lr-find-expected-input-batch-size-102512-to-match-target-batch-size-16/41721
If bs is not set, then the error will be out of memory

Source

Cugtyt

Most helpful comment

You should use TextLMDataBunch. TextDataBunch shouldn't be used directly, it only implements base methods for the subclasses TextLMDataBunch and TextClasDataBunch.

sgugger on 9 Apr 2019

👍3

All 3 comments

You should use TextLMDataBunch. TextDataBunch shouldn't be used directly, it only implements base methods for the subclasses TextLMDataBunch and TextClasDataBunch.

sgugger on 9 Apr 2019

👍3

I am having a similar issue:
"ValueError: Expected input batch_size (8) to match target batch_size (1382400)."

I'm following the same procedure as presented in https://nbviewer.jupyter.org/github/fastai/course-v3/blob/master/nbs/dl1/lesson3-camvid.ipynb but in my case I'm trying to implement it using squeezenet1_1 instead of resnet34.

The only change I've made to the code is to replace
learn = unet_learner(data, models.resnet34, metrics=metrics, wd=wd)
with
learn = cnn_learner(data, models.squeezenet1_1, metrics=accuracy)

When it comes to executing lr_find() however, the following gets produced:

`ValueError Traceback (most recent call last)
in ()
----> 1 learn.lr_find()
2 learn.recorder.plot()

8 frames
/usr/local/lib/python3.6/dist-packages/fastai/train.py in lr_find(learn, start_lr, end_lr, num_it, stop_div, wd)
30 cb = LRFinder(learn, start_lr, end_lr, num_it, stop_div)
31 epochs = int(np.ceil(num_it/len(learn.data.train_dl)))
---> 32 learn.fit(epochs, start_lr, callbacks=[cb], wd=wd)
33
34 def to_fp16(learn:Learner, loss_scale:float=None, max_noskip:int=1000, dynamic:bool=True, clip:float=None,

/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
197 callbacks = [cb(self) for cb in self.callback_fns + listify(defaults.extra_callback_fns)] + listify(callbacks)
198 if defaults.extra_callbacks is not None: callbacks += defaults.extra_callbacks
--> 199 fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
200
201 def create_opt(self, lr:Floats, wd:Floats=0.)->None:

/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in fit(epochs, learn, callbacks, metrics)
99 for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
100 xb, yb = cb_handler.on_batch_begin(xb, yb)
--> 101 loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)
102 if cb_handler.on_batch_end(loss): break
103

/usr/local/lib/python3.6/dist-packages/fastai/basic_train.py in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
28
29 if not loss_func: return to_detach(out), yb[0].detach()
---> 30 loss = loss_func(out, *yb)
31
32 if opt is not None:

/usr/local/lib/python3.6/dist-packages/fastai/layers.py in __call__(self, input, target, *kwargs)
265 if self.floatify: target = target.float()
266 input = input.view(-1,input.shape[-1]) if self.is_2d else input.view(-1)
--> 267 return self.func.__call__(input, target.view(-1), *kwargs)
268
269 def CrossEntropyFlat(args, axis:int=-1, *kwargs):

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__(self, input, *kwargs)
491 result = self._slow_forward(input, *kwargs)
492 else:
--> 493 result = self.forward(input, *kwargs)
494 for hook in self._forward_hooks.values():
495 hook_result = hook(self, input, result)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py in forward(self, input, target)
940 def forward(self, input, target):
941 return F.cross_entropy(input, target, weight=self.weight,
--> 942 ignore_index=self.ignore_index, reduction=self.reduction)
943
944

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
2054 if size_average is not None or reduce is not None:
2055 reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2056 return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
2057
2058

/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
1867 if input.size(0) != target.size(0):
1868 raise ValueError('Expected input batch_size ({}) to match target batch_size ({}).'
-> 1869 .format(input.size(0), target.size(0)))
1870 if dim == 2:
1871 ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)

ValueError: Expected input batch_size (8) to match target batch_size (1382400).`

(P.s. I'm relatively inexperienced with fastai and ML in general)

WarrenPretorius on 29 May 2019

👍1

@WarrenPretorius (and everybody else with this error), I was having the same problem as you, then I realized that writing learn = cnn_learner(...) instead of learn = unet_learner(...) had something to do with this error… :facepalm:

Now I have a more manageable RuntimeError: CUDA error: device-side assert triggered, the buik is done.