Keras: merge two resnet50 causes fit_generator to crash after more than 5 minute - without start the training

Created on 15 Feb 2017 · 4Comments · Source: keras-team/keras

Hello,
I'm trying to play with keras and resnet50, I was trying to do the following:

input_dim = (3, 224, 224)
input_a = Input(shape=input_dim)
input_b = Input(shape=input_dim)

base_model = ResNet50(weights='imagenet', include_top=False, input_tensor=None, input_shape=input_dim)

out_a_base = base_model (input_a)
out_b_base = base_model (input_b)
concatenated = merge([out_a_base,out_b_base], mode='sum')
model = Model(input=[input_a,input_b], output=distance )

This works and model.compile works as well. But when trying to do

model.fit_generator(...)
it hangs for long long time and then, before starts, it produces long error message with ends with

Exception: ('The following error happened while compiling the node', GpuElemwise{RoundHalfToEven,no_inplace}(GpuElemwise{Composite{sqrt(clip(i0, i1, i2))},no_inplace}.0), '\n', 'nvcc return status', 2, 'for cmd', 'nvcc -shared -O3 --maxrregcount=32 -arch=sm_37 -m64 -Xcompiler -fno-math-errno,-Wno-unused-label,-Wno-unused-variable,-Wno-write-strings,-DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden -Xlinker -rpath,/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray -I/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray -I/usr/local/cuda-8.0/include -I/home/oak/venv2/local/lib/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -I/home/oak/venv2/local/lib/python2.7/site-packages/theano/gof -I/home/oak/venv2/local/lib/python2.7/site-packages/theano/sandbox/cuda -o /home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/tmpKFacqe/1a0cc683bdd484bffedd2637c51df231.so mod.cu -L/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray -L/usr/lib -lcudart -lcublas -lcuda_ndarray -lpython2.7', '[GpuElemwise{RoundHalfToEven,no_inplace}(

I replaced the base model with simpler structure like:
base_model= Sequential()
base_model.add(Flatten(input_shape=input_dim))
base_model.add(Dense(1024, activation='relu'))
base_model.add(Dropout(0.5))
base_model.add(Dense(1024, activation='relu'))

~~and fit_generator in this case is working just fine~~. What can be the issue with ResNet50 and the merge layer

Is it memory issue?

EDIT:

without the nvcc compiler (took it out from path) . It does not crash but takes forever(alot of minutes) until it gets to the train phase

EDIT2
with the nvcc compiler theano backend it crashes also for the simple module.

Source

oak-tree

Most helpful comment

@patyork looks like you are right, I installed the lastest from Theano repo by

pip install --upgrade git+https://github.com/Theano/Theano.git#egg=Theano

and it seems to fix this issue

oak-tree on 16 Feb 2017

👍2

All 4 comments

This is a problem related to Theano. For some reason, it isn't able to
compile some GPU code.

Can you give the full error message? There is information in it that I'm
missing.

Fred

On Wed, Feb 15, 2017 at 11:48 AM oak-tree notifications@github.com wrote:

Hello,
I'm trying to play with keras and resnet50, I was trying to do the
following:

input_dim = (3, 224, 224)
input_a = Input(shape=input_dim)
input_b = Input(shape=input_dim)

base_model = ResNet50(weights='imagenet', include_top=False, input_tensor=None, input_shape=input_dim)

out_a_base = base_model (input_a)
out_b_base = base_model (input_b)
concatenated = merge([out_a_base,out_b_base], mode='sum')
model = Model(input=[input_a,input_b], output=distance )

This works and model.compile works as well. But when trying to do

model.fit_generator(...)
it hangs for long long time and then, before starts, it produces long
error message with ends with

Exception: ('The following error happened while compiling the node',
GpuElemwise{RoundHalfToEven,no_inplace}(GpuElemwise{Composite{sqrt(clip(i0,
i1, i2))},no_inplace}.0), '\n', 'nvcc return status', 2, 'for cmd', 'nvcc
-shared -O3 --maxrregcount=32 -arch=sm_37 -m64 -Xcompiler
-fno-math-errno,-Wno-unused-label,-Wno-unused-variable,-Wno-write-strings,-DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden
-Xlinker
-rpath,/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray
-I/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray
-I/usr/local/cuda-8.0/include
-I/home/oak/venv2/local/lib/python2.7/site-packages/numpy/core/include
-I/usr/include/python2.7
-I/home/oak/venv2/local/lib/python2.7/site-packages/theano/gof
-I/home/oak/venv2/local/lib/python2.7/site-packages/theano/sandbox/cuda -o
/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/tmpKFacqe/1a0cc683bdd484bffedd2637c51df231.so
mod.cu
-L/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray
-L/usr/lib -lcudart -lcublas -lcuda_ndarray -lpython2.7',
'[GpuElemwise{RoundHalfToEven,no_inplace}( True, False, False))>)]')

I replaced the base model with simpler structure like:
base_model= Sequential()
base_model.add(Flatten(input_shape=input_dim))
base_model.add(Dense(1024, activation='relu'))
base_model.add(Dropout(0.5))
base_model.add(Dense(1024, activation='relu'))

and fit_generator in this case is working just fine. What can be the
issue with ResNet50 and the merge layer

Is it memory issue?

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/fchollet/keras/issues/5408, or mute the thread
https://github.com/notifications/unsubscribe-auth/AALC-8D-6vzNJGH8iy2jbpCQnmvNHukAks5rcyxGgaJpZM4MB9UZ
.

nouiz on 15 Feb 2017

@nouiz It seems to be an issue with the RoundHalfToEven mode for rounding. I think the master branch of theano had fixed this (so I think this is a 0.8.2 issue). Another thread here.

The default mode was changed in the Keras backend fairly recently to match what TF does (I believe). The prior fix was just to change the rounding method back to half_away_from_zero.