Hello,
I'm trying to play with keras and resnet50, I was trying to do the following:
input_dim = (3, 224, 224)
input_a = Input(shape=input_dim)
input_b = Input(shape=input_dim)
base_model = ResNet50(weights='imagenet', include_top=False, input_tensor=None, input_shape=input_dim)
out_a_base = base_model (input_a)
out_b_base = base_model (input_b)
concatenated = merge([out_a_base,out_b_base], mode='sum')
model = Model(input=[input_a,input_b], output=distance )
This works and model.compile works as well. But when trying to do
model.fit_generator(...)
it hangs for long long time and then, before starts, it produces long error message with ends with
Exception: ('The following error happened while compiling the node', GpuElemwise{RoundHalfToEven,no_inplace}(GpuElemwise{Composite{sqrt(clip(i0, i1, i2))},no_inplace}.0), '\n', 'nvcc return status', 2, 'for cmd', 'nvcc -shared -O3 --maxrregcount=32 -arch=sm_37 -m64 -Xcompiler -fno-math-errno,-Wno-unused-label,-Wno-unused-variable,-Wno-write-strings,-DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden -Xlinker -rpath,/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray -I/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray -I/usr/local/cuda-8.0/include -I/home/oak/venv2/local/lib/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -I/home/oak/venv2/local/lib/python2.7/site-packages/theano/gof -I/home/oak/venv2/local/lib/python2.7/site-packages/theano/sandbox/cuda -o /home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/tmpKFacqe/1a0cc683bdd484bffedd2637c51df231.so mod.cu -L/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray -L/usr/lib -lcudart -lcublas -lcuda_ndarray -lpython2.7', '[GpuElemwise{RoundHalfToEven,no_inplace}(
I replaced the base model with simpler structure like:
base_model= Sequential()
base_model.add(Flatten(input_shape=input_dim))
base_model.add(Dense(1024, activation='relu'))
base_model.add(Dropout(0.5))
base_model.add(Dense(1024, activation='relu'))
and . What can be the issue with fit_generator in this case is working just fineResNet50 and the merge
layer
Is it memory issue?
EDIT:
without the nvcc compiler (took it out from path) . It does not crash but takes forever(alot of minutes) until it gets to the train phase
EDIT2
with the nvcc compiler theano backend it crashes also for the simple module.
This is a problem related to Theano. For some reason, it isn't able to
compile some GPU code.
Can you give the full error message? There is information in it that I'm
missing.
Fred
On Wed, Feb 15, 2017 at 11:48 AM oak-tree notifications@github.com wrote:
Hello,
I'm trying to play with keras and resnet50, I was trying to do the
following:input_dim = (3, 224, 224)
input_a = Input(shape=input_dim)
input_b = Input(shape=input_dim)base_model = ResNet50(weights='imagenet', include_top=False, input_tensor=None, input_shape=input_dim)
out_a_base = base_model (input_a)
out_b_base = base_model (input_b)
concatenated = merge([out_a_base,out_b_base], mode='sum')
model = Model(input=[input_a,input_b], output=distance )This works and model.compile works as well. But when trying to do
model.fit_generator(...)
it hangs for long long time and then, before starts, it produces long
error message with ends withException: ('The following error happened while compiling the node',
GpuElemwise{RoundHalfToEven,no_inplace}(GpuElemwise{Composite{sqrt(clip(i0,
i1, i2))},no_inplace}.0), '\n', 'nvcc return status', 2, 'for cmd', 'nvcc
-shared -O3 --maxrregcount=32 -arch=sm_37 -m64 -Xcompiler
-fno-math-errno,-Wno-unused-label,-Wno-unused-variable,-Wno-write-strings,-DCUDA_NDARRAY_CUH=c72d035fdf91890f3b36710688069b2e,-DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION,-fPIC,-fvisibility=hidden
-Xlinker
-rpath,/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray
-I/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray
-I/usr/local/cuda-8.0/include
-I/home/oak/venv2/local/lib/python2.7/site-packages/numpy/core/include
-I/usr/include/python2.7
-I/home/oak/venv2/local/lib/python2.7/site-packages/theano/gof
-I/home/oak/venv2/local/lib/python2.7/site-packages/theano/sandbox/cuda -o
/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/tmpKFacqe/1a0cc683bdd484bffedd2637c51df231.so
mod.cu
-L/home/oak/.theano/compiledir_Linux-4.4--generic-x86_64-with-Ubuntu-16.04-xenial-x86_64-2.7.12-64/cuda_ndarray
-L/usr/lib -lcudart -lcublas -lcuda_ndarray -lpython2.7',
'[GpuElemwise{RoundHalfToEven,no_inplace}(True, False, False))>)]') I replaced the base model with simpler structure like:
base_model= Sequential()
base_model.add(Flatten(input_shape=input_dim))
base_model.add(Dense(1024, activation='relu'))
base_model.add(Dropout(0.5))
base_model.add(Dense(1024, activation='relu'))and fit_generator in this case is working just fine. What can be the
issue with ResNet50 and the merge layerIs it memory issue?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/fchollet/keras/issues/5408, or mute the thread
https://github.com/notifications/unsubscribe-auth/AALC-8D-6vzNJGH8iy2jbpCQnmvNHukAks5rcyxGgaJpZM4MB9UZ
.
@nouiz It seems to be an issue with the RoundHalfToEven mode for rounding. I think the master branch of theano had fixed this (so I think this is a 0.8.2 issue). Another thread here.
The default mode was changed in the Keras backend fairly recently to match what TF does (I believe). The prior fix was just to change the rounding method back to half_away_from_zero.
@nouiz
Here is a gist for the full error lloghttps://gist.github.com/oak-tree/ccec4bf5ec0931c29a11629e1a0f9d46
@patyork looks like you are right, I installed the lastest from Theano repo by
pip install --upgrade git+https://github.com/Theano/Theano.git#egg=Theano
and it seems to fix this issue
Most helpful comment
@patyork looks like you are right, I installed the lastest from
Theanorepo byand it seems to fix this issue