I'm trying to a build a CNN in keras (tensorflow backend) using the Model class API.
The model compiles without any issues and proceeds for the first iteration of training as well. At the end of the first epoch, while calculating the validation accuracy, the program crashes with the following error
InvalidArgumentError (see above for traceback): Tensor must be 4-D with last dim 1, 3, or 4, not [1,5,5,32,1]
[[Node: conv2d_1/kernel_0_1 = ImageSummary[T=DT_FLOAT, bad_color=Tensor<type: uint8 shape: [4] values: 255 0 0...>, max_images=3, _device="/job:localhost/replica:0/task:0/cpu:0"](conv2d_1/kernel_0_1/tag, ExpandDims_1/_351)]]
[[Node: batch_normalization_2/moments/sufficient_statistics/count/_445 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_226_batch_normalization_2/moments/sufficient_statistics/count", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
None of the tensor's im feeding into the network nor any of the layers have more than 4 dimensions (including batch size).The Node:conv2d_1/kernel_0_1does not have any data flow edge of that size either. If i tried to build the model again, the error occurs at a different CONV_2D node. I'm not sure what's causing this error (especially only during validation).
Setup - tensorflow 1.0.1 + keras 2.0.3 + python 3.5.3 + NVIDIA GTX 960M
The issue is with the tensorboard callback and only when write_graph = True, write_images = True. If i don't use that callback or set write_graph = False, write_images = False everything works fine for both random arrays and images.
Here's the code (i've skipped the data preprocessing part)
Run the code below to reproduce the error
from keras.models import Model
from keras.layers import (
Input,
Activation,
Dense,
Flatten,
Reshape
)
from keras.layers.convolutional import (
Conv2D,
MaxPooling2D,
AveragePooling2D
)
from keras.layers.merge import add,concatenate
from keras.layers.normalization import BatchNormalization
from keras.regularizers import l2
from keras import backend as K
import numpy as np
from keras.callbacks import ReduceLROnPlateau, CSVLogger, EarlyStopping, TensorBoard
image_train = np.random.rand(1000,72,120,1)
data_train = np.random.rand(1000,2)
target_train = np.random.rand(1000,2)
batch_size = 100
input_shape = (72,120,1)
input_two_shape = (2,)
ROW_AXIS = 1
COL_AXIS = 2
CHANNEL_AXIS = 3
nb_epoch = 2
lr_reducer = ReduceLROnPlateau(factor=np.sqrt(0.1), cooldown=0, patience=5, min_lr=0.5e-6)
early_stopper = EarlyStopping(min_delta=0.001, patience=10)
tbCallBack = TensorBoard(log_dir='./Graphs', histogram_freq= 1 , write_graph=True, write_images=True)
input_one = Input(shape=input_shape, name = 'Input_One')
input_two = Input(shape = input_two_shape, name = 'Input_Two')
conv_layer = Conv2D(filters= 32, kernel_size=(5, 5), strides=(2, 2), padding= 'same' , kernel_initializer='he_normal', kernel_regularizer=l2(1.e-4))(input_one)
batchNorm_layer1 = BatchNormalization(axis=CHANNEL_AXIS)(conv_layer)
relu_layer1 = Activation("relu")(batchNorm_layer1)
conv_layer1 = Conv2D(filters= 64, kernel_size=(3, 3), strides=(1,1), padding= 'same' , kernel_initializer='he_normal', kernel_regularizer=l2(1.e-4))(relu_layer1)
batchNorm_layer2 = BatchNormalization(axis=CHANNEL_AXIS)(conv_layer1)
relu_layer2 = Activation("relu")(batchNorm_layer2)
conv_layer2 = Conv2D(filters= 64, kernel_size=(3, 3), strides=(1,1), padding= 'same' , kernel_initializer='he_normal', kernel_regularizer=l2(1.e-4))(relu_layer2)
batchNorm_layer3 = BatchNormalization(axis=CHANNEL_AXIS)(conv_layer2)
relu_layer3 = Activation("relu")(batchNorm_layer3)
conv_layer3 = Conv2D(filters= 64, kernel_size=(3, 3), strides=(1,1), padding= 'same' , kernel_initializer='he_normal', kernel_regularizer=l2(1.e-4))(relu_layer3)
add_layer13 = add([conv_layer1, conv_layer3])
batchNorm_layer4 = BatchNormalization(axis=CHANNEL_AXIS)(add_layer13)
relu_layer4 = Activation("relu")(batchNorm_layer3)
conv_layer4 = Conv2D(filters= 128, kernel_size=(3, 3), strides=(2,2), padding= 'same' , kernel_initializer='he_normal', kernel_regularizer=l2(1.e-4))(relu_layer4)
batchNorm_layer5 = BatchNormalization(axis=CHANNEL_AXIS)(conv_layer4)
relu_layer5 = Activation("relu")(batchNorm_layer5)
conv_layer5 = Conv2D(filters= 128, kernel_size=(3, 3), strides=(1,1), padding= 'same' , kernel_initializer='he_normal', kernel_regularizer=l2(1.e-4))(relu_layer5)
batchNorm_layer6 = BatchNormalization(axis=CHANNEL_AXIS)(conv_layer5)
relu_layer6 = Activation("relu")(batchNorm_layer6)
conv_layer6 = Conv2D(filters= 128, kernel_size=(3, 3), strides=(1,1), padding= 'same' , kernel_initializer='he_normal', kernel_regularizer=l2(1.e-4))(relu_layer6)
add_layer46 = add([conv_layer4, conv_layer6])
batchNorm_layer7 = BatchNormalization(axis=CHANNEL_AXIS)(add_layer46)
relu_layer7 = Activation("relu")(batchNorm_layer7)
head_shape = K.int_shape(relu_layer7)
pool_layer = AveragePooling2D(pool_size=(head_shape[ROW_AXIS], head_shape[COL_AXIS]),strides=(1, 1))(relu_layer7)
flat_end = Reshape((128,))(pool_layer)
fully_connected = concatenate([flat_end, input_two], axis = 1)
dense_1 = Dense(units=100, kernel_initializer="he_normal", activation="relu")(fully_connected)
dense_2 = Dense(units = 2, kernel_initializer="he_normal", activation="linear", name = 'Output_2')(dense_1)
model = Model(inputs = [input_one, input_two], outputs=dense_2)
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
model.fit({'Input_One' : image_train, 'Input_Two' : data_train}, {'Output_2' : target_train }, batch_size=batch_size, epochs = nb_epoch, shuffle=True, callbacks=[lr_reducer, early_stopper,tbCallBack], validation_split = 0.01, verbose = 1)
Does it work if you set the histogram_freq argument to 0
tbCallBack = TensorBoard(log_dir='./Graphs', histogram_freq= 1 , write_graph=True, write_images=True)
@karimpedia It does work! But i don't have any distribution/histogram data now ?
Same issue with same configuration as Chandrahas1991, except for python 2.7 and gtx 980M.
@gokceneraslan could you please take a look?
This means the tensor you input must be 4-dimension.
not like [512,512,1], change it to [1,512,512,1]
Most helpful comment
@karimpedia It does work! But i don't have any distribution/histogram data now ?