Tensorflow Reshape error with custom pooling/unpooling layer
I am trying to implement a smaller scale version of SegNet described in this paper (https://arxiv.org/pdf/1511.00561.pdf), but I'm trying to tailor it towards detecting edges
Dataset:
I am using the BSDS500 boundary dataset, I cropped and rotated the images so their sizes are 320x480x3 instead of 321x481x3
Input shapes, 200 training images and 100 validation images:
x_train: (200, 320, 480, 3)
x_val: (100, 320, 480, 3)
y_train: (200, 153600)
y_val: (100, 153600)
Framework:
I am using keras with tensorflow backend
These are the functions I am using for the custom pooling and unpooling layers:
def pool_argmax2D(x, pool_size=(2,2), strides=(2,2)):
padding = 'SAME'
pool_size = [1, pool_size[0], pool_size[1], 1]
strides = [1, strides[0], strides[1], 1]
ksize = [1, pool_size[0], pool_size[1], 1]
output, argmax = tf.nn.max_pool_with_argmax(
x,
ksize = ksize,
strides = strides,
padding = padding
)
return [output, argmax]
def unpool2D(pool, argmax, ksize=(2,2)):
with tf.variable_scope("unpool"):
input_shape = tf.shape(pool)
output_shape = [input_shape[0],
input_shape[1] * ksize[0],
input_shape[2] * ksize[1],
input_shape[3]]
flat_input_size = tf.cumprod(input_shape)[-1]
flat_output_shape = tf.cast([output_shape[0],
output_shape[1] * output_shape[2] * output_shape[3]], tf.int64)
pool_ = tf.reshape(pool, [flat_input_size])
batch_range = tf.reshape(tf.range(tf.cast(output_shape[0], tf.int64), dtype=tf.int64),
shape=[input_shape[0], 1, 1, 1])
b = tf.ones_like(argmax) * batch_range
b = tf.reshape(b, [flat_input_size, 1])
ind_ = tf.reshape(argmax, [flat_input_size, 1]) % flat_output_shape[1]
ind_ = tf.concat([b, ind_], 1)
ret = tf.scatter_nd(ind_, pool_, shape=flat_output_shape)
ret = tf.reshape(ret, output_shape)
return ret
This is the code for the model:
batch_size = 4
kernel = 3
pool_size=(2,2)
img_shape = (320,480,3)
inputs = Input(shape=img_shape, name='main_input')
conv_1 = Conv2D(32, (kernel, kernel), padding="same")(inputs)
conv_1 = BatchNormalization()(conv_1)
conv_1 = Activation("relu")(conv_1)
conv_2 = Conv2D(32, (kernel, kernel), padding="same")(conv_1)
conv_2 = BatchNormalization()(conv_2)
conv_2 = Activation("relu")(conv_2)
pool_1, mask_1 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_2)
conv_3 = Conv2D(64, (kernel, kernel), padding="same")(pool_1)
conv_3 = BatchNormalization()(conv_3)
conv_3 = Activation("relu")(conv_3)
conv_4 = Conv2D(64, (kernel, kernel), padding="same")(conv_3)
conv_4 = BatchNormalization()(conv_4)
conv_4 = Activation("relu")(conv_4)
pool_2, mask_2 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_4)
conv_5 = Conv2D(64, (kernel, kernel), padding="same")(pool_2)
conv_5 = BatchNormalization()(conv_5)
conv_5 = Activation("relu")(conv_5)
unpool_1 = Lambda(unpool2D, output_shape = (160,240,64), arguments={'ksize':pool_size, 'argmax': mask_2})(conv_5)
conv_6 = Conv2D(64, (kernel, kernel), padding="same")(unpool_1)
conv_6 = BatchNormalization()(conv_6)
conv_6 = Activation("relu")(conv_6)
conv_7 = Conv2D(64, (kernel, kernel), padding="same")(conv_6)
conv_7 = BatchNormalization()(conv_7)
conv_7 = Activation("relu")(conv_7)
unpool_2 = Lambda(unpool2D, output_shape = (320,480,64), arguments={'ksize':pool_size, 'argmax': mask_1})(conv_7)
conv_8 = Conv2D(32, (kernel, kernel), padding="same")(unpool_2)
conv_8 = BatchNormalization()(conv_8)
conv_8 = Activation("relu")(conv_8)
conv_9 = Conv2D(32, (kernel, kernel), padding="same")(conv_8)
conv_9 = BatchNormalization()(conv_9)
conv_9 = Activation("relu")(conv_9)
conv_10 = Conv2D(1, (1, 1), padding="same")(conv_9)
conv_10 = BatchNormalization()(conv_10)
flatten_1 = Flatten()(conv_10)
outputs = Activation('softmax')(flatten_1)
model = Model(inputs=inputs, outputs=outputs)
The model compiles properly when I run:
model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['accuracy'])
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
main_input (InputLayer) (None, 320, 480, 3) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 320, 480, 32) 896
_________________________________________________________________
batch_normalization_1 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_1 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 320, 480, 32) 9248
_________________________________________________________________
batch_normalization_2 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_2 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
lambda_1 (Lambda) [(None, 160, 240, 32), (N 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 160, 240, 64) 18496
_________________________________________________________________
batch_normalization_3 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_3 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_4 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_4 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
lambda_2 (Lambda) [(None, 80, 120, 64), (No 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 80, 120, 64) 36928
_________________________________________________________________
batch_normalization_5 (Batch (None, 80, 120, 64) 256
_________________________________________________________________
activation_5 (Activation) (None, 80, 120, 64) 0
_________________________________________________________________
lambda_3 (Lambda) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_6 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_6 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_7 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_7 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
lambda_4 (Lambda) (None, 320, 480, 64) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 320, 480, 32) 18464
_________________________________________________________________
batch_normalization_8 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_8 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_9 (Conv2D) (None, 320, 480, 32) 9248
_________________________________________________________________
batch_normalization_9 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_9 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 320, 480, 1) 33
_________________________________________________________________
batch_normalization_10 (Batc (None, 320, 480, 1) 4
_________________________________________________________________
flatten_1 (Flatten) (None, 153600) 0
_________________________________________________________________
activation_10 (Activation) (None, 153600) 0
=================================================================
Total params: 205,893
Trainable params: 204,995
Non-trainable params: 898
_________________________________________________________________
However when trying to fit the model
history = model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=3, verbose=2, validation_data=(x_val,y_val))
I encounter this error:
InvalidArgumentError: Input to reshape is a tensor with 4915200 values, but the requested shape has 9830400
[[{{node lambda_4/unpool/Reshape_3}} = Reshape[T=DT_INT64, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](lambda_1/MaxPoolWithArgmax:1, lambda_4/unpool/Reshape_2/shape)]]
[[{{node lambda_4/unpool/strided_slice_6/_515}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1479_lambda_4/unpool/strided_slice_6", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I have looked over all the shapes after each layers and they are what I expect. I also tested out the pooling/unpooling functions on sample tensors and they produced expected output. What am I doing wrong here?
I've been pulling my hair out trying to solve this, any help is greatly appreciated!
python tensorflow machine-learning keras computer-vision
add a comment |
I am trying to implement a smaller scale version of SegNet described in this paper (https://arxiv.org/pdf/1511.00561.pdf), but I'm trying to tailor it towards detecting edges
Dataset:
I am using the BSDS500 boundary dataset, I cropped and rotated the images so their sizes are 320x480x3 instead of 321x481x3
Input shapes, 200 training images and 100 validation images:
x_train: (200, 320, 480, 3)
x_val: (100, 320, 480, 3)
y_train: (200, 153600)
y_val: (100, 153600)
Framework:
I am using keras with tensorflow backend
These are the functions I am using for the custom pooling and unpooling layers:
def pool_argmax2D(x, pool_size=(2,2), strides=(2,2)):
padding = 'SAME'
pool_size = [1, pool_size[0], pool_size[1], 1]
strides = [1, strides[0], strides[1], 1]
ksize = [1, pool_size[0], pool_size[1], 1]
output, argmax = tf.nn.max_pool_with_argmax(
x,
ksize = ksize,
strides = strides,
padding = padding
)
return [output, argmax]
def unpool2D(pool, argmax, ksize=(2,2)):
with tf.variable_scope("unpool"):
input_shape = tf.shape(pool)
output_shape = [input_shape[0],
input_shape[1] * ksize[0],
input_shape[2] * ksize[1],
input_shape[3]]
flat_input_size = tf.cumprod(input_shape)[-1]
flat_output_shape = tf.cast([output_shape[0],
output_shape[1] * output_shape[2] * output_shape[3]], tf.int64)
pool_ = tf.reshape(pool, [flat_input_size])
batch_range = tf.reshape(tf.range(tf.cast(output_shape[0], tf.int64), dtype=tf.int64),
shape=[input_shape[0], 1, 1, 1])
b = tf.ones_like(argmax) * batch_range
b = tf.reshape(b, [flat_input_size, 1])
ind_ = tf.reshape(argmax, [flat_input_size, 1]) % flat_output_shape[1]
ind_ = tf.concat([b, ind_], 1)
ret = tf.scatter_nd(ind_, pool_, shape=flat_output_shape)
ret = tf.reshape(ret, output_shape)
return ret
This is the code for the model:
batch_size = 4
kernel = 3
pool_size=(2,2)
img_shape = (320,480,3)
inputs = Input(shape=img_shape, name='main_input')
conv_1 = Conv2D(32, (kernel, kernel), padding="same")(inputs)
conv_1 = BatchNormalization()(conv_1)
conv_1 = Activation("relu")(conv_1)
conv_2 = Conv2D(32, (kernel, kernel), padding="same")(conv_1)
conv_2 = BatchNormalization()(conv_2)
conv_2 = Activation("relu")(conv_2)
pool_1, mask_1 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_2)
conv_3 = Conv2D(64, (kernel, kernel), padding="same")(pool_1)
conv_3 = BatchNormalization()(conv_3)
conv_3 = Activation("relu")(conv_3)
conv_4 = Conv2D(64, (kernel, kernel), padding="same")(conv_3)
conv_4 = BatchNormalization()(conv_4)
conv_4 = Activation("relu")(conv_4)
pool_2, mask_2 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_4)
conv_5 = Conv2D(64, (kernel, kernel), padding="same")(pool_2)
conv_5 = BatchNormalization()(conv_5)
conv_5 = Activation("relu")(conv_5)
unpool_1 = Lambda(unpool2D, output_shape = (160,240,64), arguments={'ksize':pool_size, 'argmax': mask_2})(conv_5)
conv_6 = Conv2D(64, (kernel, kernel), padding="same")(unpool_1)
conv_6 = BatchNormalization()(conv_6)
conv_6 = Activation("relu")(conv_6)
conv_7 = Conv2D(64, (kernel, kernel), padding="same")(conv_6)
conv_7 = BatchNormalization()(conv_7)
conv_7 = Activation("relu")(conv_7)
unpool_2 = Lambda(unpool2D, output_shape = (320,480,64), arguments={'ksize':pool_size, 'argmax': mask_1})(conv_7)
conv_8 = Conv2D(32, (kernel, kernel), padding="same")(unpool_2)
conv_8 = BatchNormalization()(conv_8)
conv_8 = Activation("relu")(conv_8)
conv_9 = Conv2D(32, (kernel, kernel), padding="same")(conv_8)
conv_9 = BatchNormalization()(conv_9)
conv_9 = Activation("relu")(conv_9)
conv_10 = Conv2D(1, (1, 1), padding="same")(conv_9)
conv_10 = BatchNormalization()(conv_10)
flatten_1 = Flatten()(conv_10)
outputs = Activation('softmax')(flatten_1)
model = Model(inputs=inputs, outputs=outputs)
The model compiles properly when I run:
model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['accuracy'])
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
main_input (InputLayer) (None, 320, 480, 3) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 320, 480, 32) 896
_________________________________________________________________
batch_normalization_1 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_1 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 320, 480, 32) 9248
_________________________________________________________________
batch_normalization_2 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_2 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
lambda_1 (Lambda) [(None, 160, 240, 32), (N 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 160, 240, 64) 18496
_________________________________________________________________
batch_normalization_3 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_3 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_4 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_4 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
lambda_2 (Lambda) [(None, 80, 120, 64), (No 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 80, 120, 64) 36928
_________________________________________________________________
batch_normalization_5 (Batch (None, 80, 120, 64) 256
_________________________________________________________________
activation_5 (Activation) (None, 80, 120, 64) 0
_________________________________________________________________
lambda_3 (Lambda) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_6 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_6 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_7 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_7 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
lambda_4 (Lambda) (None, 320, 480, 64) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 320, 480, 32) 18464
_________________________________________________________________
batch_normalization_8 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_8 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_9 (Conv2D) (None, 320, 480, 32) 9248
_________________________________________________________________
batch_normalization_9 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_9 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 320, 480, 1) 33
_________________________________________________________________
batch_normalization_10 (Batc (None, 320, 480, 1) 4
_________________________________________________________________
flatten_1 (Flatten) (None, 153600) 0
_________________________________________________________________
activation_10 (Activation) (None, 153600) 0
=================================================================
Total params: 205,893
Trainable params: 204,995
Non-trainable params: 898
_________________________________________________________________
However when trying to fit the model
history = model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=3, verbose=2, validation_data=(x_val,y_val))
I encounter this error:
InvalidArgumentError: Input to reshape is a tensor with 4915200 values, but the requested shape has 9830400
[[{{node lambda_4/unpool/Reshape_3}} = Reshape[T=DT_INT64, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](lambda_1/MaxPoolWithArgmax:1, lambda_4/unpool/Reshape_2/shape)]]
[[{{node lambda_4/unpool/strided_slice_6/_515}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1479_lambda_4/unpool/strided_slice_6", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I have looked over all the shapes after each layers and they are what I expect. I also tested out the pooling/unpooling functions on sample tensors and they produced expected output. What am I doing wrong here?
I've been pulling my hair out trying to solve this, any help is greatly appreciated!
python tensorflow machine-learning keras computer-vision
add a comment |
I am trying to implement a smaller scale version of SegNet described in this paper (https://arxiv.org/pdf/1511.00561.pdf), but I'm trying to tailor it towards detecting edges
Dataset:
I am using the BSDS500 boundary dataset, I cropped and rotated the images so their sizes are 320x480x3 instead of 321x481x3
Input shapes, 200 training images and 100 validation images:
x_train: (200, 320, 480, 3)
x_val: (100, 320, 480, 3)
y_train: (200, 153600)
y_val: (100, 153600)
Framework:
I am using keras with tensorflow backend
These are the functions I am using for the custom pooling and unpooling layers:
def pool_argmax2D(x, pool_size=(2,2), strides=(2,2)):
padding = 'SAME'
pool_size = [1, pool_size[0], pool_size[1], 1]
strides = [1, strides[0], strides[1], 1]
ksize = [1, pool_size[0], pool_size[1], 1]
output, argmax = tf.nn.max_pool_with_argmax(
x,
ksize = ksize,
strides = strides,
padding = padding
)
return [output, argmax]
def unpool2D(pool, argmax, ksize=(2,2)):
with tf.variable_scope("unpool"):
input_shape = tf.shape(pool)
output_shape = [input_shape[0],
input_shape[1] * ksize[0],
input_shape[2] * ksize[1],
input_shape[3]]
flat_input_size = tf.cumprod(input_shape)[-1]
flat_output_shape = tf.cast([output_shape[0],
output_shape[1] * output_shape[2] * output_shape[3]], tf.int64)
pool_ = tf.reshape(pool, [flat_input_size])
batch_range = tf.reshape(tf.range(tf.cast(output_shape[0], tf.int64), dtype=tf.int64),
shape=[input_shape[0], 1, 1, 1])
b = tf.ones_like(argmax) * batch_range
b = tf.reshape(b, [flat_input_size, 1])
ind_ = tf.reshape(argmax, [flat_input_size, 1]) % flat_output_shape[1]
ind_ = tf.concat([b, ind_], 1)
ret = tf.scatter_nd(ind_, pool_, shape=flat_output_shape)
ret = tf.reshape(ret, output_shape)
return ret
This is the code for the model:
batch_size = 4
kernel = 3
pool_size=(2,2)
img_shape = (320,480,3)
inputs = Input(shape=img_shape, name='main_input')
conv_1 = Conv2D(32, (kernel, kernel), padding="same")(inputs)
conv_1 = BatchNormalization()(conv_1)
conv_1 = Activation("relu")(conv_1)
conv_2 = Conv2D(32, (kernel, kernel), padding="same")(conv_1)
conv_2 = BatchNormalization()(conv_2)
conv_2 = Activation("relu")(conv_2)
pool_1, mask_1 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_2)
conv_3 = Conv2D(64, (kernel, kernel), padding="same")(pool_1)
conv_3 = BatchNormalization()(conv_3)
conv_3 = Activation("relu")(conv_3)
conv_4 = Conv2D(64, (kernel, kernel), padding="same")(conv_3)
conv_4 = BatchNormalization()(conv_4)
conv_4 = Activation("relu")(conv_4)
pool_2, mask_2 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_4)
conv_5 = Conv2D(64, (kernel, kernel), padding="same")(pool_2)
conv_5 = BatchNormalization()(conv_5)
conv_5 = Activation("relu")(conv_5)
unpool_1 = Lambda(unpool2D, output_shape = (160,240,64), arguments={'ksize':pool_size, 'argmax': mask_2})(conv_5)
conv_6 = Conv2D(64, (kernel, kernel), padding="same")(unpool_1)
conv_6 = BatchNormalization()(conv_6)
conv_6 = Activation("relu")(conv_6)
conv_7 = Conv2D(64, (kernel, kernel), padding="same")(conv_6)
conv_7 = BatchNormalization()(conv_7)
conv_7 = Activation("relu")(conv_7)
unpool_2 = Lambda(unpool2D, output_shape = (320,480,64), arguments={'ksize':pool_size, 'argmax': mask_1})(conv_7)
conv_8 = Conv2D(32, (kernel, kernel), padding="same")(unpool_2)
conv_8 = BatchNormalization()(conv_8)
conv_8 = Activation("relu")(conv_8)
conv_9 = Conv2D(32, (kernel, kernel), padding="same")(conv_8)
conv_9 = BatchNormalization()(conv_9)
conv_9 = Activation("relu")(conv_9)
conv_10 = Conv2D(1, (1, 1), padding="same")(conv_9)
conv_10 = BatchNormalization()(conv_10)
flatten_1 = Flatten()(conv_10)
outputs = Activation('softmax')(flatten_1)
model = Model(inputs=inputs, outputs=outputs)
The model compiles properly when I run:
model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['accuracy'])
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
main_input (InputLayer) (None, 320, 480, 3) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 320, 480, 32) 896
_________________________________________________________________
batch_normalization_1 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_1 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 320, 480, 32) 9248
_________________________________________________________________
batch_normalization_2 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_2 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
lambda_1 (Lambda) [(None, 160, 240, 32), (N 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 160, 240, 64) 18496
_________________________________________________________________
batch_normalization_3 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_3 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_4 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_4 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
lambda_2 (Lambda) [(None, 80, 120, 64), (No 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 80, 120, 64) 36928
_________________________________________________________________
batch_normalization_5 (Batch (None, 80, 120, 64) 256
_________________________________________________________________
activation_5 (Activation) (None, 80, 120, 64) 0
_________________________________________________________________
lambda_3 (Lambda) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_6 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_6 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_7 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_7 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
lambda_4 (Lambda) (None, 320, 480, 64) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 320, 480, 32) 18464
_________________________________________________________________
batch_normalization_8 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_8 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_9 (Conv2D) (None, 320, 480, 32) 9248
_________________________________________________________________
batch_normalization_9 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_9 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 320, 480, 1) 33
_________________________________________________________________
batch_normalization_10 (Batc (None, 320, 480, 1) 4
_________________________________________________________________
flatten_1 (Flatten) (None, 153600) 0
_________________________________________________________________
activation_10 (Activation) (None, 153600) 0
=================================================================
Total params: 205,893
Trainable params: 204,995
Non-trainable params: 898
_________________________________________________________________
However when trying to fit the model
history = model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=3, verbose=2, validation_data=(x_val,y_val))
I encounter this error:
InvalidArgumentError: Input to reshape is a tensor with 4915200 values, but the requested shape has 9830400
[[{{node lambda_4/unpool/Reshape_3}} = Reshape[T=DT_INT64, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](lambda_1/MaxPoolWithArgmax:1, lambda_4/unpool/Reshape_2/shape)]]
[[{{node lambda_4/unpool/strided_slice_6/_515}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1479_lambda_4/unpool/strided_slice_6", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I have looked over all the shapes after each layers and they are what I expect. I also tested out the pooling/unpooling functions on sample tensors and they produced expected output. What am I doing wrong here?
I've been pulling my hair out trying to solve this, any help is greatly appreciated!
python tensorflow machine-learning keras computer-vision
I am trying to implement a smaller scale version of SegNet described in this paper (https://arxiv.org/pdf/1511.00561.pdf), but I'm trying to tailor it towards detecting edges
Dataset:
I am using the BSDS500 boundary dataset, I cropped and rotated the images so their sizes are 320x480x3 instead of 321x481x3
Input shapes, 200 training images and 100 validation images:
x_train: (200, 320, 480, 3)
x_val: (100, 320, 480, 3)
y_train: (200, 153600)
y_val: (100, 153600)
Framework:
I am using keras with tensorflow backend
These are the functions I am using for the custom pooling and unpooling layers:
def pool_argmax2D(x, pool_size=(2,2), strides=(2,2)):
padding = 'SAME'
pool_size = [1, pool_size[0], pool_size[1], 1]
strides = [1, strides[0], strides[1], 1]
ksize = [1, pool_size[0], pool_size[1], 1]
output, argmax = tf.nn.max_pool_with_argmax(
x,
ksize = ksize,
strides = strides,
padding = padding
)
return [output, argmax]
def unpool2D(pool, argmax, ksize=(2,2)):
with tf.variable_scope("unpool"):
input_shape = tf.shape(pool)
output_shape = [input_shape[0],
input_shape[1] * ksize[0],
input_shape[2] * ksize[1],
input_shape[3]]
flat_input_size = tf.cumprod(input_shape)[-1]
flat_output_shape = tf.cast([output_shape[0],
output_shape[1] * output_shape[2] * output_shape[3]], tf.int64)
pool_ = tf.reshape(pool, [flat_input_size])
batch_range = tf.reshape(tf.range(tf.cast(output_shape[0], tf.int64), dtype=tf.int64),
shape=[input_shape[0], 1, 1, 1])
b = tf.ones_like(argmax) * batch_range
b = tf.reshape(b, [flat_input_size, 1])
ind_ = tf.reshape(argmax, [flat_input_size, 1]) % flat_output_shape[1]
ind_ = tf.concat([b, ind_], 1)
ret = tf.scatter_nd(ind_, pool_, shape=flat_output_shape)
ret = tf.reshape(ret, output_shape)
return ret
This is the code for the model:
batch_size = 4
kernel = 3
pool_size=(2,2)
img_shape = (320,480,3)
inputs = Input(shape=img_shape, name='main_input')
conv_1 = Conv2D(32, (kernel, kernel), padding="same")(inputs)
conv_1 = BatchNormalization()(conv_1)
conv_1 = Activation("relu")(conv_1)
conv_2 = Conv2D(32, (kernel, kernel), padding="same")(conv_1)
conv_2 = BatchNormalization()(conv_2)
conv_2 = Activation("relu")(conv_2)
pool_1, mask_1 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_2)
conv_3 = Conv2D(64, (kernel, kernel), padding="same")(pool_1)
conv_3 = BatchNormalization()(conv_3)
conv_3 = Activation("relu")(conv_3)
conv_4 = Conv2D(64, (kernel, kernel), padding="same")(conv_3)
conv_4 = BatchNormalization()(conv_4)
conv_4 = Activation("relu")(conv_4)
pool_2, mask_2 = Lambda(pool_argmax2D, arguments={'pool_size': pool_size, 'strides': pool_size})(conv_4)
conv_5 = Conv2D(64, (kernel, kernel), padding="same")(pool_2)
conv_5 = BatchNormalization()(conv_5)
conv_5 = Activation("relu")(conv_5)
unpool_1 = Lambda(unpool2D, output_shape = (160,240,64), arguments={'ksize':pool_size, 'argmax': mask_2})(conv_5)
conv_6 = Conv2D(64, (kernel, kernel), padding="same")(unpool_1)
conv_6 = BatchNormalization()(conv_6)
conv_6 = Activation("relu")(conv_6)
conv_7 = Conv2D(64, (kernel, kernel), padding="same")(conv_6)
conv_7 = BatchNormalization()(conv_7)
conv_7 = Activation("relu")(conv_7)
unpool_2 = Lambda(unpool2D, output_shape = (320,480,64), arguments={'ksize':pool_size, 'argmax': mask_1})(conv_7)
conv_8 = Conv2D(32, (kernel, kernel), padding="same")(unpool_2)
conv_8 = BatchNormalization()(conv_8)
conv_8 = Activation("relu")(conv_8)
conv_9 = Conv2D(32, (kernel, kernel), padding="same")(conv_8)
conv_9 = BatchNormalization()(conv_9)
conv_9 = Activation("relu")(conv_9)
conv_10 = Conv2D(1, (1, 1), padding="same")(conv_9)
conv_10 = BatchNormalization()(conv_10)
flatten_1 = Flatten()(conv_10)
outputs = Activation('softmax')(flatten_1)
model = Model(inputs=inputs, outputs=outputs)
The model compiles properly when I run:
model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['accuracy'])
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
main_input (InputLayer) (None, 320, 480, 3) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 320, 480, 32) 896
_________________________________________________________________
batch_normalization_1 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_1 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 320, 480, 32) 9248
_________________________________________________________________
batch_normalization_2 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_2 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
lambda_1 (Lambda) [(None, 160, 240, 32), (N 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 160, 240, 64) 18496
_________________________________________________________________
batch_normalization_3 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_3 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_4 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_4 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
lambda_2 (Lambda) [(None, 80, 120, 64), (No 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 80, 120, 64) 36928
_________________________________________________________________
batch_normalization_5 (Batch (None, 80, 120, 64) 256
_________________________________________________________________
activation_5 (Activation) (None, 80, 120, 64) 0
_________________________________________________________________
lambda_3 (Lambda) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_6 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_6 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
conv2d_7 (Conv2D) (None, 160, 240, 64) 36928
_________________________________________________________________
batch_normalization_7 (Batch (None, 160, 240, 64) 256
_________________________________________________________________
activation_7 (Activation) (None, 160, 240, 64) 0
_________________________________________________________________
lambda_4 (Lambda) (None, 320, 480, 64) 0
_________________________________________________________________
conv2d_8 (Conv2D) (None, 320, 480, 32) 18464
_________________________________________________________________
batch_normalization_8 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_8 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_9 (Conv2D) (None, 320, 480, 32) 9248
_________________________________________________________________
batch_normalization_9 (Batch (None, 320, 480, 32) 128
_________________________________________________________________
activation_9 (Activation) (None, 320, 480, 32) 0
_________________________________________________________________
conv2d_10 (Conv2D) (None, 320, 480, 1) 33
_________________________________________________________________
batch_normalization_10 (Batc (None, 320, 480, 1) 4
_________________________________________________________________
flatten_1 (Flatten) (None, 153600) 0
_________________________________________________________________
activation_10 (Activation) (None, 153600) 0
=================================================================
Total params: 205,893
Trainable params: 204,995
Non-trainable params: 898
_________________________________________________________________
However when trying to fit the model
history = model.fit(x=x_train, y=y_train, batch_size=batch_size, epochs=3, verbose=2, validation_data=(x_val,y_val))
I encounter this error:
InvalidArgumentError: Input to reshape is a tensor with 4915200 values, but the requested shape has 9830400
[[{{node lambda_4/unpool/Reshape_3}} = Reshape[T=DT_INT64, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"](lambda_1/MaxPoolWithArgmax:1, lambda_4/unpool/Reshape_2/shape)]]
[[{{node lambda_4/unpool/strided_slice_6/_515}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_1479_lambda_4/unpool/strided_slice_6", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I have looked over all the shapes after each layers and they are what I expect. I also tested out the pooling/unpooling functions on sample tensors and they produced expected output. What am I doing wrong here?
I've been pulling my hair out trying to solve this, any help is greatly appreciated!
python tensorflow machine-learning keras computer-vision
python tensorflow machine-learning keras computer-vision
edited Nov 20 '18 at 13:01
desertnaut
18.5k73874
18.5k73874
asked Nov 20 '18 at 12:14
jasonacpjasonacp
64
64
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Found the problem, mask_1 has 32 channels while unpool_2 is trying to reshape output with 64 channels. I just re-arranged stuff so the depths lined up.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53392793%2ftensorflow-reshape-error-with-custom-pooling-unpooling-layer%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Found the problem, mask_1 has 32 channels while unpool_2 is trying to reshape output with 64 channels. I just re-arranged stuff so the depths lined up.
add a comment |
Found the problem, mask_1 has 32 channels while unpool_2 is trying to reshape output with 64 channels. I just re-arranged stuff so the depths lined up.
add a comment |
Found the problem, mask_1 has 32 channels while unpool_2 is trying to reshape output with 64 channels. I just re-arranged stuff so the depths lined up.
Found the problem, mask_1 has 32 channels while unpool_2 is trying to reshape output with 64 channels. I just re-arranged stuff so the depths lined up.
answered Nov 24 '18 at 20:26
jasonacpjasonacp
64
64
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53392793%2ftensorflow-reshape-error-with-custom-pooling-unpooling-layer%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown