Hi,
I want to build a simple convolutional network for binary classification, CIFAR-10 like, but I have a problem with ConvReLULayer function from Macros.ndl. I've tried to rewrite it to BrainScript but I'm obviously missing something because an exception occurred.
EXCEPTION occurred: Convolution operation requires that kernel dim 16 <= input dim 3.
My conv.bs contains
imageW = 64
imageH = 64
inputChannels = 3
labelDim = 2
features = ImageInput(imageW, imageH, inputChannels, tag = "feature", imageLayout="cudnn")
featOffs = Constant(128)
featScaled = Minus(features, featOffs)
labels = Input(labelDim, tag='label')
# conv1
kW1 = 5
kH1 = 5
cMap1 = 16 # number of feature maps
inWCount1 = kW1 * kH1 * inputChannels
hStride1 = 1
vStride1 = 1
wScale = 0.0043
conv1 = ConvReLULayer(featScaled, cMap1, inWCount1, kW1, kH1, hStride1, vStride1, wScale)
My macro.bs contains
ConvW(outMap, inWCount, wScale) = [
W = Parameter(outMap, inWCount, init = "uniform", initValueScale=wScale, initOnCPUOnly=true)
].W
ConvB(outMap) = [
b = ParameterTensor(1 : 1 : outMap)
].b
ConvReLULayer(inp, outMap, inWCount, kW, kH, hStride, vStride, wScale) = [
W = ConvW(outMap, inWCount, wScale)
b = ConvB(outMap)
c = Convolution(W, inp, (kW : kH : outMap), stride=(hStride: vStride : outMap), imageLayout = "cudnn")
z = Plus(c, b);
y = RectifiedLinear(z);
].y
Original ConvReLULayer function
ConvReLULayer(inp, outMap, inWCount, kW, kH, hStride, vStride, wScale, bValue)
[
W = LearnableParameter(outMap, inWCount, init = Gaussian, initValueScale = wScale)
b = ImageParameter(1, 1, outMap, init = fixedValue, value = bValue, imageLayout = $imageLayout$)
c = Convolution(W, inp, kW, kH, outMap, hStride, vStride, zeroPadding = true, imageLayout = $imageLayout$)
p = Plus(c, b)
y = RectifiedLinear(p)
]
Hi Arminea,
The issue is related to the convolutional shape you're using in the ConvReLULayer.
You're using a convolution kernel of (kW : kH : outMap), for conv1 let's see what it gives you :
In order to fix it, you should use a ND-Convolution which uses the inMap parameter :
ConvReLULayer(inp, inMap, outMap, inWCount, kW, kH, hStride, vStride, wScale) = [
W = ConvW(outMap, inWCount, wScale)
b = ConvB(outMap)
c = Convolution(W, inp, (kW : kH : inMap), stride=(hStride: vStride : inMap), imageLayout = "cudnn")
z = Plus(c, b);
y = RectifiedLinear(z);
].y
Let us know if it fixes your issue.
Morgan
Hi,
thanks for answering so quickly :) I'll try it later today. Just curious, what should be the value of inMap? I suppose 3, am I right? And I was not sure about b = ParameterTensor(1 : 1 : outMap) in ConvB function. Is it ok?
Tereza
Hi,
Yes you're right, for the first layer, inMap will be 3. More generally, the Nth layer inMap will be the outMap of the N-1 :).
For the bias parameter, you've a predefined macro in BrainScript :
BS.Parameters.BiasParam(outDim) which is defined as follow :
BiasParam (dim) = ParameterTensor ((dim), init='fixedValue', value=0.0)
So you can replace your bias declaration with :
b = BS.Parameters.BiasParam(1:1:outMap)
Morgan
I've tried your solution. It works ... sort of. I give you a log
Validating network. 27 nodes to process in pass 1.
Validating --> labels = InputValue() : -> [2 x *]
Validating --> ol.W = LearnableParameter() : -> [2 x 128]
Validating --> h1.W = LearnableParameter() : -> [128 x 16 x 16 x 32]
Validating --> conv2.W.W = LearnableParameter() : -> [32 x 400]
Validating --> conv1.W.W = LearnableParameter() : -> [16 x 75]
Validating --> features = InputValue() : -> [64 x 64 x 3 x *]
Validating --> featOffs = LearnableParameter() : -> [1 x 1]
Validating --> featScaled = Minus (features, featOffs) : [64 x 64 x 3 x *], [1 x 1] -> [64 x 64 x 3 x *]
Validating --> conv1.c = Convolution (conv1.W.W, featScaled) : [16 x 75], [64 x 64 x 3 x *] -> [64 x 64 x 1 x *]
Validating --> conv1.b.b = LearnableParameter() : -> [1 x 1 x 16]
Validating --> conv1.z = Plus (conv1.c, conv1.b.b) : [64 x 64 x 1 x *], [1 x 1 x 16] -> [64 x 64 x 16 x *]
Validating --> conv1.y = RectifiedLinear (conv1.z) : [64 x 64 x 16 x *] -> [64 x 64 x 16 x *]
Validating --> pool1 = MaxPooling (conv1.y) : [64 x 64 x 16 x *] -> [32 x 32 x 16 x *]
Validating --> conv2.c = Convolution (conv2.W.W, pool1) : [32 x 400], [32 x 32 x 16 x *] -> [32 x 32 x 1 x *]
Validating --> conv2.b.b = LearnableParameter() : -> [1 x 1 x 32]
Validating --> conv2.z = Plus (conv2.c, conv2.b.b) : [32 x 32 x 1 x *], [1 x 1 x 32] -> [32 x 32 x 32 x *]
Validating --> conv2.y = RectifiedLinear (conv2.z) : [32 x 32 x 32 x *] -> [32 x 32 x 32 x *]
Validating --> pool2.p = Pooling (conv2.y) : [32 x 32 x 32 x *] -> [16 x 16 x 32 x *]
Validating --> h1.t = Times (h1.W, pool2.p) : [128 x 16 x 16 x 32], [16 x 16 x 32 x *] -> [128 x *]
Validating --> h1.b = LearnableParameter() : -> [128 x 1]
Validating --> h1.z = Plus (h1.t, h1.b) : [128 x *], [128 x 1] -> [128 x 1 x *]
Validating --> h1.y = Sigmoid (h1.z) : [128 x 1 x *] -> [128 x 1 x *]
Validating --> ol.z.PlusArgs[0] = Times (ol.W, h1.y) : [2 x 128], [128 x 1 x *] -> [2 x 1 x *]
Validating --> ol.b = LearnableParameter() : -> [2 x 1]
Validating --> ol.z = Plus (ol.z.PlusArgs[0], ol.b) : [2 x 1 x *], [2 x 1] -> [2 x 1 x *]
Validating --> ce = CrossEntropyWithSoftmax (labels, ol.z) : [2 x *], [2 x 1 x *] -> [1]
Validating --> errs = ErrorPrediction (labels, ol.z) : [2 x *], [2 x 1 x *] -> [1]
Validating network. 16 nodes to process in pass 2.
Validating network, final pass.
conv1.c: using GEMM convolution engine for geometry: Input: 64 x 64 x 3, Output: 64 x 64 x 1, Kernel: 5 x 5 x 3, Map: 1, Stride: 1 x 1 x 3, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
Validating --> conv1.c = Convolution (conv1.W.W, featScaled) : [16 x 75], [64 x 64 x 3 x *] -> [64 x 64 x 1 x *] FAILED
[CALL STACK]
> Microsoft::MSR::CNTK::ComputationNetwork:: ValidateNode
- Microsoft::MSR::CNTK::ComputationNetwork:: ValidateNodes
- Microsoft::MSR::CNTK::ComputationNetwork:: ValidateNetwork
- Microsoft::MSR::CNTK::ComputationNetwork:: CompileNetwork
- Microsoft::MSR::CNTK::ComputationNetwork:: ConstructFromRoots
- Microsoft::MSR::CNTK::ComputationNetwork:: ComputationNetwork
- std::make_shared<Microsoft::MSR::CNTK::ComputationNetwork,std::shared_ptr<Microsoft::MSR::ScriptableObjects::IConfigRecord> const & __ptr64>
- Microsoft::MSR::ScriptableObjects::MakeRuntimeObject<Microsoft::MSR::CNTK::ComputationNetwork>
- <lambda_56a46857eb9bf0001b5ffab01c6f03ed>:: operator ()
- std::_Callable_obj<<lambda_56a46857eb9bf0001b5ffab01c6f03ed>,0>::_ApplyX<std::shared_ptr<Microsoft::MSR::ScriptableObjects::Object>,std::shared_ptr<Microsoft::MSR::ScriptableObjects::IConfigRecord>>
- std::_Func_impl<std::_Callable_obj<<lambda_56a46857eb9bf0001b5ffab01c6f03ed>,0>,std::allocator<std::_Func_class<std::shared_ptr<Microsoft::MSR::ScriptableObjects::Object>,std::shared_ptr<Microsoft::MSR::ScriptableObjects::IConfigRecord>>>,std:: shared_pt
- std::_Func_class<std::shared_ptr<Microsoft::MSR::ScriptableObjects::Object>,std::shared_ptr<Microsoft::MSR::ScriptableObjects::IConfigRecord>>:: operator ()
- Microsoft::MSR::BS:: Evaluate
- <lambda_a048d19f5114b6bccccaea8ea1203939>:: operator ()
- std::_Callable_obj<<lambda_a048d19f5114b6bccccaea8ea1203939>,0>::_ApplyX<Microsoft::MSR::ScriptableObjects::ConfigValuePtr>
- std::_Func_impl<std::_Callable_obj<<lambda_a048d19f5114b6bccccaea8ea1203939>,0>,std::allocator<std::_Func_class<Microsoft::MSR::ScriptableObjects::ConfigValuePtr>>,Microsoft::MSR::ScriptableObjects::ConfigValuePtr>:: _Do_call
EXCEPTION occurred: Convolution weight matrix conv1.W.W should have dimension [1, 75] which is [kernelCount, kernelWidth * kernelHeight * inputChannels]
Can you share your conv1 definition and the macro associated with ?
The conv1.W.W is very strange ...
conv.bs file
# conv1
kW1 = 5
kH1 = 5
cMap1 = 16 # number of feature maps
inWCount1 = kW1 * kH1 * inputChannels
hStride1 = 1
vStride1 = 1
wScale = 0.0043
conv1 = ConvNDReLULayer(featScaled, kW1, kH1, inputChannels, inWCount1, cMap1, hStride1, vStride1, wScale, 1)
macros.bs file
ConvNDReLULayer(inp, kW, kH, inMap, inWCount, outMap, hStride, vStride, wScale, bValue) = [
W = Parameter(outMap, inWCount, init = "uniform", initValueScale=wScale, initOnCPUOnly=true)
b = ParameterTensor(1 : 1 : outMap)
c = Convolution(W, inp, (kW : kH : inMap), stride=(hStride : vStride : inMap), imageLayout="cudnn")
# sharing=(true : true : true),
z = Plus(c, b);
y = RectifiedLinear(z);
].y
Hi, I've tried it again this morning and the strange second W just went away. It's EXCEPTION occurred: Convolution weight matrix conv1.W should have dimension [1, 75] which is [kernelCount, kernelWidth * kernelHeight * inputChannels] now. I have no idea why.
Here is my whole project if you want to see it https://1drv.ms/f/s!As_qGuBAm_dbhudr98UXaBKczYIw4g. You can run it yourself :)
Anything new? :)
The naming is a bit confusing here. The dimension of the weight matrices for convolution should be:
This is the number of outputs for each pixel = depth of feature map = kernel count = number of output channels.
Your convolution operation will run your input through this many filters. For each of the (width * height) pixel position, you will get this many values, stored in a tensor of shape [width x height x numOutputChannels].
The filter kernel depends on two things: The number of pixels you want it to cover (kernelWidth, kernelHeight) and the depth of the feature map you are processing (inputChannels).
The filter kernel is really a rank-3 tensor, but w.r.t. the weight matrix, only the number of parameters must be specified. That value is the product of these three values kernelWidth * kernelHeight * inputChannels.
I understand your explanation. But isn't it strange that in the log file is a line
Validating --> conv1.W = LearnableParameter() : -> [16 x 75]
and at the end there is an exception ?
Convolution weight matrix conv1.W should have dimension [1, 75] which is [kernelCount, kernelWidth * kernelHeight * inputChannels]
I suppose that conv1.W is the same matrix the whole time, so why is the size different?
Ah. The validation process runs in multiple passes (which is required due to recurrent networks, which you are not using). Only the last pass sets up the actual convolution engine, and that's where the failure is detected.
I have trouble, though, to see what the problem is. A 5 x 5 x 3 kernel indeed has 75 parameters.
@Alexey-Kamenev, do you have an idea what might be wrong?
conv1.c: using GEMM convolution engine for geometry: Input: 64 x 64 x 3, Output: 64 x 64 x 1, Kernel: 5 x 5 x 3, Map: 1, Stride: 1 x 1 x 3, Sharing: (1), AutoPad: (1), LowerPad: 0, UpperPad: 0.
Validating --> conv1.c = Convolution (conv1.W.W, featScaled) : [16 x 75], [64 x 64 x 3 x *] -> [64 x 64 x 1 x *] FAILED
I would expect to see
... -> [64 x 64 x *16* x *]
In fact you need to specify the mapDims parameter in the a ND Convolution node.
Like that :
c = Convolution(W, inp, (kW:kH:inMap), mapDims=outMap, stride=(hStride:vStride:inMap))
Sorry for the delay :)
Morgan
I would take that as an opportunity for improvement. The weight matrix' row dimension should be sufficient to specify this.
Making the change now. You can now leave out mapDims, it will default to the row dimension of the weight matirx. Will take a few days to land in master though. Thanks for the feedback!
@mfuntowicz Thanks, that really help :) Now it's working without any exception. Is there any new documentation for BrainScript? I was looking for the Convolution function in the CNTK book and there is just an old example for NDL.
@frankseide Thanks to you too. It would be very helpful :) I can surely wait a few days :)
Convolution() is documented here. I just updated it w.r.t. the change (which will be online soon).
@frankseide Thanks :) ... I believe that's all for now. I'm closing this issue :)
Hi i'm new to CNTK
If i want to change the dimension , should i rebuild it in Visual Studio?
Or if change the config files and ndl files and then run it from command prompt it works?
Just change the config and ndl/brain script files. No need to recompile. https://github.com/notifications/beacon/AL5Pc8MCZ9Usqxtkd1_VKqpauxAmuJ8oks5qavR-gaJpZM4JPBh7.gif