-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Convolution
Frank Seide edited this page Jul 27, 2016
·
31 revisions
Convolution()
computes the convolution of a weight matrix with an image or tensor. This operation is used in image-processing applications and language processing.
Convolution()
supports any dimensions, stride, sharing or padding. The syntax is:
Convolution(w, input,
{kernel dimensions},
mapCount = {map dimensions},
stride = {stride dimensions},
sharing = {sharing flags},
autoPadding = {padding flags (boolean)},
lowerPad = {lower padding (int)},
upperPad = {upper padding (int)},
maxTempMemSizeInSamples = 0,
imageLayout = "cudnn")
Where:
-
w
- convolution weight matrix, it has the dimensions of[mapCount, kernelDimensionsProduct]
. -
input
- convolution input -
kernel dimensions
- dimensions of the kernel -
mapCount
- [named, optional, default is 0] depth of feature map. 0 means use the row dimension ofw
-
stride
- [named, optional, default is 1] stride dimensions -
sharing
- [named, optional, default is true] sharing flags for each input dimension -
autoPadding
- [named, optional, default is true] automatic padding flags for each input dimension -
lowerPad
- [named, optional, default is 0] precise lower padding for each input dimension -
upperPad
- [named, optional, default is 0] precise upper padding for each input dimension -
maxTempMemSizeInSamples
- [named optional] maximum amount of auxiliary memory (in samples) that should be reserved to perform convolution operations. Some convolution engines (e.g. cuDNN and GEMM-based engines) can benefit from using workspace as it may improve performance. However, sometimes this may lead to higher memory utilization. Default is 0 which means the same as the input samples.
All values of the form {...}
must actually be given as a colon-separated sequence of values, e.g. (5:5)
for the kernel dimensions. (If you use the deprecated NDLNetworkBuilder
, these must be comma-separated and enclosed in { }
instead.)
Example (ConvReLULayer NDL macro):
ConvReLULayer(inp, outMap, inWCount, kW, kH, hStride, vStride, wScale, bValue) =
[
W = LearnableParameter (outMap, inWCount, init="gaussian", initValueScale=wScale)
b = ImageParameter (1, 1, outMap, init="fixedValue", value=bValue)
c = Convolution (W, inp, (kW:kH), stride=(hStride:vStride), autoPadding=true)
y = RectifiedLinear (c + b)
].y
Note: If you are using the deprecated NDLNetworkBuilder
, there should be no trailing .y
in the example.
The 2D convolution syntax is:
Convolution(w, image,
kernelWidth, kernelHeight,
horizontalStride, verticalStride,
zeroPadding=false, maxTempMemSizeInSamples=0, imageLayout="cudnn" /* or "HWC"*/ )
where:
-
w
- convolution weight matrix, it has the dimensions of[mapCount, kernelWidth * kernelHeight * inputChannels]
. -
image
- the input image. -
mapCount
- depth of output feature map (number of output channels) -
kernelWidth
- width of the kernel -
kernelHeight
- height of the kernel -
horizontalStride
- stride in horizontal direction -
verticalStride
- stride in vertical direction -
zeroPadding
- [named optional] specifies whether the sides of the image should be padded with zeros. Default is false. -
maxTempMemSizeInSamples
- [named optional] maximum amount of auxiliary memory (in samples) that should be reserved to perform convolution operations. Some convolution engines (e.g. cuDNN and GEMM-based engines) can benefit from using workspace as it may improve performance. However, sometimes this may lead to higher memory utilization. Default is 0 which means the same as the input samples. -
imageLayout
- [named optional] the storage format of each image. By default it’sHWC
, which means each image is stored as[channel, width, height]
in column major. If you use cuDNN to speed up training, you should set it tocudnn
, which means each image is stored as[width, height, channel]
. Note thatcudnn
layout will work both on GPU and CPU so it is recommended to use it by default.