-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Convolution
Alexey Kamenev edited this page May 17, 2016
·
31 revisions
Compute the convolution of a weight matrix with an image. There is a simplified syntax for 2D convolutions and more advanced syntax for ND convolutions. 2D convolution syntax is:
Convolution(w, image,
kernelWidth, kernelHeight,
horizontalStride, verticalStride,
[zeroPadding=false, maxTempMemSizeInSamples=0, imageLayout="HWC"|cudnn"])
Where:
-
w- convolution weight matrix, it has the dimensions of[outputChannels, kernelWidth * kernelHeight * inputChannels]. -
image- the input image. -
kernelWidth- width of the kernel -
kernelHeight- height of the kernel -
horizontalStride- stride in horizontal direction -
verticalStride- stride in vertical direction -
zeroPadding- [named optional] specifies whether the sides of the image should be padded with zeros. Default is false. -
maxTempMemSizeInSamples- [named optional] maximum amount of auxiliary memory (in samples) that should be reserved to perform convolution operations. Some convolution engines (e.g. cuDNN and GEMM-based engines) can benefit from using workspace as it may improve performance. However, sometimes this may lead to higher memory utilization. Default is 0 which means the same as the input samples. -
imageLayout- [named optional] the storage format of each image. By default it’sHWC, which means each image is stored as[channel, width, height]in column major. If you use cuDNN to speed up training, you should set it tocudnn, which means each image is stored as[width, height, channel]. Note thatcudnnlayout will work both on GPU and CPU so it is recommended to use it by default.
Example (ConvReLULayer NDL macro):
ConvReLULayer(inp, outMap, inWCount, kW, kH, hStride, vStride, wScale, bValue)
[
W = LearnableParameter(outMap, inWCount, init = Gaussian, initValueScale = wScale)
b = ImageParameter(1, 1, outMap, init = fixedValue, value = bValue, imageLayout = $imageLayout$)
c = Convolution(W, inp, kW, kH, outMap, hStride, vStride, zeroPadding = true, imageLayout = $imageLayout$)
p = Plus(c, b)
y = RectifiedLinear(p)
]
ND Convolution allows to create convolutions of any dimensions, stride, sharing or padding. The syntax is:
Convolution(w, input,
{kernel dimensions},
mapCount = {map dimensions},
stride = {stride dimensions},
sharing = {sharing},
autoPadding = {padding (boolean)},
lowerPad = {lower padding (int)},
upperPad = {upper padding (int)},
maxTempMemSizeInSamples = 0,
imageLayout = "cudnn")
Where:
-
w- convolution weight matrix, it has the dimensions of[kernelCount, kernelDimensionsProduct]. -
input- convolution input -
{kernel dimensions}- dimensions of the kernel -
stride- [named, optional, default is 1] stride dimensions -
sharing- [named, optional, default is true] sharing flags for each input dimension -
autoPadding- [named, optional, default is true] automatic padding flags for each input dimension -
lowerPad- [named, optional, default is 0] precise lower padding for each input dimension -
upperPad- [named, optional, default is 0] precise upper padding for each input dimension -
maxTempMemSizeInSamples- [named optional] maximum amount of auxiliary memory (in samples) that should be reserved to perform convolution operations. Some convolution engines (e.g. cuDNN and GEMM-based engines) can benefit from using workspace as it may improve performance. However, sometimes this may lead to higher memory utilization. Default is 0 which means the same size as the number of input samples. -
imageLayout- [named optional] the storage format of each image. The only supported value iscudnn, which means each image is stored as[width, height, channel](column-major notation).