-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Hello. There was a task to build a neural network to search for circles and ellipses in the image.
The target object will also be black. It is clear that there are very few features, but still.
I do it in C# because then the network needs to be integrated into an existing project.
I used Tensorflow and CNTK libs. but since the tensor is terribly slow both when building networks and when calculating them, I took CNTK. There's probably not a whole lot of difference.
And so. The most important question. How to configure the network to work correctly??
Generated a bunch of pictures with different ellipses and circles.
I also generated the answers in the form of a roundedRect structure that contains the coordinates of the center of the object X,Y, the length and width of the rectangle of the object W,H and the angle of rotation Angle. A total of 5 values.
When I feed all 5 values to the network, I bring them to values from 0 to 1 by dividing by the width and height of the frame, respectively. I divide the angle by 360.
`int labels_count = 5;
NDShape inputDim = NDShape.CreateNDShape(new int[] { 320, 240, 1 }); // подается массив для входного изображения в ЧБ
NDShape outputDim = NDShape.CreateNDShape(new int[] { labels_count }); // выходной массив параметров прямоугольник объекта X Y W H
// входной слой данных
Variable input_shape = CNTKLib.InputVariable(inputDim, DataType.Float, "features");
Variable output_shape = CNTKLib.InputVariable(outputDim, DataType.Float, "labels");
// создаем слои
double convWScale = 0.26;
var view = new NDArrayView(NDShape.CreateNDShape(new int[] { 3, 3, 1 }), new double[] { -1, 0, 1, -2, 0, 2, -1, 0, 1 }, DeviceDescriptor.CPUDevice, false);
var scaledInput = CNTKLib.ElementTimes(Constant.Scalar<float>(1.0f / 255.0f, DeviceDescriptor.CPUDevice), input_shape); // слой масштабирования
int kernelWidth1 = 5, kernelHeight1 = 5, numInputChannels1 = 1, outFeatureMapCount1 = 4;
var conv1 = CNTKHelper.ConvolutionWithMaxPooling(scaledInput, DeviceDescriptor.CPUDevice, kernelWidth1, kernelHeight1, numInputChannels1, outFeatureMapCount1, 2, 2, 3, 3);
var layer2 = CNTKHelper.Dense(conv1, 64, DeviceDescriptor.CPUDevice, CNTKHelper.Activation.ReLU, "layer2");
var classifierOutput = CNTKHelper.Dense(layer2, labels_count, DeviceDescriptor.CPUDevice, CNTKHelper.Activation.Sigmoid, "classifierOutput"); // конечная сеть`
I was in training for a day, but the value of PreviousMinibatchLossAverage does not fall below 30.
Maybe who knows how to choose the layers correctly??
Is it possible to solve this problem in some way?