development of a neural network for object search

Hello. There was a task to build a neural network to search for circles and ellipses in the image.
The target object will also be black. It is clear that there are very few features, but still.

I do it in C# because then the network needs to be integrated into an existing project.
I used Tensorflow and CNTK libs. but since the tensor is terribly slow both when building networks and when calculating them, I took CNTK. There's probably not a whole lot of difference.

And so. The most important question. How to configure the network to work correctly??
Generated a bunch of pictures with different ellipses and circles.
I also generated the answers in the form of a roundedRect structure that contains the coordinates of the center of the object X,Y, the length and width of the rectangle of the object W,H and the angle of rotation Angle. A total of 5 values.

When I feed all 5 values to the network, I bring them to values from 0 to 1 by dividing by the width and height of the frame, respectively. I divide the angle by 360.



`int labels_count = 5;
            NDShape inputDim = NDShape.CreateNDShape(new int[] { 320, 240, 1 }); // подается массив для входного изображения в ЧБ
            NDShape outputDim = NDShape.CreateNDShape(new int[] { labels_count }); // выходной массив параметров прямоугольник объекта X Y W H

            // входной слой данных
            Variable input_shape = CNTKLib.InputVariable(inputDim, DataType.Float, "features");
            Variable output_shape = CNTKLib.InputVariable(outputDim, DataType.Float, "labels");
            // создаем слои

            double convWScale = 0.26;

            var view = new NDArrayView(NDShape.CreateNDShape(new int[] { 3, 3, 1 }), new double[] { -1, 0, 1, -2, 0, 2, -1, 0, 1 }, DeviceDescriptor.CPUDevice, false);
             
            var scaledInput = CNTKLib.ElementTimes(Constant.Scalar<float>(1.0f / 255.0f, DeviceDescriptor.CPUDevice), input_shape); // слой масштабирования 

            int kernelWidth1 = 5, kernelHeight1 = 5, numInputChannels1 = 1, outFeatureMapCount1 = 4;
            var conv1 = CNTKHelper.ConvolutionWithMaxPooling(scaledInput, DeviceDescriptor.CPUDevice, kernelWidth1, kernelHeight1, numInputChannels1, outFeatureMapCount1, 2, 2, 3, 3);
             
            var layer2 = CNTKHelper.Dense(conv1, 64, DeviceDescriptor.CPUDevice, CNTKHelper.Activation.ReLU, "layer2"); 
            var classifierOutput = CNTKHelper.Dense(layer2, labels_count, DeviceDescriptor.CPUDevice, CNTKHelper.Activation.Sigmoid, "classifierOutput"); // конечная сеть`

![389](https://user-images.githubusercontent.com/54429272/228136283-cb4b4714-cacf-4f04-9dda-3071bf5a8e38.jpg)

![356](https://user-images.githubusercontent.com/54429272/228136287-fe8e9e04-d4b4-48a7-8935-0f88a91e8be7.jpg)

![359](https://user-images.githubusercontent.com/54429272/228136291-3dff28f6-ab32-4442-86d2-3dbf99bbd2a5.jpg)

![363](https://user-images.githubusercontent.com/54429272/228136294-ef8a7796-a924-49ab-bc35-cf3ffa0a288f.jpg)


I was in training for a day, but the value of PreviousMinibatchLossAverage does not fall below 30.
Maybe who knows how to choose the layers correctly??
Is it possible to solve this problem in some way?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

development of a neural network for object search #3879

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

development of a neural network for object search #3879

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions