Skip to content

Commit 1406a09

Browse files
authored
Merge pull request #20 from KasunThushara/main
ADD: introduction to DNN
2 parents e6317fc + 1553999 commit 1406a09

File tree

21 files changed

+154
-10
lines changed

21 files changed

+154
-10
lines changed
Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
# Introduction to DNN
2+
3+
## What is Deep Learning?
4+
5+
Deep learning is a type of artificial intelligence where computers learn to understand data in a way similar to how the human brain works. It uses layers of "neurons" to recognize patterns in things like images, text, and sounds. By analyzing large amounts of data, deep learning helps computers make decisions and predictions without needing specific instructions for every task. It's the technology behind many modern tools, such as facial recognition, voice assistants, and self-driving cars.
6+
7+
## Why Deep Learning
8+
9+
- Efficiently processes unstructured data (e.g., images, text, audio).
10+
- Discovers hidden relationships and patterns in large datasets.
11+
- Excels at handling complex tasks like image recognition, natural language processing, and more.
12+
13+
## What are the components of a deep learning network?
14+
15+
A deep neural network (DNN) is built from layers of artificial "neurons," inspired by how the human brain works. Here's a simple breakdown of the key components and concepts:
16+
17+
![NN](../../pictures/Chapter1/nn.gif)
18+
19+
- **Input layer**: This is where your data enters the network. Each input is converted into numbers, which are then passed to the next layer.
20+
21+
- **Hidden layers**: These are the layers between the input and output. The more hidden layers you have, the deeper your network. Each neuron in a hidden layer takes inputs, processes them, and passes the result to the next layer. Hidden layers help the network learn complex patterns.
22+
23+
- **Output layer**: This final layer provides the result, which could be a classification (e.g., yes or no) or a range of values depending on the task.
24+
25+
- **Weights and biases**: Every connection between neurons has a weight, which determines the importance of the input. Bias is an extra value that helps the neuron make better decisions.
26+
27+
- **Activation function**: Each neuron uses an activation function to decide if it should "fire" or not. This function simulates the way real neurons work, making the network learn and adjust.
28+
29+
If you're ready to dive deep, [this guide](https://www.3blue1brown.com/lessons/neural-networks) is perfect for exploring every detail.
30+
31+
32+
In essence, a DNN learns by adjusting its weights and biases through training on data, gradually improving its predictions. The goal is to get better at recognizing patterns and making accurate decisions.
33+
34+
## What is a Neuron?
35+
36+
A neuron (or perceptron) in a neural network works by receiving inputs, each multiplied by a weight that shows its importance. It then adds these weighted inputs together, and if the sum passes a certain threshold, the neuron "fires" by sending an output signal. This output is passed through an activation function, which helps decide whether the neuron should activate (e.g., output a 1) or stay inactive (e.g., output a 0). The result is then sent to the next neuron in the network to continue the process.
37+
38+
![Nueron](../../pictures/Chapter1/nuron.gif)
39+
40+
## How Neural Network training ?
41+
42+
- **Feed data into the network**: Input your training data into the network through the input layer.
43+
- **Forward propagation**: The data moves through the hidden layers, and the network makes a prediction based on its current weights and biases.
44+
- **Calculate loss**: Compare the network's prediction to the actual target (desired output) and calculate the error, known as loss.
45+
- **Backpropagation**: Adjust the weights and biases by passing the error backward through the network to minimize the loss.
46+
- **Repeat and optimize**: Continue feeding data and adjusting weights through multiple iterations (epochs) until the network learns to make accurate predictions.
47+
48+
49+
![Training](../../pictures/Chapter1/training.gif)
50+
51+
## Convolution Neural Network
52+
53+
Convolutional Neural Networks (CNNs) are a specialized type of deep learning model designed to process and analyze visual data, such as images and videos. They are particularly effective at recognizing patterns and spatial hierarchies within images, making them ideal for tasks like object detection, image classification, and facial recognition. Unlike traditional neural networks, CNNs use convolutional layers to automatically learn local features, which allows them to excel in capturing visual information. This makes CNNs the state-of-the-art approach for many image-related AI applications.
54+
55+
![CNN](../../pictures/Chapter1/cnn1.jpg)
56+
57+
[References](https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53)
58+
59+
### Why CNN is Different from DNN
60+
61+
- **CNN is designed for image data**:
62+
- CNNs are specialized for processing and analyzing images by automatically learning patterns like edges, shapes, and textures.
63+
- DNNs are more general and can be used for various tasks, but they don't excel at spatial pattern recognition like CNNs.
64+
- **Local feature learning vs. Global feature learning**:
65+
- CNN uses convolutional layers that focus on small regions of an image (local features), capturing spatial relationships.
66+
- DNNs use fully connected layers that consider the entire input (global features), making them less effective for image data.
67+
- **CNN uses fewer parameters**:
68+
- CNN’s convolutional layers are sparsely connected (not every neuron connects to every input), reducing the number of parameters and computation.
69+
- DNN’s layers are fully connected, which increases the number of parameters, making them less efficient for image processing tasks.
70+
- **Better for spatial data**:
71+
- CNN is excellent for image-related tasks like object detection and classification because it recognizes spatial hierarchies in data.
72+
- DNNs, although effective, do not naturally handle spatial information in the same way.
73+
74+
### Basic CNN Structure
75+
76+
**Convolution Layer:**
77+
- Extracts features from the image by applying filters (kernels) that detect patterns like edges, textures, etc.
78+
- Output: Feature maps that represent learned patterns.
79+
80+
![CNN](../../pictures/Chapter1/conv.gif)
81+
82+
[References](https://compneuro.neuromatch.io/tutorials/W1D5_DeepLearning/student/W1D5_Tutorial2.html)
83+
84+
**Pooling Layer**
85+
- Reduces the size of feature maps (down-sampling) to make computation more efficient.
86+
- Common technique: Max-pooling, where the maximum value in a region is taken to reduce data size.
87+
88+
89+
![CNN](../../pictures/Chapter1/maxpool.gif)
90+
91+
**Fully Connected Layer (FC)**:
92+
93+
- A traditional layer where all neurons are connected to every neuron in the previous layer.
94+
- Helps in combining the features extracted by convolution layers to make final predictions.
95+
96+
**Output Layer**:
97+
- The final layer where the model gives its prediction, such as identifying the object in an image.
98+
99+
[References](https://www.youtube.com/watch?v=CXOGvCMLrkA)
100+
101+
## What are the Popular Image Classification architectures?
102+
103+
**LeNet**
104+
- LeNet, developed by Yann LeCun in 1998, is one of the first CNN models, designed for handwritten digit recognition (like the MNIST dataset).
105+
- It has a simple structure with two convolutional layers followed by pooling layers, and fully connected layers for classification.
106+
- LeNet laid the foundation for modern CNNs and is used in early computer vision tasks like digit classification.
107+
108+
![CNN](../../pictures/Chapter1/lenet.png)
109+
110+
**VGG16**
111+
- VGG16, created by the Visual Geometry Group at Oxford, is a deep CNN with 16 layers, primarily used for image classification tasks.
112+
- It uses small 3x3 convolution filters and stacks multiple layers together to capture detailed features, followed by fully connected layers.
113+
- VGG16 is popular for its simplicity and effectiveness in large-scale image classification and object detection tasks.
114+
115+
![CNN](../../pictures/Chapter1/vgg16.png)
116+
117+
118+
## Object Detection
119+
120+
Object detection is a computer vision technique that identifies and localizes objects within images or video by marking them with bounding boxes. Unlike simple image classification, which only labels an entire image, object detection provides spatial information, detecting multiple objects and their positions simultaneously. It enables applications ranging from autonomous driving to real-time surveillance by combining classification and localization tasks. This makes it a crucial step toward understanding visual scenes in depth.
121+
122+
123+
## Object Detection Architectures
124+
125+
**Two-Stage Detectors**
126+
127+
Two-stage detectors work in two main steps. First, they generate region proposals—likely areas in the image where objects might be located. Then, in the second stage, they refine these proposals and classify them into specific object categories. This approach balances accuracy by focusing on the most relevant parts of an image, which improves detection but can slow down processing.
128+
129+
Ex: R-CNN, Fast RCNN
130+
131+
![RCNN](../../pictures/Chapter1/RCNN.png)
132+
133+
**Single-Stage Detectors**
134+
135+
Single-stage detectors streamline the process by predicting bounding boxes and class labels in a single pass over the image. Instead of generating region proposals first, they treat object detection as a dense prediction problem—examining the entire image at once, making them faster than two-stage methods. These models are generally more suitable for real-time applications, though sometimes less accurate.
136+
137+
Ex: SSD and Yolo
138+
139+
![Yolo](../../pictures/Chapter1/YOLO.png)
140+
141+
142+
143+
144+

articles/Chapter 1 - Introduction to AI/Introduction_to_OpenCV.md renamed to articles/Chapter 2 - Configuring the RaspberryPi Environment/Introduction_to_OpenCV.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -60,22 +60,22 @@ print(cv2.__version__)
6060

6161
If the version prints successfully (e.g., 4.10.x), then OpenCV has been installed properly.
6262

63-
![OpenCV installed](../../pictures/Chapter1/install_openCV.PNG)
63+
![OpenCV installed](../../pictures/Chapter2/install_openCV.PNG)
6464

6565
## Read an Image
6666

6767
**Step 01: Create a new Folder on Desktop. This case I used file name as OpenCV_Files**
6868

69-
![Folder](../../pictures/Chapter1/folder.PNG)
69+
![Folder](../../pictures/Chapter2/folder.PNG)
7070

7171
**Step 02: Place the image file lenna.png inside this folder.**
7272

73-
![Lenna2](../../pictures/Chapter1/lenna2.PNG)
73+
![Lenna2](../../pictures/Chapter2/lenna2.PNG)
7474

7575
**Step 03: Open the Python interpreter.**
7676

7777

78-
![Thonny](../../pictures/Chapter1/thonny.PNG)
78+
![Thonny](../../pictures/Chapter2/thonny.PNG)
7979

8080

8181
**Step 04: Write the following code to read and display the image. Save it as Lesson1.py on OpenCV_Files folder**
@@ -122,7 +122,7 @@ Step 07: Run the python script
122122
```bash
123123
python Lesson1.py
124124
```
125-
![Thonny](../../pictures/Chapter1/readimage.PNG)
125+
![Thonny](../../pictures/Chapter2/readimage.PNG)
126126
127127
Press any key to exit the image view. Congratulations! Now you know how to read an image.
128128
@@ -138,7 +138,7 @@ Open the terminal and run the following command
138138
```bash
139139
ls /dev/video*
140140
```
141-
![Video_Cam](../../pictures/Chapter1/video_cam.PNG)
141+
![Video_Cam](../../pictures/Chapter2/video_cam.PNG)
142142
143143
**Step 3: Write the Python Script to Capture Video Feed.**
144144
Open Thonny or a text editor and create a new Python script.
@@ -172,7 +172,7 @@ else:
172172
```
173173
**Step 04: Save it as Lesson2.py on OpenCV_Files folder**
174174
175-
![Video_Cam](../../pictures/Chapter1/lesson2.PNG)
175+
![Video_Cam](../../pictures/Chapter2/lesson2.PNG)
176176
177177
**Step 05: Go to Terminal and activate the virtual environment that we created.**
178178
@@ -193,7 +193,7 @@ cd /home/pi/Desktop/OpenCV_Files
193193
python Lesson2.py
194194
```
195195
196-
![Lesson 2](../../pictures/Chapter1/webcmfeed.PNG)
196+
![Lesson 2](../../pictures/Chapter2/webcmfeed.PNG)
197197
198198
## Basic Image Manipulations
199199
@@ -293,7 +293,7 @@ cd /home/pi/Desktop/OpenCV_Files
293293
python Lesson3.py
294294
```
295295
296-
![Img manu](../../pictures/Chapter1/imagemanupulations.PNG)
296+
![Img manu](../../pictures/Chapter2/imagemanupulations.PNG)
297297
298298
Press q to exit.
299299
@@ -386,7 +386,7 @@ cd /home/pi/Desktop/OpenCV_Files
386386
```bash
387387
python Lesson4.py
388388
```
389-
![Video_output](../../pictures/Chapter1/video_output.PNG)
389+
![Video_output](../../pictures/Chapter2/video_output.PNG)
390390
391391
392392

pictures/Chapter1/RCNN.png

167 KB
Loading

pictures/Chapter1/YOLO.png

34.3 KB
Loading

pictures/Chapter1/cnn1.jpg

123 KB
Loading

pictures/Chapter1/conv.gif

686 KB
Loading

pictures/Chapter1/lenet.png

32.1 KB
Loading

pictures/Chapter1/maxpool.gif

70.1 KB
Loading

pictures/Chapter1/nuron.gif

475 KB
Loading

pictures/Chapter1/training.gif

1.59 MB
Loading

0 commit comments

Comments
 (0)