By now, you might be knowing about Machine Learning, a computer science branch that studies the design of algorithms that can learn. Deep learning is a sub-field of machine learning that is inspired by artificial neural networks, which in turn are inspired by biological neural networks.
Convolutional Neural Networks are very similar to ordinary Neural Networks that have learn-able weights and biases. In neural networks, Convolutional neural network owns major applications in image recognition, image classification, detection of objects, recognizing faces, etc.
What do you mean by Convolution?
Convolution comes from the Latin convolvere, “to convolve” means to roll together. Convolution is a mathematical operation on two functions (f and g) to construct a third function that represents how the shape of one is modified by the other. It is the integral calculating how much two functions overlap as one passes over the other. Assume convolution as a way of combining two functions by multiplying them.
“The green curve shows the convolution of the blue and red curves as a function of t, the position indicated by the vertical green line. The gray region indicates the product as shown below as a function of t, so its area as a function of t is precisely the convolution.”
If we talk in CNN, the operations performed on the input image are feature detector/kernel/filter and feature map.
How our brain classifies an image!
Whenever we see an image, our brain looks for the features in the image to classify that image. We categorize things by recognizing the features. To prove this thing, here are two images which we will classify:
In the above image, if we look to the right side of the image, we will see a person looking towards the right side while, if we look in the center, we will perceive that the person is looking towards us.
Does our brain struggle a lot in identifying these different scenarios and is confused if the person is looking in right or towards us?
This happens because our brain studies for the features in the image and then assume what to take.
Consider another image:
The above image depicts a young girl looking elsewhere and an old lady wearing a scarf on her head and seeing downwards. Confused? The image is created to confuse you!
What if the features were not clear, like this?
Aren’t we a bit dazzled? Is your brain able to decide what is correct?
No! That’s because the features depicted are inadequate to assist the brain in classifying them.
All the above images are addressed to understand that our brain functions on the features of the image it sees and then classifies it accordingly.
In a similar manner, neural networks work. We can see in the image below, the neural network has successfully classified cheetah and bullet train but was unsuccessful in predicting hand glass. This is because of the unclear features in the image.
In simple words, Neural Networks works exactly like a human mind.
How the Computer sees an image?
As we all know, an image is a matrix of pixels. If there is a black and white image, then we will get a 2D array. While, if we passed a colored image, then we will get a 3D array, which means it has an extra parameter for depth which is RGB channel as shown below. The pixel value lies between 0 and 255, and images are stored in bytes (0 to 255).
We have discussed, what is CNN and how the image is interpreted and classified on the basis of features in the image. CNNs have wide applications in image and video recognition, recommender systems and natural language processing.
In the next blog, you will discover concepts regarding Convolution operation and ReLU layer. Till then, keep learning.
If you have any query or suggestions, you may leave a comment below 🙂