Deep Learning is becoming a trending word in computer science since the last decade. There is always confusion between the terms Artificial Intelligence, Machine Learning and Deep Learning. The following figure illustrates it very neatly and nicely.
As from the figure, it is clear that Machine earning is the subset of Artificial Intelligence and Deep Learning is the subset of Machine Learning.
Why we need Deep Learning?
As we have seen, ML and DL both are subset of AI. That simply means their ultimate goal is to achieve intelligence as a human brain. Practically it is extremely difficult for a machine to learn in the same way as the human brain can learn. Machine Learning deals with computer algorithms that make computers understand and recognize the things as the human brain can. In Machine Learning we need domain experts to reduce the complexity of the data and make patterns more visible to learning algorithms to work.
Everything was nice with machine learning until the "big data era". Now the size of data is very huge in each industry. For any industry, it is being very much difficult to predict something using a huge volume of data. In such a scenario, ML algorithms can definitely work but due to the high volume of data and need of domain experts lead the need and innovation of deep learning.
Of course, deep learning requires extremely high resource power then to operate ML algorithms. It uses a Deep network to learn from the provided a huge amount of data. Deep Learning algorithms try to learn high-level features from data in an incremental manner. This removes the need for domain expertise and hardcore feature extraction which was there with machine learning algorithms.
Who Invented Deep Learning?
The Deep Learning term was introduced to artificial neural networks by Igor Aizenberg in 2000. But this actually became popular in 2012 with the victory of ImageNet Competition where winners of this contest actually used the concepts and techniques of Deep learning to optimize the solution for Object Recognition.
In the year 2000, the Vanishing Gradient Problem came out when "features" formed in lower layers were not being learned by the upper layers so, no learning signal reached those layers. This was not an essential problem for all neural networks but it was restricted to only gradient-based learning methods. This problem turned out in the generation of certain activation functions which shortened their input and reduced the output range in a messy fashion. This gave birth to large areas of input mapped over an extremely small range.
A research report compiled by the Gartner Group came up with the challenges and opportunities of the three-dimensional data growth in 2001. This report marked the importance of Big Data and described the increasing volume and speed of data as increasing the range of data sources and types.
Fei-Fei Li was an AI professor at Stanford who launched ImageNet in 2009 assembling a free database of more than 14 million labeled images. These images were used as inputs to train neural nets. The speed of GPUs had increased significantly by 2011, making it possible to train convolutional neural networks without the need of layer by layer pre-training. Deep learning holds significant advantages in efficiency and speed.
How Deep Learning Works?
Deep learning is a subset of machine learning, it examines computer algorithms that learn and improve on their own. Doesn't it sound like magic? Of course yes! Many times we think that how Facebook recognizes our friends in the picture we posted and automatically tags the friend who is there in that image. Sometimes we think that Facebook is reading our brain or what. How does it understand things automatically? How Google, Facebook, Linkedin, Twitter knows and suggests our friend list to follow. These amazing things are part of AI.
As any human's brain, ML also have neurons to remember things. These neurons are connected with each other. That is the reason they are known as neural networks. In our brain neurons are the specialized cells, while in a neural network, a neuron is generally shown by circle in the image of neural network. Now the main question comes in mind, is this all is true, but where does the actual magic lie of a neural network?
To understand that lets first learn how many layers are there of neurons. Neurons are generally grouped in following various types of layers.
-
Input Layer
-
Hidden Layer(s)
-
Output Layer
Now let us understand those layers with an example. Suppose you want to predict the airline tickets rate. So roughly what parameters will affect the price of tickets? They are listed below:
-
Source Airport
-
Destination Airport
-
Date
-
Airline
Input Layer receives in input data. In our example, source and destination airport, date and airline are the input data for the input layer. The input layer passes the input to the first hidden layer.
Hidden Layer receives input from input layer. Then it performs various mathematical computations on our provided input data. One of the challenging tasks during the creation of a neural network is that how many hidden layers should we keep and how many neurons should be there on a single layer. The word "Deep" in deep learning simply means that there is more than one hidden layer.
Output Layer gives us the desired output. In our example, it will return the predicted price.
Now that we have learned the basic architecture of deep learning, now we will dive into the magic of how it predicted the price of the airline?
To answer this question we should first understand that every neuron is associated with some weight. Now, what is this weight? This weight provides the importance of each input parameter. Initially, the weights are set randomly. In our example, the departure date is the heaviest factor in deciding the rate of the ticket. Hence the departure date will have a large weight value as compared to other parameters.
Now one more important thing is that each neuron has an activation function, which will be explained in another post. Once input data is passed in all the layers then it calculates the final output and pass it to the output layer.
How we train the Neural network?
Training is the most difficult task as it includes a large number of datasets and large computational power.
Here in our example, we should have historical data about ticket prices. Due to large amount of possible airports and departure date combinations, we need a huge number of ticket prices data. Now to train AI, we need to give these data as input but at first, it may predict the wrong output as compared to given in the original dataset.
Once we complete the whole data set, we can create a function that shows us how wrong the AI's outputs were from the real outputs. This function is called the Cost Function. Ideally, we need our cost function to be zero to have our predicted output the same as the given dataset.
Conclusion
-
Deep Learning uses a Neural Network to behave and work like animal intelligence.
-
There are mainly 3 types of layers of neurons: Input Layer, Hidden Layer(s), and Output Layer.
-
Neurons have connections that are associated with a weight, dictating the importance of the input value.
-
Neurons apply an Activation Function on the input data to “standardize” the output of the neuron.
-
A large dataset is needed to train a Neural Network.
-
Cost Function is produced by iterating through the data set and comparing the outputs, indicating how much the AI is far from the real outputs.
-
To reduce the cost function after every iteration through the data set, the weights between neurons are adjusted using Gradient Descent.
You may also like: