Introduction
In the early 1990s, cognitive neuroscientists proposed a class of theories about brain development called 'deep learning'. These theories posited that the brain develops through a process of hierarchical abstraction, in which higher-level concepts are built upon the foundations laid by lower-level ones. For example, a child might first learn to identify individual objects, then groups of objects, then relationships between objects. This process of abstraction is thought to be crucial for the acquisition of complex knowledge.
In recent years, the concept of 'deep learning' has emerged as a powerful tool for Artificial Intelligence (AI). As such, Deep Learning is a type of Machine Learning (ML) that is inspired by the structure and function of the human brain. In fact, Deep Learning networks are composed of a large number of interconnected processing nodes, or neurons, which learn to identify patterns in data. By analogy with the human brain, these networks are called Artificial Neural Networks.
Let's have a look at the main difference between ML and Deep Learning through a concrete example. Consider the task of recognizing handwritten digits. In a traditional ML algorithm, we would first train the algorithm on a large set of labelled data. This dataset would contain images of handwritten digits (and consequently their features), along with the correct answer (e.g., ‘5’). The algorithm would then be able to recognise handwritten digits in new data, by comparing it to the training data. In contrast, a Deep Learning algorithm would not require any labelled data. It would be trained by exposing it to a large number of unlabelled images, containing handwritten digits. It would then learn to identify the features and patterns in these images and assign probabilities to different digit labels. Finally, the algorithm would output the most likely digit label for a given image. Of course, Deep Learning requires less human input when compared to ML.
Neural Networks
The majority of Deep Learning algorithms are based on Artificial Neural Networks (or Neural Networks in short), which are inspired by the brain's own 'neural networks'. They consist of a large number of neurons, where all the processing of information occurs. These are connected together through channels. The strength of these connections is determined by a weight, hence a weighted channel. All neurons are assigned a unique number called bias, which is added to the weighted sum of inputs and determines the initial activation state of a neuron (hence whether the information is passed on to the following layer or otherwise). The weights and bias are continuously updated, through what are called forward and backpropagation, in order to produce a well-trained neural network.
Although they have the same end goal, Neural Networks are used to solve distinct types of problems. A Deep Learning Neural Network may contain any number of layers with any number of neurons in each layer. The image below shows a simple Neural Network built to determine if any given fruit is an apple or a banana, based on the weight and shape of the fruit. The first layer (the input layer) takes as input the weight and shape of a fruit. The second layer (the hidden layer) contains 2 neurons, which are connected to the input layer. These neurons learn to identify the features and patterns in the input data and assign a probability to each. Finally, the output layer contains a single neuron, which outputs the most likely fruit, between an apple and a banana.
Source: https://victorzhou.com/965173626f97e1e6b497a136d0c14ec1/network2.svg
Types of Deep Learning algorithms
There are two main types of Deep Learning algorithms: Convolutional Neural Networks and Recurrent Neural Networks.
Convolutional Neural Networks (CNN) are designed to process and identify patterns in image data. The CNN is trained by exposing it to a large number of images, containing different objects (e.g. animals, cars, etc.). It then learns to identify the features and patterns in these images and assigns a probability to each label. Finally, the CNN outputs the most likely object for any given image. This science of computer image/video analysis and comprehension is called Computer Vision.
Recurrent Neural Networks (RNNs) are designed to process sequences of data, such as text, audio or video. In an RNN, connections between nodes form a directed graph along a temporal sequence. This allows it to exhibit dynamic temporal behaviour for a time series or sequential data. As a result, an RNN may utilize its internal state (memory) by recalling past data points and decisions while reviewing new information. This makes it ideal for Natural Language Processing work, which deals with understanding and generating human language that has structure over time; such as sentences, paragraph and tones. Such algorithms are used to analyse text and extract meaning from it in order to enable computers to communicate with humans. In particular, this is applied in Sentiment Analysis, which is a method of Natural Language Processing that is used to analyse emotions in text. The goal of Sentiment Analysis is to automatically detect the opinion of a text, whether it be positive, negative, or neutral.