Deep Belief Networks
{Geoffrey Hinton}
{Geoffrey Hinton}
0) Introduction:
Deep Belief Networks are a graphical representation which are essentially generative in nature.
For example: It produces all possible values which can be generated for the case at hand.
It is an amalgamation of probability and statistics with machine learning and neural networks.
DBNs consist of multiple layers with values, wherein there is a relation between the layers but not the values.
The main aim is to help the system classify the data into different categories.
1) Methods:
Network Evolution:
The First Generation:
Neural Networks used Perceptrons which identified a particular object or anything else by taking into consideration “weight” or pre-fed properties.
However, the Perceptrons could only be effective at a basic level and not useful for advanced technology.
The Second Generation:
The introduction of the concept of Back propagation in which the received output is compared with the desired output and the error value is reduced to zero.
Support Vector Machines created and understood more test cases by referring to previously input test cases.
Next came directed a cyclic graphs called belief networks which helped in solving problems related to inference and learning problems. This was followed by Deep Belief Networks which helped to create unbiased values to be stored in leaf nodes.
Restricted Boltzmann Machines:
DBNs are composed of unsupervised networks like RBMs.
In this the invisible layer of each sub-network is the visible layer of the next. The hidden or invisible layers are not connected to each other and are conditionally independent.
The probability of a joint configuration network over both visible and hidden layers depends on the joint configuration network’s energy compared with the energy of all other joint configuration networks.
Training a DBN:
The first step is to train a layer of properties which can obtain the input signals from the pixels directly.
The next step is to treat the values of this layer as pixels and learn the features of the previously obtained features in a second hidden layer.
Every time another layer of properties or features is added to the belief network, there will be an improvement in the lower bound on the log probability of the training data set.
Implementation:
MNIST9 dataset
An important thing to keep in mind is that implementing a Deep Belief Network demands training each layer of RBM.
Firstly, the units and parameters are initialized.
It is followed by two phases in Contrastive Divergence algorithm — positive and negative.
In the positive phase, the binary states of the hidden layers can be obtained by calculating the probabilities of weights and visible units. Since it is increases the probability of the training data set, it is called positive phase.
The negative phase decreases the probability of samples generated by the model.
The greedy learning algorithm is used to train the entire Deep Belief Network. The greedy learning algorithm trains one RBM at a time and until all the RBMs have been taught.