MNIST with Multi-Layered Perceptron

MNIST is the "Hello World" for Deep Learning. Here, I try to understand the basics of TensorFlow by following the tutorial for recognizing handwritten digits. The MNIST dataset is provided by the TensorFlow library itself. It contains black-and-white images of 28x28 size. Each image is of a handwritten digit, and the task is to recognize this digit.

In [1]:
import tensorflow as tf
In [2]:
import matplotlib.pyplot as plt
%matplotlib inline
In [3]:
from tensorflow.examples.tutorials.mnist import input_data
In [4]:
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)
Extracting /tmp/data/train-images-idx3-ubyte.gz
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
In [7]:
sample = mnist.train.images[2].reshape(28, 28)
In [9]:
plt.imshow(sample, cmap = 'Greys')
Out[9]:
<matplotlib.image.AxesImage at 0x1830f3c1b38>

So this is how each image looks in the dataset.

Parameters

To build our model, we need to first define 3 parameters:

  • The learning rate determines how quickly the cost function will be adjusted
  • The number of training epochs is the number of cycles the model will go through
  • Batch Size is the size of the batches of the training data.
In [10]:
learning_rate = 0.001
training_epochs = 15
batch_size = 100

Network Parameters

These parameters will define the neural network and can be adjusted based on the data and user preference.

In [12]:
n_hidden_1 = 256
n_hidden_2 = 256
n_input = 784
n_classes = 10
samples = mnist.train.num_examples

Inputs

As in any statistical model, we have our inputs which are the dimensions we want to train our model with, and the labels for testing.

In [13]:
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])

Multi Layer Perceptron

The model consists of 2 hidden layers. Our input will go to the first hidden layer, where the data will get some weight attached to it (at first, randomly). The data will then be passed through an activation function (here, RELU activation), and over to the next hidden layer. From the second hidden layer, the data reaches the output layer, where we will determine the overall loss. We will then apply an optimization function, which will adjust the weights across the network to lower the loss in the subsequent run. Based on the tutorial, here we use the Adam Optimizer.

In [14]:
def multilayer_perceptron(x, weights, biases):
    
    # First Hidden layer with RELU activation
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    
    # Second Hidden layer with RELU activation
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    
    # Last Output layer with linear activation
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    return out_layer

The weights and biases are randomly generated and stored as TensorFlow variables. These values will be modified by the activation and the optimization function.

In [15]:
weights = {
    'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))
}
biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'b2': tf.Variable(tf.random_normal([n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_classes]))
}
In [16]:
pred = multilayer_perceptron(x, weights, biases)

Cost and Optimization functions

Calculate the cost of each run by measuring the mean entropy i.e. difference between the predicted value and actual value for each image.
For optimization, we use the built-in Adam Optimizer

In [33]:
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = pred, labels = y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
In [21]:
init = tf.global_variables_initializer()

Running the Session

In [26]:
# Launch the session
sess = tf.InteractiveSession()

# Intialize all the variables
sess.run(init)

# Training Epochs
# Essentially the max amount of loops possible before we stop
# May stop earlier if cost/loss limit was set
for epoch in range(training_epochs):

    # Start with cost = 0.0
    avg_cost = 0.0

    # Convert total number of batches to integer
    total_batch = int(samples/batch_size)

    # Loop over all batches
    for i in range(total_batch):

        # Grab the next batch of training data and labels
        batch_x, batch_y = mnist.train.next_batch(batch_size)

        # Feed dictionary for optimization and loss value
        # Returns a tuple, but we only need 'c' the cost
        # So we set an underscore as a "throwaway"
        _, c = sess.run([optimizer, cost], feed_dict={x: batch_x, y: batch_y})

        # Compute average loss
        avg_cost += c / total_batch

    print("Epoch: {} cost={:.4f}".format(epoch+1,avg_cost))

print("Model has completed {} Epochs of Training".format(training_epochs))
Epoch: 1 cost=160.3345
Epoch: 2 cost=41.1100
Epoch: 3 cost=25.7214
Epoch: 4 cost=17.8961
Epoch: 5 cost=12.9697
Epoch: 6 cost=9.6128
Epoch: 7 cost=7.1332
Epoch: 8 cost=5.3724
Epoch: 9 cost=4.1054
Epoch: 10 cost=3.0427
Epoch: 11 cost=2.3493
Epoch: 12 cost=1.7292
Epoch: 13 cost=1.3395
Epoch: 14 cost=1.0423
Epoch: 15 cost=0.8322
Model has completed 15 Epochs of Training

Evaluation

We evaluate the performance of our model by calculating how many images were recognized correctly out of the total images.

In [27]:
correct_predictions = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
In [29]:
correct_predictions = tf.cast(correct_predictions, "float")
In [30]:
accuracy = tf.reduce_mean(correct_predictions)
In [31]:
print("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))
Accuracy: 0.9421

The accuracy can be increased by increasing the number of hidden layers and the number of training epochs.