Hello World of Deep Learning - Classify Handwritten digits problem with NN

Hello World of Deep Learning - Classify Handwritten digits problem with NN

Table of contents

No heading

No headings in the article.

Understanding Deep learning requires familiarity with mathematical concepts such as Tensor, Tensor Operations, Differentiation, Gradient Descent etc.,.

Before diving deep into these concepts let's try to build a simple neural network that is able to find the handwritten digits. Generally, it is considered as Hello World of Deep Learning. In this article, we will try to understand the concepts in a realistic way by applying them to the model we build rather than simply writing all full length of mathematical definitions.

Problem Statement : Classfiy the grayscale images of handwritten digits (28x28) into their 10 categories(0 to 9). We will use the MNIST dataset. It has a set of 60,000 training images along with 10,000 test images

(These are the example sample digits)

The dataset comes prebuilt in Keras, data is in the form of Numpy arrays

Load the Data :

a :tran_images and train_labels are the numpy arrays we will train our model on, and after training we will test the model on test_images and test_labels to evaluate the model's performance.

Images are encoded as 2x2 numpy as arrays as the handwritten digits are grayscale images and labels are from 0 to 9 digits.

Now let's explore the training data:

First understand the shape of train_images (60000, 28, 28), it means we have a total of 60000 elements and each of size 28x28. We have the 60000 train_labels as digits 0 to 9 in correspondence with train_images.

Now lets also explore the test data:

So we have a total of 10000 images in the test set.

Now yes it's time to build the network... Lets do it

Layers: Layers are the core building block of neural networks, is a data processing module that can think of as a filter for data. Some input data goes in and comes out in a more useful form. Layers extract the representations out of the data fed into them. So neural network basically is chaining multiple simple layers that will implement a form of progressive data distillation.

Above our model consists of sequence of two dense layers, means every neuron of first layers is connected to every other neuron of second layer also called as fully connected network. The second/ last layers is 10-way softmax classification layer, means it will return an array of 10 probability score summing upto 1. Each score belongs to the one of our 10 digit classes(0 to 9)

Now we have build our model, input goes and model tries to find the patterns or representations of input data to map anyone of our 10 classes, now we need to do back propagation, need to increase the efficiency of the model by adjusting weights. So now comes the compilation step

So now by now we have out input data and model ready, but before starting the training of the model with the input data, we first need to transform the data into the way the model expects. we do it in two steps, first we will reshape our data into (60000, 28*28), means previously our each image is represented as 28x28 matrix and there are 60000 in total. so now we are flattening 28x28 into 1x1 matrix that is previously we have 28 rows and 28 columns now we are converting it into 1x1 matrix with total 28*28 elements we will have as shown in below image.

Now we have reshaped our input data that can be given as input to our model's first layer, but before moving to that we also need to do scaling. Scaling is very important step it will avoid the issues we may get with the distribution of data. After the scaling all the values will be in the range of 0 to 1. We will do reshaping and scaling with below code

Training the model :

Now we are ready with everything and going to train our model. We will do that by using the fit() method on the model we have created.

Hurray! we got 96% accuracy. Means our model is 96% accurate on the training images.

Time to make Predictions :

Now its time to test our model on the new images that are not part of training data, we will try to predict the digits with test images

In the above code we are predicting for the first 10 images of the test data, and for the first test image we are displaying the prediction. now we will see what is actually the first value using the tets labels we have.

Now evaluate our model on the entire test data,

That is for this article, in the next article we are going to see very important and must know basics of Deep Learning "Mathematical building block of neural networks", where we understand mathematical concepts such as Tensor, Tensor Operations, Differentiation, Gradient Descent etc.,.

For any queries you can reach out to me Linkedin :

https://www.linkedin.com/in/hariprasad-alluru-9bb6a9183/