Handwritten digits recognition using google tensorflow with python

May 3, 2017 Saloni Samant

Handwritten digits recognition using tensorflow

Handwritten digits recognition using Tensorflow with Python

The progress in technology that has happened over the last 10 years is unbelievable. Every corner of the world is using the top most technologies to improve existing products while also conducting immense research into inventing products that make the world the best place to live.

Some of these are the Amazon just walk out technology, the Tesla autopilot car, Spaceships and more. All of these breakthrough products could never exist without machine learning and deep learning algorithms.

Despite the complexity of the computations involved, some very sophisticated calculations can easily be handled by frameworks created for machine learning and deep learning. This article will help you get started with one of the most popular frameworks – Tensorflow.

So in this article, you will get a taste of deep learning with some interesting application, the handwritten digits recognization application.

Handwritten digits recognition using google tensorflow with python Click To Tweet

Before we begin. We would like to thank Google for access to their open source the tensorflow library.

What is Tensorflow?

Tensorflow is an open source library created by the Google Brain Trust for heavy computational work, geared towards machine learning and deep learning tasks. It is built on C, C++ making its computations very fast while it is available for use via a Python, C++, Haskell, Java and Go API.

It created data graph flows for each model, where a graph consists of two units – a tensor and a node.

Tensor: A tensor is any multidimensional array.
Node: A node is a mathematical computation that is being worked at the moment.

A data graph flow essentially maps the flow of information via the interchange between these two components. Once this graph is complete, the model is executed and the output is computed.

You can learn a lot more from the tensorflow official document

Now let’s begin start building handwritten digits recognition application. To start we need the dataset of handwritten digits for training and for testing the model. MNIST is the most popular dataset having handwritten digits as image files.

About the MNIST dataset

Mnist database handwritten digits

To begin our journey with Tensorflow, we will be using the MNIST database to create an image identifying model based on simple feedforward neural network with no hidden layers.

MNIST is a computer vision database consisting of handwritten digits, with labels identifying the digits. As mentioned earlier, every MNIST data point has two parts: an image of a handwritten digit and a corresponding label.

We’ll call the images “x” and the labels “y”. Both the training set and test set contain images and their corresponding labels; for example, the training images are mnist.train.images and the training labels are mnist.train.labels.

Each image is 28 pixels by 28 pixels. We can interpret this as a big array of numbers. We can flatten this array into a vector of 28×28 = 784 numbers.

It doesn’t matter how we flatten the array, as long as we’re consistent between images. From this perspective, the MNIST images are just a bunch of points in a 784-dimentional vector space.

Implementing the Handwritten digits recognition model

Implementing the handwritten digits model using Tensorflow with Python

We will be building simple feedforward neural network using softmax to predict the number in each image. We begin by calling in a Python environment.

As you need python as a prerequisite for understanding the below implementation. If you are new to the python and facing any environment issues then get quick hands on experience on python and the environment setup before you start.

Download the MNIS

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(“model_data/", one_hot=True)

Import Tensorflow to your environment

import tensorflow as tf

Initializing parameters for the model

batch =100
learning_rate=0.01
training_epochs=10

In machine learning, an epoch is a full iteration over samples. Here, we are restricting the model to 10 complete epochs or cycles of the algorithm running through the dataset.

The batch variable determines the amount of data being fed to the algorithm at any given time, in this case, 100 images.

The learning rate controls the size of the parameters and rates, thereby affecting the rate at which the model “learns”.

Creating Placeholders

x = tf.placeholder(tf.float32, shape=[None, 784]) 
y_ = tf.placeholder(tf.float32, shape=[None, 10])

The method tf.placeholder allows us to create variables that act as nodes holding the data. Here, x is a 2-dimensionall array holding the MNIST images, with none implying the batch size (which can be of any size) and 784 being a single 28×28 image. y_ is the target output class that consists of a 2-dimensional array of 10 classes (denoting the numbers 0-9) that identify what digit is stored in each image.

Creating Variables

W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

Here, W is the weight and b is the bias of the model. They are initialized with tf.Variable as they are components of the computational graph that need to change values with the input of each different neuron.

Initializing the model

y = tf.nn.softmax(tf.matmul(x,W) + b)

We will be using a simple softmax model to implement our network. Softmax is a generalization of logistic regression, usually used in the final layer of a network. It is useful because it helps in multi-classification models where a given output can be a list of many different things.

It provides values between 0 to 1 that in addition give you the probability of the output belonging to a particular class.

Defining Cost Function

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

This is the cost function of the model – a cost function is a difference between the predicted value and the actual value that we are trying to minimize to improve the accuracy of the model.

Determining the accuracy of parameters

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Implementing Gradient Descent Algorithm

train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)

Tensorflow comes pre-loaded with a lot of algorithms, one of them being Gradient Descent. The gradient descent algorithm starts with an initial value and keeps updating the value till the cost function reaches the global minimum i.e. the highest level of accuracy.

This is obviously dependant upon the number of iterations being permitted for the model.

Initializing the session

with tf.Session() as sess:
  sess.run(tf.initialize_all_variables())

Creating batches of data for epochs

for epoch in range(training_epochs):
    batch_count = int(mnist.train.num_examples/batch)
    for i in range(batch_count):
        batch_x, batch_y = mnist.train.next_batch(batch)

Executing the model

sess.run([train_op], feed_dict={x: batch_x, y_: batch_y})

Print accuracy of the model

if epoch % 2 == 0: 
      print "Epoch: ", epoch 
  prit "Accuracy: ", accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels})
  print "Model Execution Complete"

Share your accuracy details I would love to see those.

Complete Code:

#!/usr/bin/env python
# handwritten_digits_recognition.py
# Date: 03-May-2017
# About: Handwritten digits recognition with Tensorflow


# Required Python Packages
from tensorflow.examples.tutorials.mnist import input_data

# Download the MNIS dataset
mnist = input_data.read_data_sets(“model_data/", one_hot=True)

# import tensorflow to the environment
import tensorflow as tf

# initializing parameters for the model
batch = 100
learning_rate = 0.01
training_epochs = 10

# creating placeholders
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])

# creating variables
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# initializing the model
y = tf.nn.softmax(tf.matmul(x,W) + b)

# Defining Cost Function
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

# Determining the accuracy of parameters
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# Implementing Gradient Descent Algorithm
train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)

# Initializing the session
with tf.Session() as sess:
  sess.run(tf.initialize_all_variables())

# Creating batches of data for epochs
for epoch in range(training_epochs):
    batch_count = int(mnist.train.num_examples / batch)
    for i in range(batch_count):
        batch_x, batch_y = mnist.train.next_batch(batch)

# Executing the model
sess.run([train_op], feed_dict={x: batch_x, y_: batch_y})

# Print Accuracy of the model
if epoch % 2 == 0:
    print "Epoch: ", epoch
    prnt "Accuracy: ", accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels})
    print "Model Execution Complete"

You can clone the complete code from our Github account

Final Note

Creating a deep learning model can be easy and intuitive on Tensorflow. But to really implement some cool things, you need to have a good grasp on machine learning principles used in data science.

6 Responses to “Handwritten digits recognition using google tensorflow with python”

Sriram
6 years ago
Reply

File “”, line 5
mnist = input_data.read_data_sets(“model_data/”, one_hot=True)
^
SyntaxError: invalid character in identifier

I’m getting error here…
- Saimadhu Polamuri
  4 years ago
  Reply
  
  Hi Sriram,
  
  Could you please change the code
  
  input_data.read_data_sets(“model_data”, one_hot=True)
  
  I hope this will resolve the issue, if not please leave the error comment message.
  
  Thanks and happy learning!
Michael
6 years ago
Reply

Hi,
Thank you for your tutorial. When I run the sample code, I got the error “RuntimeError: Attempted to use a closed Session.” Do you know what is wrong on it? Thank you.

—————————————————————————
RuntimeError Traceback (most recent call last)
in ()
—-> 1 sess.run([train_op], feed_dict={x: batch_x, y_: batch_y})

~/Documents/github/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
875 try:
876 result = self._run(None, fetches, feed_dict, options_ptr,
–> 877 run_metadata_ptr)
878 if run_metadata:
879 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/Documents/github/tensorflow/lib/python3.5/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1021 # Check session.
1022 if self._closed:
-> 1023 raise RuntimeError(‘Attempted to use a closed Session.’)
1024 if self.graph.version == 0:
1025 raise RuntimeError(‘The Session graph is empty. Add operations to the ‘

RuntimeError: Attempted to use a closed Session.
- Saimadhu Polamuri
  4 years ago
  Reply
  
  Hi Michael,
  
  Could you please try the TensorFlow 2 way of creating the session, which is much easier than this.
  
  Thanks and happy learning!
Alageshan
6 years ago
Reply

session is getting closed as sess.run() is out of tf.session()
- Saimadhu Polamuri
  4 years ago
  Reply
  
  Hi Alageshan,
  
  In the updated TensorFlow version, the session creation is much simpler.Please have a look.
  
  Thanks and happy learning.