TensorFlow (Best Tutorial 2019)

TensorFlow Best Tutorial

TensorFlow Best Tutorial

This blog provides an introduction to the most popular programming framework in the deep learning community:

TensorFlow. Developed and backed by Google, TensorFlow has been adapted and advanced by a huge open source community. It is therefore essential for the deep learning practitioners to at least master the basics. In fact, much of the codes that you can find on the Internet are written in TensorFlow.

 

We’ll cover the ingredients of the TensorFlow core library as well as some high-level APIs that are available in the Python ecosystem. Our discussion in this blog should help you understand the basic structures of the framework, allowing you to build your own DL models using TensorFlow.

 

Although we recommend using Keras if you are new to DL, learning the essentials of TensorFlow is quite useful, as Keras is also built on top of TensorFlow.

 

TensorFlow is available both for Python 2 and Python 3. Since we’re using Python 3 in this blog, we briefly cover how to install TensorFlow on your local computer. However, if you’re using the Docker file provided, TensorFlow is already installed for you.

 

Before installing TensorFlow, it is important to make note of the computation units on your machine that can be used by TensorFlow. You have two options to run your TensorFlow code: you can use the CPU or the GPU. Since GPUs are designed to run linear matrix operations faster than the CPUs, data scientists prefer to use GPUs when available.

 

However, the Tensor Flow code you write will be the same (except for the statement of your preference regarding the computation units you would like to use).

 

Let’s start with the installation of the Tensor Flow. In doing so, we make use of the pip package manager of Python. So, if Python 3 is the only installed version of Python in your machine, then the:


pip install –upgrade tensor flow

 

the command would install Tensorflow for Python 3. However, if both Python 2 and Python 3 are installed in your computer, then the command above might install the TensorFlow for Python 2. In that case, you can also use the following command to install TensorFlow for Python 3:


pip3 install –upgrade tensorflow

 

The TensorFlow framework is now installed for you to explore. In your code, import the TensorFlow to use it:


import tensorflow

 

If you wish, you can rename it to “tf”. We will do this throughout the blog because it is the convention in the community:


import tensorflow as tf

 

First Look at TensorFlow

TensorFlow is mathematical software and an open-source software library for Machine Intelligence, developed in 2011, by Google Brain Team. The initial target of TensorFlow was to conduct research in machine learning and in deep neural networks. However, the system is general enough to be applicable to a wide variety of other domains as well.

 

The name is derived from the data model which is represented by tensors and from the data flow graph that stands for the TensorFlow's execution model. In 2015, Google has open-sourced the TensorFlow and all of its reference implementation and made all the source code available on GitHub under the Apache 2.0 license.

 

After that, TensorFlow has achieved wide adaption, from academia and research to industry and following that recently the most stable version 1.0 has been released with a unified API.

 

Keeping in mind your needs and based on all the latest and exciting features of TensorFlow 1.x, this blog will give a description of the main TensorFlow's capabilities. The following topics will be discussed in this blog:

 

General overview

TensorFlow is an open source software library for numerical computation using data flow graphs that enables machine learning practitioners to do more data-intensive computing. It provides some robust implementations of widely used deep learning algorithms. Nodes in the flow graph represent mathematical operations.

 

On the other hand, the edges represent multidimensional tensors that ensure communication between edges and nodes. TensorFlow offers you a very flexible architecture that enables you to deploy computation to one or more CPUs or GPUs in a desktop, server or mobile device with a single API.

 

What's new with TensorFlow 1.x?

The APIs in TensorFlow 1.0 has changed in ways that are not all backward-compatible. That is, TensorFlow programs that worked on TensorFlow 0.x won't necessarily work on TensorFlow 1.x. These API changes have been made to ensure an internally-consistent API. In other words, Google does not have any plans to make TensorFlow backward-breaking changes throughout the 1.x lifecycle.

 

In the latest TensorFlow 1.x version, Python APIs resemble NumPy more closely. This has made the current version more stable for array-based computation. Two experimental APIs for Java and GO have been introduced too. This is very good news for the Java and GO, programmer.

 

A new tool called TensorFlow Debugger  has been introduced. This is a command-line interface and API for debugging live TensorFlow programs.

A new Android demos (https://github.com/tensorflow/tensorflow/tree/r1.0/tensorflow/examples/android) for object detection and localization and camera-based image stylization have been made available.

 

Now the installation of TensorFlow can be done through an Anaconda and Docker image of TensorFlow. Finally and most importantly, a new domain-specific compiler for TensorFlow graphs targeting CPU and GPU computing has been introduced. This is called Accelerated Linear Algebra (XLA).

 

How does it change the way people use it?

The main features offered by the latest release of TensorFlow are as follows:

Faster computing: The major versioning upgrade to TensorFlow 1.0 has made its capability incredibly faster including a 7.3x speedup on 8 GPUs for inception v3 and 58x speedup for distributed Inception (v3 training on 64 GPUs).

 

Flexibility: TensorFlow is not just a deep learning or machine learning software library but also great a library full with powerful mathematical functions with which you can solve most different problems.

 

The execution model that uses the data flow graph allows you to build very complex models from simple sub-models. TensorFlow 1.0 introduces high-level APIs for TensorFlow, with tf.layers, tf.metrics, tf.losses and tf.keras modules. These have made TensorFlow very suitable for high-level neural network computing

 

Portability: TensorFlow runs on Windows, Linux, and Mac machines and on mobile computing platforms (that is, Android).

Easy debugging: TensorFlow provides the Tensor Board tool for the analysis of the developed models.

 

Unified API: TensorFlow offers you a very flexible architecture that enables you to deploy computation to one or more CPUs or GPUs in a desktop, server or mobile device with a single API.

 

Transparent use of GPU computing: Automating management and optimization of the same memory and the data used. You can now use your machine for large-scale and data-intensive GPU computing with NVIDIA, cuDNN, and CUDA toolkits.

 

Easy Use: TensorFlow is for everyone; it's for students, researchers, and deep learning practitioners and also for readers of this blog. Production ready at scale: Recently it has been evolved as the neural network for machine translation, at production scale. TensorFlow 1.0 promises Python API stability making it easier to choose new features without worrying too much about breaking your existing code.

 

Extensibility: TensorFlow is relatively new technology and it's still under active development. However, it is extensible because it was released with the source code available on GitHub. And if you don't see the low-level data operator you need, you can write it in C++ and add it to the framework.

 

Supported: There is a large community of developers and users working together to improve TensorFlow both by providing feedback and by actively contributing to the source code.

 

Wide adaption: Numerous tech giants are using TensorFlow to increase their business intelligence. For example, ARM, Google, Intel, eBay, Qualcomm, SAM, DropBox, DeepMind, Airbnb, Twitter and so on.

 

Installing and getting started with TensorFlow

You can install and use TensorFlow on a number of platforms such as Linux, Mac OSX, and Windows. You can also build and install TensorFlow from the latest GitHub source of TensorFlow. Also if you have a Windows machine you can install TensorFlow only if you have a virtual machine.

 

The TensorFlow Python API supports Python 2.7 and Python 3.3+ so you need to install Python to start the TensorFlow installation. You must install Cuda Toolkit 7.5 and cuDNN v5.1+. In this section, we will show how to install and get started with TensorFlow. More details on installing TensorFlow on Linux will be shown. A short overview of Windows will be provided as well.

 

Note that, for this and the rest of the blogs, we will provide all the source codes with Python 2.7 computable. However, you will find all of them with Python 3.3+ compatible on the Packt repository.

 

Installing on Mac OS is more or less similar to Linux. Please refer to URL at https://www.tensorflow.org/install/install_mac for more details.

 

Installing TensorFlow on Linux

In this section, we will show how to install TensorFlow on Ubuntu 14.04 or higher. The instructions presented here also might be applicable to other Linux distros.

 

Which TensorFlow to install on your platform?

However, before proceeding with the formal steps, we need to determine which TensorFlow to install on your platform. TensorFlow has been developed such that you can run the data-intensive tensor application on GPU as well as CPU. Thus, you should choose one of the following types of TensorFlow to install on your platform:

 

TensorFlow with CPU support only: If there is no GPU such as NVIDIA installed on your machine, you must install and start computing using this version. This is very easy and you can do it in just 5 to 10 minutes.

 

TensorFlow with GPU support: As you might know, a deep learning application requires typically very high intensive computing resources. Thus TensorFlow is no exception but can typically speed up data computation and analytics significantly on a GPU rather than on a CPU. Therefore, if there's NVIDIA GPU hardware on your machine, you should ultimately install and use this version.

 

From our experience, even if you have NVIDIA GPU hardware integrated on your machine, it would be worth installing and trying the CPU only version first and if you don't experience good performance you should switch to GPU support then.

 

Requirements for running TensorFlow with GPU from NVIDIA

The GPU-enabled version of TensorFlow has several requirements such as 64-bit Linux, Python 2.7 (or 3.3+ for Python 3), NVIDIA CUDA 7.5 (CUDA 8.0 required for Pascal GPUs) and NVIDIA, cuDNN v4.0 (minimum) or v5.1 (recommended).

 

More specifically, the current development of TensorFlow supports only GPU computing using NVIDIA toolkits and software. Now the following software must be installed on your machine.

 

Step 1: Install NVIDIA CUDA

To use TensorFlow with NVIDIA GPUs, CUDA Toolkit 8.0 and associated NVIDIA drivers with CUDA Toolkit 8+ need to be installed.

For more details, refer to this NVIDIA's documentation CUDA Toolkit 10.0 Download. Now download and install the required package from https://developer.nvidia.com/cuda-downloads

 

Available CUDA packages based on various platforms Also, ensure that you have added the Cuda installation path to the LD_LIBRARY_PATH environment variable.

 

Step 2: Installing NVIDIA cuDNN v5.1+

Once the CUDA Toolkit is installed, download the cuDNN v5.1 Library Once downloaded, uncompress the files and copy them into the CUDA Toolkit directory (assumed here to be in /usr/local/cuda/):


$ sudo tar -xvf cudnn-8.0-linux-x64-v5.1.tgz -C /usr/local

 

Note that, to install the cuDNN v5.1 library, you just need to register for the Accelerated Computing Developer Program at https:/ /developer.nvidia.com/accelerated-computing-developer.

Now when you have installed the cuDNN v5.1 library, ensure that you create the CUDA_HOME environment variable.

 

Step 3: GPU card with CUDA compute capability 3.0+

Make sure that your machine comes with the GPU card with CUDA compute capability 3.0+, to use the preceding library and tools in Steps 1 and 2.

 

Step 4: Installing the Libutti-dev library

Lastly, you need to have libcupti-dev library installed on your machine. This is the NVIDIA CUDA provides advanced profiling support. To install this library, issue the following command:


$ sudo apt-get install libcupti-dev

 

Step 5: Installing Python (or Python3)

For those who are new to Python or TensorFlow, we recommend you install TensorFlow using pip. Python 2+ and 3.3+ are automatically installed on Ubuntu. Make check to sure that pip or pip3 is installed using the following command:


$ python -V Expected output: Python 2.7.6

$ which python Expected output: /usr/bin/python

For Python 3.3+ use the following:

$ python3 -V Expected output: Python 3.4.3

If you want a very specific version:

$ sudo apt-cache show python3

$ sudo apt-get install python3=3.5.1*

 

Step 6: Installing and upgrading PIP (or PIP3)

The pip or pip3 package manager usually comes with Ubuntu. Check that pip or pip3 is installed using the following command:


$ pip -V

Expected output:

pip 9.0.1 from /usr/local/lib/python2.7/dist-packages/pip-9.0.1-py2.7.egg (python 2.7)

For Python 3.3+ use the following:

$ pip3 -V

The expected output is as follows:

pip 1.5.4 from /usr/lib/python3/dist-packages (python 3.4)

 

It is to be noted that pip version 8.1+ or pip3 version 1.5+ are strongly recommended for better results and smooth computation. If version 8.1+ for pip and 1.5+ for pip3 are not installed, either install or upgrade to the latest pip version:


$ sudo apt-get install python-pip python-dev

For Python 3.3+, use the following command:

$ sudo apt-get install python3-pip python-dev

 

Step 7: Installing TensorFlow

Refer to the following section, for more step-by-step guidelines on how to install the latest version of TensorFlow for the CPU only and GPU supports with NVIDIA cuDNN and CUDA computing capability.

 

How to install TensorFlow

You can install TensorFlow on your machine in a number of ways; such as using virtualenv, pip, Docker, and Anaconda. However, using Docker and Anaconda are a bit advanced and this is why we have decided to use pip and virtualenv instead.

 

Interested readers can try using Docker and Anaconda from this


URL at https://www.tensorflow.org/install/.

 

Installing TensorFlow with native pip

If Steps 1 to 6 has been completed, install TensorFlow by invoking one of the following commands, for Python 2.7 and of course with only CPU support:


$ pip install tensorflow

For Python 3.x and of course with only CPU support:

$ pip3 install tensorflow

For Python 2.7 and of course with GPU support:

$ pip install tensorflow-gpu

For Python 3.x and of course with GPU support:

$ pip3 install tensorflow-gpu

If Step 3 failed, install the latest version of TensorFlow by issuing this command manually:

$ sudo pip install --upgrade TF_PYTHON_URL

For Python 3.x, use the following command:

$ sudo pip3 install --upgrade TF_PYTHON_URL

For both cases, TF_PYTHON_URL signifies the URL of the TensorFlow Python package

presented at this URL: https://www.tensorflow.org/install/install_linux#the_url_of_the_tensorflow_ python_package

For example, to install the latest version with CPU only support (currently v1.0.1), use the following command:

$ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.0.1-cp34-cp34m-linux_

 

Installing with virtualenv

We assume that you have already Python 2+ (or 3+) and pip (or pip3) are installed on your system. If so, the following are the steps to install TensorFlow:


1. Create a virtualenv environment as follows:

$ virtualenv --system-site-packages targetDirectory

The targetDirectory signifies the root of the virtualenv tree. By default it is ~/tensorflow (however, you may choose any directory).

2. Activate the virtualenv environment as follows:

$ source ~/tensorflow/bin/activate # bash, sh, ksh, or zsh

$ source ~/tensorflow/bin/activate.csh # csh or tcsh

If the command succeeds in Step 2, then you should see following on your terminal:

(tensorflow)$

3. Install TensorFlow:

Use one of the following commands to install TensorFlow in the active virtualenv environment. For Python 2.7 with CPU only support, use the following command:

(tensorflow)$ pip install --upgrade tensorflow

4. For Python 3.x with CPU only supports, use the following command:

(tensorflow)$ pip3 install --upgrade tensorflow

6. For Python 2.7 with GPU support, use the following command:

(tensorflow)$ pip install --upgrade tensorflow-gpu

7. For Python 3.x with GPU supports, use the following command:

(tensorflow)$ pip3 install --upgrade tensorflow-gpu

 

If the preceding command succeeds, skip Step 5. If the preceding command fails, perform Step 5.

 

If Step 3 failed, try to install TensorFlow in the active virtualenv environment by issuing a command in the following format. For Python 2.7 (select appropriate URL with CPU or GPU support).


(tensorflow)$ pip install --upgrade TF_PYTHON_URL

For Python 3.x (select appropriate URL with CPU or GPU support).

(tensorflow)$ pip3 install --upgrade TF_PYTHON_URL

For Python 2.7 with CPU/GPU support, select the appropriate value of

TF_PYTHON_URL. For both cases, TF_PYTHON_URL signifies the URL of the TensorFlow Python package presented at this URL: https://www.tensorflow.org/install/install_linux#the_url_of_the_tensorflow_python_package.

For example, to install the latest version with CPU only support (currently v1.0.1), use the following command:

(tensorflow)$ pip3 install --upgrade

To validate the installation in Step 3, you must activate the virtual environment. If the virtualenv environment is not currently active, issue one of the following commands:


$ source ~/tensorflow/bin/activate # bash, sh, ksh, or zsh $ source ~/tensorflow/bin/activate.csh # csh or tcsh

To uninstall TensorFlow, simply remove the tree you created. For example:

$ rm -r targetDirectory

 

Installing TensorFlow on Windows

If you can't get a Linux-based system, you must install Ubuntu on a virtual machine; just use a free application called VirtualBox, which lets you create a virtual PC on Windows and install Ubuntu in it. TensorFlow only supports version 3.5.x of Python on Windows.

 

Note that Python 3.5.x comes with the pip3 package manager, which is the program you'll use to install TensorFlow. To install TensorFlow, start a command prompt and then issue the appropriate pip3 install command in that terminal. To install the CPU-only version of TensorFlow, enter the following command:


C:> pip3 install --upgrade tensorflow

To install the GPU version of TensorFlow, enter the following command:

C:> pip3 install --upgrade tensorflow-gpu

Installation from source

The Pip installation can cause problems using TensorBoard. For this reason, I suggest you build TensorFlow directly from the source.

 

The steps are as follows:

1. Clone the entire TensorFlow repository:


$git clone --recurse-submodules

tensorflow/tensorflow

 

2. Install Bazel, which is a tool that automates software builds and tests. Now, to build TensorFlow from source, the Bazel build system must be installed on your machine. If not, issue the following command:


$ sudo apt-get install software-properties-common swig

$ sudo add-apt-repository ppa:webupd8team/java

$ sudo apt-get update

$ sudo apt-get install oracle-java8-installer

$ echo "deb http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/

$ curl https://storage.googleapis.com/bazel-apt/doc/apt-key.pub.gpg | sudo apt-key add

$ sudo apt-get update

$ sudo apt-get install bazel

 

Following the instructions and guidelines on how to install Bazel on your


platform at: http://bazel.io/docs/install.html:

1. Run the Bazel installer:

$ chmod +x http://bazel-version-installer-os.sh

2. Then run the following command:

$ ./bazel-version-installer-os.sh --user

3. Install Python dependencies:

$ sudo apt-get install python-numpy swig python-dev

4. Configure the installation (GPU or CPU) by the following command:

$ ./configure

5. Create your TensorFlow package using bazel:

$ bazel build -c opt //tensorflow/tools/pip_package:

$ build_pip_package

6. To build with GPU support use the following command:

$ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

7. Finally, install TensorFlow.

The following are the code listed as per Python version:


For Python 2.7:

$ sudo pip install --upgrade /tmp/tensorflow_pkg/tensorflow-1.0.11-*.whl

For Python 3.4:

$ sudo pip3 install --upgrade /tmp/tensorflow_pkg/tensorflow-1.0.1-*.whl

The name of the .whl file will depend on your platform (aka OS).

 

Install on Windows

If you can't get a Linux-based system, you must install Ubuntu on a virtual machine. Use a free application called VirtualBox, which lets you create a virtual PC on Windows and install Ubuntu on the latter cudnn-8.0-linux-x64-v5.1.tgz file.

 

Test your TensorFlow installation

Open a Python terminal and enter the following lines of code:


>>> import tensorflow as tf

>>> hello = tf.constant("hello TensorFlow!")

>>> sess=tf.Session()

To verify your installation just type:

>>> print sess.run(hello)

If the installation is okay, you'll see the following output:

Hello TensorFlow!

Computational graphs

When performing an operation, for example training a neural network, or the sum of two integers, TensorFlow internally represent, its computation using a data flow graph (or computational graph).

 

This is a directed graph consisting of the following:

  • A set of nodes, each one representing an operation
  • A set of directed arcs, each one representing the data on which the operations are performed

 

TensorFlow has two types of edge:

Normal: They are only carriers of data structures, between the nodes. The output of one operation (from one node) becomes the input for another operation. The edge connecting two nodes carry the values.

 

Special: This edge doesn't carry values. It represents a control dependency between two nodes A and B. It means that node B will be executed only if the operation in A will be ended before the relationship between operations on the data.

 

The TensorFlow implementation defines control dependencies to enforce orderings between otherwise independent operations as a way of controlling peak memory usage.

 

A computational graph is basically like a flow chart; the following is the computational graph for a simple computation: z=d×c=(a+b) ×c.

 

TensorFlow architecture

TensorFlow is designed as a distributed system by nature, so it is quite easy to run TensorFlow models in distributed settings. The TensorFlow Distributed Execution Engine is responsible for handling this capability of TensorFlow.

 

As we mentioned before, TensorFlow models can be run on top of CPUs and GPUs. However, other computation units are also available to use. Recently, Google announced Tensor Processing Units (TPUs) that are designed to swiftly run TensorFlow models. You can even run TensorFlow in Android devices directly.

 

Although Python is the most commonly used language with TensorFlow, you can use TensorFlow with C++, Java, Julia, Go, R, and more. TensorFlow includes two relatively high-level abstraction modules called layers and datasets.

 

The Layers module provides methods that simplify the creation of fully connected layers, convolutional layers, pooling layers, and more. It also provides methods like adding activation functions or applying dropout regularization. The Datasets module includes capabilities to manage your datasets.

 

Higher-level APIs (like Keras or Estimators) are easier to use, and they provide the same functionality of these lower-level modules. Lastly, we should mention that TensorFlow includes some pre-trained models out of the box.

 

Core components

To understand the core architecture of the TensorFlow framework, we introduce some basic concepts. First, let’s begin with a fundamental design principle of TensorFlow: TensorFlow is designed to work with “static graphs”. The computational flow of your model will be converted to a graphical representation in the framework before execution.

 

The static graph in TensorFlow is the computational graph and not the data. This means that before you run the code, you must define the computational flow of your data. After that, all of the data that is fed to the system will flow through this computational graph, even if the data changes from time to time.

 

Let’s start with the basic concepts of the framework. The first concept you have to understand is the “tensor” which is also included in the name of the framework.

Tensors are the units that hold the data. You can think of tensors as NumPy n-dimensional arrays. The rank of the tensor defines the dimension, and the shape defines the lengths of each dimension in a tuple form. So


[ [1.0, 2.0, 3.0], [4.0, 5.0, 6.0] ]

is a tensor that has rank 2 and shape (2,3).

 

Another crucial concept of TensorFlow is the “directed graph”, which contains operations and tensors. In this graph, operations are represented as nodes; tensors are represented as edges. Operations take tensors as input and produce tensors as output. Let’s give a simple example here:


# first, we have to import tensorflow Import tensorflow as tf

# constants are the most basic type of operations

x = tf.constant(1.0, dtype = tf.float32) y = tf.constant(2.0, dtype = tf.float32) z = x + y

 

In the code above, we define two tensors x and y by the tf.constant operation. This operation takes 1.0 and 2.0 as inputs and just produces their tensor equivalents and nothing more. Then using x and y, we created another tensor called z. Now, what do you expect from this code below?


print(z)

You are incorrect if you expect to see 3.0. Instead, you just see:

Tensor(“add:0”, shape=(), dtype=float32)

 

Defining graphs is different than executing the statements. For now, z is just a tensor object and has no value associated to it. We somehow need to run the graph so that we can get 3.0 from the tensor z. This is where another concept in the TensorFlow comes in: the session.

 

Sessions in TensorFlow are the objects that hold the state of the runtime where our graph will be executed. We need to instantiate a session and then run the operations we have already defined:


sess = tf.Session()

The code above instantiates the session object. Now, using that object, we can run our operations:

print(sess.run(z))

 

and we get 3.0 from the print statement! When we run an operation (namely a node in the graph), the TensorFlow executes it by calculating the tensors that our operation takes as input. This involves a backward calculation of the nodes and tensors until it reaches a natural starting point – just like in our tf.constant operations above.

 

As you have already noticed, tf.constant simply provides constants as an operation; it may not be suitable to work with external data. For these kinds of situations, TensorFlow provides another object called the placeholder. You can think of placeholders as arguments to a function.

 

It is something that you’ll provide later on in your code! For example:


k = tf.placeholder(tf.float32)

l = tf.placeholder(tf.float32)

m = k + l

This time we define k and las placeholders; we will assign some values to them when we run them in the session. Using the session above:


print(sess.run(m, feed_dict={k = 1.0, l = 2.0}))

 

will print 3.0. Here we used a feed_dict object, which is a dictionary used to pass values to the placeholders. Effectively, we pass 1.0 and 2.0 to k and l placeholders, respectively, in the runtime. You can also use the feed_dict parameter of the run method of the session to update values of the tf.constants.

 

We have seen that constants and placeholders are useful TensorFlow constructs to store values. Another useful construct is the TensorFlow variable. One can think of a variable as something that lies between constants and placeholders. Like placeholders, variables do not have an assigned value. However, much like constants, they can have default values. Here is an example of a Tensor Flow variable:


v= tf.Variable([0], tf.float32)

 

In the above line, we define a TensorFlow variable called v and set its default value as 0. When we want to assign some value different than the default one, we can use the tf.assign method:


w= tf.assign(v, [-1.])

It is crucial to know that TensorFlow variables are not initialized when defined. Instead, we need to initialize them in the session like this:

init = tf.global_variables_initializer()

sess.run(init)

The code above initializes all the variables! As a rule of thumb, you should use tf.constant to define constants, tf.placeholder to hold the data fed to your model, and tf. Variable to represent the parameters for your model.

 

Now that we have learned the basic concepts of TensorFlow and demonstrated how to use them, you are all set to use TensorFlow to build your own models.

 

TensorFlow in action

We’ll begin our TensorFlow exercises by implementing a DL classification model, utilizing the elements of TensorFlow we covered in the last section.

 

The datasets we use to demonstrate TensorFlow are the same synthetic datasets we used in the previous section. We use them for classification and regression purposes in this blog. Remember that those datasets– as well as the codes we go over in this section–are already provided with the Docker image distributed with this blog. You can run that Docker image to access the datasets and the source codes of this blog.

 

Classification

Before we begin to implement our classifier, we need to import some libraries to use them. Here are the libraries we need to import:


import numpy as np

import pandas as pd

import tensorflow as tf

from sklearn.model_selection import train_test_split

 

First, we should load the dataset and do a bit of preprocessing to format the data we’ll use in our model. As usual, we load the data as a list:


# import the data

with open(“../data/data1.csv”) as f:

data_raw = f.read()

# split the data into separate lines lines = data_raw.splitlines()

 

Then, we separate the labels and the three features into lists, called “labels” and “features”:


labels = []

features = []

for line in lines:

tokens = line.split(‘,’)

labels.append(int(tokens[-1]))

x1,x2,x3 = float(tokens[0]), float(tokens[1]), float(tokens[2])

features.append([x1, x2, x3])

Next, we make dummy variables of the three label categories, using

Pandas’ get_dummies function:

labels = pd.get_dummies(pd.Series(labels))

 

After this, the labels list should look like this:

The next step is to split our data into train and test sets. For this purpose, we use the scikit-learn’s train_test_split function that we imported before:


X_train, X_test, y_train, y_test = train_test_split(features, \ labels, test_size=0.2, random_state=42)

 

We’re now ready to build up our model using TensorFlow. First, we define the hyperparameters of the model that are related to the optimization process:


# Parameters learning_rate = 0.1 epoch = 10

Next, we define the hyperparameters that are related with the structure of the model:

# Network Parameters

n_hidden_1 = 16 # 1st layer number of neurons

n_hidden_2 = 16 # 2nd layer number of neurons

num_input = 3 # data input

num_classes = 3 # total classes

Then we need the placeholders to store our data:

# tf Graph input

X = tf.placeholder(“float”, [None, num_input])

Y = tf.placeholder(“float”, [None, num_classes])

We will store the model parameters in two dictionaries:

# weights and biases weights = {

‘h1’: tf.Variable(tf.random_normal([num_input ,n_hidden_1])),

‘h2’: tf.Variable(tf.random_normal([n_hidden_ 1,n_hidden_2])),

‘out’: tf.Variable(tf.random_normal([n_hidden _2, \

num_classes]))

}

biases = {

‘b1’: tf.Variable(tf.random_normal([n_hidden_ 1])),

‘b2’: tf.Variable(tf.random_normal([n_hidden_ 2])),

‘out’: tf.Variable(tf.random_normal([num_clas ses]))

}

We can now define our graph in TensorFlow. To that end, we provide a function:


# Create model def neural_net(x):

# Hidden fully connected layer with 16 neurons

layer_1 = tf.nn.relu(tf.add(tf.matmul(x, weights[‘h1’]), \ biases[‘b1’]))

# Hidden fully connected layer with 16 neurons

layer_2

= tf.nn.relu(tf.add(tf.matmul(layer_1, \ weights[‘h2’]), biases[‘b2’]))

# Output fully connected layer with a neuron for each class

out_layer = tf.add(tf.matmul(layer_2,

weights[‘out’]), \ biases[‘out’])

# For visualization in TensorBoard

tf.summary.histogram(‘output_layer’, out_laye r)

return out_layer

This function takes the input data as an argument. Using this data, it first constructs a hidden layer. In this layer, each input data point is multiplied by the weights of the first layer, and added to the bias terms. Using the output of this layer, the function constructs another hidden layer. Similarly, this second layer multiplies the output of the first layer with the weights of its own and adds the result to the bias term.

 

Then the output of the second layer is fed into the last layer which is the output layer of the neural network. The output layer does the same thing as the previous layers. As a result, the function we define just returns the output of the last layer.

 

After this, we can define our loss function, optimization algorithm, and the metric we will use to evaluate our model:


# Construct model logits = neural_net(X)

# Define loss and optimizer

loss_op = tf.losses.softmax_cross_entropy(logit s=logits, \

onehot_labels=Y)

# For visualization in TensorBoard tf.summary.scalar(‘loss_value’, loss_op)

optimizer

= tf.train.AdamOptimizer(learning_rate=learning _rate)

train_op = optimizer.minimize(loss_op)

# Evaluate model with test logits

correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(Y, 1))

accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# For visualization in TensorBoard tf.summary.scalar(‘accuracy’, accuracy) #For TensorBoard

merged = tf.summary.merge_all()

train_writer = tf.summary.FileWriter(“events”)

# Initialize the variables (assign their default value)

init = tf.global_variables_initializer()

As our loss function, we use the cross-entropy loss with softmax. Apart from this, there are other loss functions that are pre-built in TensorFlow.


Some of them are: softmax, tanh, log_softmax, and weighted_cross_entropy_with_logits.

 

Adam is one of the most commonly used optimization algorithms in the machine learning community. Some other optimizers available in TensorFlow are: GradientDescentOptimizer, AdadeltaOptimizer, AdagradOptimizer, MomentumOptimizer, FtrlOptimizer, and RMSPropOptimizer.

 

Accuracy is our evaluation metric, as usual.

 

Now it’s time to train our model!


with tf.Session() as sess:

# Run the initializer sess.run(init)

# For visualization of the graph in TensorBoard

train_writer.add_graph(sess.graph)

for step in range(0, epoch):

# Run optimization

sess.run(train_op, feed_dict={X: X_train,

Y: y_train})

# Calculate loss and accuracy

summary, loss, acc

= sess.run([merged, loss_op, \ accuracy], feed_dict={X: X_train,

Y: y_train})

# Add summary events for TensorBoard

train_writer.add_summary(summary,step) print(“Step “ + str(step) + “, Loss= “ + \

“{:.4f}”.format(loss) + “, Training Accuracy= “+ \

“{:.3f}”.format(acc))

print(“Optimization Finished!”)

# Calculate test accuracy

acc = sess.run(accuracy, feed_dict= {X: X_test, Y: y_test})

print(“Testing Accuracy:”, acc)

# close the FileWriter train_writer.close()

After several iterations, you should see an output similar to this:

Step 0, Loss= 0.4989, Training Accuracy= 0.821

Step 1, Loss= 0.2737, Training Accuracy= 0.898

Step 2, Loss= 0.2913, Training Accuracy= 0.873

Step 3, Loss= 0.3024, Training Accuracy= 0.864

Step 4, Loss= 0.2313, Training Accuracy= 0.892

Step 5, Loss= 0.1640, Training Accuracy= 0.933

Step 6, Loss= 0.1607, Training Accuracy= 0.943

Step 7, Loss= 0.1684, Training Accuracy= 0.938

Step 8, Loss= 0.1537, Training Accuracy= 0.944

Step 9, Loss= 0.1242, Training Accuracy= 0.956

Optimization Finished!

Testing Accuracy: 0.95476

Regression

Although today’s deep learning applications are quite successful in challenging classification tasks, TensorFlow also enables us to build regression models in almost the same manner. In this section, we’ll show you how to predict a continuous outcome variable using regression.

 

It is critical to choose a different loss function than we used in the classification model – one that is more suitable to a regression task. We’ll choose the L2 metric, as it is one of the most popular metrics in regression analysis. In terms of evaluation, we’ll use R-squared to assess the performance of our model in the test set.

 

We import the same libraries that we imported for the classification task:


import numpy as np

import pandas as pd

import tensorflow as tf

from sklearn.model_selection import train_test_split

The dataset we use is the same synthetic set provided, with 20 features and 1 outcome variable. Below, we load the dataset and do some pre-processing to format the data we’ll use in our model:


import the data

with open(“../data/data2.csv”) as f:

data_raw = f.read()

# split the data into separate lines lines = data_raw.splitlines()

 

Instead of calling the outcome variable as “labels”, we prefer to call it “outcomes” in this case as this seems more appropriate for regression models. As usual, we separate 20% of our dataset as our test data.


outcomes = []

features = []

for line in lines:

tokens = line.split(‘,’)

outcomes.append(float(tokens[-1]))

features.append([float(x) for x in tokens[:-1]])

X_train, X_test, y_train, y_test = train_test_split(features, \ outcomes, test_size=0.2, random_state=42)

We can now set the hyperparameters of the model regarding the optimization process, and define the structure of our model:


# Parameters learning_rate = 0.1

epoch = 500

# Network Parameters

n_hidden_1 = 64 # 1st layer number of neurons n_hidden_2 = 64 # 2nd layer number of neurons num_input = 20 # data input num_classes = 1 # total classes

# tf Graph input

X = tf.placeholder(“float”, [None, num_input])

Y = tf.placeholder(“float”, [None, num_classes])

, above. Next, we store the model parameters in two dictionaries as we did in the classification case:


# weights & biases weights = {

‘h1’: tf.Variable(tf.random_normal([num_input ,n_hidden_1])),

‘h2’: tf.Variable(tf.random_normal([n_hidden_ 1,n_hidden_2])),

‘out’: tf.Variable(tf.random_normal([n_hidden _2, \

num_classes]))

}

biases = {
This time, our outcome is single-value in nature, and we have 20 features. We set the relevant parameters accordingly
‘b1’: tf.Variable(tf.random_normal([n_hidden_ 1])),

‘b2’: tf.Variable(tf.random_normal([n_hidden_ 2])),

‘out’: tf.Variable(tf.random_normal([num_clas ses]))

}

It’s time to define the structure of our model. The graph is exactly the same as the graph of the classification model we used in the previous part:


# Create model def neural_net(x):

# Hidden fully connected layer with 64 neurons

layer_1 = tf.add(tf.matmul(x, weights[‘h1’]), biases[‘b1’])

# Hidden fully connected layer with 64 neurons

layer_2 = tf.add(tf.matmul(layer_1, weights[‘h2’]), \

biases[‘b2’])

# Output fully connected layer

out_layer = tf.matmul(layer_2,

weights[‘out’]) \ + biases[‘out’]

return out_layer

 

The difference between the classification model and the regression model is that the latter uses the L2 loss as a loss function. This is because the outcome of the regression model is continuous; as such, we must use a loss function that is capable of handling continues loss values. We also use the Adam optimization algorithm in this regression model.


# Construct model

output = neural_net(X)

# Define loss and optimizer

loss_op = tf.nn.l2_loss(tf.subtract(Y, output))

optimizer

= tf.train.AdamOptimizer(learning_rate=learning _rate)

train_op = optimizer.minimize(loss_op)

Another difference between our classification and regression models is the metric we use to evaluate our model. For regression models, we prefer to use the R-squared metric; it is one of the most common metrics used to assess the performance of regression models:


# Evaluate model using R-squared

total_error = tf.reduce_sum(tf.square(tf.subtra ct(Y, \

tf.reduce_mean(Y))))

unexplained_error = tf.reduce_sum(tf.square(tf. subtract(Y, \

output)))

R_squared = tf.subtract(1.0,tf.div(unexplained_ error, \

total_error))

# Initialize the variables (assign their default values)

init = tf.global_variables_initializer()

We are all set to train our model:

# Start training

with tf.Session() as sess:

# Run the initializer sess.run(init)

for step in range(0, epoch):

# Run optimization sess.run(train_op,feed_dict= \ {X: X_train, \ Y:np.array(y_train).reshape(200000,1)})

# Calculate batch loss and accuracy

loss, r_sq = sess.run([loss_op, R_squared], \

feed_dict={X: X_train, \

Y: np.array(y_train).reshape(200000,1)})

print(“Step “ + str(step) + “, L2 Loss= “ +

\

“{:.4f}”.format(loss) + “, Training R-squared= “ \

+ “{:.3f}”.format(r_sq)) print(“Optimization Finished!”)

# Calculate accuracy for MNIST test images print(“Testing R-squared:”, \ sess.run(R_squared, feed_dict={X: X_test, \

Y: np.array(y_test).reshape(50000,1)}))

The outcome of the model should look like this:

Step 497, L2 Loss= 81350.7812, Training R-squared= 0.992

Step 498, L2 Loss= 81342.4219, Training R-squared= 0.992

Step 499, L2 Loss= 81334.3047, Training R-squared= 0.992

Optimization Finished!

Testing R-squared: 0.99210745

 

Visualization in TensorFlow: TensorBoard

Visualization of your model’s results is a useful method for investigation, understanding, and debugging purposes. To this end, TensorFlow offers a visualization library called TensorBoard; with that library, you can visualize your models and their outcomes. TensorBoard comes with TensorFlow; once you install TensorFlow on your machine, TensorBoard should be present.

 

TensorBoard reads event files containing the summary data of the TensorFlow model. To generate summary data, TensorFlow provides some functions in the summary module. In this module, there are some functions that operate just like the operations in TensorFlow. This means that we can use tensors and operations as input for these summary operations.

 

In the classification example, we actually used some of these functionalities. Here is a summary of the operations that we used in our example:

tf.summary.scalar:

 

If we want data about how a scalar evolves in time (like our loss function), we can use the loss function node as an input for the tf.summary.scalar function—right after we define the loss, as shown in the following example:


loss_op =

tf.reduce_mean(tf.nn.softmax_cross_entropy_with

_logits(

logits=logits, labels=Y))

tf.summary.scalar(‘loss_value’, loss_op)

tf.summary.histogram:

We may also be interested in the distributions of some variables, like the results of a matrix multiplication.

In this case, we use tf.summary.histogram, as follows:

out_layer = tf.matmul(layer_2, weights[‘out’]) + biases[‘out’]

tf.summary.histogram(‘output_layer’, out_layer)

tf.summary.merge_all: Summary nodes do not alter the graph of the model, but we need them to run our summary operations.

The tf.summary.merge_all function merges all of our summary operations so that we do not need to run each operation one by one.

tf.summary.FileWriter: This function is used to store the summary (which was generated using the tf.summary.merge_all function) to the disk. 
​Here is an example of how to do that:

merged = tf.summary.merge_all()

train_writer = tf.summary.FileWriter(“events”)

Once we define our file writers, we also need to initialize them inside the session:

train_writer.add_graph(sess.graph)

After we integrate summary functions to our code, we should write the summaries to the files and visualize them. When we run our model, we also receive the summaries:


summary, loss, acc = sess.run([merged, loss_op, accuracy], feed_dict={X: batch_x, Y: batch_y})

After that we add the summary to our summary file:

train_writer.add_summary(summary,step)

Last, we close the FileWriter:

train_writer.close()

Next, we can visualize the summaries in the browser. To do that, we need to run the following command:

tensorboard —logdir=path/to/log-directory

where the log-directory refers to the directory where we saved our summary files. When you open localhost:6006 in your browser, you should see the dashboard with the summaries of your model.

 

High-level APIs in TensorFlow: Estimators

So far, we’ve discussed the low-level structures of TensorFlow. We saw that we must build our own graph and keep track of the session. However, TensorFlow also provides a high-level API, where the tedious works are handled automatically. This high-level API is called “Estimators”.

 

Estimators API also provides pre-made estimators. You can use these estimators quickly, and customize them if needed. Here are some of the advantages of this API, with respect to the low-level APIs of TensorFlow:

With fewer lines of codes, you can implement the same model.

Building the graph, opening and closing the session, and initializing the variables are all handled automatically.

 

The same code runs in CPU, GPU, or TPU.

Parallel computing is supported. As such, if multiple servers are available, the code you write on this API can be run without any modification of the code you run on your local machine.

 

Summaries of the models are automatically saved for TensorBoard.

When you are writing your code using this API, you basically follow four steps:

  • 1. Reading the dataset.
  • 2. Defining the feature columns.
  • 3. Setting up a pre-defined estimator.
  • 4. Training and evaluating the estimator.

Now we will demonstrate each of these steps using our synthetic data for classification. First, we read the data from our .csv file, as usual:


# import the data

with open(“../data/data1.csv”) as f:

data_raw = f.read()

lines = data_raw.splitlines() # split the data into separate lines

labels = []

x1 = []

x2 = []

x3 = []

for line in lines:

tokens = line.split(‘,’)

labels.append(int(tokens[-1])-1)

x1.append(float(tokens[0]))

x2.append(float(tokens[1]))

x3.append(float(tokens[2]))

features =

np.array([x1,x2,x3]).reshape(250000,3)

labels = np.array(pd.Series(labels))

X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2, random_state=42)

Second, we write a function that converts our features to a dictionary, and returns the features and labels for the model:

def inputs(features,labels):

features = {‘x1’: features[:,0],

‘x2’: features[:,1],

‘x3’: features[:,2]}

labels = labels

return features, labels

Third, we write a function that transforms our data into a DataSet object:

def train_input_fn(features, labels, batch_size):

# Convert the inputs to a Dataset.

dataset =

tf.data.Dataset.from_tensor_slices((dict(featur

es), labels))

# Shuffle, repeat, and batch the examples. return

dataset.shuffle(1000).repeat().batch(batch_size )

Defining our feature columns only requires a few lines of code:

# Feature columns describe how to use the input.

my_feature_columns = []

for key in [‘x1’,’x2’,’x3’]:

my_feature_columns.append(tf.feature_column.n umeric_column(key=key))

Before we run our model, we should select a pre-defined estimator that is suitable for our needs. Since our task is classification, we use two fully-connected layers, as we did previously. For this, the estimator’s API provides a classifier called DNNClassifier:


# Build a DNN with 2 hidden layers and 256 nodes in each hidden layer.

classifier = tf.estimator.DNNClassifier(

feature_columns=my_feature_columns,

# Two hidden layers of 256 nodes each. hidden_units=[256, 256],

# The model must choose between 3 classes. n_classes=3, optimizer=tf.train.AdamOptimizer( learning_rate=0.1

))

As before, we defined two dense layers of size 256, we set the learning rate to 0.1, and we set the number of classes to 3.

 

Now, we are ready to train and evaluate our model. Training is as simple as:


classifier.train(input_fn=lambda:train_input_fn

(inputs(X_train,y_train)[0],

inputs(X_train,y_train)[1], 64), steps=500)

We provided the function that we wrote above, which returns the DataSet object for the model as an argument to the train() function. We also set training steps to 500, as usual. When you run the code above, you should see something like:


INFO:tensorflow:loss = 43.874107, step = 401 (0.232 sec)

INFO:tensorflow:Saving checkpoints for 500 into /tmp/tmp8xv6svzr/model.ckpt.

INFO:tensorflow:Loss for final step: 34.409817.

<tensorflow.python.estimator.canned.dnn.DNNClas sifier at 0x7ff14f59b2b0>

After this, we can evaluate the performance of our model in our test set:

# Evaluate the model.

eval_result = classifier.evaluate(

input_fn=lambda:train_input_fn(inputs(X_test, y_test)[0], inputs(X_test,y_test)[1], 64), steps=1)

print(‘Test set accuracy:

{accuracy:0.3f}\n’.format(**eval_result))

The output should look like this:


INFO:tensorflow:Starting evaluation at 2018-04-07-12:11:21

INFO:tensorflow:Restoring parameters from /tmp/tmp8xv6svzr/model.ckpt-500

INFO:tensorflow:Evaluation [1/1]

INFO:tensorflow:Finished evaluation at 2018-04-

07-12:11:21

INFO:tensorflow:Saving dict for global step 500: accuracy = 0.828125, average_loss =

0.6096449, global_step = 500, loss = 39.017273

Test set accuracy: 0.828

 

Summary

TensorFlow is a deep learning framework initially developed by Google and now backed by a huge open source community.

TensorFlow is by far the most popular deep learning framework. Even if you choose to use other frameworks, learning the basics of TensorFlow is beneficial; many of the codes you’ll encounter that are written by others will likely be written in TensorFlow.

  • TensorFlow supports distributed computing by nature.
  • TensorFlow models can be run on CPUs, GPUs, and TPUs.
  • You can write TensorFlow code in Python, Java, Julia, C++, R, and more.
  • Although you can use low-level structures of TensorFlow, there are also many high-level APIs that simplify the model building process.
  • You can also use Keras on top of Theano or CNTK, but using it on top of TensorFlow is by far the most common usage in the industry.

 

Building a DL Network Using Keras

Now that you understand the basics of the TensorFlow framework, we’ll explore another very popular framework that is built on top of TensorFlow: Keras. Keras is a framework that reduces the lines of code you need to write by means of its abstraction layers. It provides a simple yet powerful API that almost anyone can implement even a complicated DL model with just a few lines of code.

 

Our advice is to use Keras if you are new to DL, as you can implement almost anything just using Keras. Nevertheless, being familiar with TensorFlow is also beneficial.

 

You’ll likely encounter models that are written in TensorFlow, and to understand them you’ll need a good grasp of TensorFlow. This is one of the reasons why we introduced TensorFlow before Keras. The other reason is that we can now appreciate the simplicity Keras brings to the table, compared to TensorFlow!

 

We cover the basic structures in Keras and show how you can implement DL models in Keras using our synthetic dataset. Next, we explore the visualization capabilities of the framework. Then, we show you how to transform your models written in Keras into TensorFlow estimators.

 

Keras sits on top of a back-end engine which is either TensorFlow, Theano, or CNTK. So, before installing Keras, one should first install one of these three backends that Keras supports.

 

By default, Keras supports the TensorFlow back-end engine. Since we covered TensorFlow in the last blog, we assume that you’ve already installed TensorFlow on your system. If not, refer to the relevant section of the TensorFlow blog to install TensorFlow first.

 

After installing TensorFlow, Keras can be installed via PyPl.11 Simply run this command:


pip install keras

After you run the command above, the Keras deep learning framework should be installed in your system. After importing it, you can use

 

Keras in your Python code as follows:


import keras

 

Keras abstracts away the low-level data structures of TensorFlow, replacing them with intuitive, easily integrated, and extensible structures. When designing this framework, the developers followed these guiding principles:

 

1. User friendliness: Keras makes human attention focus on the model and builds up the details around this structure. In doing so, it reduces the amount of work done in common-use cases by providing relevant functionality by default.

2. Modularity: In Keras, we can easily integrate the layers, optimizers, activation layers, and other ingredients of a DL model together, as if they were modules.

3. Easy extensibility: We can create new modules in Keras and integrate them to our existing models quite easily. They can be used as objects or functions.

 

Core components

The basic component in Keras is called the model. You can think of the Keras model as an abstraction of a deep learning model. When we start to implement a DL model in Keras, we usually begin by creating a so-called model object. Of the many types of models in Keras, Sequential is the simplest and the most commonly used.

 

Another basic structure in Keras is called the layer. Layer objects in Keras represent the actual layers in a DL model. We can add layer objects to our model object by just defining the type of the layer, the number of units, and the input/output sizes. The most commonly used layer type is the Dense layer.

 

And that is all! You might be surprised that the authors forgot to mention some other critical parts of the Keras framework. However, as you’ll see below, you can now start to build up your model with what you’ve already learned so far!

 

Keras in action

Now it’s time to see Keras in action. Remember that the datasets and the codes examined in this section are available to you via the Docker image provided with the blog.

 

The datasets used to demonstrate Keras is the same synthetic datasets used in the blog on TensorFlow. We’ll again use them for classification and regression purposes.

 

Classification

Before we begin to implement our classifier, we need to import some libraries in order to use them. Here are the libraries we need to import:


import numpy as np

import pandas as pd

from keras.models import Sequential from keras.layers import Dense from keras import optimizers

from sklearn.model_selection import train_test_split

 

First, we should load the dataset, and do a bit of pre-processing to format the data we’ll use in our model. As usual, we load the data as a list:


# import the data

with open(“../data/data1.csv”) as f:

data_raw = f.read()

lines = data_raw.splitlines() # split the data into separate lines

 

Then, we separate the labels and the three features into lists, respectively called labels and features:


labels = []

features = []

for line in lines:

tokens = line.split(‘,’)

labels.append(int(tokens[-1]))

x1,x2,x3 = float(tokens[0]), float(tokens[1]), float(tokens[2])

features.append([x1, x2, x3])

Next, we make dummy variables of the three label categories using

Pandas’ get_dummies function:

labels = pd.get_dummies(pd.Series(labels))

 

The next step is to split our data into train and test sets. For this purpose, we use the scikit-learn’s train_test_split function that we imported before:


X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2, random_state=42)

 

We’re now ready to build up our model using Keras. We first define our model and then add three layers; the first two are the dense layers and the third is the output layer:


model = Sequential()

model.add(Dense(units=16, activation=’relu’, input_dim=3))

model.add(Dense(units=16, activation=’relu’))

model.add(Dense(units=3, activation=‘softmax’))

As you can see, building a graph in Keras is quite an easy task. In the code above, we first define a model object (which is sequential, in this case). Then we add three fully-connected layers (called dense layers).

 

After we define our model and layers, we must choose our optimizer and compile our model. For the optimizer, we use Adam, setting its learning rate to 0.1:


sgd = optimizers.Adam(lr=0.1)

 

Then we compile our model. In doing so, we define our loss function to be categorical cross-entropy, which is one of the pre-defined loss functions in Keras. For the metric to evaluate the performance of our model, we use accuracy, as usual. All these definitions can be implemented in a single line in Keras as seen here:


model.compile(loss=’categorical_crossentropy’, optimizer=sgd, metrics= [‘accuracy’])

Now, it’s time to train our model in a single line! We train our models by calling the fit function of the model object. As parameters, we provide our features and labels as NumPy arrays—the batch size and the epochs. We define the batch size as 10.000 and the epochs as 5:


model.fit(np.array(X_train), np.array(y_train), batch_size=10000, epochs = 5)

Next, we evaluate the performance of the model in our test data:

loss_and_metrics = model.evaluate(np.array(X_test), np.array(y_test), batch_size=100)

print(loss_and_metrics)

It should print out:

[0.03417351390561089, 0.9865800099372863]

So our model’s loss value is approximately 0.03 and the accuracy in the test set is about 0.99!

 

Regression

In Keras, building regression models is as simple as building classification models. We first define our models and the layers. One thing to be aware of is that the output layer of a regression model must produce only a single value.

 

We also must choose a different loss function. As we did in the TensorFlow blog, we use the L2 metric, as it is one of the most popular metrics in regression analysis. Finally, we evaluate the performance of our model using R-squared.

 

Import the following libraries:


import numpy as np

import pandas as pd

from keras.models import Sequential from keras.layers import Dense

from keras import optimizers

import keras.backend as K

from sklearn.model_selection import train_test_split

 

We’ll again utilize the synthetic dataset from the previous blog. Recall that it includes 20 features and 1 outcome variable. Below, we load the dataset and pre-process the data into the format we’ll use in our model:


# import the data

with open(“../data/data2.csv”) as f:

data_raw = f.read()

lines = data_raw.splitlines() # split the data into separate lines

 

Instead of “label”, we prefer to call the target variable “outcome,” as it is more appropriate for regression models. As usual, we separate 20% of our dataset as our test data.


outcomes = []

features = []

for line in lines:

tokens = line.split(‘,’)

outcomes.append(float(tokens[-1]))

features.append([float(x) for x in tokens[:-1]])

X_train, X_test, y_train, y_test = train_test_split(features, outcomes, test_size=0.2, random_state=42)

We define our model and the layers as follows:

model = Sequential()

model.add(Dense(units=64, activation=’relu’, input_dim=20))

model.add(Dense(units=64, activation=’relu’))

model.add(Dense(units=1, activation=‘linear’))

This time, our outcome is a single value and we have 20 features. So, we set the relevant parameters accordingly.

 

It’s time to compile our model. First, though, we must define a function that calculates the R-squared metric. Unfortunately, as of this writing, Keras does not provide a built-in R-squared metric in its package. As such, consider our implementation:


def r2(y_true, y_pred):

SS_res = K.sum(K.square(y_true - y_pred))

SS_tot = K.sum(K.square(y_true - K.mean(y_true)))

return ( 1 - SS_res/(SS_tot + K.epsilon()) )

After that we choose Adam as our optimizer and set the learning rate to 0.1:

sgd = optimizers.Adam(lr=0.1)

Now we can compile our model. We use the mean-squared error as our loss function, and we feed our r2() function to the model as a metric:


model.compile(optimizer=sgd,

loss=’mean_squared_error’,

metrics=[r2])

Training a model is quite simple in Keras, as we saw earlier with classification. We provide our features and outcomes as NumPy arrays to the fit function of the model object. We also set the batch size to 10.000 and epochs to 10:


model.fit(np.array(X_train), np.array(y_train), batch_size=10000, epochs = 10)

Next we evaluate the performance of our model on the test data:

loss_and_metrics = model.evaluate(np.array(X_test), np.array(y_test), batch_size=100)

print(loss_and_metrics)

So our model achieves 0.97 R-squared in the test data.

 

Model Summary and Visualization

If you don’t need any visuals, Keras can easily provide a textual summary of the layers of the model. For this purpose, Keras provides a summary() function. When called from a model, it returns the textual information about the model. By just printing the summary of a model using the code below, it is possible to check out the structure of the model:


print(model.summary())

 

Of course, visualizations are not only more aesthetically pleasing, but also can help you easily explain and share your findings with stakeholders and team members. Graphically visualizing the model in Keras is straightforward.

 

A module named keras.utils.vis_utils includes all the utilities for visualizing the graph using a library called graphviz. Specifically, the plot_model() function is the basic tool for visualizing the model. The code below demonstrates how to create and save the graph visualization for a model:


from keras.utils import plot_model

plot_model(model, to_file = “my_model.png”)

The plot_model() function accepts two optional arguments:

show_shapes: if True the graph shows the output shapes.

The default setting is False.

show_layer_names: if True the graph shows the names of the layers. The default setting is True.

 

Converting Keras models to TensorFlow Estimators

As we mentioned in the previous blog, TensorFlow provides a rich set of pre-trained models that you can use without any training. The Estimators abstraction of TensorFlow will allow you to use these pre-trained models. To make full use of this rich set of models, it would be nice to convert our Keras models into TensorFlow Estimators.

 

Thankfully, Keras provides this functionality out of the box. With just a single line of code, Keras models turn into TensorFlow Estimators, ready to be used. The function is called model_to_estimator() in the keras.estimator module, and looks like this:


estimator_model =

keras.estimator.model_to_estimator(keras_model = model)

Once we convert our Keras model into TensorFlow Estimator, we can use this estimator in TensorFlow code.

 

Before closing the blog, we encourage our users to read more about the Keras framework. If you are using DL models for research purposes, Keras is probably the most convenient tool for you. Keras will save a lot of time in implementing the many models you’ll try.

 

If you’re a data science practitioner, Keras is one of the best choices for you both in prototyping and production. Hence, enhancing your understanding and expertise in Keras is beneficial regardless of your particular problem.

 

Summary

Keras is a deep learning framework that provides a convenient and easy-to-use abstraction layer on top of the sensor flow framework.

Keras brings a more user-friendly API to the TensorFlow framework. Along with easy extensibility and modularity, these are the key advantages of Keras over other frameworks.

 

The main structure in Keras is the model object, which represents the deep learning model to be used. The most commonly-used model type is the sequential model. Another important structure in Keras is the layer, which represents the layers in the model; the most common layer is the Dense layer.

 

Visualizing the model structure in Keras is accomplished with a single function call to plot_model().

It is a good idea to start building deep learning models in Keras instead of TensorFlow if you are new to the field.

 

Although Keras provides a very wide range of functionality, one may need to switch to TensorFlow to write some sophisticated functionality for non-standard deep learning models.

 

Advanced TensorFlow Programming

Development of deep learning networks, especially when testing new models, may require rapid prototyping. For this reason, there have been developed several TensorFlow-based libraries, abstracting many programming concepts and providing higher-level building blocks.

 

In this blog, we'll give an overview of the libraries such as Keras, Pretty Tensor, and TFLearn. For each library, we'll describe its main characteristics, with an application example.

 

Introducing Keras

Keras is a minimalist, high-level neural networks library, capable of running on top of TensorFlow. It was developed with a focus on enabling easy and fast prototyping and experimentation. Keras runs on Python 2.7 or 3.5, and can seamlessly execute on GPUs and CPUs gave the underlying frameworks. It is released under the MIT license.

Keras was developed and maintained by François Chollet, a Google engineer, following these design principles:

 

Modularity: A model is understood as a sequence or a graph of the standalone, fully configurable modules that can be plugged together with as few restrictions as possible. Neural layers, cost functions, optimizers, initialization schemes, and activation functions are all standalone modules that can be combined to create new models.

 

Minimalism: Each module must be short (few lines of code) and simple. The source code should be transparent upon the dirt reading. Extensibility: New modules are simple to add (as new classes and functions), and existing modules provide examples.

To be able to easily create new modules allows for total expressiveness, making Keras suitable for advanced research.

 

Python: No separate model configuration files in a declarative format. Models are described in Python code, which is compact, easier to debug, and allows for ease of extensibility.

 

Installation

To install Keras, you must also have an installation of TensorFlow on your system already. Keras can be installed easily using pip, as follows:


sudo pip install keras

For Python 3+ use the following command:

sudo pip3 install keras

 

During the writing of this blog, the most recent version of Keras is version 2.0.2. You can check your version of Keras on the command line, using the following snippet:


python -c "import keras; print keras.__version__"

Running the preceding script, you will see the following output:

2.0.2

 

You can also upgrade your installation of Keras using the same method:


sudo pip install --upgrade keras

Building deep learning models


The core data structure of Keras is a model, which is a way to organize layers.

There are two types of model:

Sequential: The main type of model. It is simply a linear stack of layers.

Keras functional API: These are used for more complex architectures.

 

You define a sequential model as follows:


from keras.models import Sequential

model = Sequential()

Once a model is defined, you can add one or more layers. The stacking operation is provided by the add() statement:

from keras.layers import Dense, Activation

 

For example, add a first fully connected NN layer and the Activation function:


model.add(Dense(output_dim=64, input_dim=100))

model.add(Activation("relu"))

Then add a second softmax layer:

model.add(Dense(output_dim=10))

model.add(Activation("softmax"))

If the model looks fine, you must compile the model by using the model. compile function, specifying the loss function and the optimizer function to be used:


model.compile(loss='categorical_crossentropy',\

optimizer='sgd',\

metrics=['accuracy'])

You may configure your optimizer. Keras tries to make programming reasonably simple, allowing the user to be fully in control when they need to be. Once compiled, the model must be fitted to the data:


model.fit(X_train, Y_train, nb_epoch=5, batch_size=32

Alternatively, you can feed batches to your model manually:

model.train_on_batch(X_batch, Y_batch)

Once trained, you can use your model to make predictions on new data:

classes = model.predict_classes(X_test, batch_size=32)

proba = model.predict_proba(X_test, batch_size=32)

We can summarize the construction of deep learning models in Keras as follows:

= Define your model: Create a sequence and add layers.

= Compile your model: Specify loss functions and optimizers.

= Fit your model: Execute the model using data.

= Evaluate the model: Keep an evaluation of your training dataset.

= Make predictions: Use the model to generate predictions on new data.

The following figure depicts the preceding processes:

 

Keras programming model

In the following section, we'll look at how to use the Keras sequential model to study the sentiment classification problem of movie reviews.

 

Sentiment classification of movie reviews

Sentiment analysis is the capability to decipher the opinions contained in a written or spoken text. The main purpose of this technique is to identify the sentiment (or polarity) of a lexical expression, which may have a neutral, positive, or negative connotation.

 

The problem we want to resolve is the IMDB movie review sentiment classification problem. Each movie review is a variable sequence of words, and the sentiment (positive or negative) of each movie review must be classified.

This problem is very complex because the sequences can vary in length; they can also be part of a large vocabulary of input symbols.

The solution requires the model to learn long-term dependencies between symbols in the input sequence.

 

The IMDB dataset contains 25,000 highly polarized movie reviews (good or bad) for training and the same amount again for testing. The data was collected by Stanford researchers and was used in a 2011 paper, where a split of 50/50 of the data was used for training and testing. In this paper, an accuracy of 88.89% was achieved.

 

Once we define our problem, we are ready to develop an LSTM model to classify the sentiment of movie reviews. We can quickly develop an LSTM for the IMDB problem and achieve good accuracy.

 

Let's start off by importing the classes and functions required for this model, and initializing the random number generator to a constant value, to ensure we can easily reproduce the results:


import numpy

from keras.datasets import imdb

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import LSTM

from keras.layers.embeddings import Embedding

from keras.preprocessing import sequence

numpy.random.seed(7)

We load the IMDB dataset. We are constraining the dataset to the top 5,000 words. We also split the dataset into training (50%) and testing (50%) sets.

 

Keras provides access to the IMDb dataset (http://www.imdb.com/interfaces) built-in. alternatively, you also can download the IMDB dataset


from Kaggle website at https://www.kaggle.com/deepmatrix/imdb-5000-movie-dat set.

 

The imdb.load_data() function allows you to load the dataset in a format that is ready for use in a neural network and deep learning models. The words have been replaced by integers, which indicate the ordered frequency of each word in the dataset. The sentences in each review are therefore comprised of a sequence of integers.

 

Here's the code:


top_words = 5000\

(X_train, y_train), (X_test, y_test) =\

imdb.load_data(nb_words=top_words)

Next, we need to truncate and pad the input sequences so that they are all the same length for modeling. The model will learn the zero values that carry no information because, although the sequences are not the same length in terms of content, same length vectors are required to perform the computation in Keras.

 

The sequence length in each review varies, so we constrained each review to 500 words, truncating long reviews and padding the shorter reviews with zero values:

 

Let's see:


max_review_length = 500\

X_train = sequence.pad_sequences\

(X_train, maxlen=max_review_length)

X_test = sequence.pad_sequences\

(X_test, maxlen=max_review_length)

We can now define, compile, and fit our LSTM model.

To resolve the sentiment classification problem, we'll use the word embedding technique, which consists of representing words in a continuous vector space, that is, an area in which the words that are semantically similar are mapped in neighboring points. Word embedding is based on the distributional hypothesis, that is, the words that appear in a given context must share the same semantic meaning.

 

Each movie review will then be mapped into a real vector domain, where the similarity between words, in terms of meaning, translates to closeness in the vector space. Keras provides a convenient way to convert positive integer representations of words into word embedding by an embedding layer.

 

Here, we define the length of the embedding vector and the model:


embedding_vector_length = 32

model = Sequential()

The first layer is the embedded layer, which uses 32 length vectors to represent each word:


model.add(Embedding(top_words,\

embedding_vector_length,\

input_length=max_review_length))

 

The next layer is the LSTM layer, with 100 memory units. Finally, because this is a classification problem, we use a Dense output layer with a single neuron and a sigmoid activation function to make 0 or 1 predictions for the two classes (good and bad) in the problem:


model.add(LSTM(100))

model.add(Dense(1, activation='sigmoid'))

 

Because it is a binary classification problem, the binary_crossentropy function is used as a loss function, while the optimizer function used here is the Adam optimization algorithm (we also encountered it in a previous TensorFlow implementation):


model.compile(loss='binary_crossentropy',\

optimizer='adam',\

metrics=['accuracy'])

print(model.summary())

We fit only three epochs because the problem quickly overfits. A batch size of 64 reviews is used to space out weight updates:


model.fit(X_train, y_train, \

validation_data=(X_test, y_test),\

nb_epoch=3,\

batch_size=64)

Then, we estimate the model's performance on unseen reviews:

scores = model.evaluate(X_test, y_test, verbose=0)

print("Accuracy: %.2f%%" % (scores[1]*100))

You can see that this simple LSTM, with a little tuning, achieves near state of the art results on the IMDB problem. Importantly, this is a template that you can use to apply LSTM networks to your own sequence classification problems.

 

Source code for the Keras movie classifier

Here is the complete source code; you can see how short the number of lines of code as shown in the following. However, if you experience error saying that there’s no module called keras.datasets and so on, you should install keras package using the following command:


$ sudo pip install keras

 

Alternatively, download the source code of Keras from https://pypi.pyth on.org This website is for sale! -&nbspon Resources and Information., unzip the file and run the source code using Python 3 (within the keras folder) as follows:


python keras_movie_classifier_1.py

import numpy

from keras.datasets import imdb

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import LSTM

from keras.layers.embeddings import Embedding

from keras.preprocessing import sequence

# fix random seed for reproducibility numpy.random.seed(7)

# load the dataset but only keep the top n words, zero the rest top_words = 5000

(X_train, y_train), (X_test, y_test) =\

imdb.load_data(nb_words=top_words)

# truncate and pad input sequences max_review_length = 500

X_train = sequence.pad_sequences(X_train, maxlen=max_review_length) X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)

# create the model

embedding_vector_length = 32

model = Sequential()

model.add(Embedding(top_words, embedding_vector_length,\

input_length=max_review_length))

model.add(LSTM(100))

model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',\

optimizer='adam',\

metrics=['accuracy'])

print(model.summary())

model.fit(X_train, y_train,\

validation_data=(X_test, y_test),\

nb_epoch=3, batch_size=64)

# Final evaluation of the model

scores = model.evaluate(X_test, y_test, verbose=0)

print("Accuracy: %.2f%%" % (scores[1]*100))

Adding a convolutional layer

We can add one-dimensional CNN and max-pooling layers after the embedding layer, which will then feed the consolidated features to the LSTM.

 

Here is our embedding layer:


model = Sequential()

model.add(Embedding(top_words,\

embedding_vector_length,\

input_length=max_review_length))

We can apply a convolution layer with a small kernel filter (filter_length) of size 3, with 32 output features (nb_filter):

model.add(Conv1D (padding="same", activation="relu", kernel_size=3,\ num_filter=32))

Next, we add a pooling layer; the size of the region to which max pooling is applied is equal to 2:

model.add(GlobalMaxPooling1D ())

The next layer is a LSTM layer, with 100 memory units:

model.add(LSTM(100))

 

The final layer is a Dense output layer, with a single neuron and a sigmoid activation function, to make 0 or 1 predictions for the two classes (good and bad) in the problem (that is, binary classification problem):


model.add(Dense(1, activation='sigmoid'))

 

Source code for movie classifier with a convolutional layer. The complete source code for the previous example is as follows:


import numpy

from keras.datasets import imdb

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import LSTM

from keras.layers.embeddings import Embedding

from keras.preprocessing import sequence

from keras.layers import Conv1D, GlobalMaxPooling1D

Z: fix random seed for reproducibility numpy.random.seed(7)

AA: load the dataset but only keep the top n words, zero the rest top_words = 5000

(X_train, y_train), (X_test, y_test) =\

imdb.load_data(num_words=top_words)

Z: truncate and pad input sequences max_review_length = 500

X_train = sequence.pad_sequences(X_train, maxlen=max_review_length) X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)

AA: create the model

embedding_vector_length = 32

model = Sequential()

model.add(Embedding(top_words, embedding_vector_length,\

input_length=max_review_length))

model.add(Conv1D (padding="same", activation="relu", kernel_size=3,\ num_filter=32))

model.add(GlobalMaxPooling1D ())

model.add(LSTM(100))

model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',\

optimizer='adam',\

metrics=['accuracy'])

print(model.summary())

model.fit(X_train, y_train,\

validation_data=(X_test, y_test),\

nb_epoch=3, batch_size=64)

# Final evaluation of the model

scores = model.evaluate(X_test, y_test, verbose=0)

print("Accuracy: %.2f%%" % (scores[1]*100))

Pretty Tensor

Pretty Tensor allows the developer to wrap TensorFlow operations, to quickly chain any number of layers to define neural networks.

 

The following is a simple example of the Pretty Tensor capabilities. We wrap a standard TensorFlow object, pretty, into a library compatible object, then we feed it through three fully connected layers, to finally output a softmax distribution:


pretty = tf.placeholder([None, 784], tf.float32)

softmax = (prettytensor.wrap(examples)

.fully_connected(256, tf.nn.relu)

.fully_connected(128, tf.sigmoid)

.fully_connected(64, tf.tanh)

.softmax(10))

The Pretty Tensor installation is very simple; just use the pip installer:


sudo pip install prettytensor

 

Chaining layers

Pretty Tensor has three modes of operation, which share the ability to chain methods.

 

Normal mode

In normal mode, every time a method is called, a new Pretty Tensor is created. This allows for easy chaining, and yet you can still use any particular object multiple times. This makes it easy to branch your network.

 

Sequential mode

In sequential mode, an internal variable, the head, keeps track of the most recent output tensor, thus allowing for the breaking of call chains into multiple statements.

Here is a quick example:


seq = pretty_tensor.wrap(input_data).sequential()

seq.flatten()

seq.fully_connected(200, activation_fn=tf.nn.relu)

seq.fully_connected(10, activation_fn=None)

result = seq.softmax(labels, name=softmax_name))

 

Branch and join

Complex networks can be built using the first class methods of the branch and join:

Branch creates a separate Pretty Tensor object that points to the current head when it is called, and this allows the user to define a separate tower, which either ends in a regression target, output or rejoins the network. Rejoining allows the user to define composite layers, such as inception. Join is used to join multiple inputs or to rejoin a composite layer.

 

Digit classifier

In this example, we'll define and train either a two-layer model or a convolutional model in the style of LeNet 5:


from six.moves import xrange

import tensorflow as tf

import prettytensor as pt

from prettytensor.tutorial import data_utils

tf.app.flags.DEFINE_string(

'save_path', None, 'Where to save the model checkpoints.')

FLAGS = tf.app.flags.FLAGS

BATCH_SIZE = 50

EPOCH_SIZE = 60000 // BATCH_SIZE

TEST_SIZE = 10000 // BATCH_SIZE

Since we are feeding our data as numpy arrays, we need to create placeholders in the graph. These must then be fed using the feed dict.

image_placeholder = tf.placeholder\

(tf.float32, [BATCH_SIZE, 28, 28, 1])

labels_placeholder = tf.placeholder\

(tf.float32, [BATCH_SIZE, 10])

tf.app.flags.DEFINE_string('model', 'full',

'Choose one of the models, either

full or conv')

FLAGS = tf.app.flags.FLAGS

We created the following function, multilayer_fully_connected. The first two layers are fully connected (100 neurons), and the final layer is a softmax result layer. Note that the chaining layer is a very simple operation:


def multilayer_fully_connected(images, labels):

images = pt.wrap(images)

with pt.defaults_scope\

(activation_fn=tf.nn.relu,l2loss=0.00001):

return (images.flatten().\

fully_connected(100).\

fully_connected(100).\

softmax_classifier(10, labels))

In the following, we'll build a multilayer convolutional network; the architecture is similar to that defined in LeNet 5. Please change this to experiment with other architectures:


def lenet5(images, labels):

images = pt.wrap(images)

with pt.defaults_scope\

(activation_fn=tf.nn.relu, l2loss=0.00001):

return (images.conv2d(5, 20).\

max_pool(2, 2).\

conv2d(5, 50).\

max_pool(2, 2). \

flatten().\

fully_connected(500).\

softmax_classifier(10, labels))

Since we are feeding our data as numpy arrays, we need to create placeholders in the graph. These must then be fed using the feed dict:

def main(_=None):

image_placeholder = tf.placeholder\

(tf.float32, [BATCH_SIZE, 28, 28, 1])

labels_placeholder = tf.placeholder\

(tf.float32, [BATCH_SIZE, 10])

Depending on FLAGS.model, we may have a two-layer classifier or a convolutional classifier, previously defined:

def main(_=None):

if FLAGS.model == 'full':

result = multilayer_fully_connected\

(image_placeholder, labels_placeholder)

elif FLAGS.model == 'conv':

result = lenet5(image_placeholder, labels_placeholder)

else:

raise ValueError\

('model must be full or conv: %s' % FLAGS.model)

Then we define the accuracy function for the evaluated classifier:

accuracy = result.softmax.evaluate_classifier\

(labels_placeholder,phase=pt.Phase.test)

Next, we build the training and test sets:

train_images, train_labels = data_utils.mnist(training=True)

test_images, test_labels = data_utils.mnist(training=False)

We will use a gradient descent optimizer procedure and apply it to the graph. The pt.apply_optimizer function adds regularization losses and sets up a step counter:


optimizer = tf.train.GradientDescentOptimizer(0.01)\ train_op = pt.apply_optimizer

(optimizer,losses=[result.loss])

We can set save_path in the running session to automatically checkpoint every so often. Otherwise, at the end of the session, the model will be lost:

runner = pt.train.Runner(save_path=FLAGS.save_path)

with tf.Session():

for epoch in xrange(10):

Shuffle the training data:

train_images, train_labels =\

data_utils.permute_data\

((train_images, train_labels))

runner.train_model(train_op,result.\

loss,EPOCH_SIZE,\

feed_vars=(image_placeholder, \

labels_placeholder),\

feed_data=pt.train.\

feed_numpy(BATCH_SIZE,\

train_images,\

train_labels),\

print_every=100)

classification_accuracy = runner.evaluate_model\

(accuracy,\

TEST_SIZE,\

feed_vars=(image_placeholder,\

labels_placeholder), \

feed_data=pt.train.\

feed_numpy(BATCH_SIZE,\

test_images,\

test_labels))

print('epoch' , epoch + 1)

print('accuracy', classification_accuracy )

if __name__ == '__main__':

tf.app.run()

Running this example provides the following output:

>>>

Extracting /tmp/data/train-images-idx3-ubyte.gz

Extracting tmp/data/train-labels-idx1-ubyte.gz

Extracting /tmp/data/t10k-images-idx3-ubyte.gz

Extracting /tmp/data/t10k-labels-idx1-ubyte.gz

epoch = 1

Accuracy [0.8994]

epoch = 2

Accuracy [0.91549999]

epoch = 3

Accuracy [0.92259997]

epoch = 4

Accuracy [0.92760003]

epoch = 5

Accuracy [0.9303]

epoch = 6

Accuracy [0.93870002]

epoch = 7

epoch = 8

Accuracy [0.94700003]

epoch = 9

Accuracy [0.94910002]

epoch = 10

Accuracy [0.94980001]

Source code for digit classifier

The following is the full source code for the digit classifier previously described:


from six.moves import xrange

import tensorflow as tf

import prettytensor as pt

from prettytensor.tutorial import data_utils

tf.app.flags.DEFINE_string('save_path', None, 'Where to save the model checkpoints.') FLAGS = tf.app.flags.FLAGS

BATCH_SIZE = 50

EPOCH_SIZE = 60000 // BATCH_SIZE

TEST_SIZE = 10000 // BATCH_SIZE

image_placeholder = tf.placeholder\

(tf.float32, [BATCH_SIZE, 28, 28, 1])

labels_placeholder = tf.placeholder\

(tf.float32, [BATCH_SIZE, 10])

tf.app.flags.DEFINE_string('model', 'full','Choose one of the models, either full or co FLAGS = tf.app.flags.FLAGS

def multilayer_fully_connected(images, labels):

images = pt.wrap(images)

with pt.defaults_scope(activation_fn=tf.nn.relu,l2loss=0.000 return (images.flatten().\

fully_connected(100).\

fully_connected(100).\

softmax_classifier(10, labels))

def lenet5(images, labels):

images = pt.wrap(images)

with pt.defaults_scope\

(activation_fn=tf.nn.relu, l2loss=0.00001):

return (images.conv2d(5, 20).\

max_pool(2, 2).\

conv2d(5, 50).\

max_pool(2, 2).\

flatten(). \

fully_connected(500).\

softmax_classifier(10, labels))

def main(_=None):

image_placeholder = tf.placeholder\

(tf.float32, [BATCH_SIZE, 28, 28, 1])

labels_placeholder = tf.placeholder\

(tf.float32, [BATCH_SIZE, 10])

if FLAGS.model == 'full':

result = multilayer_fully_connected\

(image_placeholder,\

labels_placeholder)

elif FLAGS.model == 'conv':

result = lenet5(image_placeholder,\

labels_placeholder)

else:

raise ValueError\

('model must be full or conv: %s' % FLAGS.model)

accuracy = result.softmax.\

evaluate_classifier\

(labels_placeholder,phase=pt.Phase.test)

train_images, train_labels = data_utils.mnist(training=True)

test_images, test_labels = data_utils.mnist(training=False)

optimizer = tf.train.GradientDescentOptimizer(0.01)

train_op = pt.apply_optimizer(optimizer,losses=[result.loss])

runner = pt.train.Runner(save_path=FLAGS.save_path)

with tf.Session():

for epoch in xrange(10):

train_images, train_labels =\

data_utils.permute_data\

((train_images, train_labels))

runner.train_model(train_op,result.\

loss,EPOCH_SIZE,\

feed_vars=(image_placeholder,\

labels_placeholder),

feed_data=pt.train.\

feed_numpy(BATCH_SIZE,\

train_images,\

train_labels),\

print_every=100)

classification_accuracy = runner.evaluate_model\

(accuracy,\

TEST_SIZE,\

feed_vars=(image_placeholder,\

labels_placeholder),\

feed_data=pt.train.\

feed_numpy(BATCH_SIZE,\

test_images,\

test_labels))

print('epoch' , epoch + 1)

print('accuracy', classification_accuracy )

if __name__ == '__main__':

tf.app.run()

 

TFLearn

TFLearn is a library that wraps a lot of new APIs by TensorFlow with the nice and familiar scikit-learn API. TensorFlow is all about building and executing a graph. This is a very powerful concept, but it is also cumbersome to start with. Looking under the hood of TFLearn, we used just three parts:

 

Layers: This is a set of advanced TensorFlow functions that allows you to easily build complex graphs, from fully connected layers, convolution, and batch norm, to losses and optimization.

 

Graph actions: This is a set of tools to perform training and evaluating, and run inference on TensorFlow graphs.

Estimator: This packages everything in a class that follows the scikit-learn interface, and provides a way to easily build and train custom TensorFlow models. Subclasses of Estimator, such as linear classifier, linear regressor, DNN classifier, and so on, are pre-packaged models similar to scikit-learn logistic regression that can be used in one line.

 

TFLearn installation

To install TFLearn, the easiest way is to run the following:


pip install git+tflearn/tflearn

For the latest stable version, use the following command:

pip install tflearn

Alternatively, you can also install from source by running (from the source folder) the following:

python http://setup.py install

 

Titanic survival predictor

In this tutorial, we will learn to use TFLearn and TensorFlow to model the survival chance of Titanic passengers using their personal information (such as gender, age, and so on). To tackle this classic machine learning task, we are going to build a DNN classifier.

 

Let's take a look at the dataset (TFLearn will automatically download it for you). For each passenger, the following information is provided:


survivedSurvived (0 = No; 1 = Yes)

pclass Passenger Class (1 = st; 2 = nd; 3 = rd)

name Name

sex Sex

age Age

sibsp Number of Siblings/Spouses Aboard

parch Number of Parents/Children Aboard

ticket Ticket Number

fare Passenger Fare

Here are some samples extracted from the dataset:

survived pclass name sex age sibsp parch ticket

Aubart, Mme. PC

1 1 Leontine Female 24 0 0

17477

Pauline

0 2 Bowenur, Mr. Male 42 0 0 21153

Solomon

Baclini, Miss.

1 3 Marie Female 5 2 1 2666

Catherine

0 3 Youseff, Mr. Male 45.5 0 0 2628

 

There are two classes in our task: not survived (class = 0) and survived (class = 1), and the passenger data has eight features.

 

The Titanic dataset is stored in a CSV file, so we can use the TFLearn load_csv() function to load the data from the file into a Python list. We specify the target_column argument to indicate that our labels (survived or not) are located in the first column (ID is 0). The functions will return a tuple (data, labels).

 

Let's start by importing the numpy and TFLearn libraries:


import numpy as np

import tflearn

Download the titanic dataset:

from tflearn.datasets import titanic titanic.download_dataset('titanic_dataset.csv')

Load the CSV file, indicating that the first column represents labels:

from tflearn.data_utils import load_csv

data, labels = load_csv('titanic_dataset.csv', target_column=0, categorical_labels=True, n_classes=2)

The data needs some preprocessing to be ready to be used in our DNN classifier. Indeed, we must delete the columns, fields that don't help in our analysis. We discard the name and ticket fields because we estimate that passenger name and ticket are not correlated with their chance of surviving:


def preprocess(data, columns_to_ignore):

The preprocessing phase starts by descending the id and delete columns:

for id in sorted(columns_to_ignore, reverse=True):

[r.pop(id) for r in data]

for i in range(len(data)):

The sex field is converted to float (to be manipulated):

data[i][1] = 1. if data[i][1] == 'female' else 0.

return np.array(data, dtype=np.float32)

As already described, the fields name and ticket will be ignored by the analysis:

to_ignore=[1, 6]

Here, we call the preprocess procedure:

data = preprocess(data, to_ignore)

First of all, we specify the shape of our input data. The input sample has a total of 6 features, and we will process samples per batch to save memory, so our data input shape is [None, 6]. The None parameter means an unknown dimension, so we can change the total number of samples that are processed in a batch:

net = tflearn.input_data(shape=[None, 6])

Finally, we build a three-layer neural network with this simple sequence of statements:

net = tflearn.fully_connected(net, 32)

net = tflearn.fully_connected(net, 32)

net = tflearn.fully_connected(net, 2, activation='softmax')

net = tflearn.regression(net)

TFLearn provides a model wrapper DNN that can automatically perform neural network classifier tasks:


model = tflearn.DNN(net)

We will run it for 10 epochs, with a batch size of 16:

model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)

Running the model, you should have an output as follows:

Training samples: 1309

Validation samples: 0

--

Training Step: 82 | total loss: 0.64003

# Adam | epoch: 001 | loss: 0.64003 - acc: 0.6620 -- iter: 1309/1309

--

Training Step: 164 | total loss: 0.61915

# Adam | epoch: 002 | loss: 0.61915 - acc: 0.6614 -- iter: 1309/1309

--

Training Step: 246 | total loss: 0.56067

# Adam | epoch: 003 | loss: 0.56067 - acc: 0.7171 -- iter: 1309/1309

--

Training Step: 328 | total loss: 0.51807

# Adam | epoch: 004 | loss: 0.51807 - acc: 0.7799 -- iter: 1309/1309

--

Training Step: 410 | total loss: 0.47475

# Adam | epoch: 005 | loss: 0.47475 - acc: 0.7962 -- iter: 1309/1309

--

Training Step: 492 | total loss: 0.51677

# Adam | epoch: 006 | loss: 0.51677 - acc: 0.7701 -- iter: 1309/1309

--

Training Step: 574 | total loss: 0.48988

# Adam | epoch: 007 | loss: 0.48988 - acc: 0.7891 -- iter: 1309/1309

--

Training Step: 656 | total loss: 0.55073

# Adam | epoch: 008 | loss: 0.55073 - acc: 0.7427 -- iter: 1309/1309

--

Training Step: 738 | total loss: 0.50242

# Adam | epoch: 009 | loss: 0.50242 - acc: 0.7854 -- iter: 1309/1309

--

Training Step: 820 | total loss: 0.41557

# Adam | epoch: 010 | loss: 0.41557 - acc: 0.8110 -- iter: 1309/1309

 

The model accuracy is around 81%, which means that it can predict the correct outcome (survived or not) for 81% of the total passengers.

Finally, evalute the model to get the final accuracy:


accuracy = model.evaluate(data, labels, batch_size=16)

print('Accuracy: ', accuracy)

The following is the output:

Accuracy: [0.78456837289473591]

Source code for titanic classifier

The full code for the implemented classifier is as follows:

from tflearn.datasets import titanic titanic.download_dataset('titanic_dataset.csv') from tflearn.data_utils import load_csv

data, labels = load_csv('titanic_dataset.csv', target_column=0, categorical_labels=True, n_classes=2)

def preprocess(data, columns_to_ignore):

for id in sorted(columns_to_ignore, reverse=True):

[r.pop(id) for r in data]

for i in range(len(data)):

data[i][1] = 1. if data[i][1] == 'female' else 0.

return np.array(data, dtype=np.float32)

to_ignore=[1, 6]

data = preprocess(data, to_ignore)

net = tflearn.input_data(shape=[None, 6])

net = tflearn.fully_connected(net, 32)

net = tflearn.fully_connected(net, 32)

net = tflearn.fully_connected(net, 2, activation='softmax')

net = tflearn.regression(net)

model = tflearn.DNN(net)

model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True)

# Evalute the model

accuracy = model.evaluate(data, labels, batch_size=16)

print('Accuracy: ', accuracy)

 

Summary

In this blog, we discovered three TensorFlow-based libraries for deep learning research and development.

  • We gave an overview of Keras, which is designed for minimalism and modularity, allowing the user to quickly define deep learning models.
  • Using Keras, we have learned how to develop a simple single layer LSTM model for the IMDB movie review sentiment classification problem.
  • Then, we briefly introduced Pretty Tensor; it allows the developer to wrap TensorFlow operations to chain any number of layers.
  • We implemented a convolutional model in the style of LeNet to quickly resolve the handwritten classification model.

 

The final library we looked at was TFLearn; it wraps a lot of TensorFlow APIs. In the example application, we learned to use TFLearn to estimate the survival chance of Titanic passengers. To tackle this task, we built a deep neural network classifier.

 

The next blog introduces reinforcement learning. We'll explore the basic principles and algorithms. We'll also look at some example applications, using TensorFlow and the OpenAI Gym framework, which is a powerful toolkit for developing and comparing reinforcement learning algorithms.

 

Advanced Multimedia Programming with TensorFlow

In this blog, we will discuss advanced multimedia programming using TensorFlow. Some emerging research problem like Deep Neural Networks for scalable object detection and deep learning on Android using TensorFlow with an example will be discussed.

 

Image and video analysis is the extraction of meaningful information and patterns from images or videos. A huge amount of images and videos are being generated every day and being able to analyze this huge amount of data has a great potential for providing services that are based on this analysis.

 

In this blog we will go through deep learning examples that cover image analysis as well as video analysis (just a demonstration of what video analysis and other complex deep learning examples will look like after the integration of TensorFlow and Keras) and see how can we get meaningful information out of them.

Recommend