Five Popular Data Augmentation techniques In Deep Learning

August 31, 2020 Niteesha Balla

Five Popular Data Augmentation techniques In Deep Learning

As Alan turing said

What we want is a machine that can learn from experience.

The machine gets more learning experience from feeding more data. In particular for deep learning models more data is the key for building high performance models.

If we are not able to feed the right amount of data the deep learning models we build face the underfitting issue, Sometime the data we feed needs to be more diversified, else even if we are feeding high amount data, the model will face the overfitting issue.

So we are clear now, we need large amounts of data to build deep learning models but not all the time we will have enough data,

So we will stop building the model in such cases.

No right, We need to find ways to use the available data, to generate more data with more diversity. In machine learning to solve the similar kind of problem handling limited data, we use the oversampling method.

In the same way for building deep learning models we use different data augmentation methods to create more meaningful data which can be used for building deep learning models.

So let’s drive further.

Below are the concepts you are going to learn in this article.

Table of Contents

What is Data Augmentation?

Why do we need Data Augmentation?

Where do we apply Data Augmentation?

Data Augmentation Techniques

What is Data Augmentation?

Data Augmentation is a process of increasing the available limited data to large meaningful and more diversity amounts. In other terms, we are artificially increasing the size of the dataset by creating different versions of the existing data from our dataset.

The main reason for this, as we all know the real world data may not always be in the correct form.

For example, consider a car in an image, the car may not be at the center in all cases, sometimes it can be in the left side of the image or right. The image may be clicked on a bright sunny day or on a cloudy day. The image might be the left view of the car or the right view.

All these factors affect the model while evaluating an image. The model should be trained in such a way that it can detect the object accurately irrespective of the above factors.

We can apply data augmentation to different types of data, but in this article we are focusing on the Image Data Augmentation techniques that are used in common.

Why do we need Data Augmentation?

Popular Data Augmentation techniques In Deep Learning

Click to Tweet

Most of the state-of-the-art models contain lots of parameters in the order of millions.

In order to train a model for accurate results we need to have more number of parameters to learn almost all the features from the data. To accommodate all these parameters we need to have a good amount of data. Deep learning models often require more data which is not always available.

“What do we do if we have less amount of data or imbalance data?”

We need not dig in google for new images. We can simply use some techniques and generate images which are ten times of our dataset or even more.

In case of imbalanced data we can generate more images for the class which has less data.

Where do we apply Data Augmentation?

We can apply this technique at the time of the data generation after preprocessing and before training.

We apply this technique only for the training dataset. At test time we use the test image directly without any transformations.

For small datasets we can generate the transformations of the images and train the model with all the data at once. For large datasets we can generate unique transformed images for every batch of an epoch.

Data Augmentation Techniques

Five Popular Data Augmentation techniques

Below are some of the most popular data augmentation widely used in deep learning.

Random Rotation.
Flip (Horizontal and Vertical).
Zoom
Random Shift
Brightness

To get a better understanding of these data augmentation techniques we are going to use a cat image.

First step is to read it using the matplotlib library.

Below is the code to read the image:

We are going to fit the image on the ImageDataGenerator class from keras which applies the transformations and returns the data in batches.

The ImageDataGenerator needs the input in the shape of (batch_size, height, width, channels) but the shape of our image is ( height, width, channels).

So , let's reshape our image into the desired shape.

We have to create an instance for the ImageDataGenerator and pass these transformations as parameters.

Replace the above code cell with the respective code cells from the below techniques to apply the transformations.

Now we need to pass the image to the data generator flow method which generates the transformations.

After that Let’s view our image using matplotlib without any augmentations.

Below is the loaded cat image.

Now, let's dive into the details of the data augmentation techniques and apply them on our image.

Random Rotation

We can rotate the image by applying some angle. Each rotated image is a unique one to the model. The rotation can be applied up to 360 degrees based on the object in the image.

For the above example we are applying rotation_range = 50, which means the ImageGenerator considers it as a range [-50,50] and applies some random angle from the range to the image.

Rotation technique

Flip

The image can be flipped either horizontally or vertically based on the object in the image.

For example, the image of a car cannot be flipped vertically as it results in the upside down car. However, It can be flipped horizontally generating left view and right view of a car.

For some objects we should not flip it vertically as the image may change entirely. The below flip transformation is just for understanding the concept.

Horizontal Flip

Horizontal flip technique

Vertical flip technique

Zoom

The image can be zoomed in or out with the zoom Augmentation.

ImageDataGenerator class accepts a single float value or a list of 2 values:

If a single value is given then the zoom range is [1-value, 1+value].
If a list is given then one value is taken as lower limit and the other as upper limit.

The image is randomly zoomed in or out within the given range.

Zoom technique

Random Shift

The pixels of the image can be shifted horizontally or vertically.

ImageDataGenerator class accepts two types of values(float and int):

If float value is given then it considers the value as percentage of width or height to shift the image.
If int value is given then it shifts the pixels of the height or width by that value.

Width Shift

The width_shift_range shifts the pixels horizontally either to the left or to the right randomly.

Width shift technique

Height Shift

The height_shift_range shifts the pixels vertically either to the top or to the bottom randomly.

Hight shift technique

Brightness

Brightness is an important factor when training the model. We are not sure that the images are always taken in better lighting. So, our model needs to identify the object even with the least resolution.

ImageDataGenerator class accepts a range of values and sets the brightness of an image randomly from that range.

Brightness technique

We can apply all these transformations at a time based on the context of our dataset.

Below is the complete code for the Data Augmentation.

Complete Code

Conclusion

More models are being trained everyday with some accuracy. But only the models which give accurate results are rewarded the best. The above Augmentation techniques help in generalizing the model by preventing the overfitting and in turn increases the accuracy of the model.

These techniques can be applicable only for the Computer Vision problems with image datasets. There are also techniques to generate synthetic data for other types of datasets also.

Try the one which better suits your problem and obtain state of the art accuracy for your models.