**Appsilon Data Science Blog**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

## Before we start…

We hope you found the first half of this post useful and interesting. Before we dive into the code, I want to explain a few important aspects of data science. Firstly, implementing data science in practice is always a research process. The goals we set have a significant impact on the methods chosen. Trying to achieve even a marginal increase in accuracy or precision can have a significant impact on the project’s duration. Development is heavily influenced by the data, as well. Achieving the same results on different data sets is not always a straightforward process.

Furthermore, I want to describe why we use GPU’s over CPU’s to train our models. It is important to go into the differences between the two. CPU’s only have a few cores. Generally, each core works on a single process at a time. GPU’s on the other hand, have hundreds of weaker cores.

Technically speaking, training a model is done through thousands of small processes and individual statistical manipulations. Each of these processes can be done at the same time on a GPU, vastly decreasing the necessary time needed for training. The differences are most apparent in Deep Learning.

## The data

Before we start changing our CNN’s architecture, there are some things we can do when preparing our data. As a reminder, we’ve got 2800 satelite images (80 pixel height, 80 pixel width, 3 colors – RGB color space). This isn’t a huge sample, especially in Deep Learning, but it will do for our needs. In situations like this, a common practise is to use some geometric transformation (rotation, translation, thickening, blurring etc.) to enlarge training set. For example, in R we can use **rot90** function from the **pracma** package to create images rotated by 90, 180, or 270 degrees. We now have to slightly modify the code:

## CNN’s architecture

We can change the architecture of our ConvNet in many different ways. The first and simplest thing we can try is to add more layers. Our initial network looks like this:

We will add some previously mentioned layers (convolutional, pooling, activation), but can also add some new ones. Our network is getting bigger and more complicated. As such, it could be prone to overfitting. To prevent this we can use a regularization method called **dropout**. In dropout, individual nodes are either removed from the network with some probability **1-p** or kept with probability **p**. To add dropout to a convolutional neural network in Keras we can use the **layer_dropout()** function and set the **rate** parameter to a desired probability. Our example architecture could looks like this:

## Optimizer

After preparing our training set and setting up the architecture, we can choose a loss function and optimization algorithm. In Keras, you can choose from several algoritms such as a simple **Stochastic Gradient Descent** to a more adaptive algorithm like **Adaptive Moment Estimation**. Choosing a good optimizer could be crucial. In Keras, optimizer functions start with **optimizer_**:

## Results

The figure below shows the values of our accuracy and loss function (cross-entropy) before (Model 1) and after (Model 2) modifications. We can see noticeable growth in our validation set accuracy (from 0.7449 to 0.9828) and loss function decrease (from 0.556 to 0.04573).

I also ran both models on CPU and on GPU. The computation times are below:

Machine specifications:

**Processor**: Intel Core i7-7700HQ,

**Memory**: 32GB DDR4-2133MHz,

**Graphic**: NVIDIA GeForce GTX 1070, 8GB GDDR5 VRAM

Read the original post at

Appsilon Data Science Blog.

**leave a comment**for the author, please follow the link and comment on their blog:

**Appsilon Data Science Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.