Neural network have become a corner stone of machine learning in the last decade. Created in the late 1940s with the intention to create computer programs who mimics the way neurons process information, those kinds of algorithm have long been believe to be only an academic curiosity, deprived of practical use since they require a lot of processing power and other machine learning algorithm outperform them. However since the mid 2000s, the creation of new neural network types and techniques, couple with the increase availability of fast computers made the neural network a powerful tool that every data analysts or programmer must know.
In this series of articles, we’ll see how to fit a neural network with R, we’ll learn the core concepts we need to know to well apply those algorithms and how to evaluate if our model is appropriate to use in production. Today, we’ll practice how to use the
neuralnet packages to create a feedforward neural networks, which we introduce in the last set of exercises. In this type of neural network, all the neuron from the input layer are linked to the neuron from the hidden layer and all of those neuron are linked to the output layer, like seen on this image. Since there’s no cycle in this network, the information flow in one direction from the input layer to the hidden layers to the output layer. For more information about those types of neural network you can read this page.
Answers to the exercises are available here.
We’ll start by practicing what we’ve seen in the last set of exercises. Load the
MASS package and the biopsy dataset, then prepare your data to be feed to a neural network.
We’ll use the
nnet() function from the package of the same name to do a logistic regression on the biopsy data set using a feedforward neural network. If you remember the
last set of exercises you know that we have to choose the number of neuron in the hidden layer of our feedforward neural network. There’s no rule or equation which can tell us the optimal number of neurons to use, so the best way to find the better model is to do a bunch of cross-validation of our model with different number of neurons in the hidden layer and choose the one who would fit best the data. A good range to test with this process is between one neuron and the number of input variables.
Write a function that take a train data set, a test data set and a range of integer corresponding to the number of neurons to be used as parameter. Then this function should, for each possible number of neuron in the hidden layer, train a neural network made with
nnet(), make prediction on the test set and return the accuracy of the prediction.
Use your function on your data set and plot the result. Which should be the number of neurons to use in the hidden layer of your feedforward neural network.
nnet() function is easy to use, but doesn’t give us a lot of option to customize our neural network. As a consequence, it’s a good package to use if you have to do a quick model to test a hypothesis, but for more complex model the
neuralnet package is a lot more powerful. Documentation for this package can be found here.
neuralnet() function with the default parameter and the number of neuron in the hidden layer set to the answer of the last exercise. Note that this function can only handle numeric value and cannot deal with factors. Then use the
compute() function to make prediction on the values of the test set and compute the accuracy of your model.
- Work with Deep Learning networks and related packages in R
- Create Natural Language Processing models
- And much more
nnet() function use by default the BFGS algorithm to adjust the value of the weights until the output values of our model are close to the values of our data set. The
neuralnet package give us the option to use more efficient algorithm to compute those value which result in faster processing time and overall better estimation. For example, by default this function use the resilient backpropagation with weight backtracking.
neuralnet() function with the parameter
algorithm set to ‘rprop-‘, which stand for resilient backpropagation without weight backtracking.
Then test your model and print the accuracy.
Two other algorithm can be used with the
'slr'. Those two strings tell the function to use the globally convergent algorithm (grprop) and to modify the learning rate associated with the smallest absolute gradient (sag) or the smallest learning rate (slr). When using those algorithm, it can be useful to pass a vector or list containing the lowest and highest limit for the learning rate to the learningrate.limit parameter.
Again, use the
neuralnet() function twice, once with parameter
algorithm set to
'sag' and then to
'slr'. In both cases set the
learningrate.limit parameter to c(0.1,1) and change the
stepmax parameter to
The learning rate determine how much the backpropagation can affect the weight at each iteration. A high learning rate mean that during the training of the neural network, each iteration can strongly change the value of the weight or, to put in other way, the algorithm learn a lot of each observation in your data set. This mean that outlier could easily affect your weight and make your algorithm diverge from the path of the ideal weights for your problem. A small learning rate mean that the algorithm learn less from each observation in your data set, so your neural network is less affected by outlier, but this mean that you will need more observations to make a good model.
neuralnet() function with parameter
algorithm set to ‘rprop+’ twice: once with the
learningrate parameter set to 0.01 and another time with the
learningrate parameter set to 1. Notice the difference in running time in both cases.
neuralnet package give us the ability of make a visual representation of the neural network you made. Use the
plot() function to visualize one of the neural networks of the last exercise.
Until now, we’ve used feedfordward neural network with one hidden layer of neurons, but we could use more. In fact, the state of the art neural network use often 100 of hidden layer for modeling complex behavior. For basic regression problems or even basic digits recognition problems, one layer is enough, but if you want to use more, you can do so with the
neuralnet() function by passing a vector of integer to the hidden parameter representing the number of neurons in each layer.
Create a feedforward neural network with three hidden layers of nine neurons and use it on your data.
Plot the feedforward neural network from the last exercise.