# Classification using neural net in r

October 9, 2013
By

(This article was first published on Analytics , Education , Campus and beyond, and kindly contributed to R-bloggers)

This is mostly for my students and myself for future reference.

Classification is a supervised task , where we need preclassified data and then on new data , I can predict.
Generally we holdout a % from the data available for testing and we call them training and testing data respectively.  So it’s like this , if we know which emails are spam , then only using classification we can predict the emails as spam.

I used the dataset http://archive.ics.uci.edu/ml/datasets/seeds# .  The data set has 7 real valued attributes and 1 for predicting .  http://www.jeffheaton.com/2013/06/basic-classification-in-r-neural-networks-and-support-vector-machines/ has influenced many of the writing , probably I am making it more obvious.

The library to be used is library(nnet) , below are the list of commands for your reference

2.       Setting training set index ,  210 is the dataset size, 147 is 70 % of that
`   seedstrain<- sample(1:210,147)`
3.       Setting test set index
`   seedstest <- setdiff(1:210,seedstrain)`
` `
4.       Normalize the value to be predicted , use that attribute of the dataset , that you want to predict
`   ideal <- class.ind(seeds\$Class)`
5.       Train the model, -8 because you want to leave out the class attribute , the dataset had a total of 8 attributes with the last one as the predicted one
`   seedsANN = nnet(irisdata[seedstrain,-8], ideal[seedstrain,], size=10, softmax=TRUE)`
6.       Predict on testset
`   predict(seedsANN, seeds[seedstrain,-8], type="class")`
7.       Calculate Classification accuracy

`   table(predict(seedsANN, seeds[seedstest,-8], type="class"),seeds[seedstest,]\$Class)`

Happy Coding !

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...