Blog Archives

Simple Text Mining with R

May 31, 2012
By
Simple Text Mining with R

I’ve used R for many use cases and Text Mining is one of those. Below is a small snippet to get you started with R and Text Mining. require(fortunes) require(tm) sentences <- NULL for (i in 1:10) sentences <- c(sentences,fortune(i)$quote) d <- data.frame(textCol =sentences ) ds <- DataframeSource(d) dsc<-Corpus(ds) dtm<- DocumentTermMatrix(dsc, control = list(weighting =

Read more »

Path from root to leaf node in mvpart

December 1, 2011
By
Path from root to leaf node in mvpart

I was recently asked by a R user about how one could extract the “rule” in a classification/regression tree. The requirement was to obtain the path traced from the root node to the leaf nodes and obtain all the paths or “rules” path.rpart() function in the mvpart package provides this convenience library(mvpart) # Create a

Read more »

GUI for sending email in R (using sendEmail)

November 30, 2011
By
GUI for sending email in R (using sendEmail)

After writing the last post on using sendEmail to send email from R I decided to create a simple GUI to enable this functionality. A snapshot image of the GUI is shown above. To use this GUI, you will need to install the following packages in R: gWidgets gWidgetsRGtk2 Windows GTK Bundle More information on

Read more »

Sending Email from R (using sendEmail)

November 25, 2011
By
Sending Email from R (using sendEmail)

Like a lot of other R users I’ve felt the need for sending email from R. I haven’t surveyed CRAN for such a package but looked for the possibility of sending command line email in Windows. Found a nice application called sendEmail that can be found here Below are code snippets in R that will

Read more »

Finding functions in R

November 17, 2011
By
Finding functions in R

When looking for functions whose exact name is unknown # Functions related to “shrinkage” methods help.search(“shrinkage”) Package sos does a great job in finding functions install.packages(“sos”) library(sos) shrinkageResults <- findFn("shrinkage", maxPages = 1) shrinkageResults # This opens a webpage in your browser with the results The table in the webpage created above have sortable columns.

Read more »

Missing values and column types when reading data into R

November 17, 2011
By
Missing values and column types when reading data into R

Reading data into R when dealing with column types and values that need to be considered as NA Below are code snippets to introduce a few arguments of the read.csv function in R # Create sample data strVals <- do.call("c",lapply(1:1000,function(x)paste(sample(letters,sample(5:20,1)),collapse=""))) miscVals <- sample(c("","999","—-","MISS"),100,replace=T) numVals <- rnorm(1000) # Scenario 1 : Pure numeric and strings dataTemp<-data.frame(numericVals

Read more »

Setting up AWS Cluster to use snow in R

November 8, 2011
By
Setting up AWS Cluster to use snow in R

Setting up AWS Cluster I wanted to setup an AWS cluster to take a shot at a Kaggle contest – DunnHumby Challenge http://www.kaggle.com/c/dunnhumbychallenge For this, I found StarCluster to be of great help. It allows you to set-up AWS nodes in a few lines of code and does much more (choosing AMIs and cluster configurations)

Read more »