Blog Archives

Text Data Mining with Twitter and R

April 8, 2011
By
Text Data Mining with Twitter and R

Twitter is a favorite source of text data for analysis: it’s popular (there is a huge volume of variety on all topics) and easily accessible using Twitter’s free, open APIs which are easily consumable in JSON and ATOM formats. Some … Continue reading →

Read more »

Compcache on Ubuntu on Amazon EC2

May 4, 2010
By
Compcache on Ubuntu on Amazon EC2

The following fully-automatic Bash script downloads, compiles, and initializes compcache version 0.6.2 on Ubuntu Karmic Koala (9.10). This script creates two swaps with a maximum of 4GB uncompressed size each. Two swaps are used to take advantage of 2 CPUs (or CPU cores in a multicore CPU). Compcache is a fascinating memory compression system. The

Read more »

Validating credit card numbers in SAS

March 16, 2010
By
Validating credit card numbers in SAS

Major credit card issuing networks (including Visa, MasterCard, Discover, and American Express) allow simple credit card number validation using the Luhn Algorithm (also called the “modulus 10″ or “mod 10″ algorithm). The following code demonstrates an implementation in SAS. The code also validates the credit card number by length and by checking against a short

Read more »

Weighting model fit with ctree in party

March 15, 2010
By
Weighting model fit with ctree in party

Conditional inference trees (ctree) in package party allows weighting which is useful when one classification outcome is more important than another. Useful examples are not difficult to imagine: in a marketing direct mailing, a false positive (non-res...

Read more »

Setting the HTML title tag in SAS ODS (the right way)

January 5, 2010
By
Setting the HTML title tag in SAS ODS (the right way)

In our department and various places on the Intertubes, SAS programmers set the HTML title tag (which sets the title in web browsers and on search engines) in ODS using the headtext option: ods html headtext="<title>My great report</title>" /* wrong! */ file="foo.html"; This may work in some situations, but it’s ugly and wrong. To see

Read more »

R: Memory usage statistics by variable

January 4, 2010
By
R: Memory usage statistics by variable

Do you need a way to find out which individual variables in R consume the most memory? # create dummy variables for demonstration x <- 1:1000 y <- 1:10000 z <- 1:100000 # print aggregate memory usage statistics print(paste('R is using', mem...

Read more »

Error : .onLoad failed in ‘loadNamespace’ for ‘RWeka’

December 24, 2009
By
Error : .onLoad failed in ‘loadNamespace’ for ‘RWeka’

After installing Weka/RWeka in R, you may get this error if you try to load RWeka in the same session: require(RWeka) Cannot create Java virtual machine (-4) Error : .onLoad failed in 'loadNamespace' for 'RWeka' Solution: Just close R and re-open it. Cause: Apparently the installation requires some initialization. Tested on R 2.10.1 on Windows

Read more »

Compare performance of machine learning classifiers in R

December 23, 2009
By
Compare performance of machine learning classifiers in R

This tutorial demonstrates to the R novice how to create five machine learning models for classification and compare the performance graphically with ROC curves in one chart. For a simpler introduction, start with Plot ROC curve and lift chart in R. # ...

Read more »

Plot ROC curve and lift chart in R

December 18, 2009
By
Plot ROC curve and lift chart in R

This tutorial with real R code demonstrates how to create a predictive model using cforest (Breiman’s random forests) from the package party, evaluate the predictive model on a separate set of data, and then plot the performance using ROC curves ...

Read more »

“Outlook cannot open this item.” and tasks missing

October 8, 2009
By
“Outlook cannot open this item.” and tasks missing

Recently Microsoft Office Outlook 2007 started giving me the vague error message Outlook cannot open this item. The item may be damaged. The message would appear randomly throughout the day. Sometimes five error message boxes would be stacked up on top of each other. OK, but which item? What kind of item? Is it an

Read more »