In Crazy RUT, I started to explore why the moving average strategy has failed for the last 2 decades on the Russell 2000. I still do not have an answer, but I thought looking at skewness and kurtosis might help explain some of the challenge of be...

After writing several hundreds of lines of R codes, I start to pay some attention to my coding style. Fortunately, I find a document about R style guide in google code. Surprisingly, R is among the most popular programming languages, such as C++, objective-C, python, java and html. I didn’t realize … Continue reading →

Seeing as more people were interested in how I created my slides for the R conference than what was actually on them, I posted my source and commands to github. I used knitr with Rmarkdown source to convert to markdown that went into pandoc to create beamer slide. Enjoy! https://gist.github.com/2955183

Here is a quick demonstration of how functionality from the AQP package can be used to answer complex soils-related questions. In these examples the profileApply() function is used to iterate over a collection of soil profiles, and compute severa...

Since library() and require() only accept input with length(input) = 1 it is necessary to make repeated calls - this can be quite annoying.. So, HERE is a little wrapper function for convenient package installation / loading. It installs packages if th...

RStudio and knitr are an excellent conbination for generating dynamic reports. But in this blog, I will show you how to generate HTML-style presentaion using R only. OK, I confess that we still need something else: deck.js and markdown and R.utils. ...

For me, one of the most annoying features of R is that by default, rbind, cbind and data.frame recycle the shorter vector to the length of the longer vector. I still don’t understand why the standard generics don’t have a parameter like cbind(1:10, 1:5, fill = TRUE) to fill up with ‘NA’s. There may be

A quick heads-up that I'll be hosting a live webinar this Wednesday (June 20) with my colleage Sue Ranney on the new Revolution R Enterprise 6. If you've never taken a look at Revolution R Enterprise and want to know it's different from open-source R, or just want to learn about the new features, then please join us on...

Dear All, we have released version 0.6 of the igraph package today. This is a major new version, with a lot of new features, and (sadly) it is not completely compatible with code that was written for the previous igraph versions. (See “Major new features” below for details.) I have included below a list of (bigger) changes. Please see...

Introduction I recently posted about using the Wikileaks cable corpus to find word use patterns, both over time, and in secret cables vs unclassified cables. I received a lot of good suggestions for further topics to pursue with the corpus, and probably the most interesting was the idea to do sentiment analysis over time on a variety of...

Arguably, knitr (CRAN link) is the most outstanding R package of this year and its creator, Yihui Xie is the star of the useR! conference 2012. This is because the ease of use comparing to Sweave for making reproducible report. Integration of knitR and R Studio has made reproducible research much more convenience, intuitive and easier to

A colleague asked for help with randomly choosing a kid within a family. This is for a trial in which families are recruited at well-child visits, but in each family only one of the children having a well-child visit that day can be in the study. The idea is that after recruiting the family, the research assistant...

Introduction I recently posted about using the Wikileaks cable corpus to find word use patterns, both over time, and in secret cables vs unclassified cables. I received a lot of good suggestions for further topics to pursue with the corpus, and probably the most interesting was the idea to do sentiment analysis over time on a variety of named entities. Sentiment analysis is the process...

I’ve been playing around with using gWidgets on Windows over the last few weeks as a way of creating front ends for various functions and set of functions that I’ve created, so that non R users can have the benefit of these without having to write a single line of code. The likes of 4Dpiecharts … Continue reading...

On June 17 a new version (0.6) of package ”igraph” was released. This new version abandoned the old way of indexing graph vertices with consecutive numbers starting from 0. The new version now numbers the vertices starting from 1, which is more consistent with the general R convention of indexing vectors, matrices, etc. Because this change is

A look at a simplistic measure of stock-picking opportunity. Motivation The interquartile range (the spread of the middle half of the data) has recently been added to the market portrait plots. Putting those numbers into historical context was the original impulse. However, this led to thinking about change in stock-picking opportunity over time. Data Daily … Continue reading...

This is a follow up to my previous post. There is a quicker way to compute the function I created (basic cumulative sum) in R.Instead of:function f(x) { sum = 0; for (i in seq(1,x)) sum = sum + i return(su...

According to the post on FREE online R tutorials from universities, I have received many email suggesting more and more tutorials. However some tutorials are not hosted in an academic institutes, so I decided to create this post to list such tutorials.

Based on Launchpad traffic and mailing list responses, Gabor and Tamas will soon be releasing igraph 0.6. In celebration, I’ll be publishing a number of helpful lists and tables I’ve put together to organize information about igraph. In…Read more ›

It is quite common in political science for researchers to run statistical models, find that a coefficient for a variable is not statistically significant, and then claim that the variable "has no effect." This is equivalent to proposing a research ...

I just returned from the useR! 2012 conference for developers and users of R. One of the common themes to many of the presentations was integration of R-based statistical systems with other systems, be they other programming languages, web systems, or enterprise data systems. Some highlights for me were an update to Rserve that includes