Poll Shows Open Source Almost Even with Commercial Analytics Software

May 31, 2012
By
Poll Shows Open Source Almost Even with Commercial Analytics Software

The 2012 results of the annual KDnuggets poll are in. It shows R in first place with 30.7% of users reporting having used it for a real project. Excel is almost as popular. It seems out of place among so … Continue reading →

Read more »

Using R.Net in an Excel Add in

May 31, 2012
By

I thought I’d try out R.net and in doing so I have put together a very simple Excel 2007 add in that connects Excel to R. I’m using .Net 4.0 in Visual Studio 2010 pro with the latest commit of R.Net, … Continue reading →

Read more »

Simple Text Mining with R

May 31, 2012
By
Simple Text Mining with R

I’ve used R for many use cases and Text Mining is one of those. Below is a small snippet to get you started with R and Text Mining. require(fortunes) require(tm) sentences <- NULL for (i in 1:10) sentences <- c(sentences,fortune(i)$quote) d <- data.frame(textCol =sentences ) ds <- DataframeSource(d) dsc<-Corpus(ds) dtm<- DocumentTermMatrix(dsc, control = list(weighting =

Read more »

Inferno-ish R

May 31, 2012
By
Inferno-ish R

CambR was nice enough to invite Markus Gesmann and me to speak at their event on Tuesday. My talk was Inferno-ish R. See also The R Inferno. Epilogue Subscribe to the Portfolio Probe blog by Email

Read more »

The Facebook Doomsday Watch

May 31, 2012
By
The Facebook Doomsday Watch

I've been following the myriad circus of Facebook commentators and bystanders pointing to its horrific failed IPO launch and seemingly inevitable crash to zero. While my focus here isn't really so much on fundamentals or basic TA; I do want to comment ...

Read more »

New Data Science Packages Coming To Computational Journalism Server

May 30, 2012
By

I’ve just received an announcement from Michael Lang that packages BatchJobs and BatchExperiments have been added to the Comprehensive R Archive Network (CRAN). From the announcement: The package BatchJobs implements the basic objects and procedu...

Read more »

Converting cross sectional data with dates to weekly averages in R.

May 30, 2012
By
Converting cross sectional data with dates to weekly averages in R.

I was recently confronted with a problem where I had to compare two very different data sets. The problem was that one data set was observed cross sectional data with dates over the course of three months and the other was weekly averages during those same three months.  After a bit of research, I discovered

Read more »

Online Course from Statistics.com: Advanced Programming in R

May 30, 2012
By

  Hadley Wickham teaches “Programming in R – Advanced,” June 15 – July 13 online at Statistics.com. This is the third in a series of courses that cover programming in R, so if you are new to the subject you should start with our Jul 27 course “Introduction to R: Data Handling.” Upcoming Courses: Jun 15:  Advanced Programming in R...

Read more »

Predicting the NBA Finals with R

May 30, 2012
By
Predicting the NBA Finals with R

This is the initial post about the algorithm. See updates 1, 2, and 3 for more. The algorithm is currently 4-2 in the playoffs!OverviewI was struck by Martin O'Leary's recent post on predicting the Eurovision finals, which led me to decide that I wou...

Read more »

Project Euler — problem 5

May 30, 2012
By

I spent around 40 minutes on the last post yesterday, which delayed my bedding time and caused my sleepiness in the morning. So, I’m starting to write earlier tonight. The fifth problem is to calculate the smallest composite for given numbers. 2520 is … Continue reading →

Read more »

R 2.15.1 scheduled for June 22

May 30, 2012
By

The next release of open-source R, codenamed "Roasted Marshmallows", is scheduled to be released on June 22, according to this announcement on the r-announce mailing list. Don't expect too many changes in this update: despite the fact that "there have been very few issues with 2.15.0 ... some people may be waiting superstitiously for a .1 release". This will...

Read more »

Send emails with attachments from R command line

May 30, 2012
By

The sendmailR package makes it easy to send emails with attachments from the R command line.  #load packagelibrary("sendmailR")#use string formatting and your system info to format FROM address from <- sprintf("<Project1@%s>", Sys.info()[...

Read more »

Be assertive!

May 30, 2012
By
Be assertive!

assertive, my new package for writing robust code, is now on CRAN. It consists of lots of is functions for checking variables, and corresponding assert functions that throw an error if the condition doesn’t hold. For example, is_a_number checks that the input is numeric and scalar. In the last two cases, the return value of

Read more »

Predicting the NBA Finals with R

May 30, 2012
By
Predicting the NBA Finals with R

This is the initial post about the algorithm. See updates 1, 2, and 3 for more. The algorithm is currently 4-2 in the playoffs!OverviewI was struck by Martin O'Leary's recent post on predicting the Eurovision finals, which led me to decide that I wou...

Read more »

Space Time Swing Probability Plot for Ichiro

May 30, 2012
By

I was having some fun with PITCHf/x data and generalize additive models. PITCHf/x keeps track of the trajectory, path, location of every pitch in the MLB. It is pretty accurate and opens up baseball to more analyses than ever before. Generalized additi...

Read more »

knitcitations

May 30, 2012
By

Included file 'themes/noamblog/rsscss.html' not found in _includes directory Markdown is becoming an increasingly popular platform for lightweight and online publishing. While traditional publishing tools like LaTeX and word processors have long had i...

Read more »

Review: “Forest Analytics with R: an introduction”

May 29, 2012
By
Review: “Forest Analytics with R: an introduction”

Forestry is the province of variability. From a spatial point of view this variability ranges from within-tree variation (e.g. modeling wood properties) to billions of trees growing in millions of hectares (e.g. forest inventory). From a temporal point of view … Continue reading →

Read more »

Google Earth and ocean depth contours

May 29, 2012
By
Google Earth and ocean depth contours

Been playing around in R for a while and then a bit in the GIS environment - within R of course. Do not know very much about GIS, but know what I want: Looking at GIS fisheries data across various scales (macro to global). Now we have myriads of websit...

Read more »

Better decision tree graphics for rpart via party and partykit

May 29, 2012
By

I’ve been using Graphviz to create better decision tree graphics “by hand” for rpart objects created in R (final tree). I stumbled on this post that shows how one could convert an rpart object to a party project via the as.party function in partykit to utilize the plot functions in party. It looks quite nice.... Read more »

Exporting from Win ISI / Importing into R

May 29, 2012
By
Exporting from Win ISI / Importing into R

Chemometric software´s  have the option to export a matrix to a TXT file (in this case a constituents matrix), in a way we can import it easily into R, to work with. It is the first step to go into the R world.I use in this case Win ISI Soft...

Read more »

Mahalanobis distance with "R" (Exercice)

May 29, 2012
By
Mahalanobis distance with "R" (Exercice)

I have developed this exercise with Excel in another post for the same calculations , I am going to develop  it this time with  "R".    edad  long.  peso    mg.kg1    28 &n...

Read more »

Project Euler — problem 4

May 29, 2012
By
Project Euler — problem 4

It’s midnight already. I’m going to bed after I type this. Now the fourth Euler problem: A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 99. Find … Continue reading →

Read more »

How to Stay Current in Bioinformatics/Genomics

May 29, 2012
By

A few folks have asked me how I get my news and stay on top of what's going on in my field, so I thought I'd share my strategy. With so many sources of information begging for your attention, the difficulty is not necessarily finding what's interesting...

Read more »

Choosing colour palettes. Part I: Introduction

May 29, 2012
By
Choosing colour palettes. Part I: Introduction

In this series of three posts, we’ll look at colours in R graphics produced with ggplot2: what are the available choices of colour schemes, and how to choose a colour palette most suitable for a particular graphic? In kindergarten, choosing a co...

Read more »

Example 9.33: Multiple imputation, rounding, and bias

May 29, 2012
By
Example 9.33: Multiple imputation, rounding, and bias

Nick has a paper in the American Statistician warning about bias in multiple imputation arising from rounding data imputed under a normal assumption. One example where you might run afoul of this is if the data are truly dichotomous or count variables, but you model it as normal (either because your software is unable to model dichotomous...

Read more »

knitr, Slideshows, and Dropbox

May 29, 2012
By
knitr, Slideshows, and Dropbox

I just noticed that Markus Gesmann has a nice post on using RStudio, knitr, Pandoc, and Slidy to create slideshows. After my recent attempt to use deck.rb to turn a Markdown/knitr file into a deck.js presentation I caved in and also decided to go with ...

Read more »

Interactive HTML presentation with R, googleVis, knitr, pandoc and slidy

May 29, 2012
By
Interactive HTML presentation with R, googleVis, knitr, pandoc and slidy

Tonight I will give a talk at the Cambridge R user group about googleVis. Following my good experience with knitr and RStudio to create interactive reports, I thought that I should try to create the slides in the same way as well. Christopher Gandrud's...

Read more »

Backtesting Classical Technical Patterns

May 28, 2012
By
Backtesting Classical Technical Patterns

In the last post, Classical Technical Patterns, I discussed the algorithm and pattern definitions presented in the Foundations of Technical Analysis by A. Lo, H. Mamaysky, J. Wang (2000) paper. Today, I want to check how different patterns performed historically using SPY. I will follow the rolling window procedure discussed on pages 14-15 of the

Read more »

End of May flotsam

May 28, 2012
By
End of May flotsam

The end is near! At least the semester is coming to an end, so students have crazy expectations like getting marks back for assignments, and administrators want to see exam scripts. Sigh! What has been happening meanwhile in Quantum Forest? … Continue reading →

Read more »