11 Million Yellow Slips – City of Toronto Parking Tickets, 2008-2011

June 2, 2012
By
11 Million Yellow Slips – City of Toronto Parking Tickets, 2008-2011

IntroductionI don't know about you, but I really hate getting parking tickets. Sometimes I feel like it's all just a giant cash grab. Really? I can't park there between the hours of 11 and 3, but every other time is okay? Well, why the hell not?Bu...

Read more »

Visualizing car brand choices in ggplot2

June 2, 2012
By
Visualizing car brand choices in ggplot2

I always like to read new posts at chartsnthings as they always inspire me with new ideas for data visualization. Yesterday I have read an article on choices of car brands by members of parliament in Poland in Gazeta.pl. It contains a simple ...

Read more »

Calling R from SAS IML Studio

June 1, 2012
By
Calling R from SAS IML Studio

I am playing around with SAS IML Studio 3.4.  For those that do not know, IML (Interactive Matrix Language) is the Matlab-esk language from SAS.  It opperates from normal SAS code through the PROC IML procedure.  A new (to me at least) UI has been developed for analysts called IML Studio.  IML Studio uses a superset of the IML...

Read more »

Double Release: Rook 1.0-5 and rApache 1.1.21

June 1, 2012
By
Double Release: Rook 1.0-5 and rApache 1.1.21

I wanted to get these releases out the door before the useR conference, especially since I’m giving a tutorial on Rook! There were two bugs found in Rook, not by me, but by users! This is a fantastic step in the right direction as it’s an ...

Read more »

System from Trend Following Factors

June 1, 2012
By
System from Trend Following Factors

As I thought more about Trend Following Factors from Hsieh and Fung, I thought that the trend following factors might indicate a state/regime for the equity markets that could potentially offer momentum-style timing signals for a system on the S&P ...

Read more »

Distribution of Oft-Used Bash Commands

June 1, 2012
By
Distribution of Oft-Used Bash Commands

Browsing commandlinefu.com today, I came across this little one-liner to display which commands I use most often. Here’s what I got: Yep, seems legit. I navigate and look at files a whole bunch (ls, cd, cat), and I do a butt tonne of editing (vim). I sudo like a boss, hop onto various servers (ssh),

Read more »

Life contingencies with R

June 1, 2012
By
Life contingencies with R

I will be giving in less than four weeks a short course at the 6th R/Rmetrics Meielisalp Workshop & Summer School on Computational Finance and Financial Engineering organized by ETH Zürich, https://www.rmetrics.org/. The talk will be on Actuarial models with R, and first part will be dedicated to life insurance. A complete set of slides can be downloaded from the...

Read more »

The influences that shaped R: Inferno-ish R

June 1, 2012
By
The influences that shaped R: Inferno-ish R

Patrick Burns, author of the excellent R Inferno, gave a presentation about R at the Cambridge R User Group this week. (Revolution Analytics is a proud sponsor of CambR.) I wasn't at the presentation myself, but Pat always gives a great talk, and he's generously provided his slides with copious notes. They're definitely worth a read if you're interested...

Read more »

Predicting NBA Playoff Games – Results and Update 1

June 1, 2012
By
Predicting NBA Playoff Games – Results and Update 1

Game ResultsI recently made a post about developing an algorithm to predict the NBA playoffs, and I concluded with 2 predictions. Although Miami beat the Celtics to make my algorithm 1-0 in terms of predictions, it fell to 1-1 when the Thunder beat th...

Read more »

Selection in R

June 1, 2012
By

The design of the statistical programming language R sits in a slightly uncomfortable place between the functional programming and object oriented paradigms. The upside is you get a lot of the expressive power of both programming paradigms. A downside of this is: the not always useful variability of the language’s list and object extraction operators. Related posts:

Read more »

Life contingencies with R

June 1, 2012
By
Life contingencies with R

I will be giving in less than four weeks a short course at the 6th R/Rmetrics Meielisalp Workshop & Summer School on Computational Finance and Financial Engineering organized by ETH Zürich, https://www.rmetrics.org/. The talk will be...

Read more »

Predicting NBA Playoff Games – Results and Update 1

June 1, 2012
By
Predicting NBA Playoff Games – Results and Update 1

Game ResultsI recently made a post about developing an algorithm to predict the NBA playoffs, and I concluded with 2 predictions. Although Miami beat the Celtics to make my algorithm 1-0 in terms of predictions, it fell to 1-1 when the Thunder beat th...

Read more »

Color Tools

June 1, 2012
By
Color Tools

Colors are mysterious business and choosing the right color or color palette for an R graphics can sometimes be a chore. To alleviate this burden, I present two functions for choosing color values. To access these functions install the RSurvey and dichromat packages: install.packages(c("RSurvey", "dichromat")) library(RSurvey) Choose Color The first function, ChooseColor (source), calls upon a graphical user interface (GUI)...

Read more »

Computational Journalism Server Version 1.6.5 Released

May 31, 2012
By

I’ve just released version 1.6.5 of the Computational Journalism Server. This is going to be the last release for a while. Release notes: I removed CoffeeScript and Node.js. I wasn’t using them. I dropped back to Erlang R14B-1.1. Everything tes...

Read more »

R Tops Data Mining Software Poll

May 31, 2012
By

For the past 12 years, KDNuggets has conducted an annual poll asking "What analytics/data mining software you used in the past 12 months for a real project (not just evaluation)". In this year's poll, R was the top-ranked data mining solution, selected by 30.7% of poll respondents. Microsoft Excel was second, at 29.8%. Rapidminer, which took the #1 spot...

Read more »

Term structure of interest rate spread volatility : Unit root test

Term structure of interest rate spread volatility : Unit root test

Recently, I was working on my master's thesis and came across an interesting observation regarding the term structure of interest rate spread volatility that I wish to share. Let me first try and throw some light on the jargon that I have used. To begi...

Read more »

Conditional Drawdown Exploration

May 31, 2012
By
Conditional Drawdown Exploration

After reading Strub, Issam S., Trade Sizing Techniques for Drawdown and Tail Risk Control (May 21, 2012), I thought I should try to tie this with 2 other good R pieces on Conditional Drawdown: http://systematicinvestor.wordpress.com/2011/11/01/minimiz...

Read more »

Poll Shows Open Source Almost Even with Commercial Analytics Software

May 31, 2012
By
Poll Shows Open Source Almost Even with Commercial Analytics Software

The 2012 results of the annual KDnuggets poll are in. It shows R in first place with 30.7% of users reporting having used it for a real project. Excel is almost as popular. It seems out of place among so … Continue reading →

Read more »

Using R.Net in an Excel Add in

May 31, 2012
By

I thought I’d try out R.net and in doing so I have put together a very simple Excel 2007 add in that connects Excel to R. I’m using .Net 4.0 in Visual Studio 2010 pro with the latest commit of R.Net, … Continue reading →

Read more »

Simple Text Mining with R

May 31, 2012
By
Simple Text Mining with R

I’ve used R for many use cases and Text Mining is one of those. Below is a small snippet to get you started with R and Text Mining. require(fortunes) require(tm) sentences <- NULL for (i in 1:10) sentences <- c(sentences,fortune(i)$quote) d <- data.frame(textCol =sentences ) ds <- DataframeSource(d) dsc<-Corpus(ds) dtm<- DocumentTermMatrix(dsc, control = list(weighting =

Read more »

Inferno-ish R

May 31, 2012
By
Inferno-ish R

CambR was nice enough to invite Markus Gesmann and me to speak at their event on Tuesday. My talk was Inferno-ish R. See also The R Inferno. Epilogue Subscribe to the Portfolio Probe blog by Email

Read more »

The Facebook Doomsday Watch

May 31, 2012
By
The Facebook Doomsday Watch

I've been following the myriad circus of Facebook commentators and bystanders pointing to its horrific failed IPO launch and seemingly inevitable crash to zero. While my focus here isn't really so much on fundamentals or basic TA; I do want to comment ...

Read more »

New Data Science Packages Coming To Computational Journalism Server

May 30, 2012
By

I’ve just received an announcement from Michael Lang that packages BatchJobs and BatchExperiments have been added to the Comprehensive R Archive Network (CRAN). From the announcement: The package BatchJobs implements the basic objects and procedu...

Read more »

Converting cross sectional data with dates to weekly averages in R.

May 30, 2012
By
Converting cross sectional data with dates to weekly averages in R.

I was recently confronted with a problem where I had to compare two very different data sets. The problem was that one data set was observed cross sectional data with dates over the course of three months and the other was weekly averages during those same three months.  After a bit of research, I discovered

Read more »

Online Course from Statistics.com: Advanced Programming in R

May 30, 2012
By

  Hadley Wickham teaches “Programming in R – Advanced,” June 15 – July 13 online at Statistics.com. This is the third in a series of courses that cover programming in R, so if you are new to the subject you should start with our Jul 27 course “Introduction to R: Data Handling.” Upcoming Courses: Jun 15:  Advanced Programming in R...

Read more »

Predicting the NBA Finals with R

May 30, 2012
By
Predicting the NBA Finals with R

This is the initial post about the algorithm. See updates 1, 2, and 3 for more. The algorithm is currently 4-2 in the playoffs!OverviewI was struck by Martin O'Leary's recent post on predicting the Eurovision finals, which led me to decide that I wou...

Read more »

Project Euler — problem 5

May 30, 2012
By

I spent around 40 minutes on the last post yesterday, which delayed my bedding time and caused my sleepiness in the morning. So, I’m starting to write earlier tonight. The fifth problem is to calculate the smallest composite for given numbers. 2520 is … Continue reading →

Read more »

R 2.15.1 scheduled for June 22

May 30, 2012
By

The next release of open-source R, codenamed "Roasted Marshmallows", is scheduled to be released on June 22, according to this announcement on the r-announce mailing list. Don't expect too many changes in this update: despite the fact that "there have been very few issues with 2.15.0 ... some people may be waiting superstitiously for a .1 release". This will...

Read more »

Send emails with attachments from R command line

May 30, 2012
By

The sendmailR package makes it easy to send emails with attachments from the R command line.  #load packagelibrary("sendmailR")#use string formatting and your system info to format FROM address from <- sprintf("<Project1@%s>", Sys.info()[...

Read more »

Sponsors