\verbatim [beamer package]

June 11, 2012
By
\verbatim [beamer package]

Once again working on my slides for the AMSI Lecture 2012 tour, it took me a while to get the following LaTeX code (about the family reunion puzzle) to work: \begin{frame} \slidetitle{A family meeting} \begin{block}{Random switch of couples} \only<1>{ \begin{itemize} \item Pick two couples at random with probabilities proportional to the

Read more »

Should I adjust the slope?

June 11, 2012
By
Should I adjust the slope?

I add a new video “Should I adjust the slope”, where a new part of script is added to the monitor function.  I don´t recommend adjusting the slope, but there are circumstances where it is necessary:Suppose you have an equation, but not the ca...

Read more »

Do you still have time to sleep ?

June 11, 2012
By
Do you still have time to sleep ?

Last week, @3wen (Ewen) helped me to write nice R functions to extract tweets in R and build datasets containing a lot of information. I've tried a couple of time on my own. Once on tweet contents, but it was not convincing and once on the activit...

Read more »

Time series cross-validation 4: forecasting the S&P 500

June 11, 2012
By
Time series cross-validation 4: forecasting the S&P 500

I finally got around to publishing my time series cross-validation package to github, and I plan to push it out to CRAN  shortly. You can clone the repo using github for mac, for windows, or linux, and then run the following script to...

Read more »

Data distillation with Hadoop and R

June 11, 2012
By
Data distillation with Hadoop and R

We're definitely in the age of Big Data: today, there are many more sources of data readily available to us to analyze than there were even a couple of years ago. But what about extracting useful information from novel data streams that are often noisy and minutely transactional ... aye, there's the rub. One of the great things about...

Read more »

The effect of blockbuster projects on kickstarter pledges (via…

June 11, 2012
By
The effect of blockbuster projects on kickstarter pledges (via…

The effect of blockbuster projects on kickstarter pledges (via Blockbuster Effects » The Kickstarter Blog — Kickstarter)

Read more »

Simulating Euro 2012

June 11, 2012
By
Simulating Euro 2012

Why settle for just one realisation of this year’s UEFA Euro when you can let the tournament play out 10,000 times in silico? Since I already had some code lying around from my submission to the Kaggle hosted 2010 Take on the Quants challenge, I figured I’d recycle it for the Euro this year. The

Read more »

Autoplot: Graphical Methods with ggplot2

June 11, 2012
By
Autoplot:  Graphical Methods with ggplot2

Background As of ggplot2 0.9.0 released in March 2012, there is a new generic function autoplot.  This uses R's S3 methods (which is essentially oop for babies) to let you have some simple overloading of functions.  I'm not going to get deep into oop, because honestly we don't need to. The idea is very simple.  If I say "I'm...

Read more »

Random regression coefficients using lme4

June 11, 2012
By
Random regression coefficients using lme4

What's the gain over lm()?By Ben OgorekRandom effects models have always intrigued me. They offer the flexibility of many parameters under a single unified, cohesive and parsimonious system. But with the growing size of data sets and increased ability to estimate many parameters with a high level of accuracy, will the subtleties of the random effects analysis be lost? In this...

Read more »

Binomial Pricing Trees in R

Binomial Tree Simulation The binomial model is a discrete grid generation method from \(t=0\) to \(T\). At each point in time (\(t+\Delta t\)) we can move up with probability \(p\) and down with probability \((1-p)\). As the probability of an … Continue reading →

Read more »

Universal portfolio, part 6

June 10, 2012
By

The final table in Universal Portfolios introduces leverage.  It indirectly also shows the dangers of rebalancing on margin, while Kin Ark increases 4.2 times, at 50% margin it goes to nothing.The code below reproduces Table 8.4, again a...

Read more »

R becomes a critical tool in government departments

June 10, 2012
By
R becomes a critical tool in government departments

Situation and Outlook for Primary Industries (2012) just published by New Zealand’s Ministry for Primary Industries (click to download page) demonstrates well that R is a limitless tool for analysis and graphing, and the capability of using R is growing in … Continue reading →

Read more »

An R function for finding coordinates of NZ localities

June 10, 2012
By

Over the course of my PhD, I will be doing a fair amount of georeferencing. This involves obtaining geographic coordinates for localities where weevil specimens have been collected. When I'm the one who has collected them, this is fairly straightforward—Google Maps has made obtaining coordinates a breeze. When it's a museum specimen, however, things get a little tricky....

Read more »

R/Python Web Apps

June 10, 2012
By

I have a little delinquent on this whole blogging thing but here is a talk I gave to the DC R Group. On a twisted and Rpy2 web application framework that I built for my company. Enjoy http://bit.ly/NW0Neg J

Read more »

FloraWeb Plant Species Report via R

June 10, 2012
By
FloraWeb Plant Species Report via R

For German-spoken users I added the function floraweb_scrape.R that allows you to conveniently collect species data and print to a PDF-file (see this example output). The function accesses data provided by the  web-site FloraWeb.de (BfN - Bundesministerium für Naturschutz).You can use it as an interactive version (RTclTk) which I have put to a Github repository

Read more »

Classifying the UCI mushrooms

In my last post, I considered the shifts in two interestingness measures as possible tools for selecting variables in classification problems.  Specifically, I considered the Gini and Shannon interestingness measures applied to the 22 categorical mushroom characteristics from the UCI mushroom dataset.  The proposed variable selection strategy was to compare these values when computed from only edible mushrooms...

Read more »

Testing recommender systems in R

June 10, 2012
By
Testing recommender systems in R

Recommender systems are pervasive. You have encountered them while buying a book on barnesandnoble, renting a movie on Netflix, listening to music on Pandora, to finding the bar visit (FourSquare). Saar for Revolution Analytics, had demonstrated how to get started with some techniques for R here. We will build some using Michael Hahsler’s excellent package

Read more »

Universal portfolio, part 5

June 9, 2012
By

The first three tables in Universal Portfolios presents the same information in numerical form as some of the plots.  The following code generates all three tables by defining a function then calling it with suitable parameters.  Th...

Read more »

ggplot2: Creating a custom plot with two different geoms

June 9, 2012
By
ggplot2: Creating a custom plot with two different geoms

This past week for work I had to create some plots to show the max, min, and median of a measure across the levels of a qualitative variable, and show the max and min of the same variable within a … Continue reading →

Read more »

LondonR meeting (June 19th)

June 9, 2012
By

Mango Solutions announces the next LondonR meeting which will take place on June 19th. The meeting is free and open to anyone interested in R.  If you would like to attend please register in advance via email to [email protected] Date:                     Tuesday 19th June 2012 Venue:                 The Counting House, 50 Cornhill, London, London EC3V 3PD (note change of usual...

Read more »

Rcpp vs. R implementation of cosine similarity

June 9, 2012
By

While speeding up some code the other day working on a project with a colleague I ended up trying Rcpp for the first time. I re-implemented the cosine distance function using RcppArmadillo relatively easily using bits and pieces of code I found scattered around the web. But the speed increase was not as much as I expected comparing the...

Read more »

I’m following you in Twitter…are you following me back?

If you spend some time on Twitter, you might have some followers and some people that you follow...the more time you spend, the more people you're going to interact with...Sometimes, you just realized that you're following some many people that might o...

Read more »

Project Euler — problem 8

June 9, 2012
By

The eight problem of Project Euler: Find the greatest product of five consecutive digits in the 1000-digit number. … The solution is as straightforward as the problem, although the 1000-digit number needs some format changes before product calculation. ?View Code … Continue reading →

Read more »

Converting Sweave LaTeX to knitr LaTeX: A case study

June 9, 2012
By

The following post documents the steps I needed to take in order to convert a project using Sweave LaTeX into one using knitr LaTeX. Additional Resources It is fairly straightforward to convert a document from Sweave LaTeX to knitr LaTeX. Yihui Xie on...

Read more »

NBA Playoffs Update 5 (5-4)

June 9, 2012
By
NBA Playoffs Update 5 (5-4)

This is the sixth post in my series on predicting the NBA playoffs with an algorithm. After the Boston loss in their last game, the algorithm is now 5-4 in the playoffs. Hopefully it is correct tonight! Open Sourcing the CodeI have had a couple of re...

Read more »

Visualizing Euro 2012 with ggplot2

June 9, 2012
By
Visualizing Euro 2012 with ggplot2

After scanning this paper by Zeileis, Leitner & Hornik, I thought it would be interesting to see how the victory odds for each team changes as Euro 2012 progresses. To do this, I am going to collect the daily inverse odds of a tournament victory offered by a popular betting site for each team. Here

Read more »

NBA Playoffs Update 5 (5-4)

June 9, 2012
By
NBA Playoffs Update 5 (5-4)

This is the sixth post in my series on predicting the NBA playoffs with an algorithm. After the Boston loss in their last game, the algorithm is now 5-4 in the playoffs. Hopefully it is correct tonight! Open Sourcing the Code I have had a couple ...

Read more »

NBA Playoffs Update 5 (5-4)

June 9, 2012
By
NBA Playoffs Update 5 (5-4)

This is the sixth post in my series on predicting the NBA playoffs with an algorithm. After the Boston loss in their last game, the algorithm is now 5-4 in the playoffs. Hopefully it is correct tonight! Open Sourcing the CodeI have had a couple of req...

Read more »

knitr Performance Report 4

June 8, 2012
By
knitr Performance Report 4

please see knitR Performance Report 3 (really with knitr) and dprint, knitr Performance Report–Attempt 3, knitr Performance Report-Attempt 2 and knitr Performance Report-Attempt 1 Here is another iteration of the ongoing performance reporting attempt...

Read more »