R exam postprocessing

February 19, 2010
By
R exam postprocessing

Following my three-fold R exam of last month, I had a depressing afternoon meeting (with other faculty members) some students who had submitted R codes that were suspiciously close to other submitted R codes… In other words, it looked very  likely they had cheated. (A long-term issue with my R course, alas!) During this meeting,

Read more »

Where did all the bankers go?

February 19, 2010
By
Where did all the bankers go?

When Lehman Brothers, Bear Stearns and Merrill Lynch went kablooie in the financial crisis, what happened to all their employees? Thanks to the magic of LinkedIn data, their Chief Scientist DJ Patil can answer that question: they went to the surviving banks: It's a great, if tantalizingly incomplete visualization -- I'd love to see this with "Other (non-bank) employers"...

Read more »

How to call C++ from R with ease

February 19, 2010
By

At last night's meeting of the ACM Student Chapter at the University of Chicago, DIrk Eddelbuettel gave an invited guest lecture, "Programming with Data: Using and Extending R". I wasn't there myself, but Dirk has already posted his slides, and they're a treat. After a backgrounder on R itself (BTW, I'm flattered he referenced my Introduction to R talk...

Read more »

Newspaper flubs probability calculation

February 19, 2010
By

That headline's right up there with "Dog Bites Man" for shock value, but the Daily Express in the UK isn't one to let mere probability stand in the way of a sensational headline like "Mum beats odds of 50 million to one to have 3 babies on same date". As Ben Goldacre helpfully explains, the probability is actually a...

Read more »

U of C ACM talk

February 18, 2010
By

Fellow GSoC mentor and local ACM masterminder Borja Sotomayor had invited me a few months ago to give a talk at the ACM chapter at the University of Chicago. Today was the day, and the slides from the 50-minutes talk on R and extending R with Rcpp ar...

Read more »

U of C ACM talk

February 18, 2010
By

Fellow GSoC mentor and local ACM masterminder Borja Sotomayor had invited me a few months ago to give a talk at the ACM chapter at the University of Chicago. Today was the day, and the slides from the 50-minutes talk on R and extending R with Rcpp are ...

Read more »

U of C ACM talk

February 18, 2010
By

Fellow GSoC mentor and local ACM masterminder Borja Sotomayor had invited me a few months ago to give a talk at the ACM chapter at the University of Chicago. Today was the day, and the slides from the 50-minutes talk on R and extending R with Rcpp ar...

Read more »

Corruption indicators in Mexico

February 18, 2010
By
Corruption indicators in Mexico

As you can see there is only a slight positive correlation between the corruption index of the Mexican chapter of Transparency International and the percentage of students who cheated on the Grade 6 ENLACE test*. What I find surprising is that there is...

Read more »

Joining R-bloggers

February 18, 2010
By
Joining R-bloggers

Upon request by the blog administrator, Tal Galili, I have joined R-bloggers, which aggregate blog entries about R into a central place. I feel I have much more to learn than to teach about R (as can be seen from earlier comments on my R programs in Introducing Monte Carlo Methods with R). As I

Read more »

Press Enter in LyX Sweave as You Wish

February 18, 2010
By
Press Enter in LyX Sweave as You Wish

or a long time I’ve been wondering why we are not able to use Enter in the LyX Scrap environment which was set up by Gregor Gorjanc for Sweave. Two weeks ago, I (finally!) could not help asking Gregor about this issue, as I’m using “LyX + Sweave” more and more in my daily work.

Read more »

R IDE and debugger now available for 64-bit Windows; Webinar Tuesday

February 18, 2010
By

We've just upgraded REvolution R Enterprise to version 3.1 and expanded the available platforms to include 64-bit Windows. (REvolution R Enterprise is our subscription-based distribution of R.) This means that you can now create R programs on Windows that use all of your available memory, instead of being constrained by the 3Gb limit imposed by 32-bit versions of Windows....

Read more »

SPSS Co-Founder Tex Hull Joins REvolution Computing

February 18, 2010
By

We're proud to announce that Tex Hull, who together with REvolution CEO Norman Nie created the first version of SPSS, has joined the REvolution team. Tex will be working with Norman and our CTO David Champagne to take REvolution R Enterprise to the next level, specifically to improve its scalability to handle very large data sets. You can read...

Read more »

Gas price seasonality

February 18, 2010
By
Gas price seasonality

Last spring I read “Quantitative Trading” by Ernest P. Chan. In his book, he suggested to buy gas futures contract at the end of February and sell it later, in March. Today, I decided to test this strategy by using R-language. The most important thing for such investigation is data. For this purpose, I used this

Read more »

Analysis of Winter Olympic Medal Data Using R

February 18, 2010
By
Analysis of Winter Olympic Medal Data Using R

The Winter Olympics are on. The Guardian's DataBlog has graciously compiled a database on Winter Olympic Medals. Thus, I thought I'd run a few quick analyses on the data in R. In this post I was hoping to show how one could quickly churn out ...

Read more »

raster images and RImageJ

February 18, 2010
By
raster images and RImageJ

The next version of R includes support for raster images in standard and grid graphics. The RImageJ package uses ImageJ through rJava to read and manipulate images from various formats Paul Murrell closed the gap and contributed code that allows...

Read more »

Genetic Algorithm Systematic Trading Development– Part 2

February 17, 2010
By
Genetic Algorithm Systematic Trading Development– Part 2

We started by discussing the goal of a genetic algorithm, which is to optimally find the candidate pool of rules that are superior to other potential rules. In our example of moving averages, we are seeking the values of parameters of the rule :if ma(...

Read more »

R project named in Intelligent Enterprise 2010 Editor’s Choice Awards

February 17, 2010
By

Intelligent Enterprise has announced its 2010 "Editors Choice" Awards, and the R project is included as one of twelve "Companies to Watch" in the Business Intelligence category. R Project is an open-source statistical programming environment that is winning broad praise and accelerating uptake as a language for in-database analytics. The likes of SAS, SPSS and Information Builders are even...

Read more »

Real-World, Real-Time Analytics

February 17, 2010
By

Stop wasting time reading my drivel. You need to head over the the DataWrangling.com blog and read Peter Skomoroch’s interview with Bradford Cross of FlightCaster. Peter wrote up this interview back in August 2009, so I’m a little late to this party. There’s some really great quotes in this interview. Here’s a few of my fav

Read more »

hash-1.99.x

February 17, 2010
By

hash-2.0.0 has been released please read about it here: Earlier today, hash-1.99.x was released to CRAN. This is a stable release and adds some more functions to an already full-featured hash implementation. This version fixes some bugs, adds some features, improves performance and stability. You can read about the hash package in

Read more »

Springer solution manuals on line

February 17, 2010
By
Springer solution manuals on line

Springer Verlag has just posted on its webpage both the student and the instructor solution manuals to “Introducing Monte Carlo Methods with R”. Yes, both! Before you rush there, the Catch-22 in this announcement is that the access to the instructor version is restricted to registered instructors. So, if you are registered as an instructor

Read more »

Visualize dynamic data from R in 3d

February 17, 2010
By

In this video i demonstrate a nice feature of Bio7 to visualize 3d data created in “R” dynamically. The data for the points is generated in “R” and then transferred to the OpenGL view of Bio7. In the first example a random plot is generated and updated. In the second example 10000 random (lighted) spheres

Read more »

Generalized linear mixed effect model problem

February 16, 2010
By
Generalized linear mixed effect model problem

I am trying to compare cohort difference in infant mortality using generalized linear mixed model. I first estimated the model in Stata:xi:xtlogit inftmort i.cohort, i(code)which converged nicely:Fitting comparison model:Iteration 0:   log likelih...

Read more »

How to make a mosaic plot in R

February 16, 2010
By
How to make a mosaic plot in R

Mosaic plots (aka treemaps) are a great way to visualize hierarchical data. A collection of rectangles represents all the elements to be visualized (customers, news items, blog posts), with the size and color of the rectangles coding attribute. But what makes this chart unique is the arrangement of the elements: where there is hierarchy (customer segments, news topics, post...

Read more »

You can Hadoop it! It’s elastic! Boogie woogie woog-ie!

February 16, 2010
By
You can Hadoop it! It’s elastic! Boogie woogie woog-ie!

I just came back from the future and let me be the first to tell you this: Learn some Chinese. And more than just cào nǐ niáng (肏你娘) which your friend in grad school told you means “Live happy with many blessings”. Trust me, I’ve been hanging with Madam Wu and she told me

Read more »

For fun: Correlation of US State with the number of clicks on online banners

February 16, 2010
By
For fun: Correlation of US State with the number of clicks on online banners

“Chitika research” published today a fun small dataset (you can download it from here) in a post titled “The Educated are Harder to Advertise To”. In this post I had three goals in mind: Suggesting another plot instead of the one used in the original post. Emphasizing the “Correlation does not imply causation” rule. Inviting other R lovers (as myself) to find fun...

Read more »

A Case Study in Optimising Code in R

February 16, 2010
By
A Case Study in Optimising Code in R

This post presents an experience I had optimising the efficiency of code for a data analysis task in R. I'm not an expert in programming nor code optimisation. However, I thought my experience might make an interesting case study for others at a simila...

Read more »

Sugar price seasonality

February 16, 2010
By
Sugar price seasonality

Recently, Orion securities have issued a “BUY” recomendation for Cugar ETF. Because, neither I follow the recommendations nor I’m big fan of TA (I have to admit, that I was…), I decided to check sugar price seasonality. Voila, the mean of monthly returns are presented in the graph. February, April and May tend to be negative

Read more »

R Web Application – “Hello World” using RApache (~7min video tutorial)

February 16, 2010
By

I just noticed a google buzz from Jeroen ooms, with a Youtube video titled “RApache Hello World + POST arguments + catching errors.” In this ~7 min video tutorial, Jeroen shares with us: How to write ”Hello World” in a website using RApache. How to extract arguments from a form submited by the website visitor (and then inserting it into an “rnorm” function...

Read more »

Using Google Reader

February 15, 2010
By
Using Google Reader

Google Reader is a fantastic way to keep track of new papers that are appearing in many different journals, and also to follow some of the interesting research blogs (and blogs on other topics) that are out there. Google Reader checks websites for you and lets you know of any new material that appears. Instead of

Read more »