The truncated Poisson

February 21, 2010
By
The truncated Poisson

A common model for counts data is the Poisson. There are cases however that we only record positive counts, ie there is a truncation of 0. This is the truncated Poisson model. To study this model we only need the total counts and the sample size. This comes from the sufficient statistic principle as the

Read more »

Visual Interpretation of Principal Coordinates (of) Neighbor Matrices (PCNM)

February 21, 2010
By

Principal Coordinates (of) Neighbor Matrices (PCNM) is an interesting algorithm, developed by P. Borcard and P. Legendre at the University of Montreal, for the multi-scale analysis of spatial structure. This algorithm is typically applied to a distance matrix, computed from the coordinates where some environmental data were collected. The resulting "PCNM vectors" are commonly used to describe...

Read more »

Uh!

February 20, 2010
By

Didn't know this... a data 0 2 4 7+ 25 34 12 5 It's becoming clear that I have learned R in the most unstructured way...I always do it in two stages :ashamed:

Read more »

Design of Experiments – Block Designs

February 20, 2010
By
Design of Experiments – Block Designs

In many experiments where the investigator is comparing a set of treatments there is the possibility of one or more sources of variability in the experimental measurements that can be accounted for during the design stage of the experimentation. For example we might be investigating four different pieces of machinery using say two different operators,

Read more »

Does a Proclamation of Increased Workout Load Matter?

February 20, 2010
By
Does a Proclamation of Increased Workout Load Matter?

I forgot to link this up, but I have a new article (joint with our editor) over at Fantasy Ball Junkie. I run an extremely crude model to see if players who were mentioned in the media as having lost weight, gained muscle, gained speed, got eye surgery...

Read more »

Genetic Algorithm Systematic Trading Development — Part 3 (Python/VBA)

February 20, 2010
By
Genetic Algorithm Systematic Trading Development — Part 3  (Python/VBA)

As mentioned in prior posts, it is not possible to use the standard Weka GUI to instantiate a Genetic Algorithm, other than for feature selection. Part of the reason is that there is no generic algorithm to instantiate a fitness function. The same fl...

Read more »

lme4 stands 4 Linear mixed-effects…

February 19, 2010
By
lme4 stands 4 Linear mixed-effects…

There is a certain hype about mixed (and random) effects among statistician and analysts. You can show some love to Douglas Bates and Martin Maechler for maintaing the lme4 package for our cupid, R I copy the entity of the information of the projects page. Doxygen documentation of the underlying C functions is here. The

Read more »

R exam postprocessing

February 19, 2010
By
R exam postprocessing

Following my three-fold R exam of last month, I had a depressing afternoon meeting (with other faculty members) some students who had submitted R codes that were suspiciously close to other submitted R codes… In other words, it looked very  likely they had cheated. (A long-term issue with my R course, alas!) During this meeting,

Read more »

Where did all the bankers go?

February 19, 2010
By
Where did all the bankers go?

When Lehman Brothers, Bear Stearns and Merrill Lynch went kablooie in the financial crisis, what happened to all their employees? Thanks to the magic of LinkedIn data, their Chief Scientist DJ Patil can answer that question: they went to the surviving banks: It's a great, if tantalizingly incomplete visualization -- I'd love to see this with "Other (non-bank) employers"...

Read more »

How to call C++ from R with ease

February 19, 2010
By

At last night's meeting of the ACM Student Chapter at the University of Chicago, DIrk Eddelbuettel gave an invited guest lecture, "Programming with Data: Using and Extending R". I wasn't there myself, but Dirk has already posted his slides, and they're a treat. After a backgrounder on R itself (BTW, I'm flattered he referenced my Introduction to R talk...

Read more »

Newspaper flubs probability calculation

February 19, 2010
By

That headline's right up there with "Dog Bites Man" for shock value, but the Daily Express in the UK isn't one to let mere probability stand in the way of a sensational headline like "Mum beats odds of 50 million to one to have 3 babies on same date". As Ben Goldacre helpfully explains, the probability is actually a...

Read more »

U of C ACM talk

February 18, 2010
By

Fellow GSoC mentor and local ACM masterminder Borja Sotomayor had invited me a few months ago to give a talk at the ACM chapter at the University of Chicago. Today was the day, and the slides from the 50-minutes talk on R and extending R with Rcpp ar...

Read more »

U of C ACM talk

February 18, 2010
By

Fellow GSoC mentor and local ACM masterminder Borja Sotomayor had invited me a few months ago to give a talk at the ACM chapter at the University of Chicago. Today was the day, and the slides from the 50-minutes talk on R and extending R with Rcpp are ...

Read more »

U of C ACM talk

February 18, 2010
By

Fellow GSoC mentor and local ACM masterminder Borja Sotomayor had invited me a few months ago to give a talk at the ACM chapter at the University of Chicago. Today was the day, and the slides from the 50-minutes talk on R and extending R with Rcpp ar...

Read more »

Corruption indicators in Mexico

February 18, 2010
By
Corruption indicators in Mexico

As you can see there is only a slight positive correlation between the corruption index of the Mexican chapter of Transparency International and the percentage of students who cheated on the Grade 6 ENLACE test*. What I find surprising is that there is...

Read more »

Joining R-bloggers

February 18, 2010
By
Joining R-bloggers

Upon request by the blog administrator, Tal Galili, I have joined R-bloggers, which aggregate blog entries about R into a central place. I feel I have much more to learn than to teach about R (as can be seen from earlier comments on my R programs in Introducing Monte Carlo Methods with R). As I

Read more »

Press Enter in LyX Sweave as You Wish

February 18, 2010
By
Press Enter in LyX Sweave as You Wish

or a long time I’ve been wondering why we are not able to use Enter in the LyX Scrap environment which was set up by Gregor Gorjanc for Sweave. Two weeks ago, I (finally!) could not help asking Gregor about this issue, as I’m using “LyX + Sweave” more and more in my daily work.

Read more »

R IDE and debugger now available for 64-bit Windows; Webinar Tuesday

February 18, 2010
By

We've just upgraded REvolution R Enterprise to version 3.1 and expanded the available platforms to include 64-bit Windows. (REvolution R Enterprise is our subscription-based distribution of R.) This means that you can now create R programs on Windows that use all of your available memory, instead of being constrained by the 3Gb limit imposed by 32-bit versions of Windows....

Read more »

SPSS Co-Founder Tex Hull Joins REvolution Computing

February 18, 2010
By

We're proud to announce that Tex Hull, who together with REvolution CEO Norman Nie created the first version of SPSS, has joined the REvolution team. Tex will be working with Norman and our CTO David Champagne to take REvolution R Enterprise to the next level, specifically to improve its scalability to handle very large data sets. You can read...

Read more »

Gas price seasonality

February 18, 2010
By
Gas price seasonality

Last spring I read “Quantitative Trading” by Ernest P. Chan. In his book, he suggested to buy gas futures contract at the end of February and sell it later, in March. Today, I decided to test this strategy by using R-language. The most important thing for such investigation is data. For this purpose, I used this

Read more »

Analysis of Winter Olympic Medal Data Using R

February 18, 2010
By
Analysis of Winter Olympic Medal Data Using R

The Winter Olympics are on. The Guardian's DataBlog has graciously compiled a database on Winter Olympic Medals. Thus, I thought I'd run a few quick analyses on the data in R. In this post I was hoping to show how one could quickly churn out ...

Read more »

raster images and RImageJ

February 18, 2010
By
raster images and RImageJ

The next version of R includes support for raster images in standard and grid graphics. The RImageJ package uses ImageJ through rJava to read and manipulate images from various formats Paul Murrell closed the gap and contributed code that allows...

Read more »

Genetic Algorithm Systematic Trading Development– Part 2

February 17, 2010
By
Genetic Algorithm Systematic Trading Development– Part 2

We started by discussing the goal of a genetic algorithm, which is to optimally find the candidate pool of rules that are superior to other potential rules. In our example of moving averages, we are seeking the values of parameters of the rule :if ma(...

Read more »

R project named in Intelligent Enterprise 2010 Editor’s Choice Awards

February 17, 2010
By

Intelligent Enterprise has announced its 2010 "Editors Choice" Awards, and the R project is included as one of twelve "Companies to Watch" in the Business Intelligence category. R Project is an open-source statistical programming environment that is winning broad praise and accelerating uptake as a language for in-database analytics. The likes of SAS, SPSS and Information Builders are even...

Read more »

Real-World, Real-Time Analytics

February 17, 2010
By

Stop wasting time reading my drivel. You need to head over the the DataWrangling.com blog and read Peter Skomoroch’s interview with Bradford Cross of FlightCaster. Peter wrote up this interview back in August 2009, so I’m a little late to this party. There’s some really great quotes in this interview. Here’s a few of my fav

Read more »

hash-1.99.x

February 17, 2010
By

hash-2.0.0 has been released please read about it here: Earlier today, hash-1.99.x was released to CRAN. This is a stable release and adds some more functions to an already full-featured hash implementation. This version fixes some bugs, adds some features, improves performance and stability. You can read about the hash package in

Read more »

Springer solution manuals on line

February 17, 2010
By
Springer solution manuals on line

Springer Verlag has just posted on its webpage both the student and the instructor solution manuals to “Introducing Monte Carlo Methods with R”. Yes, both! Before you rush there, the Catch-22 in this announcement is that the access to the instructor version is restricted to registered instructors. So, if you are registered as an instructor

Read more »

Visualize dynamic data from R in 3d

February 17, 2010
By

In this video i demonstrate a nice feature of Bio7 to visualize 3d data created in “R” dynamically. The data for the points is generated in “R” and then transferred to the OpenGL view of Bio7. In the first example a random plot is generated and updated. In the second example 10000 random (lighted) spheres

Read more »

Generalized linear mixed effect model problem

February 16, 2010
By
Generalized linear mixed effect model problem

I am trying to compare cohort difference in infant mortality using generalized linear mixed model. I first estimated the model in Stata:xi:xtlogit inftmort i.cohort, i(code)which converged nicely:Fitting comparison model:Iteration 0:   log likelih...

Read more »