Data mining competition with R

October 8, 2010
By

There is a new data mining competition aimed at predicting preferred data mining tools in R via dataists.com.   The concept of the competition is to try to determine which R packages are preferred in the R community via their CRAN package librarie...

Read more »

Data mining competition with R

October 8, 2010
By

There is a new data mining competition aimed at predicting preferred data mining tools in R via dataists.com.   The concept of the competition is to try to determine which R packages are preferred in the R community via their CRAN package librarie...

Read more »

Contest for developing an R package recommendation system

October 7, 2010
By

After I spoke tonight at the NYC R meetup, John Myles White and Drew Conway told me about this competition they're administering for developing a recommendation system for R packages. They seem to have already done some work laying out... ...

Read more »

Nightlights

October 7, 2010
By
Nightlights

Those who follow the discussions about UHI understand that “nightlights” plays a large role in defining whether or not a station is considered Rural or Urban. In the work of GISS nightlights are determined by looking at the DSMP product. The product is available in 30 arcsecond format. That’s .00833 degrees. The following issue arises.

Read more »

Pre-ordinary meeting

October 7, 2010
By
Pre-ordinary meeting

Those are the slides for the (basic) introduction of the paper by Mark Girolami and Ben Calderhead at the RSS next week. Not to be confused with my comments on the paper. Filed under: R, Statistics, Travel, University life Tagged: Hamiltonian, Langevi...

Read more »

Students in predominantly ethnic minority classes want segregated education very much. The others don’t.

October 7, 2010
By
Students in predominantly ethnic minority classes want segregated education very much. The others don’t.

I just did this for what will hopefully be a book chapter on our Divided Education – Divided Citizens research project with NEPC. Explanation further below for anyone more interested in the actual topic About the graphic: I like raw-data plots like this, made possible by Hadley Wickhams’s amazing ggplot2 package for the stats package

Read more »

Students in predominantly ethnic minority classes want segregated education very much. The others don’t.

October 7, 2010
By
Students in predominantly ethnic minority classes want segregated education very much. The others don’t.

I just did this for what will hopefully be a book chapter on our Divided Education - Divided Citizens research project with NEPC. Explanation further below for anyone more interested in the actual topic ;-) About the graphic: I like raw-...

Read more »

Studying joint effects in a regression

October 7, 2010
By
Studying joint effects in a regression

We've seen in the previous post (here)  how important the *-cartesian product to model joint effected in the regression. Consider the case of two explanatory variates, one continuous (, the age of the driver) and one qualitative (, gasoline ve...

Read more »

Build a Recommendation System for R Packages

October 7, 2010
By

On Dataists, a new collaborative blog for data hackers that I’m contributing to, we’ve just announced a data contest that’s custom made for R users. To win the contest, you need to build a recommendation system for R packages. To find out more, check out the official announcement on Dataists. Then go to GitHub to

Read more »

Using Data Tools to Find Data Tools, the Yo Dawg of Data Hacking

October 7, 2010
By

by John Myles White and Drew Conway Editors’ Note: One theme likely to recur on dataists.com is that data hackers love using their tools to analyze, visualize, and predict everything. Data hackers also love discovering and learning about new tools. So it should come as no surprise that Dataist contributors John Myles White and Drew

Read more »

R is Hot: Part 1

October 7, 2010
By

This is Part 1 of a five-part article series, with new parts published each Thursday. You can download the complete article from the Revolution Analytics website. How Did a Statistical Programming Language Invented in New Zealand Become a Global Sensation? Much in the same way that social networking, reality TV and craft beer were considered marginal fads before gaining...

Read more »

LondonR Rcpp slides

October 7, 2010
By

I'm just back to london where I presented about Rcpp at mango's LondonR event. This was the third time (after rmetrics and useR!) I presented these slides, so I allowed myself some new metaphores about my long term relationship with R and my ind...

Read more »

Science is vital – what we don’t know yet

October 6, 2010
By
Science is vital – what we don’t know yet

This post is not about R (for a change). For working UK scientists, science is vital – sign the on-line petition to preserve science funding. For my contribution of what we don’t know yet - We don’t know whether we can use biomarkers of kidney injury to personalise the doses of medications to maximise the

Read more »

Creating GUIs in R with gWidgets

October 6, 2010
By
Creating GUIs in R with gWidgets

The gWidgets framework is a way of creating graphical user interfaces in a toolkit independent way. That means that you can choose between tcl/tk, Gtk, Java or Qt underneath the bonnet. There's also a web-version based upon RApache and ExtJS. Since the code is the same in each case, you can change your mind and swap toolkits...

Read more »

R is Hot

October 6, 2010
By

Our mission at Revolution Analytics is to make R the statistical analysis tool of choice in the workplace. But even though R is pervasive in academia and rising in popularity generally, we still sometimes get blank faces when we demonstrate R to potential new clients. Sure, most people have heard of R -- it's been hard to miss in...

Read more »

Belgian Astronomers and Exercise Machines

October 6, 2010
By
Belgian Astronomers and Exercise Machines

In the twisting paths of human discovery, you never quite know what intellectual enterprise is going to result in a world changing discovery.  For instance, the mathematical notion of expected value did not grow up in a sterile, academic environment.   In 1654 Blaise Pascal was approached by Chevalier de Méré who was interested...

Read more »

Belgian Astronomers and Exercise Machines

October 6, 2010
By
Belgian Astronomers and Exercise Machines

In the twisting paths of human discovery, you never quite know what intellectual enterprise is going to result in a world changing discovery.  For instance, the mathematical notion of expected value did not grow up in a sterile, academic environment.   In 1654 Blaise Pascal was approached by Chevalier de Méré who was interested...

Read more »

Convert decimal to IEEE-754 in R

October 6, 2010
By

For some theory on the standard IEEE-754, you can read the Wikipedia page. Here I will post only the code of the function to make the conversion in R.First we write some functions to convert decimal numbers to binary numbers:decInt_to_8bit q r xx for(i in 1:precs){xx q r xx }rr return(rr)}devDec_to_8bit nas nbs xxs for(i in 1:precs){xxs...

Read more »

Convert decimal to IEEE-754 in R

October 6, 2010
By

For some theory on the standard IEEE-754, you can read the Wikipedia page. Here I will post only the code of the function to make the conversion in R.First we write some functions to convert decimal numbers to binary numbers:decInt_to_8bit q r xx for(i in 1:precs){xx q r xx }rr return(rr)}devDec_to_8bit nas nbs xxs for(i in 1:precs){xxs...

Read more »

How fast is JAGS?

October 6, 2010
By
How fast is JAGS?

From Martyn Plummer, on the JAGS news blog. Key graph below, showing a few outlying cases in which JAGS is substantially slower than OpenBUGS, but generally, JAGS performs quite favorably. Key point from Martyn: Incidentally, these figures are for JAGS with the glm module loaded. The glm module is not loaded by default. If you

Read more »

Typos…

October 5, 2010
By
Typos…

Edward Kao just sent another typo found both in  Monte Carlo Statistical Methods (Problem 3.21) and in Introducing Monte Carlo Methods with R (Exercise 3.17), namely that should be I also got another email from Jerry Sin mentioning that matrix summation in the matrix commands of Figure 1.2 of Introducing Monte Carlo Methods with R

Read more »

The Data Science Venn Diagram

October 5, 2010
By
The Data Science Venn Diagram

Whenever I'm asked, "Who uses R?", I usually rattle off a long list of job titles: statistician, analyst, quant, researcher ... and that's before all the domain-specific titles. It would be nice if there were a simple, succinct phrase to describe the process of working with, analyzing, and communicating with real data. At the new blog, "dataists", the inaugural...

Read more »

Arctic Sea Ice Extent Trends: 1979-2010; Update 1

October 5, 2010
By
Arctic Sea Ice Extent Trends: 1979-2010; Update 1

Now that the 2010 Arctic sea ice melt season is over, we can see how 2010 fits into the long-term trends Arctic  Sea Ice Extent. This post shows an R Climate chart that I have made to look at the … Continue reading →

Read more »

Example 8.8: more Hosmer and Lemeshow

October 5, 2010
By
Example 8.8: more Hosmer and Lemeshow

This is a special R-only entry.In Example 8.7, we showed the Hosmer and Lemeshow goodness-of-fit test. Today we demonstrate more advanced computational approaches for the test.If you write a function for your own use, it hardly matters what it looks l...

Read more »

India Australia test cricket matches over the years

October 5, 2010
By
India Australia test cricket matches over the years

If you're like me - chewing your nails, sitting in the same position for the last 2 hours to watch India battle Australia in the Mohali test match and enjoying the fascinating duel, watch how closely these two teams have fought in the past. This plot s...

Read more »

India Australia test cricket matches over the years

October 5, 2010
By
India Australia test cricket matches over the years

If you're like me - chewing your nails, sitting in the same position for the last 2 hours to watch India battle Australia in the Mohali test match and enjoying the fascinating duel, watch how closely these two teams have fought in the past. This plot s...

Read more »

Why R is better than Excel for teaching statistics

October 4, 2010
By
Why R is better than Excel for teaching statistics

This was the topic of a recent conversation on the Australian and New Zealand R mailing list. Here is an edited list of some of the comments made. R is free. R is well-documented. R runs (really well) on *nix as well as Windows and Mac OS. R is open-source. Trust in the R software

Read more »

S4 classes in R: printing function definition and getting help

October 4, 2010
By

I’m not very familiar with S4 classes and methods, but I assume it’s the recommended way to write new packages since it is newer than S3; this of course is open to debate. I’ll outline my experience of programming with S4 classes and methods in a later post, but in the mean time, I want... Read more »

Use R to Analyze Players for your Fantasy Hockey League

October 4, 2010
By
Use R to Analyze Players for your Fantasy Hockey League

I am in a fantasy hockey league for the first time this seasons and I wanted to use R to analyze players.  Since I am relatively new to R, I am quite certain this code could be improved.  The code below is functional, however, and while this isn’t my complete analysis, I think it outlines

Read more »