More Dabblings With Local Sentencing Data

December 1, 2011
By
More Dabblings With Local Sentencing Data

In Accessing and Visualising Sentencing Data for Local Courts I posted a couple of quick ways in to playing with Ministry of Justice sentencing data for the period July 2010-June 2011 at the local court level. At the end of the post, I wondered about how to wrangle the data in R so that I

Read more »

Path from root to leaf node in mvpart

December 1, 2011
By
Path from root to leaf node in mvpart

I was recently asked by a R user about how one could extract the “rule” in a classification/regression tree. The requirement was to obtain the path traced from the root node to the leaf nodes and obtain all the paths or “rules” path.rpart() function in the mvpart package provides this convenience library(mvpart) # Create a

Read more »

quantum forest

December 1, 2011
By
quantum forest

Thanks to a link on R-bloggers, I was introduced to Luis Apiolaza’s blog, Quantum Forest, which covers data analyses and R comments he encounters in his research as a quantitative forester/geneticist. And he works at the University of Canterbury, Christchurch, where I first taught from Bayesian Core in 2006. Which may be why he chose

Read more »

knitr: Elegant, flexible and fast dynamic report generation with R

December 1, 2011
By

The world has changed. You can feel it on GitHub. You can smell it on Google+. The knitr package, as an alternative tool to Sweave, has features that you have been longing for, and features that you might have never imagined. Thumb through the PDF manu...

Read more »

knitr: Elegant, flexible and fast dynamic report generation with R

December 1, 2011
By

The world has changed. You can feel it on GitHub. You can smell it on Google+. For those who have been struggling with Sweave, here comes the knitr package. It has features that you have been longing for, and features that you might have never imagined. Thumb through the PDF manual to see some of

Read more »

Review of Distance Course: Graduate Certificate in Statistics offered at Sheffield [completed: 3 June 2012]

December 1, 2011
By

Recently, on Andrew Gelman's blog there was a discussion about how to get yourself a statistics education (presumably without going through the whole process of becoming a professional statistician). Here's the discussion on Gelman's blog, with lots of...

Read more »

Wicked Webapps with R, err, Wt

November 30, 2011
By
Wicked Webapps with R, err, Wt

A few months ago, I had blogged about using R inside of Qt. This used our RInside package for embedding the statistical programming environment and language R inside of a C++ application, and further relies on our Rcpp package for R and C++ integrati...

Read more »

A look at market returns by month

November 30, 2011
By
A look at market returns by month

I’ve been reading The Big Picture, and again, there was a discussion about seasonality in stock markets (see Fourth Quarter is Da Bomb). I’ve already discussed the two seasonal investment scenarios (Nov. to Apr VS May to Oct) in this post, and was wondering if one could break it down further into a monthly analysis.

Read more »

mean of an absolute Student’s t

November 30, 2011
By
mean of an absolute Student’s t

Having (rather foolishly) involved myself into providing an answer for Cross Validated: “Can the standard deviation of non-negative data exceed the mean?“, I ended up having to derive the mean of the absolute value of a Student’s variate X.  (Well, not really, but then I did.) I think the following is correct: where is the

Read more »

Earthquakes

November 30, 2011
By
Earthquakes

> data(quakes)> head(quakes) lat long depth mag stations 1 -20.42 181.62 562 4.8 41 2 -20.62 181.03 650 4.2 15 3 -26.00 184.10 42 5.4 43 4 -17.97 181.66 626...

Read more »

Earthquakes

November 30, 2011
By
Earthquakes

> data(quakes)> head(quakes) lat long depth mag stations 1 -20.42 181.62 562 4.8 41 2 -20.62 181.03 650 4.2 15 3 -26.00 184.10 42 5.4 43 4 -17.97 181.66 626...

Read more »

Tips for getting started on Kaggle (datamining)

November 30, 2011
By
Tips for getting started on Kaggle (datamining)

Ever since I heard about Kaggle.com at this year's Bay Area Data Mining Camp, I've wanted to participate. But I was feeling somewhat intimidated. Jeremy Howard's "Intro to Kaggle" talk at yesterday's MeetUp (DataMining for a Cause) was exactly what I...

Read more »

rOpenSci won 3rd place in the PLoS-Mendeley Binary Battle!

November 30, 2011
By

I am part of the rOpenSci development team (along with Carl Boettiger, Karthik Ram, and Nick Fabina).   Our website: http://ropensci.org/.  Code at Github: https://github.com/ropensciWe entered two of our R packages for integrating with ...

Read more »

rOpenSci won 3rd place in the PLoS-Mendeley Binary Battle!

November 30, 2011
By
rOpenSci won 3rd place in the PLoS-Mendeley Binary Battle!

I am part of the rOpenSci development team (along with Carl Boettiger, Karthik Ram, and Nick Fabina).   Our website: http://ropensci.org/.  Code at Github: https://github.com/ropensciWe entered two of our R packages for integrating with ...

Read more »

rOpenSci won 3rd place in the PLoS-Mendeley Binary Battle!

November 30, 2011
By
rOpenSci won 3rd place in the PLoS-Mendeley Binary Battle!

I am part of the rOpenSci development team (along with Carl Boettiger, Karthik Ram, and Nick Fabina).   Our website: http://ropensci.org/.  Code at Github: https://github.com/ropensciWe entered two of our R packages for integrating with ...

Read more »

Free ggplot2 webinar from Hadley Wickham

November 30, 2011
By

The Orange County R Users Group is hosting a free webinar presented by Hadley Wickham, author of the ggplot2 graphics package for R. The webinar, "Advanced Visualizations in R with Hadley Wickham" is live from 6PM-7PM Pacific Time tomorrow, December 1. You can register at the LinkedIn event page below, as long as there are spaces left (it's limited...

Read more »

rOpenSci is a runner-up in the Mendeley Binary Battle!

November 30, 2011
By

We just got word that rOpenSci was a runner-up in the first Binary Battle!  Thank you for all the support so far! We entered two of our packages for integrating with PLoS Journals (rplos) and Mendeley (RMendeley) in the Mendeley-PLoS Binary Battle.  Get them at GitHub (rplos; RMendeley). These two packages allow users to search and retrieve

Read more »

GUI for sending email in R (using sendEmail)

November 30, 2011
By
GUI for sending email in R (using sendEmail)

After writing the last post on using sendEmail to send email from R I decided to create a simple GUI to enable this functionality. A snapshot image of the GUI is shown above. To use this GUI, you will need to install the following packages in R: gWidgets gWidgetsRGtk2 Windows GTK Bundle More information on

Read more »

Alpha decay in portfolios

November 30, 2011
By
Alpha decay in portfolios

How does the effect of our expected returns change over time?  This is not academic  curiosity, we want to know in the context of our portfolio if we can.  And we can — we visualize the effect of expected returns in situ. First step The idea is to look at the returns of portfolios that … Continue reading...

Read more »

Job Satisfaction in England – GGPlot #2

November 29, 2011
By
Job Satisfaction in England – GGPlot #2

I’ve recently been scouring the internet for a public opinion data set pertaining to job satisfaction. I was particularly interested in examining how gender, age, and socio-economic status influence how satisfied an individual is with their current employment situation. For example, existing research suggests that women and private-sector employees tend to have higher levels of

Read more »

The art of R programming

November 29, 2011
By
The art of R programming

This is a gem of a book. It will become the book I give PhD students when they are learning how to write good R code. That is, if I ever see it again. I had hoped to write a review of it, but I haven’t seen it since it arrived in the mail a

Read more »

Learning R as a language

November 29, 2011
By
Learning R as a language

Books written to teach a general purpose programming language are usually organized according to the features of the language and examples often show how a particular language feature is interpreted by a compiler. Books about domain specific languages are usually organized in a way that makes sense in the corresponding application domain and examples usually

Read more »

Ulam Spirals in R and ggplot

November 29, 2011
By
Ulam Spirals in R and ggplot

Having seen a twitter post speed by about Ulam Spirals I started to read up.  As the story goes in 1963 Stanislaw Ulam was bored at conference and started scribbling numbers in a spiral. What he discovered was a strange diaginal pattern of Prime Nu...

Read more »

Clearning up the sqldf confusion

November 29, 2011
By

Apparently I have issues with my reading comprehension and with Textmate (initially) when it comes to using the sqldf package. G. pointed out in the previous comments, I could have just used options(gsubfn.engine = "R") instead of going through the trouble of installing the tcltk binaries. If you’ve got a happy distribution of R that

Read more »

RcppArmadillo 0.2.31

November 29, 2011
By

Conrad Sanderson just released the second pre-release 2.3.92 of what will be Armadillo 2.4.*. This is now in RcppArmadillo release 0.2.31 which is already on CRAN as of this morning. The NEWS entries summarising the changes for both a...

Read more »

R still the preferred tool of predictive modelers competing at Kaggle

November 29, 2011
By
R still the preferred tool of predictive modelers competing at Kaggle

As reported on the Kaggle blog No Free Hunch, R remains the preferred tool for data scientists seeking to win the prizes in the predictive modeling competitions: More than 30% of Kaggle competitors report using R for their analysis, up from 22% a year ago. R's flexibility and the breadth of packages for machine learning and predictive modeling make...

Read more »

Relation Between Fires and Distanse to the Nearest Road (Recalculated)

November 29, 2011
By
Relation Between Fires and Distanse to the Nearest Road (Recalculated)

As you may already know, I'm a proud owner of AMD FX-8150 8-core CPU. And I've purchased it not for gaming reasons, but for science. My previous CPU was painfully slow with such calculations as determination of the relation between fires and distance t...

Read more »

Permanently Setting the CRAN repository

November 29, 2011
By

Setting the CRAN repository so that it does not ask every time you try to install a package  is something that I think few people bother to do, but it is so simple and can save a fair bit of frustration when working.  This is accomplished through a setting in one of the Rprofile files.  There

Read more »

Review of "The Art of R Programming" by Norman Matloff

November 29, 2011
By

By Joseph Rickert Anyone seeking to learn R faces two major challenges: (1) learning how to swim in the sea of information: R packages, books, websites, blog posts, message boards etc. that threatens to drown a newbie and (2) and coming to grips with the structure, syntax and features of the language itself. Having some idea of what one...

Read more »