Monthly Archives: April 2013

Mortality after paediatric heart surgery using public domain data

April 6, 2013
By
Mortality after paediatric heart surgery using public domain data

This post comes with some big health warnings. The recent events in Leeds highlight the difficulties faced in judging the results of surgery by individual hospital. A clear requirement is timely access to data in a form easily digestible by the public. Here I’ve scraped the publically available data from the central cardiac audit database

Read more »

Retirement : simulating wealth with random returns, inflation and withdrawals – Shiny web application

April 6, 2013
By
Retirement : simulating wealth with random returns, inflation and withdrawals – Shiny web application

Today, I want to share the Retirement : simulating wealth with random returns, inflation and withdrawals – Shiny web application (code at GitHub). This application was developed and contributed by Pierre Chretien, I only made minor updates. This is application is a great example of how easy it is to convert your R script into

Read more »

Worry about correctness and repeatability, not p-values

April 5, 2013
By
Worry about correctness and repeatability, not p-values

In data science work you often run into cryptic sentences like the following: Age adjusted death rates per 10,000 person years across incremental thirds of muscular strength were 38.9, 25.9, and 26.6 for all causes; 12.1, 7.6, and 6.6 for cardiovascular disease; and 6.1, 4.9, and 4.2 for cancer (all P < 0.01 for linearRelated posts:

Read more »

Reconstructing Principal Component Analysis Matrix

April 5, 2013
By
Reconstructing Principal Component Analysis Matrix

PCA is widely used method for finding patterns in high-dimensional data. Whether you use it to compress large matrix or to remove one of the principal components in biological datasets, you’ll end up with the task of performing series of … Continue reading →

Read more »

Organise your data

April 5, 2013
By

Use R to specify factors, recode variables and begin by-group analyses. Video Files This file contains data on pain score after laparoscopic vs. open hernia repair. Age, gender and primary/recurrent hernia also included. The ultimate aim here is to work out which of these factors are associated with more pain after this operation. lap_hernia Script

Read more »

Properly “internationalized” regular expressions in R

April 5, 2013
By

We should pay special attention to writing a truly portable code that works in the same fashion under different locales and character encodings. Currently, R has two Regex engines, ERE (via TRE) and PRE (via PCRE). What is surprising, they…Read more ›

Read more »

Security in R: RAppArmor package & paper updates

April 5, 2013
By

This week version 0.8.3 of RAppArmor appeared on CRAN. RAppAmor is a package to dynamically enforce security policies and hardware restrictions in R on Linux systems. It currently supports Ubuntu 12.04+, Debian 7 and OpenSuse 12.1+. The readme page has more info, and helpful video tutorials to get you started. One important change in the ...

Read more »

Multiple pairwise comparisons for categorical predictors

April 5, 2013
By
Multiple pairwise comparisons for categorical predictors

Dale Barr (@datacmdr) recently had a nice blog post about coding categorical predictors, which reminded me to share my thoughts about multiple pairwise comparisons for categorical predictors in growth curve analysis. As Dale pointed out in his post, the R default is to treat the reference level of a factor as a...

Read more »

Interview by DecisionStats

April 5, 2013
By

Ajay Ohri interviewed me on his popular DecisionStats blog. Topics discussed ranged widely from Fellows Statistics, to Deducer, to statnet, to Poker A.I., to Big Data.    

Read more »

Extending RevoScaleR for Mining Big Data – Hexbins

April 5, 2013
By
Extending RevoScaleR for Mining Big Data – Hexbins

by Derek McCrae Norton, Senior Sales Engineer It is my job to help potential clients see that the tasks they are used to completing can be completed on big data in Revolution R Enterprise (and that it is easy). Honestly, this is my dream job, and in my eyes it is sort of like playing and getting paid for...

Read more »