Le Monde puzzle #13

April 13, 2011
By
Le Monde puzzle #13

This week, Le Monde offers not one but three related puzzles: Is it possible to label the twelve edges of a cube by consecutive numbers such that the sum of the edge numbers at any of the eight nodes is constant? Is it possible to label the eight nodes of a cube by consecutive numbers

Read more »

compiler and runiregGibbs (bayesm)

April 13, 2011
By

So everyone's excited about the new R 2.13 release because of the compiler package.Apparently it is easy to get a 3x speed increase by simply compiling a function.Doing a lot of the MCMC stuff, I am particularly excited about speed in R. I just compile...

Read more »

compiler and runiregGibbs (bayesm)

April 13, 2011
By

So everyone's excited about the new R 2.13 release because of the compiler package.Apparently it is easy to get a 3x speed increase by simply compiling a function.Doing a lot of the MCMC stuff, I am particularly excited about speed in R. I just compile...

Read more »

Rcpp 0.9.4, and a paper in the Journal of Statistical Software

April 13, 2011
By

A brand new 0.9.4 release of Rcpp is now on CRAN and Debian. This version contains an improvement to loading and initialization of Rcpp modules, a bug fix for vectors of factors, another build issue fix as well as (per common practice with JSS) ...

Read more »

Block diagonal matrices in R

April 13, 2011
By
Block diagonal matrices in R

As far as I can tell, R doesn’t have a function for building block diagonal matrices so as I needed one, I’ve coded it myself. It might save someone some time. Example: Let m1 and m2 two square matrices. Selec … Continue reading →

Read more »

Some comments peer-review and a year of blogging

April 13, 2011
By

It's been a year since I began keeping a web log. This post presents some thoughts related to the experience. Blogging is Sharing Ideas Blogging is online self-publishing. There is no faster way to share your ideas so broadly. Last year at the useR! conference (in Gaithersburg, MD, just a few months after I started

Read more »

RProtoBuf 0.2.3

April 13, 2011
By

A maintenance release 0.2.3 of RProtoBuf is now on CRAN. RProtoBuf provides GNU R bindings for the Google Protobuf data encoding library used and released by Google. The NEWS file entry follows below: 0.2.3 2011-04-12 o Protect UINT64...

Read more »

Using R and clinical heuristics to explore the Heritage Health Prize: what do we gain?

April 13, 2011
By

The recent opening of the Heritage Health Prize both represents a milestone and raises a cautionary flag. On the one hand, crowdsourced analytics prizes have never tackled anything so noble (not to discount predicting movie ratings), but on the other … Continue reading →

Read more »

CoKriging with gstat – Videotutorial

April 13, 2011
By

This is the last lesson of the R Videotutorial for spatial statistics.It is all about cokriging in gstat. For this lesson I used the meuse dataset, available within gstat, for the references to this dataset take a look at the script.The videotutorial i...

Read more »

CoKriging with gstat – Videotutorial

April 13, 2011
By

This is the last lesson of the R Videotutorial for spatial statistics.It is all about cokriging in gstat. For this lesson I used the meuse dataset, available within gstat, for the references to this dataset take a look at the script.The videotutorial i...

Read more »

Screening for predictive characteristics … and a mea culpa

Screening for predictive characteristics … and a mea culpa

In my last post, I considered the UCI mushroom dataset and characterized the variables included there using four different interestingness measures.  When I began drafting this post, my intention was to consider the question of how the different m...

Read more »

Journal of Statistical Software. Vol. 40

April 12, 2011
By

The latest edition of the Journal of Statistical Software is out, with plenty of interesting articles for R users. A must-read is Hadley Wickham's article on "The Split-Apply-Combine Strategy for Data Analysis", which makes a compelling argument for the use of the plyr package to partition datasets and apply sub-group analyses. Also, anyone who hasn't yet purchased a copy...

Read more »

Using tikzDevice with Sweave in R 2.13

April 12, 2011
By

R 2.13 introduces an option to specify a custom graphics device in an Sweave code chunk. This is really cool and allows you to use tikzDevice output like pgfSweave does. In an pinch, say when you dont have access to any non-core packages, you can use tikzDevice output with the regular Sweave driver (RweaveLatex) more

Read more »

Video Tutorial on Robust Standard Errors

April 12, 2011
By
Video Tutorial on Robust Standard Errors

Update: I have included a modified version of this summaryR() command as part of my package tonymisc, which extends mtable() to report robust standard errors. The tonymisc package is available on CRAN through the install.packages() command.If you have...

Read more »

Video Tutorial on Robust Standard Errors

April 12, 2011
By
Video Tutorial on Robust Standard Errors

Update: I have included a modified version of this summaryR() command as part of my package tonymisc, which extends mtable() to report robust standard errors. The tonymisc package is available on CRAN through the install.packages() command.If you have...

Read more »

Measuring Price Dispersion of Marijuana

April 12, 2011
By
Measuring Price Dispersion of Marijuana

The intersection of mapping APIs, fast database operations and user engagement offers a lot of very cool crowdsourcing applications ranging from the benign and powerful (Google’s Person Finder) to the minor and questionable (A DUI checkpoints app). Most intriguing in … Continue reading →

Read more »

Using R + Bioconductor to Get Flanking Sequence Given Genomic Coordinates

April 12, 2011
By
Using R + Bioconductor to Get Flanking Sequence Given Genomic Coordinates

I'm working on a project using next-gen sequencing to fine-map a genetic association in a gene region. Now that I've sequenced the region in a small sample, I'm picking SNPs to genotype in a larger sample. When designing the genotyping assay the lab wi...

Read more »

Using R + Bioconductor to Get Flanking Sequence Given Genomic Coordinates

April 12, 2011
By

I'm working on a project using next-gen sequencing to fine-map a genetic association in a gene region. Now that I've sequenced the region in a small sample, I'm picking SNPs to genotype in a larger sample. When designing the genotyping assay the lab wi...

Read more »

Example 8.34: lack of robustness of t test with small n

April 12, 2011
By
Example 8.34: lack of robustness of t test with small n

Tim Hesterberg has effectively argued for a larger role for resampling based inference in introductory statistics courses (and statistical practice more generally). While the Central Limit Theorem is a glorious result, and the Student t-test remarkabl...

Read more »

The new R compiler package in R 2.13.0: Some first experiments

April 12, 2011
By

R 2.13.0 will be released tomorrow. It contains a new package by R Core member Luke Tierney: compiler. What a simple and straightforward name, and something that Luke has been working on for several years. The NEWS file says o Package compiler is now provided as a standard package. See ...

Read more »

The new R compiler package in R 2.13.0: Some first experiments

April 12, 2011
By

R 2.13.0 will be released tomorrow. It contains a new package by R Core member Luke Tierney: compiler. What a simple and straightforward name, and something that Luke has been working on for several years. The NEWS file says o Package compiler is now provided as a standard package. See ...

Read more »

Wilcox’s Robust Statistics: A new R package

April 12, 2011
By

Recently I started to build a new package for R containing Wilcox’ collection of functions for robust statistics. Wilcox provides 700+ functions for robust statistics, including: robust correlations (e.g. percentage bend correlation) robust measures of location and mean differences (e.g. … Continue reading →

Box Plot with ggplot2

Hi,in these days I'm creating lots and lots of box plot with ggplot2.The look of them is really good and you can change every bit of code so that you can customize the plot completely.Here is the code I'm using with a test data file to try it:BoxPlot.z...

Read more »

Box Plot with ggplot2

April 12, 2011
By
Box Plot with ggplot2

Hi,in these days I'm creating lots and lots of box plot with ggplot2.The look of them is really good and you can change every bit of code so that you can customize the plot completely.Here is the code I'm using with a test data file to try it:BoxPlot.z...

Read more »

Historical Sources of Bond Returns with Shiller Data 1919-2011

April 11, 2011
By
Historical Sources of Bond Returns with Shiller Data 1919-2011

And as usual, I always want a longer data set, so after a little playing with R-Excel, we can extend our historical sources of bond returns to 1919.  If nothing else, maybe you can find other uses for the Shiller Dataset in R. From TimelyPort...

Read more »

Rstudio updates to beta 2

April 11, 2011
By

The folks over at Rstudio have released a new update to their open-source R GUI, currently in beta test. This Beta 2 release adds more customizable layouts, editor improvements and new editing themes. Also of interest is a new feature that allows creation of graphics that update under the control of sliders and checkboxes and such. See the full...

Read more »

EC2 AMI for scientific computing in Python and R

April 11, 2011
By

Like many people who crunch numbers frequently, I have increasingly been integrating Amazon’s cloud computing services into my daily workflow. In particular, I have been using their elastic cloud computing (EC2) on a regular basis. The service is an excellent way to offload computationally intensive work from your laptop for literally pennies on the

Read more »

Historical Sources of Bond Returns

April 11, 2011
By
Historical Sources of Bond Returns

As promised in Monitoring Sources of Bond Return, we can show more history if we use CPI instead of expected inflation (from the TIP inflation breakeven yield).  Here are the results with history back to 1953. From TimelyPortfolio However, mo...

Read more »

Monday Links: 23andMe, RStudio, PacBio+Galaxy, Data Science One-Liners, Post-Linkage RFA, SSH

April 11, 2011
By
Monday Links: 23andMe, RStudio, PacBio+Galaxy, Data Science One-Liners, Post-Linkage RFA, SSH

Lately I haven't written as many full length posts as usual, but here's a quick roundup of a few links I've shared on Twitter (@genetics_blog) over the last week:First, 23andMe is having a big DNA Day Sale ($108) for the kit + 1 year of their personal ...

Read more »