Monthly Archives: September 2011

About commercial publishers

September 19, 2011
By
About commercial publishers

Julien Cornebise has pointed out a recent Guardian article. It is about commercial publishers of academic journals, mainly Elsevier, Springer, and Wiley, with a clear stand from its title: “Academic publishers make Murdoch look like a socialist“! The valuable argument therein is that academic publishers make hefty profits (a 40% margin for Elsevier!)

Read more »

Using jri to connect JAVA to R

September 19, 2011
By

The R package rJava allows R to be accessed in Java programs. The part of the package that allows this is jri. The notes on the rJava site about getting jri to work didn’t help me much getting it to … Continue reading →

Read more »

R 2.14 to be released on October 31; R 2.13 patch on September 13

September 19, 2011
By

The next major release of R has been announced: R 2.14.0 is scheduled for October 31. Details are still coming in about the new features planned for this release, but R core member Luke Tierney has revealed some of the performance improvements expected, and R core member Brian Ripley has spoken of forthcoming low-level support for multi-threaded computing and...

Read more »

Appendable saving in R

September 19, 2011
By

One of the most crucial problems in HPC is that every error you make have much greater impact than in the normal computing — there is nothing more amusing than finding out that few-day simulation broke few minutes before the end because of an unfortunate value thrown by a random generator, typo in result saving code or

Read more »

Three free books for better programming in R (and any other language)

September 19, 2011
By

Like many users and producers of R packages, I have never had any formal training in computer science. I’ve come to to the conclusion that this is a serious omission in a professional researcher’s training. Computer scientists and professional hackers … Continue reading →

Read more »

rgdal + raster + RCurl = My next package

September 18, 2011
By
rgdal + raster + RCurl = My next package

This package has been a long time in the making.  In the end it’s more of a data package than a functional package, but pulling all the pieces together required me to learn some really cool packages: raster ( which I already knew ) rgdal and RCurl.  I’ll provide a littel bit of an overview

Read more »

DTW: dynamic time warping 动态时间规整

September 18, 2011
By

Basically, DTW (dynamic time warping) is an algorithm to output cumulative distance of two time sequences, which is widely used e.g. for classification and clustering.For example, when using k-mean for clustering, we can use DTW as distance function. Here is one of such nice instances (using R: http://www.rdatamining.com/examples/ts-mining)Relevant information from Anshul's email.  A review of DTW http://csdl.ics.hawaii.edu/techreports/08-04/08-04.pdfCode:Python code: https://mlpy.fbk.eu/R...

Read more »

Map the distribution of your sample by geolocating ip addresses or zip codes

September 18, 2011
By
Map the distribution of your sample by geolocating ip addresses or zip codes

Yesterday I wanted to create a map of participants from a study on social media and partisan selective exposure that Sean Westwood and I ran recently, with participants from Amazon’s Mechanical Turk.  We recorded ip addresses for each Turker participant, so … Continue reading →

Read more »

Implementation of the CDC Growth Charts in R

September 17, 2011
By

I implemented in R a function to re-create the CDC Growth Chart, according to the data provided by the CDC.In order to use this function, you need to download the .rar file available at this megaupload link.Mirror: mediafire link.Then unrar the file, a...

Read more »

Bayesian Models with Censored Data: A comparison of OLS, tobit and bayesian models

September 17, 2011
By
Bayesian Models with Censored Data: A comparison of OLS, tobit and bayesian models

The following R code models a censored dependent variable (in this case academic aptitude) using a traditional least squares, tobit, and Bayesian approaches.  As depicted below, the OLS estimates (blue) for censored data are inconsistent and will ...

Read more »