## Version 1.0 of devtools released!

January 23, 2013
By

We’re very pleased to announce the release of devtools 1.0.  We’ve given devtools the 1.0 marker because it now works with the vast majority of packages in the wild, with this version adding support for S4 and Rcpp.  Devtools also has completely revamped code for finding Rtools on windows, including much better error messages if

## Fake text generation the wrong way, and a contest

January 23, 2013
By

As part of a bigger project, I needed to simulate a text string based on a source document, but at the character level. Just in case people find the code useful, I’ve uploaded it to MCMCtext.r. In my simulated text, each character is chosen based on the transition probabilities in the source text from one

## New book announcement: R and Data Mining – Examples and Case Studies

January 23, 2013
By

R and Data Mining: Examples and Case Studies Author: Yanchang Zhao Publisher: Academic Press, Elsevier Publish date: December 2012 ISBN: 978-0-12-396963-7 Length: 256 pages URL: http://www.rdatamining.com/books/rdm This book introduces into using R for data mining with examples and case studies. … Continue reading →

## Text Decryption Using MCMC

January 22, 2013
By

The famous probabilist and statistician Persi Diaconis wrote an article not too long ago about the "Markov chain Monte Carlo (MCMC) Revolution." The paper describes how we are able to solve a diverse set of problems with MCMC. The first example he give...

## Reducing Respondent Burden: Item Sampling

January 22, 2013
By

You received confirmation this morning.  Someone made a mistake programming that battery of satisfaction ratings on your online survey.  Instead of each respondent rating all 12 items using a random rotation, only six randomly selected i...

## Binomial Confidence Intervals

January 22, 2013
By
$Binomial Confidence Intervals$

This stems from a couple of binomial distribution projects I have been working on recently.  It’s widely known that there are many different flavors of confidence intervals for the binomial distribution.  The reason for this is that there is a coverage problem with these intervals (see Coverage Probability).  A 95% confidence interval isn’t always (actually

## Life expectancy and retirement age in USA

January 22, 2013
By

Data Scientist John Myles White used R to compare life expectancy and retirement age in the USA. While male (red) and female (green) life expectancies are rising, average retirement age is going down. What this means for the future of benefit programs like Social Security is the topic of an interesting series of comments on the post.

## A beginner’s guide to sharing and collaboration with R

January 22, 2013
By

In today's social world, it's important to be able to collaborate with others online when working with data, and to be able to easily share your outputs online. Fortunately, the R language and the broad R community provides a number of facilities for collaboration and sharing, which are summarized in Noam Ross's guide to tools for collaboration with R....

## Reserving based on log-incremental payments in R, part III

January 22, 2013
By

This is the third post about Christofides' paper on Regression models based on log-incremental payments . The first post covered the fundamentals of Christofides' reserving model in sections A - F, the second focused on a more realistic example and model reduction of sections G - K. Today's post will wrap up the paper with sections...

## from down-under, Lake Menteith upside-down

January 22, 2013
By

The dataset used in Bayesian Core for the chapter on image processing is a Landsat picture of Lake of Menteith in Scotland (close to Loch Lomond). (Yes, Lake of Menteith, not Loch Menteith!) Here is the image produced in the book. I just got an email from Matt Moores at QUT that the image is

## Shiny Server now available

January 22, 2013
By

Shiny makes it easy to develop interactive web applications that run on your own machine. But by itself, it isn’t designed to make your applications available to all comers over the internet (or intranet). You can’t run more than one Shiny application on the same port, and if your R process crashes or exits for

## SQL commands in R

January 22, 2013
By

For a class I'm taking this semester on genomics we're dealing with some pretty large data and for this reason we're learning to use mySQL. I decided to be a geek and do the assignments in R as well to demonstrate the ability of R to handle pretty larg...

## R-bloggers

January 22, 2013
By

As long as I can't find the time to post my newest adventuRes, why don't you check out the great collection of other R-blogs on the web:www.r-bloggers.com Have fun!

## Parallel Array Computations With SciDB and R

January 22, 2013
By

R Evangelist Bryan Lewis on a natural integration of the R analytic environment and SciDB's distributed, multidimensional array database.

## Comparing Transformation Styles: attach, transform, mutate and within

January 22, 2013
By

There are several ways to perform data transformations in R. Each has its own set of advantages and disadvantages. Let’s take one variable, square it and add 100. How many ways might an R beginner screw up such a simple … Continue reading →

## Randomly deleting duplicate rows from a dataframe

January 22, 2013
By

I use R a lot in my day to day workflow, particularly for manipulating raw data files into a format that can be used for analysis. This is often a brain-taxing exercise and, sometimes, it would be totally quicker to … Continue reading →

## Randomly deleting duplicate rows from a dataframe

January 22, 2013
By

I use R a lot in my day to day workflow, particularly for manipulating raw data files into a format that can be used for analysis. This is often a brain-taxing exercise and, sometimes, it would be totally quicker to … Continue reading →

## Quick conversion of a list of lists into a data frame

January 22, 2013
By

Data frames are one of R’s distinguishing features. Exposing a list of lists as an array of cases, they make many formal operations such as regression or optimization easy to represent. The R data.frame operation for lists is quite slow, in large part because it exposes a vast amount of functionality. This sample shows one way to write a much...

## Quick conversion of a list of lists into a data frame

January 22, 2013
By

Data frames are one of R’s distinguishing features. Exposing a list of lists as an array of cases, they make many formal operations such as regression or optimization easy to represent. The R data.frame operation for lists is quite slow, in large part because it exposes a vast amount of functionality. This sample shows one way to write a much...

## A copper toned publication!

January 21, 2013
By

At long last (1.5yrs since the first submission attempt to be exact), the research I worked on as a post-doctoral fellow has been published!Click on the image above to head over to the article for some light reading.  A lot of work went into this ...

## Data fishing: R and XML part 2

January 21, 2013
By

I’m constantly amazed at what can be done using free software, such as R, and more importantly, what can be done with data that are available on the internet. In an earlier post, I confessed to my sedentary lifestyle immersed in code, so my opinion regarding the utility of open-source software is perhaps biased. None

## A strained Data Science analogy

January 21, 2013
By

In the sponsored article Data Science: Buyer Beware at Forbes, SAP's Ray Rivera takes a dim view of Data Science. According to Rivera, Data Science is a "management fad" in the mold of Business Process Reengineering, and casts data scentists as self-ordained "gurus" whose mission is to stand between the "ignorant masses" that need access to data and a...

## Montreal R User group meetup at Wajam

January 21, 2013
By

This Thursday (Jan 24th), 5:30pm, the good folks at Wajam are hosting a meetup of the Montreal R User Group. The event will be at Bolidea at 4115 St Laurent, Montréal, QC. Be sure to RSVP. From Benjamin Rollert: This is an opportunity for people interested in R to hang out at our office, eat

## digest 0.6.1

January 21, 2013
By

digest version 0.6.1 is now on CRAN, and I will push the corresponding version into Debian shortly. Duncan Murdoch added AES support, and helped me fix two issues which (annoyingly) made the Rout.save output differ on another platform. CRANberries...

## El Nino and ggplot2

January 21, 2013
By

Some days ago i've got stucked in what appears to be a simple task: read some El Nino data, and then plot it. The problem was the data file format, which you can find here: El Nino Data. There's a mix between 'white space' and '.' characters, so read....

## Americans Live Longer and Work Less

January 21, 2013
By

Today I saw an article on Hacker News entitled, “America’s CEOs Want You to Work Until You’re 70″. I was particularly surprised by this article appearing out of the blue because I take it for granted that America will eventually have to raise the retirement age to avoid bankruptcy. After reading the article, I wasn’t

## Improved evolution of correlations

January 21, 2013
By

Update June 2013: A systematic analysis of the topic has been published:Schönbrodt, F. D., & Perugini, M. (2013). At what sample size do correlations stabilize? Journal of Research in Personality, 47, 609-612. doi:10.1016/j.jrp.2013.05.009 Check also the supplementary website, where you can find the PDF of the paper. As an update of this post: here’s an

## Clustering and sector strength

January 21, 2013
By

An exploration of the usefulness of sectors. Previously This subject was discussed in “S&P 500 sector strengths”. Idea Stocks are put into groups based on the sector that the company is considered to be in.  Cluster analysis is a statistical technique that finds groups.  If sectors really move together, then clustering should recover sectors.  Will … Continue reading...