In my previous post I showed an animation of Arctic Sea Ice Extent from the 1980′s through August, 2012 (link). In this post, I show how to build this Arctic Sea ice Extent animated chart. Source Data The Arctic Ice … Continue reading →

I work in an environment dominated by SAS, and I am looking to integrate R into our environment. Why would I want to do such a thing? First, I do not want to get rid of SAS. That would not only take away most of our investment in SAS training and hiring good quality SAS programmers, but...

RStudio and knitr are an excellent conbination for generating dynamic reports. But in this blog, I will show you how to generate HTML-style presentaion using R only. OK, I confess that we still need something else: deck.js and markdown and R.utils. ...

R Packages All Well maintained? There are so many R packages, can they all be trusted? or are they well maintained? To answer this question, we just need to take a look of their archive histories. If a package has many versions, we can take that as th...

R Packages growth Curve Why R is so popular? There are a lot of reasons, such as: easy to learn and convenient to use, active community, open source, etc. Another important reason is the numerous contributed packages. Up to yesterday, there are 4033 R...

Most of regression methods assume that response variables follow some exponential distribution families, e.g. Guassian, Poisson, Gamma, etc. However, this assumption was frequently violated in real world by, for example, zero-inflated overdispersion problem. A number of methods were developed to deal with such problem, and among them, Quasi-Poisson and Negative Binomial are the most popular methods perhaps due to that...

My coworkers at Fred Hutchinson regularly use the development version of R (i.e., R-devel) and have urged me to do the same. This post details how I have set up the development version of R on our Linux server, which I use remotely because it is much faster than my Mac. First, I downloaded the R-devel source into ~/local/, which...

In our article How robust is logistic regression? we pointed out some basic yet deep limitations of the traditional full-step Newton-Raphson or Iteratively Reweighted Least Squares methods of solving logistic regression problems (such as in R‘s standard glm() implementation). In fact in the comments we exhibit a well posed data fitting problem that can not Related posts:

The Quick-and-Dirty Summary I was recently asked to participate in a proposed SXSW panel that will debate the question, “Will Data Scientists Be Replaced by Tools?” This post describes my current thinking on that question as a way of (1) convincing you to go vote for the panel’s inclusion in this year’s SXSW and (2)