Blog Archives

Notes on Engineering Data Analysis (with R and ggplot2)

July 8, 2011
By
Notes on Engineering Data Analysis (with R and ggplot2)

Hadley Wickham gave a Google Tech Talk a couple weeks back titled Engineering Data Analysis (with R and ggplot2). These are my notes. The data analysis cycle is to iteratively transform, visualize and model. Leading into the cycle is data access an...

Read more »

Drawing heatmaps in R

June 24, 2011
By
Drawing heatmaps in R

A while back, while reading chapter 4 of Using R for Introductory Statistics, I fooled around with the mtcars dataset giving mechanical and performance properties of cars from the early 70's. Let's plot this data as a hierarchically clustered heatmap. # scale data to mean=0, sd=1 and convert to matrix mtscaled <- as.matrix(scale(mtcars)) # create...

Read more »

Environments in R

June 4, 2011
By
Environments in R

One interesting thing about R is that you can get down into the insides fairly easily. You're allowed to see more of how things are put together than in most languages. One of the ways R does this is by having first-class environments. At first glance, environments are simple enough. An environment...

Read more »

Using R for Introductory Statistics 6, Simulations

March 21, 2011
By
Using R for Introductory Statistics 6, Simulations

R can easily generate random samples from a whole library of probability distributions. We might want to do this to gain insight into the distribution's shape and properties. A tricky aspect of statistics is that results like the central limit theore...

Read more »

Using R for Introductory Statistics, The Geometric distribution

March 13, 2011
By
Using R for Introductory Statistics, The Geometric distribution

We've already seen two discrete probability distributions, the binomial and the hypergeometric. The binomial distribution describes the number of successes in a series of independent trials with replacement. The hypergeometric distribution describes th...

Read more »

Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

February 21, 2011
By
Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

This is a little digression from Chapter 5 of Using R for Introductory Statistics that led me to the hypergeometric distribution. Question 5.13 A sample of 100 people is drawn from a population of 600,000. If it is known that 40% of the population h...

Read more »

Using R for Introductory Statistics, Chapter 5, Probability Distributions

February 9, 2011
By
Using R for Introductory Statistics, Chapter 5, Probability Distributions

In Chapter 5 of Using R for Introductory Statistics we get a brief introduction to probability and, as part of that, a few common probability distributions. Specifically, the normal, binomial, exponential and lognormal distributions make an appearance....

Read more »

Annotated source code

February 1, 2011
By
Annotated source code

We programmers are told that reading code is a good idea. It may be good for you, but it's hard work. Jeremy Ashkenas has come up with a simple tool that makes it easier: docco. Ashkenas is also behind underscore.js and coffeescript, a dialect of ja...

Read more »

Using R for Introductory Statistics, Chapter 5

January 23, 2011
By
Using R for Introductory Statistics, Chapter 5

Any good stats book has to cover a bit of basic probability. That's the purpose of Chapter 5 of Using R for Introductory Statistics, starting with a few definitions: Random variable A random number drawn from a population. A random variable is ...

Read more »

Using R for Introductory Statistics, Chapter 4, Model Formulae

January 10, 2011
By
Using R for Introductory Statistics, Chapter 4, Model Formulae

Several R functions take model formulae as parameters. Model formulae are symbolic expressions. They define a relationship between variables rather than an arithmetic expression to be evaluated immediately. Model formulae are defined with the tilde ope...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)