**Why trust some supposed laws of statistical sampling and convergence when you can just test them yourself?** If you have a computer with `R`

installed (also recommended: `Rstudio`

) then you can stop dithering about whether these `n=1000`

studies cited in the newspapers actually resemble the truth enough, or not.

# make some people
# let's say 1e5 one-dimensional people characterised by one parameter
# like "wealth" or "health" or "support of some particular policy"
# if you want you can create subsets like "Irish" and "English"
# ... I'll leave that kind of fun to you
**base** **<-** **rnorm**(**1e5**, mean=**45**, sd=**4**)
**inheritance** **<-** exp( exp( exp( **rpois**(1e5, 1.1) )))
**luck** **<-** base ***** inheritance ***** **rpois**(1e5, 2.1)
**extreme.luck** <- rcauchy(1e5, location=45, scale=4)
**people** **<-** exp( base **+** inheritance **+** luck **+** extreme.luck )
# randomly sample the people
**Nielsen** **<-** **sample(** people[1:1e5], **100**, replace=F **)**
# take some statistics of each and compare them
**mean**(Nielsen)
**mean**(people)
**diff**( mean(Nielsen), mean(people) )
# and so on
# compare histograms, compare medians, compare stdev's, compare kurtoses...

(Notice this is an economy with no geography, no choice, and no response.)

You could also simulate “biased sampling” by grabbing for example `people[1:100]`

rather than `sample(people[1:1e5], 100, replace=F)`

. Or to be a little biased but also a little random you could make a `indexes.to.sample.from <- floor( runif( 100, min=1, max=316) ^2 )`

. (Squaring will disperse the values with a bias towards the earlier. Think about *that* meaning of the parabola picture!)

Nice way to play around with:

- Different functions for generating (and noising up) a bunch of sims
- Different measures of central tendency or spread (is
`median`

better than `mean`

? You can prove it to yourself.)
`R`

. Not that we need more reasons to play around with R, but we will gladly accept them.

*Related*

To

**leave a comment** for the author, please follow the link and comment on his blog:

** Isomorphismes**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as: visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...

**Tags:** biology, demographics, Economics, Geography, Monte Carlo, R, Simulation, statistics