Preface I will every now and then post my experience with R, a package for statistical analyses. I try to show some solutions for common types of analyses or problems you are facing when you start working with R. These … Weiterlesen →![]()
Preface I will every now and then post my experience with R, a package for statistical analyses. I try to show some solutions for common types of analyses or problems you are facing when you start working with R. These … Weiterlesen →![]()
Knitr is a great tool for doing reproducible research. You can produce all kinds of output inside a single knitr chunk, e.g. you can write a loop to produce lots of figures or tables. The only catch is if you want your figures to have differing captions, heights, etc (and usually you do). The standard
Now, after reading in data, making plots and organising commands with scripts and Sweave, we’re ready to do some numerical data analysis. If you’re following this introduction, you’ve probably been waiting for this moment, but I really think it’s a good idea to start with graphics and scripting before statistical calculations. We’ll use the silly 
How to control the limits of data values in R plots. R has multiple graphics engines. Here we will talk about the base graphics and the ggplot2 package. We’ll create a bit of data to use in the examples: one2ten <- 1:10 ggplot2 demands that you have a data frame: ggdat <- data.frame(first=one2ten, second=one2ten) Seriously
The post Plot...
GitHub recently launched a more powerful search feature which has been used on more than one occasion to identify sensitive files that may be hosted in a public GitHub repository. When used innocently, there are all sorts of fun things you can find with this search feature. Inspired by Aldo Cortesi's post documenting his exploration
Following the interest in our Twitter Tongues map for London, Ed Manley and I have teamed up with Trendsmap creator John Barratt to offer this snapshot of New York City’s Twitter languages. We have visualised the geography of about 8.5 million geo-located tweets collected between Jan 2010 and Feb 2013. Each tweet is marked by a slightly transparent dot ...
I was flipping through my copy of William Cleveland’s The Elements of Graphing Data the other day; it’s a book worth revisiting. I’ve always liked Cleveland’s approach to visualization as statistical analysis. His quest to ground visualization principles in the context of human visual cognition (he called it “graphical perception”) generated useful advice for designing
The yhat blog lists 10 R packages they wish they'd known about earlier. Drew Conway calls them "10 reasons to always start your analysis in R". They're all very useful R packages that every data scientist should be aware of. They are: sqldf (for selecting from data frames using SQL) forecast (for easy forecasting of time series) plyr (data...
Installing and changing fonts in your plots comes now easy with the extrafonts-package. There is a excellent tutorial on the extrafonts github site, still I will shortly demonstrate how it worked for me. First, install the package and load it. You can now install the desired system fonts (at the moment only TrueType fonts): The 