First, let's load up our data. The data are available in a gist. You can convert your own GPS data to .csv by following the instructions here, using gpsbabel.gps <- read.csv("callan.csv", header = TRUE)Next, we can use the function SMA fr...

Now, after reading in data, making plots and organising commands with scripts and Sweave, we’re ready to do some numerical data analysis. If you’re following this introduction, you’ve probably been waiting for this moment, but I really think it’s a good idea to start with graphics and scripting before statistical calculations. We’ll use the silly

How to control the limits of data values in R plots. R has multiple graphics engines. Here we will talk about the base graphics and the ggplot2 package. We’ll create a bit of data to use in the examples: one2ten <- 1:10 ggplot2 demands that you have a data frame: ggdat <- data.frame(first=one2ten, second=one2ten) Seriously The post Plot...

GitHub recently launched a more powerful search feature which has been used on more than one occasion to identify sensitive files that may be hosted in a public GitHub repository. When used innocently, there are all sorts of fun things you can find with this search feature. Inspired by Aldo Cortesi's post documenting his exploration

Last summer, I had some internet connectivity problems. Specifically, I would have massive latency issues that affected my conversations on Skype and my relatively pathetic under the best of circumstances efforts at online gaming. It was driving me up a wall and I couldn't figure it out. It hadn't...

The yhat blog lists 10 R packages they wish they'd known about earlier. Drew Conway calls them "10 reasons to always start your analysis in R". They're all very useful R packages that every data scientist should be aware of. They are: sqldf (for selecting from data frames using SQL) forecast (for easy forecasting of time series) plyr (data...

Does what it says on the tin. DOWNLOAD THE CODE #------------------------------ #-------- INFORMATION --------- #------------------------------ # Plotting points from Hugh # Rallinson's "Using Geochemical # Data" book. Code compiled by # Darren J. Wilkinson, # Grant Inst. Earth Science # The University of Edinburgh # [email protected] #------------------------------ # -------- CONTROLS ---------- y.max = 16 x.min

This post will describe linear regression as from the book Veterinary Epidemiologic Research, describing the examples provided with R. Regression analysis is used for modeling the relationship between a single variable Y (the outcome, or dependent variable) measured on a continuous or near-continuous scale and one or more predictor (independent or explanatory variable), X. If