3097 search results for "GIS"

Onepager Now with knitR

February 19, 2013
By

Since at some point I had trouble with a conflict between knitr and the latex package textpos, I used the lesser Sweave in Another Experiment with R and Sweave.  I ran the Sweave2knitr command and discovered that textpos and knitr play well togeth...

Read more »

Better modelling and visualisation of newspaper count data

February 19, 2013
By
Better modelling and visualisation of newspaper count data

<!-- Styles for R syntax highlighter In this post I outline how count data may be modelled using a negative binomial distribution in order to more accurately present trends in time series count data than using linear methods. I also show how to...

Read more »

Sketches Around Twitter Followers

February 19, 2013
By
Sketches Around Twitter Followers

I’ve been doodling… Following a query about the possible purchase of Twitter followers for various public figure accounts (I need to get my head round what the problem is with that exactly?!), I thought I’d have a quick look at some stats around follower groupings… I started off with a data grab, pulling down the

Read more »

New Rcpp master class scheduled for New York

February 18, 2013
By

A new Rcpp master class is scheduled for March 9 in New York. The format will an updated version of the one-day workshops I have given at the University of Rochester in 2010, in San Franciso in 2011 (organised by Revolution Analytics) and at the UseR...

Read more »

Data fishing: R and XML part 3

February 18, 2013
By
Data fishing: R and XML part 3

I’ve recently posted two blogs about gathering data from web pages using functions in R. Both examples showed how we can create our own custom functions to gather data about Minnesota lakes from the Lakefinder website. The first post was an example showing the use of R to create our own custom functions to get

Read more »

Predictors, responses and residuals: What really needs to be normally distributed?

February 18, 2013
By
Predictors, responses and residuals: What really needs to be normally distributed?

Introduction Many scientists are concerned about normality or non-normality of variables in statistical analyses. The following and similar sentiments are often expressed, published or taught: "If you want to do statistics, then everything needs to be normally distributed." "We normalized…Read more →

Read more »

Run production, one team at a time

February 17, 2013
By

In a previous post, I used R to process data from the Lahman database to calculate index values that compare a team's run production to the league average for that year.  For the purpose of that exercise, I started the sequence at 1947, but for what follows I re-ran the code with the time period...

Read more »

A look at strucchange and segmented

February 17, 2013
By
A look at strucchange and segmented

After last week's post it was commented that strucchange and segmented would be more suitable for my purpose. I had a look at both. Strucchange can find a jump in a time series, which was what I was looking for. In contrast segmented is more suitable f...

Read more »

Finding outliers in numerical data

Finding outliers in numerical data

One of the topics emphasized in Exploring Data in Engineering, the Sciences and Medicine is the damage outliers can do to traditional data characterizations.  Consequently, one of the procedures to be included in the ExploringData package is FindOutliers, described in this post.  Given a vector of numeric values, this procedure supports four different methods for identifying possible outliers.Before...

Read more »

Video: Data Mining with R

February 15, 2013
By

Yesterday's Introduction to R for Data Mining webinar was a record setter, with more than 2000 registrants and more than 700 attending the live session presented by Joe Rickert. If you missed it, I've embedded the video replay below, and Joe's slides (with links to many useful resources) are also available. During the webinar, Joe demoed several examples of...

Read more »