2185 search results for "ggplot2"

Visualizing Risky Words — Part 2

March 9, 2013
By
Visualizing Risky Words — Part 2

This is a follow-up to my Visualizing Risky Words post. You’ll need to read that for context if you’re just jumping in now. Full R code for the generated images (which are pretty large) is at the end. Aesthetics are the primary reason for using a word cloud, though one can pretty quickly recognize what

Read more »

Analyzing SimplyStatistics visits info

March 9, 2013
By
Analyzing SimplyStatistics visits info

Recently we had to analyze the data of the number of visits per day to SimplyStatistics.org. There were two goals: Estimate the fraction of visitors retained after a spike in the number of visitors Identify (if any) any factors that influence the fraction estimated in 1. For me it was a fun project in part because I like SimplyStatistics but also...

Read more »

A bit more on sample size

March 8, 2013
By
A bit more on sample size

In our article What is a large enough random sample? we pointed out that if you wanted to measure a proportion to an accuracy “a” with chance of being wrong of “d” then a idea was to guarantee you had a sample size of at least: This is the central question in designing opinion polls Related posts:

Read more »

Visualizing rOpenSci collaboration

March 8, 2013
By
Visualizing rOpenSci collaboration

We (rOpenSci) have been writing code for R packages for a couple years, so it is time to take a look back at the data. What data you ask? The commits data from GitHub ~ data that records who did what and when. Using the Github commits API we can gather data on who commited code to a...

Read more »

Visualizing rOpenSci collaboration

March 8, 2013
By
Visualizing rOpenSci collaboration

We (rOpenSci) have been writing code for R packages for a couple years, so it is time to take a look back at the data. What data you ask? The commits data from GitHub ~ data that records who did what and when. Using the Github commits API we can gather data on who commited code to a...

Read more »

Visualizing rOpenSci collaboration

March 8, 2013
By
Visualizing rOpenSci collaboration

We (rOpenSci) have been writing code for R packages for a couple years, so it is time to take a look back at the data. What data you ask? The commits data from GitHub ~ data that records who did what and when. Using the Github commits API we can gather data on who commited code to a...

Read more »

From OpenOffice noob to control freak: A love story with R, LaTeX and knitr

March 8, 2013
By
From OpenOffice noob to control freak: A love story with R, LaTeX and knitr

Lately I had to write a seminar paper for a class and I decided to overdo it.But let's start at the very beginning. Here is my evolution of how I used to write stuff and how I got from this:to that:School: OpenOffice - I guess everyone has some&nb...

Read more »

ddply in action

March 7, 2013
By
ddply in action

Top Batting Averages Over Time Top Batting Averages Over Time reference:http://www.baseball-databank.org/ ShortI'm going to use plyr and ggplot2 to look at how top batting averages have changed over time First load the data: options(width = 100)library(ggplot2) ## Warning message: package 'ggplot2' was built under R version 2.14.2 library(plyr)data(baseball)head(baseball) ## ...

Read more »

geom_point Legend with Custom Colors in ggplot

March 7, 2013
By
geom_point Legend with Custom Colors in ggplot

Formerly, I showed how to make line segments using ggplot.Working from that previous example, there are only a few things we need to change to add custom colors to our plot and legend in ggplot.First, we'll add the colors of our choice. I'll do th...

Read more »

Veterinary Epidemiologic Research: Linear Regression Part 2 – Checking assumptions

March 6, 2013
By
Veterinary Epidemiologic Research: Linear Regression Part 2 – Checking assumptions

We continue on the linear regression chapter the book Veterinary Epidemiologic Research. Using same data as last post and running example 14.12: Now we can create some plots to assess the major assumptions of linear regression. First, let’s have a look at homoscedasticity, or constant variance of residuals. You can run a statistical test, the

Read more »