453 search results for "boxplot"

Use box plots to assess the distribution and to identify the outliers in your dataset

August 14, 2015
By
Use box plots to assess the distribution and to identify the outliers in your dataset

After you check the distribution of the data by ploting the histogram, the second thing to do is to look for outliers. Identifying the outliers is important becuase it might happen that an association you find in your analysis can be explained by the presence of outliers. The best tool to identify the outliers is

Read more »

Importing the New Zealand Income Survey SURF

August 14, 2015
By
Importing the New Zealand Income Survey SURF

The quest for income microdata For a separate project, I've been looking for source data on income and wealth inequality. Not aggregate data like Gini coefficients or the percentage of income earned by the bottom 20% or top 1%, but the sources used to calculate those things. Because it's sensitve personal financial data either from surveys or tax...

Read more »

cricketr adapts to the Twenty20 International!

August 8, 2015
By
cricketr adapts to the Twenty20 International!

Introduction This should be last in the series of posts based on my R package cricketr. That is, unless some bright idea comes trotting along and light bulbs go on around my head. In this post cricketr adapts to the Twenty20 International format. Now cricketr can handle stats from all 3 formats of the game

Read more »

cricketr adapts to the Twenty20 International!

August 8, 2015
By
cricketr adapts to the Twenty20 International!

Introduction This should be last in the series of posts based on my R package cricketr. That is, unless some bright idea comes trotting along and light bulbs go on around my head. In this post cricketr adapts to the Twenty20 International format. Now cricketr can handle stats from all 3 formats of the game

Read more »

Airline Performance Comparison with R/Shiny

August 2, 2015
By
Airline Performance Comparison with R/Shiny

Open R shiny App from a new window here! Play with the App here: In this project I set out to build an interactive app to

Read more »

cricketr plays the ODIs!

August 2, 2015
By
cricketr plays the ODIs!

Introduction In this post my package ‘cricketr’ takes a swing at One Day Internationals(ODIs). Like test batsman who adapt to ODIs with some innovative strokes, the cricketr package has some additional functions and some modified functions to handle the high strike and economy rates in ODIs. As before I have chosen my top 4 ODI

Read more »

cricketr plays the ODIs!

August 2, 2015
By
cricketr plays the ODIs!

Introduction In this post my package ‘cricketr’ takes a swing at One Day Internationals(ODIs). Like test batsman who adapt to ODIs with some innovative strokes, the cricketr package has some additional functions and some modified functions to handle the high strike and economy rates in ODIs. As before I have chosen my top 4 ODI

Read more »

15 Questions All R Users Have About Plots

July 30, 2015
By
15 Questions All R Users Have About Plots

R allows you to create different plot types, ranging from the basic graph types like density plots, dot plots, bar charts, line charts, pie charts, boxplots and scatter plots, to the more statistically complex types of graphs such as probability plots, mosaic plots and correlograms. In addition, R is pretty known for its data visualization The post

Read more »

Computing AIC on a Validation Sample

July 29, 2015
By
Computing AIC on a Validation Sample

This afternoon, we’ve seen in the training on data science that it was possible to use AIC criteria for model selection. > library(splines) > AIC(glm(dist ~ speed, data=train_cars, family=poisson(link="log"))) 438.6314 > AIC(glm(dist ~ speed, data=train_cars, family=poisson(link="identity"))) 436.3997 > AIC(glm(dist ~ bs(speed), data=train_cars, family=poisson(link="log"))) 425.6434 > AIC(glm(dist ~ bs(speed), data=train_cars, family=poisson(link="identity"))) 428.7195 And I’ve been asked...

Read more »

Why I use Panel/Multilevel Methods

July 24, 2015
By
Why I use Panel/Multilevel Methods

I don’t understand why any researcher would choose not to use panel/multilevel methods on panel/hierarchical data. Let’s take the following linear regression as an example: , where is a random effect for the i-th group. A pooled OLS regression model for the above is unbiased and consistent. However, it will be inefficient, unless for all

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)