2419 search results for "ggplot"

In case you missed it: March 2013 Roundup

April 10, 2013
By

In case you missed them, here are some articles from March of particular interest to R users. Facebook used R to analyze profile photo changes to create a map of same-sex marriage support in the USA. Joe Rickert contrasts random sampling with fitting models directly to large data sets. A presentation by Carlos Somohano summarizes the history, skills and...

Read more »

R and social media

April 10, 2013
By

R is a piece of software, but it is also a community. Help community The most visible aspect of the R community is help.  This is also the most useful to new users.  The initial sense of cooperation with R was driven mainly by people helping each other. You don’t need to actively participate in The post R...

Read more »

Behind the NCAA Visualizer: Python, R and JavaScript

April 9, 2013
By

Rodrigo Zamith's NCAA Tournament Visualizer is a great example of an interactive data visualization. If you want to create something similar, Rodrigo has shared detailed behind-the-scenes information on how it was created. He used a mix of tools: Python was used to scrape team statistics fromt the NCAA website R was used to prepare the data for analysis, and...

Read more »

Changing figure options mid-chunk (in a loop) using the pander package.

April 9, 2013
By
Changing figure options mid-chunk (in a loop) using the pander package.

I wrote already about changing figure options mid-chunk in reproducible research. This can be important  e.g. if you are looping through a dataset to produce a graphic for each variable but the figure width or height need to depend on properties of the variables, e.g. if you are producing histograms and want the figures to

Read more »

Starting Analysis and Visualisation of Spatial Data with R

April 8, 2013
By
Starting Analysis and Visualisation of Spatial Data with R

Last week I ran an introductory workshop on the analysi

Read more »

Dirichlet Process, Infinite Mixture Models, and Clustering

April 7, 2013
By
Dirichlet Process, Infinite Mixture Models, and Clustering

The Dirichlet process provides a very interesting approach to understand group assignments and models for clustering effects.   Often time we encounter the k-means approach.  However, it is necessary to have a fixed number of clusters.  Often we encounter situations where we don’t know how many fixed clusters we need.  Suppose we’re trying to identify

Read more »

Sync

April 7, 2013
By
Sync

I am listening to the audiobook Sync: How Order Emerges from Chaos in the Universe, Nature, and Daily Lifeby Steven Strogatz which I got from Audible. Obviously a mathematical book is not ideal to listen to, but lacking illustrations I can ma...

Read more »

Mortality after paediatric heart surgery using public domain data

April 6, 2013
By
Mortality after paediatric heart surgery using public domain data

This post comes with some big health warnings. The recent events in Leeds highlight the difficulties faced in judging the results of surgery by individual hospital. A clear requirement is timely access to data in a form easily digestible by the public. Here I’ve scraped the publically available data from the central cardiac audit database

Read more »

Worry about correctness and repeatability, not p-values

April 5, 2013
By
Worry about correctness and repeatability, not p-values

In data science work you often run into cryptic sentences like the following: Age adjusted death rates per 10,000 person years across incremental thirds of muscular strength were 38.9, 25.9, and 26.6 for all causes; 12.1, 7.6, and 6.6 for cardiovascular disease; and 6.1, 4.9, and 4.2 for cancer (all P < 0.01 for linear Related posts:

Read more »

Multiple pairwise comparisons for categorical predictors

April 5, 2013
By
Multiple pairwise comparisons for categorical predictors

Dale Barr (@datacmdr) recently had a nice blog post about coding categorical predictors, which reminded me to share my thoughts about multiple pairwise comparisons for categorical predictors in growth curve analysis. As Dale pointed out in his post, the R default is to treat the reference level of a factor as a...

Read more »