2224 search results for "ggplot"

Sentiment analysis finds trouble in the Enron emails

May 24, 2013
By
Sentiment analysis finds trouble in the Enron emails

The Enron email dataset, collected during the FERC investigation of the Enron financial scandal, represents the largest publicly available set of emails. This makes theman ideal testbed for sentiment analysis algorithms. Ikanow's Andrew Strite used the open-source Infinit.e framework and a Hadoop cluster to generate sentiment scores for all of the Enron emails, and then used R to manipulate...

Read more »

Veterinary Epidemiologic Research: Modelling Survival Data – Non-Parametric Analyses

May 23, 2013
By
Veterinary Epidemiologic Research: Modelling Survival Data – Non-Parametric Analyses

Next topic from Veterinary Epidemiologic Research: chapter 19, modelling survival data. We start with non-parametric analyses where we make no assumptions about either the distribution of survival times or the functional form of the relationship between a predictor and survival. There are 3 non-parametric methods to describe time-to-event data: actuarial life tables, Kaplan-Meier method, and

Read more »

xkcd Style Bubble Plot

May 23, 2013
By
xkcd Style Bubble Plot

A package was recently released to generate plots in the style of xkcd using R. Being a big fan of the cartoon, I could not resist trying it out. So I set out to produce something like one of Hans Rosling’s bubble plots. First I needed some data. Spoilt for choice. I scraped some population data broken

Read more »

My Prime Sieve – Homage to Yitan Zhang

May 22, 2013
By
My Prime Sieve – Homage to Yitan Zhang

# As a homage to Yitang Zhang who has proven a mind-bending property of Prime Pairs, I have written a prime Sieve to detect all of the prime numbers from 1 to N. # There might very well be a function in the base package that already does this. No...

Read more »

Analytical and simulation-based power analyses for mixed-design ANOVAs

May 21, 2013
By
Analytical and simulation-based power analyses for mixed-design ANOVAs

In this post I show some R-examples on how to perform power analyses for mixed-design ANOVAs. The first example is analytical—and adapted from formulas used in G*Power (Faul et al., 2007), and the second example is a Monte Carlo simulation. Read more

Read more »

Mining the last French presidential debate

May 18, 2013
By
Mining the last French presidential debate

After reading this post (thanks to him), I think it could be interesting to replicate this with some specific up of french language and to see and we can perform rapid view of the debate between Sarkozy and Hollande of the last 2nd round of presidentia...

Read more »

Analyzing a simple experiment with heterogeneous variances using asreml, MCMCglmm and SAS

May 17, 2013
By
Analyzing a simple experiment with heterogeneous variances using asreml, MCMCglmm and SAS

I was working with a small experiment which includes families from two Eucalyptus species and thought it would be nice to code a first analysis using alternative approaches. The experiment is a randomized complete block design, with species as fixed effect and family and block as a random effects, while the response variable is growth

Read more »

1.5 percent of doctors, a quarter of malpratice reports

May 14, 2013
By
1.5 percent of doctors, a quarter of malpratice reports

Some doctors receive more malpractice reports than others. Just how unequal is the distribution of malpractice reports? The post 1.5 percent of doctors, a quarter of malpratice reports appeared first on Decision Science News.

Read more »

SIR Model – The Flue Season – Dynamic Programming

May 14, 2013
By
SIR Model – The Flue Season – Dynamic Programming

# The SIR Model (susceptible, infected, and recovered) model is a common and useful tool in epidemiological modelling.# In this post and in future posts I hope to explore how this basic model can be enriched by including different population group...

Read more »

Visualizing your websites’ ecommerce performance with R

May 14, 2013
By
Visualizing your websites’ ecommerce performance with R

In this blogpost, I want to dive deeper into the explanation of the relationship between Frequency and Recency of Visits with the Conversion Rate and Average Order Value. I have used the RGA package for data extraction and Dr. Hadley Wickham’s ggplot2 package to achieve the visualizations. Here’s the data aggregation script : #transactions dataframe

Read more »