Timeline graph with ggplot2

July 7, 2012
By
Timeline graph with ggplot2

This post shows how to create a timeline graph by using ggplot2. Let’s start by loading the ggplot2 library. Next let’s create a dataset which we will use to feed the graph. In the last column (y), I create random positive values for the first three rows (which will be  Read more...

Read more »

Graphical insights from the 2012 UseR! Meeting

About this time last month, I attended the 2012 UseR! Meeting.  Now an annual event, this series of conferences started in Europe in 2004 as an every-other-year gathering that now seems to alternate between the U.S. and Europe.  This year’s meeting was held on the VanderbiltUniversity campus in Nashville, TN, and it was attended by about 500 R aficionados,...

Read more »

R, knitr & markdown = HTML

July 7, 2012
By
R, knitr & markdown = HTML

Welcome to this demo of how R code and results can be combined into an HTML report. This entire blogpost was generated by using a combination of R, knitr and markdown. Beforehand, make sure you have the following libraries installed (latest version); knitr markdown ggplot2 (to run the example script)  Read more »

Project Euler — problem 12

July 7, 2012
By

Going to supper in 20 minutes. I’d like to type down my solution to the 12th Euler problem, just make my time count. The sequence of triangle numbers is generated by adding the natural numbers. So the 7th triangle number … Continue reading →

Read more »

ggplot2 – much easier with JGR and Deducer

July 7, 2012
By
ggplot2 – much easier with JGR and Deducer

In the last R-User meeting in Cologne, we had a discussion about using ggplot2 – and I gave a short introduction of how to use ggplot2 with JGR and Deducer. Basically, JGR is a Graphical User Interface for R, and Deducer is a kind of “data analysis plugin”, that also comes with a so-called “plot

Read more »

The R Journal Volume 4/1, June 2012

July 6, 2012
By

As first reported by Paolo, the new R journal is out! You can Download the complete issue from here.  Refereed articles may be downloaded individually using the links below. Table of Contents Editorial 3   Contributed Research Articles   Analysing Seasonal Data  Adrian G Barnett, Peter Baker and Annette J Dobson 5 MARSS: Multivariate Autoregressive State-space...

Read more »

Three hours of pure soccer emotion, visualized with R

July 6, 2012
By
Three hours of pure soccer emotion, visualized with R

The biggest prize in UK soccer, the Premier League Championship, is decided by a points system. Unlike most sports competitions, there's no final round or playoff series: once the regular round of games is complete, the team that has accumulated the most points (three for a win, and one for a draw) is the champion of English football. In...

Read more »

Soda vs. Pop with Twitter

July 6, 2012
By
Soda vs. Pop with Twitter

One of the great things about Twitter is that it’s a global conversation anyone can join anytime. Eavesdropping on the world, what what! Of course, it gets even better when you can mine all this chatter to study the way humans live and interact. For example, how do people in New York City differ from those in Silicon Valley? We...

Read more »

Error metrics for multi-class problems in R: beyond Accuracy and Kappa

July 6, 2012
By
Error metrics for multi-class problems in R: beyond Accuracy and Kappa

The caret package for R provides a variety of error metrics for regression models and 2-class classification models, but only calculates Accuracy and Kappa for multi-class models.  Therefore, I wrote the following function to allow caret:::train t...

Read more »

RSAP, Rook and ERP

RSAP, Rook and ERP

As I wrote in my blog Analytics with SAP and R (Windows version) we can use RSAP to connect to our ERP system and play with the data. This time I wanted of course, to keep exploring the capabilities of RSAP, but using something else. As everybody kno...

Read more »

Fix Overplotting with Colored Contour Lines

July 6, 2012
By
Fix Overplotting with Colored Contour Lines

I saw this plot in the supplement of a recent paper comparing microarray results to RNA-seq results. Nothing earth-shattering in the paper - you've probably seen a similar comparison many times before - but I liked how they solved the overplotting...

Read more »

Interest Differencing: Folk Commonly Followed by Tweeting MPs of Different Parties

July 6, 2012
By
Interest Differencing: Folk Commonly Followed by Tweeting MPs of Different Parties

Earlier this year I doodled a recipe for comparing the folk commonly followed by users of a couple of BBC programme hashtags (Social Media Interest Maps of Newsnight and BBCQT Twitterers). Prompted in part by a tweet from Michael [email protected] about generating an ESP map for UK politicians (something I’ve also doodled before – Sketching

Read more »

A practical introduction to garch modeling

July 6, 2012
By
A practical introduction to garch modeling

We look at volatility clustering, and some aspects of modeling it with a univariate GARCH(1,1) model. Volatility clustering Volatility clustering — the phenomenon of there being periods of relative calm and periods of high volatility — is a seemingly universal attribute of market data.  There is no universally accepted explanation of it. GARCH (Generalized AutoRegressive … Continue reading...

Read more »

The R Journal Volume 4/1

July 6, 2012
By
The R Journal Volume 4/1

The 'Summer edition' of the R Journal is out! Get it from here.

Read more »

automated cell phenotyping — R package “EBImage”

July 5, 2012
By
automated cell phenotyping — R package “EBImage”

Counting cells under microscope is always laborious and null. Those in the art would be relieved with assistance of a powerful image processing package, EBImage. Images are treated as “Image” objects, essentially multi-dimensional arrays. The class “Image” contains spatial information, pixel … Continue reading →

Read more »

More Exploration of Crazy RUT

July 5, 2012
By
More Exploration of Crazy RUT

Unintentionally while playing with the lawstat package in R, I started trying to build systems (STANDARD DISCLAIMER: NOT INVESTMENT ADVICE AND WILL LOSE LOTS OF MONEY SO PROCEED WITH CAUTION) based on the Jarque Bera test of normality (entry in Wikiped...

Read more »

A better ‘nls’ (?)

July 5, 2012
By
A better ‘nls’ (?)

Those that do a lot of nonlinear regression will love the nls function of R. In most of the cases it works really well, but there are some mishaps that can occur when using bad starting values for the parameters. One of the most dreaded is the “singular gradient matrix at initial parameter estimates” which

Read more »

Health Care Costs – Part 1, "The Problem"

July 5, 2012
By
Health Care Costs – Part 1, "The Problem"

The Problem In the United States, health care costs have been going up for a number of years, even when adjusted for inflation. Not unlike a runaway freight train, this rampant inflation cannot continue indefinitely without crashing. ...

Read more »

New R User Group in Leipzig, Germany

July 5, 2012
By

Leipzig R Statistical Computing is the sixth local R user group in Germany, and has been holding meetings since February. In the next meeting on July 12, member Claudia Beleites will talk about her pacakges softclassval (for classifier performance measures) and hyperspec (for hyperspectral data). meetup.com: Leipzig R Statistical Computing

Read more »

Validating email adresses in R

July 5, 2012
By

I currently program an automated report generation in R – participants fill out a questionnaire, and they receive a nicely formatted pdf with their personality profile. I use knitr, LaTex, and the sendmailR package. Some participants did not provide valid email addresses, which caused the sendmail function to crash. Therefore I wanted some validation of

Read more »

A tiny RCurl headache ;)

July 4, 2012
By

As more and more data go online (plus we love Google Drive) we are forced to connect to our data over the net. We mostly do this via RCurl (but we could do this using RGoogleDocs as well).In that case all that is required to get the data into R is the two lines of ...read more

Read more »

A new open journal on Data Science

July 4, 2012
By

Springer has introduced a new open, peer-reviewed journal focused on Data Science: EPJ Data Science. What makes this a Data Science journal is novel uses of statistics, data analysis, computer techniques and public data sources to research a topic in another domain, rather than methodological research. Here are a few examples of the papers you'll find in the journal:...

Read more »

Alternative to Monte Carlo Testing

July 4, 2012
By
Alternative to Monte Carlo Testing

When we backtest a strategy on a portfolio, it is a simple analysis of a single period in time. There are ways to “stress test” a strategy such as monte carlo, random portfolios, or shuffling the returns in a random order. I could never really wrap my head around monte carlo and shuffling the returns … Continue reading...

Read more »

Three Questions about a Matrix of Coefficient Plots

July 4, 2012
By

It's Independence Day in the U.S., so I am taking the day off, but I received the following request for advice and thought I'd pass it along to my readers. I wonder if you could help – I am trying to create 9 different coefficient plots , which repr...

Read more »

A tutorial on outlier detection techniques

July 4, 2012
By
A tutorial on outlier detection techniques

by Yanchang Zhao, RDataMining.com There is an excellent tutorial on outlier detection techniques, presented by Hans-Peter Kriegel et al. at ACM SIGKDD 2010. It presents many popular outlier detection algorithms, most of which were published between mid 1990s and 2010, … Continue reading →

Read more »

The Higgs boson: 5-sigma and the concept of p-values

July 4, 2012
By
The Higgs boson: 5-sigma and the concept of p-values

Why are physicists talking about 5-sigma, and what's it got to do with statistics? In this short post I'll explain what 5-sigma is and why it's not a measure of how certain scientist are that they've found the Higgs boson

Read more »

Glmnet_1.8 uploaded to CRAN

July 4, 2012
By

(by Trevor Hastie) Glmnet_1.8 uploaded to CRAN – This is a major revision, with two additional models included. 1) Multiresponse regression – family=”mgaussian” Here we have a matrix of M responses, and we fit a series of linear models in parallel. We use a group-lasso penalty on the set of M coefficients for each variable. This means they are...

Read more »

To the Basics: Bayesian Inference on A Binomial Proportion

July 4, 2012
By
To the Basics: Bayesian Inference on A Binomial Proportion

Think of something observable – countable – that you care about with only one outcome or another. It could be the votes cast in a two-way election in your town, or the free throw shots the center on your favorite...

Read more »

Example of Factor Attribution

July 3, 2012
By
Example of Factor Attribution

In the prior post, Factor Attribution 2, I have shown how Factor Attribution can be applied to decompose fund’s returns in to Market, Capitalization, and Value factors, the “three-factor model” of Fama and French. Today, I want to show you a different application of Factor Attribution. First, let’s run Factor Attribution on each the stocks

Read more »

Sponsors

Mango solutions



plotly webpage

dominolab webpage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.