Shading regions under a curve

January 11, 2013
By
Shading regions under a curve

Over on the Clastic Detritus blog, Brian Romans posted a nice introduction to plotting in R. At the end of his post, Brian mentioned he would like to colour in areas under the data curve corresponding to particular ranges of … Continue reading →

Read more »

Using Eigen for eigenvalues

January 11, 2013
By

A previous post showed how to compute eigenvalues using the Armadillo library via RcppArmadillo. Here, we do the same using Eigen and the RcppEigen package. #include <RcppEigen.h> // ] using Eigen::Map; ...

Read more »

Seasonal Trend Decomposition in R

January 11, 2013
By
Seasonal Trend Decomposition in R

The Seasonal Trend Decomposition using Loess (STL) is an algorithm that was developed to help to divide up a time series into three components namely: the trend, seasonality and remainder. The methodology was presented by Robert Cleveland, William Cleveland, Jean McRae and Irma Terpenning in the Journal of Official Statistics in 1990. The STL is

Read more »

Rmathlib and kdb+ part 3: Utility Functions

January 11, 2013
By

In the first two parts of this series, I looked at the basics of the interface I created between rmathlib and kdb+. In this post, I’ll go through some of the convenience functions I wrote to emulate some basic R … Continue reading →

Read more »

Shading regions under a curve

January 11, 2013
By
Shading regions under a curve

Over on the Clastic Detritus blog, Brian Romans posted a nice introduction to plotting in R. At the end of his post, Brian mentioned he would like to colour in areas under the data curve corresponding to particular ranges of grain sizes. The comment area on a blog isn’t really amenable to giving a full answer to the...

Read more »

DC R Meetup: “Analyze US Government Survey Data with R”

January 10, 2013
By
DC R Meetup: “Analyze US Government Survey Data with R”

I really enjoyed tonight’s DC R Meetup, presented by the prolific Anthony Damico. I’ve met Anthony before to discuss whether the Census Bureau could either… publish R-readable … Continue reading →

Read more »

Gauge Chart in R

January 10, 2013
By
Gauge Chart in R

How to replicate a google gauge chart in R? Google charts has several options to produce nice graphics. Most of them have their equivalent function in R and can be quickly replicated, but some of them require a bit of programming. For instance, take the google gauge charts which I really like: A gauge is … Continue reading...

Read more »

Webinar Jan 24: Using R with Hadoop

January 10, 2013
By

In two weeks (on January 24), Think Big Analytics' Jeffrey Breen will present a new webinar on using R with Hadoop. Here's the webinar description: R and Hadoop are changing the way organizations manage and utilize big data. Think Big Analytics and Revolution Analytics are helping clients plan, build, test and implement innovative solutions based on the two technologies...

Read more »

Rmathlib and kdb+, part 2 – Probability Distribution Functions

January 10, 2013
By

Following on from the last post on integrating some rmathlib functionality with kdb+, here is a sample walkthrough of how some of the functionality can be used, including some of the R-style wrappers I wrote to emulate some of the … Continue reading →

Read more »

Simulation of landmine clearing with Massoud Hassani’s Mine Kafon

January 10, 2013
By
Simulation of landmine clearing with Massoud Hassani’s Mine Kafon

Code used: MineClearingSimulationWithKafons.r TRANSCRIPT OF VIDEO: Hello, I’m Matt Asher with StatisticsBlog.com. This video is about my attempt to simulate a landmine clearing device built by Massoud Hassani called the Mine Kafon. I’ve put a link to his webpage at StatisticsBlog.com, I highly recommend checking out the video. Hassani’s device looks like this: It’s a

Read more »

My Personal Intro to F1 Race Statistics

January 10, 2013
By
My Personal Intro to F1 Race Statistics

One of the many things I keep avoiding is statistics. I’ve never really been convinced about the 5% significance level thing; as far as I can tell, hardly anything that’s interesting normally distributes; all the counting that’s involved just confuses me; and I never really got to grips with confidently combining probabilities. I find a

Read more »

Formulae in R: ANOVA and other models, mixed and fixed

January 10, 2013
By

R’s formula interface is sweet but sometimes confusing. ANOVA is seldom sweet and almost always confusing. And random (a.k.a. mixed) versus fixed effects decisions seem to hurt peoples’ heads too. So, let’s dive into the intersection of these three. I’m aware that there are lots of packages for running ANOVA models that make things nicer

Read more »

R for actuarial science

January 10, 2013
By
R for actuarial science

As mentioned in the Appendix of Modern Actuarial Risk Theory, “R (and S) is the ‘lingua franca’ of data analysis and statistical computing, used in academia, climate research, computer science, bioinformatics, pharmaceutical industry, customer analytics, data mining, finance and by some insurers. Apart from being stable, fast, always up-to-date and very versatile, the chief advantage of R is that...

Read more »

Stacked Bar Charts in R

January 10, 2013
By
Stacked Bar Charts in R

Reshape Wide to LongLet's use the Loblolly dataset from the datasets package. These data track the growth of some loblolly pine trees.> Loblolly   height age Seed1    4.51   3  30115  10.89   ...

Read more »

Vector fields with streamlines

January 10, 2013
By
Vector fields with streamlines

A new version of rasterVis is available at CRAN. This version includes several bug fixes and a new method to …Continuar leyendo »

Read more »

Install R in Ubuntu 12.04 Precise Pangolin

January 10, 2013
By

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. One of my main motivations to install R is Sweave. The Sweave is a literate programming language which integrates LaTeX and R code. The main idea of the Sweave is to combine data analysis code...

Read more »

Lomb-Scargle periodogram for unevenly sampled time series

January 10, 2013
By
Lomb-Scargle periodogram for unevenly sampled time series

In the natural sciences, it is common to have incomplete or unevenly sampled time series for a given variable. Determining cycles in such series is not directly possible with methods such as Fast Fourier Transform (FFT) and may require some degree o...

Read more »

Lomb-Scargle periodogram for unevenly sampled time series

January 10, 2013
By
Lomb-Scargle periodogram for unevenly sampled time series

In the natural sciences, it is common to have incomplete or unevenly sampled time series for a given variable. Determining cycles in such series is not directly possible with methods such as Fast Fourier Transform (FFT) and may require some degree of interpolation to fill in gaps. An alternative is the Lomb-Scargle method (or...

Read more »

Optimizing parameters for an oscillator – Video

January 10, 2013
By

Here’s a video how the modFit function from the FME package optimizes parameters for an oscillation. A Nelder-Mead-optimizer (R function optim) finds the best fitting parameters for an undampened oscillator. Minimum was found after 72 iterations, true parameter eta was -.05: Evolution of parameters in optimization process from Felix Schönbrodt on Vimeo. More on estimating

Read more »

A first lambda function with C++11 and Rcpp

January 10, 2013
By

Yesterday’s post started to explore the nice additions which the new C++11 standard is bringing to the language. One particularly interesting feature are lambda functions which resemble the anonymous functions R programmers have enjoyed all along...

Read more »

Reading Codebook Files in R

January 10, 2013
By

One issue I continuously encounter when starting to work with a new dataset is that of the codebook. In general, I prefer to load a codebook into R like any other data source, specifically as a data frame. And ideally, one data frame to provides the variable names with descriptions and any other meta data available, and a separate...

Read more »

Maps in R: Plotting data points on a map

January 10, 2013
By
Maps in R: Plotting data points on a map

In the introductory post of this series I showed how to plot empty maps in R. Today I'll begin to show how to add data to R maps. The topic of this post is the visualization of data points on … Continue reading →

Read more »

Elements of Statistical Learning: free book download

January 9, 2013
By

The go-to bible for this data scientist and many others is The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. Each of the authors is an expert in machine learning / prediction, and in some cases invented the techniques we turn to today to make sense of big data: ensemble...

Read more »

Every NFL punt since 2002

January 9, 2013
By
Every NFL punt since 2002

The site reddit told us about data on every single NFL (U.S. National Football League) play since 2002. We read it in and did an analysis of punting. The results are beautiful. The post Every NFL punt since 2002 appeared first on Decision Science News.

Read more »

Getting Access data into R

January 9, 2013
By

Revisiting Cronbach 1951 via Simulation with Shiny

January 9, 2013
By
Revisiting Cronbach 1951 via Simulation with Shiny

At the time of the creation of this blog, Cronbach’s 1951 piece on coefficient alpha has 18,132 citations according to google scholar.  The main use of coefficient alpha is to assess internal consistency reliability of a test or survey.   Although it may have been forgotten, the proof Cronbach demonstrated established that coefficient alpha is the mean of all split...

Read more »

Revisiting Cronbach 1951 via Simulation with Shiny

January 9, 2013
By
Revisiting Cronbach 1951 via Simulation with Shiny

At the time of the creation of this blog, Cronbach’s 1951 piece on coefficient alpha has 18,132 citations according to google scholar.  The main use of coefficient alpha is to assess internal consistency reliability of a test or survey.   Although it may have been forgotten, the proof Cronbach demonstrated established that coefficient alpha is the mean of all split...

Read more »

Factor Analysis of Baseball’s Hall of Fame Voters

January 9, 2013
By
Factor Analysis of Baseball’s Hall of Fame Voters

Factor Analysis of Baseball's Hall of Fame VotersRecently, Nate Silver wrote a post which analyzed how voters who voted for and against Barry Bonds for Baseball's Hall of Fame differed. Not surprisingly, those who voted for Bonds were more likely to vote for other suspected steroids users (like Roger Clemens). This got...

Read more »

WordPress Stats in R

January 9, 2013
By
WordPress Stats in R

A trackback from Martin Hawksey’s recent post on Analysing WordPress post velocity and momentum stats with Google Sheets (Spreadsheet), which demonstrates how to pull WordPress stats into a Google Spreadsheet and generates charts and reports therein, reminded me of the WordPress stats API. So here’s a quick function for pulling WordPress reports into R. (Code

Read more »

Sponsors