Fumblings with Ranked Likert Scale Data in R

July 9, 2012
By
Fumblings with Ranked Likert Scale Data in R

The code is horrible and the visualisations quite possibly misleading, but I’m dead tired and there are a couple of tricks in the following R code that I want to remember, so here’s a contrived bit of fumbling with some data of the form: enjoyCompany tooMuchFamily 1 strongly agree strongly disagree 2 strongly agree strongly

Read more »

A Napa Valley wine tasting map, made with R and ggmap

July 9, 2012
By
A Napa Valley wine tasting map, made with R and ggmap

R has had a maps package available since the very early days. It's great for simple geographic maps, but the political boundaries can be out of date. For more detailed maps, you can also download shape files and use the sp package to draw borders directly. But for accurate and attractive maps of countries, roads and satellite imagery, nothing...

Read more »

Map biodiversity records with rgbif, maps and ggplot2 packages in R

July 9, 2012
By
Map biodiversity records with rgbif, maps and ggplot2 packages in R

Global Biodiversity Information Facility or GBIF is an international consortium working towards making Biodiversity information available through single portal to everyone.  GBIF with its partners are working towards mobilizing data, developing data and metadata standards, developing distributed database system and making the data accessible through APIs. At this point this largest single window data source covering wide spectrum of taxa and

Read more »

Example 9.37: (Mis)behavior of binomial confidence intervals

July 9, 2012
By
Example 9.37: (Mis)behavior of binomial confidence intervals

While traditional statistics courses teach students to calculate intervals and test for binomial proportions using a normal or t approximation, this method does not always work well. Agresti and Coull ("Approximate is better than "exact' for interval estimation of binomial proportions". The American Statistician, 1998; 52:119-126) demonstrated this and reintroduced an...

Read more »

Trend and Spatial Pattern of Poverty in the Philippines

July 9, 2012
By
Trend and Spatial Pattern of Poverty in the Philippines

In a teaching demo that I have conducted, I discussed on how R can be used to analyze trends and spatial pattern of poverty incidence in the Philippines. Playing on the data I got from the National Statistical Coordination Board below is what I got.&...

Read more »

leaf area measuring — R package “EBImage”

July 9, 2012
By
leaf area measuring — R package “EBImage”

Besides microscopic images in our routine, common photos are frequently taken to measure quantitative plant features, such as leaf area, root length, branch numbers, etc. Scientific software is available for manual processing. For example, to measure the root length, one need to use the … Continue reading →

Read more »

Network Visualization of Key Driver Analysis

July 8, 2012
By
Network Visualization of Key Driver Analysis

Whatever happened to those evaluations that your airline asked you to complete after taking a flight? They ask you for a number of ratings about buying your ticket, attributes of the plane, the service you received, and if you were satisfied, if you wo...

Read more »

Bubble Plots (ggplot2)

July 8, 2012
By
Bubble Plots (ggplot2)

1 Introduction Rarely have I seen a three dimension graph including time, value, and volatility. It is essenti

Read more »

New package RcppCNPy with release 0.1.0 (and 0.0.1 earlier last week)

A few days ago I had blogged about getting NumPy data in R by using a simple converter script. That works fine, but it is a little annoying to have to write an entire file only to read from it again. So I kept looking around for a better solution---and soon found the cnpy library by Carl Rogers which provides simple C++...

Read more »

Representation of numerical NA’s in R and the 1954 enigma

July 8, 2012
By
Representation of numerical NA’s in R and the 1954 enigma

I've always wondered how exactly the missing value (NA) in R is represented under the hood. Last weekend I was working on a little project that gave me enough excuse to spend some time on finding this out. So, I … Continue reading →

Read more »

Fitting a dynamic model, and determining the number of parameters that can be fitted.

July 8, 2012
By
Fitting a dynamic model, and determining the number of parameters that can be fitted.

Let's suppose that we have the same dynamic model we presented before - that is, the Lorentz system of differential equations. Remember? In order to perform a fitting we need to define an objective function of sort: this will then be minimised. Now,...

Read more »

Universal portfolio, part 7

July 7, 2012
By
Universal portfolio, part 7

After reproducing all original figures and tables from Universal Portfolios, R coupled with modern processors allows to perform some more analysis.First we calculate the final wealth of the universal portfolio for all possible pairs of stocks, and...

Read more »

SMS analysis (coming from an Android smartphone or an IPhone)

July 7, 2012
By
SMS analysis (coming from an Android smartphone or an IPhone)

At first, this post was intended to describe how to manipulate dates with R but, as the idea was coming from the question of one of my students who wanted to analyze his SMS, I thought that I might as well also explain the whole analysis process... Using my new smartphone (that I started to

Read more »

The Actuary Puzzle 508 – Square numbers

The Actuary Puzzle 508 – Square numbers

The Actuary Puzzle 508 - Square numbers Author: Matt Malin From the puzzle pages of The Actuary June 2012, I attempt to solve the following, making use of R: This square contains exactly 21 smaller squares. Each of these smaller squares has sides of integer length, with no two smaller squares having sides of the same length. Can you find a solution for...

Read more »

Timeline graph with ggplot2

July 7, 2012
By
Timeline graph with ggplot2

This post shows how to create a timeline graph by using ggplot2. Let’s start by loading the ggplot2 library. Next let’s create a dataset which we will use to feed the graph. In the last column (y), I create random positive values for the first three rows (which will be  Read more...

Read more »

Graphical insights from the 2012 UseR! Meeting

About this time last month, I attended the 2012 UseR! Meeting.  Now an annual event, this series of conferences started in Europe in 2004 as an every-other-year gathering that now seems to alternate between the U.S. and Europe.  This year’s meeting was held on the VanderbiltUniversity campus in Nashville, TN, and it was attended by about 500 R aficionados,...

Read more »

R, knitr & markdown = HTML

July 7, 2012
By
R, knitr & markdown = HTML

Welcome to this demo of how R code and results can be combined into an HTML report. This entire blogpost was generated by using a combination of R, knitr and markdown. Beforehand, make sure you have the following libraries installed (latest version); knitr markdown ggplot2 (to run the example script)  Read more »

Project Euler — problem 12

July 7, 2012
By

Going to supper in 20 minutes. I’d like to type down my solution to the 12th Euler problem, just make my time count. The sequence of triangle numbers is generated by adding the natural numbers. So the 7th triangle number … Continue reading →

Read more »

ggplot2 – much easier with JGR and Deducer

July 7, 2012
By
ggplot2 – much easier with JGR and Deducer

In the last R-User meeting in Cologne, we had a discussion about using ggplot2 – and I gave a short introduction of how to use ggplot2 with JGR and Deducer. Basically, JGR is a Graphical User Interface for R, and Deducer is a kind of “data analysis plugin”, that also comes with a so-called “plot

Read more »

The R Journal Volume 4/1, June 2012

July 6, 2012
By

As first reported by Paolo, the new R journal is out! You can Download the complete issue from here.  Refereed articles may be downloaded individually using the links below. Table of Contents Editorial 3   Contributed Research Articles   Analysing Seasonal Data  Adrian G Barnett, Peter Baker and Annette J Dobson 5 MARSS: Multivariate Autoregressive State-space...

Read more »

Three hours of pure soccer emotion, visualized with R

July 6, 2012
By
Three hours of pure soccer emotion, visualized with R

The biggest prize in UK soccer, the Premier League Championship, is decided by a points system. Unlike most sports competitions, there's no final round or playoff series: once the regular round of games is complete, the team that has accumulated the most points (three for a win, and one for a draw) is the champion of English football. In...

Read more »

Soda vs. Pop with Twitter

July 6, 2012
By
Soda vs. Pop with Twitter

One of the great things about Twitter is that it’s a global conversation anyone can join anytime. Eavesdropping on the world, what what! Of course, it gets even better when you can mine all this chatter to study the way humans live and interact. For example, how do people in New York City differ from those in Silicon Valley? We...

Read more »

Error metrics for multi-class problems in R: beyond Accuracy and Kappa

July 6, 2012
By
Error metrics for multi-class problems in R: beyond Accuracy and Kappa

The caret package for R provides a variety of error metrics for regression models and 2-class classification models, but only calculates Accuracy and Kappa for multi-class models.  Therefore, I wrote the following function to allow caret:::train t...

Read more »

RSAP, Rook and ERP

RSAP, Rook and ERP

As I wrote in my blog Analytics with SAP and R (Windows version) we can use RSAP to connect to our ERP system and play with the data. This time I wanted of course, to keep exploring the capabilities of RSAP, but using something else. As everybody kno...

Read more »

Fix Overplotting with Colored Contour Lines

July 6, 2012
By
Fix Overplotting with Colored Contour Lines

I saw this plot in the supplement of a recent paper comparing microarray results to RNA-seq results. Nothing earth-shattering in the paper - you've probably seen a similar comparison many times before - but I liked how they solved the overplotting...

Read more »

Interest Differencing: Folk Commonly Followed by Tweeting MPs of Different Parties

July 6, 2012
By
Interest Differencing: Folk Commonly Followed by Tweeting MPs of Different Parties

Earlier this year I doodled a recipe for comparing the folk commonly followed by users of a couple of BBC programme hashtags (Social Media Interest Maps of Newsnight and BBCQT Twitterers). Prompted in part by a tweet from Michael Smethurst/@fantasticlife about generating an ESP map for UK politicians (something I’ve also doodled before – Sketching

Read more »

A practical introduction to garch modeling

July 6, 2012
By
A practical introduction to garch modeling

We look at volatility clustering, and some aspects of modeling it with a univariate GARCH(1,1) model. Volatility clustering Volatility clustering — the phenomenon of there being periods of relative calm and periods of high volatility — is a seemingly universal attribute of market data.  There is no universally accepted explanation of it. GARCH (Generalized AutoRegressive … Continue reading...

Read more »

The R Journal Volume 4/1

July 6, 2012
By
The R Journal Volume 4/1

The 'Summer edition' of the R Journal is out! Get it from here.

Read more »

automated cell phenotyping — R package “EBImage”

July 5, 2012
By
automated cell phenotyping — R package “EBImage”

Counting cells under microscope is always laborious and null. Those in the art would be relieved with assistance of a powerful image processing package, EBImage. Images are treated as “Image” objects, essentially multi-dimensional arrays. The class “Image” contains spatial information, pixel … Continue reading →

Read more »