RStudio Development Environment

March 23, 2012
By
RStudio Development Environment

Compared to many other languages of equal popularity, there are realtively few development environments for R. In fact, the total number of production ready R IDEs could probably be counted on one hand. That deficiency is a small price to pay to use R and if you’re not already accustomed to using IDEs for other The post RStudio...

Read more »

RStudio Development Environment

March 23, 2012
By
RStudio Development Environment

Compared to many other languages of equal popularity, there are realtively few development environments for R. In fact, the total number of production ready R IDEs could probably be counted on one hand. That deficiency is a small price to pay to use R ...

Read more »

R, Twitter and McDonald’s

March 23, 2012
By
R, Twitter and McDonald’s

Ed Chen is a data scientist at Twitter, so he's accustomed to working with big data and complex models. In an interview with MIT Technology Review, he describes his data science toolbox: A common pattern for me is that I'll code a MapReduce job in Scala, do some simple command-line munging on the results, pass the data into Python...

Read more »

This graph shows that President Obama’s proposed budget treats the NIH even worse than G.W. Bush – Sign the petition to increase NIH funding!

March 23, 2012
By
This graph shows that President Obama’s proposed budget treats the NIH even worse than G.W. Bush – Sign the petition to increase NIH funding!

The NIH provides financial support for a large percentage of biological and medical research in the United States. This funding supports a large number of US jobs, creates new knowledge, and improves healthcare for everyone. So I am signing this petiti...

Read more »

Low (and high) volatility strategy effects

March 23, 2012
By
Low (and high) volatility strategy effects

Does minimum variance act differently from low volatility?  Do either of them act like low beta?  What about high volatility versus high beta? Inspiration Falkenblog had a post investigating differences in results when using different strategies for low volatility investing.  Here we look not at a single portfolio of a given strategy over time, but … Continue reading...

Read more »

Forecasts and ggplot

March 22, 2012
By

The forecast package uses the base R graphics for all plots, but some people may prefer to use the nice graphics available using the ggplot2 package. In the following two posts, Frank Davenport shows how it can be done: Plotting forecast() objects in ...

Read more »

Project Euler: Problem 20

March 22, 2012
By

n! means n x (n - 1) x ... x 3 x 2 x 1For example, 10! = 10 x 9 x ... x 3 x 2 x 1 = 3628800,and the sum of the digits in the number 10! is 3 + 6 + 2 + 8 + 8 + 0 + 0 = 27.Find the sum of the di...

Read more »

Do we appreciate sunbathing in Spring ?

March 22, 2012
By
Do we appreciate sunbathing in Spring ?

We are currently experiencing an extremely hot month in Montréal (and more generally in North America). Looking at people having a beer, and starting the first barbecue of the year, I was wondering: if we asked people if global warming was a good ...

Read more »

Using Ggplot2 to plot last.fm top 100 albums

March 22, 2012
By
Using Ggplot2 to plot last.fm top 100 albums

I found out that last.fm had made data files available for their Best of 2011 artist list, and I thought it'd be a great opportunity to learn some more about data management in R and Ggplot2.

Read more »

Using Ggplot2 to plot last.fm top 100 albums

March 22, 2012
By
Using Ggplot2 to plot last.fm top 100 albums

I found out that last.fm had made data files available for their Best of 2011 artist list, and I thought it’d be a great opportunity to learn some more about data management in R and Ggplot2. I began by downloading and importing the tab separated data file from last.fm (TSV). 1 2# read data lastfm <- read.delim("~/Downloads/bestof_2011_tsv/bestof_2011_releases.tsv") Then I did some data cleanup, because...

Read more »

Tracking SFO Airport’s Performance Using R, HANA and D3

March 22, 2012
By
Tracking SFO Airport’s Performance Using R, HANA and D3

This is my first introduction to D3 and I am simply blown away.  Mike Bostock (@mbostock), you are genius and thanks for creating D3!  With HANA, R, D3, HTML5 and iPad, and you got yourself a KILLER combo!I have been burning my midnight oil o...

Read more »

Comparing Banzhaf and Shapley-Shubik power indices

March 22, 2012
By
Comparing Banzhaf and Shapley-Shubik power indices

Last week I analyzed Shapley-Shubik power index in R. I got several requests to write a code calculating Banzhaf power index. Here is the proposed code.Again I use data from Warsaw School of Economics rector elections (the details are in my last post)....

Read more »

Exponentiation of a matrix (including pseudoinverse)

March 22, 2012
By
Exponentiation of a matrix (including pseudoinverse)

The following function "exp.mat" allows for the exponentiation of a matrix (i.e. calculation of a matrix to a given power). The function follows three steps:1) Singular Value Decomposition (SVD) of the matrix2) Exponentiation of the singular values3) Re-calculation of the matrix with the new singular valuesThe most common case where the method is applied...

Read more »

Pareto Charts in R

March 22, 2012
By
Pareto Charts in R

A Pareto Chart is a sorted bar chart that displays the frequency (or count) of occurrences that fall in different categories, from greatest frequency on the left to least frequency on the

Read more »

Example 9.24: Changing the parameterization for categorical predictors

March 22, 2012
By
Example 9.24: Changing the parameterization for categorical predictors

In our book, we discuss the important question of how to assign different parameterizations to categorical variables when fitting models (section 3.1.3). We show code in R for use in the lm() function, as follows:lm(y ~ x, contrasts=list(x,"contr.trea...

Read more »

as.character() for rownames()

March 22, 2012
By
as.character() for rownames()

Rainer pointed out, in response to my post, Row names in data frames: Beware of 1:nrow, that if I’d used rownames(x) <- as.character(1:3) rather than rownames(x) <- 1:3, I wouldn’t have had the problem I’d seen. If you type rownames(x) you see the same result as rownames(z), and is.character(rownames(x)) and is.character(rownames(z)) both return TRUE, but

Read more »

Montreal R Workshop: Introduction to Bayesian Methods

March 22, 2012
By
Montreal R Workshop: Introduction to Bayesian Methods

Monday, March 26, 2012  14h-16h, Stewart Biology N4/17 Corey Chivers, Department of Biology McGill University This is a meetup of the Montreal R User Group. Be sure to join the group and RSVP. More information about the workshop here. Topics Why would we want to be Bayesian in the first place?  In this workshop we

Read more »

Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

March 21, 2012
By
Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

In my last post I presented a function for extracting data from a forecast() object and formatting the data so that it can be plotted in ggplot.  The scenario is that you are fitting a model to a time series object with training data, then forecas...

Read more »

Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

March 21, 2012
By
Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

In my last post I presented a function for extracting data from a forecast() object and formatting the data so that it can be plotted in ggplot.  The scenario is that you are fitting a model to a time series object with training data, then forecas...

Read more »

Metadata Dubuque and UHI

March 21, 2012
By
Metadata Dubuque and UHI

I’m in the process of remaking all the metadata from scratch and looking once again at the question of UHI. There are not any global conclusions we can draw from the data yet; I’m just in the process of checking out everything that is available that could be used to illuminate the problem. The problem,

Read more »

How to Generate Exponential Delays

March 21, 2012
By
How to Generate Exponential Delays

This question arose while addressing Comments on a previous blog post about exponentially distributed delays. One of my ongoing complaints is that many, if not most, popular load-test generation tools do not provide exponential variates as part of a library of time delays or think-time distributions. Not only is this situation bizarre, given that all load tests...

Read more »

R/Finance 2012 program announced, registration open

March 21, 2012
By

Registration is now open for R/Finance 2012 in Chicago, the conference devoted to applications of R in the financial sector. The program has also been announced, with topics including: modelling insurance claim reserves; risk management in power markets; peer performance of hedge funds; hedging event risk; operational risk measurement with R and RevoScaleR; and many other applications of R....

Read more »

Row names in data frames: beware of 1:nrow

March 21, 2012
By
Row names in data frames: beware of 1:nrow

I spent some time puzzling over row names in data frames in R this morning. It seems that if you make the row names for a data frame, x, as 1:nrow(x), R will act as if you’d not assigned row names, and the names might get changed when you do rbind. Here’s an illustration: As

Read more »

knitr, github, and a new phase for the lab notebook

March 21, 2012
By
knitr, github, and a new phase for the lab notebook

I have recently modified the basic workflow of my lab notebook since discovering knitr. Before, I would write code files which I could track on github, push figures created by the code to flickr, and then write a notebook entry on wordpress describing what I was doing. I’d embed each figure I wanted into the

Read more »

Bare-bones intro to Plotting options in R

March 21, 2012
By
Bare-bones intro to Plotting options in R

If you’re using base::plot in R for the first time (for example if you do plot(pima) or plot(faithful) (use ??pima if you can’t find the dataset)) you may have looked at ?plot (2 page help file) or ?par (12 page help file) to figure out what’s go...

Read more »

Nordic Countries Dominate the World in Internet Penetration

March 21, 2012
By
Nordic Countries Dominate the World in Internet Penetration

Something about that cold weather... The number of internet users in the Nordic countries has greatly outpaced the world by comparison. Denmark, Iceland, Norway, Sweden, Finland - all in the elite echelon. These countries share a common ancestry -...

Read more »

When Big Data matters

March 21, 2012
By
When Big Data matters

Big can be a qualitative as well as a quantitative difference. The gas in the ill-fated Hindenburg airship, the gas that formed our Sun, and the gas that formed the Milky Way galaxy were just lumps of hydrogen atoms (with varying impurities). The difference was in the number of atoms. But that difference in numbers made the three structures...

Read more »

Simple ROC plots with ggplot2 – Part 2

March 21, 2012
By
Simple ROC plots with ggplot2 – Part 2

In the first part of this article we built a function (rocdata) to calculate the co-ordinates for the ROC plot and its summary statistics. Now we need to actually produce the plot. I make most of my plots in ggplot2 because of it’s versatility. However there’s no reason why these plots couldn’t be produced

Read more »

Using R for a salary negotiation–an extension of decision tree models

March 21, 2012
By
Using R for a salary negotiation–an extension of decision tree models

Let’s say you are in the middle of a salary negotiation, and you want to know whether you should be aggressive in your offering or conservative. One way to help with the decision is to make a decision tree. We’ll work with the following assumptions: You are at a job currently making $50k You have the choices between...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.