The R-Podcast Episode 4: Data Structures-Introduction

March 25, 2012
By

In this episode: Site updates, additional screencasts about R from other sites, listener feedback, and discussion on the fundamental data structures for R: vectors, matrices, lists, and data frames. The R code discussed in this episode is available in our GitHub repository, see the show notes for details. Leave us a voicemail a +1-269-849-9780, or

Read more »

Popularity of Programming Languages

March 25, 2012
By
Popularity of Programming Languages

As you can see, R is relatively popular (but more so on StackOverflow than GitHub):For the original graph, click here. This scatter plot is a reminder that R is useful to learn not only for statistical modeling (since there are so many excellent packag...

Read more »

Canonical Correlation Analysis for finding patterns in coupled fields

March 25, 2012
By
Canonical Correlation Analysis for finding patterns in coupled fields

First CCA pattern of Sea Level Pressure (SLP) and Sea Surface Temperature (SST) monthly anomalies for the region between -180 °W to -70 °W and +30 °N to -30 °S. The following post demonstrates the use of Canonical Correlation Analysis (CCA) for diagnosing coupled patterns in climate fields....

Read more »

Classification Trees and Spatial Autocorrelation

March 25, 2012
By
Classification Trees and Spatial Autocorrelation

I'm currently trying to model species presence / absence data (N = 523) that were collected over a geographic area and are possibly spatially autocorrelated. Samples come from preferential sites (sea level > 1200 m, obligatory presence of permanent ...

Read more »

VIDEO: Applying "MSC" math-treatment to our raw spectra in "R".

March 25, 2012
By
VIDEO: Applying "MSC" math-treatment to our raw spectra in "R".

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: NIR-Quimiometría. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics...

Read more »

Levenshtein distance in C++ and code profiling in R

March 25, 2012
By
Levenshtein distance in C++ and code profiling in R

At work, the client requested, if existing search engine could accept singular and plural forms equally, e. g. “partner” and “partners” would lead to the same result. The first option – stemming. In that case, search engine would use root of a word, e. g. “partn”. However, stemming has many weaknesses: two different words might have same root, a

Read more »

Disproportionality Data

March 25, 2012
By
Disproportionality Data

So I was hunting around for some data on disproportional electoral outcomes (when the proportion of voters cast for political parties is not close to the proportion of legislative seats that they win).Michael Gallagher keeps an updated version of his L...

Read more »

Citations in markdown using knitr

March 24, 2012
By
Citations in markdown using knitr

I am finding myself more and more drawn to markdown rather then tex/Rnw as my standard format (not least of which is the ease of displaying the files on github, particularly now that we have automatic image uploading). One thing I miss from latex is the citation commands. (I understand these can be provided to

Read more »

Initial release 0.1.0 of package RcppSMC

March 24, 2012
By
Initial release 0.1.0 of package RcppSMC

Hm, I realized that I announced this on Google+ (via Rcpp) as well as on Twitter, on the r-packages list, wrote a new and simple web page for it, but had not put it on my blog. So here is some catching up. Sequential Monte Carlo / Particle Filter is ...

Read more »

Custom Summary Stats as Dataframe or List

March 24, 2012
By
Custom Summary Stats as Dataframe or List

On Stackoverflow I found this useful example on how to apply custom statistics on a dataframe and return the results as list or dataframe:somedata<- data.frame(               ...

Read more »

VIDEO:Looking to "NIR" Spectra in "R": (Import and organize)

March 24, 2012
By
VIDEO:Looking to "NIR" Spectra in "R":  (Import and organize)

 

Read more »

Linking apple liking to sensory

March 24, 2012
By
Linking apple liking to sensory

Previously it was seen that apple liking was related to consumers scores for juiciness and sweetness. It would be most nice if these scores can be linked to sensory scores. Thus a three block model would result:A block with sensory data describing how ...

Read more »

Video: R at Work and at Home

March 24, 2012
By
Video: R at Work and at Home

The following video was filmed at Melbourne R Users. The description of the talk from the meetup site: Eu Jin is a Senior Analyst with Deloitte Analytics in Melbourne. He has over four years experience in data mining and statistical … Continue reading →

Read more »

R script to calculate QIC for Generalized Estimating Equation (GEE) Model Selection

March 23, 2012
By
R script to calculate QIC for Generalized Estimating Equation (GEE) Model Selection

Generalized Estimating Equations (GEE) can be used to analyze longitudinal count data; that is, repeated counts taken from the same subject or site. This is often referred to as repeated measures data, but longitudinal data often has more repeated observations. … Continue reading →

Read more »

R script to calculate QIC for Generalized Estimating Equation (GEE) Model Selection

March 23, 2012
By
R script to calculate QIC for Generalized Estimating Equation (GEE) Model Selection

Generalized Estimating Equations (GEE) can be used to analyze longitudinal count data; that is, repeated counts taken from the same subject or site. This is often referred to as repeated measures data, but longitudinal data often has more repeated obse...

Read more »

Gini Efficient Frontier

March 23, 2012
By
Gini Efficient Frontier

David Varadi have recently wrote two posts about Gini Coefficient: I Dream of Gini, and Mean-Gini Optimization. I want to show how to use Gini risk measure to construct efficient frontier and compare it with alternative risk measures I discussed previously. I will use Gini mean difference risk measure – the mean of the difference

Read more »

Serious stats – free statistics resources

March 23, 2012
By

The companion web site for Serious Stats is now live:http://www.palgrave.com/psychology/baguley/The web site includes:- a free sample chapter (Chapter 15: Contrasts)- data sets- R scripts- 5 online supplements (for meta-analysis, multiple imputation, r...

Read more »

Serious stats companion web site now live: sample chapter, data and R scripts

March 23, 2012
By
Serious stats companion web site now live: sample chapter, data and R scripts

The companion web site for Serious stats is now live: http://www.palgrave.com/psychology/Baguley/ It includes a sample chapter (Chapter 15: Contrasts), data sets, R scripts for all the examples and supplementary material. Filed under: news, R code, ser...

Read more »

Dissimilarity Between Soil Profiles: A Closer Look

March 23, 2012
By
Dissimilarity Between Soil Profiles: A Closer Look

Continuing the previous discussion of pair-wise dissimilarity between soil profiles, the following demonstration (code, comments, and figures) further elaborates on the method. A more in-depth discussion of this example will be included as a vignette w...

Read more »

Launching iButton Thermochrons with the help of R

March 23, 2012
By

Maxim's iButton Thermochron temperature dataloggers are little silver doo-dads the size of a large watch battery that can record up to 2048 time-stamped temperature values. The internal battery is usually good for a few years of use. Maxim supplies a J...

Read more »

R in Google Summer of Code 2012

March 23, 2012
By

This post is a slightly revised (and "blogified") version of the message Brian Peterson has sent to various R mailing lists.Once again, R has been accepted as a mentoring organization for the Google Summer of Code (2012).  We invite students interested in this program to learn more about it.  A good starting point...

Read more »

RStudio Development Environment

March 23, 2012
By
RStudio Development Environment

Compared to many other languages of equal popularity, there are realtively few development environments for R. In fact, the total number of production ready R IDEs could probably be counted on one hand. That deficiency is a small price to pay to use R and if you’re not already accustomed to using IDEs for other The post RStudio...

Read more »

RStudio Development Environment

March 23, 2012
By
RStudio Development Environment

Compared to many other languages of equal popularity, there are realtively few development environments for R. In fact, the total number of production ready R IDEs could probably be counted on one hand. That deficiency is a small price to pay to use R ...

Read more »

R, Twitter and McDonald’s

March 23, 2012
By
R, Twitter and McDonald’s

Ed Chen is a data scientist at Twitter, so he's accustomed to working with big data and complex models. In an interview with MIT Technology Review, he describes his data science toolbox: A common pattern for me is that I'll code a MapReduce job in Scala, do some simple command-line munging on the results, pass the data into Python...

Read more »

This graph shows that President Obama’s proposed budget treats the NIH even worse than G.W. Bush – Sign the petition to increase NIH funding!

March 23, 2012
By
This graph shows that President Obama’s proposed budget treats the NIH even worse than G.W. Bush – Sign the petition to increase NIH funding!

The NIH provides financial support for a large percentage of biological and medical research in the United States. This funding supports a large number of US jobs, creates new knowledge, and improves healthcare for everyone. So I am signing this petiti...

Read more »

Low (and high) volatility strategy effects

March 23, 2012
By
Low (and high) volatility strategy effects

Does minimum variance act differently from low volatility?  Do either of them act like low beta?  What about high volatility versus high beta? Inspiration Falkenblog had a post investigating differences in results when using different strategies for low volatility investing.  Here we look not at a single portfolio of a given strategy over time, but … Continue reading...

Read more »

Forecasts and ggplot

March 22, 2012
By

The forecast package uses the base R graphics for all plots, but some people may prefer to use the nice graphics available using the ggplot2 package. In the following two posts, Frank Davenport shows how it can be done: Plotting forecast() objects in ...

Read more »

Project Euler: Problem 20

March 22, 2012
By

n! means n x (n - 1) x ... x 3 x 2 x 1For example, 10! = 10 x 9 x ... x 3 x 2 x 1 = 3628800,and the sum of the digits in the number 10! is 3 + 6 + 2 + 8 + 8 + 0 + 0 = 27.Find the sum of the di...

Read more »

Do we appreciate sunbathing in Spring ?

March 22, 2012
By
Do we appreciate sunbathing in Spring ?

We are currently experiencing an extremely hot month in Montréal (and more generally in North America). Looking at people having a beer, and starting the first barbecue of the year, I was wondering: if we asked people if global warming was a good ...

Read more »