ABC in Roma [R lab #1]

February 29, 2012
By
ABC in Roma [R lab #1]

Here are the R codes of the R labs organised by Serena Arima in supplement of my lectures. This is quite impressive and helpful to the students, as illustrated by the first example below (using the abc software). I am having a great time teaching this “ABC in Roma” course, in particular because of the

Read more »

Statistics project ideas for students

February 29, 2012
By

Here are a few ideas that might make for interesting student projects at all levels (from high-school to graduate school). I’d welcome ideas/suggestions/additions to the list as well. All of these ideas depend on free or scraped data, which means tha...

Read more »

XYZ geographic data interpolation, part 2

February 29, 2012
By
XYZ geographic data interpolation, part 2

Having recently received a comment on a post regarding geographic xyz data interpolation, I decided to return to my original "xyz.map" function and open it up for easier interpretation. This should make the method easier to adapt and follow.The above graph shows the distance to Mecca as interpolated from 1000 randomly generated lat/lon...

Read more »

A minimum variance portfolio in 2011

February 29, 2012
By
A minimum variance portfolio in 2011

2011 was a good vintage for minimum variance, at least among stocks in the S&P 500. Previously The post “Realized efficient frontiers” included, of course, a minimum variance portfolio.  That portfolio seemed interesting enough to explore some more. “What does ‘passive investing’ really mean” suggests that minimum variance should be considered a form of passive … Continue reading...

Read more »

Custom Amazon EC2 config for Rstudio

February 29, 2012
By

IntroductionThis post is a work in progress building on the previous post. It's my attempt to simultaneously learn Amazon's AWS tools and set up R and Rstudio Server on a customized "cloud" instance. I look forward to testing some R jobs that have la...

Read more »

Expanding Visualization of published system edges (R)

February 28, 2012
By
Expanding Visualization of published system edges (R)

I happened to be looking over a revised text of a systems author I happen to follow. I will be a bit vague about specifics, as the system itself is based on well know ideas, but I'll leave the reader to research related systems.  The basic message...

Read more »

Parsing R code: Freedom of expression is not always a good idea

February 28, 2012
By
Parsing R code: Freedom of expression is not always a good idea

With my growing interest in R it was inevitable that I would end up writing a parser for it. The fact that the language is relatively small (the add-on packages do the serious work) hastened the event because it did not look like much work; famous last words. I knew about R’s design and implementation

Read more »

Webinar tomorrow: Big-data statistics with Revolution R with IBM Netezza

February 28, 2012
By

As explained in detail by Michele Chambers at the IBM Netezza blog, there are two keys to getting fast performance with statistical analysis on massive data sets with R: Massive parallelization: break the problem down into small pieces, and run them in parallel Bring the R engine to the data (not the other way around), to avoid data transfer...

Read more »

PCA for NIR Spectra_part 006: "Mahalanobis"

February 28, 2012
By
PCA for NIR Spectra_part 006: "Mahalanobis"

Outliers have an important influence over the PCs, for this reason they must be detected and examinee.We have just the spectra without lab data, and we have to check if any of the sample spectra is an outlier ( a noisy spectrum, a sample which belongs ...

Read more »

People voice about Lynas Malaysia through Twitter Analysis with R CloudStat

February 28, 2012
By
People voice about Lynas Malaysia through Twitter Analysis with R CloudStat

People voice about Lynas Malaysia through Twitter Analysis with R CloudStat: CloudStat Analysis: This is a twitter analysis report for “Lynas” from 21 till 28 February 2012, generated by CloudStat Twitter Application. Lynas was a hot topic, espec...

Read more »

useR! 2012 Early Registration Ending Tomorrow!

February 28, 2012
By
useR! 2012 Early Registration Ending Tomorrow!

The early registration deadline for useR! 2012 is tomorrow! Visit the Online Registration Website. The fees for registration increase March 1st.

Read more »

Apprentice Piece with Lattice Graphs

February 28, 2012
By
Apprentice Piece with Lattice Graphs

Lattice graphs can be quite tedious. I don't use them too often and  when I need them I usually have to dig the archives for the parameter-details.The here presented example may serve as a welcome template for the usage of panel functions, panel o...

Read more »

First Milano R net meeting to be held in May 2012

February 28, 2012
By

We are organising the first Milano R net meeting in May 2012. Please visit this page in a few days for news.

Read more »

Adventures in R Studio Server: Apache2, Https, Security, and Amazon EC2.

February 28, 2012
By

I just put a fresh install of Ubuntu Server (10.04.4 LTS) on one of our machines.  As I was doing some post-install config, I accidentally installed Rstudio Server.  And subsequently fell down an exciting little rabbit-hole of server configur...

Read more »

Exponential smoothing and regressors

February 28, 2012
By

I have thought quite a lot about including regressors (i.e. covariates) in exponential smoothing (ETS) models, and I have done it a couple of times in my published work. See my 2008 exponential smoothing book (chapter 9) and my 2008 Tourism Management paper. However, there are some theoretical issues with these approaches, which have come to light through the research of...

Read more »

All Your Source Code Are Belong to… Nature?

February 28, 2012
By

The Journal of Nature put out an interesting op-ed recently discussing the need to make source code available for scientific articles that require statistical computation to produce their results. http://www.nature.com/nature/journal/v482/n7386/ful...

Read more »

Project Euler in R: Problem 26

February 27, 2012
By

I have been posting R solutions to Project Euler problems as a way of polishing my R skills. Here is the next problem in the series, problem 26. The problem is stated as follows:A unit fraction contains 1 in the numerator. The decimal representati...

Read more »

how to read csv files into r

February 27, 2012
By

(This article was first published on twotorials by anthony damico, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: twotorials by anthony damico. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL,...

Read more »

how to use a function and read the help files in r

February 27, 2012
By

(This article was first published on twotorials by anthony damico, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: twotorials by anthony damico. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL,...

Read more »

PCA for NIR Spectra_part 005: "Reconstruction"

February 27, 2012
By
PCA for NIR Spectra_part 005: "Reconstruction"

We saw how to plot the raw spectra (X), how to calculate the mean spectrum, how to center the sprectra (subtracting the mean spectrum from every spectra of the original matrix X). After that we have developed the PCAs with the NIPALS algorithm, getting...

Read more »

R integrated throughout the enterprise analytics stack

February 27, 2012
By

The past couple of years have seen a dramatic growth in the use of the R language in the enterprise. R has always been pervasive in academia for research and teaching in statistics and data science, and as new graduates trained in R have migrated to the workplace the demand for R in corporations has become more and more...

Read more »

RHadoop updated: improved performance and more control

February 27, 2012
By

Revolution Analytics' open-source RHadoop project, which provides integration between R and Hadoop, has been updated with the release of version 1.2 of the "rmr" package. New in this version: support for binary I/O formats, which improves on the text-only interfact by allowing use of faster and more space-efficient data formats like R's native serialization format. This version also improves...

Read more »

Revolution Analytics at Strata 2012

February 27, 2012
By

One of my favourite conferences, Strata: Making Data Work, starts tomorrow in Santa Clara, CA. Revolution Analytics is a proud sponsor, and I'll be there with the team to listen to some great talks and to meet other R users at our booth in the exhibition hall. There will be several R-related talks and tutorials during the conference, including...

Read more »

Subsetting made easy

February 27, 2012
By

Calculating characteristics such as median, mean,… of a subset of data is quite straightforward in R: For a data set containing results from several “models”, a subset for the model “base” is created by Then, the median of the variable … Continue reading →

Read more »

Testing the Effect of a Factor within each Level of another Factor with R-Package {contrast}

February 27, 2012
By
Testing the Effect of a Factor within each Level of another Factor with R-Package {contrast}

This is a small example of how custom contrasts can easily be applied with the contrast-package. The package-manual has several useful explanations and the below example was actually grabbed from there.This example can also be applied to a GLM but I ch...

Read more »

Realized efficient frontiers

February 27, 2012
By
Realized efficient frontiers

A look at the distortion from predicted to realized. The idea The efficient frontier is a mainstay of academic quant.  I’ve made fun of it before.  This post explores the efficient frontier in a slightly less snarky fashion. Data The universe is 474 stocks in the S&P 500.  The predictions are made using data from … Continue reading...

Read more »

Show me the data! Or how to digitize plots

February 27, 2012
By
Show me the data! Or how to digitize plots

I had mentioned the Guardian's data blog and the need for more data journalism earlier here. What I really like about the Guardian's approach in particular is that they share the data of their articles and encourage readers to use it.Of course there ar...

Read more »

The R-Podcast Episode 2: Getting Ready to Use R

February 26, 2012
By

In this episode: A couple of site updates, our first listener feedback, an overview of installing R on each major platform, and an overview of R IDEs and helpful resources for getting started with R. If you would like to provide feedback, please send an email or audio comment to [email protected] or leave us a

Read more »

Portfolio Optimization – Why do we need a Risk Model

February 26, 2012
By
Portfolio Optimization – Why do we need a Risk Model

In the last post, Multiple Factor Model – Building Risk Model, I have shown how to build a multiple factor risk model. In this post I want to explain why do we need a risk model and how it is used during portfolio construction process. The covariance matrix is used during the mean-variance portfolio optimization

Read more »