Oracle R Distribution and Open Source R

March 13, 2012
By

Oracle provides the Oracle R Distribution, an Oracle-supported distribution of open source R. Support for Oracle R Distribution is provided to customers of the Oracle Advanced Analytics option and the Oracle Big Data Appliance. The Oracle R Distribu...

Read more »

R code for Chapter 2 of Non-Life Insurance Pricing with GLM

March 13, 2012
By
R code for Chapter 2 of Non-Life Insurance Pricing with GLM

We continue working our way through the examples, case studies, and exercises of what is affectionately known here as “the two bears book” (Swedish björn = bear) and more formally as Non-Life Insurance Pricing with Generalized Linear Models by Esbjörn Ohlsson and Börn Johansson (Amazon UK | US

Read more »

R code for Chapter 2 of Non-Life Insurance Pricing with GLM

March 13, 2012
By
R code for Chapter 2 of Non-Life Insurance Pricing with GLM

We continue working our way through the examples, case studies, and exercises of what is affectionately known here as “the two bears book” (Swedish björn = bear) and more formally as Non-Life Insurance Pricing with Generalized Linear Models by Esbjörn Ohlsson and Börn Johansson (Amazon UK | US). At...

Read more »

In case you missed it: February Roundup

March 13, 2012
By

In case you missed them, here are some articles from February of particular interest to R users. February 29 marked the 12th anniversary of the release of R 1.0.0, and the release of R 2.14.2. A list of commercial vendors who have integrated R with their products for data, analysis, and presentation. The rmr package (part of the RHadoop...

Read more »

Data Science – learn the lessons of software

March 13, 2012
By

We're starting to see a deluge of companies who businesses are all about making data analysis/science/insight "easy for the non-expert". We've been here before, quite a few times sadly. When I started writing software 12 years ago, there was...

Read more »

Example 9.23: Demonstrating proportional hazards

March 13, 2012
By
Example 9.23: Demonstrating proportional hazards

A colleague recently asked after a slide suitable for explaining proportional hazards. In particular, she was concerned that her audience not focus on the time to event or probability of the event. An initial thought was to display the cumulative haz...

Read more »

Anthromes and UHI

March 12, 2012
By
Anthromes and UHI

With BerkeleyEarth 1.6 posted to CRAN I figured it was time to do some sample programs to explain how the package worked and integrated with other packages. Also, I have some issues to check out with the metadata; and in the long run I want to reformulate my metadata package to include some new resources.

Read more »

how to read spss, stata, and sas files into r

March 12, 2012
By

(This article was first published on twotorials by anthony damico, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: twotorials by anthony damico. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL,...

Read more »

New R User Group in Milan

March 12, 2012
By

Italy now has three local R user groups, thanks to the recent formation of MilanoR in the city of Milan. Founded by R consulting company Quantide's Andrea Spanò, R-core member and University of Milan professor Stefano Iacus, and quantitative consultant Daniele Amberti, MilanoR will be a forum to "exchange knowledge, learn and share tricks and techniques, and provide R...

Read more »

NIT: Fatty acids study in R – Part 006

March 12, 2012
By
NIT: Fatty acids study in R – Part 006

In one of the columns, for constituent C16_0, one sample (57) has a value of “zero” (we could see this in the histogram).The reason for that is that the laboratory did not supply this value. The PLS regression will consider the lab value as cero, s...

Read more »

Kony 2012: How to weaken your argument with charts

March 12, 2012
By
Kony 2012: How to weaken your argument with charts

If the goal of the Invisible Children campaign, which has received millions of dollars of contributions since the Kony 2012 video went viral, is to convince us that the money is being put to humanitarian efforts, they could do a lot better than this chart: Putting 37% of expenses into programs in Africa is a decent result -- many...

Read more »

R index between two products is somewhat dependent on other products

March 12, 2012
By
R index between two products is somewhat dependent on other products

I explained earlier how R-index is used in sensory is used to examine ranking data. The legitimization to use R-index is in the link with d' and with Mann-Whitney statistic. In this post I show there is a dependence on the number of products and p...

Read more »

That’s Not How the “Law of Large Numbers” Works

March 12, 2012
By
That’s Not How the “Law of Large Numbers” Works

Breaking my dissertation and administrata induced silence for a small rant combining two of my favorite things – Apple Computer Inc, and simulation. Recently, the New York Times featured the article ‘Apple Confronts the Law of Large Numbers‘. The fundamental assertion? That the earnings growth and stock price of Apple cannot continue its rapid rise.

Read more »

Japan Trade by Geographic Region

March 12, 2012
By
Japan Trade by Geographic Region

To further the analysis presented in Japanese Trade and the Yen, I thought I would take the more granular data provided by the Japanese Ministry of Finance on trade by geographic region.  Of course, I will use R to read, analyze, and plot the .csv...

Read more »

A Julia version of the multinomial sampler

March 12, 2012
By

In the previous post on RcppEigen I described an example of sampling from collection of multinomial distributions represented by a matrix of probabilities.  In the timing example the matrix was 100000 by 5 with each of the 100000 rows summing...

Read more »

Basic Introduction to ggplot2

March 12, 2012
By
Basic Introduction to ggplot2

This is a very basic introduction to the ggplot2 package.  A much more detailed description of the package can be found in this book ggplot2: Elegant Graphics for Data Analysis. On his website (http://had.co.nz/ggplot2/) package author Hadley Wickham describes ggplot2 asa plotting system for R, based on the grammar of graphics, which tries to take...

Read more »

The R-Podcast Screencast 1: Basic Interaction with R

March 12, 2012
By

Here is the inaugural R-Podcast Screencast: Basic Interaction with R. This screencast contains audio from episode 3 of the R-Podcast. In this screencast I demonstrate how to create a vector of numerical data, calculating means, installing and loading packages, and getting help for a function. You can find the R code demonstrated in this episode

Read more »

XYZ geographic data interpolation, part 3

March 12, 2012
By
XYZ geographic data interpolation, part 3

This will be probably be a final posting on interpolation of xyz data as I believe I have come to some conclusions to my original issues. I show three methods of xyz interpolation:1. The quick and dirty method of interpolating projected xyz points (bi-linear)2. Interpolation using Cartesian coordinates (bi-linear)3. Interpolation using spherical coordinates and...

Read more »

useR! 2012 Abstract Submission Deadline Today!

March 12, 2012
By
useR! 2012 Abstract Submission Deadline Today!

useR! 2012 is just around the corner. The deadline for talk and poster abstract submissions is today! Submit your abstract here.

Read more »

The quality of variance matrix estimation

March 12, 2012
By
The quality of variance matrix estimation

A bit of testing of the estimation of the variance matrix for S&P 500 stocks in 2011. Previously There was a plot in “Realized efficient frontiers” showing the realized volatility in 2011 versus a prediction of volatility at the beginning of the year for a set of random portfolios.  A reader commented to me privately … Continue reading...

Read more »

Compiling government positions from the Manifesto Project data with R

March 12, 2012
By
Compiling government positions from the Manifesto Project data with R

The Manifesto Project (former Manifesto Research Group, Comparative Manifestos Project) has assembled a database of ‘quantitative content analyses of parties’ election programs from more than 50 countries covering all free, democratic elections since 1945′ and is freely accessible online. The … Continue reading →

Read more »

Change in life expectancy animated with geo charts

March 12, 2012
By
Change in life expectancy animated with geo charts

The data of the World Bank is absolutely amazing. I had said this before, but their updated iPhone App gives me a reason to return to this topic. Version 3 of the DataFinder App allows you to visualise the data on your phone, including motion maps, see...

Read more »

Generating a lag/lead variables

March 11, 2012
By
Generating a lag/lead variables

A few days ago, my friend asked me is there any function in R to generate lag/lead variables in a data.frame or did similar thing as _n in stata. He would like to use that to clean-up his dataset in R. In stata help manual: _n contains the number of the current observation. Here’s an

Read more »

The R-Podcast Episode 3: Basic Interaction with R

March 11, 2012
By

In this episode: New versions of R and ggplot2 available, listener feedback, and an interactive session with R. The R code discussed in this episode will be available in our GitHub repository, see the show notes for details. There will be a companion screencast to accompany this episode which will be posted shortly. As always,

Read more »

Hindi/Devanagari presentations using orgmode, R, latex and beamer

March 11, 2012
By

I recently had to prepare a beamer presentation in hindi/devanagari. I usually use emacs-orgmode  with a lot of R source code embedded in it to prepare my beamer presentations. To adapt the entire setup to work with devanagari, this is what I needed to do.      Make orgmode export to latex using xetex rather than

Read more »

IS vs. self-normalised IS

March 11, 2012
By
IS vs. self-normalised IS

I was grading my Master projects this morning and came upon this graph: which compares the variability of an importance-sampling estimator versus its self-normalised alternative… This is an interesting case in that self-normalisation does considerably degrade the quality of the approximation in that setting. In other cases, self-normalisation may bring a clear improvement. (This reminded

Read more »

Plotting stuff on an image

March 11, 2012
By

Recently, I needed to figure out how many extension cords I was going to need to buy in order to reach parts of my field site. Wandering around in the field with a surveyor's tape was an option, but so was plotting distances on an aerial image I had of...

Read more »

Interactive function for distances in plots

March 11, 2012
By

The following R function returns the distance between two points located on a plot. The distance returned is in the same units as that of the plot.interDist     aa     dx     dy     sqrt(sum(c(dx^2, dy^2)))}

Read more »

A Julia version of the multinomial sampler

March 11, 2012
By
A Julia version of the multinomial sampler

In the previous post on RcppEigen I described an example of sampling from collection of multinomial distributions represented by a matrix of probabilities.  In the timing example the matrix was 100000 by 5 with each of the 100000 rows summing...

Read more »