ThinkStats … in R :: Example/Chapter 2 :: Example 2.1-2.3

March 14, 2012
By
ThinkStats … in R :: Example/Chapter 2 :: Example 2.1-2.3

As promised, this post is a bit more graphical, but I feel the need to stress the importance of the first few points in chapter 2 of the book (i.e. the difference between mean and average and why variance is meaningful). These are fundamental concepts for future work. The “pumpkin” example (2.1) gives us an

Read more »

Simple plots reveal interesting artifacts

March 14, 2012
By
Simple plots reveal interesting artifacts

I’ve recently been working with methylation data; specifically, from the Illumina Infinium HumanMethylation450 bead chip. It’s a rather complex array which uses two types of probes to determine the methylation state of DNA at ~ 485 000 sites in the genome. The Bioconductor project has risen to the challenge with a (somewhat bewildering) variety of

Read more »

More Anthromes !

March 13, 2012
By
More Anthromes !

First off let me thank folks for all the comments and suggestions. I’m just starting to explore this data so perhaps I should explain how I go about  doing this. First off, I am looking for a global bias in the record from UHI. It is well known that you can look through the data

Read more »

Japan Trade More Specifically with Korea

March 13, 2012
By
Japan Trade More Specifically with Korea

Macro analysis of Japanese trade in posts Japanese Trade and the Yen and Japan Trade by Geographic Region revealed some very interesting changes.  Since the Korean Won is so undervalued versus the Japanese Yen on a Purchasing Power Parity (PPP) ba...

Read more »

Plotting forecast() objects in ggplot part 1: Extracting the Data

March 13, 2012
By

Lately I've been using Rob J Hyndman's excellent forecast package. The package comes with some built in plotting functions but I found I wanted to customize and make my own plots in ggplot. In order to do that, I need a generalizable function that will...

Read more »

Plotting forecast() objects in ggplot part 1: Extracting the Data

March 13, 2012
By

Lately I've been using Rob J Hyndman's excellent forecast package. The package comes with some built in plotting functions but I found I wanted to customize and make my own plots in ggplot. In order to do that, I need a generalizable function that will...

Read more »

SNA with R workshop at Sunbelt XXXII in Redondo Beach

March 13, 2012
By
SNA with R workshop at Sunbelt XXXII in Redondo Beach

I am currently in Redondo Beach, CA at the Sunbelt XXXII social networks conference. The program is thick from numerous interesting talks so the event promises to be very interesting. Today in the morning I gave the workshop “Introduction to Social Network Analysis with R”. Over 50 people registered. I am grateful to all the

Read more »

Scatter Plot Matrix in R

March 13, 2012
By

Stata has a large number of graphics capabilities (and I highly recommend Stata over other statistical packages for a variety of reasons), but in a few instances R is more useful. In particular, I find R useful for creating beautiful scatter plot ...

Read more »

Shapley-Shubik Power Index in R

March 13, 2012
By
Shapley-Shubik Power Index in R

This spring we have Rector Elections at Warsaw School of Economics. One of my collegues Tomasz Szapiro agreed to start in the elections. This induced me to write Shapley-Shubik Power Index calculation snippet in R.Rector elections in Warsaw School...

Read more »

Video: Using R in Academic Finance

March 13, 2012
By

The slides and replay for Dr Sanjiv Das's webinar, Using R for Analyzing Loans, Portfolios and Risk: From Academic Theory to Financial Practice are now available. I've embedded the slides below: they tell a great story of how Das, after being mistaken for the then-CEO of Citibank (with whom he shares a name) was then led to research (using...

Read more »

R-Function to Read Data from Google Docs Spreadsheets

March 13, 2012
By
R-Function to Read Data from Google Docs Spreadsheets

I used this idea posted on Stack Overflow to plug together a function for reading data from Google Docs spreadsheets into R. google_ss <- function(gid = NA, key = NA) { if (is.na(gid)) {stop("\nWorksheetnumber (gid) is missing\n")} if (is....

Read more »

Oracle R Distribution and Open Source R

March 13, 2012
By

Oracle provides the Oracle R Distribution, an Oracle-supported distribution of open source R. Support for Oracle R Distribution is provided to customers of the Oracle Advanced Analytics option and the Oracle Big Data Appliance. The Oracle R Distribu...

Read more »

R code for Chapter 2 of Non-Life Insurance Pricing with GLM

March 13, 2012
By
R code for Chapter 2 of Non-Life Insurance Pricing with GLM

We continue working our way through the examples, case studies, and exercises of what is affectionately known here as “the two bears book” (Swedish björn = bear) and more formally as Non-Life Insurance Pricing with Generalized Linear Models by Esbjörn Ohlsson and Börn Johansson (Amazon UK | US

Read more »

R code for Chapter 2 of Non-Life Insurance Pricing with GLM

March 13, 2012
By
R code for Chapter 2 of Non-Life Insurance Pricing with GLM

We continue working our way through the examples, case studies, and exercises of what is affectionately known here as “the two bears book” (Swedish björn = bear) and more formally as Non-Life Insurance Pricing with Generalized Linear Models by Esbjörn Ohlsson and Börn Johansson (Amazon UK | US). At...

Read more »

In case you missed it: February Roundup

March 13, 2012
By

In case you missed them, here are some articles from February of particular interest to R users. February 29 marked the 12th anniversary of the release of R 1.0.0, and the release of R 2.14.2. A list of commercial vendors who have integrated R with their products for data, analysis, and presentation. The rmr package (part of the RHadoop...

Read more »

Data Science – learn the lessons of software

March 13, 2012
By

We're starting to see a deluge of companies who businesses are all about making data analysis/science/insight "easy for the non-expert". We've been here before, quite a few times sadly. When I started writing software 12 years ago, there was...

Read more »

Example 9.23: Demonstrating proportional hazards

March 13, 2012
By
Example 9.23: Demonstrating proportional hazards

A colleague recently asked after a slide suitable for explaining proportional hazards. In particular, she was concerned that her audience not focus on the time to event or probability of the event. An initial thought was to display the cumulative haz...

Read more »

Anthromes and UHI

March 12, 2012
By
Anthromes and UHI

With BerkeleyEarth 1.6 posted to CRAN I figured it was time to do some sample programs to explain how the package worked and integrated with other packages. Also, I have some issues to check out with the metadata; and in the long run I want to reformulate my metadata package to include some new resources.

Read more »

how to read spss, stata, and sas files into r

March 12, 2012
By

(This article was first published on twotorials by anthony damico, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: twotorials by anthony damico. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL,...

Read more »

New R User Group in Milan

March 12, 2012
By

Italy now has three local R user groups, thanks to the recent formation of MilanoR in the city of Milan. Founded by R consulting company Quantide's Andrea Spanò, R-core member and University of Milan professor Stefano Iacus, and quantitative consultant Daniele Amberti, MilanoR will be a forum to "exchange knowledge, learn and share tricks and techniques, and provide R...

Read more »

NIT: Fatty acids study in R – Part 006

March 12, 2012
By
NIT: Fatty acids study in R – Part 006

In one of the columns, for constituent C16_0, one sample (57) has a value of “zero” (we could see this in the histogram).The reason for that is that the laboratory did not supply this value. The PLS regression will consider the lab value as cero, s...

Read more »

Kony 2012: How to weaken your argument with charts

March 12, 2012
By
Kony 2012: How to weaken your argument with charts

If the goal of the Invisible Children campaign, which has received millions of dollars of contributions since the Kony 2012 video went viral, is to convince us that the money is being put to humanitarian efforts, they could do a lot better than this chart: Putting 37% of expenses into programs in Africa is a decent result -- many...

Read more »

R index between two products is somewhat dependent on other products

March 12, 2012
By
R index between two products is somewhat dependent on other products

I explained earlier how R-index is used in sensory is used to examine ranking data. The legitimization to use R-index is in the link with d' and with Mann-Whitney statistic. In this post I show there is a dependence on the number of products and p...

Read more »

That’s Not How the “Law of Large Numbers” Works

March 12, 2012
By
That’s Not How the “Law of Large Numbers” Works

Breaking my dissertation and administrata induced silence for a small rant combining two of my favorite things – Apple Computer Inc, and simulation. Recently, the New York Times featured the article ‘Apple Confronts the Law of Large Numbers‘. The fundamental assertion? That the earnings growth and stock price of Apple cannot continue its rapid rise.

Read more »

Japan Trade by Geographic Region

March 12, 2012
By
Japan Trade by Geographic Region

To further the analysis presented in Japanese Trade and the Yen, I thought I would take the more granular data provided by the Japanese Ministry of Finance on trade by geographic region.  Of course, I will use R to read, analyze, and plot the .csv...

Read more »

A Julia version of the multinomial sampler

March 12, 2012
By

In the previous post on RcppEigen I described an example of sampling from collection of multinomial distributions represented by a matrix of probabilities.  In the timing example the matrix was 100000 by 5 with each of the 100000 rows summing...

Read more »

Basic Introduction to ggplot2

March 12, 2012
By
Basic Introduction to ggplot2

This is a very basic introduction to the ggplot2 package.  A much more detailed description of the package can be found in this book ggplot2: Elegant Graphics for Data Analysis. On his website (http://had.co.nz/ggplot2/) package author Hadley Wickham describes ggplot2 asa plotting system for R, based on the grammar of graphics, which tries to take...

Read more »

The R-Podcast Screencast 1: Basic Interaction with R

March 12, 2012
By

Here is the inaugural R-Podcast Screencast: Basic Interaction with R. This screencast contains audio from episode 3 of the R-Podcast. In this screencast I demonstrate how to create a vector of numerical data, calculating means, installing and loading packages, and getting help for a function. You can find the R code demonstrated in this episode

Read more »

XYZ geographic data interpolation, part 3

March 12, 2012
By
XYZ geographic data interpolation, part 3

This will be probably be a final posting on interpolation of xyz data as I believe I have come to some conclusions to my original issues. I show three methods of xyz interpolation:1. The quick and dirty method of interpolating projected xyz points (bi-linear)2. Interpolation using Cartesian coordinates (bi-linear)3. Interpolation using spherical coordinates and...

Read more »