Sampling for Monte Carlo simulations with R

October 31, 2011
By
Sampling for Monte Carlo simulations with R

I've knocked together a quick function for generating efficient Monte Carlo samples. It takes a bit of the legwork out of running Monte Carlo simulations.

Read more »

Using IUCN-Data, ArcMap 9.3 and R to Map Species Diversity

October 31, 2011
By
Using IUCN-Data, ArcMap 9.3 and R to Map Species Diversity

..I'm overwhelmed by the ever-growing loads of data that's made available via the web. I.e., IUCN collects and hosts spatial species data which is free for download. I'm itching to play with all this data... And, in the end there may arise ...

Read more »

Reading Excel data is easy with JGR and XLConnect

October 30, 2011
By

Despite the fact that Excel is the most widespread application for data manipulation and (perhaps) analysis, R's support for the xls and xlsx file formats has left a lot to be desired. Fortunately, the XLConnect package has been created to fill this void, and now JGR 1.7-8 includes integration with XLConnect package to load .xls

Read more »

Learning R: Project 1, Part 2

October 30, 2011
By
Learning R: Project 1, Part 2

So it's been a week since I started down this path.  I worked most of this out over last weekend, went to a conference, had hectic week at work, and then realized I lost my work.  Gah.I'll be posting my general thoughts on R later.  Most...

Read more »

Bayesian ideas and data analysis

October 30, 2011
By
Bayesian ideas and data analysis

Here is another Bayesian textbook that appeared recently. I read it in the past few days and, despite my obvious biases and prejudices, I liked it very much! It has a lot in common (at least in spirit) with our Bayesian Core, which may explain why I feel so benevolent towards Bayesian ideas and

Read more »

The present and future of the R blogosphere (~7 minute video from useR2011)

October 30, 2011
By
The present and future of the R blogosphere (~7 minute video from useR2011)

This is (roughly) the lightning talk I gave in useR2011. If you are a reader of R-bloggers.com then this talk is not likely to tell you anything new. However, if you have a friend, college or student who is a new useRs of R, this talk will offer him a decent introduction to what the R Read more...

Read more »

Modelling with R: part 5

October 30, 2011
By

In our exercise of learning modelling in R, we have till now succeeded in doing the following:Importing the dataPreparing and transforming the dataRunning a logistic regressionCreating a decision treeSpecifically, we created a decision tree using the r...

Read more »

Proc report for simple statistics

October 30, 2011
By
Proc report for simple statistics

Ken Beath, of Macquarie University, commented on an earlier entry that the best way to generate summary statistics is using proc report. While the best tools might differ, depending on the purpose, we wanted to share Ken's code demonstrating how to re...

Read more »

Rcpp reverse dependency graph

October 30, 2011
By
Rcpp reverse dependency graph

I played around with reverse dependencies of Rcpp. At the moment, 44 packages depend on Rcpp and the number goes up to 53 when counting recusive reverse dependencies. I've used graphviz for the representation of the directed graph Here is the c...

Read more »

Installing R 2.14.0 on an iBook G4 running Mac OS 10.4.11

October 30, 2011
By
Installing R 2.14.0 on an iBook G4 running Mac OS 10.4.11

My 12" iBook G4 is celebrating its 8th birthday today! Time for a little present. How about R 2.14.0?The iBook is still in daily use, mostly for browsing the web, writing e-mails and this blog; and I still use it for R as well. For a long time it run R...

Read more »

Anarchy Golf! And that’s your Sunday gone.

October 29, 2011
By
Anarchy Golf! And that’s your Sunday gone.

I like to follow good practice when I program. I want my code to be readable, properly indented, modular and re-usable. And I want my variables to have descriptive names. There’s nothing that I hate moderately dislike more than arbitrary … Continue reading →

Read more »

Plotting gain chart

October 29, 2011
By
Plotting gain chart

Gain chart is a popular method to visually inspect model performance in binary prediction. It presents the percentage of captured positive responses as a function of selected percentage of a sample. It is easy to obtain it using ROCR package plott...

Read more »

SabreR

October 29, 2011
By

SabreR just released an update. It is another software package that can estimate multivariate multilevel model (other options are aML, MCMCglmm, etc.). They seem to also have a book dedicated to the software, which be worth checking out.It will be grea...

Read more »

Migrating from SPSS/Excel to R, Part 3: Preparing your Data

October 29, 2011
By
Migrating from SPSS/Excel to R, Part 3: Preparing your Data

In this post, I describe how to prepare your data for migrating between SPSS/Excel and R. This is the third …Continue reading »

Read more »

Dennis Ritchie 1941-2011

October 28, 2011
By
Dennis Ritchie 1941-2011

I just got the “news” that Dennis Ritchie died, although this happened on October 12… The announcement was surprisingly missing from my information channels and certainly got little media coverage, compared with Steve Jobs‘ demise. (I did miss the obituaries in the New York Times and in the Guardian. The Economist has the most appropriate

Read more »

Comparison of ave, ddply and data.table

October 28, 2011
By
Comparison of ave, ddply and data.table

This is a copy of a post by me on the R-statistics blog. Fortran and C programmers often say that interpreted languages like R are nice and all, but lack in terms of speed. How fast something works in R… See more ›

Read more »

New R User Group in Dublin, Ireland

October 28, 2011
By

There have been several requests to an R User Group in Ireland, so thanks to Kevin O'Brien for stepping up to co-ordinate the Dublin-R group. Kevin invites all R users in the area to the first meeting on November 17: The Dublin R users group will be holding a series of monthly meetings. On the agenda is the development...

Read more »

My little presentation on getting web data through R

October 28, 2011
By
My little presentation on getting web data through R

With examples from rOpenSci R packages. p.s. I am no expert at this...Web data from R View more presentations from schamber

Read more »

R-TreeBASE Tutorial

October 28, 2011
By
R-TreeBASE Tutorial

My treebase package is now up on the CRAN repository. (Source code is up, the binaries should appear soon). Here’s a few introductory examples to illustrate some of the functionality of the package. Thanks in part to new data deposition … Continue reading →

Read more »

Copulas made easy

October 28, 2011
By
Copulas made easy

Everyday, a poor soul tries to understand copulas by reading the corresponding Wikipedia page, and gives up in despair. The incomprehensible mess that one finds there gives the impression that copulas are about as accessible as tensor theory, which is a shame, because they are actually a very nice tool. The only prerequisite is knowing

Read more »

R versus SAS/SPSS in corporations

October 28, 2011
By
R versus SAS/SPSS in corporations

A recent question on one of the LinkedIn groups about the advantages of using R over commercial tools like SAS or IBM SPSS Modeller drew lots of comments for R. We like R a lot and we use it extensively, but I also wanted to balance the discussion. R is great, but looking at commercial organizations near...

Read more »

R versus SAS/SPSS in corporations

October 28, 2011
By
R versus SAS/SPSS in corporations

A recent question on one of the LinkedIn groups about the advantages of using R over commercial tools like SAS or IBM SPSS Modeller drew lots of comments for R. We like R a lot and we use it...

Read more »

Creating an R package, using developer/productivity tools

October 27, 2011
By
Creating an R package, using developer/productivity tools

Couple of R programming (mainly infrastructure/workflow) related topics discussed at the Los Angeles R users group in a tutorial/demo-like form (targeted mainly to beginners) by Szilard Pafka and Jeroen Ooms: how easy it is to create a simple package for … Continue reading →

Read more »

Building diversified portfolios with R

October 27, 2011
By
Building diversified portfolios with R

A common approach to reducing risk associated with financial portfolios is diversification. A portfolio made of components that are all highly correlated with each other -- a portfolio composed solely of financial stocks, for example -- is risky, because if there's a wide-spread crisis that affects the banking sector, all components of the portfolio will tank at once, together....

Read more »

Predictability of stock returns : Using acf()

October 27, 2011
By
Predictability of stock returns : Using acf()

In my previous post, I employed a rather crude and non-parametric approach to see if I could predict the direction of stock returns using the function runs.test(). Lets go a step further and try modelling this with a parametric econometric approach. The company that I choose for the study is INFOSYS (NSE code INFY). Lets start...

Read more »

Copy all the files in a directory to a new directory using R

October 27, 2011
By

Someone asked me how to move a directory full of files from one place to another using R.  The easiest way I've found is as follows (where "oldpath" is the existing directory and "newpath" is the new directory):file.copy(list.files(oldpath),newpath) Tags: R

Read more »

Copy all the files in a directory to a new directory using R

October 27, 2011
By
Copy all the files in a directory to a new directory using R

Someone asked me how to move a directory full of files from one place to another using R.  The easiest way I've found is as follows (where "oldpath" is the existing directory and "newpath" is the new directory):file.copy(list.files(oldpath),newpath)

Read more »

A New Dimension to Principal Components Analysis

October 27, 2011
By
A New Dimension to Principal Components Analysis

In general, the standard practice for correcting for population stratification in genetic studies is to use principal components analysis (PCA) to categorize samples along different ethnic axes.  Price et al. published on this in 20...

Read more »

The Most Diversified or The Least Correlated Efficient Frontier

October 27, 2011
By
The Most Diversified or The Least Correlated Efficient Frontier

The “Minimum Correlation Algorithm” is a term I stumbled at the CSS Analytics blog. This is an Interesting Risk Measure that in my interpretation means: minimizing Average Portfolio Correlation with each Asset Class for a given level of return. One might try to use Correlation instead of Covariance matrix in mean-variance optimization, but this approach,

Read more »