## Meet cloudnumbers.com at the UseR2011 conference in UK

June 21, 2011
cloudnumbers.com provides researchers and companies with the resources to perform high performance calculations in the cloud. We currently focus on the well-known open-source statistics program R (http://www.r-project.org). R is a strongly functional language and environment for statistical computing. You can explore data sets, make graphical displays of data, run statistical simulations and much more. This

## ProjectEuler-Problem 46

June 21, 2011
It was proposed by Christian Goldbach that every odd composite number can be written as the sum of a prime and twice a square.9 = 7 + 21215 = 7 + 222Read More: 461 Words Totally

## RGhcnV3 A new package

June 21, 2011
It’s been a long journey and there are some people to thank for helping me along the way. Steve McIntyre, Ron Broberg, Jeff Id, Ryan ODonnell, RomanM, Nick Stokes, Robert Hijmans, Gabor  Grothendieck, Hadley Wickham, David Winsemius, and countless others on the R Help list. The Package is done. Package: RghcnV3 Type: Package Title: Global

## Statistics.com Review

June 20, 2011
Disclaimer: All prices and classes are approximate and should be confirmed at www.statistics.com as they can change. A comment from my previous post asked me about the experience I had in taking courses from statistics.com (www.statistics.com). To help...

## Testing Hurst with Multiple Indexes

June 20, 2011
DO NOT TRADE THIS SYSTEM.  YOU VERY EASILY COULD LOSE LARGE AMOUNTS OF MONEY. I am not necessarily recommending the system that I presented in Exploring the Market with Hurst, but I thought it would provide a nice platform to illustrate some backtesti...

## Visualization Meetup at the Googleplex

June 20, 2011
Earlier this month we were excited to host a joint meetup with the Bay Area visualization group and the Bay Area R user group at Google’s headquarters in Mountain View.Hadley Wickham from Rice University came and gave a talk on Interactive Graphics in R. Earlier in the day we...

## Population pyramids in R

Some friends asked me about how to built population pyramids with confidence intervals in R. When I did my own probabilistic population projections, I had the same trouble, unfortunately there is no library to do that. So here I share the code I wrote:...

## Calling R lovers and bloggers – to work together on “The R Programming wikibook”

June 20, 2011
This post is a call for both R community members and R-bloggers, to come and help make The R Programming wikibook be amazing. The R Programming wikibook is not just another one of the many free books about statistics/R, it is a community project which aims to create a cross-disciplinary practical guide to the R programming language.  Here is

## Example 8.41: Scatterplot with marginal histograms

June 20, 2011
The scatterplot is one of the most ubiquitous, and useful graphics. It's also very basic. One of its shortcomings is that it can hide important aspects of the marginal distributions of the two variables. To address this weakness, you can add a histo...

## Today’s Assignment: Assignment

June 20, 2011
A new R user quickly discovers that there are multiple ways to store information into an object -- the technical term for this is assignment. There's = as in:x = c(1,2,3)and there's <- as in:x <- c(1,2,3)R help on assignOps offers this explanati...

## Summer school in Gran Paradiso

June 19, 2011
The Parco Nazionale Gran Paradiso and the Università di Pavia are organising a summer school on “Advances in species distribution modelling in ecological studies and conservation” in Pavia and Cogne, 12-18 September 2011. This school includes R and Winbugs tutorials, regular classes, plus a field trip to the park, so this sounds quite exciting (at

## Making GUIs using C# and R with the help of R.NET

June 19, 2011
In this post, I’ll take a look at using R.NET with the creation of an application written in C#. As …Continue reading »

## On Dirichlet’s approximation theorem

June 19, 2011
This is one of my favourites: in 1840 the German mathematician Dirichlet proved an elegant theorem, known as “Dirichlet’s approximation theorem“. The proof is surprisingly simple, but the usefulness of the proposition in some fields of mathematics, such as Diophantine analysis is remarkable. It goes as follows: Let a be a real number and N

## R-bloggers

June 19, 2011
This humble blog is proudly part of R-bloggers, since a couple of weeks. I had this website as my homepage for some months now and I have found therein really inspiring and informative things. So I wish all the best to Tal Galili and his great job with...

## A Little Sampling Puzzle

June 18, 2011
Suppose you have 10 objects from which you take a sample of size 20 (with replacement, or you're in trouble). What's the probability that each object was chosen at least once? Getting an answer via simulation is pleasantly easy:f <- function(n=10,...

## Efficient loops in R — the complexity versus speed trade-off

June 18, 2011
I've written before about the up- and downsides of the plyr package -- I love it's simplicity, but it can't be mindlessly applied, no pun intended. This week, I started building a agent-based model for a large population, and I figured I'd use something like a binomial per-timestep birth-death process for between-agent connections. My ballpark...

## Two textbooks on probability using R

June 18, 2011
By
$Two textbooks on probability using R$

This fall, I’ll be teaching a second-year course on Probability with Computer Applications, which is required for Computer Science majors.  I’ve taught this before, but that was five years ago, so I’ve been looking to see what new textbooks would be suitable.  The course aims not just to use computer science applications as examples, but

## Performance ratios, bootstrapping and infinite variances

June 18, 2011
If returns had infinite variance, would there be a problem bootstrapping information ratios? Background There is a discussion on the Quant Finance group of LinkedIn with the title: “How do you measure the confidence intervals of performance ratios?” One suggestion was to use the statistical bootstrap. This resulted in a discussion of the efficacy of … Continue reading...

## Speeding Up MLE Code in R

June 18, 2011
Recently, I’ve been fitting some models from the behavioral economics literature to choice data. Most of these models amount to non-linear variants of logistic regression in which I want to infer the parameters of a utility function. Because several of these models aren’t widely used, I’ve had to write my own maximum likelihood code to

## Binary Installation Now Available

The biggest complaint we had during the installation process was that Xcode (account required) and Rtools were required for MacOS X and Windows. Today we released universal binaries (PPC/i386/x86_64) for MacOS 10.5+ as well as binaries (i386/x86_64) for Windows. This addition wil

## R progress indicators

June 18, 2011
Complicated calculations usually take a lot of time. So how to know the progress status to estimate how much time the program still needs to finish?

## Tracking execution paths

June 18, 2011
Earlier this week, I was trying to figure out the path of execution through a big chunk of code. Once you reach a certain size of codebase, tracking which function gets called when can be tricky. My first thought for dealing with this was to add a message line at the start of each function

## A Brief Introduction to Mixture Distributions

Last time, I discussed some of the advantages and disadvantages of robust estimators like the median and the MADM scale estimator, noting that certain types of datasets – like the rainfall dataset discussed last time – can cause these estimators to fail spectacularly.  An extremely useful idea in working with datasets like this one is that of mixture distributions,...

## Exploring the Market with Hurst

June 17, 2011
Randomly trudging through PerformanceAnalytics source code, I was intrigued by the Hurst Index calculation, which I discovered is more commonly called Hurst Exponent.  After quickly satisfying myself that I could actually do the rolling Hurst calculat...

## Raster, CMSAF and solaR

The Satellite Application Facility on Climate Monitoring (CMSAF) generates, archives and distributes widely recognised high-quality satellite-derived products and services relevant for climate monitoring in operational mode. The data is freely accesible here after a registration process. I have ask them for several files with monthly averages of global solar radiation over the Iberian Peninsula (download).

## Big-Data PCA: 50 years of stock data

June 17, 2011
In this post, Revolution engineer Sherry LaMonica shows us how to use the RevoScaleR big-data package in Revolution R Enterprise to do principal components analysis on 50 years of stock market data -- ed. Principal components analysis, or PCA, seeks to find a set of orthogonal axes such that the first axis, or first principal component, accounts for as...

## solaR 0.24 at CRAN

The version 0.24 of solaR is at CRAN and R-Forge. Some days before the 0.23 version was uploaded, but I had to make a quick fix to readMAPA: the url of http://www.mapa.es/siar has been changed to http://www.marm.es/siar. Moreover, this function has been renamed to readSIAR, although it is still available as readMAPA. Consequently the mode

## Engineering Data Analysis (with R and ggplot2) – a Google Tech Talk given by Hadley Wickham

June 17, 2011
It appears that just days ago, Google Tech Talk released a new, one hour long, video of a presentation (from June 6, 2011) made by one of R’s community more influential contributors, Hadley Wickham. This seems to be one of the better talks to send a programmer friend who is interested in getting into R.