## littler 0.1.5

September 17, 2011
By

Brown-bag release time for littler. One of the minor cleanups in the 0.1.4 release from Thursday actually introduced a nasty little bug as you can't call Rf_KillAllDevices() when you do not have any graphics device. Doh. So with apologies for the l...

## UK R Courses – 2012

September 17, 2011
By

The School of Mathematics & Statistics at Newcastle University (UK), are again running some R courses. In January, 2012, we will run: January 16th: Introduction to R; January 17th: Programming with R; January 18th & 19th: Advanced graphics with R. The courses aren’t aimed at teaching statistics, rather they aim to go through the fundemental

## Introduction to Beamer

September 17, 2011
By

A friend of mine, who is quite smart by the way (she is a PhD. student in Physics at Cambridge), recently asked me for some help with Beamer. Well most of my knowledge and code came from Utkarsh when I had started about a year ago. Initially, I ha...

## Elements of Bayesian Econometrics

September 16, 2011
By

posterior = (likelihood x prior) / integrated likelihoodThe combination of a prior distribution and a likelihood function is utilized to produce a posterior distribution.  Incorporating information from both the prior distribution and the likelihood function leads to a reduction in variance and an improved estimator. As n→...

## ifelse function in R only returns the first element

September 16, 2011
By

If you also favor to use the function, be aware of the returned value. For example:> ifelse(1>0, 3, 4) 3> ifelse(1>0, c(2, 3), c(4, 5)) # only the first element returned. 2 > ifelse(c(1:10)>5, 'on', 'off') "off" "off...

## R in the insurance industry

September 16, 2011
By

Let's talk about R in the insurance industry today.  David Smith's blog entry reminded me about our poster at the R user conference in Warwick in August 2011:Using R in InsuranceWe presented examples on how R can be used in the insu...

## How to extract time series from large timestamped logs with R

September 16, 2011
By

Revolution Analytics' Joe Rickert has a new post on inside-R.org, demonstrating how you can use R and the RevoScaleR package to extract time series data from time-stamped logs (in this case, the "US Domestic Flights From 1990 to 2009" dataset on Infochimps): Analyzing time series data of all sorts is a fundamental business analytics task to which the R...

## Backtesting Part 2: Splits, Dividends, Trading Costs and Log Plots

September 16, 2011
By

Note: This post is NOT financial advice!  This is just a fun way to explore some of the capabilities R has for importing and manipulating data.   In my last post, I demonstrated how to backtest a simple momentum-based stock trading strategy ...

## Beta and expected returns

September 16, 2011
By

Some pictures to explore the reality of the theory that stocks with higher beta should have higher expected returns. Figure 2 of “The effect of beta equal 1″ shows the return-beta relationship as downward sloping.  That’s a sample of size 1.  In this post we add six more datapoints. Data The exact same betas of … Continue reading...

## A multidimensional “which” function

September 16, 2011
By

update Henrik Bengtsson commented that which(x, arr.ind=TRUE) gives the same result, rendering the blog below academic (thanks for the comment!). So, for academic interest, I'll leave it. In my defense, I implemented this kind of functionality in C some time … Continue reading →

## A multidimensional "which" function

September 16, 2011
By

The well-known which function accepts a logical vector and returns the indices where its value equals TRUE. Actually, which also accepts matrices or multidimensional arrays. Internally, R uses a single index to run through such two- or higher-dimension...

## Soil-Landscape Block Diagrams in SoilWeb

September 16, 2011
By

Users of our Google Earth interface to USDA-NCSS soils information will now see links to soil-landscape block diagrams listed within map unit descriptions. Automated Linking to NCSS Block Diagrams read more

September 16, 2011
By

I have a paper which I wrote some years ago, which has not been finished, and which should be accompanied by an R package. So far nothing special, but at that time, I was only at the beginning of my affair with R, and so I made several mistakes (OK – I did also some

## Soil Series Query for SoilWeb

September 16, 2011
By

A map depicting the spatial distribution of a given soil series can be very useful when working on a new soil survey, updating an old one, or searching for specific soil characteristics. We have recently added a soil series query facility to SoilWeb, w...

## Simulation studies in R – Using all cores and other tips

September 16, 2011
By

After working more seriously with simulations I noticed some updates were necessary to my previous setup. Most notably are the following three: It is very handy to explicitly call the different scenarios instead of using nested loops Storing intermediate results in single files obliviates the need to rerun an almost finished but crashed analysis and

## Beeswarm Plot with ggplot2

September 16, 2011
By

A colleague showed me results of his study project with beeswarm plots made by GraphPad. I was wondering if it could be implemented in R and more specifically with ggplot2. There is a R package allowing to draw such graphs, the beeswarm package (beeswa...

## Performance with ggplot2

September 16, 2011
By

Now after Reporting Good Enough to Share, let’s use ggplot2 and PerformanceAnalytics to turn this into this From TimelyPortfolio I have been notified that the colors aren’t great.  How does everyone like this? R code (click to download)...

## Statistics and Data Analysis in Python with pandas and statsmodels

September 16, 2011
By

Wes McKinney is a prominent figure in the scientific Python community, and has made tremendous contributions to several core statistical computing libraries in that language. This month, Wes will be speaking specifically about two packages he has crea...

## Datasets to Practice Your Data Mining

September 16, 2011
By

There are many datasets available online for free for research use. Some of them are listed below. - The R Datasets Package: There are around 90 datasets available in the package. Most of them are small and easy to feed … Continue reading →

## Project Euler: problem 2

September 16, 2011
By

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...By considering the terms in the Fibonacci sequence whose values do not exc...

## How Lloyd’s of London uses R for Insurance

September 15, 2011
By

Lloyd's is the world's leading specialist insurance market, and is often the first to insure new, unusual or complex risks. So it's no surprise that Lloyd's is one of the many companies that use R and its advanced capabilities for data analysis to help manage its insurance risks. At the useR! conference last month, Lloyd's analysts Markus Gesmann, Viren...

September 15, 2011
By

Prompted by a rush of visitors from Andrew Gelman's blog, I went back and updated the details of my post from 2009 on reading data from Google Spreadsheets into R. Since then, Google had switched to using a secure (https) connection for Google Docs, which required some tweaks to the code. If you haven't seen it before, it's a...

## Correlations among US Stocks: Is it really time to fire your adviser?

September 15, 2011
By

Note: This post is NOT financial advice!  This is just a fun way to explore some of the capabilities R has for importing and manipulating data. The Financial Times says it's time to "Fire your Adviser" because correlations among US stocks ar...

## Reporting Good Enough to Share

September 15, 2011
By

Sorry to all my faithful readers for my absence recently. I started a new job at a new firm, so my blogging has moved down the priority list but only temporarily. I am still committed to documenting my thoughts, especially finance and R thoughts as dis...

## littler 0.1.4

September 15, 2011
By

Matthias Klose, the tireless force behind the Debian / Ubuntu gcc, python, and what have you packages, sent me a minimal patch to let littler build when the ld linker uses the --as-needed option (as Ubuntu builds now do): all it took was a little reor...

## Project Euler: problem 1

September 15, 2011
By

To be fairly honest (assuming there are degrees of honesty), I do know a little about math and programming but I don't know much math or any programming. I've loved math for a long time, but started to learn and understand fairly recently. So during th...

## Recent Updates in the aqp (Algorithms for Quantitative Pedology) Package for R

September 14, 2011
By

New version of our 'aqp' package for quantitative soils investigations, available on CRAN (version 0.99-5) and R-Forge (0.99-8). Some of the major changes are listed below: -------------------------- aqp 0.99-8 (2011-09-14) -------------------------- ...

## R Fork Bomb

September 14, 2011
By

So maybe I’m a strange guy, but I think fork bombs are really funny.  What’s a fork bomb?  The basic premise is that you spawn a process that spawns a process that spawns a process…, ad infinitum. The most beautiful example of a fork bomb, and really one of the most beautiful lines of code

## Shortest paths to/from nodes of a certain type

September 14, 2011
By

Elijah asked the following via SOCNET mailing list: I was wondering if anyone knew of a script or tool which would give me the network distance of nodes to a particular class of nodes.  I think of this as an Erdos number, except instead of getting the distance to one node, I want the distance