A New plot.xts

August 15, 2012
By
A New plot.xts

The Google Summer of Code (2012) project to extend xts has produced a very promising new plot.xts function.  Michael Weylandt, the project's student, wrote R-SIG-Finance to request impressions, feedback, and bug reports.  The function is hous...

Read more »

Probit Models with Endogeneity

August 15, 2012
By
Probit Models with Endogeneity

Dealing with endogeneity in a binary dependent variable model requires more consideration than the simpler continuous dependent variable case. For some, the best approach to this problem is to use the same methodology used in the continuous case, i.e. 2 stage least squares. Thus, the equation of interest becomes a linear probability model (LPM). The

Read more »

Project Euler — problem 18

August 15, 2012
By

The 18th Euler problem is sorta a route finding problem. It has occupied my mind for two days. Finally I came up to a clever solution. Find the maximum total from top to bottom of the triangle below: 75 95 64 17 … Continue reading →

Read more »

Processing sample labels using regular expressions in R

August 15, 2012
By
Processing sample labels using regular expressions in R

I am often found in possession of palaeo core data where the sample identifiers contain a core code or label plus the sample depth. Often these are things generated by colleagues who have used other software where for one reason … Continue reading →

Read more »

Predicting the memory usage of an R object containing numbers

August 15, 2012
By

To estimate if a certain vector of numbers will fit into memory, you can quite easily predict the memory usage based on the size of the vector. An integer vector will use 4 bytes per number, and a numeric vector… See more ›

Read more »

Processing sample labels using regular expressions in R

August 15, 2012
By

I am often found in possession of palaeo core data where the sample identifiers contain a core code or label plus the sample depth. Often these are things generated by colleagues who have used other software where for one reason or another they don’t want to store the depth information as a separate numeric variable. I also generate such...

Read more »

Chapter 2 Solutions – Statistical Methods in Bioinformatics

August 14, 2012
By

As I have mentioned previously, I have begun reading Statistical Methods in Bioinformatics by Ewens and Grant and working selected problems for each chapter. In this post, I will give my solution to two problems. The first problem is pretty straightforward. Problem 2.20 Suppose that a parent of genetic type Mm has three children. Then the parent transmits...

Read more »

Some Quirks of the R Language

August 14, 2012
By

R is my favorite programming language.  It's just so useful for getting work done.  Sometimes people will complain that R is a difficult language.  To me, this begs the questions:  difficult for what?  And for whom?  I personally think R is just about the easiest thing in the world for prototyping.  Meaning if you want to quickly crank out...

Read more »

Textbook – Statistical Methods in Bioinformatics

August 14, 2012
By
Textbook – Statistical Methods in Bioinformatics

As part of my effort to acquaint myself more with biology, bioinformatics, and statistical genetics, I am trying to find as many resources as I can that provide a solid foundation. For instance, I am wading through Molecular Biology of the Cell at a pa...

Read more »

Minimum Expected Shortfall, Part 2

August 14, 2012
By
Minimum Expected Shortfall, Part 2

Previously, we setup the problem of constructing a minimum expected shortfall portfolio.   We exported the portfolio weights from each quarterly rebalancing into R objects. This post will process those weights and compare the portfolio s...

Read more »

The Statistical Sleuth (second edition) in R

August 14, 2012
By
The Statistical Sleuth (second edition) in R

For those of you who teach, or are interested in seeing an illustrated series of analyses, there is a new compendium of files to help describe how to fit models for the extended case studies in the Second Edition of the Statistical Sleuth: A Course in...

Read more »

Is gas cheaper than it used to be?

August 14, 2012
By
Is gas cheaper than it used to be?

Biostatistician and R user Matt Cooper noticed recently that the price he pays for petrol (gasoline) at the pump in Perth, Australia was about the same as he was paying four years ago. Nonetheless, inflation has marched on over the years, so does that mean petrol is effectively cheaper now than it used to be? And how does the...

Read more »

Math Constants in C++

August 14, 2012
By

Some of my colleagues didn't know that you can use mathematical constants that are part of "cmath". Here is the small snippet that shows how to use PI from cmath library. Be aware that you need to write "#define _USE_MATH_DEFINES" before you include cm...

Read more »

Bank of America 1% Cash Rewards Aren’t Really 1%

August 14, 2012
By
Bank of America 1% Cash Rewards Aren’t Really 1%

Bank of America (BoA) has a "Cash Rewards" credit card that pays "1% cash back everywhere, every time"1. But if you read the fine print, it's clear that the reward is almost always less than 1%. Here's the relevant sentence from the terms and conditions2: Fractions are truncated at the 100th decimal place, and are

Read more »

Custom axis transformations in ggplot2

August 14, 2012
By

To apply a data transformation on an axis in a ggplot, you can use coordinate transformations. For more detail see the ggplot2 documentation. A number of coordinate transformations is available, including log10 and sqrt. However, if you want to perform… See more ›

Read more »

How to branch/fork a (StatET) project with SVN

August 14, 2012
By
How to branch/fork a (StatET) project with SVN

I was introduced to version control at the 2011 Belgrade R+OSGeo in higher education summer school. I’ve been using it in my daily work ever since. Recently the need to branch my project came up and this post describes how after a few hours of reading teh internets satisfied my need. In a nutshell, you

Read more »

Random and fixed effects in sensory profiling

August 14, 2012
By
Random and fixed effects in sensory profiling

I am reading Introduction into mixed modelling by N.W. Galway. It is partly a repeat of things I know, but I expect to use mixed models quite a lot the coming time, so it is good to repeat these things.My problem with this book is a sensory exampl...

Read more »

London 2012 Olympics — medal statistics

August 14, 2012
By
London 2012 Olympics — medal statistics

The 2012 Olympic Games officially ended this Sunday in London. Although I missed most of the games, I was still entertaining myself with some hilarious news, such as Thomas’s re-diving. So much fun. I would remember this for years :) Games ended. … Continue reading →

Read more »

The essence of a handwritten digit

August 13, 2012
By
The essence of a handwritten digit

If you haven’t yet discovered the competitive machine learning site kaggle.com, please do so now. I’ll wait. Great – so, you checked it out, fell in love and have made it back. I recently downloaded the data for the getting started competition. It consists of 42000 labelled images (28×28) of hand written digits 0-9. The

Read more »

Adaptive Asset Allocation

August 13, 2012
By
Adaptive Asset Allocation

Today I want to highlight a whitepaper about Adaptive Asset Allocation by Butler, Philbrick and Gordillo and the discussion by David Varadi on the robustness of parameters of the Adaptive Asset Allocation algorithm. In this post I will follow the steps of the Adaptive Asset Allocation paper, and in the next post I will show

Read more »

RInside 0.2.7

August 13, 2012
By

A new version 0.2.7 of RInside is now available via CRAN. RInside provides a set of convenience classes which facilitate embedding of R inside of C++ applications and programs, using the classes and functions provided by the Rcpp R and C++ integrati...

Read more »

Missouri: Comparison of Registered Voter Counts to Census Voting Age Population

August 13, 2012
By
Missouri:  Comparison of Registered Voter Counts to Census Voting Age Population

By Earl F Glynn | Franklin Center A comparison of US Census voting age population data in Missouri to voter registration data shows a number of Missouri counties have bloated voter registration lists. Charts by county for the years 2000 to 2012 show how counties are maintaining their voter lists. Voter fraud potential is higher

Read more »

Cleaning sentences by recursively merging words using R

August 13, 2012
By

A question on StackOverflow really sparked my attention. The aim was to clean up a dataset of inappropriately spaced words. For example: My approach was to create what I call a wordpair object. The word pair object for the… See more ›

Read more »

Videos on Using R

August 13, 2012
By

In this post on his blog some months ago, Ethan Fosse drew attention to Anthony Damico's collection of over 90 videos on using the R software environment.Definitely worth looking at!© 2012, David E. Giles

Read more »

User Input using tcl/tk

August 13, 2012
By
User Input using tcl/tk

I was inspired by Kay Cichini  recent post on creating a a tcl/tk dialog box for users to enter variable values. I am going to have a use for this very soon so took some time to make it a bit more generic. What I wanted is a function that takes a vector (of variable names)

Read more »

Quick SAP HANA and R usecase

Quick SAP HANA and R usecase

DISCLAIMER: I'm not an SAP HANA expert or an R expert, not even a Python expert. I'm just a guy with a lot of ideas who loves to write blogs.The other day I was thinking about making some nice with SAP HANA and R, because people doesn't seem to be enou...

Read more »

New R User Groups in San Antonio, Milwaukee, Nicaragua

August 13, 2012
By

We have three new local R user groups to announce this month. The Alamo City R Users Group in San Antonio becomes the fifth R user group in Texas. The group's just getting started, and volunteers are always welcome. Although not a dedicated R group, the Milwaukee Chapter of the ASA hosts occasional R workshops. In May next year,...

Read more »

The fanplot package for R

August 13, 2012
By
The fanplot package for R

My fanplot package has gone up on CRAN. Here is a online version of the vignette. Introduction The fanplot package contains a collection of R (R Development Core Team, 2012) functions to effectively display plots of sequential distributions such as … Continue reading →

Read more »

Highlights of R in Finance 2012

August 13, 2012
By
Highlights of R in Finance 2012

I unfortunately was not there, but we can vicariously enjoy it via the presentations that are posted on the conference website. Below is my take on the highlights (in chronological order). Peter Carl and Brian Peterson “Constructing Strategic Hedge Fund Portfolios” is wonderful from my perspective.  Promoting random portfolios is sure to win my heart.  … Continue reading...

Read more »