Even More JGB Yield Charts with R lattice

May 15, 2013
By
Even More JGB Yield Charts with R lattice

See the last post for all the details. I just could not help creating a couple more. Variations on Favorite Plot - Time Series Line of JGB Yields by Maturity p2 <- xyplot(value ~ date | indexname, data = jgb.melt, type = "l", layout = c(length(unique(jgb.melt$indexname)), ...

Read more »

Exponential Cache Behavior

May 15, 2013
By
Exponential Cache Behavior

Guerrilla alumnus Gary Little observed certain fixed-point behavior in simulations where disk IO blocks are updated randomly in a fixed size cache. For his python simulation with 10 million entries (corresponding to an allocation of about 400 MB of memory) the following results were obtained: Hit ratio (i.e., occupied) = 0.3676748 Miss ratio...

Read more »

R code for generating multi-site stochastic precipitation

In water resource management, climate change, hydrology and related disciplines long time series of precipitation/rainfall data is required. Since historical records are relatively short, typically 50 years or less, mathematical/statistical models are ...

Read more »

Automated Archival and Visual Analysis of Tweets Mentioning #bog13, Bioinformatics, #rstats, and Others

May 15, 2013
By
Automated Archival and Visual Analysis of Tweets Mentioning #bog13, Bioinformatics, #rstats, and Others

Automatically Archiving Twitter ResultsEver since Twitter gamed its own API and killed off great services like IFTTT triggers, I've been looking for a way to automatically archive tweets containing certain search terms of interest to me. Twitter's buil...

Read more »

Japan – JGB Yields–More Lattice Charts

May 15, 2013
By
Japan – JGB Yields–More Lattice Charts

This blog is littered with posts about Japan. In one sentence, I think Japan presents opportunity and is a very interesting real-time test of much of my macro thinking. Proper visualization is absolutely essential for me to understand all of the dynami...

Read more »

Big News! “Practical Data Science with R” MEAP launched!

May 15, 2013
By
Big News! “Practical Data Science with R” MEAP launched!

Nina Zumel and I ( John Mount ) have been working very hard on producing an exciting new book called “Practical Data Science with R.” The book has now entered Manning Early Access Program (MEAP) which allows you to subscribe to chapters as they become available and give us feedback before the book goes into Related posts:

Read more »

Variance matrix differences

May 15, 2013
By
Variance matrix differences

Torturing portfolios to give different volatilities between a factor model and Ledoit-Wolf shrinkage. Previously There have been posts on: “What the hell is a variance matrix?” factor models Ledoit-Wolf shrinkage Question Two of the several ways to produce an estimate of the variance matrix of asset returns is a statistical factor model and Ledoit-Wolf shrinkage.  … Continue reading...

Read more »

Forecasting annual totals from monthly data

May 15, 2013
By
Forecasting annual totals from monthly data

This question was posed on crossvalidated.com: I have a monthly time series (for 2009–2012 non-stationary, with seasonality). I can use ARIMA (or ETS) to obtain point and interval forecasts for each month of 2013, but I am interested in forecasting the total for the whole year, including prediction intervals. Is there an easy way in R to obtain interval...

Read more »

Easier confidence interval estimation with matrices and similar arrays in R

May 15, 2013
By

When dealing with survey data in particular, social scientists are often wanting to produce proportions from the data, and associated confidence intervals. The prop.test command in R can be used to generate the desired results. When dealing with small ...

Read more »

From a random generator to a sample function

May 14, 2013
By
From a random generator to a sample function

This week-end, I wrote a post since I had some trouble to generate a sample random sample with R, to reproduce one obtained by a co-author, with SAS (generated using Fishman and Moore (1982) used in function RANUNI). I was lucky since another contributor for that book, Christrophe Dutang, got the anwer to the last question I asked: is it...

Read more »

Top 3 R resources for beginners

May 14, 2013
By

The community team at Revolution Analytics has just updated this list of resources to learn about R on the Web. Included is this list of the top 3 resources for absolute beginners getting started with R: An Introduction to R – The free, “official” CRAN R Manual Try R – a short course that lets you jump right in...

Read more »

Le Monde puzzle [#820]

May 14, 2013
By
Le Monde puzzle [#820]

The current puzzle is… puzzling: Given the set {1,…,N} with N<61, one iterates the following procedure: take (x,y) within the set and replace the pair with the smallest divider of x+y (bar 1). What are the values of N such that the final value in the set is 61? I find it puzzling because the

Read more »

1.5 percent of doctors, a quarter of malpratice reports

May 14, 2013
By
1.5 percent of doctors, a quarter of malpratice reports

Some doctors receive more malpractice reports than others. Just how unequal is the distribution of malpractice reports? The post 1.5 percent of doctors, a quarter of malpratice reports appeared first on Decision Science News.

Read more »

SIR Model – The Flue Season – Dynamic Programming

May 14, 2013
By
SIR Model – The Flue Season – Dynamic Programming

# The SIR Model (susceptible, infected, and recovered) model is a common and useful tool in epidemiological modelling.# In this post and in future posts I hope to explore how this basic model can be enriched by including different population group...

Read more »

Much more efficient bubble sort in R using the Rcpp and inline packages

May 14, 2013
By

Recently I wrote a blogpost showing the implementation of a simple bubble sort algorithm in pure R code. The downside of that implementation was that is was awfully slow. And by slow, I mean really slow, as in “a 100… See more ›

Read more »

Beware: 2 is not always 2 in R

May 14, 2013
By

This post is minimalistic. Consider this: Now let's have look at what's inside x: But is it really true? Here you go. A colleague of mine was once ruined by this for an entire day before we realized what was…Read more →

Read more »

Forecast Update: Will 2014 be the Beginning of the End for SAS and SPSS?

May 14, 2013
By
Forecast Update: Will 2014 be the Beginning of the End for SAS and SPSS?

I recently updated my plots of the data analysis tools used in academia in my ongoing article, The Popularity of Data Analysis Software. I repeat those here and update my previous forecast of data analysis software usage. Learning to use … Continue reading →

Read more »

Projection Pursuit Classification Trees

May 14, 2013
By

I've been looking at this article for a new tree-based method. It uses other classification methods (e.g. LDA) to find a single variable use in the split and builds a tree in that manner. The subtleties of the model are: The model does not prune but ...

Read more »

PhyloTempo

May 14, 2013
By
PhyloTempo

Summary: Two new measures of tree topology are introduced: temporal clustering (TC), and staircase-ness. Several other existing statistics are also implemented for the purpose of comparison: Aldous's graphical test and likelihood test to decide if a tree fits the Yule … Continue reading →

Read more »

The rbinding race: for vs. do.call vs. rbind.fill

May 14, 2013
By
The rbinding race: for vs. do.call vs. rbind.fill

Which function rbinds dataframes together fastest?First competitor: classic rbind in a for loop over a list of dataframesSecond competitor: do.call("rbind", <list of dataframes>)Third competitor: rbind.fill(<list of dataframes>) f...

Read more »

Visualizing your websites’ ecommerce performance with R

May 14, 2013
By
Visualizing your websites’ ecommerce performance with R

In this blogpost, I want to dive deeper into the explanation of the relationship between Frequency and Recency of Visits with the Conversion Rate and Average Order Value. I have used the RGA package for data extraction and Dr. Hadley Wickham’s ggplot2 package to achieve the visualizations. Here’s the data aggregation script : #transactions dataframe

Read more »

Claims Inflation – a known unknown

May 14, 2013
By

Over the last year I worked with two colleagues of mine on the subject of inflation and claims inflation in particular. I didn't expect it to be such a challenging topic, but we ended up with more questions than answers. The key question and biggest ch...

Read more »

RcppArmadillo 0.3.820

Conrad rolled up a new Armadillo release 3.820 (following two minor fix release in the 0.3.810 series of which we packaged the one that was relevant for us). This new version is now out in a release 0.3.820 of RcppArmadillo which is already on CRAN a...

Read more »

Stan!

May 13, 2013
By

Guy Freeman writes: I thought you’d all like to know that Stan was used and referenced in a peer-reviewed Rapid Communications paper on influenza. Thank you for this excellent modelling language and sampler, which made it possible to carry out this work quickly! I haven’t actually read the paper, but I’m happy to see Stan The post Stan!...

Read more »

Rによるモンテカルロ法入門

May 13, 2013
By
Rによるモンテカルロ法入門

Here is the cover of the Japanese translation of our Introducing Monte Carlo methods with R book.  A few year after the French translation. It actually appeared last year in August but I was not informed of this till a few weeks ago. The publisher is Maruzen, with an associated webpage if you want to

Read more »

Integration take two – Shiny application

May 13, 2013
By
Integration take two – Shiny application

My last post discussed a technique for integrating functions in R using a Monte Carlo or randomization approach. The mc.int function (available here) estimated the area underneath a curve by multiplying the proportion of random points below the curve by the total area covered by points within the interval: The estimated integration (bottom plot) is

Read more »

In case you missed it: April 2013 Roundup

May 13, 2013
By

In case you missed them, here are some articles from April of particular interest to R users: A critique of a SAS whitepaper comparing the performance of SAS, R and Mahout. A video presentation from statistician Tess Nesbitt at UpStream, who uses GAM survival models in R for marketing attribution analysis. The April edition of the Revolution Analytics newsletter....

Read more »

Shiny App for CRAN packages

May 13, 2013
By

Over the past few days, I have been introduced to a few new-to-me R packages, via some comments from the Shiny guys and the R-bloggers site. This seems a rather haphazard way of acquiring knowledge and I cannot be alone in thinking that this is not the most productive way to become aware of new/better

Read more »

Stack Exchange: Why I dropped out

May 13, 2013
By
Stack Exchange: Why I dropped out

Stack Exchange is a series of question-and-answer sites, including Stack Overflow for programming and Cross Validated for statistics. I was introduced to these sites at a short talk by Barry Rowlingson at the 2011 UseR! meeting, “Why R-help must die!“ These sites have a lot of advantages over R-help: The format is easier to read,

Read more »

Sponsors