## Stone Flakes V, networks again

July 6, 2014
Last week I tried pcalg. This week deal (Learning Bayesian Networks with Mixed Variables). The aim n this post I want to try something new, a causal graphical model. The aim here is just as much to get myself a feel what these things do as to underst...

## Estimating Required Coinage

July 5, 2014
I would like to code up a simple method of minimizing the number of coins required to give change.  Then I would like see what coins are most likely to be called into usage if change is required from a uniform draw between 1 cent and 499 cents. Exported from Notepad++ # Define denominations to search throughden...

July 5, 2014
The latest version of rNOMADS is now available on CRAN.  This update resolves several minor bugs and one major one involving multiple variable/level selections when using the ModelGrid function.  I have also added support for two more models on NOMADS:  Climate Forecast System Flux Products and Climate Forecast System 3D Pressure Products.  This brings the

## RDataMining group having 6000 members today

July 4, 2014
RDataMining Group: http://group.rdatamining.com Twitter: @RDataMining Website: http://www.RDataMining.com The RDataMining group has 6000 members today, 5 July 2014. Created in August 2011, this group has developed into a big community with 6000 member within three years. Since its creation, many members … Continue reading →

## Automatic bias correction doesn’t fix omitted variable bias

July 4, 2014
Page 94 of Gelman, Carlin, Stern, Dunson, Vehtari, Rubin “Bayesian Data Analysis” 3rd Edition (which we will call BDA3) provides a great example of what happens when common broad frequentist bias criticisms are over-applied to predictions from ordinary linear regression: the predictions appear to fall apart. BDA3 goes on to exhibit what might be considered Related posts:

## Finding the distance from ChIP signals to genes

July 4, 2014
I’ve had a couple of months off from blogging. Time for some computer-assisted biology! Robert Griffin asks on Stack Exchange about finding the distance between HP1 binding sites and genes in Drosophila melanogaster.  We can get a rough idea with some public chromatin immunoprecipitation data, R and the wonderful BEDTools. Finding some binding sites There

## Two handy documents for making good UK maps

July 4, 2014
Everybody loves a good map. Even if you don’t have any reason to make one, your boss will love it when you do, so check this out and get yourself a pay rise (possibly). First, this set of diagrams via … Continue reading →

## The dendextend package for visualizing and comparing trees of hierarchical clusterings (slides from useR!2014)

July 3, 2014
This week I presented in the useR!2014 my package dendextend (also on github), for easily manipulating, visualizing, and comparing dendrograms. Put simply, it is a package designed to easily create figures like these: Here is my presentation from useR: You are also invited to give a look to the current version of the package vignettes: https://github.com/talgalili/dendextend/blob/master/vignettes/dendextend-tutorial.pdf I

## Women Graduates in Math, Statistics, and Computer Information Systems

July 3, 2014
One of the more interesting talks at this year’s useR! Conference was the heR Panel discussing the role of women in the R community. They estimate that fewer than 15% of package authors are women. One of the points brought up was that this is less than the percentage of women in statistics. Perhaps this is more...

## Efficient Ragged Arrays in R and Rcpp

July 3, 2014
When is R Slow, and Why? Computational speed is a common complaint lodged against R. Some recent posts on r-bloggers.com have compared the speed of R with some other programming languages , and showed the favorable impact of the new compiler package on run-times . I and others have written about using Rcpp to easily write C++...

## useR! 2014 Highlights

July 3, 2014
My talk went well; here are the slides and a link to the paper pre-print. Hadley Wickham gave an excellent tutorial on dplyr. Based on the talk I saw, I think I will take the data sets from the book and make some public visualizations on the Plotly we...

## Currency Exchange Rate Forecasting with ARIMA and STL

July 3, 2014
I have made an example of time series forecasting with R, demonstrating currency exchange rate forecasting with the ARIMA and STL models. The example is easy to understand and follow. R source files are provided to run the example. The … Continue reading →

## How to Remember the Poisson Distribution

July 3, 2014
The Poisson cumulative distribution function (CDF) $$F(α,n) = \sum_{k=0}^n \dfrac{α^k}{k!} \; e^{-α} \label{eqn:pcdf}$$ is the probability of at most $n$ events occurring when the average number of events is α, i.e., $\Pr(X \le n)$. Since \eqref{eqn:pcdf} is a probability function, it cannot have a value greater than 1. In R, the CDF is given by the...

## Beer and Pie | rCharts pie charts with d3pie

July 3, 2014
In honor of the 4th of July, I thought a quick example of a pie chart on beer using the wonderful new d3pie library would be appropriate.  The rCharts binding with d3pie is simply an experiment now, but expect more in the near future.   Using slidify...

## UseR! 2014 Tutorials

July 3, 2014
by Joseph Rickert UserR! 2014 got under way this past Monday with a very impressive array of tutorials delivered on the day that the conferences organizers were struggling to cope with a record breaking crowd. My guess is that conference attendance is somewhere in the 700 range. Moreover, this the first year that I can remember that tutorials were...

## useR 2014 Slides for PSAboot and Version 1.1 on CRAN

July 3, 2014
PSAboot is an R package to assist with bootstrapping propensity score methods. I gave a talk today at the useR! 2014 Conference. The slides can be downloaded from the PSAboot Github page or directly here. The package is described at jason.bryer.org/PSA...

## FRAMA Part III: Avoiding Countertrend Trading — A First Attempt

July 2, 2014
This post will begin to experiment with long-term directional detection using relationships between two FRAMA indicators. By observing the relationship … Continue reading →

## F1 Doing the Data Visualisation Competition Thing With Tata?

July 2, 2014
Sort of via @jottevanger, it seems that Tata Communications announces the first challenge in the F1® Connectivity Innovation Prize to extract and present new information from Formula One Management’s live data feeds. (The F1 site has a post Tata launches F1® Connectivity Innovation Prize dated “10 Jun 2014″? What’s that about then?) Tata Communications are

## Revolution Analytics: the R company since 2007

July 2, 2014
Revolution Analytics, founded in 2007, was the first company devoted to the R project. Since then, we've been behind several R initiatives, including the RHadoop project and the network of R user groups around the world. I gave this short presentation today at the useR! 2014 conference in Los Angeles with some of the highlights from Revolution Analytics from...

## Using Biplots to Map Cluster Solutions

July 2, 2014
FactoMineR is a quick and easy R package for generating biplots, such as the following plot showing the columns as arrows with the rows to be added later as points. As you might recall from a previous post, a biplot maps a data matrix by plotting both ...

## Short course: Bayesian methods in health economics

July 2, 2014
Chris, Richard and I tested this last March in Canada (see also here) and things seem to have gone quite well. So we have decided to replicate the experiment (so that we can get a bigger sample size!) and do the short course this coming November (...

## 2014 UseR conference, days 1-2

July 2, 2014
I’m at UCLA for the UseR Conference. I attended once before, and I really enjoyed it. And I’m really enjoying this one. I’m learning a ton, and I find the talks very inspiring. In my comments below, I give short shrift to some speakers (largely by not having attended their talks), and I’m critical in

July 1, 2014
translateR is the new service from German based R specialist eoda, which helps users to translate SPSS® Code to R automatically. Today we presented translateR at the useR!2014 in L.A., the world’s most popular conference for the R statistical language. translateR allows a fast and easy migration from SPSS® to R. The manual translation of

## recycling accept-reject rejections (#2)

July 1, 2014
$recycling accept-reject rejections (#2)$

Following yesterday’s post on Rao’s, Liu’s, and Dunson’s paper on a new approach to intractable normalising constants, and taking advantage of being in Warwick, I tested the method on a toy model, namely the posterior associated with n Student’s t observations with unknown location parameter μ and a flat prior, which is “naturally” bounded by

## Parallel computing in R

July 1, 2014
Roughly a year ago I published an article about parallel computing in R here, in which I compared computation performance among 4 packages that provide R with parallel features once R is essentially a single-thread task package. Parallel computing is incredibly useful, but not every thing worths distribute across as many cores as possible. Actually,

## Win a free copy of a new video course on ggplot2 and Shiny!

July 1, 2014
Noticed all these posts on r-bloggers about ggplot2 and shiny? Do you want in? My course “Building Interactive Graphs with ggplot2 and Shiny” (published by Packt Publishing) covers those 2 packages in a series of 40 videos, each one dedicated … Continue reading →

## How To: 20 Minute Guide to Get Started with PivotalR

July 1, 2014
In this article, Pivotal engineer and predictive analytics expert Hai Qian explains how someone new to R can get started performing statistical analysis on data stores in Greenplum Database, Pivotal HD and PostgreSQL in just 20 minutes using PivotalR. First, there is some background on R’s popularity, then the articles dives into important topics such as installation, data loading,...

## Quantitative Finance applications in R – 7: Constructing a Term Structure of Interest Rates Using R (part 2 of 2)

July 1, 2014
by Daniel Hanson Recap and Introduction Last time in part 1 of this topic, we used the xts and lubridate packages to interpolate a zero rate for every date over the span of 30 years of market yield curve data. In this article, we will look at how we can implement the two essential functions of a term structure:...