## Building a custom database of country time-series data using Quandl

May 8, 2013
By

Encouraged by this post I had another look at quandl for collecting datasets from different agencies. Right now I need to get data for four countries on a couple of dozen indicators. This graphic is just a quick example with only two indicators of what I am aiming to be able to do. The process

## An accept-reject sampler using RcppArmadillo::sample()

May 8, 2013
By

The recently added RcppArmadillo::sample() functionality provides the same algorithm used in R’s sample() to Rcpp-level code. Because R’s own sample() is written in C with minimal work done in R, writing a wrapper around RcppArmadillo::sample() to then call in R won’t get you much of a performance boost. However, if you need to repeatedly call sample(), then calling a...

## Gambler’s Run With Shiny

May 8, 2013
By

I finally had an opportunity to play with Shiny, and I am very impressed. I have created a Github Project so head over there for the source code. There are a number of ways to distribute Shiny apps. If you are running R (and mostly likely you are if you are reading this), you can download and...

## heatmaps with p-values (2)… coloured according to odds ratio

May 7, 2013
By

I like heatplots with p-values -or frequencies, or whatever-. Not very conclusive, but pretty anyway. And when talking about graphs, pretty will make our neurons to fire in more interesting ways: neurons like “pretty” graphs. Moreover, observing your data can … Sigue leyendo →

## CAISN

May 7, 2013
By

Reblogged from Zero to R Hero: Canadian Aquatic Invasive Species Networks Annual General Meeting in Kananaskis, Alberta. May 03, 3:25-5:30. This 2-hour workshop will focus on how and why we do numerical simulation in R. Time permitting, we will also look at how to build and fit likelihood based statistical models. We ask that you bring your

## New geomorph function to digitize multiple 2d images

Hi Morphometricians! We've enhanced geomorph's ability to continuously digitize multiple specimens' images in 2d, if these are within the same directory. This new function allows one to digitize 2d images without interruption. Thanks to Samuel Brown and Karl Fetter for suggesting the improvement. We will incorporate this function in our next package update. I'm including demonstration code...

## Poisson regression on non-integers

May 7, 2013
By

In the course on claims reserving techniques, I did mention the use of Poisson regression, even if incremental payments were not integers. For instance, we did consider incremental triangles > source("http://perso.univ-rennes1.fr/arthur.charpentier/bases.R") > INC=PAID > INC=PAID-PAID > INC 3209 1163 39 17 7 21 3367 1292 37 24 10 NA 3871...

## R in Insurance: Programme and Abstracts published

May 7, 2013
By

I am delighted to announce that the programme and abstracts for the first R in Insurance conference at Cass Business School in London, 15 July 2013, have been published. The conference committee received strong abstracts from academia and the industry,...

## SAS Big Data Analytics Benchmark (Part Two)

May 7, 2013
By

by Thomas Dinsmore On April 26, SAS published on its website an undated Technical Paper entitled Big Data Analytics: Benchmarking SAS, R and Mahout. In the paper, the authors (Allison J. Ames, Ralph Abbey and Wayne Thompson) describe a recent project to compare model quality, product completeness and ease of use for two SAS products together with open source...

## Eigen-analysis of Linear Model Behavior in R

May 7, 2013
By

This post is actually about replicating the figures in Otto and Day: A Biologist’s Guide to Mathematical Modeling in Ecology and Evolution. The figures I’m interested in for this post are Figures 9.1 and 9.2 in the chapter ‘General Solutions … Continue reading →

## DataMind & The R Service Bus @ RBelgium

Within 2 weeks on Friday, May 24, The RBelgium R user group is holding its next Regular meeting in Leuven for which this is the schedule: ** Jonathan Cornelissen - DataMind  Discover DataMind, a new online learning platform for d...

## Subsetting data

May 6, 2013
By

At School we use R across many courses, because students are supposed to use statistics under a variety of contexts. Imagine their disappointment when they pass stats and discovered that R and statistics haven’t gone away! When students start working with real data sets one of their first stumbling blocks is subsetting data. We have

## Passing columns of a dataframe to a function without quotes

May 6, 2013
By

I love the syntax of calls to lm and ggplot, wherein the dataframe is specified as a variable and specific columns are referenced as though they were separate variables. While developing some of my functions, I’d wanted to introduce something similar. I often find that I have a single large dataframe and want to execute

## xkcd: Visualized

May 6, 2013
By

IntroductionIt's been said that the ideal job is one you love enough to do for free but are good enough at that people will pay you for it. That if you do what you love no matter what others may say, and if you work at it hard enough, and long enough, eventually people will recognize it and...

## Explaining real-time predictive analytics with big data (video)

May 6, 2013
By

In my presentation to the Strata Santa Clara 2013 conference earlier this year, my goal was to give a succinct (under 20 minutes!) explanation of three terms that are two often used as mere buzzwords: predictive analytics, real time, and big data. You can download the slides for my presentation, Real-time Big Data Analytics: From Deployment to Production, from...

## Veterinary Epidemiologic Research: Count and Rate Data – Zero Counts

May 6, 2013
By
$Veterinary Epidemiologic Research: Count and Rate Data – Zero Counts$

Continuing on the examples from the book Veterinary Epidemiologic Research, we look today at modelling count when the count of zeros may be higher or lower than expected from a Poisson or negative binomial distribution. When there’s an excess of zero counts, you can fit either a zero-inflated model or a hurdle model. If zero

## When the “reorder” function just isn’t good enough…

May 6, 2013
By

The reorder function, in R 3.0.0, is behaving strangely (or I’m really not understanding something).  Take the following simple data frame: df = data.frame(a1 = c(4,1,1,3,2,4,2), a2 = c(“h”,”j”,”j”,”e”,”c”,”h”,”c”)) I expect that if I call the reorder function on the … Continue reading →

## Oracle R Distribution for R 2.15.2 available on public-yum

May 6, 2013
By

Oracle R Distribution (ORD) for R 2.15.2 on Linux is now available for download from Oracle's public-yum repository.  R 2.15.2 is a maintenance update that includes improved performance and reduced memory usage for some commonly-used functions, increased memory available for data on 64-bit systems, enhanced localization for Polish language users, and a number of bug fixes.  Detailed updates...

## Bayesian and Frequentist Approaches: Ask the Right Question

May 6, 2013
By

It occurred to us recently that we don’t have any articles about Bayesian approaches to statistics here. I’m not going to get into the “Bayesian versus Frequentist” war; in my opinion, which style of approach to use is less about philosophy, and more about figuring out the best way to answer a question. Once you Related posts:

## Incomplete Data by Design: Bringing Machine Learning to Marketing Research

May 6, 2013
By

Survey research deals with the problem of question wording by always asking the same question.  Thus, the Gallup Daily Tracking is filled with examples of moving averages for the exact same question asked precisely the same way every day. &nb...

## New fixed.angle() Function

Hello morphometricians,Below you can find a new fixed angle function addressing the problem discovered by Fabio Machado in the morphmet mail archive. We will include this function in our next schedule update to geomorph. Cheers, Erik CODE: ...

## Mixed Model Example — Wagner et al. (2006)

May 6, 2013
By

I am preparing for a workshop on mixed models and looked at the paper “Accounting for multilevel data structures in fisheries data using mixed models” by Wagner et al. (2006) (PDF available here).  Wagner et al. (2006) used two examples, with the … Continue reading →

## Monitoring des médias 2

May 6, 2013
By

(This article was first published on Learning Data Science , and kindly contributed to R-bloggers) Petit monitoring de notre observatoire des médias sur Twitter. Chez Mediapart : Le Monde Le Figaro Le parisien Vue globale Le code pour réaliser ce post : To leave a comment for the author, please follow the link and comment on their blog: Learning...

## Creating a QGIS-Style (qml-file) with an R-Script

May 6, 2013
By

How to get from a txt-file with short names and labels to a QGIS-Style (qml-file)? I used the below R-script to create a style for this legend table where I copy-pasted the parts I needed to a txt-file, like for the WRB-FULL (WRB-FULL: Full soil code o...

## The half variance approximation for mean returns

May 6, 2013
By

What’s that thing about arithmetic and geometric returns and the variance? Previously An introduction to the difference between simple and log returns is: A tale of two returns Issue Suppose you are predicting the mean annual return of an asset for some number of years.  To simplify the discussion, let’s buy into the fantasy that … Continue reading...

## analyze the social security administration public use microdata files (ssapumf) with r

May 5, 2013
By

the social security administration (ssa) must be overflowing with quiet heroes, because their public-use microdata files are as inconspicuous as they are thorough.  sure, ssa publishes enough great statistical research of their own that outside re...

## Google Analytics + R = FUN!

May 5, 2013
By

The scope of this post it to show how simple it is to get data out of the Google Analytics and create your own reports (that you hope that they can be semi-automated at least) and you favourite statistical graphs (those that GA is currently missing). As you already know R is a favourite tool

