Estimating Ages from First Names Part 2 – Using Some Morbid Test Data

July 31, 2013
By
Estimating Ages from First Names Part 2 – Using Some Morbid Test Data

In my last post, I wrote about how I compiled a US Social Security Agency data set into something usable in R, and mentioned some issues scaling it up to be usable for bigger datasets.  I also mentioned the need … Continue reading →

Read more »

A dirty hack for importing packages that use Depends

July 31, 2013
By
A dirty hack for importing packages that use Depends

A dirty hack for importing packages that use Depends 2013-05-27 Source Scope This article is about R package development. Motivation As stated in the the Writing R Extensions manual and the Software for Data Analysis book (aka the R bible), packages should whenever possible use Imports instead of Depends, to avoid name collision (masking) and ensure trustworthy computations. See

Read more »

I made a mistake, please don’t shoot me

July 31, 2013
By

The major difference between commercial/academic written software is the handling of user mistakes, or to be more exact what is considered to be a user mistake. In the commercial world the emphasis is on keeping the customer happy, which translates into trying hard to gracefully handle any ‘mistake’ the user makes. Academic software is generally

Read more »

Butler Analytics: Real Analysts use R

July 31, 2013
By

In an overview of several predictive analytics platforms (including SPSS, Oracle and SAS), Butler Analytics offers this 4.5/5 star review of Revolution R Enterprise: Real analysts use R – well it sounds a bit macho, but actually there is some truth in it. R is the most widely used, and arguably the most powerful analysis software on the planet....

Read more »

Measuring Bias in Published Work

July 31, 2013
By
Measuring Bias in Published Work

In a series of previous posts, I’ve spent some time looking at the idea that the review and publication process in political science—and specifically, the requirement that a result must be statistically significant in order to be scientifically notable or publishable—produces a very misleading scientific literature. In short, published studies of some relationship will tend

Read more »

Measuring Bias in Published Work

July 31, 2013
By
Measuring Bias in Published Work

In a series of previous posts, I’ve spent some time looking at the idea that the review and publication process in political science—and specifically, the requirement that a result must be statistically significant in order to be scientifically notable or publishable—produces a very misleading scientific literature. In short, published studies of some relationship will tend

Read more »

Trivial, but useful: sequences with defined mean/s.d.

July 31, 2013
By
Trivial, but useful: sequences with defined mean/s.d.

O.k., the following post may be (mathematically) trivial, but could be somewhat useful for people that do simulations/testing of statistical methods. Let’s say we want to test the dependence of p-values derived from a t-test to a) the ratio of means between two groups, b) the standard deviation or c) the sample size(s) of the

Read more »

R in Insurance: Presentations are online

July 31, 2013
By
R in Insurance: Presentations are online

The programme and the presentation files of the first R in Insurance conference have been published on GitHub. Front slides of the conference presentations Additionally to the slides many presenters have made their R code available as well: Alexander McNeil shared the examples of the CreditRisk+ model he presented. Lola Miranda made a...

Read more »

R ecology workshop

July 31, 2013
By
R ecology workshop

After my presentation yesterday to a group of grad students on R resources, I did a presentation today on intro to R data manipulation, visualizations, and analyses/visualizations of biparite networks and community level analyses (diversity, rarefactio...

Read more »

R ecology workshop

July 31, 2013
By
R ecology workshop

After my presentation yesterday to a group of grad students on R resources, I did a presentation today on intro to R data manipulation, visualizations, and analyses/visualizations of biparite networks and community level analyses (diversity, rarefactio...

Read more »

bdvis development version available for early feedback

July 30, 2013
By
bdvis development version available for early feedback

Google Summer of Code 2013 is half way through. Mid term evaluations are underway. I thought this is a good logical point for us to share what we have been doing for Biodiversity Data Visualizations in R project and open up the package for testing and some early feedback. We have named the package bdvis.

Read more »

bdvis development version available for early feedback

July 30, 2013
By
bdvis development version available for early feedback

Google Summer of Code 2013 is half way through. Mid term evaluations are underway. I thought this is a good logical point for us to share what we have been doing for Biodiversity Data Visualizations in R project and open up the package for testing and some early feedback. We have named the package bdvis.

Read more »

LondonR lightning talk: Audiblization / sonification of data

July 30, 2013
By
LondonR lightning talk: Audiblization / sonification of data

It looks like the next LondonR meeting on 10 September 2013 will involve a series of 5 minute lightning talks rather than a few half hour slots. I have proposed “Audiblization / sonification of data: what are people doing and … Continue reading →

Read more »

Occupancy model fit & AUC

July 30, 2013
By
Occupancy model fit & AUC

Occupancy models are used to understand species distributions while accounting for imperfect detection. In this post, I’ll demonstrate a method to evaluate the performance of occupancy models based on the area under a receiver operating characteristic curve (AUC), as published last year by Elise Zipkin and colleagues in Ecological Applications. Suppose we are to fit a multi-year occupancy...

Read more »

A Chart of Recent Comrades Marathon Winners

July 30, 2013
By
A Chart of Recent Comrades Marathon Winners

Continuing on my quest to document the Comrades Marathon results, today I have put together a chart showing the winners of both the men and ladies races since 1980. Click on the image below to see a larger version. The analysis started off with the same data set that I was working with before, from

Read more »

Basic Proof of Concept: Save an R dataframe as a Tableau Data Extract

July 30, 2013
By

I love Tableau, as it is a huge part of my data workflow. Not only is it super easy to slice and dice data, but the company recently released an API for developers.  In short, we can use a few languages (python and C++/Java) to build Data Extracts, the super fast back-end that makes using

Read more »

Crowdfunding the Ubuntu Edge will fail

July 30, 2013
By
Crowdfunding the Ubuntu Edge will fail

Tuesday 30 July 2013 - 14:14 Although the Ubuntu Edge has been record-breaking in its crowdfunding efforts, the fundraising campaign will fall well short of its 32 million dollar target. Just to be clear, I hope to be proven wrong! I backed the Ubuntu Edge Indiegogo fundraiser at the $625 level...

Read more »

Crowdfunding the Ubuntu Edge will fail

July 30, 2013
By
Crowdfunding the Ubuntu Edge will fail

Tuesday 30 July 2013 - 14:14 Although the Ubuntu Edge has been record-breaking in its crowdfunding efforts, the fundraising campaign will fall well short of its 32 million dollar target. Just to be clear, I hope to be proven wrong! I backed the Ubuntu Edge Indiegogo fundraiser at the $625 level...

Read more »

Interfacing R and Google maps

July 30, 2013
By
Interfacing R and Google maps

Introduction I couple of weeks ago I had an idea for a website where people can collaborate to create the first real Audio Atlas, using the power of the Google Maps API. The problem was that I do some programming in R but I did know very few things about HTML and javascript. However, I knew that having...

Read more »

Visualizing Book Sentiments

July 30, 2013
By
Visualizing Book Sentiments

Sentiment analysis of social media content has become pretty popular of late, and a few days ago, as I lay in bed, I wondered if we could do the same thing to books - and see how sentiments vary through the story.The answer, of course, was that yes, we could. And if you’d rather just...

Read more »

R resources

July 30, 2013
By
R resources

I'm doing a presentation today to grad students on R resources. I have been writing HTML presentations recently, but some great tools are now available to convert text that is easy to read and write to presentations. RStudio has something called R pr...

Read more »

R resources

July 30, 2013
By
R resources

I'm doing a presentation today to grad students on R resources. I have been writing HTML presentations recently, but some great tools are now available to convert text that is easy to read and write to presentations. RStudio has something called R pr...

Read more »

How divided is the Senate?

July 29, 2013
By
How divided is the Senate?

I very seldom pay attention to politics directly, because politics have always seemed a bit circular and cyclical to me. Most of the political news that I take in ends up worming its way into the news sources that I do consume, like the excellent longform.org. Even given my limited intake of political news, one trend that I...

Read more »

How divided is the Senate?

July 29, 2013
By
How divided is the Senate?

I very seldom pay attention to politics directly, because politics have always seemed a bit circular and cyclical to me. Most of the political news that I take in ends up worming its way into the news sources that I do consume, like the excellent longform.org. Even given my limited intake of political news, one trend...

Read more »

Stop Loss

July 29, 2013
By
Stop Loss

Today I want to share and present an example of the flexible Stop Loss functionality that I added to the Systematic Investor Toolbox. Let’s examine a simple Moving Average Crossover strategy: Buy is triggered once fast moving average crosses above the slow moving average Sell is triggered once fast moving average crosses below the slow

Read more »

“I like this concept of “low volatility, interrupted by occasional periods of high volatility”. I…”

July 29, 2013
By
“I like this concept of “low volatility, interrupted by occasional periods of high volatility”. I…”

“I like this concept of “low volatility, interrupted by occasional periods of high volatility”. I think I will call it “volatility”.” - Daniel Davies via nonergodic   (PS: If you didn’t see Read more »

“I like this concept of “low volatility, interrupted by occasional periods of high volatility”. I…”

July 29, 2013
By
“I like this concept of “low volatility, interrupted by occasional periods of high volatility”. I…”

“I like this concept of “low volatility, interrupted by occasional periods of high volatility”. I think I will call it “volatility”.” - Daniel Davies via nonergodic   (PS: If you didn’t see Read more »

Quandl.com for Time Series Datasets

July 29, 2013
By

If you want to dig in with both feet on time series data, then quandl.com is a good choice.  The website claims to have several million datasets all of them available for free download.  It also allows you to upload data to the site with an a...

Read more »

How divided is the Senate?

July 29, 2013
By
How divided is the Senate?

I very seldom pay attention to politics directly, because politics have always seemed a bit circular and cyclical to me. Most of the political news that I take in ends up worming its way into the news sources that I do consume, like the excellent longform.org. Even given my limited intake of political news, one trend that I...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.