A Note on Tweedie

October 9, 2014
By
A Note on Tweedie

by Joseph Rickert In a recent post I talked about the information that can be developed by fitting a Tweedie GLM to a 143 million record version of the airlines data set. Since I started working with them about a year or so ago, I now see Tweedie models everywhere. Basically, any time I come across a histogram that...

Read more »

In case you missed it: September 2014 Roundup

October 8, 2014
By

In case you missed them, here are some articles from September of particular interest to R users. Norm Matloff argues that T-tests shouldn't be part of the Statistics curriculum and questions the "star system" for p-values in R. A nice video introduction to the dplyr package and the %>% operator, presented by Kevin Markham. An animation of police militarization...

Read more »

Data analysis the data.table way: introducing DataCamp’s newest course

October 8, 2014
By
Data analysis the data.table way: introducing DataCamp’s newest course

Together with the key people behind the data.table package, Matt Dowle and Arun Srinivasan,  DataCamp developed a brand new interactive course to bring your data analysis skillset up to date with the essentials of the powerful data.table package. Learn more…  The popularity of the data.table package is increasing and with good reason. Not only is the number

Read more »

Structural “Arbitrage”: a Working Long-History Backtest

October 8, 2014
By
Structural “Arbitrage”: a Working Long-History Backtest

For this post, I would like to give my sincere thanks to Mr. Helmuth Vollmeier, for providing the long history … Continue reading →

Read more »

Responsive SVG in Your RStudio Browser

October 8, 2014
By

For those readers who are unaware, SVG is absolutely amazing, and if you need some convincing see this 2009 paper/talk from David Dailey Why is SVG Going to Be REALLY BIG?  Most R users should be very well acquainted with graphics and plots magically ...

Read more »

Slice bivariate densities, or the Joy Division “waterfall plot”

October 8, 2014
By
Slice bivariate densities, or the Joy Division “waterfall plot”

This has been on my to-do list for a long old time. Lining up slices through a bivariate density seems a much more intuitive way of depicting it than contour plots or some ghastly rotating 3-D thing (urgh). Of course, … Continue reading →

Read more »

Julia style string literal interpolation in R

October 8, 2014
By
Julia style string literal interpolation in R

I feel like a sculptor who has been using the same metal tools for the last four years and happened to have looked at my comrades and found them sporting new, sleek electric tools. Suddenly all of the hard work put into maintaining and adapting my meta...

Read more »

Plot Me Like a Hurricane (a.k.a. animating historical North Atlantic basin tropical storm tracks)

October 7, 2014
By

Markus Gessman (@MarkusGesmann) did a beautiful job Visualising the seasonality of Atlantic windstorms using small multiples, which was inspired by both a post by Arthur Charpentier (@freakonometrics) on using Markov spatial processes to “generate” hurricanes—which was tweaked a bit by Robert Grant (@robertstats)—and Gaston Sanchez‘s Visualizing Hurricane Trajectories RPub. I have some history with hurricane

Read more »

Fitting Lasso with Julia

October 7, 2014
By
Fitting Lasso with Julia

Julia Code R Code

Read more »

Predicting Monthly Car Sales: The Residuals are the Story

October 7, 2014
By
Predicting Monthly Car Sales: The Residuals are the Story

I'll produce predictions for US car sales by manufacture every month. There are already several blogs that describe the industry and sales that do a great job. Autoblog by the Numbers and Counting Cars are some to mention. Unli...

Read more »

Efficiently Adding Gabelhouse Lengths and Relative Weights to a data.frame (using dplyr)

October 7, 2014
By
Efficiently Adding Gabelhouse Lengths and Relative Weights to a data.frame (using dplyr)

In this post on RPubs, I demonstrate how to use new functions (psdAdd() and wrAdd()) in the FSA package, along with functions in the dplyr package, to efficiently add Gabelhouse length category and relative weight variables for all species in … Continue reading →

Read more »

The Generalized Lambda Distribution and GLDEX Package: Fitting Financial Return Data

October 7, 2014
By
The Generalized Lambda Distribution and GLDEX Package: Fitting Financial Return Data

by Daniel Hanson, with contributions by Steve Su (author of the GLDEX package). Part 1 of a series. Introduction As most readers are well aware, market return data tends to have heavier tails than that which can be captured by a normal distribution; furthermore, skewness will not be captured either. For this reason, a four parameter distribution such as...

Read more »

Lot of reports with a single click!

October 7, 2014
By
Lot of reports with a single click!

Suppose you want to create a huge number of pdf files t

Read more »

Part 2 of Who We Are: Society for Judgment and Decision Making (SJDM)

October 7, 2014
By
Part 2 of Who We Are: Society for Judgment and Decision Making (SJDM)

An analysis of where the SJDM members are from in the world. The post Part 2 of Who We Are: Society for Judgment and Decision Making (SJDM) appeared first on Decision Science News.

Read more »

randomness in coin tosses and last digits of prime numbers

October 7, 2014
By
randomness in coin tosses and last digits of prime numbers

A rather intriguing note that was arXived last week: it is essentially one page long and it compares the power law of the frequency range for the Bernoulli experiment with the power law of the frequency range for the distribution of the last digits of the first 10,000 prime numbers to conclude that the power

Read more »

Visualising the seasonality of Atlantic windstorms

October 7, 2014
By
Visualising the seasonality of  Atlantic windstorms

Last week Arthur Charpentier sketched out a Markov spatial process to generate hurricane trajectories. Here, I would like to take another look at the data Arthur used, but focus on its time component. According to the Insurance Information Institute, a normal season, based on averages from 1980 to 2010, has 12 named storms, six hurricanes and...

Read more »

Popular Mutual Funds Decomposed With Ekholm (2014)

October 6, 2014
By

While we have a foundation and momentum from the last post “SelectionShare & TimingShare | Masterfully Written by Delightfully Responsive Author” , we can run the Ekholm calculations on some popular funds to see how they have evolved since the early 1980s.  Remember these are my opinions and not investment advice.  I chose these four funds for ...

Read more »

The World We Live In #1: Obesity And Cells

October 6, 2014
By
The World We Live In #1: Obesity And Cells

Lesson learned, and the wheels keep turning (The Killers – The world we live in) I discovered this site with a huge amount of data waiting to be analyzed. The first thing I’ve done is this simple graph, where you can see relationship between cellular subscribers and obese people. Bubbles are countries and its size

Read more »

The winds of Winter [Bayesian prediction]

October 6, 2014
By
The winds of Winter [Bayesian prediction]

A surprising entry on arXiv this morning: Richard Vale (from Christchurch, NZ) has posted a paper about the characters appearing in the yet hypothetical next volume of George R.R. Martin’s Song of ice and fire series, The winds of Winter . Using the previous five books in the series

Read more »

R as a general-purpose language for creating DSLs

October 6, 2014
By

As a computer scientist, RStudio's Joe Cheng has some great insights into the R language and how it compares with other programming language. In the interview with DataScience.LA below, he notes that while R is often thought about as a domain-specific language (or DSL), the combination of a functional language with deferred evaluation of functional arguments actually makes it...

Read more »

New version of pqR with faster variable lookup, faster subset replacement, and more

October 6, 2014
By
New version of pqR with faster variable lookup, faster subset replacement, and more

I’ve released a new version, pqR-2014-09-30, of my speedier, “pretty quick”, implementation of R, with some major performance improvements, and some features from recent R Core versions. It also has fixes for bugs (some also in R-3.1.1) and installation glitches. Details are in pqR NEWS. Here I’ll highlight some of the more interesting improvements. Faster variable lookup.   In both pqR

Read more »

7 new R jobs (for October 6th 2014)

October 6, 2014
By
7 new R jobs (for October 6th 2014)

This is the bimonthly R Jobs post (for 2014-10-06), based on the R-bloggers’ sister website: R-users.com. If you are an employer who is looking to hire people from the R community, please visit this link to post a new R job (it’s free, and registration takes less than 10 seconds). If you are a job seekers, please follow the links below to learn more and apply for your job of interest (or visit previous...

Read more »

A Conversation with Joe Cheng at useR! 2014

October 6, 2014
By

Joe Cheng is a software engineer. Unfortunately the term gets thrown around pretty lightly, so...

Read more »

A bit more fragmented

October 6, 2014
By
A bit more fragmented

Tweet This year election renders an even more fragmented legislative. The way political scientists measure this is by applying an algorithm to calculate the Effective Number of Parties, which is a measure that helps to go beyond the simple number of parties. A widely accepted algorithm was proposed by M. Laakso and R. Taagepera: , … Read More...

Read more »

Building a DGA Classifier: Part 3, Model Selection

October 6, 2014
By
Building a DGA Classifier: Part 3, Model Selection

This is part two of a three-part blog series on building a DGA classifier and it is split into the three phases of building a classifier: 1) Data preparation 2) Feature engineering and 3) Model selection (this post) Back in part 1, we prepared the data and we are starting with a nice clean list of domains labeled as either legitimate (“legit”) or generated by an algorithm (“dga”)....

Read more »

TBATS with regressors

October 5, 2014
By

I’ve received a few emails about including regression variables (i.e., covariates) in TBATS models. As TBATS models are related to ETS models, tbats() is unlikely to ever include covariates as explained here. It won’t actually complain if you include an xreg argument, but it will ignore it. When I want to include covariates in a

Read more »

Monte Carlo simulation and resampling methods for social science [book review]

October 5, 2014
By
Monte Carlo simulation and resampling methods for social science [book review]

Monte Carlo simulation and resampling methods for social science is a short paperback written by Thomas Carsey and Jeffrey Harden on the use of Monte Carlo simulation to evaluate the adequacy of a model and the impact of assumptions behind this model. I picked it in the library the other day and browse through the

Read more »

Bayes of thrones

October 5, 2014
By

My friend and colleague Andreas sent me a link to a working paper published by a statistician at the University of Christchurch (New Zealand) and discussed here. The main idea of the paper was to use a Bayesian model to predict the number of futur...

Read more »

Bayes models from SAS PROC MIXED in R, post 2

October 5, 2014
By

This is my second post in converting SAS's PROC MCMC examples in R. The task in his week is determining the transformation parameter in a Box-Cox transformation. SAS only determines Lambda, but I am not so sure about that. What I used to do was get an ...

Read more »