## Bayesian computational tools

June 17, 2013
By

I just updated my short review on Bayesian computational tools I first wrote in April for the Annual Review of Statistics and Its Applications. The coverage is quite restricted, as I took advantage of two phantom papers I had started a while ago, one with Jean-Michel Marin, on hierarchical Bayes methods and on ABC. (As

## Dave Harris on Maximum Likelihood Estimation

June 17, 2013
By

At our last Davis R Users’ Group meeting of the quarter, Dave Harris gave a talk on how to use the bbmle package to fit mechanistic models to ecological data. Here’s his script, which I ran throgh the spin function in knitr: # Load data library(emdbook) ## Loading required package: MASS Loading required package: lattice library(bbmle) ## Loading required package:...

## Oracle R Connector for Hadoop 2.1.0 released

June 17, 2013
By

(This article was first published on Oracle R Enterprise, and kindly contributed to R-bloggers) Oracle R Connector for Hadoop (ORCH), a collection of R packages that enables Big Data analytics using HDFS, Hive, and Oracle Database from a local R environment, continues to make advancements. ORCH 2.1.0 is now available, providing a flexible framework while remarkably improving performance and...

## Model Selection in Bayesian Linear Regression

June 17, 2013
By
$Model Selection in Bayesian Linear Regression$

Previously I wrote about performing polynomial regression and also about calculating marginal likelihoods. The data in the former and the calculations of the latter will be used here to exemplify model selection. Consider data generated by and suppose we wish to fit a polynomial of degree 3 to the data. There are then 4 regression The post Model...

## Stashing and playing with raw data locally from the web

June 17, 2013
By

It is getting easier to get data directly into R from the web. Often R packages that retrieve data from the web return useful R data structures to users like a data.frame. This is a good thing of course to make things user friendly. However, what if you want to drill down into the data that's returned from a query...

## Stashing and playing with raw data locally from the web

June 17, 2013
By

It is getting easier to get data directly into R from the web. Often R packages that retrieve data from the web return useful R data structures to users like a data.frame. This is a good thing of course to make things user friendly. However, what if you want to drill down into the data that's returned from a query...

## analyze the pesquisa de orcamentos familiares (pof) with r

June 17, 2013
By

for the unlucky among us born without a portuguese mother tongue, the pesquisa de orcamentos familiares (pof) translates to survey of household budgets.  this data set captures brazilian family consumption habits, allocation of expenses, and incom...

## Annotating select points on an X-Y plot using ggplot2

June 16, 2013
By

or, Is the Seattle Mariners outfield a disaster?The BackstoryEarlier this week (2013-06-10), a blog post by Dave Cameron appeared at USS Mariner under the title “Maybe It's Time For Dustin Ackley To Play Some Outfield”. In the first paragraph, Cameron describes to the Seattle Mariners outfield this season as “a complete disaster” and Raul Ibanez as...

## Exploratory Data Analysis: Combining Box Plots and Kernel Density Plots into Violin Plots for Ozone Pollution Data

Introduction Recently, I began a series on exploratory data analysis (EDA), and I have written about descriptive statistics, box plots, and kernel density plots so far.  As previously mentioned in my post on box plots, there is a way to combine box plots and kernel density plots.  This combination results in violin plots, and I

## Dynamic Data Visualizations in the Browser Using Shiny

June 16, 2013
By

After being busy the last two weeks teaching and attending academic conferences, I finally found some time to do what I love, program data visualizations using R. After being interested in Shiny for a while, I finally decided to pull the trigger and build my first Shiny app! I wanted to make a proof of

## General Regression Neural Network with R

June 16, 2013
By

Similar to the back propagation neural network, the general regression neural network (GRNN) is also a good tool for the function approximation in the modeling toolbox. Proposed by Specht in 1991, GRNN has advantages of instant training and easy tuning. A GRNN would be formed instantly with just a 1-pass training with the development data.

## Scenario analysis and trading options using R

June 16, 2013
By

I present you with my restructured project on options trading and scenario analysis. You are more than welcome to try it out. Firstly, I will give a small presentation that will reveal what you can do with it and whether you need to continue reading. T...

## The scaling of Expected Shortfall

June 16, 2013
By

Getting Expected Shortfall given the standard deviation or Value at Risk. Previously There have been a few posts about Value at Risk and Expected Shortfall. Properties of the stable distribution were discussed. Scaling One way of thinking of Expected Shortfall is that it is just some number times the standard deviation, or some other number … Continue reading...

## Distribution of car weights

June 16, 2013
By

Two weeks ago I described car data, among which weight distribution of cars in Netherlands. At that time it was purely plots. In the mean time I decided I wanted to model trends. As a first step of that, I decided to fit distributions for these da...

## Modeling an Infant’s Feeding Schedule with Periodic Smoothing Splines

June 15, 2013
By

Feeding Schedule While on paternity leave I had an opportunity to test out periodic smoothing splines (within the framework of generalized additive models) on an interesting time-series-- an infant's feeding schedule. read more

Some days ago H. Wickham (Chief Scientist of the RStudio company) posted an article about the RStudio CRAN mirror with …Continuar leyendo »

## Simulating Map-Reduce in R for Big Data Analysis Using Flights Data

June 14, 2013
By

At datadolph.in, we are constantly crunching through large amounts of data.  We are designing unique and innovative ways to process large datasets on a single node and use distributed computing only when single node computing become...

## Sudoku Automation Solver Challenge – R

June 14, 2013
By

On a recent flight I was bored waiting for the plane to land and I tried out the electronic sudoku game that they had offered.  I found the game surprisingly interesting as I realized that it is far more entertaining when you cannot use paper or p...

## The equivalence of the ellipsis argument and an infinite set of closures

June 14, 2013
By
$The equivalence of the ellipsis argument and an infinite set of closures$

This post is about a practical application of a topic I discuss in my book. In my book, I prove …Continue reading »

## A list of R packages, by popularity

June 14, 2013
By

R package developer (and R-bloggers editor) Tal Galili just published the answers to a question many R users have asked: which are the most popular R packages? He wrote some R code to rank the top 100 packages by number of downloads. Here's the top 10: The source data are the download logs from the RStudio CRAN mirror, whose...

## Latent Class Modeling Election Data

June 14, 2013
By

Latent class analysis is a useful tool that is used to identify groups within multivariate categorical data.  An example of this is the likert scale. In categorical language these groups are known as latent classes. As a simple comparison this can be compared to the k-means multivariate cluster analysis. There are several key differences between the

## Interval Estimation of the Population Mean

June 14, 2013
By

Interval estimation of the population mean can be computed from the functions of the following R packages:stats - contains the t.testTeachingDemos - contains the z.testBSDA - contains the zsum.test and tsum.testThe t.test of the stats package is a stud...

## Modeling an Infant’s Feeding Schedule with Periodic Smoothing Splines

June 13, 2013
By

While on paternity leave I had an opportunity to test out periodic smoothing splines (within the framework of generalized additive models) on an interesting time-series-- an infant's feeding schedule. load / format data and fit GAMs

## Practicing static typing in R: Prime directive on trusting our functions with object oriented programming

June 13, 2013
By

The creator of S language which R is derived from John Chambers said in one of his books  Software for data analysis programming with R: ...This places an obligation on all creators of software to program in such away that the computations ca...

## Win Your Fantasy Football Auction Draft: Calculate the Optimal Players to Draft with this Shiny App in R

June 13, 2013
By

In this post, I use a Shiny app in R to determine the best possible players to pick in a fantasy football auction draft.  The app takes projections from FantasyPros, The post Win Your Fantasy Football Auction Draft: Calculate the Optimal Players to Draft with this Shiny App in R appeared first on Fantasy Football Analytics.

## How big data and statistical modeling are changing video games

June 13, 2013
By

Bill Grosso presented a fascinating webinar about the video gaming industry today, Knowing How People are Playing Your Game Gives You the Winning Hand. He described how over the past three years, game studios have switched from viewing analytics as a primarily descriptive tool to deploying modern data collection practices, machine learning toolkits, and statistical methods to gain a...

## ANOVA and Tukey’s test on R

June 13, 2013
By

OBS: This is a full translation of a portuguese version. In many different types of experiments, with one or more treatments, one of the most widely used statistical methods is analysis of variance or simply ANOVA . The simplest ANOVA can be called “one way” or “single-classification” and involves the analysis of data sampled from The post ANOVA...

## Big in Japan

June 13, 2013
By

Inspired by this post on R-bloggers, I decided to check how BCEA was doing. Unfortunately, it does not feature in the top 100 most downloaded R packages. However, I think it's doing well \$-\$ considering the book (which is the main medium of advertising of the package) has...

## Getting started with twitteR in R

June 13, 2013
By

I have asked by a few people lately to help walk them through using twitter API in R, and I’ve always just directed them to the blog post I wrote last year during the US presidential debates not knowing that Twitter had changed a few things. Having my interest peaked through a potential project at