## Evaluating Optimization Algorithms in MATLAB, Python, and R

June 18, 2013
By

As those of you who read my last post know, I’m at the NIMBioS-CAMBAM workshop on linking mathematical models to biological data here at UT Knoxville. Day 1 (today) was on parameter estimation and model identifiability. Specifically, we (quickly) covered … Continue reading →

## googleVis 0.4.3 released with improved Geocharts

June 18, 2013
By

The Google Charts Tools provide two kinds of heat map charts for geographical data, the Flash based Geomap and the HTML5/SVG based Geochart. I prefer the Geochart as it doesn't require Flash, but so far there have been two shortcomings with it: I couldn't add additional tooltip information and the default Mercator projection shows Greenland the...

## Software Packages for Graphs and Charts

June 17, 2013
By

Graphs can be an important feature of analysis. A graph that has been well designed and put together can make summary statistics much more readable and increase the interpretability. It also makes reports and articles looks more professional. There are many software packages that are available to design great graphs and charts.  This seems to

## Computerworld’s Beginners Guide to R

June 17, 2013
By

Sharon Machlis is not only the online managing editor at Computerworld, she's also a budding data scientist who recently started learning the R language. To the benefit of all other new R users, she's shared her learnings in an excellent 6-part beginners guide to R, published by Computerworld. It's jam-packed with useful information for anyone getting started with R,...

## Zombie Apocalypse Survival Test – R-Powered (using Concerto)

June 17, 2013
By

This test is the first attempt to seriously assess the ability of individuals to survive a zombie apocalypse.  This test is administered using the R powered open-source testing platform Concerto developed at the University of Cambridge. The t...

## Bayesian computational tools

June 17, 2013
By

I just updated my short review on Bayesian computational tools I first wrote in April for the Annual Review of Statistics and Its Applications. The coverage is quite restricted, as I took advantage of two phantom papers I had started a while ago, one with Jean-Michel Marin, on hierarchical Bayes methods and on ABC. (As

## Dave Harris on Maximum Likelihood Estimation

June 17, 2013
By

At our last Davis R Users’ Group meeting of the quarter, Dave Harris gave a talk on how to use the bbmle package to fit mechanistic models to ecological data. Here’s his script, which I ran throgh the spin function in knitr: # Load data library(emdbook) ## Loading required package: MASS Loading required package: lattice library(bbmle) ## Loading required package:...

## Oracle R Connector for Hadoop 2.1.0 released

June 17, 2013
By

(This article was first published on Oracle R Enterprise, and kindly contributed to R-bloggers) Oracle R Connector for Hadoop (ORCH), a collection of R packages that enables Big Data analytics using HDFS, Hive, and Oracle Database from a local R environment, continues to make advancements. ORCH 2.1.0 is now available, providing a flexible framework while remarkably improving performance and...

## Model Selection in Bayesian Linear Regression

June 17, 2013
By
$Model Selection in Bayesian Linear Regression$

Previously I wrote about performing polynomial regression and also about calculating marginal likelihoods. The data in the former and the calculations of the latter will be used here to exemplify model selection. Consider data generated by and suppose we wish to fit a polynomial of degree 3 to the data. There are then 4 regression The post Model...

## Stashing and playing with raw data locally from the web

June 17, 2013
By

It is getting easier to get data directly into R from the web. Often R packages that retrieve data from the web return useful R data structures to users like a data.frame. This is a good thing of course to make things user friendly. However, what if you want to drill down into the data that's returned from a query...

## Stashing and playing with raw data locally from the web

June 17, 2013
By

It is getting easier to get data directly into R from the web. Often R packages that retrieve data from the web return useful R data structures to users like a data.frame. This is a good thing of course to make things user friendly. However, what if you want to drill down into the data that's returned from a query...

## analyze the pesquisa de orcamentos familiares (pof) with r

June 17, 2013
By

for the unlucky among us born without a portuguese mother tongue, the pesquisa de orcamentos familiares (pof) translates to survey of household budgets.  this data set captures brazilian family consumption habits, allocation of expenses, and incom...

## Annotating select points on an X-Y plot using ggplot2

June 16, 2013
By

or, Is the Seattle Mariners outfield a disaster?The BackstoryEarlier this week (2013-06-10), a blog post by Dave Cameron appeared at USS Mariner under the title “Maybe It's Time For Dustin Ackley To Play Some Outfield”. In the first paragraph, Cameron describes to the Seattle Mariners outfield this season as “a complete disaster” and Raul Ibanez as...

## Exploratory Data Analysis: Combining Box Plots and Kernel Density Plots into Violin Plots for Ozone Pollution Data

Introduction Recently, I began a series on exploratory data analysis (EDA), and I have written about descriptive statistics, box plots, and kernel density plots so far.  As previously mentioned in my post on box plots, there is a way to combine box plots and kernel density plots.  This combination results in violin plots, and I

## Dynamic Data Visualizations in the Browser Using Shiny

June 16, 2013
By

After being busy the last two weeks teaching and attending academic conferences, I finally found some time to do what I love, program data visualizations using R. After being interested in Shiny for a while, I finally decided to pull the trigger and build my first Shiny app! I wanted to make a proof of

## General Regression Neural Network with R

June 16, 2013
By

Similar to the back propagation neural network, the general regression neural network (GRNN) is also a good tool for the function approximation in the modeling toolbox. Proposed by Specht in 1991, GRNN has advantages of instant training and easy tuning. A GRNN would be formed instantly with just a 1-pass training with the development data.

## Scenario analysis and trading options using R

June 16, 2013
By

I present you with my restructured project on options trading and scenario analysis. You are more than welcome to try it out. Firstly, I will give a small presentation that will reveal what you can do with it and whether you need to continue reading. T...

## The scaling of Expected Shortfall

June 16, 2013
By

Getting Expected Shortfall given the standard deviation or Value at Risk. Previously There have been a few posts about Value at Risk and Expected Shortfall. Properties of the stable distribution were discussed. Scaling One way of thinking of Expected Shortfall is that it is just some number times the standard deviation, or some other number … Continue reading...

## Distribution of car weights

June 16, 2013
By

Two weeks ago I described car data, among which weight distribution of cars in Netherlands. At that time it was purely plots. In the mean time I decided I wanted to model trends. As a first step of that, I decided to fit distributions for these da...

## Modeling an Infant’s Feeding Schedule with Periodic Smoothing Splines

June 15, 2013
By

Feeding Schedule While on paternity leave I had an opportunity to test out periodic smoothing splines (within the framework of generalized additive models) on an interesting time-series-- an infant's feeding schedule. read more

Some days ago H. Wickham (Chief Scientist of the RStudio company) posted an article about the RStudio CRAN mirror with …Continuar leyendo »

## Simulating Map-Reduce in R for Big Data Analysis Using Flights Data

June 14, 2013
By

At datadolph.in, we are constantly crunching through large amounts of data.  We are designing unique and innovative ways to process large datasets on a single node and use distributed computing only when single node computing become...

## Sudoku Automation Solver Challenge – R

June 14, 2013
By

On a recent flight I was bored waiting for the plane to land and I tried out the electronic sudoku game that they had offered.  I found the game surprisingly interesting as I realized that it is far more entertaining when you cannot use paper or p...

## The equivalence of the ellipsis argument and an infinite set of closures

June 14, 2013
By
$The equivalence of the ellipsis argument and an infinite set of closures$

This post is about a practical application of a topic I discuss in my book. In my book, I prove …Continue reading »

## A list of R packages, by popularity

June 14, 2013
By

R package developer (and R-bloggers editor) Tal Galili just published the answers to a question many R users have asked: which are the most popular R packages? He wrote some R code to rank the top 100 packages by number of downloads. Here's the top 10: The source data are the download logs from the RStudio CRAN mirror, whose...

## Latent Class Modeling Election Data

June 14, 2013
By

Latent class analysis is a useful tool that is used to identify groups within multivariate categorical data.  An example of this is the likert scale. In categorical language these groups are known as latent classes. As a simple comparison this can be compared to the k-means multivariate cluster analysis. There are several key differences between the

## Interval Estimation of the Population Mean

June 14, 2013
By

Interval estimation of the population mean can be computed from the functions of the following R packages:stats - contains the t.testTeachingDemos - contains the z.testBSDA - contains the zsum.test and tsum.testThe t.test of the stats package is a stud...

## Modeling an Infant’s Feeding Schedule with Periodic Smoothing Splines

June 13, 2013
By

While on paternity leave I had an opportunity to test out periodic smoothing splines (within the framework of generalized additive models) on an interesting time-series-- an infant's feeding schedule. load / format data and fit GAMs

## Practicing static typing in R: Prime directive on trusting our functions with object oriented programming

June 13, 2013
By

The creator of S language which R is derived from John Chambers said in one of his books  Software for data analysis programming with R: ...This places an obligation on all creators of software to program in such away that the computations ca...