## A million ways to connect R and Excel

February 11, 2014
In quantitative finance both R and Excel are the basis tools for any type of analysis. Whenever one has to use Excel in conjunction with R, there are many ways to approach the problem and many solutions. It depends on what you really want to do and the size of the dataset you’re dealing with. I

February 11, 2014
The “medals” R post by TRInker and re-blogged by Revolutions were both spiffy and a live example why there’s no point in not publishing raw data. You don’t need to have R (or any other language) do the scraping, though. The “IMPORTHTML” function (yes, function names seem to be ALL CAPS now over at Google

## There is no Such Thing as Biomedical "Big Data"

February 11, 2014
At the moment, the world is obsessed with “Big Data” yet it sometimes seems that people who use this phrase don’t have a good grasp of its meaning.  Like most good buzz-words, “Big Data” sparks the idea of something grand and complicated, while sounding ordinary enough that listeners feel like they have an intuitive understanding of the concept.  However...

## Literate Statistical Programming with knitr – Creating Reproducible Analysis in R

February 11, 2014
This video was created by Dr. Roger Peng, professor at the Johns Hopkins Bloomberg School of Public Health and author at Simply Statistics. Dr. Peng is not affiliated with Stats Make Me Cry in any way and written consent was ob...

## R jobs (February 2014)

February 11, 2014
R-bloggers is offering an “R jobs” post, that will be published once every (“couple of”) months. If you are interested in offering a job to be posted on r-bloggers, please e-mail me at: [email protected] (please write in the subject line the text “R job” so I could have easy e-mail filtering on it). Harvard — R/C++ programmer to develop climate-related R software/package...

## The Sound Of Mandelbrot Set

February 11, 2014
Music is the pleasure the human soul experiences from counting without being aware that it is counting (Gottfried Leibniz) I like the concept of sonification: translating data into sounds. There is a huge amount of contents in the Internet about this technique and there are several packages in R to help you to sonificate your

## Efficiency of Importing Large CSV Files in R

February 10, 2014
## RcppSMC 0.1.2

February 10, 2014
Late last week, and just before leaving to participate in this crazy thing, I managed to get a new version of RcppSMC onto CRAN.RcppSMC combines the SMCTC template classes for Sequential Monte Carlo and Particle Filters (Johansen, 2009, JSS) with t...

## Winter Olympic Medal Standings, presented by R

February 10, 2014
There's no shortage of web sites listing the current medal standings at Sochi, not least the official Winter Olympics Medal Tally. And here's the same tally, rendered with R: Click through to see a real-time version of the chart, created with RStudio's Shiny by Ramnath Vaidyanathan. (By the way, does anyone know if it's possible to embed a live...

## Using Dates and Times in R

February 10, 2014
Today at the Davis R Users’ Group, Bonnie Dixon gave a tutorial on the various ways to handle dates and times in R. Bonnie provided this great script which walks through essential classes, functions, and packages. Here it is piped through knitr::spin. The original R script can be found as a gist here. Date/time classes Three...

## Unprincipled Component Analysis

February 10, 2014
As a data scientist I have seen variations of principal component analysis and factor analysis so often blindly misapplied and abused that I have come to think of the technique as unprincipled component analysis. PCA is a good technique often used to reduce sensitivity to overfitting. But this stated design intent leads many to (falsely)Related posts:

## Bullet Graph in R

February 10, 2014
Stephen Few designed the Bullet Graph

## Scraping Pro-Football Data and Interactive Charts using rCharts, ggplot2, and shiny

February 10, 2014
This post uses pro-football (American) boxscore data from 1966 through 2013 and generates few interactive charts using rCharts, ggplot2 and shiny.  It also provided a first time exposure to the power of dplyr. Data for these charts were scraped fr...

## Three ways to call C/C++ from R

February 10, 2014
By Ben Ogorek Introduction I only recently discovered the fundamental connection between the C and R languages. It was during a Bay Area useR Group meeting, where presenter J.J. Allaire shared two points to motivate his talk on Rcpp. The first explained just how much of modern R really is C and C++. For...

## Spatial autocorrelation of errors in JAGS

February 10, 2014
In the core of kriging, Generalized-Least Squares (GLS) and geostatistics lies the multivariate normal (MVN) distribution – a generalization of normal distribution to two or more dimensions, with the option of having non-independent variances (i.e. autocorrelation). In this post I will show: (i) how to use exponential decay and the … Continue reading →

## Installing MatLab vs Installing R

Installing MatLab vs Installing RI retweeted this a few days ago:1. Open MATLAB for first time in a few years after using #rstats. 2. Site license doesn't work right. 3. F*** MATLAB, I'll try to do it in R— Andrew D. Steen (@drdrewsteen) February 6, 2014And as I have started the process of installing MatLab on my...

## Cleaning Data and Graphing in R and Python

February 10, 2014
Python has some pretty awesome data-manipulation and graphing capabilities. If you’re a heavy R-user who dabbles in Python like me, you might wonder what the equivalent commands are in Python for dataframe manipulation. Additionally, I was curious to see how … Continue reading →

## DailyMeteo.org – 2014 Conference

February 10, 2014
Our friend Stefan has been participating in MilanoR since the beginning, and was one of the people who started using R intensively after the "Introduction to R" Quantide course. Since he is from Belgrade (Serbia), and takes part in the … Continue reading →

## rMaps released

February 10, 2014
Ramnath Vaidyanathan just released his new R interactive package, rMaps (Vaidyanathan, 2014). The packages relies on the development version of his widely known rCharts package (Vaidyanathan, 2013) as well as javascript libraries...

## rOpenSci developer meeting in March

February 10, 2014
Our team has been cranking out a large number of tools over the past several months. As regular readers are aware, our software packages provide programmatic access to a diverse and extensive trove of scientific data. More recently we’ve expanded our efforts to build more general purpose and cross-domain tools. These include tools for reading, writing, integrating and publishing...

## Data Driven Security Roundup: betaPERT, Shiny, Honeypots, Passwords & Reproducible Research

February 9, 2014
Jay Jacobs (@jayjacobs)—my co-author of the soon-to-be-released book Data-Driven Security—& I have been hard at work over at the book’s sister-blog cranking out code to help security domain experts delve into the dark art of data science. We’ve covered quite a bit of ground since January 1st, but I’m using this post to focus more

## Sochi Olympic Medals

February 9, 2014
For those who are addicted to R and haven’t the time to click the mouse on a web browser you too can still be informed about the results of the 2014 Sochi Winter Olympics. I was inspired by a SO … Continue reading →

## After 1st semester of Statistics PhD program

February 9, 2014
Have you ever wondered whether the first semester of a PhD is really all that busy? My complete lack of posts last fall should prove it Some thoughts on the Fall term, now that Spring is well under way: The … Continue reading →

## The Statsguys on Data Analytics

February 9, 2014
It's good to see that more and more students of econometrics are taking an interest in "Data Analytics" / "Big Data" /"Data Science" literature. As I've commented previously, there's a lot that we can all learn from each other. Moreover, many of "boundaries" are very soft, and are more perceived than real.So, I was delighted to see the...

## A complicated answer to a simple correlation question

February 9, 2014
A data analysis surprise party. Simple question If I have correlation matrices each estimated with a month of daily returns, how much worse is the average of six of those compared to the estimate with six months of daily data? Expected answer Do a statistical bootstrap with the returns and compare the standard deviations across … Continue reading...

## UofT R session went well. Thanks RStudio Server!

February 9, 2014
Apart from going longer than I had anticipated, very little of any significance went wrong during my R session at UofT on friday!  It took a while at the beginning for everyone to get set up.  Everyone was connecting to … Continue reading →

## Bayesian analysis of sensory profiling data, part 2

February 9, 2014
Last week I made the core of a Bayesian model for sensory profiling data. This week the extras need to be added. That is, there are a bunch of extra interactions and the error is dependent on panelists and descriptors.Note that where last week I pointe...

## Another skewed normal distribution

February 8, 2014
By
$Another skewed normal distribution$

At the CLRS last year, Glenn Meyers talked about something very near to my heart: a skewed normal distribution. In loss reserving (and I'm sure, many other contexts) standard linear regression is less than ideal as it presumes that deviations from the mean are equally distributed. We rarely expect this assumption to hold (though we