The following is mostly based on our arXived paper with Andrew Gelman and the references mentioned there. Koopman, Shephard, and Creal (2009) proposed to make a sample based estimate of the existence of the moments using generalized Pareto

“In this article it is shown that in a fairly general setting, a sample of size approximately exp(D(μ|ν)) is necessary and sufficient for accurate estimation by importance sampling.” Sourav Chatterjee and Persi Diaconis arXived yesterday an exciting paper where they study the proper sample size in an importance sampling setting with no variance. That’s right,

In a previous post, we described how we performed exploratory data analysis (EDA) in real-world log files, as provided by Skroutz.gr, the leading online company in Greece for online price comparison, in the context of Athens Datathon 2015. In the present post we will have a look at the same job as performed with Oracle Big Data Discovery (v....

In this follow-up tutorial of This R Data Import Tutorial Is Everything You Need-Part One, DataCamp continues with its comprehensive, yet easy tutorial to quickly import data into R, going from simple, flat text files to the more advanced SPSS and SAS files. As a lot of our readers noticed correctly from the first post, The post

The quest for income microdata For a separate project, I've been looking for source data on income and wealth inequality. Not aggregate data like Gini coefficients or the percentage of income earned by the bottom 20% or top 1%, but the sources used to calculate those things. Because it's sensitve personal financial data either from surveys or tax...

We better keep an eye on this one: she is tricky (Michael Banks, talking about Mary Poppins) Professor Bertrand teaches Simulation and someday, ask his students: Given a circumference, what is the probability that a chord chosen at random is longer than a side of the equilateral triangle inscribed in the circle? Since they must reach the … Continue reading...

Why an R Tutorial on Reading and Importing Excel Files into R is necessary As most of you know, Excel is a spreadsheet application developed by Microsoft. It is an easy accessible tool for organizing, analyzing and storing data in tables and has a widespread use in many different application fields all over the world. The post

Taking the first step is often the hardest: getting data from Excel into R. Suppose you would like to use the ChainLadder package to forecast future claims payments for a run-off triangle that you have stored in Excel. How do you get the triangle into R and execute a reserving function, such as MackChainLadder?Well, there are...

e-mails with the latest R posts.

(You will not see this message again.)