Do not believe anything: what artists really do is to hang around all day (Paco de Lucia) Andy Warhol was mathematician. At least, he knew how clustering algorithms work. I am pretty sure of this after doing this experiment. First of all, let me introduce you to the breathtaking Grace ... [Read more...]
Quandl is a “wikipedia” for numerical data that allows you to search rapidly through 8 million ready-to-use data sets. At DataCamp we created a free in-browser coding tutorial on how to use the corresponding R package to access Quandl data from within R. As every real world data analyst knows, finding ... [Read more...]
Sometimes I have the need to reduce the number of images for a presentation or an article. A good way of doing it is putting multiple plot on the same tif or jpg file.R has multiple functions to achieve this objective and a nice tutorial for this topic... [Read more...]
This last post will give a quick overview of two other methods I used to try to understand Watch the Throne using text analysis with R. Last post about this album, I promise!The first analysis I'll present here has to do with clustering of songs. I was very inspired ... [Read more...]
My last post I talked about using rCharts to create interactive graphics for my interview presentations. They seemed to go over pretty well in my interviews and helped me greatly as I did not need to remember or write down specific numbers to talk about. I use slidy to create ... [Read more...]
Seventh Torino R net meeting on 27 Mar 2014, exceptionally hosted at Polo Universitario di Asti, will have three presentations: Processing and analysis methods for DNA methylation array data, Giovanni Fiorito, Complex Systems for Life Sciences, University of Turin; Temporal Dominance of Sensations (TDS) … Continue reading →
[Read more...]
Presentations of the sixth Torino R net meeting are now available on line, section Downloads. Thank you to all who attended the meeting on Thursday 21th November and special thanks to presenters. … Continue reading →
[Read more...]
A historian, a data scientist, a programmer, a mathematician, and a philosopher discuss the question, how likely it is that a lottery draw (6 out of 49) contains two consecutive numbers.
The historian
The historian argues that from 1955 up to 2011, there were 5026 lottery draws in Germany, every Saturday, and from 2000 on, two ... [Read more...]
Last November 2013, PACKT Publishing launched the Introduction to R for Quantitative Finance. The book around which is around 164 pages (including cover page and back pages) discuss the implementation different quantitative methods used in financ... [Read more...]
R square is a widely used measure of model fitness, in General Linear Models (GLM) it can be interpreted as the percent of variance in the response variable explained by the model. This measure is unitless which makes it useful to compare model between studies in meta-analysis analysis. Generalized Linear ... [Read more...]
It’s Oscars season again so why not explore how predictable (my) movie tastes are. This has literally been a million dollar problem and obviously I am not gonna solve it here, but it’s fun and slightly educational to do some number … Continue reading →
[Read more...]
In this post I want to analyze a first order pharmocokinetcs problem: the data of study problem 9, chapter 3 of Rowland and Tozer (Clinical pharmacokinetics and pharmacodynamics, 4th edition) with Jags. It is a surprising simple set of data, but still there is enough to play around with.Data, model and ... [Read more...]
A new version of the jsonlite package was released to CRAN. In addition to adding small new features, this release cleans up code and documentation. Some annoying compiler warnings inherited from RJSONIO are fixed and the reference manual is a bit more concise. Also some new examples of public JSON ...
We are happy to inform you that abstract submission for useR! 2014 is now available online, see http://user2014.stat.ucla.edu/ The R User Conference, useR! 2014 is scheduled for July 1-3, 2014 at the University of California, Los Angeles. Before the official program, half-day tutorials will be offered on Monday, June 30. ... [Read more...]
OECD.Stat is a commonly used statistics portal in the research world but there are no easy ways (that I know of) to query it straight from R. There are two main benefits of querying OECD.Stat straight from R: 1. Create reproducible analysis (something that is easily lost if you ... [Read more...]
There’s a new post up at the ninazumel.com blog that looks at the statistics of “verification by multiplicity” — the statistical technique that is behind NASA’s announcement of 715 new planets that have been validated in the data from the Kepler Space Telescope. We normally don’t write about ... [Read more...]
I just wanted to plug for three classical books on statistical graphics that I really enjoyed reading. The books are old (that is, older than me) but still relevant and together they give a sense of the development of exploratory graphics in general and the graphics system in R specifically ... [Read more...]
When working with raster datasets I often encounter limitations caused by the large size of the files. I thus wrote up a little R function that invokes gdal_translate which would split the raster into parts.The screenshot to the left shows a raster in QGIS that was split into ... [Read more...]
If you are writing some C++ code with the intent of calling it from R or even developing it into a package you might wonder whether it is better to use the pseudo random number library native to C++11 or the R standalone library. On the one hand users of ... [Read more...]