This is my first blog since joining R-bloggers. I’m quite excited to be part of this group and apologize if I bore any experienced R users with my basic blogs for learning R or offend programmers with my inefficient, sloppy … Continue reading →

Fig. 4 – Boxplots of the evolution of ABC approximations to the Bayes factor. The representation is made in terms of frequencies of visits to models MA(1) and MA(2) during an ABC simulation when ε corresponds to the 10,1,.1,.01% quantiles on the simulated autocovariance distances. The data is a time

In my last few posts, I have been discussing some of the consequences of the slow decay rate of the tail of the Pareto type I distribution, along with some other, closely related notions, all in the context of continuously distributed data. Today’s post considers the Zipf distribution for discrete data, which has come to be extremely popular as...

Arthur Charpentier used R to denote a broken record of the CAC 40 when it went 11 consecutive days with negative returns. Question: What happens to the market after runs of positive or negative returns? Will the market tank or soar after n days of gains/losses? First, a little dissection of historical data (S&P 500

I have a bunch of time series whose power spectra (FFT via R's spectrum() function) I've been trying to visualize in an intuitive, aesthetically appealing way. At first, I just used lattice's bwplot, but the spacing of the X-axis here really matters. ...

How much predictability is there for these higher moments? Data The data consist of daily returns from the start of 2007 through mid 2011 for almost all of the S&P 500 constituents. Estimates were made over each half year of data. Hence there are 8 pairs of estimates where one estimate immediately follows the other. … Continue reading...

In my last post, I described three situations where the average of a sequence of numbers is not representative enough to be useful: in the presence of severe outliers, in the face of multimodal data distributions, and in the face of infinite-variance distributions. The post generated three interesting comments that I want to respond to here.First and foremost, I...

Of all possible single-number characterizations of a data sequence, the average is probably the best known. It is also easy to compute and in favorable cases, it provides a useful characterization of “the typical value” of a sequence of numbers. It is not the only such “typical value,” however, nor is it always the most useful one: two other...