Update to my pscl package, now on CRAN. Biggest change: fixing a bug in the way MCMC draws for item parameters were being stored and summarized by ideal.

Update to my pscl package, now on CRAN. Biggest change: fixing a bug in the way MCMC draws for item parameters were being stored and summarized by ideal.

During a short if profitable visit to Dublin for a SFI meeting on Tuesday/Friday, I had the opportunity to visit the National Gallery of Ireland in my sole hour of free time (as my classy hotel was very close). The building itself is quite nice, being well-inserted between brick houses from the outside, while providing

In this post, Portfolio Probe explores a way to decide whether market kurtosis and skewness are predictable. Market skewness, in naive financial modeling, is some kind of measure of (as-)symmetrical distribution of (daily) returns around the average market return. A higher skewness would tend to indicate a denser distribution of higher returns, compared to lower

Since we explored some statitics of an abstract painting with Pierre (we even have an article in Variances last issue!), I became more sensitive to art linked to randomness. Here are some pointers to related websites I have digged out. Random.org, mentioned here by Pierre, is, at it reads, a true random number service that

During construction of typical efficient frontier, risk is usually measured by the standard deviation of the portfolio’s return. Maximum Loss and Mean-Absolute Deviation are alternative measures of risk that I will use to construct efficient frontier. I will use methods presented in Comparative Analysis of Linear Portfolio Rebalancing Strategies: An Application to Hedge Funds by

At the Bay Area R User Group meeting this week, Antonio Piccolboni gave an overview of the design goals and implementation of the RHadoop Project packages that connect Hadoop and R: rhdfs, rhbase and rmr: (The image above was captured from Antionio's slides.) The most revealing part of the talk for me was the comparison of implementing the K-means...

If you're in the Bay Area, tomorrow would be a great day to head down to San José for the ACM Data Mining Camp. Hundreds of data scientists, data hackers and data miners will be there for a fun "unconference", with talks and practical sessions organized on the spot according to demand. Revolution Analytics is proud to be a...

I have a few friends that keep bragging about their 14% annual returns by investing their money with Lending Club, a peer-to-peer lending service that cuts out the complexities and difficulties of getting approved for a loan through a bank. To give you an idea of the sheer amount of volume Lending Club has been

I received an email from a very inconvenienced statistician a few weeks ago. The problem was an old data file with the extension .sd2. Apparently, this is an obsolete data storage format used by past versions of SAS. A quick glance at the file contents revealed that this sd2 formatted file is incompatible with the

“Bayes Theorem is a simple consequence of the axioms of probability, and is therefore accepted by all as valid. However, some who challenge the use of personal probability reject certain applications of Bayes Theorem.“ J. Kadane, p.44 Principles of uncertainty by Joseph (“Jay”) Kadane (Carnegie Mellon University, Pittsburgh) is a profound and mesmerising book on

I once heard John Chambers (the inventor of the S language, and member of the R Core Group) say, "Show me a programming language no-one complains about, and I'll show you a language no-one uses". The R language has its fair share of complainants, to be sure -- and that's to be expected for a language with more than...

In part 3, we ran a logistic model to determine the probability of default of a customer. We also saw the outputs and tried to judge the performance of the model b plotting the ROC curve. Let's try a different approach today. How about a decision tree?...

This is the first post in the series about Asset Allocation, Risk Measures, and Portfolio Construction. I will use simple and naive historical input assumptions for illustration purposes across all posts. In these series I plan to discuss: Maximum Loss, MAD, CVaR, CDaR, Omega Risk Measures 130:30 Long/Short portfolios and Cardinality Constraints Arithmetic and Geometric

In this article, Hans Gilde exposes the clever use of a heatmap hidden in the Bioconductor library. In his example, he describes a way to show different ‘observations’ on subjects, with the concept of time. Financial indices, like the S&P 500 or the Dow Jones indices, are mathematically some kind of measure of overall market

To me, this post by Christophe Ladroue personifies what data doodlers do.They take a dataset that is of interest to them (In his case, his triathlon results) and then they manipulate the numbers to see what insights can be drawn. Most bloggers only sho...

The two translators of our book in Japanese, Kazue & Motohiro Ishida, contacted me about some R code mistakes in the book. The translation is nearly done and they checked every piece of code in the book, an endeavour for which I am very grateful! Here are the two issues they have noticed (after incorporating

Impressive. You are not alone!