Monthly Archives: September 2012

Video: Analyzing Big Data using Oracle R Enterprise

September 23, 2012
By

Learn how Oracle R Enterprise is used to generate new insight and new value to business, answering not only what happened, but why ...

Football model; plots and usage

September 23, 2012
By

After reading data, making a predictions display and building a football data model it is time to put this to validate a bit more (regression plots) and put to usage. It appears that the regression plots in the car package were not ...

Project Euler — problem 20

September 23, 2012
By

It’s been quite a while since my last post on Euler problems. Today a visitor post his solution to the second problem nicely, which encouraged me to keep solving these problems. Just for fun! 10! = 10 * 9 * … * 3 * 2 * 1 … Continue reading →

The infamous apply function

September 23, 2012
By

For R beginners, the apply() function seems like a secret doorway into programming bliss. It seems so powerful, and yet, beyond reach. For those just starting out, examples of how to use apply() can really help with the intuition of how to h...

Text Analysis Tutorial on Spam Email in R

September 23, 2012
By

Hi everyone – I just wrote a tutorial on text analysis in R using the tm and wordcloud packages. Thought some of you here might be interested in it: text-analysis-75-925

Maximum likelihood estimates for multivariate distributions

September 22, 2012
By

Consider our loss-ALAE dataset, and - as in Frees & Valdez (1998) - let us fit a parametric model, in order to price a reinsurance treaty. The dataset is the following, __ library(evd) __ data(lossalae) __ Z=lossalae __ X=Z;Y=Z ...

Spacing measures: heterogeneity in numerical distributions

Numerically-coded data sequences can exhibit a very wide range of distributional characteristics, including near-Gaussian (historically, the most popular working assumption), strongly asymmetric, light- or heavy-tailed, multi-modal, or discrete (e.g., count data).  In addition, numerically coded values can be effectively categorical, either ordered, or unordered.  A specific example that illustrates the range of distributional behavior often seen in a collection...

Good programming practices in R

September 22, 2012
By

I write sloppy R scripts. It is a byproduct of working with a high-level language that allows you to quickly write functional code on the fly (see this post for a nice description of the problem in Python code) and the result of my limited formal training in computer programming. The lack of formal training

KLEMS (1)

September 22, 2012
By

This post is actually a homework I did. The data file contains input use, output, quantities, costs, and prices for total U.S. nondurable manufacturing for 1949-2001. The data are deﬁned as follows: , , , , = Inputs corresponding to capital, labor, energy, materials, and purchased services, = represents total output, = respective quantity indexes, ...

Core [still] minus one…

September 22, 2012
By

Another full day spent working with Jean-Michel Marin on the new edition of Bayesian Core (soon to be Bayesian Essentials with R!) and the remaining hierarchical Bayes chapter… I have reread and completed the regression and GLM chapters, sent to very friendly colleagues for a last round of comments. Now, I am essentially idle, waiting