A few hours ago Peter Dalgaard (of R Core Team) announced the release of R 3.0.0! Bellow you can read the changes in this release. One of the features worth noticing is the introduction of long vectors to R 3.0.0. As David Smith …
********************************************************************* The 11th Australasian Data Mining Conference (AusDM 2013) Canberra, Australia, 13-15 November 2013, http://ausdm13.togaware.com Join us on LinkedIn: http://www.linkedin.com/groups/AusDM-4907891 ********************************************************************* Data mining, the art and science of intelligent analysis of (usually large) data sets for meaningful (and previously unknown) … Continue reading →
Last week, I posted about statisticians’ constant battle against the belief that the p-value associated (for example) with a regression coefficient is equal to the probability that the null hypothesis is true, for a null hypothesis that beta is zero or negative. I argued that (despite our long pedagogical practice) there are, in fact, many 
Benford’s law is nowadays extremely popular (see e.g. http://en.wikipedia.org/…). It is usually claimed that, for a given set data set, changing units does not affect the distribution of the first digit. Thus, it should be related to scale invariant distributions. Heuristically, scale (or unit) invariance means that the density of the measure (or probability function) should be proportional to...
This morning, Mathieu had a nice experience in his course on computational method in actuarial science. But let us start with some mathematical formal definitions. First, recall that is – somehow – a standard expression. No one should be surprised to see such an expression. Generally (as explained in http://en.wikipedia.org/… ), this function is defined only when . The...
Third part on logistic regression (first here, second here). Two steps in assessing the fit of the model: first is to determine if the model fits using summary measures of goodness of fit or by assessing the predictive ability of the model; second is to deterime if there’s any observations that do not fit the 
Although I suffer from complete ignorance of typography, with a little help from a post from Hyndsight and post from mages' blog, I wanted to try a different font on the one-pager performance report that we created in Onepager Now with knitR. I do not think Open Sans Light is the best choice for this...
Second part on logistic regression (first one here). We used in the previous post a likelihood ratio test to compare a full and null model. The same can be done to compare a full and nested model to test the contribution of any subset of parameters: Interpretation of coefficients Note: Dohoo do not report the 
We continue to explore the book Veterinary Epidemiologic Research and today we’ll have a look at generalized linear models (GLM), specifically the logistic regression (chapter 16). In veterinary epidemiology, often the outcome is dichotomous (yes/no), representing the presence or absence of disease or mortality. We code 1 for the presence of the outcome and 0 