## Evaluation of Orthogonal Signal Correction for PLS modeling (OSC-PLS and OPLS)

March 15, 2013
Partial least squares projection to latent structures or PLS is one of my favorite modeling algorithms. PLS is an optimal algorithm for predictive modeling using wide data or data with  rows << variables. While there is s a wealth of literature regarding the application of PLS to various tasks, I find it especially useful for biological

## How Did I Miss “The Golden Dilemma”?

March 15, 2013
I am ashamed to admit that I am way behind (about 10,127 downloads) in discovering this wonderful paper: The Golden Dilemma (January 8, 2013)Erb, Claude B. and Harvey, Campbell R.Available at SSRN: http://ssrn.com/abstract=2078535 Here are the authors presenting the concept in July 2012 if you prefer slideshow format (thanks...

## Veterinary Epidemiologic Research: GLM – Logistic Regression

March 14, 2013
$Veterinary Epidemiologic Research: GLM – Logistic Regression$

We continue to explore the book Veterinary Epidemiologic Research and today we’ll have a look at generalized linear models (GLM), specifically the logistic regression (chapter 16). In veterinary epidemiology, often the outcome is dichotomous (yes/no), representing the presence or absence of disease or mortality. We code 1 for the presence of the outcome and 0

March 14, 2013
Nomen Est Omen?Lately, the terms "data science" and "data scientist" turn up at an increasing pace in the R-blog-sphere. Since its first occurrence (to my knowledge,  "data scientist" has been coined by DJ Patil and Jeff Hammerbacher in 2008), th...

## On ENAR, or Statistical Meetings in General

March 14, 2013
Last year I accepted an invitation from Ben to go to ENAR 2013 -- my first ENAR. I used to go to JSM and useR!, and apparently I enjoy useR! most. The reason is not, or not only, because I'm more of a technical person. It is just hard to concentrate at large statistical conferences. I want...

## qdap 0.2.1 Released

March 13, 2013
I’m very pleased to announce the release of qdap 0.2.1 This is the second installment of the qdap package available at CRAN. The qdap package automates many of the tasks associated with quantitative discourse analysis of transcripts containing discourse, including … Continue reading →

## John Snow’s Cholera data in more formats

March 13, 2013
In honour of the bicentenary of John Snow’s birth – and because I was asked to by someone via email – I have now released my digitisation of John Snow’s Cholera data in a few other formats: KML and as Google Fusion Tables. To save you reading my previous blog posts on the subject, I’ll

## New package for ensembling R models

March 13, 2013
I've written a new R package called caretEnsemble for creating ensembles of caret models in R.  It currently works well for regression models, and I've written some preliminary support for binary classification models. At th...

## R to Latex packages: Coverage

March 12, 2013
There are now quite a few R packages to turn cross-tables and fitted models into nicely formatted latex. In a previous post I showed how to use one of them to display regression tables on the fly. In this post I summarise what types of R object each of the major packages can deal with.