Blog post: R-squared can mislead us. Here are two related statistics for a better assessment of regression models.

Following last week's short examination, I now wanted to drill down a bit more in the voting behaviour as given in data from votewatch.eu on voting of MEPs.Votewatch's Data describe how often MEPs voted what in the European Parliament. For each MEP the number of votes, percentages Yes, No, Abstain, number of elections and...

In previous posts I have looked at how generalized additive models (GAMs) can be used to model non-linear trends in time series data. At the time a number of readers commented that they were interested in modelling data that had more than just a trend component; how do you model data collected throughout the year over many years with...

The latest issue of the American Statistician has a set of thought-provoking point/counterpoint papers on Simpson’s Paradox, with a tie-in to the controversial issue of causality. (I will not address the causality issue here.) Since I have long had my own thoughts about Simpson’s, I’ll postpone the topic I had planned to post this week,

Interaction are the funny interesting part of ecology, the most fun during data analysis is when you try to understand and to derive explanations from the estimated coefficients of your model. However you do need to know what is behind these estimate, there is a mathematical foundation between them that you need to be aware

Scraping organism metadata for Treebase repositories from GOLD using Python and R I recently wanted to get hold of habitat/phenotype/sequencing metadata for the individual organisms of an archived Treebase project.) The GOLD database holds more than 18000 full genomes. For many of these it provides pretty good metadata (GOLDcards) which are indirectly linked to...

by Joseph Rickert Worldwide R user group activity for the first Quarter of 2014 appears to be way up compared to previous years as the following plot shows. The plot was built by counting the meetings on Revolution Analytics R Community Calendar. R users continue to value the live, in person events and face-to-face meetings with their peers. Moreover,...

As discussed in the MAT8181 course, there are – at least – two kinds of non-stationary time series: those with a trend, and those with a unit-root (they will be called integrated). Unit root tests cannot be used to assess whether a time series is stationary, or not. They can only detect integrated time series. And the same holds...

