Posts Tagged ‘ Statistical Modelling ’

Classification Trees using the rpart function

September 21, 2010
By
Classification Trees using the rpart function

In a previous post on classification trees we considered using the tree package to fit a classification tree to data divided into known classes. In this post we will look at the alternative function rpart that is available within the base R distribution. Fast Tube by Casper A classification tree can be fitted using the rpart function

Read more »

Classification Trees

September 18, 2010
By
Classification Trees

Decision trees are applied to situation where data is divided into groups rather than investigating a numerical response and its relationship to a set of descriptor variables. There are various implementations of classification trees in R and the some commonly used functions are rpart and tree. Fast Tube by Casper To illustrate the use of the tree

Read more »

10w2170, Banff

September 11, 2010
By
10w2170, Banff

Yesterday night, we started the  Hierarchical Bayesian Methods in Ecology workshop by trading stories. Everyone involved in the programme discussed his/her favourite dataset and corresponding expectations from the course. I found the exchange most interesting, like the one we had two years ago in Gran Paradiso, because of the diversity of approaches to Statistics reflected

Read more »

Variable selection using automatic methods

May 22, 2010
By

When we have a set of data with a small number of variables we can easily use a manual approach to identifying a good set of variables and the form they take in our statistical model. In other situations we may have a large number of potentially important variables and it soon becomes a time

Read more »

Linear regression models with robust parameter estimation

May 15, 2010
By

There are situations in regression modelling where robust methods could be considered to handle unusual observations that do not follow the general trend of the data set. There are various packages in R that provide robust statistical methods which are summarised on the CRAN Robust Task View. As an example of using robust statistical estimation in

Read more »

Manual variable selection using the dropterm function

May 12, 2010
By
Manual variable selection using the dropterm function

When fitting a multiple linear regression model to data a natural question is whether a model can be simplified by excluding variables from the model. There are automatic procedures for undertaking these tests but some people prefer to follow a more manual approach to variable selection rather than pressing a button and taking what comes

Read more »

Using the update function during variable selection

May 9, 2010
By

When fitting statistical models to data where there are multiple variables we are often interested in adding or removing terms from our model and in cases where there are a large number of terms it can be quicker to use the update function to start with a formula from a model that we have already

Read more »

Analysis of Covariance – Extending Simple Linear Regression

April 28, 2010
By
Analysis of Covariance – Extending Simple Linear Regression

The simple linear regression model considers the relationship between two variables and in many cases more information will be available that can be used to extend the model. For example, there might be a categorical variable (sometimes known as a covariate) that can be used to divide the data set to fit a separate linear

Read more »

Simple Linear Regression

April 23, 2010
By
Simple Linear Regression

One of the most frequent used techniques in statistics is linear regression where we investigate the potential relationship between a variable of interest (often called the response variable but there are many other names in use) and a set of one of more variables (known as the independent variables or some other term). Unsurprisingly there

Read more »

Two-way Analysis of Variance (ANOVA)

February 15, 2010
By
Two-way Analysis of Variance (ANOVA)

The analysis of variance (ANOVA) model can be extended from making a comparison between multiple groups to take into account additional factors in an experiment. The simplest extension is from one-way to two-way ANOVA where a second factor is included in the model as well as a potential interaction between the two factors. As an example

Read more »