Posts Tagged ‘ Statistical Modelling ’

Generalized Linear Models – Poisson Regression

June 26, 2011
By
Generalized Linear Models – Poisson Regression

The Generalized Linear Model (GLM) allows us to model responses with distributions other than the Normal distribution, which is one of the assumptions underlying linear regression as used in many cases. When data is counts of events (or items) then a discrete distribution is more appropriate is usually more appropriate than approximating with a...

Read more »

Classification Trees using the rpart function

September 21, 2010
By
Classification Trees using the rpart function

In a previous post on classification trees we considered using the tree package to fit a classification tree to data divided into known classes. In this post we will look at the alternative function rpart that is available within the base R distribution. Fast Tube by Casper A classification tree can be fitted using the rpart...

Read more »

Classification Trees

September 18, 2010
By
Classification Trees

Decision trees are applied to situation where data is divided into groups rather than investigating a numerical response and its relationship to a set of descriptor variables. There are various implementations of classification trees in R and the some commonly used functions are rpart and tree. Fast Tube by Casper To illustrate the use of the...

Read more »

10w2170, Banff

September 11, 2010
By
10w2170, Banff

Yesterday night, we started the  Hierarchical Bayesian Methods in Ecology workshop by trading stories. Everyone involved in the programme discussed his/her favourite dataset and corresponding expectations from the course. I found the exchange most interesting, like the one we had two years ago in Gran Paradiso, because of the diversity of approaches to Statistics...

Read more »

Variable selection using automatic methods

May 22, 2010
By

When we have a set of data with a small number of variables we can easily use a manual approach to identifying a good set of variables and the form they take in our statistical model. In other situations we may have a large number of potentially important variables and it soon becomes a...

Read more »

Linear regression models with robust parameter estimation

May 15, 2010
By

There are situations in regression modelling where robust methods could be considered to handle unusual observations that do not follow the general trend of the data set. There are various packages in R that provide robust statistical methods which are summarised on the CRAN Robust Task View. As an example of using robust statistical estimation...

Read more »

Manual variable selection using the dropterm function

May 12, 2010
By
Manual variable selection using the dropterm function

When fitting a multiple linear regression model to data a natural question is whether a model can be simplified by excluding variables from the model. There are automatic procedures for undertaking these tests but some people prefer to follow a more manual approach to variable selection rather than pressing a button and taking what...

Read more »

Using the update function during variable selection

May 9, 2010
By

When fitting statistical models to data where there are multiple variables we are often interested in adding or removing terms from our model and in cases where there are a large number of terms it can be quicker to use the update function to start with a formula from a model that we have...

Read more »

Analysis of Covariance – Extending Simple Linear Regression

April 28, 2010
By
Analysis of Covariance – Extending Simple Linear Regression

The simple linear regression model considers the relationship between two variables and in many cases more information will be available that can be used to extend the model. For example, there might be a categorical variable (sometimes known as a covariate) that can be used to divide the data set to fit a separate...

Read more »

Simple Linear Regression

April 23, 2010
By
Simple Linear Regression

One of the most frequent used techniques in statistics is linear regression where we investigate the potential relationship between a variable of interest (often called the response variable but there are many other names in use) and a set of one of more variables (known as the independent variables or some other term). Unsurprisingly...

Read more »