Statistical Modelling

Classification Trees using the rpart function

September 21, 2010 | Ralph

In a previous post on classification trees we considered using the tree package to fit a classification tree to data divided into known classes. In this post we will look at the alternative function rpart that is available within the base R distribution. Fast Tube by Casper A classification tree ...
[Read more...]

Classification Trees

September 18, 2010 | Ralph

Decision trees are applied to situation where data is divided into groups rather than investigating a numerical response and its relationship to a set of descriptor variables. There are various implementations of classification trees in R and the some commonly used functions are rpart and tree. Fast Tube by Casper ...
[Read more...]

10w2170, Banff

September 11, 2010 | xi'an

Yesterday night, we started the  Hierarchical Bayesian Methods in Ecology workshop by trading stories. Everyone involved in the programme discussed his/her favourite dataset and corresponding expectations from the course. I found the exchange most interesting, like the one we had two years ago in Gran Paradiso, because of the ... [Read more...]

Variable selection using automatic methods

May 22, 2010 | Ralph

When we have a set of data with a small number of variables we can easily use a manual approach to identifying a good set of variables and the form they take in our statistical model. In other situations we may have a large number of potentially important variables and ... [Read more...]

Linear regression models with robust parameter estimation

May 15, 2010 | Ralph

There are situations in regression modelling where robust methods could be considered to handle unusual observations that do not follow the general trend of the data set. There are various packages in R that provide robust statistical methods which are summarised on the CRAN Robust Task View. As an example ... [Read more...]

Manual variable selection using the dropterm function

May 12, 2010 | Ralph

When fitting a multiple linear regression model to data a natural question is whether a model can be simplified by excluding variables from the model. There are automatic procedures for undertaking these tests but some people prefer to follow a more manual approach to variable selection rather than pressing a ...
[Read more...]

Using the update function during variable selection

May 9, 2010 | Ralph

When fitting statistical models to data where there are multiple variables we are often interested in adding or removing terms from our model and in cases where there are a large number of terms it can be quicker to use the update function to start with a formula from a ... [Read more...]

Analysis of Covariance – Extending Simple Linear Regression

April 28, 2010 | Ralph

The simple linear regression model considers the relationship between two variables and in many cases more information will be available that can be used to extend the model. For example, there might be a categorical variable (sometimes known as a covariate) that can be used to divide the data set ...
[Read more...]

Simple Linear Regression

April 23, 2010 | Ralph

One of the most frequent used techniques in statistics is linear regression where we investigate the potential relationship between a variable of interest (often called the response variable but there are many other names in use) and a set of one of more variables (known as the independent variables or ...
[Read more...]

Two-way Analysis of Variance (ANOVA)

February 15, 2010 | Ralph

The analysis of variance (ANOVA) model can be extended from making a comparison between multiple groups to take into account additional factors in an experiment. The simplest extension is from one-way to two-way ANOVA where a second factor is included in the model as well as a potential interaction between ...
[Read more...]

One-way ANOVA (cont.)

February 12, 2010 | Ralph

In a previous post we considered using R to fit one-way ANOVA models to data. In this post we consider a few additional ways that we can look at the analysis. In the analysis we made use of the linear model function lm and the analysis could be conducted using ...
[Read more...]

One-way Analysis of Variance (ANOVA)

February 3, 2010 | Ralph

Analysis of Variance (ANOVA) is a commonly used statistical technique for investigating data by comparing the means of subsets of the data. The base case is the one-way ANOVA which is an extension of two-sample t test for independent groups covering situations where there are more than two groups being ...
[Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)