1769 search results for "regression"

Logistic Regression in R – Part One

September 1, 2015
By
Logistic Regression in R – Part One

Please note that an earlier version of this post had to be retracted because it contained some content which was generated at work. I have since chosen to rewrite the document in a series of posts. Please recognize that this may take some time. Apologies for any inconvenience.   Logistic regression is used to analyze the

Read more »

Bayesian regression models using Stan in R

September 1, 2015
By
Bayesian regression models using Stan in R

It seems the summer is coming to end in London, so I shall take a final look at my ice cream data that I have been playing around with to predict sales statistics based on temperature for the last couple of weeks , , .Here I will use the new brms (GitHub, CRAN) package...

Read more »

Kickin’ it with elastic net regression

Kickin’ it with elastic net regression

With the kind of data that I usually work with, overfitting regression models can be a huge problem if I'm not careful. Ridge regression is a really effective technique for thwarting overfitting. It does this by penalizing the L2 norm… Continue reading →

Read more »

Evaluating Logistic Regression Models

August 17, 2015
By
Evaluating Logistic Regression Models

Logistic regression is a technique that is well suited for examining the relationship between a categorical response variable and one or more categorical or continuous predictor variables. The model is generally presented in the following format, where β refers to the parameters and x represents the independent variables. log(odds)=β0+β1∗x1+...+βn∗xn The log(odds), or log-odds ratio, is defined

Read more »

R, Python, and SAS: Getting Started with Linear Regression

August 16, 2015
By
R, Python, and SAS: Getting Started with Linear Regression

Consider the linear regression model, $$ y_i=f_i(boldsymbol{x}|boldsymbol{beta})+varepsilon_i, $$ where $y_i$ is the response or the dependent variable at the $i$th case, $i=1,cdots, N$ and the predictor or the independent variable is the $boldsymbol{x}$ term defined in the mean function $f_i(boldsymbol{x}|boldsymbol{beta})$. For simplicity, consider the following simple linear regression (SLR) model, $$ y_i=beta_0+beta_1x_i+varepsilon_i. $$ To obtain the (best) estimate...

Read more »

Bivariate Linear Regression

August 13, 2015
By
Bivariate Linear Regression

Regression is one of the – maybe even the single most important fundamental tool for statistical analysis in quite a large number of research areas. It forms the basis of many of the fancy statistical methods currently en vogue in the social sciences. Multilevel analysis and structural equation modeling are perhaps the most widespread and

Read more »

Empirical bias analysis of random effects predictions in linear and logistic mixed model regression

July 30, 2015
By
Empirical bias analysis of random effects predictions in linear and logistic mixed model regression

In the first technical post in this series, I conducted a numerical investigation of the biasedness of random effect predictions in generalized linear mixed models (GLMM), such as the ones used in the Surgeon Scorecard, I decided to undertake two explorations: firstly, the behavior of these estimates as more and more data are gathered for each

Read more »

Regression with Multicollinearity Yields Multiple Sets of Equally Good Coefficients

July 6, 2015
By
Regression with Multicollinearity Yields Multiple Sets of Equally Good Coefficients

The multiple regression equation represents the linear combination of the predictors with the smallest mean-squared error. That linear combination is a factorization of the predictors with the factors equal to the regression weights. You may see the wo...

Read more »

Heteroscedasticity in Regression — It Matters!

June 7, 2015
By
Heteroscedasticity in Regression — It Matters!

R’s main linear and nonlinear regression functions, lm() and nls(), report standard errors for parameter estimates under the assumption of homoscedasticity, a fancy word for a situation that rarely occurs in practice. The assumption is that the (conditional) variance of the response variable is the same at any set of values of the predictor variables. … Continue reading...

Read more »

Simulation-based power analysis using proportional odds logistic regression

May 22, 2015
By
Simulation-based power analysis using proportional odds logistic regression

Consider planning a clinicial trial where patients are randomized in permuted blocks of size four to either a 'control' or 'treatment' group. The outcome is measured on an 11-point ordinal scale (e.g., the numerical rating scale for pain). It may be reasonable to evaluate the results of this trial using a proportional odds cumulative logit

Read more »