Blog Archives

Build a search engine in 20 minutes or less

March 27, 2013
By
Build a search engine in 20 minutes or less

…or your money back.

The

author = "Ben Ogorek"
Twitter = "@baogorek"
email = paste0(sub("@", "", Twitter), "@gmail.com")

Setup

Pretend this is Big Data:

doc1 <- "Stray cats are running all over the place. I see 10 a day!"
doc2 <- "Cats are killers. They...

Read more »

A simple web application using Rook

December 21, 2012
By
A simple web application using Rook

by Ben Ogorek

Rook

I'm grateful to Rook for helping me, a simple statistician, learn a few fundamentals of web technology. For R web application development, there are increasingly polished methods available (most notably Shiny ), but you can build one...

Read more »

Hierarchical linear models and lmer

October 31, 2012
By
Hierarchical linear models and lmer

Hierarchical linear models and lmer

Article by Ben Ogorek
Graphics by Bob Forrest

Background

Hierarchical Linear Models
My last article featured linear models with random slopes. For estimation and prediction, we used the lmer function from the lme4 package.

Today we'll consider another level in the hierarchy, one...

Read more »

Random regression coefficients using lme4

June 11, 2012
By
Random regression coefficients using lme4

What's the gain over lm()?
By Ben Ogorek

Random effects models have always intrigued me. They offer the flexibility of many parameters under a single unified, cohesive and parsimonious system. But with the growing size of data sets and increased ability to estimate many parameters with a high level of accuracy, will the subtleties of the random effects analysis be lost?

In this...

Read more »

The lm() function with categorical predictors

April 8, 2012
By
The lm() function with categorical predictors

What's with those estimates?
By Ben Ogorek


In R, categorical variables can be added to a regression using the lm() function without a hint of extra work. But have you ever look at the resulting estimates and wondered exactly what they were?

First, let's define a data set.
set.seed(12255)
n = 30
sigma = 2.0

AOV.df <- data.frame(category = c(rep("category1", n)
     ...

Read more »