Posts Tagged ‘ Pragmatic Machine Learning ’

Modeling Trick: Impact Coding of Categorical Variables with Many Levels

July 23, 2012
By
Modeling Trick: Impact Coding of Categorical Variables with Many Levels

One of the shortcomings of regression (both linear and logistic) is that it doesn’t handle categorical variables with a very large number of possible values (for example, postal codes). You can get around this, of course, by going to another modeling technique, such as Naive Bayes; however, you lose some of the advantages of regression Related posts:

Read more »

Modeling Trick: Masked Variables

July 1, 2012
By
Modeling Trick: Masked Variables

A primary problem data scientists face again and again is: how to properly adapt or treat variables so they are best possible components of a regression. Some analysts at this point delegate control to a shape choosing system like neural nets. I feel such a choice gives up far too much statistical rigor, transparency and Related posts:

Read more »

Selection in R

June 1, 2012
By

The design of the statistical programming language R sits in a slightly uncomfortable place between the functional programming and object oriented paradigms. The upside is you get a lot of the expressive power of both programming paradigms. A downside of this is: the not always useful variability of the language’s list and object extraction operators. Related posts:

Read more »