Coefplot: New Package for Plotting Model Coefficients

Posted on January 3, 2012 by Joseph Rickert in R bloggers | 0 Comments

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

By Joseph Rickert

Even to the practiced eye, looking at coefficients in R model summaries can be tedious. And, capturing information about the significance of coefficients from scores or maybe even hundreds of models in a way that makes writing the final report a bit easier is a time consuming and thankless task. Of course, once you know what you are looking for, it only takes a few lines of code to select coefficients and plot them. Nevertheless, it would be nice to have a function that just plots the coefficients with error bars. Coefplot, a relatively recent package by Jared Lander, does exactly this and has the potential to become a very useful tool. Built on top of ggplot2 graphics, coefplot plots coefficients from lm and glm models as well as from the big data models generated by RevoScaleR's rxLinMod and rxLogit functions. A small example from Revolution Analytics’ Saar Golde illustrates the use of coefplot. The R code reads in credit data (see table) from 10 separate csv files, concatenates them into a single file,

creditScore	houseAge	yearsEmploy	ccDebt	year	default
691	16	9	6725	2000	0
691	4	4	5077	2000	0
743	18	3	3080	2000	0
728	22	1	4345	2000	0
745	17	3	2969	2000	0
539	15	3	4588	2000	0

and uses RevoScaleR’s rxLinMod function to perform the linear regression:

default ~F(year) + yearsEmploy + ccDebt + creditScore

Note that the F function makes year a factor on the fly so that the regression will produce a coefficient for each year. Running coefplot on the model object produces the graph of the coefficients.

This is slick, but to be really useful coefplot should be able to handle models with thousands of coefficients. I spoke with Jared about this. He said that he is well aware of the problem and is working on it:

“The big issue is identifying levels that belong to factors, which I solved, even for interactions. But how do people specify levels that might belong to different factors, or how to handle a specified level and its interactions, etc….”

It is difficult to build useful tools, and an amazing feature about the open source R project is that so many people are willing to try. Jared also said that he is open to suggestions.

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Coefplot: New Package for Plotting Model Coefficients

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)