Estimating Finite Mixture Models with Flexmix Package

[This article was first published on Yet Another Blog in Statistical Computing » S+/R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In my post on 06/05/2013 (http://statcompute.wordpress.com/2013/06/05/estimating-composite-models-for-count-outcomes-with-fmm-procedure), I’ve shown how to estimate finite mixture models, e.g. zero-inflated Poisson and 2-class finite mixture Poisson models, with FMM and NLMIXED procedure in SAS. Today, I am going to demonstrate how to achieve the same results with flexmix package in R.

R Code

library(flexmix)

data <- read.table('../data/credit_count.txt', header = TRUE, sep = ',')

set.seed(2013)
class <- FLXPmultinom(~ AGE + ACADMOS + MINORDRG + LOGSPEND)
formula <- "MAJORDRG ~ AGE + ACADMOS + MINORDRG + LOGSPEND"
control <- list(verbose = 10, iter.max = 500, minprior = 0.1, tol = 0.01)

cat("\n### TWO-CLASS FINITE MIXTURE POSSION MODEL ###\n")
mdl1 <- FLXMRglm(family = "poisson")
fit1 <- flexmix(as.formula(formula), data = data[data$CARDHLDR == 1, ], k = 2, model = mdl1, concomitant = class, control = control)
refit1 <- refit(fit1, method = 'optim')
cat("\n=== MODEL THE RESPONSE ===\n")
summary(refit1, which = 'model')
cat("\n=== MODEL THE MIXTURE DISTRIBUTION ===\n")
summary(refit1, which = 'concomitant') 

cat("\n### ZERO-INFLATED POSSION MODEL ###\n")
mdl2 <- FLXMRziglm(family = "poisson")
fit <- flexmix(as.formula(formula), data = data[data$CARDHLDR == 1, ], k = 2 , model = mdl2, concomitant = class, control = control)
refit2 <- refit(fit, method = 'optim')
cat("\n=== MODEL THE RESPONSE ===\n")
summary(refit2, which = 'model')
cat("\n=== MODEL THE MIXTURE DISTRIBUTION ===\n")
summary(refit2, which = 'concomitant')

R Output for 2-Class Finite Mixture Poisson

### TWO-CLASS FINITE MIXTURE POSSION MODEL ###
Classification: weighted 
   2 Log-likelihood :   -4303.5967 
converged

=== MODEL THE RESPONSE ===
$Comp.1
              Estimate Std. Error z value  Pr(>|z|)    
(Intercept) -8.0940843  1.6102947 -5.0265 4.996e-07 ***
AGE          0.0114988  0.0129009  0.8913  0.372759    
ACADMOS      0.0045677  0.0020299  2.2502  0.024438 *  
MINORDRG     0.2641256  0.6769000  0.3902  0.696390    
LOGSPEND     0.6826690  0.2224763  3.0685  0.002151 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

$Comp.2
               Estimate  Std. Error z value  Pr(>|z|)    
(Intercept) -2.44490331  0.34951344 -6.9952 2.650e-12 ***
AGE          0.02214164  0.00662479  3.3422 0.0008311 ***
ACADMOS      0.00052922  0.00077757  0.6806 0.4961209    
MINORDRG     0.05054178  0.04048630  1.2484 0.2118965    
LOGSPEND     0.21398000  0.04127917  5.1837 2.175e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


=== MODEL THE MIXTURE DISTRIBUTION ===
$Comp.2
              Estimate Std. Error z value  Pr(>|z|)    
(Intercept) -1.4274523  0.5275366 -2.7059  0.006812 ** 
AGE         -0.0027648  0.0100981 -0.2738  0.784246    
ACADMOS      0.0016143  0.0014455  1.1168  0.264096    
MINORDRG     1.5865202  0.1791074  8.8579 < 2.2e-16 ***
LOGSPEND    -0.0695020  0.0745171 -0.9327  0.350976    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R Output for Zero-Inflated Poisson

### ZERO-INFLATED POSSION MODEL ###
Classification: weighted 
   3 Log-likelihood :   -4175.7431 
converged

=== MODEL THE RESPONSE ===
$Comp.2
               Estimate  Std. Error z value  Pr(>|z|)    
(Intercept) -2.27782892  0.30030188 -7.5851 3.322e-14 ***
AGE          0.01955236  0.00602142  3.2471  0.001166 ** 
ACADMOS      0.00024907  0.00067394  0.3696  0.711698    
MINORDRG     0.11764062  0.02711666  4.3383 1.436e-05 ***
LOGSPEND     0.16441222  0.03531969  4.6550 3.240e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


=== MODEL THE MIXTURE DISTRIBUTION ===
$Comp.2
               Estimate  Std. Error z value  Pr(>|z|)    
(Intercept) -1.91132158  0.41706481 -4.5828 4.588e-06 ***
AGE         -0.00081716  0.00841240 -0.0971  0.922617    
ACADMOS      0.00293407  0.00109997  2.6674  0.007644 ** 
MINORDRG     1.44243620  0.13613625 10.5955 < 2.2e-16 ***
LOGSPEND     0.09561048  0.05080464  1.8819  0.059846 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

To leave a comment for the author, please follow the link and comment on their blog: Yet Another Blog in Statistical Computing » S+/R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)