Estimating Finite Mixture Models with Flexmix Package

June 9, 2013
By

(This article was first published on Yet Another Blog in Statistical Computing » S+/R, and kindly contributed to R-bloggers)

In my post on 06/05/2013 (http://statcompute.wordpress.com/2013/06/05/estimating-composite-models-for-count-outcomes-with-fmm-procedure), I’ve shown how to estimate finite mixture models, e.g. zero-inflated Poisson and 2-class finite mixture Poisson models, with FMM and NLMIXED procedure in SAS. Today, I am going to demonstrate how to achieve the same results with flexmix package in R.

R Code

library(flexmix)

data <- read.table('../data/credit_count.txt', header = TRUE, sep = ',')

set.seed(2013)
class <- FLXPmultinom(~ AGE + ACADMOS + MINORDRG + LOGSPEND)
formula <- "MAJORDRG ~ AGE + ACADMOS + MINORDRG + LOGSPEND"
control <- list(verbose = 10, iter.max = 500, minprior = 0.1, tol = 0.01)

cat("\n### TWO-CLASS FINITE MIXTURE POSSION MODEL ###\n")
mdl1 <- FLXMRglm(family = "poisson")
fit1 <- flexmix(as.formula(formula), data = data[data$CARDHLDR == 1, ], k = 2, model = mdl1, concomitant = class, control = control)
refit1 <- refit(fit1, method = 'optim')
cat("\n=== MODEL THE RESPONSE ===\n")
summary(refit1, which = 'model')
cat("\n=== MODEL THE MIXTURE DISTRIBUTION ===\n")
summary(refit1, which = 'concomitant') 

cat("\n### ZERO-INFLATED POSSION MODEL ###\n")
mdl2 <- FLXMRziglm(family = "poisson")
fit <- flexmix(as.formula(formula), data = data[data$CARDHLDR == 1, ], k = 2 , model = mdl2, concomitant = class, control = control)
refit2 <- refit(fit, method = 'optim')
cat("\n=== MODEL THE RESPONSE ===\n")
summary(refit2, which = 'model')
cat("\n=== MODEL THE MIXTURE DISTRIBUTION ===\n")
summary(refit2, which = 'concomitant')

R Output for 2-Class Finite Mixture Poisson

### TWO-CLASS FINITE MIXTURE POSSION MODEL ###
Classification: weighted 
   2 Log-likelihood :   -4303.5967 
converged

=== MODEL THE RESPONSE ===
$Comp.1
              Estimate Std. Error z value  Pr(>|z|)    
(Intercept) -8.0940843  1.6102947 -5.0265 4.996e-07 ***
AGE          0.0114988  0.0129009  0.8913  0.372759    
ACADMOS      0.0045677  0.0020299  2.2502  0.024438 *  
MINORDRG     0.2641256  0.6769000  0.3902  0.696390    
LOGSPEND     0.6826690  0.2224763  3.0685  0.002151 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

$Comp.2
               Estimate  Std. Error z value  Pr(>|z|)    
(Intercept) -2.44490331  0.34951344 -6.9952 2.650e-12 ***
AGE          0.02214164  0.00662479  3.3422 0.0008311 ***
ACADMOS      0.00052922  0.00077757  0.6806 0.4961209    
MINORDRG     0.05054178  0.04048630  1.2484 0.2118965    
LOGSPEND     0.21398000  0.04127917  5.1837 2.175e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


=== MODEL THE MIXTURE DISTRIBUTION ===
$Comp.2
              Estimate Std. Error z value  Pr(>|z|)    
(Intercept) -1.4274523  0.5275366 -2.7059  0.006812 ** 
AGE         -0.0027648  0.0100981 -0.2738  0.784246    
ACADMOS      0.0016143  0.0014455  1.1168  0.264096    
MINORDRG     1.5865202  0.1791074  8.8579 < 2.2e-16 ***
LOGSPEND    -0.0695020  0.0745171 -0.9327  0.350976    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R Output for Zero-Inflated Poisson

### ZERO-INFLATED POSSION MODEL ###
Classification: weighted 
   3 Log-likelihood :   -4175.7431 
converged

=== MODEL THE RESPONSE ===
$Comp.2
               Estimate  Std. Error z value  Pr(>|z|)    
(Intercept) -2.27782892  0.30030188 -7.5851 3.322e-14 ***
AGE          0.01955236  0.00602142  3.2471  0.001166 ** 
ACADMOS      0.00024907  0.00067394  0.3696  0.711698    
MINORDRG     0.11764062  0.02711666  4.3383 1.436e-05 ***
LOGSPEND     0.16441222  0.03531969  4.6550 3.240e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


=== MODEL THE MIXTURE DISTRIBUTION ===
$Comp.2
               Estimate  Std. Error z value  Pr(>|z|)    
(Intercept) -1.91132158  0.41706481 -4.5828 4.588e-06 ***
AGE         -0.00081716  0.00841240 -0.0971  0.922617    
ACADMOS      0.00293407  0.00109997  2.6674  0.007644 ** 
MINORDRG     1.44243620  0.13613625 10.5955 < 2.2e-16 ***
LOGSPEND     0.09561048  0.05080464  1.8819  0.059846 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

To leave a comment for the author, please follow the link and comment on his blog: Yet Another Blog in Statistical Computing » S+/R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.