Estimating Finite Mixture Models with Flexmix Package

June 9, 2013
By

(This article was first published on Yet Another Blog in Statistical Computing » S+/R, and kindly contributed to R-bloggers)

In my post on 06/05/2013 (http://statcompute.wordpress.com/2013/06/05/estimating-composite-models-for-count-outcomes-with-fmm-procedure), I’ve shown how to estimate finite mixture models, e.g. zero-inflated Poisson and 2-class finite mixture Poisson models, with FMM and NLMIXED procedure in SAS. Today, I am going to demonstrate how to achieve the same results with flexmix package in R.

R Code

library(flexmix)

data <- read.table('../data/credit_count.txt', header = TRUE, sep = ',')

set.seed(2013)
class <- FLXPmultinom(~ AGE + ACADMOS + MINORDRG + LOGSPEND)
formula <- "MAJORDRG ~ AGE + ACADMOS + MINORDRG + LOGSPEND"
control <- list(verbose = 10, iter.max = 500, minprior = 0.1, tol = 0.01)

cat("\n### TWO-CLASS FINITE MIXTURE POSSION MODEL ###\n")
mdl1 <- FLXMRglm(family = "poisson")
fit1 <- flexmix(as.formula(formula), data = data[data$CARDHLDR == 1, ], k = 2, model = mdl1, concomitant = class, control = control)
refit1 <- refit(fit1, method = 'optim')
cat("\n=== MODEL THE RESPONSE ===\n")
summary(refit1, which = 'model')
cat("\n=== MODEL THE MIXTURE DISTRIBUTION ===\n")
summary(refit1, which = 'concomitant') 

cat("\n### ZERO-INFLATED POSSION MODEL ###\n")
mdl2 <- FLXMRziglm(family = "poisson")
fit <- flexmix(as.formula(formula), data = data[data$CARDHLDR == 1, ], k = 2 , model = mdl2, concomitant = class, control = control)
refit2 <- refit(fit, method = 'optim')
cat("\n=== MODEL THE RESPONSE ===\n")
summary(refit2, which = 'model')
cat("\n=== MODEL THE MIXTURE DISTRIBUTION ===\n")
summary(refit2, which = 'concomitant')

R Output for 2-Class Finite Mixture Poisson

### TWO-CLASS FINITE MIXTURE POSSION MODEL ###
Classification: weighted 
   2 Log-likelihood :   -4303.5967 
converged

=== MODEL THE RESPONSE ===
$Comp.1
              Estimate Std. Error z value  Pr(>|z|)    
(Intercept) -8.0940843  1.6102947 -5.0265 4.996e-07 ***
AGE          0.0114988  0.0129009  0.8913  0.372759    
ACADMOS      0.0045677  0.0020299  2.2502  0.024438 *  
MINORDRG     0.2641256  0.6769000  0.3902  0.696390    
LOGSPEND     0.6826690  0.2224763  3.0685  0.002151 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

$Comp.2
               Estimate  Std. Error z value  Pr(>|z|)    
(Intercept) -2.44490331  0.34951344 -6.9952 2.650e-12 ***
AGE          0.02214164  0.00662479  3.3422 0.0008311 ***
ACADMOS      0.00052922  0.00077757  0.6806 0.4961209    
MINORDRG     0.05054178  0.04048630  1.2484 0.2118965    
LOGSPEND     0.21398000  0.04127917  5.1837 2.175e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


=== MODEL THE MIXTURE DISTRIBUTION ===
$Comp.2
              Estimate Std. Error z value  Pr(>|z|)    
(Intercept) -1.4274523  0.5275366 -2.7059  0.006812 ** 
AGE         -0.0027648  0.0100981 -0.2738  0.784246    
ACADMOS      0.0016143  0.0014455  1.1168  0.264096    
MINORDRG     1.5865202  0.1791074  8.8579 < 2.2e-16 ***
LOGSPEND    -0.0695020  0.0745171 -0.9327  0.350976    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R Output for Zero-Inflated Poisson

### ZERO-INFLATED POSSION MODEL ###
Classification: weighted 
   3 Log-likelihood :   -4175.7431 
converged

=== MODEL THE RESPONSE ===
$Comp.2
               Estimate  Std. Error z value  Pr(>|z|)    
(Intercept) -2.27782892  0.30030188 -7.5851 3.322e-14 ***
AGE          0.01955236  0.00602142  3.2471  0.001166 ** 
ACADMOS      0.00024907  0.00067394  0.3696  0.711698    
MINORDRG     0.11764062  0.02711666  4.3383 1.436e-05 ***
LOGSPEND     0.16441222  0.03531969  4.6550 3.240e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


=== MODEL THE MIXTURE DISTRIBUTION ===
$Comp.2
               Estimate  Std. Error z value  Pr(>|z|)    
(Intercept) -1.91132158  0.41706481 -4.5828 4.588e-06 ***
AGE         -0.00081716  0.00841240 -0.0971  0.922617    
ACADMOS      0.00293407  0.00109997  2.6674  0.007644 ** 
MINORDRG     1.44243620  0.13613625 10.5955 < 2.2e-16 ***
LOGSPEND     0.09561048  0.05080464  1.8819  0.059846 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

To leave a comment for the author, please follow the link and comment on their blog: Yet Another Blog in Statistical Computing » S+/R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Mango solutions



plotly webpage

dominolab webpage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)