Variational Gaussian Mixtures for Face Detection

[This article was first published on --Jean Arreola--, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Mixture model

A Gaussian mixture model is a probabilistic way of representing
subpopulations within an overall population. We only observe the data,
not the subpopulation from which observation belongs.

We have $N$ random variables observed, each distributed according to a
mixture of K gaussian components. Each gaussian has its own
parameters, and we should be able to estimate the category using
Expectation Maximization, as we are using a latent variables model.

Now, in a bayesian scenario, each parameter of each gaussian is also a
random variable, as well as the mixture weights. To estimate the
distributions we use Variational Inference, which can be seen as a
generalization of the EM algorithm. Be sure to check this
book
to learn all the theory behind gaussian
mixtures and variational inference.

Here is my implementation for Variational Gaussian Mixture Model.

<span class="c1">#Variational Gaussian Mixture Model</span><span class="w">


</span><span class="c1">#Constant for Dirichlet Distribution</span><span class="w">
</span><span class="n">dirConstant</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">alpha</span><span class="p">){</span><span class="w">
  </span><span class="n">res</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">1</span><span class="w">
  </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="nf">length</span><span class="p">(</span><span class="n">alpha</span><span class="p">)){</span><span class="w">
    </span><span class="n">res</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">res</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nf">gamma</span><span class="p">(</span><span class="n">alpha</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w">
  </span><span class="p">}</span><span class="w">
  </span><span class="nf">return</span><span class="p">(</span><span class="nf">gamma</span><span class="p">(</span><span class="nf">sum</span><span class="p">(</span><span class="n">alpha</span><span class="p">))</span><span class="o">/</span><span class="n">res</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">


</span><span class="n">BWishart</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">W</span><span class="p">,</span><span class="w"> </span><span class="n">v</span><span class="p">){</span><span class="w">
  </span><span class="n">D</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ncol</span><span class="p">(</span><span class="n">W</span><span class="p">)</span><span class="w">
  </span><span class="n">elem1</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="p">(</span><span class="n">det</span><span class="p">(</span><span class="n">W</span><span class="p">))</span><span class="o">^</span><span class="p">(</span><span class="o">-</span><span class="n">v</span><span class="o">/</span><span class="m">2</span><span class="p">)</span><span class="w">
  </span><span class="n">elem2</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="p">(</span><span class="m">2</span><span class="o">^</span><span class="p">(</span><span class="n">v</span><span class="o">*</span><span class="n">D</span><span class="o">/</span><span class="m">2</span><span class="p">))</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="p">(</span><span class="nb">pi</span><span class="o">^</span><span class="p">(</span><span class="n">D</span><span class="o">*</span><span class="p">(</span><span class="n">D</span><span class="m">-1</span><span class="p">)</span><span class="o">/</span><span class="m">4</span><span class="p">))</span><span class="w">
  </span><span class="n">elem3</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">1</span><span class="w">
  </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">D</span><span class="p">){</span><span class="w">
    </span><span class="n">elem3</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">elem3</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nf">gamma</span><span class="p">((</span><span class="n">v</span><span class="m">+1</span><span class="o">-</span><span class="n">i</span><span class="p">)</span><span class="o">/</span><span class="m">2</span><span class="p">)</span><span class="w">
  </span><span class="p">}</span><span class="w">
  </span><span class="nf">return</span><span class="p">(</span><span class="n">elem1</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="p">(</span><span class="n">elem2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">elem3</span><span class="p">))</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="c1">#Log precision expected value</span><span class="w">
</span><span class="n">espLnPres</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">W</span><span class="p">,</span><span class="w"> </span><span class="n">v</span><span class="p">){</span><span class="w">
  </span><span class="n">res</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">
  </span><span class="n">D</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ncol</span><span class="p">(</span><span class="n">W</span><span class="p">)</span><span class="w">
  </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">D</span><span class="p">){</span><span class="w">
    </span><span class="n">res</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">res</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nf">digamma</span><span class="p">((</span><span class="n">v</span><span class="m">+1</span><span class="o">-</span><span class="n">i</span><span class="p">)</span><span class="o">/</span><span class="m">2</span><span class="p">)</span><span class="w">
  </span><span class="p">}</span><span class="w">
  </span><span class="n">res</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">res</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">D</span><span class="o">*</span><span class="nf">log</span><span class="p">(</span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nf">log</span><span class="p">(</span><span class="n">det</span><span class="p">(</span><span class="n">W</span><span class="p">))</span><span class="w">
  </span><span class="nf">return</span><span class="p">(</span><span class="n">res</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="c1">#Wishart distribution entropy</span><span class="w">
</span><span class="n">entropyWishart</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">W</span><span class="p">,</span><span class="w"> </span><span class="n">v</span><span class="p">){</span><span class="w">
  </span><span class="n">D</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ncol</span><span class="p">(</span><span class="n">W</span><span class="p">)</span><span class="w">
  </span><span class="nf">return</span><span class="p">(</span><span class="o">-</span><span class="nf">log</span><span class="p">(</span><span class="n">BWishart</span><span class="p">(</span><span class="n">W</span><span class="p">,</span><span class="n">v</span><span class="p">))</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="p">((</span><span class="n">v</span><span class="o">-</span><span class="n">D</span><span class="m">-1</span><span class="p">)</span><span class="o">/</span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">espLnPres</span><span class="p">(</span><span class="n">W</span><span class="p">,</span><span class="n">v</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="p">(</span><span class="n">v</span><span class="o">*</span><span class="n">D</span><span class="p">)</span><span class="o">/</span><span class="m">2</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">

</span><span class="c1"># Estimating mixture parameters</span><span class="w">

</span><span class="n">vgmm</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">X</span><span class="p">,</span><span class="w"> </span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="n">iter</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">100</span><span class="p">,</span><span class="w"> </span><span class="n">eps</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">0.001</span><span class="p">){</span><span class="w">
  </span><span class="n">D</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ncol</span><span class="p">(</span><span class="n">X</span><span class="p">)</span><span class="w">
  </span><span class="n">N</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">nrow</span><span class="p">(</span><span class="n">X</span><span class="p">)</span><span class="w">

  </span><span class="c1">#Hyperparameters initialization</span><span class="w">
  </span><span class="n">m</span><span class="m">0</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">D</span><span class="p">)</span><span class="w">  </span><span class="c1"># mean</span><span class="w">
  </span><span class="n">W</span><span class="m">0</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">diag</span><span class="p">(</span><span class="n">D</span><span class="p">)</span><span class="w">  </span><span class="c1"># precision</span><span class="w">
  </span><span class="n">v</span><span class="m">0</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">D</span><span class="w">  </span><span class="c1"># degrees of freedom:  n > p-1</span><span class="w">
  </span><span class="n">alpha0</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">1</span><span class="o">/</span><span class="n">K</span><span class="w"> </span><span class="c1"># Dirichlet parameter</span><span class="w">
  </span><span class="n">beta0</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">1</span><span class="w">  </span><span class="c1"># Variance for mean</span><span class="w">

  </span><span class="c1">#For each category</span><span class="w">
  </span><span class="c1">#Initialize the means with centroids from k-means</span><span class="w">
  </span><span class="n">mk</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">kmeans</span><span class="p">(</span><span class="n">X</span><span class="p">,</span><span class="n">K</span><span class="p">)</span><span class="o">$</span><span class="n">centers</span><span class="w">

  </span><span class="c1">#Initialize presicions with diagonal  matrix</span><span class="w">
  </span><span class="n">Wk</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">array</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="n">D</span><span class="p">,</span><span class="w"> </span><span class="n">D</span><span class="p">,</span><span class="w"> </span><span class="n">K</span><span class="p">))</span><span class="w">

  </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">)</span><span class="w">
    </span><span class="n">Wk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">W</span><span class="m">0</span><span class="w">

  </span><span class="n">vk</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="n">v</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">K</span><span class="p">)</span><span class="w">

  </span><span class="c1">#Initialize hyperparameters</span><span class="w">
  </span><span class="n">betak</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="n">beta0</span><span class="p">,</span><span class="w"> </span><span class="n">K</span><span class="p">)</span><span class="w">
  </span><span class="n">alphak</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="n">alpha0</span><span class="p">,</span><span class="n">K</span><span class="p">)</span><span class="w">

  </span><span class="c1"># Necessary terms for calculate responsabilities</span><span class="w">
  </span><span class="n">ln_pres</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="n">K</span><span class="p">)</span><span class="w">
  </span><span class="n">ln_pi</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="n">K</span><span class="p">)</span><span class="w">
  </span><span class="n">E_mu_pres</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">matrix</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">N</span><span class="p">,</span><span class="w"> </span><span class="n">K</span><span class="p">)</span><span class="w">


  </span><span class="c1"># Iterate</span><span class="w">

  </span><span class="k">for</span><span class="p">(</span><span class="n">it</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">iter</span><span class="p">){</span><span class="w">

    </span><span class="c1">#Responsabilities</span><span class="w">
    </span><span class="n">r</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">matrix</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="n">N</span><span class="p">,</span><span class="w"> </span><span class="n">K</span><span class="p">)</span><span class="w">

    </span><span class="c1">#####################  Variational E-Step  ##########################33</span><span class="w">

    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">

      </span><span class="c1">#Log precision</span><span class="w">
      </span><span class="n">ln_pres</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">

      </span><span class="k">for</span><span class="p">(</span><span class="n">j</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">D</span><span class="p">){</span><span class="w">
        </span><span class="n">ln_pres</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ln_pres</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nf">digamma</span><span class="p">((</span><span class="n">vk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">1</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">j</span><span class="p">)</span><span class="w"> </span><span class="o">/</span><span class="m">2</span><span class="p">)</span><span class="w">
      </span><span class="p">}</span><span class="w">

      </span><span class="n">ln_pres</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ln_pres</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">D</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nf">log</span><span class="p">(</span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nf">log</span><span class="p">(</span><span class="n">det</span><span class="p">(</span><span class="n">Wk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]))</span><span class="w">

      </span><span class="n">alpha</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">alphak</span><span class="p">)</span><span class="w">

      </span><span class="n">ln_pi</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">digamma</span><span class="p">(</span><span class="n">alphak</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="nf">digamma</span><span class="p">(</span><span class="n">alpha</span><span class="p">)</span><span class="w">

      </span><span class="c1">#E[mu,pres] (expected value of joint distribution of mu and pres)</span><span class="w">
      </span><span class="k">for</span><span class="p">(</span><span class="n">k</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">N</span><span class="p">){</span><span class="w">
        </span><span class="n">E_mu_pres</span><span class="p">[</span><span class="n">k</span><span class="p">,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="p">(</span><span class="n">D</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">betak</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">vk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">t</span><span class="p">(</span><span class="n">X</span><span class="p">[</span><span class="n">k</span><span class="p">,]</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">mk</span><span class="p">[</span><span class="n">i</span><span class="p">,])</span><span class="w"> </span><span class="o">%*%</span><span class="w">
          </span><span class="n">Wk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="p">(</span><span class="n">X</span><span class="p">[</span><span class="n">k</span><span class="p">,]</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">mk</span><span class="p">[</span><span class="n">i</span><span class="p">,])</span><span class="w">  </span><span class="c1">#10.64</span><span class="w">

        </span><span class="n">r</span><span class="p">[</span><span class="n">k</span><span class="p">,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ln_pi</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">0.5</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">ln_pres</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="p">(</span><span class="n">D</span><span class="o">/</span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="nf">log</span><span class="p">(</span><span class="m">2</span><span class="o">*</span><span class="nb">pi</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w">
          </span><span class="m">0.5</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">E_mu_pres</span><span class="p">[</span><span class="n">k</span><span class="p">,</span><span class="n">i</span><span class="p">]</span><span class="w">

      </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">

    </span><span class="c1"># Exp-log-sum trick for numerical stability</span><span class="w">
    </span><span class="n">rho</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">apply</span><span class="p">(</span><span class="n">r</span><span class="p">,</span><span class="w"> </span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">){</span><span class="w">
      </span><span class="n">offset</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">max</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="w">
      </span><span class="n">y</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">x</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">offset</span><span class="w">
      </span><span class="nf">return</span><span class="p">(</span><span class="nf">exp</span><span class="p">(</span><span class="n">y</span><span class="p">)</span><span class="o">/</span><span class="nf">sum</span><span class="p">(</span><span class="nf">exp</span><span class="p">(</span><span class="n">y</span><span class="p">)))</span><span class="w">
    </span><span class="p">})</span><span class="w">

    </span><span class="n">rho</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">t</span><span class="p">(</span><span class="n">rho</span><span class="p">)</span><span class="w">

    </span><span class="c1">########################### Variational M-Step  ##################################</span><span class="w">

    </span><span class="c1"># Auxiliary statistics</span><span class="w">

    </span><span class="n">Nk</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">apply</span><span class="p">(</span><span class="n">rho</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="n">sum</span><span class="p">)</span><span class="w">

    </span><span class="c1"># Update means</span><span class="w">

    </span><span class="n">xBark</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">matrix</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="n">K</span><span class="p">,</span><span class="w"> </span><span class="n">D</span><span class="p">)</span><span class="w">
    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
      </span><span class="n">xBark</span><span class="p">[</span><span class="n">i</span><span class="p">,]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">colSums</span><span class="p">(</span><span class="n">rho</span><span class="p">[,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">X</span><span class="p">)</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Nk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w">
    </span><span class="p">}</span><span class="w">

    </span><span class="c1"># Update covariances</span><span class="w">

    </span><span class="n">Sk</span><span class="w"> </span><span class="o"><-</span><span class="w">  </span><span class="n">array</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="n">D</span><span class="p">,</span><span class="n">D</span><span class="p">,</span><span class="n">K</span><span class="p">))</span><span class="w">

    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
      </span><span class="n">sum_sk</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">
      </span><span class="k">for</span><span class="p">(</span><span class="n">j</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">N</span><span class="p">){</span><span class="w">
        </span><span class="n">sum_sk</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sum_sk</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">rho</span><span class="p">[</span><span class="n">j</span><span class="p">,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="p">(</span><span class="n">X</span><span class="p">[</span><span class="n">j</span><span class="p">,]</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">xBark</span><span class="p">[</span><span class="n">i</span><span class="p">,])</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="n">t</span><span class="p">(</span><span class="n">X</span><span class="p">[</span><span class="n">j</span><span class="p">,]</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">xBark</span><span class="p">[</span><span class="n">i</span><span class="p">,])</span><span class="w">
      </span><span class="p">}</span><span class="w">
      </span><span class="n">Sk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sum_sk</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">Nk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w">
    </span><span class="p">}</span><span class="w">

    </span><span class="c1"># Update hyperparameters</span><span class="w">

    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
      </span><span class="n">betak</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">beta0</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Nk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w">
      </span><span class="n">mk</span><span class="p">[</span><span class="n">i</span><span class="p">,]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="p">(</span><span class="m">1</span><span class="o">/</span><span class="n">betak</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="p">(</span><span class="n">beta0</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">m</span><span class="m">0</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Nk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">xBark</span><span class="p">[</span><span class="n">i</span><span class="p">,])</span><span class="w">
      </span><span class="n">Wk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">solve</span><span class="p">(</span><span class="n">solve</span><span class="p">(</span><span class="n">W</span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Nk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">Sk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">+</span><span class="w">
                         </span><span class="p">((</span><span class="n">beta0</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">Nk</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="p">(</span><span class="n">beta0</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Nk</span><span class="p">[</span><span class="n">i</span><span class="p">]))</span><span class="w"> </span><span class="o">*</span><span class="w">
                         </span><span class="p">(</span><span class="n">xBark</span><span class="p">[</span><span class="n">i</span><span class="p">,]</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">m</span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="n">t</span><span class="p">(</span><span class="n">xBark</span><span class="p">[</span><span class="n">i</span><span class="p">,]</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">m</span><span class="m">0</span><span class="p">))</span><span class="w">
      </span><span class="n">vk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">v</span><span class="m">0</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Nk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w">
    </span><span class="p">}</span><span class="w">

    </span><span class="c1">#ELBO (Evidence Lower Bound)</span><span class="w">

    </span><span class="c1"># ELBO is a sum of seven terms</span><span class="w">

    </span><span class="n">term1</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">  </span><span class="c1">#10.71</span><span class="w">

    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
      </span><span class="n">term1</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term1</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">Nk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="p">(</span><span class="n">ln_pres</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="p">(</span><span class="n">D</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="n">betak</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w"> </span><span class="o">-</span><span class="w">
                                  </span><span class="n">vk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">diag</span><span class="p">(</span><span class="n">Sk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="n">Wk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]))</span><span class="w"> </span><span class="o">-</span><span class="w">
                                  </span><span class="n">vk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="p">(</span><span class="w"> </span><span class="n">t</span><span class="p">(</span><span class="n">xBark</span><span class="p">[</span><span class="n">i</span><span class="p">,]</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">mk</span><span class="p">[</span><span class="n">i</span><span class="p">,])</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="n">Wk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">%*%</span><span class="w">
                                              </span><span class="p">(</span><span class="n">xBark</span><span class="p">[</span><span class="n">i</span><span class="p">,]</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">mk</span><span class="p">[</span><span class="n">i</span><span class="p">,]))</span><span class="w"> </span><span class="o">-</span><span class="w">
                                  </span><span class="n">D</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nf">log</span><span class="p">(</span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nb">pi</span><span class="p">))</span><span class="w">
    </span><span class="p">}</span><span class="w">
    </span><span class="n">term1</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0.5</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">term1</span><span class="w">

    </span><span class="n">term2</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">    </span><span class="c1">#10.72</span><span class="w">
    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">N</span><span class="p">){</span><span class="w">
      </span><span class="k">for</span><span class="p">(</span><span class="n">j</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
        </span><span class="n">term2</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term2</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="p">(</span><span class="n">rho</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">ln_pi</span><span class="p">[</span><span class="n">j</span><span class="p">])</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">

    </span><span class="n">term3</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">    </span><span class="c1">#10.73</span><span class="w">
    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
      </span><span class="n">term3</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term3</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">ln_pi</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w">
    </span><span class="p">}</span><span class="w">
    </span><span class="n">term3</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term3</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="p">(</span><span class="n">alpha0</span><span class="w"> </span><span class="m">-1</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nf">log</span><span class="p">(</span><span class="n">dirConstant</span><span class="p">(</span><span class="n">alpha0</span><span class="p">))</span><span class="w">

    </span><span class="n">term4</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">  </span><span class="c1">#10.74</span><span class="w">
    </span><span class="n">sub</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">
    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
      </span><span class="n">term4</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term4</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">D</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nf">log</span><span class="p">(</span><span class="n">beta0</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="p">(</span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nb">pi</span><span class="p">))</span><span class="w">  </span><span class="o">+</span><span class="w"> </span><span class="n">ln_pres</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">-</span><span class="w">
        </span><span class="p">((</span><span class="n">D</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">beta0</span><span class="p">)</span><span class="o">/</span><span class="n">betak</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">beta0</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">vk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w">
        </span><span class="n">t</span><span class="p">(</span><span class="n">mk</span><span class="p">[</span><span class="n">i</span><span class="p">,]</span><span class="o">-</span><span class="n">m</span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="n">Wk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="p">(</span><span class="n">mk</span><span class="p">[</span><span class="n">i</span><span class="p">,]</span><span class="o">-</span><span class="n">m</span><span class="m">0</span><span class="p">)</span><span class="w">
    </span><span class="p">}</span><span class="w">
    </span><span class="n">term4</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0.5</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">term4</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">K</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nf">log</span><span class="p">(</span><span class="n">BWishart</span><span class="p">(</span><span class="n">W</span><span class="m">0</span><span class="p">,</span><span class="n">v</span><span class="m">0</span><span class="p">))</span><span class="w">
    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
      </span><span class="n">sub</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sub</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">vk</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">diag</span><span class="p">(</span><span class="n">solve</span><span class="p">(</span><span class="n">W</span><span class="m">0</span><span class="p">)</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="n">Wk</span><span class="p">[,,</span><span class="n">i</span><span class="p">]))</span><span class="w">
    </span><span class="p">}</span><span class="w">
    </span><span class="n">term4</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term4</span><span class="w">  </span><span class="o">+</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">ln_pres</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="p">((</span><span class="n">v</span><span class="m">0</span><span class="o">-</span><span class="n">D</span><span class="m">-1</span><span class="p">)</span><span class="o">/</span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="m">0.5</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">sub</span><span class="w">

    </span><span class="n">term5</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">  </span><span class="c1">#10.75</span><span class="w">
    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">N</span><span class="p">){</span><span class="w">
      </span><span class="k">for</span><span class="p">(</span><span class="n">j</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
        </span><span class="n">stand</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rho</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">]</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nf">log</span><span class="p">(</span><span class="n">rho</span><span class="p">[</span><span class="n">i</span><span class="p">,</span><span class="n">j</span><span class="p">])</span><span class="w">
        </span><span class="k">if</span><span class="p">(</span><span class="o">!</span><span class="nf">is.finite</span><span class="p">(</span><span class="n">stand</span><span class="p">))</span><span class="w">
          </span><span class="n">stand</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">
        </span><span class="n">term5</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term5</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">stand</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">

    </span><span class="n">term6</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">  </span><span class="c1">#10.76</span><span class="w">
    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
      </span><span class="n">term6</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term6</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="p">(</span><span class="n">alphak</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="m">-1</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">ln_pi</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w">
    </span><span class="p">}</span><span class="w">
    </span><span class="n">term6</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term6</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="nf">log</span><span class="p">(</span><span class="n">dirConstant</span><span class="p">(</span><span class="n">alphak</span><span class="p">))</span><span class="w">

    </span><span class="n">term7</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">   </span><span class="c1">#10.77</span><span class="w">
    </span><span class="k">for</span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="m">1</span><span class="o">:</span><span class="n">K</span><span class="p">){</span><span class="w">
      </span><span class="n">term7</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term7</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">0.5</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="n">ln_pres</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="p">(</span><span class="n">D</span><span class="o">/</span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nf">log</span><span class="p">(</span><span class="n">betak</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">/</span><span class="p">(</span><span class="m">2</span><span class="w"> </span><span class="o">*</span><span class="w"> </span><span class="nb">pi</span><span class="p">))</span><span class="w"> </span><span class="o">-</span><span class="w">
        </span><span class="p">(</span><span class="n">D</span><span class="o">/</span><span class="m">2</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">entropyWishart</span><span class="p">(</span><span class="n">Wk</span><span class="p">[,,</span><span class="n">i</span><span class="p">],</span><span class="w"> </span><span class="n">vk</span><span class="p">[</span><span class="n">i</span><span class="p">])</span><span class="w">
    </span><span class="p">}</span><span class="w">

    </span><span class="k">if</span><span class="p">(</span><span class="n">it</span><span class="w"> </span><span class="o">></span><span class="w"> </span><span class="m">1</span><span class="p">){</span><span class="w">
      </span><span class="n">prevELBO</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ELBO</span><span class="w">
    </span><span class="p">}</span><span class="w">

    </span><span class="n">ELBO</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">term1</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">term2</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">term3</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">term4</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">term5</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">term6</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">term7</span><span class="w">

    </span><span class="c1"># Convergence criteria</span><span class="w">

    </span><span class="k">if</span><span class="p">(</span><span class="n">it</span><span class="w"> </span><span class="...

To leave a comment for the author, please follow the link and comment on their blog: --Jean Arreola--.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)