[This article was first published on Malith's Perspective
, and kindly contributed to R-bloggers
]. (You can report issue about the content on this page here
Want to share your content on R-bloggers? click here
if you have a blog, or here
if you don't.
The project aimed at improving the postCP package and making it available on CRAN again. The snag that prevented the package from being updated is the recent requirement that in the R code, .C() calls require DUP=TRUE arguments, and .Call() is suggested instead of .C(). The implementation of postCP package required that it’s done in the most R compliant way. Separating out the model specific implementation and the core implementation. The core part is implemented in C++ for speed in calculations. The project page can be found here.
To improve the usability and user friendliness, a glm syntax based “formula” and “family” specification was added. These commits are in the feature-glmsyntax branch. Also the model specific implementation was done by adding four models. (Gaussian, Poisson, Binomial and Gamma). The previous package only included three models. (Gaussian, Poisson and Binomial). These commits are found here.
The model specific part was integrated with the core specific part (C++ forward backward algorithm) to give the required results. Also following the change point model, a separate section was added for parameter calculations and parameter updates based on the updated log evidences. Standard error checks were included to handle incorrect user inputs as much as possible and to give meaningful error messages. Vignettes have been included to provide long form documentation. The commits are found here.
The development has been completed as agreed in the project proposal. All the branches have been merged to the master branch and it reflects the latest package.
The completed source code of the package has been included in the following public Google folder for viewing.
Run time analysis
The following table represents a run time analysis of postCP package done on a 2.7 GHz machine with 4GB RAM running on Ubuntu 14.04. It shows a linear time complexity; O(N)
||CPU TIME ( Time in seconds )
|n (iterations )
The mean model and the slope model are given below.
Building the Package
2. Navigate into the postCP folder
3. Build using R / RStudio. ( If it’s RStudio use Ctrl + Shift + B )
I would like to thank my mentors Gregory Nuel and Guillem Regaill for all the support and guidance and also bearing with me for the duration of the project. 😀 I would also like to thank Minh Luong (original creator of the package) for the additional documentation provide while I was trying to understand the previous implementation.
Finally thank you for GSOC for the great learning opportunity! 😀
The official GSoC Project page can be found here.
Linked Blog posts