Timer progress bar added to pbapply package

[This article was first published on Peter Solymos - R related posts, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

pbapply
is a lightweight R extension package
that adds progress bar to vectorized R functions (*apply).
The latest addition in version 1.2-0
is the timerProgressBar function which adds a text based
progress bar with timer that all started with
this pull request.

This package is the least scientifically sophisticated piece of software
that I have worked on, but still it seems to be popular based on
reverse dependencies and download statistics.
The reason for the buzz is probably related to the packages
solving a common frustration. The frustration stems in the
fact that (1) vectorized functions do not provide any feedback
about how long the process is going to take;
and (2) there is no unified interface to progress bars.

Hadley Wickham’s plyr package came to the rescue. But to my taste that was an overkill. And honestly,
what is the fun in using a package that someone else wrote?
So I decided to integrate the available progress bar types in a single
lightweight package, with options to manipulate the type and style.

Let us see an example from the package help pages:

<span class="n">library</span><span class="p">(</span><span class="n">pbapply</span><span class="p">)</span><span class="w"> </span><span class="c1"># load package
</span><span class="n">set.seed</span><span class="p">(</span><span class="m">1234</span><span class="p">)</span><span class="w"> </span><span class="c1"># for reproducibility
</span><span class="n">n</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">200</span><span class="w"> </span><span class="c1"># sample size
</span><span class="n">x</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rnorm</span><span class="p">(</span><span class="n">n</span><span class="p">)</span><span class="w"> </span><span class="c1"># predictor
</span><span class="n">y</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">rnorm</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="w"> </span><span class="n">model.matrix</span><span class="p">(</span><span class="o">~</span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="o">%*%</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="m">0</span><span class="p">,</span><span class="m">1</span><span class="p">),</span><span class="w"> </span><span class="n">sd</span><span class="o">=</span><span class="m">0.5</span><span class="p">)</span><span class="w"> </span><span class="c1"># observations
</span><span class="n">d</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">data.frame</span><span class="p">(</span><span class="n">y</span><span class="p">,</span><span class="w"> </span><span class="n">x</span><span class="p">)</span><span class="w"> </span><span class="c1"># data
</span><span class="n">mod</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">lm</span><span class="p">(</span><span class="n">y</span><span class="w"> </span><span class="o">~</span><span class="w"> </span><span class="n">x</span><span class="p">,</span><span class="w"> </span><span class="n">d</span><span class="p">)</span><span class="w"> </span><span class="c1"># call to lm
</span><span class="n">ndat</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">model.frame</span><span class="p">(</span><span class="n">mod</span><span class="p">)</span><span class="w">
</span><span class="n">B</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">100</span><span class="w"> </span><span class="c1"># number of bootstrap samples
## bootstrap IDs
</span><span class="n">bid</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="n">B</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="n">sample</span><span class="p">(</span><span class="n">nrow</span><span class="p">(</span><span class="n">ndat</span><span class="p">),</span><span class="w"> </span><span class="n">nrow</span><span class="p">(</span><span class="n">ndat</span><span class="p">),</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">))</span><span class="w">
</span><span class="c1">## bootstrap function
</span><span class="n">fun</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">z</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="nf">missing</span><span class="p">(</span><span class="n">z</span><span class="p">))</span><span class="w">
        </span><span class="n">z</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sample</span><span class="p">(</span><span class="n">nrow</span><span class="p">(</span><span class="n">ndat</span><span class="p">),</span><span class="w"> </span><span class="n">nrow</span><span class="p">(</span><span class="n">ndat</span><span class="p">),</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)</span><span class="w">
    </span><span class="n">coef</span><span class="p">(</span><span class="n">lm</span><span class="p">(</span><span class="n">mod</span><span class="o">$</span><span class="n">call</span><span class="o">$</span><span class="n">formula</span><span class="p">,</span><span class="w"> </span><span class="n">data</span><span class="o">=</span><span class="n">ndat</span><span class="p">[</span><span class="n">z</span><span class="p">,]))</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>

The function takes a resampling vector as argument (here we use
columns from the pre-defined bid matrix). When the argument is missing,
it generates the vector itself. This way we can use the same
function in different vectorized functions.

First let’s look at the standard *apply functions, printing out
system time for comparison.

<span class="n">system.time</span><span class="p">(</span><span class="n">res1</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">lapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="n">B</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="n">fun</span><span class="p">(</span><span class="n">bid</span><span class="p">[,</span><span class="n">i</span><span class="p">])))</span><span class="w">
</span><span class="c1">##   user  system elapsed
##  0.123   0.008   0.095
</span><span class="n">system.time</span><span class="p">(</span><span class="n">res2</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="n">B</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="n">fun</span><span class="p">(</span><span class="n">bid</span><span class="p">[,</span><span class="n">i</span><span class="p">])))</span><span class="w">
</span><span class="c1">##   user  system elapsed
##  0.095   0.000   0.096
</span><span class="n">system.time</span><span class="p">(</span><span class="n">res3</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">apply</span><span class="p">(</span><span class="n">bid</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="n">fun</span><span class="p">))</span><span class="w">
</span><span class="c1">##   user  system elapsed
##  0.097   0.002   0.099
</span><span class="n">system.time</span><span class="p">(</span><span class="n">res4</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">replicate</span><span class="p">(</span><span class="n">B</span><span class="p">,</span><span class="w"> </span><span class="n">fun</span><span class="p">()))</span><span class="w">
</span><span class="c1">##   user  system elapsed
##  0.091   0.001   0.092
</span>

Here is the pb*apply implementation, trying different types and
styles of progress bar. Available progress bar types are timer, text,
Windows (on Windows only), TclTk, or none.

<span class="c1">## the default is the shiny new timer progress bar
</span><span class="n">op</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">pboptions</span><span class="p">(</span><span class="n">type</span><span class="o">=</span><span class="s2">"timer"</span><span class="p">)</span><span class="w">
</span><span class="n">system.time</span><span class="p">(</span><span class="n">res1pb</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">pblapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="n">B</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="n">fun</span><span class="p">(</span><span class="n">bid</span><span class="p">[,</span><span class="n">i</span><span class="p">])))</span><span class="w">
</span><span class="c1">##   |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% ~00s         
##   user  system elapsed
##  0.163   0.010   0.173
</span><span class="n">pboptions</span><span class="p">(</span><span class="n">op</span><span class="p">)</span><span class="w"> </span><span class="c1"># reset defaults
</span><span class="w">
</span><span class="c1">## text progress bar with percentages
</span><span class="n">pboptions</span><span class="p">(</span><span class="n">type</span><span class="o">=</span><span class="s2">"txt"</span><span class="p">)</span><span class="w">
</span><span class="n">system.time</span><span class="p">(</span><span class="n">res2pb</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">pbsapply</span><span class="p">(</span><span class="m">1</span><span class="o">:</span><span class="n">B</span><span class="p">,</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="w"> </span><span class="n">fun</span><span class="p">(</span><span class="n">bid</span><span class="p">[,</span><span class="n">i</span><span class="p">])))</span><span class="w">
</span><span class="c1">##  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
##   user  system elapsed
##  0.164   0.007   0.174
</span><span class="n">pboptions</span><span class="p">(</span><span class="n">op</span><span class="p">)</span><span class="w">

</span><span class="c1">## alternative style with '=' as character
</span><span class="n">pboptions</span><span class="p">(</span><span class="n">type</span><span class="o">=</span><span class="s2">"txt"</span><span class="p">,</span><span class="w"> </span><span class="n">style</span><span class="o">=</span><span class="m">1</span><span class="p">,</span><span class="w"> </span><span class="n">char</span><span class="o">=</span><span class="s2">"="</span><span class="p">)</span><span class="w">
</span><span class="n">system.time</span><span class="p">(</span><span class="n">res3pb</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">pbapply</span><span class="p">(</span><span class="n">bid</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="n">fun</span><span class="p">))</span><span class="w">
</span><span class="c1">##==================================================
##   user  system elapsed
##  0.144   0.006   0.155
</span><span class="n">pboptions</span><span class="p">(</span><span class="n">op</span><span class="p">)</span><span class="w">

</span><span class="c1">## now we use ':' isn't it nice?
</span><span class="n">pboptions</span><span class="p">(</span><span class="n">type</span><span class="o">=</span><span class="s2">"txt"</span><span class="p">,</span><span class="w"> </span><span class="n">char</span><span class="o">=</span><span class="s2">":"</span><span class="p">)</span><span class="w">
</span><span class="n">system.time</span><span class="p">(</span><span class="n">res4pb</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">pbreplicate</span><span class="p">(</span><span class="n">B</span><span class="p">,</span><span class="w"> </span><span class="n">fun</span><span class="p">()))</span><span class="w">
</span><span class="c1">##  |::::::::::::::::::::::::::::::::::::::::::::::::::| 100%
##   user  system elapsed
##  0.152   0.007   0.162
</span><span class="n">pboptions</span><span class="p">(</span><span class="n">op</span><span class="p">)</span><span class="w">
</span>

There is clearly an overhead when comparing system times.
Which is not a surprise. More calculations take more time.
The good news is that the overhead do not increase
with the size of the problem, so it only takes an extra second or so.

Install the package from your nearest
CRAN mirror
by install.packages("pbapply") and
let me know any issues you might run into
on the GitHub development site.

UPDATE

Elapsed and remaining time is now shown with progress bar or throbber.
Version 1.2-1 is now on CRAN.

To leave a comment for the author, please follow the link and comment on their blog: Peter Solymos - R related posts.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)