2018 through {cranlogs}

[This article was first published on Colin Fay, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

2018 at glance with {cranlogs}.

Let’s load the necessary packages.

<span class="n">library</span><span class="p">(</span><span class="n">cranlogs</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">data.table</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">lubridate</span><span class="p">)</span><span class="w">
</span>
## 
## Attaching package: 'lubridate'

## The following objects are masked from 'package:data.table':
## 
##     hour, isoweek, mday, minute, month, quarter, second, wday,
##     week, yday, year

## The following object is masked from 'package:base':
## 
##     date
<span class="n">library</span><span class="p">(</span><span class="n">ggplot2</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">magrittr</span><span class="p">)</span><span class="w">
</span>

All downloads

We’ll use {cranlogs} to retrieve the data from the RStudio CRAN
mirror.

First, the number of package downloads by day in 2018.

<span class="n">total_dl</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">cran_downloads</span><span class="p">(</span><span class="n">from</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"2018-01-01"</span><span class="p">,</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"2018-12-31"</span><span class="p">)</span><span class="w">

</span><span class="c1"># Turn to a data.table</span><span class="w">
</span><span class="n">setDT</span><span class="p">(</span><span class="n">total_dl</span><span class="p">)</span><span class="w">

</span><span class="c1"># Round the date to month and week</span><span class="w">
</span><span class="n">total_dl</span><span class="p">[,</span><span class="w"> </span><span class="n">`:=`</span><span class="p">(</span><span class="w">
  </span><span class="n">round_week</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">floor_date</span><span class="p">(</span><span class="n">date</span><span class="p">,</span><span class="w"> </span><span class="s2">"week"</span><span class="w"> </span><span class="p">),</span><span class="w">
  </span><span class="n">round_month</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">floor_date</span><span class="p">(</span><span class="n">date</span><span class="p">,</span><span class="w"> </span><span class="s2">"month"</span><span class="w"> </span><span class="p">)</span><span class="w">
  </span><span class="p">)</span><span class="w"> </span><span class="p">]</span><span class="w">
</span>

How many download in total?

<span class="n">total_dl</span><span class="p">[,</span><span class="w"> </span><span class="n">.</span><span class="p">(</span><span class="n">total</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">count</span><span class="p">))]</span><span class="w">
</span>
##        total
## 1: 614548197

Let’s plot this:

<span class="n">random_viridis</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">n</span><span class="p">){</span><span class="w">
  </span><span class="n">sample</span><span class="p">(</span><span class="n">viridis</span><span class="o">::</span><span class="n">viridis</span><span class="p">(</span><span class="m">100</span><span class="p">),</span><span class="w"> </span><span class="n">n</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="n">total_dl</span><span class="p">[,</span><span class="w"> </span><span class="n">.</span><span class="p">(</span><span class="n">count</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">count</span><span class="p">)),</span><span class="w"> </span><span class="n">round_week</span><span class="p">]</span><span class="w"> </span><span class="o">%>%</span><span class="w">
  </span><span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">round_week</span><span class="p">,</span><span class="w"> </span><span class="n">count</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">geom_col</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">random_viridis</span><span class="p">(</span><span class="m">1</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">labs</span><span class="p">(</span><span class="w">
    </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Packages downloads by Week on RStudio CRAN mirror"</span><span class="p">,</span><span class="w"> 
    </span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data via {cranlogs}"</span><span class="p">,</span><span class="w"> 
    </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"week"</span><span class="w">
  </span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w">
</span>

<span class="n">total_dl</span><span class="p">[,</span><span class="w"> </span><span class="n">.</span><span class="p">(</span><span class="n">count</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">count</span><span class="p">)),</span><span class="w"> </span><span class="n">round_month</span><span class="p">]</span><span class="w"> </span><span class="o">%>%</span><span class="w">
  </span><span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">round_month</span><span class="p">,</span><span class="w"> </span><span class="n">count</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">geom_col</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">random_viridis</span><span class="p">(</span><span class="m">1</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">labs</span><span class="p">(</span><span class="w">
    </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Packages downloads by Month on RStudio CRAN mirror"</span><span class="p">,</span><span class="w"> 
    </span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data via {cranlogs}"</span><span class="p">,</span><span class="w"> 
    </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"month"</span><span class="w">
  </span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w">
</span>

R download

Let’s now have a look at the number of downloads for R itself:

<span class="n">total_r</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">cran_downloads</span><span class="p">(</span><span class="s2">"R"</span><span class="p">,</span><span class="w"> </span><span class="n">from</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"2018-01-01"</span><span class="p">,</span><span class="w"> </span><span class="n">to</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"2018-12-31"</span><span class="p">)</span><span class="w">

</span><span class="n">setDT</span><span class="p">(</span><span class="n">total_r</span><span class="p">)</span><span class="w">

</span><span class="n">total_r</span><span class="p">[,</span><span class="w"> </span><span class="n">`:=`</span><span class="p">(</span><span class="w">
  </span><span class="n">round_week</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">floor_date</span><span class="p">(</span><span class="n">date</span><span class="p">,</span><span class="w"> </span><span class="s2">"week"</span><span class="w"> </span><span class="p">),</span><span class="w">
  </span><span class="n">round_month</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">floor_date</span><span class="p">(</span><span class="n">date</span><span class="p">,</span><span class="w"> </span><span class="s2">"month"</span><span class="w"> </span><span class="p">)</span><span class="w">
</span><span class="p">)</span><span class="w"> </span><span class="p">]</span><span class="w">
</span>

How many download in total?

<span class="n">total_r</span><span class="p">[,</span><span class="w"> </span><span class="n">.</span><span class="p">(</span><span class="n">total</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">count</span><span class="p">))]</span><span class="w">
</span>
##      total
## 1: 1041727

Plotting this:

<span class="n">total_r</span><span class="p">[,</span><span class="w"> </span><span class="n">.</span><span class="p">(</span><span class="n">count</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">count</span><span class="p">)),</span><span class="w"> </span><span class="n">round_week</span><span class="p">]</span><span class="w"> </span><span class="o">%>%</span><span class="w">
  </span><span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">round_week</span><span class="p">,</span><span class="w"> </span><span class="n">count</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">geom_col</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">random_viridis</span><span class="p">(</span><span class="m">1</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">labs</span><span class="p">(</span><span class="w">
    </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"R downloads by Week on RStudio CRAN mirror"</span><span class="p">,</span><span class="w"> 
    </span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data via {cranlogs}"</span><span class="p">,</span><span class="w"> 
    </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"week"</span><span class="w">
  </span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w">
</span>

<span class="n">total_r</span><span class="p">[,</span><span class="w"> </span><span class="n">.</span><span class="p">(</span><span class="n">count</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">count</span><span class="p">)),</span><span class="w"> </span><span class="n">round_month</span><span class="p">]</span><span class="w"> </span><span class="o">%>%</span><span class="w">
  </span><span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">round_month</span><span class="p">,</span><span class="w"> </span><span class="n">count</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">geom_col</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">random_viridis</span><span class="p">(</span><span class="m">1</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">labs</span><span class="p">(</span><span class="w">
    </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"R downloads by Month on RStudio CRAN mirror"</span><span class="p">,</span><span class="w"> 
    </span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data via {cranlogs}"</span><span class="p">,</span><span class="w"> 
    </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"month"</span><span class="w">
  </span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w">
</span>

Let’s have a look to the number of download by R
version:

<span class="n">total_r</span><span class="p">[,</span><span class="w"> </span><span class="n">.</span><span class="p">(</span><span class="n">count</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">count</span><span class="p">)),</span><span class="w"> </span><span class="n">version</span><span class="p">][</span><span class="n">order</span><span class="p">(</span><span class="n">count</span><span class="p">,</span><span class="w"> </span><span class="n">decreasing</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)]</span><span class="w"> </span><span class="o">%>%</span><span class="w">
  </span><span class="n">head</span><span class="p">(</span><span class="m">10</span><span class="p">)</span><span class="w">
</span>
##          version  count
##  1:        3.5.1 464837
##  2:        3.4.3 174665
##  3:        3.5.0 137886
##  4:        3.4.4 107124
##  5:       latest  32642
##  6: 3.5.1patched  32119
##  7:        3.3.3  27992
##  8:        3.5.2  21645
##  9:        devel   8543
## 10:        3.2.4   4814
<span class="n">total_r</span><span class="p">[,</span><span class="w"> </span><span class="n">.</span><span class="p">(</span><span class="n">count</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">count</span><span class="p">)),</span><span class="w"> </span><span class="n">version</span><span class="p">][</span><span class="n">order</span><span class="p">(</span><span class="n">count</span><span class="p">)]</span><span class="w"> </span><span class="o">%>%</span><span class="w">
  </span><span class="n">head</span><span class="p">(</span><span class="m">10</span><span class="p">)</span><span class="w">
</span>
##        version count
##  1:    3.5.1rc     2
##  2:  3.5.2beta     2
##  3:    3.5.2rc     6
##  4:  3.5.0beta     8
##  5:    3.4.4rc    11
##  6: 3.5.0alpha    11
##  7:    3.5.0rc    12
##  8:      2.6.1    13
##  9:      2.8.0    16
## 10:      2.2.1    17
<span class="n">total_r</span><span class="p">[,</span><span class="w"> </span><span class="n">.</span><span class="p">(</span><span class="n">count</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">count</span><span class="p">)),</span><span class="w"> </span><span class="n">version</span><span class="p">][</span><span class="n">order</span><span class="p">(</span><span class="n">count</span><span class="p">,</span><span class="w"> </span><span class="n">decreasing</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="kc">TRUE</span><span class="p">)]</span><span class="w"> </span><span class="o">%>%</span><span class="w">
  </span><span class="n">head</span><span class="p">(</span><span class="m">10</span><span class="p">)</span><span class="w"> </span><span class="o">%>%</span><span class="w"> 
  </span><span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">reorder</span><span class="p">(</span><span class="n">version</span><span class="p">,</span><span class="w"> </span><span class="n">count</span><span class="p">),</span><span class="w"> </span><span class="n">count</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">coord_flip</span><span class="p">()</span><span class="w"> </span><span class="o">+</span><span class="w">
  </span><span class="n">geom_col</span><span class="p">(</span><span class="n">fill</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">random_viridis</span><span class="p">(</span><span class="m">1</span><span class="p">))</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">labs</span><span class="p">(</span><span class="w">
    </span><span class="n">title</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"10 most downloaded R versions in 2018 on RStudio CRAN mirror"</span><span class="p">,</span><span class="w"> 
    </span><span class="n">subtitle</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data via {cranlogs}"</span><span class="p">,</span><span class="w"> 
    </span><span class="n">x</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"version"</span><span class="w">
  </span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w"> 
  </span><span class="n">theme_minimal</span><span class="p">()</span><span class="w">
</span>

And by os:

<span class="n">total_r</span><span class="p">[,</span><span class="w"> </span><span class="n">.</span><span class="p">(</span><span class="n">total</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nf">sum</span><span class="p">(</span><span class="n">count</span><span class="p">)),</span><span class="w"> </span><span class="n">os</span><span class="p">]</span><span class="w">
</span>
##     os  total
## 1: osx 228573
## 2: win 767319
## 3: src  42725
## 4:  NA   3110

And a happy new year 🎉🎉

To leave a comment for the author, please follow the link and comment on their blog: Colin Fay.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)