First commit or initial commit?
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
When I create a new .git repository, my first commit message tends to be “1st commit”. I’ve been wondering what other people use as initial commit message. Today I used the gh
package to get first commits of all repositories of the ropensci and ropenscilabs organizations.
The sample might seem a bit small, but I just wanted to start exploring my question. I agree that it means my answer won’t be very conclusive.
Getting all repos for an organization
I’ve come up with a quite inelegant solution to paging, I just continue querying the API until it returns me nothing.
<span class="n">library</span><span class="p">(</span><span class="s2">"gh"</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="s2">"dplyr"</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="s2">"purrr"</span><span class="p">)</span><span class="w">
</span><span class="n">get_repos</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">org</span><span class="p">){</span><span class="w">
</span><span class="n">ropensci_repos_names</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="kc">NULL</span><span class="w">
</span><span class="n">page</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="n">geht</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="kc">TRUE</span><span class="w">
</span><span class="k">while</span><span class="p">(</span><span class="n">geht</span><span class="p">){</span><span class="w">
</span><span class="n">ropensci_repos</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">try</span><span class="p">(</span><span class="n">gh</span><span class="p">(</span><span class="s2">"/orgs/:org/repos"</span><span class="p">,</span><span class="w">
</span><span class="n">org</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">org</span><span class="p">,</span><span class="w">
</span><span class="n">page</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">page</span><span class="p">))</span><span class="w">
</span><span class="n">geht</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ropensci_repos</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="s2">""</span><span class="w">
</span><span class="k">if</span><span class="p">(</span><span class="n">geht</span><span class="p">){</span><span class="w">
</span><span class="n">ropensci_repos_names</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="n">ropensci_repos_names</span><span class="p">,</span><span class="w">
</span><span class="n">vapply</span><span class="p">(</span><span class="n">ropensci_repos</span><span class="p">,</span><span class="w"> </span><span class="s2">"[["</span><span class="p">,</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w"> </span><span class="s2">"name"</span><span class="p">))</span><span class="w">
</span><span class="n">page</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">page</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="nf">return</span><span class="p">(</span><span class="n">ropensci_repos_names</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="n">head</span><span class="p">(</span><span class="n">get_repos</span><span class="p">(</span><span class="n">org</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"ropenscilabs"</span><span class="p">))</span><span class="w">
</span>
## [1] "webmockr" "vcr" "seasl" "plater"
## [5] "rnaturalearth" "convertr"
Get first commit for a repository
Here I’m doing something quite inefficient. Since the API returns the most recent commits first I get all commits. I could have used the creation date of the repository instead to only query commits created shortly after that.
<span class="n">first_commit</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">repo</span><span class="p">,</span><span class="w"> </span><span class="n">org</span><span class="p">){</span><span class="w">
</span><span class="n">messages</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="kc">NULL</span><span class="w">
</span><span class="n">page</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="n">geht</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="kc">TRUE</span><span class="w">
</span><span class="k">while</span><span class="p">(</span><span class="n">geht</span><span class="p">){</span><span class="w">
</span><span class="n">commits</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">try</span><span class="p">(</span><span class="n">gh</span><span class="p">(</span><span class="s2">"/repos/:owner/:repo/commits"</span><span class="p">,</span><span class="w">
</span><span class="n">owner</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">org</span><span class="p">,</span><span class="w">
</span><span class="n">repo</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">repo</span><span class="p">,</span><span class="w">
</span><span class="n">page</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">page</span><span class="p">))</span><span class="w">
</span><span class="k">if</span><span class="p">(</span><span class="nf">class</span><span class="p">(</span><span class="n">commits</span><span class="p">)[</span><span class="m">1</span><span class="p">]</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="s2">"try-error"</span><span class="p">){</span><span class="w">
</span><span class="n">geht</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">commits</span><span class="w"> </span><span class="o">!=</span><span class="w"> </span><span class="s2">""</span><span class="w">
</span><span class="p">}</span><span class="k">else</span><span class="p">{</span><span class="w">
</span><span class="n">geht</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="kc">FALSE</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="k">if</span><span class="p">(</span><span class="n">geht</span><span class="p">){</span><span class="w">
</span><span class="n">now</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">lapply</span><span class="p">(</span><span class="n">commits</span><span class="p">,</span><span class="w"> </span><span class="s2">"[["</span><span class="p">,</span><span class="w"> </span><span class="s2">"commit"</span><span class="p">)</span><span class="w">
</span><span class="n">now</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">lapply</span><span class="p">(</span><span class="n">now</span><span class="p">,</span><span class="w"> </span><span class="s2">"[["</span><span class="p">,</span><span class="w"> </span><span class="s2">"message"</span><span class="p">)</span><span class="w">
</span><span class="n">messages</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="n">messages</span><span class="p">,</span><span class="w"> </span><span class="n">unlist</span><span class="p">(</span><span class="n">now</span><span class="p">))</span><span class="w">
</span><span class="n">page</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">page</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="m">1</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="n">messages</span><span class="p">[</span><span class="nf">length</span><span class="p">(</span><span class="n">messages</span><span class="p">)]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="n">first_commit</span><span class="p">(</span><span class="s2">"ropenaq"</span><span class="p">,</span><span class="w"> </span><span class="s2">"ropensci"</span><span class="p">)</span><span class="w">
</span>
## [1] "Everything"
I’m a bit surprised I chose “Everything” as first commit for my ropenaq
package, actually. Not because I expect my commit history to be particularly smart either, just because it’s not a “1st commit”.
Get all the first commits
<span class="n">first_commits</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_repos</span><span class="p">(</span><span class="s2">"ropenscilabs"</span><span class="p">)</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">map</span><span class="p">(</span><span class="n">first_commit</span><span class="p">,</span><span class="w"> </span><span class="n">org</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"ropenscilabs"</span><span class="p">)</span><span class="w">
</span><span class="n">save</span><span class="p">(</span><span class="n">first_commits</span><span class="p">,</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data/2017-02-21_ropenscilabs_first_commits.RData"</span><span class="p">)</span><span class="w">
</span><span class="n">first_commits</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">get_repos</span><span class="p">(</span><span class="s2">"ropensci"</span><span class="p">)</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">map</span><span class="p">(</span><span class="n">first_commit</span><span class="p">,</span><span class="w"> </span><span class="n">org</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"ropensci"</span><span class="p">)</span><span class="w">
</span><span class="n">save</span><span class="p">(</span><span class="n">first_commits</span><span class="p">,</span><span class="w"> </span><span class="n">file</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"data/2017-02-21_ropensci_first_commits.RData"</span><span class="p">)</span><span class="w">
</span>
What are the most frequent first commits?
<span class="n">load</span><span class="p">(</span><span class="s2">"data/2017-02-21_ropenscilabs_first_commits.RData"</span><span class="p">)</span><span class="w">
</span><span class="n">ropenscilabs</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">first_commits</span><span class="w">
</span><span class="n">load</span><span class="p">(</span><span class="s2">"data/2017-02-21_ropensci_first_commits.RData"</span><span class="p">)</span><span class="w">
</span><span class="n">ropensci</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">first_commits</span><span class="w">
</span><span class="n">all</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="n">unlist</span><span class="p">(</span><span class="n">ropenscilabs</span><span class="p">),</span><span class="w">
</span><span class="n">unlist</span><span class="p">(</span><span class="n">ropensci</span><span class="p">))</span><span class="w">
</span><span class="n">firstc</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">tibble</span><span class="o">::</span><span class="n">tibble</span><span class="p">(</span><span class="n">commit</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">all</span><span class="p">)</span><span class="w">
</span><span class="n">firstc</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">mutate</span><span class="p">(</span><span class="n">firstc</span><span class="p">,</span><span class="w"> </span><span class="n">commit</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">tolower</span><span class="p">(</span><span class="n">commit</span><span class="p">))</span><span class="w">
</span><span class="n">firstc</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">group_by</span><span class="p">(</span><span class="n">commit</span><span class="p">)</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">summarize</span><span class="p">(</span><span class="n">n</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">n</span><span class="p">())</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">arrange</span><span class="p">(</span><span class="n">desc</span><span class="p">(</span><span class="n">n</span><span class="p">))</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">head</span><span class="p">(</span><span class="n">n</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">15</span><span class="p">)</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">knitr</span><span class="o">::</span><span class="n">kable</span><span class="p">()</span><span class="w">
</span>
commit | n |
---|---|
first commit | 117 |
initial commit | 76 |
added readme | 19 |
added files | 9 |
1st commit | 3 |
create readme.md | 3 |
init | 3 |
added readme file | 2 |
code extracted from mikabr/devtools | 2 |
first comit | 2 |
first commit, added files | 2 |
initial | 2 |
initial import | 2 |
package infrastructure | 2 |
rstudio new package project | 2 |
Out of the 362 repositories, 76 used “initial commit” as a first commit message and 117 used “first commit” instead. In total 0.53 of all repos used either one of these two messages, which isn’t as much as I expected. But maybe rOpenSci repositories are unusual as regards first commit originality? And you, what is your favourite initial commit message if you have one?
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.