Facts About R Packages (2)

June 6, 2012
By

(This article was first published on Category: R | Huidong Tian, and kindly contributed to R-bloggers)

R Packages All Well maintained?

There are so many R packages, can they all be trusted? or are they well maintained? To answer this question, we just need to take a look of their archive histories. If a package has many versions, we can take that as the authors spent a lot of time to make their packages perfect, these of kinds of packages can be taken as well maintained.

From the above pie chart, we can see that half of the packages have 4 or more versions, and 7% of them even have more than 20 version, suggesting, at least, half of R packages were well maintained.

R code for above pie chart.

Packages Maintenance
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Extract the date of the first version, last verstion and 
# the total number of versions for each package;
pkg.update <- data.frame(pkg.name = names(PKG))
for (i in 1:nrow(pkg.update)) {
  pkg <- pkg.update$pkg.name[i]
  pkg.des <- PKG[[pkg]]
  if (“History” %in% names(pkg.des)) {
    pkg.update$Date.1[i] <- as.character(min(pkg.des$History$Date))
    pkg.update$Num[i] <- nrow(pkg.des$History) + 1
  }else {
    pkg.update$Date.1[i] <-
    pkg.des$Description$V2[which(pkg.des$Description$V1 == “Published:”)]
    pkg.update$Num[i] <- 1
  }
  pkg.update$Date.2[i] <-
  pkg.des$Description$V2[which(pkg.des$Description$V1 == “Published:”)]
}</p>

<h1 id="aggregate-package-maintenance">Aggregate package maintenance;</h1>
<p>dat.1 &lt;- with(pkg.update, aggregate(list(pkg.num = pkg.name), list(Num = Num), length))
dat.1$pkg.num[21] &lt;- sum(dat.1$pkg.num[-(1:20)])
dat.1$Num &lt;- as.character(dat.1$Num)
dat.1$Num[21] &lt;- &ldquo;20+&rdquo;
dat &lt;- dat.1[1:21,]</p>

<h1 id="display-the-pie-chart-in-googlevis">Display the pie chart in GoogleVis;</h1>
<p>Pie &lt;- gvisPieChart(dat)
plot(Pie)

There are 49 packages have more than 50 versions, take a look which of them you have used or heard.

R code for above figure.

Packages updated frequently
1
2
3
Column &lt;- gvisColumnChart(pkg.update[pkg.update$Num &gt; 50, ],
                         xvar = &ldquo;pkg.name&rdquo;, yvar = &ldquo;Num&rdquo;)
plot(Column)

21.3% of the packages have only one version which suggest that these packages need more mainenance if they are not perfect, or perhaps they were just uploaded to CRAN. For the packages having no more than 3versions, most of them (71.6%) were uploaded in recent two years, and only 15 packages were updated before 2007.

R code for above pie chart.

Packages having no more than 3 versions
1
2
3
4
5
dat.2 &lt;- pkg.update[pkg.update$Num &lt; 4, ]
dat.2$Date &lt;- substr(dat.2$Date.2, 1, 4)
dat.3 &lt;- with(dat.2, aggregate(list(Num = Date), list(Year = Date), length))
Pie2 &lt;- gvisPieChart(dat.3)
plot(Pie2)

To leave a comment for the author, please follow the link and comment on his blog: Category: R | Huidong Tian.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.