#Altmetrics on CiteULike entries in R

September 19, 2015
By

(This article was first published on chem-bla-ics, and kindly contributed to R-bloggers)

I wanted to know when a set of publications I was aggregating on CiteULike was published. The number of publications per year, for example. I did a quick Google but could not find an R package to client to the CiteULike API, and because I wanted to play with JSON in R anyway, I created a citeuliker package. Because I’m a liker of CiteULike (see these posts). Well, to me that makes sense.

citeuliker uses jsonlite, plyr, and curl (and testthat for testing). The first converts the JSON returned by the API to a R data structure. The package unfolds the “published” field, so that I can more easily plot things by year. I use this code for that:

    data[,”year”] <- laply(data[,”published”], function(x) {
      if (length(x) < 1) return(NA) else return(x[1])
    })

The laply() method comes from the plyr package. For example, if I want to see when the publications were published that I collected in my CiteULike library, I type:

    barplot(table(citeuliker::getData(user=”egonw”)[,”year”]))

That then looks like the plot in the top-right of this post. And, yes, I have a publication from 1777 in my library 🙂 See the reference at the bottom of this page.

Getting all the DOIs from my library is trivial too now:

    data <- citeuliker::getData(user=”egonw”)
    doi <- as.vector(na.omit(data[,”doi”]))
I guess the as.vector() to remove attributes can be done more efficiently; suggestions welcome.
Now, this makes it really easy to aggregate #altmetrics, because the rOpenSci people provide the rAltmetric package, and I can simply do (continuing from the above):
    library(rAltmetric) acuna <- altmetrics(doi=dois[6]);
    acuna_data <- altmetric_data(acuna);
    plot(acuna)

And then I get something like this:

Following the tutorial, I can easily get #altmetrics for all my DOIs, and plot a histogram of my Altmetric scores (make sure you have the plyr library loaded):

    raw_metrics <- lapply(dois, function(x) altmetrics(doi = x))
    metric_data <- ldply(raw_metrics, altmetric_data
    hist(metric_data$score, main=”Altmetric scores”, xlab=”score”)

That gives me this follow distribution:

The percentile statistics are also useful to me. After all, there is a clear pressure to have impact with your research. Getting your research known is a first step there. That’s why we submit abstracts for orals and posters too. Advertisement. Anyway, there is enough to be said about how useful #altmetrics are, and my main interest is in using them to see what people say about that, but I don’t have time now to do anything with that (it’s about time for dinner and Dr. Who).

But, as a last plot, and happy my online presence is useful for something, here a plot of the percentile of my papers in the journal it was published in and for the full Altmetric.com corpus:

    plot(
      as.vector(metric_data$context.all.pct),
      as.vector(metric_data$context.journal.pct),
      xlab=”pct all”, ylab=”pct journal”
    )
    abline(0,1)

This is the result:

This figure shows that my social campaign puts many of my publications in the top 10. That’s a start. Of course, these do not link one-to-one to citations, which are valued more by many, even though it also does not reflect well the true impact. Sadly, scientists here commonly ignore that the citation count also includes cito:disagreesWith and cito:citesAsAuthority.

Anyways… I think I need other R packages for getting citation counts from Google Scholar, Web of Science, and Scopus.

Scheele, C. W., 1777. Chemische Abhandlung von der Luft und dem Feuer.
Mietchen, D., Others, M., Anonymous, Hagedorn, G., Jan. 2015. Enabling open science: Wikidata for research. http://dx.doi.org/10.5281/zenodo.13906

To leave a comment for the author, please follow the link and comment on their blog: chem-bla-ics.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)