Unshorten URLs in R

April 15, 2013

(This article was first published on Stats and things, and kindly contributed to R-bloggers)

Well, of course, this tip comes out one week after I needed it. The author uses the RCurl package to request the header of the shortened URL and then parse the “location” parameter on the return. This sort of operation tends to be needed frequently, especially when using data from twitter. Twitter now shortens even already shortened links using their own t.co service. Every link in every tweet now has to be put through a process like this to resolve it to the full url.

My solution is very similar to the RLangTip, but instead of using RCurl, I am using a system call to “curl”, and repeatedly requesting the header for each url returned until no location attribute is found… and that’s the final url. It’s a little ugly, and I’m sure it can be sped up, and improved upon, but it works well enough…

Find me on twitter… https://twitter.com/corynissen

To leave a comment for the author, please follow the link and comment on their blog: Stats and things.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)