“The replicability of social science research is becoming more demanding in the age of big data. First, researchers aiming to replicate a study based on massive data face substantial computational costs. Second, and probably more challenging, they are often confronted with “highly unique” data sets derived and compiled from sources with different and unusual formats (as they are originally generated and recorded for purposes other than data analysis or research). This holds in particular for Internet data from social media, new e-businesses, and digital government. More and more social scientists attempt to exploit these new data sources following ad hoc procedures in the compilation of their data sets.
The entire post can be found here.
In this context, I also want to explicitly point to all the very relevant contributions listed in the CRAN Task View on Web Technologies and Services, the CRAN Open Data Task View, as well as the contributions by rOpenSci.