**Daniel MarcelinoDaniel Marcelino » R**, and kindly contributed to R-bloggers)

I’d like to explore more the capabilities of my statistical packages to get data online and allocate it in memory instead of download each dataset by hand. After all, I found this task is pretty easy, but got me out of bed for one night trying to find the most efficient way to loop across the files and store them in the right way. So, let’s start. You can find one file here, with a list of web address where each file we are about to download is allocated. This files contain all registered details about revenues and expenditures of each candidate in the last election in Brazil. That’s meaning, more than 22 thousands .csv2 files; each file represents a candidate (i). For this task, I’ll use just data of revenues. Finally, I’m going to show the same steps using R.

`require(xlsx)`

web <- read.xlsx(file.choose(), 1)

mysites =web$web

rm(web) # remove it because I need a lot of memory;

#run this code and relax for 3 or four hours;

big.data <- NULL

base <-NULL

for (i in mysites) {

try(base <- read.table(i, sep=";", header=T, as.is=T, fileEncoding="windows-1252"), TRUE)

if(!is.null(base)) big.data <- rbind(big.data, base)

}

#… half day after

names(big.data)

head(big.data,10)

tail(big.data, 10)

fix(base)

srt(big.data)

**leave a comment**for the author, please follow the link and comment on his blog:

**Daniel MarcelinoDaniel Marcelino » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...