Read Big Text Files Column by Column

April 27, 2012

(This article was first published on Econometrics_Help, and kindly contributed to R-bloggers)

Dear R Programmers,

There is new package “colbycol” on CRAN, which makes our jobs easier when we have large files i.e. more than a GB to be read in R. Especially, when we don’t need all of the columns/variables for our analysis. Kudos for author, Carlos J. Gil Bellosta.

I have tried it on a 1.72 GB data, where in my main interest was “few columns” where it has more 300 columns and 500,000 rows. Since, it is easy to know about how many columns exist by reading few lines of data (also refer to my earlier post and ?readLines), R job of getting what I want was completed with few lines as below (and also in quicker time):

library(colbycol) <-“D:/XYZ/filename.csv”, = c(1, 3, 21, 34, 108, 205, 227), sep = “,”)
# then on can convert simply to data.frame as follows <-, columns = 1:7, rows = 1:50000)

Also, refer to for quick intro by author.

Have a nice programming with R.

To leave a comment for the author, please follow the link and comment on their blog: Econometrics_Help. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)