More Command-Line Text Munging Utilities

May 19, 2011
By

(This article was first published on Getting Genetics Done, and kindly contributed to R-bloggers)

In a previous post I linked to gcol as a quick and intuitive alternative to awk. I just stumbled across yet another set of handy text file manipulation utilities from the creators of the BEAGLE software for GWAS data imputation and analysis. In addition to several command line utilities for converting and formatting BEAGLE files, there are several tools for doing basic text processing tasks on the command line:

  • changecolumn.jar - replace values in a column of an input file.
  • changeline.jar - replace values in a line of an input file.
  • cut.jar - extract columns from a file.
  • filtercolumns.jar - filters columns of input data according to the values in a line.
  • filterlines.jar - filters lines of input data according to the values in a column.
  • paste.jar - pastes together files that have shared initial columns followed by data columns.
  • transpose.jar - transposes rows and columns of an input file. 
Much of what these tools do can probably be emulated with some creativity with Unix commands and pipes. But since these are all Java archives they should work on any platform, not just Unix/Linux. Hit the link below to see the full list and documentation.

BEAGLE Utilities for text manipulation

To leave a comment for the author, please follow the link and comment on his blog: Getting Genetics Done.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.