Useful functions in R!

May 29, 2016

(This article was first published on R Language – the data science blog, and kindly contributed to R-bloggers)

I have listed some useful functions below:


The with( ) function applys an expression to a dataset. It is similar to DATA= in SAS.

# with(data, expression)
# example applying a t-test to a data frame mydata 
with(mydata, t.test(y ~ group))

Please look at other examples here and here.


The by( ) function applys a function to each level of a factor or factors. It is similar to BY processing in SAS.

# by(data, factorlist, function)
# example obtain variable means separately for
# each level of byvar in data frame mydata 
by(mydata, mydatat$byvar, function(x) mean(x))

Please look here for more details. calls a function with a list of arguments, lapply applies a function to each element of the list, list(c(1,2,4,1,2), na.rm = TRUE))
lapply(c(1,2,4,1,2), function(x) x + 1)

More examples here.


more() is a user-defined function that is helpful in printing out a large object. Taken from here.

#to print out an object such as data.frame mydf 20 lines at a time, use:

#where more() is defined as

more <- function(expr, lines=20) {
  out <- capture.output(expr)
  n <- length(out)
  i <- 1
  while( i < n ) {
    j <- 0
    while( j < lines && i <= n ) {
      j <- j + 1
      i <- i + 1
      rl <- readline()
      if( grepl('^ *q', rl, ) i <- n
      if( grepl('^ *t', rl, ) i <- n - lines + 1
      if( grepl('^ *[0-9]', rl) ) i <- as.numeric(rl)/10*n + 1


options() can be used to increase the limit for max.print in R. More info here.


To check which columns in the data frame df have missing values

colnames(df)[colSums( > 0]

The cover photo of this blog post is taken from

To leave a comment for the author, please follow the link and comment on their blog: R Language – the data science blog. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)