R Commander – data manipulation and summaries

June 13, 2010

(This article was first published on Software for Exploratory Data Analysis and Statistical Modelling, and kindly contributed to R-bloggers)

Previously we considered the R Commander interface as a simple GUI for the R statistical software system. Here we will look at how to undertake data manipulation and creating basic statistical summaries of data sets.

Fast Tube
Fast Tube by Casper

The R Commander GUI has two menus “Data” and “Statistics” that are used for manipulating data sets and calculating descriptive statistics and various commonly used statistical techniques. In the “Data” menu there is a sub-menu “Manage variables in active data set” that has some useful features. These include:

  • Compute new variables – used for transforming variables, e.g. converting to a logarithmic scale.
  • Standardise variables – centre data on the mean and scale to the variance of the variable.
  • Convert numeric variables to factors – this is useful for categorical data that is recorded as numbers where we would be interested in working with these as factor levels rather than the actual values.
  • Bin numeric variable – in some situations converting a continuous measurement to groups can make exploratory analysis easier.

The “Statistics” menu provides access to various descriptive and summary statistics via the “Summaries” sub-menu including:

  • Numerical summaries – mean, standard deviation or quantiles for a variable.
  • Frequency distributions – used to create tables to summarise the number of times each level of a factor occurs in a variable.
  • Table of statistics – mean, standard deviation for a numeric variable for each of the groups within a categorical variable.
  • Correlation matrix – the correlation between a set of numeric variables in a data frame.

There are other data manipulation options and summary functions available from these two menus.

Other useful resources are provided on the Supplementary Material page.

To leave a comment for the author, please follow the link and comment on their blog: Software for Exploratory Data Analysis and Statistical Modelling.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , ,

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)