Data Frames and Transactions

September 24, 2012

(This article was first published on Statistical Research » R, and kindly contributed to R-bloggers)

Transactions are a very useful tool when dealing with data mining.  It provides a way to mine itemsets or rules on datasets.

In R the data must be in transactions form.  If the data is only available in a data.frame then to create (or coerce) the data frame to transaction the researcher may use the following code.   This example shows the “Adult” dataset available in the arules package.  It originates from the “Census Income” database.  These data, AdultUCI, can be coerced to transactions using the following commands:



Adult = as(AdultUCI, "transactions");

The dataframe can be in either a normalized (single) form or a flat file (basket) form.  When the file is in basket form it means that each record represents a transaction where the items in the basket are represented by columns.  When the dataset is in ‘single’ form it means that each record represents one single item and each item contains a transaction id.  The following snippet of code shows the read.transaction() function and how the data is set up.

my_data = paste("1,2","1","2,3", sep="\n");

write(my_data, file = "my_basket");

trans = read.transactions("my_basket", format = "basket", sep=",");


Once the data has been coerced to transactions the data is ready for mining itemsets or rules.  Association Rule Learning uses the transaction data files available in R.  A very popular algorithm for association rules is the apriori algorithm.  I have discussed approaches on the use of Association Rule Learning and the Apriori Algorithm.


To leave a comment for the author, please follow the link and comment on their blog: Statistical Research » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , , ,

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)