Datasets to Practice Your Data Mining

[This article was first published on RDataMining, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There are many datasets available online for free for research use. Some of them are listed below.

– The R Datasets Package:
There are around 90 datasets available in the package. Most of them are small and easy to feed into functions in R.
See a list of data with the statement below:
> library(help=”datasets”)

Frequent Itemset Mining Dataset Repository:
click-stream data, retail market basket data, traffic accident data and web html document data (large size!).
See the website also for implementations of many algorithms for frequent itemset and association rule mining.

ACM KDD Cup:
the annual Data Mining and Knowledge Discovery competition organized by ACM SIGKDD, targeting real-world problems

UCI KDD Archive:
an online repository of large data sets which encompasses a wide variety of data types, analysis tasks, and application areas

UCI Machine Learning Repository:
a collection of databases, domain theories, and data generators

CMU StatLib Datasets Archive

Time Series Data Library:
a collection of about 800 time series drawn from many different fields

EconData:
a source of economic time series data from Inforum, at the University of Maryland

UCR Time Series Data Archive:
data for time series classification and clustering

GeoDa Center:
A collection of spatial data

The links of above datasets are provided at RDataMining website, and more datasets will be added to the website later.

Yanchang Zhao

RDataMining: http://www.rdatamining.com
Twitter: http://www.twitter.com/RDataMining
Group: http://group.rdatamining.com


To leave a comment for the author, please follow the link and comment on their blog: RDataMining.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)