10 R packages every data scientist should know about

February 18, 2013

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

The yhat blog lists 10 R packages they wish they'd known about earlier. Drew Conway calls them "10 reasons to always start your analysis in R". They're all very useful R packages that every data scientist should be aware of. They are:

  1. sqldf (for selecting from data frames using SQL)
  2. forecast (for easy forecasting of time series)
  3. plyr (data aggregation)
  4. stringr (string manipulation)
  5. Database connection packages RPostgreSQL, RMYSQL, RMongo, RODBC, RSQLite
  6. lubridate (time and date manipulation)
  7. ggplot2 (data visulization)
  8. qcc (statistical quality control and QC charts)
  9. reshape2 (data restructuring)
  10. randomForest (random forest predictive models)

You can find links to all of these packages and tips on how to use them at link below.

yhat blog: 10 R packages I wish I knew about earlier

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)