Kaleidoscope IIb (useR! 2011)

August 17, 2011

(This article was first published on Why? » R, and kindly contributed to R-bloggers)

L Collingwood – RTextTools

RTextTools. A machine learning library for automated text classification. This package builds on previous packages such as tm and random forests. Use case: undergrad labels congressional bills but then quits. Using the previously labelled data, automatically classify the remaining documents. The speaker gave a nice overview of machine learning techniques, but I was familiar with them so didn’t bother making notes.


  1. Read data;
  2. Missed opps;
  3. Create Corpus;
  4. Train Models – SVM, SLDA, TREE, etc;
  5. Classify models;
  6. Analyze data.

Jason Waddel – The Role of R in Lab Automation

License: free as in free beer and speech!

Summary: a scientist repeats the same experiment multiple times. How can we automate analysis.

R service bus allows a scientist to email/upload data and the results are automatically generated.

High level view

Various inputs such as pop, xml, REST WS. Each input is added to the queue. A pool of R servers handles the job. A simple configuration file handles the set-up.


Please note that the notes/talks section of this post is merely my notes on the presentation. I may have made mistakes: these notes are not guaranteed to be correct. Unless explicitly stated, they represent neither my opinions nor the opinions of my employers. Any errors you can assume to be mine and not the speaker’s. I’m happy to correct any errors you may spot – just let me know!

To leave a comment for the author, please follow the link and comment on their blog: Why? » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training





CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)