Monthly Archives: May 2013

Revisiting text processing with R and Python

May 25, 2013
By

  Back in 2011, I covered the relative performance difference of the most popular libraries for text processing in R and Python.   In case you can’t guess the answer, Python and NLTK  won by a significant margin over R and… Read more ›

Read more »

Speed trick: Assigning large object NULL is much faster than using rm()!

May 25, 2013
By

When processing large data sets in R you often also end up creating large temporary objects. In order to keep the memory footprint small, it is always good to remove those temporary objects as soon as possible. When done, removed objects will be deallocated from memory (RAM) the next time the garbage collection runs. Better: Use rm(list="x")...

Read more »

HOWTO: X11 Forwarding for Oracle R Enterprise

May 25, 2013
By
HOWTO: X11 Forwarding for Oracle R Enterprise

v\:* {behavior:url(#default#VML);} o\:* {behavior:url(#default#VML);} w\:* {behavior:url(#default#VML);} .shape {behavior:url(#default#VML);} Normal 0 false false false EN-US X-NONE X-NONE ...

Read more »

Sentiment analysis finds trouble in the Enron emails

May 24, 2013
By
Sentiment analysis finds trouble in the Enron emails

The Enron email dataset, collected during the FERC investigation of the Enron financial scandal, represents the largest publicly available set of emails. This makes theman ideal testbed for sentiment analysis algorithms. Ikanow's Andrew Strite used the open-source Infinit.e framework and a Hadoop cluster to generate sentiment scores for all of the Enron emails, and then used R to manipulate...

Read more »

Down and Dirty Forecasting: Part 2

May 24, 2013
By
Down and Dirty Forecasting: Part 2

This is the second part of the forecasting exercise, where I am looking at a multiple regression. To keep it simple I chose the states that boarder WI and the US unemployment information for the regression. Again this is a down and dirty analysis, I wo...

Read more »

What is probabilistic truth? Part 2 – Everything is conditional

May 24, 2013
By
What is probabilistic truth? Part 2 – Everything is conditional

Read Part 1 When making a statement of the form “1/2 is the correct probability that this coin will land tails”, there are a few things which are left unsaid, but which are typically implied. The statement is one about the probability of an unknown event occurring, and it would seem reasonable to write this

Read more »

Down and Dirty Forecasting: Part 1

May 24, 2013
By
Down and Dirty Forecasting: Part 1

I wanted to see what I could do in a hurry using the commands found at Forecasting: Principles and Practice . I chose a simple enough data set of Wisconsin Unemployment from 1976 to the present (April 2013). I kept the last 12 months worth of...

Read more »

Compiling R from Source with OpenMP, Accelerate and MKL in OS X

May 24, 2013
By

Compiling R from Source in OS X I set out to find out whether I could speed up R by compiling it from source and: using Apple´s Accelerate Framework enabling OpenMP (which is disabled under OS X and Windows by default, but enabled under Linux) using Intel´s Intel´s Math Kernel Library I also wanted to know how an implicit parallel library,...

Read more »

Shiny + Concerto = YES !!!

May 23, 2013
By
Shiny + Concerto = YES !!!

So I have finally gotten beta access to the two most powerful R controlled web application makers in existence and produced very exciting experimental productsA few posts ago I posted a Visual Reasoning Test that I had made by hand and powered wit...

Read more »

Robert Hijmans on Spatial Data Analysis

May 23, 2013
By

Last week at the Davis R Users’ Group Robert Hijmans gave a talk about spatial data analysis in R. Robert is a professor of biogeography at UC Davis and the author of the raster (analysis of gridded data), dismo (species distribution modeling), and geosphere (spherical trigonometry), packages. Robert’s presentation spanned topics including basic...

Read more »

Sponsors

Mango solutions



plotly webpage

dominolab webpage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)