Monthly Archives: February 2015

Max Kuhn’s Talk on Predictive Modeling

February 25, 2015
By

Max Kuhn, Director of Nonclinical Statistics of Pfizer and also the author of Applied Predictive Modeling joined us on February 17, 2015 and shared his experience with Data Mining

Read more »

Announcing: Introduction to Data Science video course

February 25, 2015
By
Announcing: Introduction to Data Science video course

Win-Vector LLC’s Nina Zumel and John Mount are proud to announce their new data science video course Introduction to Data Science is now available on Udemy. We designed the course as an introduction to an advanced topic. The course description is: Use the R Programming Language to execute data science projects and become a data … Continue reading Announcing:...

Read more »

Using Hadoop Streaming API to perform a word count job in R and C++

February 25, 2015
By

by Marek Gagolewski, Maciej Bartoszuk, Anna Cena, and Jan Lasek (Rexamine). Introduction In a recent blog post we explained how we managed to set up a working Hadoop environment on a few CentOS7 machines. To test the installation, let’s play…Read more ›

Read more »

How Big Is The Vatican City?

February 24, 2015
By
How Big Is The Vatican City?

Dici che il fiume trova la via al mare e come il fiume giungerai a me (Miss Sarajevo, U2) One way to calculate approximately the area of some place is to circumscribe it into a polygon of which you know its area. After that, generate coordinates inside the polygon and count how many of them fall into … Continue reading How...

Read more »

Visualizing Clusters

February 24, 2015
By
Visualizing Clusters

Consider the following dataset, with (only) ten points x=c(.4,.55,.65,.9,.1,.35,.5,.15,.2,.85) y=c(.85,.95,.8,.87,.5,.55,.5,.2,.1,.3) plot(x,y,pch=19,cex=2) We want to get – say – two clusters. Or more specifically, two sets of observations, each of them sharing some similarities. Since the number of observations is rather small, it is actually possible to get an exhaustive list of all partitions, and to minimize some criteria, such...

Read more »

RStudio v0.99 Preview: Data Viewer Improvements

February 24, 2015
By
RStudio v0.99 Preview: Data Viewer Improvements

RStudio’s data viewer provides a quick way to look at the contents of data frames and other column-based data in your R environment. You invoke it by clicking on the grid icon in the Environment pane, or at the console by typing View(mydata). As part of the RStudio Preview Release, we’ve completely overhauled RStudio’s data

Read more »

Monitoring progress of a foreach parallel job

February 24, 2015
By
Monitoring progress of a foreach parallel job

by Andrie de Vries R has strong support for parallel programming, both in base R and additional CRAN packages. For example, we have previously written about foreach and parallel programming in the articles Tutorial: Parallel programming with foreach and Intro to Parallel Random Number Generation with RevoScaleR. The foreach package provides simple looping constructs in R, similar to lapply()...

Read more »

Rare snowmelt estimation (GB)

February 24, 2015
By
Rare snowmelt estimation (GB)

I read Hough and Hollis’ 1997 paper recently which uses Met Office synoptic stations to estimate a magnitude – recurrence relationship for snowmelt in the UK. i.e. how often do we get how… Continue reading →

Read more »

Minimal examples help

February 24, 2015
By

The other day I got stuck working with a huge data set using data.table in R. It took me a little while to realise that I had to produce a minimal reproducible example to actually understand why I got stuck in the first place. I know, this is the mantra I should follow before I reach out to R-help,...

Read more »

Strata 2015: Keynote roundup

February 23, 2015
By

I spent last week at the Strata 2015 Conference in San José, California. As always, Strata made for a wonderful conference to catch up on the latest developments on big data and data science, and to connect with colleagues and friends old and new. Having been to every Strata conference since the first in XXXX, it's been interesting to...

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)