798 search results for "parallel"

Webinar tomorrow: Big-data statistics with Revolution R with IBM Netezza

February 28, 2012
By

As explained in detail by Michele Chambers at the IBM Netezza blog, there are two keys to getting fast performance with statistical analysis on massive data sets with R: Massive parallelization: break the problem down into small pieces, and run them in parallel Bring the R engine to the data (not the other way around), to avoid data transfer...

Read more »

R integrated throughout the enterprise analytics stack

February 27, 2012
By

The past couple of years have seen a dramatic growth in the use of the R language in the enterprise. R has always been pervasive in academia for research and teaching in statistics and data science, and as new graduates trained in R have migrated to the workplace the demand for R in corporations has become more and more...

Read more »

Revolution Analytics at Strata 2012

February 27, 2012
By

One of my favourite conferences, Strata: Making Data Work, starts tomorrow in Santa Clara, CA. Revolution Analytics is a proud sponsor, and I'll be there with the team to listen to some great talks and to meet other R users at our booth in the exhibition hall. There will be several R-related talks and tutorials during the conference, including...

Read more »

Large-scale Inference

February 23, 2012
By
Large-scale Inference

Large-scale Inference by Brad Efron is the first IMS Monograph in this new series, coordinated by David Cox and published by Cambridge University Press. Since I read this book immediately after Cox’ and Donnelly’s Principles of Applied Statistics, I was thinking of drawing a parallel between the two books. However, while none of them can

Read more »

doSMP removed from CRAN

February 17, 2012
By
doSMP removed from CRAN

If you do parallel processing in R on Windows, then you probably have heard of the doSMP package. However, it was recently removed from the CRAN repository with the terse message: Package ‘doSMP’ was removed from the CRAN repository. Revolution … Continue reading →

Read more »

Elegant & fast data manipulation with data.table

February 12, 2012
By
Elegant & fast data manipulation with data.table

Just learned about the R data.table package (ht @recology_) makes R data frames into ultra-fast, SQL-like objects. One thing we get is some very nice and powerful syntax. Consider some simple data of replicate time series: To apply a function to each set of replicates, instead of We can use: Note that we could have

Read more »

Revolution R and Fedora: Revisited

February 10, 2012
By
Revolution R and Fedora: Revisited

A previous post of mine had suggested that, despite them being extremely similar operating systems, and really there being no clear reason why, Revolution R 5.0, which does support Red Hat Enterprise Linux, refused to work on Fedora 16. The installation failed, dependencies could not be installed, tech support was singularly unhelpful because I wasn’t

Read more »

Monitoring Progress Inside a Foreach Loop

February 9, 2012
By

The foreach package for R is excellent, and allows for code to easily be run in parallel. One problem with foreach is that it creates new RScript instances for each iteration of the loop, which prevents status messages from being logged to the console output. This is particularly frustrating during long-running tasks, when we are often unsure...

Read more »

Monitoring Progress Inside a Foreach Loop

February 9, 2012
By

The foreach package for R is excellent, and allows for code to easily be run in parallel. One problem with foreach is that it creates new RScript instances for each iteration of the loop, which prevents status messages from being logged to the console output. This is particularly frustrating during long-running tasks, when we are often unsure how much...

Read more »

discrimination between CpG islands and random sequences using Markov chains

February 8, 2012
By
discrimination between CpG islands and random sequences using Markov chains

Major part of modern research is trying to find patterns in the given dataset using learning methods. One of the methods that can use a priori information for such purpose is Markov chains, in which the probability of symbol occurrence … Continue reading →

Read more »