# 895 search results for "parallel"

## Visualizing rOpenSci collaboration

March 8, 2013
By

We (rOpenSci) have been writing code for R packages for a couple years, so it is time to take a look back at the data. What data you ask? The commits data from GitHub ~ data that records who did what and when. Using the Github commits API we can gather data on who commited code to a...

## Visualizing rOpenSci collaboration

March 8, 2013
By

We (rOpenSci) have been writing code for R packages for a couple years, so it is time to take a look back at the data. What data you ask? The commits data from GitHub ~ data that records who did what and when. Using the Github commits API we can gather data on who commited code to a...

## resizing plot panels to fit data distribution

March 3, 2013
By

I am a big fan of lattice/latticeExtra. In fact, nearly all visualisations I have produced so far make use of this great package. The possibilities for customisation are endless and the amount of flexibility it provides is especially valuable for … Continue reading →

## visualising diurnal wind climatologies

March 3, 2013
By

In this post I want to highlight the second core function of the metvurst repository (https://github.com/tim-salabim/metvurst): The windContours function It is intended to provide a compact overview of the wind field climatology at a location and plots wind direction and … Continue reading →

## visualising large amounts of hourly environmental data

March 3, 2013
By

It is Sunday, it's raining and I have a few hours to spend before I am invited for lunch at my parents place. Hence, I thought I'd use the time to produce another post. It has been a while since … Continue reading →

## R 2.15.3 is released

March 1, 2013
By

Follows is the announcement today from Peter Dalgaard, for the R Core Team: The build system rolled up R-2.15.3.tar.gz (codename “Security Blanket”) at 9:00 this morning. This is intended to be the final round-up release of the 2.15 series, and in fact of the entire 2.x.y series which started 2004-10-04. The list below details the changes in this release. You can get...

February 27, 2013
By

On Revolution Analytics partner Cloudera's blog, Uri Laserson has posted an excellent guide to resampling from a large data set in Hadoop. Resampling is an important step in fitting ensemble models (including random forests and other bagging techniques), and Uri provides a step-by-step guide to implementing resampling methods using RHadoop. He provides the complete map-reduce code in the R...

## the BUGS Book [guest post]

February 24, 2013
By

(My colleague Jean-Louis Fouley, now at I3M, Montpellier, kindly agreed to write a review on the BUGS book for CHANCE. Here is the review, en avant-première! Watch out, it is fairly long and exhaustive! References will be available in the published version. The additions of book covers with BUGS in the title and of the corresponding

## The Wisdom of Crowds – Clustering Using Evidence Accumulation Clustering (EAC)

February 24, 2013
By

Today’s blog post is about a problem known by most of the people using cluster algorithms on datasets without given true labels (unsupervised learning). The challenge here is the “freedom of choice” over a broad range of different cluster algorithms and how to determine the right parameter values. The difficulty is the following: Every clustering algorithm and even...

## bigcor: Large correlation matrices in R

February 22, 2013
By
$bigcor: Large correlation matrices in R$

As I am working with large gene expression matrices (microarray data) in my job, it is sometimes important to look at the correlation in gene expression of different genes. It has been shown that by calculating the Pearson correlation between genes, one can identify (by high values, i.e. > 0.9) genes that share a common