drat 0.0.2: Improved Support for Lightweight R Repositories

March 1, 2015
By

A few weeks ago we introduced the drat package. Its name stands for drat R Archive Template, and it helps with easy-to-create and easy-to-use repositories for R packages. Two early blog posts describe drat: First Steps Towards Lightweight Repositorie...

Read more »

Should I use premium Diesel? Setup

March 1, 2015
By
Should I use premium Diesel? Setup

Since I drive quite a lot, I have some interest in getting the most km out every Euro spent on fuel. One thing to change is the fuel. The...

Read more »

DOSE: an R/Bioconductor package for Disease Ontology Semantic and Enrichment analysis

February 28, 2015
By
DOSE: an R/Bioconductor package for Disease Ontology Semantic and Enrichment analysis

My R/Bioconductor package, DOSE, published in Bioinformatics. Summary: Disease ontology (DO) annotates human genes in the context of disease. DO is important annotation in translating molecular findings from high-throughput data...

Read more »

Book Review: Mastering Scientific Computing with R

February 28, 2015
By
Book Review:  Mastering Scientific Computing with R

PACKT marketing guys again contact me to review their new book Mastering Scientific Computing with R.  The book 432 pages (including covers) book is consist of 10 chapters which...

Read more »

One weird trick to compile multipartite dynamic documents with Rmarkdown

February 28, 2015
By
One weird trick to compile multipartite dynamic documents with Rmarkdown

This afternoon I stumbled across this one weird trick an undocumented part of the YAML headers that get processed when you click the ‘knit’ button in...

Read more »

Playing around with #rstats twitter data

Playing around with #rstats twitter data

As a bit of weekend fun, I decided to briefly look into the #rstats twitter data that Stephen Turner collected and made available (thanks!). Essentially, this data set contains...

Read more »

Tools in Tandem – SQL and ggplot. But is it Really R?

February 28, 2015
By
Tools in Tandem – SQL and ggplot. But is it Really R?

Increasingly I find that I have fallen into using not-really-R whilst playing around with Formula One stats data. Instead, I seem to be using a hybrid of SQL to...

Read more »

Scalable Machine Learning for Big Data Using R and H2O

February 28, 2015
By

Part I Part II H2O is an open source parallel processing engine for machine learning on Big Data. This prediction engine is designed by, h20, a Mountain View-based startup...

Read more »

RcppEigen 0.3.2.4.0

February 28, 2015
By

A new release of RcppEigen is now on CRAN and in Debian. It synchronizes the Eigen code with the 3.2.4 upstream release, and updates the RcppEigen.package.skeleton() package creation...

Read more »

John Snow, and Google Maps

February 27, 2015
By
John Snow, and Google Maps

In my previous post, I discussed how to use OpenStreetMaps (and standard plotting functions of R) to visualize John Snow’s dataset. But it is also possible to use Google...

Read more »

John Snow, and OpenStreetMap

February 27, 2015
By
John Snow, and OpenStreetMap

While I was working for a training on data visualization, I wanted to get a nice visual for John Snow’s cholera dataset. This dataset can actually be found in...

Read more »

Data Science/Statistics/R @Google

February 27, 2015
By

This meetup will be hosted by Google and we’ll have Peter Lipman and Pete Meyer...

Read more »

Career NBA: The Road Least Traveled

February 27, 2015
By
Career NBA: The Road Least Traveled

The bell rings - time to go to practice. Jarnell Stokes heads over to the gym, changes, and starts warming up with his teammates. It's his Junior year in high school....

Read more »

Does Balancing Classes Improve Classifier Performance?

February 27, 2015
By
Does Balancing Classes Improve Classifier Performance?

It’s a folk theorem I sometimes hear from colleagues and clients: that you must balance the class prevalence before training a classifier. Certainly, I believe that classification tends to...

Read more »

John Chambers Statistical Software Award 2015

February 27, 2015
By

In 1998 John M. Chambers (now a member of R-core) won the ACM Software System Award for the S Language, which (in the words of the committee) "forever altered...

Read more »

RcppArmadillo 0.4.650.1.1 (and also 0.4.650.2.0)

February 26, 2015
By

A new Armadillo release 4.650.1 was released by Conrad a few days ago. Armadillo is a powerful and expressive C++ template library for linear algebra aiming towards...

Read more »

Compiling CoffeeScript in R with the js package

February 26, 2015
By
Compiling CoffeeScript in R with the js package

A new release of the js package has made it’s way to CRAN. This version adds support for compiling Coffee Script....

Read more »

reshape: from long to wide format

February 26, 2015
By
reshape: from long to wide format

This is to continue on the topic of using the melt/cast functions in reshape to convert between long and wide format of data frame. Here is the example I found...

Read more »

Why I think twice before editing plots in Powerpoint, Illustrator, Inkscape, etc.

February 26, 2015
By
Why I think twice before editing plots in Powerpoint, Illustrator, Inkscape, etc.

Thanks to a nice post by Meghan Duffy on the Dynamic Ecology blog (How do you make figures?), we have some empirical evidence that many figures made in...

Read more »

Using and Abusing Data Visualization: Anscombe’s Quartet and Cheating Bonferroni

February 26, 2015
By
Using and Abusing Data Visualization: Anscombe’s Quartet and Cheating Bonferroni

Anscombe’s quartet comprises four datasets that have nearly identical simple statistical properties, yet appear very different when graphed. Each dataset consists of eleven (x,y) points. They were

Read more »

Announcing shinyapps.io General Availability

February 26, 2015
By
Announcing shinyapps.io General Availability

RStudio is excited to announce the general availability (GA) of shinyapps.io. Shinyapps.io is an easy to use, secure, and scalable hosted service already being used by thousands of professionals...

Read more »

Aggregation

February 26, 2015
By

Aggregation splits data into subsets, computes summary statistics on each subset, and reports the results in a conveniently summarized form. The aggregate function is one of the most capable...

Read more »

The Downside of Rankings-Based Strategies

February 26, 2015
By
The Downside of Rankings-Based Strategies

This post will demonstrate a downside to rankings-based strategies, particularly when using data of a questionable quality (which, unless one … Continue reading →

Read more »

Collaborative Computing with distcomp

February 26, 2015
By
Collaborative Computing with distcomp

by Joseph Rickert Distcomp, a new R package available on GitHub from a group of Stanford researchers has the potential to significantly advance the practice of collaborative computing with...

Read more »

Fuzzy String Matching – a survival skill to tackle unstructured information

February 26, 2015
By
Fuzzy String Matching – a survival skill to tackle unstructured information

“The amount of information available in the internet grows every day” thank you captain Obvious! by now even my grandma is aware of that!. Actually, the internet...

Read more »

R: How to Layout and Design an Infographic

February 26, 2015
By
R: How to Layout and Design an Infographic

As promised from my recent article, here's my tutorial on how to layout and design an infographic in R. This article will serve as a template for more infographic...

Read more »

Generating ANOVA-like table from GLMM using parametric bootstrap

February 26, 2015
By
Generating ANOVA-like table from GLMM using parametric bootstrap

This article may also be found on RPubs: http://rpubs.com/hughes/63269 In the list of worst to best way to test for effect in GLMM the list on http://glmm.wikidot.com/faq state that...

Read more »

Adobe Sitecatalyst API and R: integrate reports with the SAINT classification file

February 26, 2015
By
Adobe Sitecatalyst API and R: integrate reports with the SAINT classification file

From original post @ http://analyticsblog.mecglobal.it/

Read more »

RMySQL version 0.10.2: Full SSL Support

February 25, 2015
By
RMySQL version 0.10.2: Full SSL Support

RMySQL version 0.10.2 has appeared on CRAN. This is a maintenance release to streamline the build process on various platforms. Most importantly,...

Read more »