Effective Applications of the R Language Conference 2014

September 25, 2014
By
Effective Applications of the R Language Conference 2014

By Chris Campbell - Senior Consultant, UK. What struck me first was how few sandals I could see, none of which were paired with socks. The energy in the room was electric as introductions were made and business cards were exchanged. The inaugural Effective Applications of the R Language (EARL) had started strongly with two sold-out workshops. As Matt Aldridge...

Read more »

RMOA package for running streaming classifcation & regression models now at CRAN

RMOA package for running streaming classifcation & regression models now at CRAN

Last week, we released the RMOA package at CRAN (http://cran.r-project.org/web/packages/RMOA). It is an R package to allow building streaming classification and regression models on top of MOA. MOA is the acronym of 'Massive Online Analysis' and it is the most popular open source framework for data stream mining which is being developed at the University of Waikato: http://moa.cms.waikato.ac.nz....

Read more »

Joint Models for Longitudinal and Survival Data

September 25, 2014
By
Joint Models for Longitudinal and Survival Data

What are joint models for longitudinal and survival data? In this post we will introduce in layman's terms the framework of joint models for longitudinal and time-to-event data. These models are applied in settings where the sample units are followed-up in time, for example, we may be interest in patients suffering...

Read more »

“R for Developers” course – Oct 16-17 @ Milano, Italy

September 25, 2014
By

R for Developers Milano - October 16 and 17, 2014 Course description This two-day course provides an overview of several advanced R topics, such as: R environments, object oriented programming, functional programming and debugging. Who should attend this course Anyone … Continue reading →

Read more »

Become an effective data hacker with the R-Hadoop stack

September 24, 2014
By

In discussion with several data scientists, Will Stanton (a data scientist with Return Path) learned that a common concern is: what software should I be using? There are many options out there, but what is the best platform to be an effective "data hacker"? Will recommends using a technology stack with R and Hadoop, which allows data scientists "to...

Read more »

Nuts and Bolts of Quantstrat, Part IV

September 24, 2014
By
Nuts and Bolts of Quantstrat, Part IV

This post will provide an introduction to the way that rules work in quantstrat. It will detail market orders along … Continue reading →

Read more »

Multiple Tests, an Introduction

September 24, 2014
By
Multiple Tests, an Introduction

Last week, a student asked me about multiple tests. More precisely, she ran an experience over – say – 20 weeks, with the same cohort of – say – 100 patients. An we observe some size=100 nb=20 set.seed(1) X=matrix(rnorm(size*nb),size,nb) (here, I just generate some fake data). I can visualize some trajectories, over the 20 weeks, library(RColorBrewer) cl1=brewer.pal(12,"Set3") cl2=brewer.pal(8,"Set2") cl=c(cl1,cl2)...

Read more »

Adding Google Drive Times and Distance Coefficients to Regression Models with ggmap and sp

September 24, 2014
By
Adding Google Drive Times and Distance Coefficients to Regression Models with ggmap and sp

Space, a wise man once said, is the final frontier. Not the Buzz Alrdin/Light Year, Neil deGrasse Tyson kind (but seriously, have you seen Cosmos?). Geographic space. Distances have been finding their way into metrics since the cavemen (probably). GIS seem to make nearly every science way more fun…and accurate! Most of my research deals with

Read more »

Data Science Toolbox Survey Results… Surprise! R and Python win

September 24, 2014
By
Data Science Toolbox Survey Results… Surprise! R and Python win

This is a re-publication of a blog post from a blog I created not long before...

Read more »

DVI Performance

September 24, 2014
By
DVI Performance

This is the next post in the DVI indicator series. After the first two (here and here) analyzed in details the post-entry returns and the entry power of this indicator, it’s time to take a look at the trading performance. Using the Systematic Investor Toolbox, we get some pretty decent results: CAGR of 16.15% and

Read more »

PageRank For SQL Lovers

September 24, 2014
By
PageRank For SQL Lovers

If you’re changing the world, you’re working on important things. You’re excited to get up in the morning (Larry Page, CEO and Co-Founder of Google) This is my particular tribute to one of the most important, influential and life-changer R packages I have discovered in the last times: sqldf package. Because of my job, transforming

Read more »

Changing the Light Azimuth in Shaded Relief Representation by Clustering Aspect

September 24, 2014
By

Some time ago I published an article on "The Cartographic Journal" regarding a method to automatically change the light azimuth in shaded relief representations.This method was based on clustering the aspect derivative of the DTM. The method was develo...

Read more »

Post 10: Multicore parallelism in MCMC

September 24, 2014
By
Post 10: Multicore parallelism in MCMC

MCMC is by its very nature a serial algorithm -- each iteration depends on the results of the last iteration. It is, therefore, rather difficult to parallelize MCMC code so that a single chain will run more quickly by splitting … Continue reading →

Read more »

PubMed Publication Date: what is it, exactly?

September 23, 2014
By
PubMed Publication Date: what is it, exactly?

File this one under “has troubled me (and others) for some years now, let’s try to resolve it.” Let’s use the excellent R/rentrez package to search PubMed for articles that were retracted in 2013. 117 articles. Now let’s fetch the records in XML format. Next question: which XML element specifies the “Date of publication” (PDAT)?

Read more »

In-depth introduction to machine learning in 15 hours of expert videos

September 23, 2014
By
In-depth introduction to machine learning in 15 hours of expert videos

In January 2014, Stanford University professors Trevor Hastie and Rob Tibshirani (authors of the legendary Elements of Statistical Learning textbook) taught an online course based on their newest textbook, An Introduction to Statistical Learning with Applications in R (ISLR). I found it to be an excellent course in statistical learning (also known as "machine learning"), largely...

Read more »

a weird beamer feature…

September 23, 2014
By
a weird beamer feature…

As I was preparing my slides for my third year undergraduate stat course, I got a weird error that got a search on the Web to unravel: which was related with a fragile environment but not directly the verbatim part: the reason for the bug was that the end{frame} command did not have a line

Read more »

Seeing the (day)light with R

September 23, 2014
By
Seeing the (day)light with R

The arrival of the autumnal equinox foreshadows the reality of longer nights and shorter days here in the northeast US. We can both see that reality and distract ourselves from it at the same time by firing up RStudio (or your favorite editor) and taking a look at the sunrise & sunset times based on

Read more »

Factors are not first-class citizens in R

September 23, 2014
By
Factors are not first-class citizens in R

The primary user-facing data types in the R statistical computing environment behave as vectors. That is: one dimensional arrays of scalar values that have a nice operational algebra. There are additional types (lists, data frames, matrices, environments, and so-on) but the most common data types are vectors. In fact vectors are so common in R Related posts:

Read more »

How to publish R and ggplot2 to the web

September 23, 2014
By
How to publish R and ggplot2 to the web

by Matt Sundquist, Plotly Co-founder It's delightfully smooth to publish R code, plots, and presentations to the web. For example: Shiny makes interactive apps from R. Pretty R highlights R code for HTML. Slidify makes slides from R Markdown. Knitr and RPubs let you publish R Markdown docs. GitHub and devtools let you quickly release packages and collaborate. Now,...

Read more »

Hands-on dplyr tutorial for faster data manipulation in R

September 23, 2014
By
2014-09-23 18_21_48-Clipboard

I love dplyr. It's my "go-to" package in R for data exploration, data manipulation, and feature engineering. I use dplyr because it saves me time: its performance is blazing fast on data frames, but even more importantly, I can write dplyr code faster ...

Read more »

testthat 0.9

September 23, 2014
By
testthat 0.9

testthat 0.9 is now available on CRAN. Testthat makes it easy to turn the informal testing that you’re already doing into formal automated tests. Learn more at http://r-pkgs.had.co.nz/tests.html This version of testthat has four important new features that bring testthat up to speed with unit testing frameworks in other languages: You can skip() tests with

Read more »

NCEAS Codefest Follow-up

September 23, 2014
By

The week after labor day, we had the pleasure of attending the NCEAS open science codefest event in Santa Barbara. It was great to meet folks like the new arrivals at the expanding Mozilla Science Lab, Bill Mills and Abby Cabunoc (Bill even already has a great post up about the codefest), and see...

Read more »

Managing R package dependencies

September 23, 2014
By
Managing R package dependencies

One of my take aways from last week's EARL conference was that R is more and more growing out of its academic roots into the enterprise. And with that come some challenges, e.g. how do I ensure consistent and systematic access to a set of R packages in an organisation, in particular when one team is providing...

Read more »

Lazy load with archivist

September 22, 2014
By
Lazy load with archivist

Version 1.1 of the archivist package reached CRAN few days ago. This package supports operations on disk based repository of R objects. It makes the storing, restoring and searching for an R objects easy (searching with the use of meta information). Want to share your object with article reviewers or collaborators? This package should help.

Read more »

RcppArmadillo 0.4.450.1.0

September 22, 2014
By

Continuing with his standard pace of approximately one new version per month, Conrad released a new minor release of Armadillo a few days ago. As before, I had created a GitHub-only pre-release which was tested against all eighty-seven (!!) CRAN dependents of our RcppArmadillo package and then uploaded RcppArmadillo 0.4.450.0 to CRAN. The CRAN...

Read more »

Interesting high contrast plots in R

September 22, 2014
By
Interesting high contrast plots in R

I was inspired by this blog post and thought I could do the same thing in R.  Well I posted the code in Google+Here are my results.  Not bad.

Read more »

Newcastle R course, a write-up

September 22, 2014
By

I recently attended a week-long R course in Newcastle, taught by Colin Gillespie. It went from “An Introduction to R” to “Advanced Graphics” via a day each on modelling, efficiency and programming. Suffice to say it was an intense 5 days! Overall it was the best R course I’ve been on so far. I’d recommend it to others,...

Read more »

Twitter’s REST API v1.1 with R (for Linux and Windows)

September 22, 2014
By
Twitter’s REST API v1.1 with R (for Linux and Windows)

In this tutorial I am going to describe a straightforward way of how to make use of Twitter’s REST API v1.1. For that purpose I composed a little package (RTwitterAPI), so that requesting data just needs the API URL, the API parameters … Continue reading →

Read more »

Around the world in 80k miles

September 22, 2014
By
Around the world in 80k miles

You're probably familiar with the classic Travelling Salesman problem: given (say) 20 cities, what is shortest route you can take that passes through all 20 cities and returns to the starting point? It's a difficult problem to solve, because you need to try all possible routes to find the minimum, and there are a LOT of possibilities. For a...

Read more »