Meet us at R Day and at the Strata+Hadoop World NYC Oct 15-17, 2014

September 30, 2014
By
Meet us at R Day and at the Strata+Hadoop World NYC Oct 15-17, 2014

Are you headed to Strata? It’s just around the corner! We particularly hope to see you at R Day on October 15, where we will cover a raft of current topics that analysts and R users need to pay attention to. The R Day tutorials come from Hadley Wickham, Winston Chang, Garrett Grolemund, J.J. Allaire, and

Read more »

Additional tips for structuring an individual-based model in R

September 30, 2014
By
Additional tips for structuring an individual-based model in R

 I had a reader ask me recently to help understand how to modify the code of an individual-based model (IBM) that I posted a while back. It was my first attempt at an IBM in R, and I realized that I have made some significant changes to the way th...

Read more »

Why are we still teaching T-tests?

September 30, 2014
By

The following post by Norm Matloff originally appeared on his blog, Mad(Data)Scientist, on September 15th. We rarely republish posts that have appeared on other blogs, however, the questions that Norm raises both with respect to the teaching of statistics, and his assertion that "R's statistical procedures are centered far too much on significance testing" deserve a second look. Moreover,...

Read more »

Building a DGA Classifier: Part 1, Data Preparation

September 30, 2014
By

This will be a three-part blog series on building a DGA classifier and will be split into three logical phases of building a classifier: 1) Data preparation (this) 2) Feature engineering and 3) Model selection. And before I get too far into this, I want to give a huge thank you to Click Security for releasing a DGA classifier in python as part of...

Read more »

Example 2014.11: Contrasts the basic way for R

September 30, 2014
By
Example 2014.11: Contrasts the basic way for R

As we discuss in section 6.1.4 of the second edition, R and SAS handle categorical variables and their parameterization in models quite differently. SAS treats them on a procedure-by-procedure basis, which leads to some odd differences in capabilities and default parameterizations. For example, in the logistic procedure, the default is effect cell coding, while in the genmod...

Read more »

Structural Arb Analysis and Portfolio Management Functionality in R

September 30, 2014
By
Structural Arb Analysis and Portfolio Management Functionality in R

I want to use this post to replicate an article I found on SeekingAlpha, along with demonstrating PerformanceAnalytics’s ability to … Continue reading →

Read more »

Syrian Refugee Settlement Clinic Locations

September 30, 2014
By
Syrian Refugee Settlement Clinic Locations

Previously I posted about the location of refugee settlements and how that had grown in density over time as well as in numbers.  As many NGOs and non-profits work in the area, they are providing much needed assistance to the people living around ...

Read more »

Running RStudio via Docker in the Cloud

September 30, 2014
By
Running RStudio via Docker in the Cloud

Deploying applications via Docker container is the current talk of town. I have heard about Docker and played around with it a little, but when Dirk Eddelbuettel posted his R and Docker talk last Friday I got really excited and had to have a go myself....

Read more »

seeking altruistic social scientists, demographers, survey researchers

September 30, 2014
By
seeking altruistic social scientists, demographers, survey researchers

hi everyone, please share this:  if you are an experienced user of a publicly-available survey data set from any country or international organization, let's work together on some user-friendly code and a short blog post for http://asdfree.com.&nb...

Read more »

Rcpp 0.11.3

September 29, 2014
By

A new release 0.11.3 of Rcpp is now on the CRAN network for GNU R, and an updated Debian package has been uploaded too. Rcpp has become the most popular way of enhancing GNU R with C++ code. As of today, 273 packages on CRAN depend on Rcpp for making...

Read more »

Data management with ShinyApps.io

September 29, 2014
By
Data management with ShinyApps.io

Some of the most innovative Shiny apps share data across user sessions. Some apps share the results of one session to use in future sessions, others track user characteristics over time and make them available as part of the app. This level of sophistication creates tricky design choices when you host your app on a

Read more »

Video introduction to data manipulation with dplyr

September 29, 2014
By

Hadley Wickham's dplyr package is a great toolkit for getting data ready for analysis in R. If you haven't yet taken the plunge to using dplyr, Kevin Markham has put together a great hands-on video tutorial for his Data School blog, which you can see below. The video covers the five main data-manipulation "verbs" that dplyr provides: filter, select,...

Read more »

TURF Analysis: A Bad Answer to the Wrong Question

September 29, 2014
By
TURF Analysis: A Bad Answer to the Wrong Question

Now that R has a package performing Total Unduplicated Reach and Frequency (TURF) Analysis, it might be a good time to issue a warning to all R users. DON'T DO IT!The technique itself is straight out of media buying from the 1950s. Given some number of...

Read more »

Registration now open for Master R Developer workshop in San Francisco

September 29, 2014
By
Registration now open for Master R Developer workshop in San Francisco

Registration is now open for the next Master R Development workshop led by Hadley Wickham, author of over 30 R packages and the Advanced R book. The workshop will be held on January 19 and 20th in the San Francisco bay area. The workshop is a two day course on advanced R practices and package

Read more »

A majority victory is not that impossible

September 29, 2014
By
A majority victory is not that impossible

By this time next week, we'll already know the true vote intentions of Brazilians towards the candidates running for president. Not long time ago a runoff was taken as grant, but last polls have been converging on the feeling that Brazilians are about to reward the Workers' Party's government another term. Marina Silva (PSB), who … Read More...

Read more »

Ramarro: “R for Developers” free (web) book

September 29, 2014
By

What is Ramarro? Ramarro is a book about advanced R pro

Read more »

R package for Computational Actuarial Science

September 29, 2014
By

A webpage for the book is now hosted on http://cas.uqam.ca/ So far, it is a very basic page, but information regarding the package can be found there. For instance, to install the package, with all the datasets, the R code is > install.packages("CAS...

Read more »

Machine learning with R & Advanced R programming course

Machine learning with R & Advanced R programming course

This year, BNOSAC offers 2 R courses in cooperation with the Leuven Statistics Research Center. The courses are part of the Leuven STATistics STATe of the Art Training Initiative and are given in Leuven (Belgium). For R users and data scientists...

Read more »

jsonlite 0.9.12: now even lighter and faster

September 28, 2014
By
jsonlite 0.9.12: now even lighter and faster

The jsonlite package implements a robust, high performance JSON parser and generator for R, optimized for statistical data and the web. This week version 0.9.12 appeared on CRAN which includes a completely rewritten json parser and more optimized C code for json generation. The new parser is based on yajl...

Read more »

future of computational statistics

September 28, 2014
By
future of computational statistics

I am currently preparing a survey paper on the present state of computational statistics, reflecting on the massive evolution of the field since my early Monte Carlo simulations on an Apple //e, which would take a few days to return a curve of approximate expected squared error losses… It seems to me that MCMC is

Read more »

Row Search in Parallel

September 28, 2014
By
Row Search in Parallel

I’ve been always wondering whether the efficiency of row search can be improved if the whole data.frame is splitted into chunks and then the row search is conducted within each chunk in parallel. In the R code below, a comparison is done between the standard row search and the parallel row search with the FOREACH

Read more »

Back to square one – R and RStudio installation

September 28, 2014
By
Back to square one  – R and RStudio installation

I remember my first experience installing R. Basic installation can be humbling for someone not familiar with mirror networks or file binaries. I remember not knowing the difference between base and contrib… which one to select? The concept of CRAN and mirrors was also new to me. Which location do I choose and are they

Read more »

Deep Down Below – Using in-database analytics from within Tableau (with MADlib)

September 28, 2014
By
Deep Down Below – Using in-database analytics from within Tableau (with MADlib)

Introduction Using Tableau for visualizing all kinds of data is quite a joy, but it’s not that strong on build-in analytics or predictive features. Tableaus integration of R was a huge step in the right direction (and I love it very much - see here, here and here) but still has some limitations (e.g. no RAWSQL...

Read more »

Updated dplyr Examples

September 28, 2014
By
Updated dplyr Examples

Over the summer I made two posts about using the dplyr package.  The first was an example of the dplyr verbs applied to fish data.  The second was an example of modifications that I had made to lencat() to work better … Continue reading →

Read more »

Stage abundances, eigenvector of population matrix

September 28, 2014
By
Stage abundances, eigenvector of population matrix

The previous article introduced the seasonal matrices and the population growth rate λ of imaginary annual plant.  This article focuses on  the meaning of the eigenvector at first, and then … Continue reading →

Read more »

Bayesian models in R

September 28, 2014
By
Bayesian models in R

There are many ways to run general Bayesian calculations in or from R. The best known are JAGS, OpenBUGS and STAN. Then some time ago Rasmus Bååth had a post Three ways to run Bayesian models in R in which he mentioned LaplacesDemon (not on CRAN) on top of those. A check of the Bayes task view...

Read more »

Exploring Mangalyaan tweets with R

September 27, 2014
By
Exploring Mangalyaan tweets with R

Mangalyaan is the spacecraft of Indian Space Research Orgnization’s Mars Orbiter Mission that entered the orbit of Mars last week. There were several tweets in Twitter with hashtag #Mangalyaan about it last week. I wanted to use R to explore … Continue reading →

Read more »

Recognizing Patterns in the Purchase Process by Following the Pathways Marked By Others

September 27, 2014
By
Recognizing Patterns in the Purchase Process by Following the Pathways Marked By Others

Herbert Simon's "ant on the beach" does not search for food in a straight line because the environment is not uniform with pebbles, pools and rough terrain. At least the ant's decision making is confined to the 3-dimensional space defining the beach. C...

Read more »

A book about some important bits of R

September 27, 2014
By

I see that Hadley Wickham’s new book, “Advanced R”, is being published in dead tree form and will be available a month or so. Hadley has generously made the material available online; I quickly skimmed the material a few months ago when I first heard about it and had another skim this afternoon. The main

Read more »