JSM 2015 | Check out these talks and visit RStudio!

August 3, 2015
By
JSM 2015 | Check out these talks and visit RStudio!

The Joint Statistics Meetings starting August 8 is the biggest meetup for statisticians in the world. Navigating the sheer quantity of interesting talks is challenging – there can be up to 50 sessions going on at a time! To prepare for Seattle, we asked RStudio’s Chief Data Scientist Hadley Wickham for his top session picks. Here are 9 talks, admittedly biased towards R,

Read more »

11 new R jobs (from R-users.com ; 2015-08-04)

August 3, 2015
By
11 new R jobs (from R-users.com ; 2015-08-04)

This is the bimonthly post (for 2015-08-04) for new R Jobs from R-users.com. Employers: visit this link to post a new R job to the R community (it’s free and quick). Job seekers: please follow the links below to learn more and apply for your job of interest (or visit previous R jobs posts). Full-Time Data Scientist AWF Leading On Line company – Posted by Ortal Katsva Netanya Center District, Israel...

Read more »

Wikipedia and the Fashion Weeks: A Look at Usage Patterns

August 3, 2015
By
Wikipedia and the Fashion Weeks: A Look at Usage Patterns

Unlike many of the entries on Wikipedia relating to statistics or computer science, fashion related topics have not not been thoroughly documented. For example, the entries on Martin Margiela and Rei Kawakubo pale in comparison to the breadth of content on John Bayes, structural equation modeling, or R. In lieu of this, I wanted to investigate whether

Read more »

Watch 18 years of R development in 15 minutes

August 3, 2015
By
Watch 18 years of R development in 15 minutes

R is an incredibly active software project. Since the first code was checked into Subversion back on 18 September 1997, there have been more than 100,000 updates to the R sources by the 20-some members of the R Core Team. As you can see from the activity graphs from the GitHub mirror of the R sources, the pace of...

Read more »

Multivariate Techniques in Python: EcoPy Alpha Launch!

August 3, 2015
By
Multivariate Techniques in Python: EcoPy Alpha Launch!

I’m announcing the alpha launch of EcoPy: Ecological Data Analysis in Python. EcoPy is a Python module that contains a number of  techniques (PCA, CA, CCorA, nMDS, MDS, RDA, etc.) for exploring complex multivariate data. For those of you familiar … Continue reading →

Read more »

Sensemaking in R: A Plenitude of Models Makes for Good Storytelling

August 3, 2015
By
Sensemaking in R: A Plenitude of Models Makes for Good Storytelling

"Sensemaking is a motivated, continuous effort to understand connections (which can be among people, places, and events) in order to anticipate their trajectories and act effectively."- Gary Klein, Brian Moon & Robert HoffmanMaking Sense of Sensema...

Read more »

Feature Engineering versus Feature Extraction: Game On!

August 3, 2015
By
Feature Engineering versus Feature Extraction: Game On!

"Feature engineering" is a fancy term for making sure that your predictors are encoded in the model in a manner that makes it as easy as possible for the model to achieve good performance. For example, if your have a date field as a predictor and there are larger differences in response for the weekends versus the weekdays, then...

Read more »

Going Bananas #2: A Needle In A Haystack

August 3, 2015
By
Going Bananas #2: A Needle In A Haystack

Now I’m gonna tell my momma that I’m a traveller, I’m gonna follow the sun (The Sun, Parov Stelar) Inspired by this book I read recently, I decided to do this experiment. The idea is comparing how easy is to find sequences of numbers inside Pi, e, Golden Ratio (Phi) and a randomly generated number. … Continue reading...

Read more »

Ternary Interpolation / Smoothing

August 3, 2015
By
Ternary Interpolation / Smoothing

For a long time, people have been sending me requests for a suitable smoothing / contouring / interpolation geometry be made available via ggtern, over and above the Kernel Density function. I am very pleased to say, that the recent version 1.0.6 has this feature added. Let me demonstrate how it works. This geometry & The post

Read more »

Producing a Control Chart in R – An Application in Analytical Chemistry

Producing a Control Chart in R – An Application in Analytical Chemistry

Introduction Many processes in chemistry, especially in synthesis, require attaining a certain target value for a property of interest.  For example, when synthesizing drug capsules that contain a medicine, a chemist has to ensure that the concentration of the medicine meets a target value.  If the concentration is too high or too low, then the patient

Read more »

Interview with a Data Scientist (Hadley Wickham)

August 2, 2015
By
Interview with a Data Scientist (Hadley Wickham)

Originally posted on Models are illuminating and wrong:I recently interviewed Hadley Wickham the creator of Ggplot2 and a famous R Stats person. He works for RStudio and his job is to work on Open Source software aimed at Data Geeks. Hadley is famous for his contributions to Data Science tooling and inspires a lot…

Read more »

Survival Analysis – 1

August 2, 2015
By

I recently was looking for methods to apply to time-to-event data and started exploring Survival Analysis Models. In this post, I'm exploring basic KM estimator which is a nonparametric estimator of the survival function using a real dataset (on time t...

Read more »

Two New R Packages – qrencoder & passwordrandom

August 2, 2015
By
Two New R Packages – qrencoder & passwordrandom

Believe it or not, there are two questions on @StackOverflowR about how to make QR codes in R. I personally think QR codes are kinda hokey, but who am I to argue with pressing needs of the #rstats community? I found libqrencode and it’s highly brew-able and apt-able (probably yum-able, too, if you

Read more »

Playing with leafletR

August 2, 2015
By

Coming back from a bicycle ride in Tarn, I wanted to have a look at the trip. This gave me the opportunity to learn how to use the R package leafletR to obtain an HTML map using leaflet and the great projects OpenStreetMap or Thunderforest (among others). Introduction To use leaflet over R, two packages

Read more »

Seattle’s Fremont Bridge Bicyclists Again in the News

August 2, 2015
By
Seattle’s Fremont Bridge Bicyclists Again in the News

Back in 2013, David Smith had done analysis of bicycle trips across Seattle’s Fremont bridge. More recently, Jake Vanderplas (creator of Python’s very popular Scikit-learn package) wrote a nice blog post on “Learning Seattle Work habits from bicycle counts” at … Continue reading →

Read more »

[ggtree] annotate phylogenetic tree with local images

August 1, 2015
By
[ggtree] annotate phylogenetic tree with local images

In ggtree, we provide a function annotation_image for annotating phylogenetic tree with images. To demonstrate the usage, I created a tree view from a random tree as shown below: Read More: 973 Words Totally

Read more »

Streamgraphs in R

July 31, 2015
By
Streamgraphs in R

It's not easy to visualize a quantity that varies over time and which is composed of more than two subsegments. Take, for example, this stacked bar chart of religious affiliation of the Australian population, by time: While it's easy to see the how the share of Anglicans (at the bottom of the chart) has changed over time, it's much...

Read more »

Rendering LaTeX Math Equations in GitHub Markdown

July 31, 2015
By
Rendering LaTeX Math Equations in GitHub Markdown

The Problem: GitHub README.md won't render LaTeX I have many times wondered about getting LaTeX math to render in a README file on GitHub. Apparently, many others ( 1, 2, 3 ), have asked the same question. The common answers are: It cannot (and in some cases, shouldn't) be done. GitHub parsing is done by

Read more »

Building a user interface for spatstat

July 30, 2015
By
Building a user interface for spatstat

Contents Introduction Get and run it! Roadmap Introduction RStudio developed shiny, an R package that, quoting from their website, “makes it super simple for R users like you to turn analyses into interactive web applications that anyone can use”. It leverages the power of R and its vast collection of packages to allow users to efficiently perform...

Read more »

15 Questions All R Users Have About Plots

July 30, 2015
By
15 Questions All R Users Have About Plots

R allows you to create different plot types, ranging from the basic graph types like density plots, dot plots, bar charts, line charts, pie charts, boxplots and scatter plots, to the more statistically complex types of graphs such as probability plots, mosaic plots and correlograms. In addition, R is pretty known for its data visualization The post

Read more »

MRAN’s Packages Spotlight

July 30, 2015
By
MRAN’s Packages Spotlight

by Joseph Rickert New R packages just keep coming. The following plot, constructed with information from the monthly files on Dirk Eddelbuettel's CRANberries site, shows a plot of the number of new packages released to CRAN between January 1, 2013 and July 27, 2015 by month (not quite 31 months). This is amazing growth! The mean rate is about...

Read more »

The little mixed model that could, but shouldn’t be used to score surgical performance

July 30, 2015
By
The little mixed model that could, but shouldn’t be used to score surgical performance

The Surgeon Scorecard Two weeks ago, the world of medical journalism was rocked by the public release of ProPublica’s Surgeon Scorecard. In this project ProPublica “calculated death and complication rates for surgeons performing one of eight elective procedures in Medicare, carefully adjusting for differences in patient health, age and hospital quality.”  By making the dataset

Read more »

Empirical bias analysis of random effects predictions in linear and logistic mixed model regression

July 30, 2015
By
Empirical bias analysis of random effects predictions in linear and logistic mixed model regression

In the first technical post in this series, I conducted a numerical investigation of the biasedness of random effect predictions in generalized linear mixed models (GLMM), such as the ones used in the Surgeon Scorecard, I decided to undertake two explorations: firstly, the behavior of these estimates as more and more data are gathered for each

Read more »

#rstats Make arrays into vectors before running table

July 29, 2015
By
#rstats Make arrays into vectors before running table

Setup of Problem While working with nifti objects from the oro.nifti, I tried to table the values of the image. The table took a long time to compute. I thought this was due to the added information about a medical image, but I found that the same sluggishness happened when coercing the nifti object to

Read more »

R Oddities: Strings in DataFrames

July 29, 2015
By
R Oddities:  Strings in DataFrames

Have you ever read a file into R and then encountered strange problems filtering and sorting because the strings were converted to factors?  For instance, you might think the two data frames, df and df2 below are contain the same data> df-data>> write.csv(df, 'df.csv')> df2-read>But look the dimensions are different> dim(df) 1 1> dim(df2) 1 2And...

Read more »

But I Don’t Want to Be a Statistician!

July 29, 2015
By
But I Don’t Want to Be a Statistician!

"For a long time I have thought I was a statistician.... But as I have watched mathematical statistics evolve, I have had cause to wonder and to doubt.... All in all, I have come to feel that my central interest is in data analysis...."Opening paragrap...

Read more »

Mapping the past and the future with Leaflet

July 29, 2015
By
Mapping the past and the future with Leaflet

I have been working on mapping things for a while and I must say that I really like the Leaflet package from Rstudio. It makes it very easy and straight forward to make leaflet maps. A while back I stumbled upon an interactive graphic from The Times, that used census data to compare each US... Read more »

Player Value Gap Assessment

July 29, 2015
By
Player Value Gap Assessment

Looking at fantasy football projections we have a group of experts providing their views on how a player will do during the football season. We have collected projections from several The post Player Value Gap Assessment appeared first on Fantasy Football Analytics.

Read more »

Predict Social Network Influence with R and H2O Ensemble Learning

July 29, 2015
By
Predict Social Network Influence with R and H2O Ensemble Learning

What is H2O? H2O is an awesome machine learning framework. It is really great for data scientists and business analysts “who need scalable and fast machine learning”. H2O is completely open source and what makes it important is that works right of the box. There seems to be no easier way to start with scalable The post

Read more »