analyze the demographic and health surveys (dhs) with r

July 8, 2014
professors of public health 101 probably cite the results of the demographic and health surveys (dhs) more than all other data sources combined.  funded by the united states agency for international development (usaid) and administered by the tech...

Reflections on useR! 2014

July 7, 2014
UseR! 2014, the R user conference held last week in LA, was the most successful yet. Around 700 R users from around the world converged on the UCLA campus to share their experiences with the R language and to socialize with other data scientists, statisticians and others using R. The week began with a series of 3-hour tutorials on...

Chillin’ at UseR! 2014

July 7, 2014
This year’s UseR! conference was held at the University of California in Los Angeles. Despite the great weather and a nearby beach, most of the conference was spent in front of projector screens in 18° c (64° f) rooms because there were so many interesting presentations and tutorials going on. I was lucky to present my R package...

Sometimes Table is not the Answer – a Faster 2×2 Table

July 7, 2014
The table command is great in its simplicity for cross tabulations. I have run into some settings where it is slow and I wanted to demonstrate one simple example here of why you may want to use other functions or write your own tabler. This example is a specific case where, for some examples and

What are the names of the school principals in Mexico?, If your name is Maria, probably this post will interest you. Trends and cool plots from the national education census of Mexico in 2013

I will start this post with a disclaimer:The main intention of the post is to show how is the distribution of the school principal names in Mexico, for example, to show basic trends regarding about what is the most common nation-wide first name and so ...

DSC 2014. Day 1

July 7, 2014
This is a report of the first day of the Directions in Statistical Computing (DSC) conference that took place in Brixen, Italy (See here for an introduction). Performance enhancements were the main theme of the day, covering not just improvements … Continue reading →

July 7, 2014
How to pick up 3 numbers from a uniform distribution in a transparent manner?

July 7, 2014
Over in my previous post, I’m giving away 3 copies of my video course on ggplot2 and shiny. To win a copy, you just need to leave a comment and I will select 3 winners among the n participants at … Continue reading →

Introduction to R for Life Scientists: Course Materials

July 7, 2014
Last week I taught a three-hour introduction to R workshop for life scientists at UVA's Health Sciences Library.I broke the workshop into three sections:In the first half hour or so I presented slides giving an overview of R and why R is so awesome. Du...

Four Simple Turtle Graphs To Play With Kids

July 7, 2014
Technology is just a tool: in terms of getting the kids working together and motivating them, the teacher is the most important (Bill Gates) Some days ago I read this article in R-bloggers and I discovered the TurtleGraphics package. I knew about turtle graphics long time ago and I was thinking of writing a post

Identify Fantasy Football Sleepers with this Shiny App

July 6, 2014
This post describes a Shiny app that identifies fantasy football sleepers.  The app allows you to modify your league settings, and calculates robust averages of projections across numerous sources.  Best of all, The post Identify Fantasy Football Sleepers with this Shiny App appeared first on Fantasy Football Analytics.

Competitive balance and home court advantage in the NBA

July 6, 2014
Two years ago, the entire NBA season went into lockout because of mostly financial reasons. However, one central point was also about keeping a competitive balance within the NBA, so that large and small-market teams alike would have a chance to compete for a championship. THis brings us to the obvious question “Is there competitive

Stone Flakes V, networks again

July 6, 2014
Last week I tried pcalg. This week deal (Learning Bayesian Networks with Mixed Variables). The aim n this post I want to try something new, a causal graphical model. The aim here is just as much to get myself a feel what these things do as to underst...

Estimating Required Coinage

July 5, 2014
I would like to code up a simple method of minimizing the number of coins required to give change.  Then I would like see what coins are most likely to be called into usage if change is required from a uniform draw between 1 cent and 499 cents. Exported from Notepad++ # Define denominations to search throughden...

July 5, 2014
The latest version of rNOMADS is now available on CRAN.  This update resolves several minor bugs and one major one involving multiple variable/level selections when using the ModelGrid function.  I have also added support for two more models on NOMADS:  Climate Forecast System Flux Products and Climate Forecast System 3D Pressure Products.  This brings the

RDataMining group having 6000 members today

July 4, 2014
RDataMining Group: http://group.rdatamining.com Twitter: @RDataMining Website: http://www.RDataMining.com The RDataMining group has 6000 members today, 5 July 2014. Created in August 2011, this group has developed into a big community with 6000 member within three years. Since its creation, many members … Continue reading →

Automatic bias correction doesn’t fix omitted variable bias

July 4, 2014
Page 94 of Gelman, Carlin, Stern, Dunson, Vehtari, Rubin “Bayesian Data Analysis” 3rd Edition (which we will call BDA3) provides a great example of what happens when common broad frequentist bias criticisms are over-applied to predictions from ordinary linear regression: the predictions appear to fall apart. BDA3 goes on to exhibit what might be considered Related posts:

Finding the distance from ChIP signals to genes

July 4, 2014
I’ve had a couple of months off from blogging. Time for some computer-assisted biology! Robert Griffin asks on Stack Exchange about finding the distance between HP1 binding sites and genes in Drosophila melanogaster.  We can get a rough idea with some public chromatin immunoprecipitation data, R and the wonderful BEDTools. Finding some binding sites There

Two handy documents for making good UK maps

July 4, 2014
Everybody loves a good map. Even if you don’t have any reason to make one, your boss will love it when you do, so check this out and get yourself a pay rise (possibly). First, this set of diagrams via … Continue reading →

The dendextend package for visualizing and comparing trees of hierarchical clusterings (slides from useR!2014)

July 3, 2014
This week I presented in the useR!2014 my package dendextend (also on github), for easily manipulating, visualizing, and comparing dendrograms. Put simply, it is a package designed to easily create figures like these: Here is my presentation from useR: You are also invited to give a look to the current version of the package vignettes: https://github.com/talgalili/dendextend/blob/master/vignettes/dendextend-tutorial.pdf I

Women Graduates in Math, Statistics, and Computer Information Systems

July 3, 2014
One of the more interesting talks at this year’s useR! Conference was the heR Panel discussing the role of women in the R community. They estimate that fewer than 15% of package authors are women. One of the points brought up was that this is less than the percentage of women in statistics. Perhaps this is more...

Efficient Ragged Arrays in R and Rcpp

July 3, 2014
When is R Slow, and Why? Computational speed is a common complaint lodged against R. Some recent posts on r-bloggers.com have compared the speed of R with some other programming languages , and showed the favorable impact of the new compiler package on run-times . I and others have written about using Rcpp to easily write C++...

useR! 2014 Highlights

July 3, 2014
My talk went well; here are the slides and a link to the paper pre-print. Hadley Wickham gave an excellent tutorial on dplyr. Based on the talk I saw, I think I will take the data sets from the book and make some public visualizations on the Plotly we...

Currency Exchange Rate Forecasting with ARIMA and STL

July 3, 2014
I have made an example of time series forecasting with R, demonstrating currency exchange rate forecasting with the ARIMA and STL models. The example is easy to understand and follow. R source files are provided to run the example. The … Continue reading →

How to Remember the Poisson Distribution

July 3, 2014
The Poisson cumulative distribution function (CDF) $$F(α,n) = \sum_{k=0}^n \dfrac{α^k}{k!} \; e^{-α} \label{eqn:pcdf}$$ is the probability of at most $n$ events occurring when the average number of events is α, i.e., $\Pr(X \le n)$. Since \eqref{eqn:pcdf} is a probability function, it cannot have a value greater than 1. In R, the CDF is given by the...

Beer and Pie | rCharts pie charts with d3pie

July 3, 2014
In honor of the 4th of July, I thought a quick example of a pie chart on beer using the wonderful new d3pie library would be appropriate.  The rCharts binding with d3pie is simply an experiment now, but expect more in the near future.   Using slidify...

UseR! 2014 Tutorials

July 3, 2014
by Joseph Rickert UserR! 2014 got under way this past Monday with a very impressive array of tutorials delivered on the day that the conferences organizers were struggling to cope with a record breaking crowd. My guess is that conference attendance is somewhere in the 700 range. Moreover, this the first year that I can remember that tutorials were...

useR 2014 Slides for PSAboot and Version 1.1 on CRAN

July 3, 2014
PSAboot is an R package to assist with bootstrapping propensity score methods. I gave a talk today at the useR! 2014 Conference. The slides can be downloaded from the PSAboot Github page or directly here. The package is described at jason.bryer.org/PSA...