# Monthly Archives: October 2014

## Vector Search vs. Binary Search

October 1, 2014
## Cross Validation for Kernel Density Estimation

October 1, 2014
$\mathbb{E}\left[\int [\widehat{f}_h(x)-f(x)]^2dx\right]$

In a post publihed in July, I mentioned the so called the Goldilocks principle, in the context of kermel density estimation, and bandwidth selection. The bandwith should not be too small (the variance would be too large) and it should not be too large (the bias would be too large). Another standard method to select the bandwith, as mentioned...

## New York Times approachably describes Bayesian Statistics

October 1, 2014
The New York Times published an article of interest to statisticians the other day: "The Odds, Continually Updated". Surprisingly for a general-audience newspaper, this article goes into the the distinctions between Bayesian and frequentist statistics, and does so in a very approachable way. Here's an excerpt: The essence of the frequentist technique is to apply probability to data. If...

## Got a ticket for the runoff?

October 1, 2014
This is one of the very last posting before the election next Sunday. So far, the only certainty is the runoff ticket of the incumbent candidate, Dilma Rousseff (PT). The runner up candidates, the environmentalist Marina Silva (PSB) and the Social Democrat Aecio Neves are walking to a neck-and-neck dispute over the last spin. Although … Read More →

## Working with NIfTI images in R

October 1, 2014
The oro.nifti package is awesome for NeuRoimaging (couldn't help myself). It has functions to read/write images, introduces the S4 nifti class, and has useful plotting functions. There are some limitations and some gotchas that are important to discuss if you are working with these objects in R. Dataset Creation We'll read in some data (a

## multiple annotation in ChIPseeker

October 1, 2014
Nearest gene annotation Almost all annotation software calculate the distance of a peak to the nearest TSS and assign the peak to that gene. This can be misleading, as binding sites might be located between two start sites of different genes or hit different genes which have the same TSS location in the genome. Read More: 1310 Words Totally

## Transparent hurricane paths in R

October 1, 2014
Arthur Charpentier has written a really nice blog post about obtaining hurricane tracks and plotting them. He then goes on to do other clever Markov process models, but as a dataviz guy who knows almost nothing about meteorology, I want to … Continue reading →

## New fiscal sponsorship agreement with NumFocus foundation

October 1, 2014
I’m very pleased to announce that rOpenSci has signed a comprehensive fiscal sponsorship agreement with the NumFocus foundation, a 501(c)3 nonprofit that supports R&D for open source scientific software projects. We are delighted to be in the company of esteemed projects such as IPython and Julia that share our goal of promoting reproducible research practices across many scientific communities...