# 1073 search results for "latex"

## Visualizing Clusters

February 24, 2015
By

Consider the following dataset, with (only) ten points x=c(.4,.55,.65,.9,.1,.35,.5,.15,.2,.85) y=c(.85,.95,.8,.87,.5,.55,.5,.2,.1,.3) plot(x,y,pch=19,cex=2) We want to get – say – two clusters. Or more specifically, two sets of observations, each of them sharing some similarities. Since the number of observations is rather small, it is actually possible to get an exhaustive list of all partitions, and to minimize some criteria, such...

## k-means clustering and Voronoi sets

February 22, 2015
By
$k$

In the context of -means, we want to partition the space of our observations into  classes. each observation belongs to the cluster with the nearest mean. Here “nearest” is in the sense of some norm, usually the (Euclidean) norm. Consider the case where we have 2 classes. The means being respectively the 2 black dots. If we partition based...

## 12 nifty tips for scientists who use computers

February 16, 2015
By

Simple things are good. Here is a list of 12 things that I find simple and useful, yet not many of my colleagues use them. The list is R-biased. Knitr. Intuitive tool to integrate R and text to make reports with fancy fonts, figures, syntax-highlighted R code and equations. If … Continue reading →

## Getting a statistics education: Review of the MSc in Statistics (Sheffield)

February 14, 2015
By

Some background:I started using statistics for my research sometime in 1999 or 2000. I was a student at Ohio State, Linguistics, and I had just gotten interested in psycholinguistics. I knew almost nothing ...

## Inequalities and Quantile Regression

February 6, 2015
By

In the course on inequality measure, we've seen how to compute various (standard) inequality indices, based on some sample of incomes (that can be binned, in various categories). On Thursday, we discussed the fact that incomes can be related to different variables (e.g. experience), and that comparing income inequalities between coutries can be biased, if they have very different...

## What does a Bayes factor feel like?

January 29, 2015
By

A Bayes factor (BF) is a statistical index that quantifies the evidence for a hypothesis, compared to an alternative hypothesis (for introductions to Bayes factors, see here, here or here). Although the BF is a continuous measure of evidence, humans love verbal labels, categories, and benchmarks. Labels give interpretations of the objective index – and

## the density that did not exist…

January 26, 2015
By
$the density that did not exist…$

On Cross Validated, I had a rather extended discussion with a user about a probability density as I thought it could be decomposed in two manageable conditionals and simulated by Gibbs sampling. The first component led to a Gumbel like density wirh y being restricted to either (0,1) or (1,∞) depending on β. The density

## stringdist 0.9: exercise all your cores

January 26, 2015
By

The latest release of the stringdist package for approximate text matching has two performance-enhancing novelties. First of all, encoding conversion got a lot faster since this is now done from C rather than from R. Secondly, stringdist now employs multithreading … Continue reading →

## Comparing the contribution of NBA draft picks

January 25, 2015
By

When it comes to the NBA draft, experts tend to argue about a number of things: at which position will a player be selected? what is the best draft class ever? etc… Luckily, the wealth of data made available by the great people of http://www.basketball-reference.com/draft/ make it possible to address a number of these, and other questions. To…

## Introducing: Orthogonal Nonlinear Least-Squares Regression in R

January 17, 2015
By
$Introducing: Orthogonal Nonlinear Least-Squares Regression in R$

With this post I want to introduce my newly bred ‘onls’ package which conducts Orthogonal Nonlinear Least-Squares Regression (ONLS): http://cran.r-project.org/web/packages/onls/index.html. Orthogonal nonlinear least squares (ONLS) is a not so frequently applied and maybe overlooked regression technique that comes into question when one encounters an “error in variables” problem. While classical nonlinear least squares (NLS) aims