# Blog Archives

## Hierarchical Linear Model

July 22, 2013
By

Linear regression probably is the most familiar technique of data analysis, but its application is often hamstrung by model assumptions. For instance, if the data has a hierarchical structure, quite often the assumptions of linear regression are feas...

## Bayesian Classification with Gaussian Process

January 6, 2013
By

Despite prowess of the support vector machine, it is not specifically designed to extract features relevant to the prediction. For example, in network intrusion detection, we need to learn relevant network statistics for the network defense. In consu...

## Bayesian Inference Using OpenBUGS

July 22, 2012
By

In our previous statistics tutorials, we have treated population parameters as fixed values, and provided point estimates and confidence intervals for them. An alternative approach is the Bayesian statistics. It treats population parameters as random...

## Significance Test for Kendall’s Tau-b

April 15, 2012
By

A variation of the standard definition of Kendall correlation coefficient is necessary in order to deal with data samples with tied ranks. It known as the Kendall’s tau-b coefficient and is more effective in determining whether two non-parametric data samples with ties are correlated. read more

## Support Vector Machine with GPU, Part II

October 21, 2011
By

In our last tutorial on SVM training with GPU, we mentioned a necessary step to pre-scale the data with rpusvm-scale, and to reverse scaling the prediction outcome. This cumbersome procedure is now simplified with the latest RPUSVM. read more

## Support Vector Machine with GPU

August 27, 2011
By

Most elementary statistical inference algorithms assume that the data can be modeled by a set of linear parameters with a normally distributed noise component. A new class of algorithms called support vector machine (SVM) remove such constraint. rea...

## Kendall Rank Coefficient by GPU

December 7, 2010
By

The correlation coefficient is a measurement of correlation between two random variables. While its computation is straightforward, it is not readily applicable to non-parametric statistics. read more

## Hierarchical Cluster Analysis

November 25, 2010
By

With the distance matrix found in previous tutorial, we can use various techniques of cluster analysis for relationship discovery. For example, in the data set mtcars, we can run the distance matrix with hclust, and plot a dendrogram that displays a ...

## Hierarchical Cluster Analysis

November 25, 2010
By

With the distance matrix found in previous tutorial, we can use various techniques of cluster analysis for relationship discovery. For example, in the data set mtcars, we can run the distance matrix with hclust, and plot a dendrogram that displays a ...

## GPU Computing with R

August 16, 2010
By

Statistics is computationally intensive. Routine statistical tasks such as data extraction, graphical summary, and technical interpretation all require heavy use of modern computing machinery. Obviously, these tasks can benefit greatly from a paralle...