Blog Archives

Practical Kullback-Leibler (KL) Divergence: Discrete Case

August 5, 2015
By

KL divergence (Kullback-Leibler57) or KL distance is non-symmetric measure of difference between two probability distributions. It is related to mutual information and can be used to measure the association between two random variables.In this short tutorial, I show how to compute KL divergence and mutual information for two categorical variables, interpreted as discrete random variables.${bf Definition}$: Kullback-Leibler (KL) Distance...

Read more »

Scale back or transform back multiple linear regression coefficients: Arbitrary case with ridge regression

April 10, 2015
By

SummaryThe common case in data science or machine learning applications, different features or predictors manifest them in different scales. This could bring difficulty in interpreting the resulting coefficients of linear regression, such as one featur...

Read more »

Euclid Algorithm for Set of Integers: ‘Reduce’ vs. trees in R

May 7, 2014
By

The Euclid Algorithm provides a solution to the greatest common divisor (GCD) of two natural numbers $x_{1}$ and $x_{-2}$, denoted by $GCD(x_{1}, x_{2})$. This will produce the largest integer that divides $x_{1}$ and $x_{2}$. Solution is proposed by ...

Read more »

Particle approximation to probability density functions: Dirac delta function representation

January 17, 2014
By
Particle approximation to probability density functions: Dirac delta function representation

In the previous post, I have briefly shown the idea of using dirac delta function for discrete data representation. In the second example there, a histogram locations for a given set of points are presented as spike trains, where as heights are somehow...

Read more »

Demystify Dirac delta function for data representation on discrete space

November 20, 2013
By
Demystify Dirac delta function for data representation on discrete space

Dirac delta function is an important tool in Fourier Analysis. It is used specially in electrodynamics and signal processing routinely.  A function over set of data points is often shown with a delta function representation. A novice reader relyin...

Read more »

A technique for doing parametrized unit testing in R: Case study with stock price data analysis

September 13, 2013
By

Ensuring the quality and correctness of statistical or scientific software in general constitute as one fo the main responsibilities of scientific software developers and scientists who provide a code to solve a specific computational task. Sometimes t...

Read more »

Metaprogramming in R with an example: Beating lazy evaluation

September 5, 2013
By

Functional languages allows us to treat functions as types. This brings us a distinct advantage of being able to write a code that generates further code, this practise is generally known as metaprogramming. As a functional language R project provides ...

Read more »

Practicing static typing in R: Prime directive on trusting our functions with object oriented programming

June 13, 2013
By

The creator of S language which R is derived from John Chambers said in one of his books  Software for data analysis programming with R: ...This places an obligation on all creators of software to program in such away that the computations ca...

Read more »

Ripley Facts

June 10, 2013
By

Normally, this blog would only contain technical and scientific related posts. But this time I would like to share with you a very interesting phenomenon I came across on the R mailing list(s). I call it 'Ripley Facts' after the prolific statistician, ...

Read more »

Matrix Cumulative Coherence: Fourier Bases, Random and Sensing Matrices

April 9, 2013
By

Compressive sampling (CS) is revolutionizing the way we process analog to digital conversion, our understanding of linear systems and the limits of information theory. One of the key concept in CS is that a signal can be represented in a sparse bases o...

Read more »