Blog Archives

Understanding the empirical law of large numbers and the gambler’s fallacy

August 1, 2016
By
Understanding the empirical law of  large numbers and the gambler’s fallacy

One of the misconceptions in our understanding of statistics, or a counter-intuitive guess, fallacy, appears in the assumption of the existence of the law of averages. Imagine we toss a fair coin many times, most people would think that the number of heads and tails would be balanced over the increasing number of trails, which is wrong....

Read more »

Economy and dynamic modelling: Haavelmo’s approach

July 25, 2016
By

Econometrics aims at estimating observables in the economy and their inter-dependencies and testing the estimates against the economic reality. A quantitative approach to express these inter-dependencies appear as simultaneous equations, an i.e. system of linear equations, this is  a mathematical structure of economic relationships that were made possible with the pioneering work of Nobel prize winning economist...

Read more »

S-shaped data: Smoothing with quasibinomial distribution

January 16, 2016
By
S-shaped data: Smoothing with quasibinomial distribution

Figure 1: Synthetic data and fitted curves.S-shaped distributed data can be found in many applications. Such data can be approximated with logistic distribution function .  Cumulative distribution function of logistic distribution function is a...

Read more »

Practical Kullback-Leibler (KL) Divergence: Discrete Case

August 5, 2015
By

KL divergence (Kullback-Leibler57) or KL distance is non-symmetric measure of difference between two probability distributions. It is related to mutual information and can be used to measure the association between two random variables.In this short tutorial, I show how to compute KL divergence and mutual information for two categorical variables, interpreted as discrete random variables.${bf Definition}$: Kullback-Leibler (KL) Distance...

Read more »

Scale back or transform back multiple linear regression coefficients: Arbitrary case with ridge regression

April 10, 2015
By

SummaryThe common case in data science or machine learning applications, different features or predictors manifest them in different scales. This could bring difficulty in interpreting the resulting coefficients of linear regression, such as one featur...

Read more »

Euclid Algorithm for Set of Integers: ‘Reduce’ vs. trees in R

May 7, 2014
By

The Euclid Algorithm provides a solution to the greatest common divisor (GCD) of two natural numbers $x_{1}$ and $x_{-2}$, denoted by $GCD(x_{1}, x_{2})$. This will produce the largest integer that divides $x_{1}$ and $x_{2}$. Solution is proposed by ...

Read more »

Particle approximation to probability density functions: Dirac delta function representation

January 17, 2014
By
Particle approximation to probability density functions: Dirac delta function representation

In the previous post, I have briefly shown the idea of using dirac delta function for discrete data representation. In the second example there, a histogram locations for a given set of points are presented as spike trains, where as heights are somehow...

Read more »

Demystify Dirac delta function for data representation on discrete space

November 20, 2013
By
Demystify Dirac delta function for data representation on discrete space

Dirac delta function is an important tool in Fourier Analysis. It is used specially in electrodynamics and signal processing routinely.  A function over set of data points is often shown with a delta function representation. A novice reader relyin...

Read more »

A technique for doing parametrized unit testing in R: Case study with stock price data analysis

September 13, 2013
By

Ensuring the quality and correctness of statistical or scientific software in general constitute as one fo the main responsibilities of scientific software developers and scientists who provide a code to solve a specific computational task. Sometimes t...

Read more »

A technique for doing parameterized unit test in R: Case study with stock price data analysis

September 13, 2013
By

Ensuring the quality and correctness of statistical or scientific software in general constitute as one for the main responsibilities of scientific software developers and scientists who provide a code to solve a specific computational task. Sometimes tasks could be mission critical. For example, in drug trails, clinical research or designing an aviation related component,  a wrong outcome would risk...

Read more »

Sponsors

Mango solutions



plotly webpage

dominolab webpage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)