1543 search results for "regression"

Classifying the UCI mushrooms

In my last post, I considered the shifts in two interestingness measures as possible tools for selecting variables in classification problems.  Specifically, I considered the Gini and Shannon interestingness measures applied to the 22 categorical mushroom characteristics from the UCI mushroom dataset.  The proposed variable selection strategy was to compare these values when computed from only edible mushrooms...

Read more »

R, the master troll of statistical languages

June 8, 2012
By
R, the master troll of statistical languages

Warning: what follows is a somewhat technical discussion of my love-hate relationship with the R statistical language, in which I somehow manage to waste 2,400 words talking about a single line of code. Reader discretion is advised. I’ve been using R to do most of my statistical analysis for about 7 or 8 years now–ever

Read more »

Simulation in the profiling model

June 7, 2012
By
Simulation in the profiling model

In this post I try to make a small simulation of the sensory (flavour) profiling data, and examine if the parameters of simulated data can be retrieved by the Bayesian model build in the previous posts.The conclusion is that it is difficult, the amount...

Read more »

Simulating the Birthday Problem with data derived probabilities

June 6, 2012
By
Simulating the Birthday Problem with data derived probabilities

You've probably heard of the Birthday Paradox: it only takes a small gathering of people before it's quite likely that two of them share the same birthday. You can solve the problem analytically or with simulation, but usually in either case simplifying assumptions are made (no-one born on February 29, for example). Joe Rickert uses Revolution R Enterprise 6...

Read more »

Let’s Party!

June 6, 2012
By
Let’s Party!

Exploring whether regression coefficients differ between groups is an important part of applied econometric research, and particularly for research with a policy based objective. For example, a government in a developing country may decide to introduce free school lunches in an effort to improve childhood health. However, if this treatment is known to only improve

Read more »

Facts About R Packages (1)

June 6, 2012
By

R Packages growth Curve Why R is so popular? There are a lot of reasons, such as: easy to learn and convenient to use, active community, open source, etc. Another important reason is the numerous contributed packages. Up to yesterday, there are 3854 R packages on CRAN. The following figure shows the growth curve of R package:

Example 9.34: Bland-Altman type plot

June 5, 2012
By
Example 9.34: Bland-Altman type plot

The Bland-Altman plot is a visual aid for assessing differences between two ways of measuring something. For example, one might compare two scales this way, or two devices for measuring particulate matter. The plot simply displays the difference between the measures against their average. Rather than a statistical test, it is intended...

Read more »

Announcing Revolution R Enterprise 6.0

June 5, 2012
By

Revolution Analytics is proud to announce the latest update to our enhanced, production-grade distribution of R, Revolution R Enterprise. This update expands the range of supported computation platforms, adds new Big Data predictive models, and updates to the latest stable release of open source R (2.14.2), which improves performance of the R interpreter by about 30%. This release expands...

Read more »

Generate Quasi-Poisson Distribution Variable

June 4, 2012
By

Most of regression methods assume that the response variables follow some exponential distribution families, e.g. Guassian, Poisson, Gamma, etc. However, this assumption was frequently violated in real world data by, for example, zero-inflated overdispersion problem. A number of methods were developed to deal with such problem, and among them, Quasi-Poisson and Negative Binomial are the most popular methods perhaps due...

Read more »

Fama-MacBeth and Cluster-Robust (by Firm and Time) Standard Errors in R

June 2, 2012
By
Fama-MacBeth and Cluster-Robust (by Firm and Time) Standard Errors in R

Ever wondered how to estimate Fama-MacBeth or cluster-robust standard errors in R? It can actually be very easy. First, for some background information read Kevin Goulding’s blog post, Mitchell Petersen’s programming advice, Mahmood Arai’s paper/note and code (there is an earlier version of the code with some more comments in it). For more formal references you may want to

Read more »