# Monthly Archives: November 2011

## Regression via Gradient Descent in R

November 27, 2011
By

In a previous post I derived the least squares estimators using basic calculus, algebra, and arithmetic, and also showed how the same results can be achieved using the canned functions in SAS and R or via the matrix programming capabilities offered by ...

## Basic Econometrics in R and SAS

November 27, 2011
By

Regression Basicsy= b0 + b1 *X  ‘regression line we want to fit’The method of least squares minimizes the squared distance between the line ‘y’ andindividual data observations yi. That is minimize: ∑ ei2 = ∑ (yi - b0 -  b1 Xi...

November 27, 2011
By

In a previous post I discussed the concept of gradient descent.  Given some recent work in the online machine learning course offered at Stanford,  I'm going to extend that discussion with an actual example using R-code  (the actual code...

## Dealing with Non-Positive Definite Matrices in R

November 27, 2011
By

Last time we looked at the Matrix package and dug a little into the chol(), Cholesky Decomposition, function.  I noted that often in finance we do not have a positive definite (PD) matrix.  The chol() function in both the Base and Matrix...

## Cleaning time-series and other data streams

The need to analyze time-series or other forms of streaming data arises frequently in many different application areas.  Examples include economic time-series like stock prices, exchange rates, or unemployment figures, biomedical data sequences like electrocardiograms or electroencephalograms, or industrial process operating data sequences like temperatures, pressures or concentrations.  As a specific example, the figure below shows four data sequences:...

## GTA R Users Group – Using R for Data Mining Competitions

November 27, 2011
By

Here are the presentation slides I used for my talk on “Using R for Data Mining Competitions” at Ryerson University as part of the Greater Toronto Area (GTA) R User’s Meetup Group. Presentation (Prezi) Presentation (PDF) Meetup Event page Special thanks to Anthony Goldbloom from Kaggle and various competition winners for sharing their code and

## Analytics using R: Most active in my Twitter list

November 27, 2011
By

I follow some 80 odd people/ news sources on my twitter account. For a while I wondered which of these sources are most active on twitter. I picked a simple metric '# of status messages posted to twitter' as the measure of activity. Using R I quickly wrote a program to generate my top 10 most active...

## Putting it all together: concise code to make dotplots with weighted bootstrapped standard errors

November 27, 2011
By

I analyze a lot of experiments and there are many times when I want to quickly look at means and standard errors for each cell (experimental condition), or the same for each cell and individual-level attribute level (e.g., Democrat, Independent, … Continue reading →

## ..A Quick Geo-Trick for GoogleMaps in R (using dismo)

November 26, 2011
By

... I thought this geocoding-bit might be worth to share (found HERE when searching the web for dismo-documentation).Read more »

## Comparing StackOverflow and the R-help mailing list

November 26, 2011
By

Only recently I discovered StackOverflow. I know, as a nerd already programming for many years that is quite late. For those who are not familiar with StackOverflow (aka SO), it is a Question and Answer site for programmers. It is… See more ›