1491 search results for "regression"

Solving easy problems the hard way

March 17, 2012
By
Solving easy problems the hard way

There’s a charming little brain teaser that’s going around the Interwebs. It’s got various forms, but they all look something like this: This problem can be solved by pre-school children in 5-10 minutes, by programer – in 1 hour, by people with higher education … well, check it yourself!  8809=6 7111=0 2172=0 6666=4 1111=0 3213=0 7662=2 9313=1 0000=4 2222=0 3333=0 5555=0 8193=3 8096=5 7777=0 9999=4 7756=1 6855=3 9881=5 5531=0 2581=? SPOILER ALERT… The answer has to do with how many

Read more »

NIT: Fatty acids study in R – Part 006

March 12, 2012
By
NIT: Fatty acids study in R – Part 006

In one of the columns, for constituent C16_0, one sample (57) has a value of “zero” (we could see this in the histogram).The reason for that is that the laboratory did not supply this value. The PLS regression will consider the lab value as cero, s...

Read more »

NIT: Fatty acids study in R – Part 005

March 9, 2012
By
NIT: Fatty acids study in R – Part 005

There are several algorithms to run a PLS regression (I recommend to consult the books: “Introduction to Multivariate Analysis in Chemometrics - Kurt Varmuza & Peter Filzmozer” and “Chemometrics with R – Ron Wehrens”).We are going to use ...

Read more »

Experience on using R to build prediction models in business applications

March 8, 2012
By
Experience on using R to build prediction models in business applications

By Yanchang zhao, RDataMining.com Building prediction/classification models is one of the most widely-seen data mining tasks in business applications. To share experience on building prediction models with R, I have started a discussion at RDataMining group on LinkedIn with the … Continue reading →

Read more »

Futures price prediction using the order book data

March 5, 2012
By
Futures price prediction using the order book data

It has been a couple of months since my last post; busy with lots of projects.I had some fun playing around with data from Interactive Brokers API.  It turns out that it is relatively easily to get hold of the raw market data relating to both trades and order book changes for CME/NYMEX commodity futures.  For the purposes of...

Read more »

Gastwirth’s location estimator

Gastwirth’s location estimator

The problem of outliers – data points that are substantially inconsistent with the majority of the other points in a dataset – arises frequently in the analysis of numerical data.  The practical importance of outliers lies in the fact that even a few of these points can badly distort the results of an otherwise reasonable data analysis.  This outlier-sensitivity...

Read more »

Modeling Trick: the Signed Pseudo Logarithm

March 1, 2012
By
Modeling Trick: the Signed Pseudo Logarithm

Much of the data that the analyst uses exhibits extraordinary range. For example: incomes, company sizes, popularity of books and any “winner takes all process”; (see: Living in A Lognormal World). Tukey recommended the logarithm as an important “stabilizing transform” (a transform that brings data into a more usable form prior to generating exploratory statistics, Related posts:

Read more »

A Direct Marketing In-flight Forecasting System

February 29, 2012
By
A Direct Marketing In-flight Forecasting System

This is an edited version of A Direct Marketing In-flight Forecasting System. The original article was written by Shannon Terry and Ben Ogorekm, Nationwide Insurrance, in order to enter the “Applications of R in Business” contest organised by Revolution Analytics. This is the winning entry of the contest. I added some notes in the third

Read more »

Exponential smoothing and regressors

February 28, 2012
By

I have thought quite a lot about including regressors (i.e. covariates) in exponential smoothing (ETS) models, and I have done it a couple of times in my published work. See my 2008 exponential smoothing book (chapter 9) and my 2008 Tourism Management paper. However, there are some theoretical issues with these approaches, which have come to light through the research of...

Read more »

R and Salesforce

February 25, 2012
By
R and Salesforce

Introduction R is widely used among scientists and statisticians to perform statistical analysis while Salesforce.com is one of the leading CRM software packages used for Marketing and Salesforce automation. Salesforce.com contains vital information regarding Leads, Customers, Contacts, Opportunities and Cases. Currently this data is mainly used for operational purposes by Sales and Marketing professionals. How

Read more »