1412 search results for "regression"

Survey in R course taught by Thomas Lumley at Statistics.com

March 19, 2012
By

Statistics.com is offering a new online course, “Survey Analysis in R,” debuts March 23 – April 20, with Dr. Thomas Lumley. Dr. Lumley is the creator of the R package “Survey,” and the author of the course text – “Complex Surveys: A Guide to Analysis Using R.” The course is suitable for those with some familiarity with R and...

Read more »

Liking of apples – more than juiciness

March 18, 2012
By
Liking of apples – more than juiciness

In a previous blog it was shown using literature data that liking of apples was related to juiciness. However, there were some questionsIs the relation linear or slightly curved? The variation in liking around CJuiciness is large. Are more explana...

Read more »

Predicting Marketing Campaign with R

March 17, 2012
By
Predicting Marketing Campaign with R

In my last blog I created a mechanism to fetch data from Salesforce using rJava and SOQL. In this blog I am going to use that mechanism to fetch ad campaign data from salesforce and predict future ad campaign sales using R Let us assume that Salesforce has campaign data for last eight quarters.  This

Read more »

Solving easy problems the hard way

March 17, 2012
By
Solving easy problems the hard way

There’s a charming little brain teaser that’s going around the Interwebs. It’s got various forms, but they all look something like this: This problem can be solved by pre-school children in 5-10 minutes, by programer – in 1 hour, by people with higher education … well, check it yourself!  8809=6 7111=0 2172=0 6666=4 1111=0 3213=0 7662=2 9313=1 0000=4 2222=0 3333=0 5555=0 8193=3 8096=5 7777=0 9999=4 7756=1 6855=3 9881=5 5531=0 2581=? SPOILER ALERT… The answer has to do with how many

Read more »

NIT: Fatty acids study in R – Part 006

March 12, 2012
By
NIT: Fatty acids study in R – Part 006

In one of the columns, for constituent C16_0, one sample (57) has a value of “zero” (we could see this in the histogram).The reason for that is that the laboratory did not supply this value. The PLS regression will consider the lab value as cero, s...

Read more »

NIT: Fatty acids study in R – Part 005

March 9, 2012
By
NIT: Fatty acids study in R – Part 005

There are several algorithms to run a PLS regression (I recommend to consult the books: “Introduction to Multivariate Analysis in Chemometrics - Kurt Varmuza & Peter Filzmozer” and “Chemometrics with R – Ron Wehrens”).We are going to use ...

Read more »

Experience on using R to build prediction models in business applications

March 8, 2012
By
Experience on using R to build prediction models in business applications

By Yanchang zhao, RDataMining.com Building prediction/classification models is one of the most widely-seen data mining tasks in business applications. To share experience on building prediction models with R, I have started a discussion at RDataMining group on LinkedIn with the … Continue reading →

Read more »

Futures price prediction using the order book data

March 5, 2012
By
Futures price prediction using the order book data

It has been a couple of months since my last post; busy with lots of projects.I had some fun playing around with data from Interactive Brokers API.  It turns out that it is relatively easily to get hold of the raw market data relating to both trades and order book changes for CME/NYMEX commodity futures.  For the purposes of...

Read more »

Gastwirth’s location estimator

Gastwirth’s location estimator

The problem of outliers – data points that are substantially inconsistent with the majority of the other points in a dataset – arises frequently in the analysis of numerical data.  The practical importance of outliers lies in the fact that even a few of these points can badly distort the results of an otherwise reasonable data analysis.  This outlier-sensitivity...

Read more »

Modeling Trick: the Signed Pseudo Logarithm

March 1, 2012
By
Modeling Trick: the Signed Pseudo Logarithm

Much of the data that the analyst uses exhibits extraordinary range. For example: incomes, company sizes, popularity of books and any “winner takes all process”; (see: Living in A Lognormal World). Tukey recommended the logarithm as an important “stabilizing transform” (a transform that brings data into a more usable form prior to generating exploratory statistics, Related posts:

Read more »