Monthly Archives: June 2013

Creating Catch Data from Individual Length Measurements

June 6, 2013
By
Creating Catch Data from Individual Length Measurements

This example has been updated in this post. I came across a “problem” today where I needed to create catch data for individual nets from length measurements made on individual fish in those nets.  In other words, I had data … Continue reading →

Read more »

Data Class Conversion

June 6, 2013
By

Data in R can be converted from one class to the other. The function is prefixed with as. then followed by the name of the data class that we wish to convert to. Data class in R are the following:numeric - as.numericvector - as.vectorcharacter - as.cha...

Read more »

How likely is the NSA PRISM program to catch a terrorist?

June 6, 2013
By
How likely is the NSA PRISM program to catch a terrorist?

Recent revelations about PRISM, the NSA’s massive program of surveillance of civilian communications have caused quite a stir. And rightfully so, as it appears that the agency has been granted warrantless direct access to just about any form of digital communication engaged in by American citizens, and that their access to such data has been

Read more »

Feature Selection 3 – Swarm Mentality

June 6, 2013
By
Feature Selection 3 – Swarm Mentality

"Bees don't swarm in a mango grove for nothing. Where can you see a wisp of smoke without a fire?" - Hla StavhanaIn the last two posts, genetic algorithms were used as feature wrappers to search for more effective subsets of predictors. Here, I will do the same with another type of search algorithm: particle swarm optimization....

Read more »

Intro to Parallel Random Number Generation with RevoScaleR

June 6, 2013
By
Intro to Parallel Random Number Generation with RevoScaleR

by Joseph Rickert Random number generation is fundamental to doing computational statistics. As you might expect, R is very rich in random number resources. The R base code provides several high quality random number generators including: Wichmann-Hill, Marsaglia-Multicarry, Super-Duper, Mersenne-Twister, Knuth-TAOCP-2002 and L’Ecuyer-CMRG. (See Random for details.) And, there are at least three packages, rspring, rlecuyer, and rstream for...

Read more »

Box-plot with R – Tutorial

June 6, 2013
By
Box-plot with R – Tutorial

Uncertain Demand Forecasting and Inventory Optimizing for Short-life-cycle Products

June 6, 2013
By

For short-life-cycle products such as newspapers and fashion, it is important to match the supply with the demand. However, sometimes we order too little from supplier and sometimes we order too much due to the uncertain demand. We would lose sales and customers would be unsatisfied if ordering too little or we would let the

Read more »

Inputting Data in Matrix Format

June 6, 2013
By

Matrix in R is formed using matrix, rbind, or cbind function. These functions have the following descriptions:matrix - used to transform a concatenated data into matrix form of compatible dimensions. rbind - short for row bind, that binds a conca...

Read more »

At what sample size do correlations stabilize?

June 6, 2013
By
At what sample size do correlations stabilize?

Maybe you have encountered this situation: you run a large-scale study over the internet, and out of curiosity, you frequently check the correlation between two variables. My experience with this practice is usually frustrating, as in small sample sizes (and we will see what “small” means in this context) correlations go up and down, change sign,

Read more »

KDNuggets 2013 software poll results

June 5, 2013
By
KDNuggets 2013 software poll results

The results of the 2013 KDNuggets software poll are in, with RapidMiner and R in a near-tie for first place. Of a record 1880 respondents, 737 reported using Rapid-I RapidMiner/RapidAnalytics, and 704 reported using R. Excel came in third: with 527 respondents, it was the lone commercial tool in the top 5. You can see the top 10 responses...

Read more »