Creating Catch Data from Individual Length Measurements

June 6, 2013
By
Creating Catch Data from Individual Length Measurements

This example has been updated in this post. I came across a “problem” today where I needed to create catch data for individual nets from length measurements made on individual fish in those nets.  In other words, I had data … Continue reading →

Read more »

Data Class Conversion

June 6, 2013
By

Data in R can be converted from one class to the other. The function is prefixed with as. then followed by the name of the data class that we wish to convert to. Data class in R are the following:numeric - as.numericvector - as.vectorcharacter - as.cha...

Read more »

How likely is the NSA PRISM program to catch a terrorist?

June 6, 2013
By
How likely is the NSA PRISM program to catch a terrorist?

Recent revelations about PRISM, the NSA’s massive program of surveillance of civilian communications have caused quite a stir. And rightfully so, as it appears that the agency has been granted warrantless direct access to just about any form of digital communication engaged in by American citizens, and that their access to such data has been

Read more »

Feature Selection 3 – Swarm Mentality

June 6, 2013
By
Feature Selection 3 – Swarm Mentality

"Bees don't swarm in a mango grove for nothing. Where can you see a wisp of smoke without a fire?" - Hla Stavhana In the last two posts, genetic algorithms were used as feature wrappers to search for more effective subsets of predictors. Here, I will do the same with another type of search algorithm: particle swarm optimization....

Read more »

Intro to Parallel Random Number Generation with RevoScaleR

June 6, 2013
By
Intro to Parallel Random Number Generation with RevoScaleR

by Joseph Rickert Random number generation is fundamental to doing computational statistics. As you might expect, R is very rich in random number resources. The R base code provides several high quality random number generators including: Wichmann-Hill, Marsaglia-Multicarry, Super-Duper, Mersenne-Twister, Knuth-TAOCP-2002 and L’Ecuyer-CMRG. (See Random for details.) And, there are at least three packages, rspring, rlecuyer, and rstream for...

Read more »

Box-plot with R – Tutorial

June 6, 2013
By
Box-plot with R – Tutorial

Uncertain Demand Forecasting and Inventory Optimizing for Short-life-cycle Products

June 6, 2013
By

For short-life-cycle products such as newspapers and fashion, it is important to match the supply with the demand. However, sometimes we order too little from supplier and sometimes we order too much due to the uncertain demand. We would lose sales and customers would be unsatisfied if ordering too little or we would let the

Read more »

Inputting Data in Matrix Format

June 6, 2013
By

Matrix in R is formed using matrix, rbind, or cbind function. These functions have the following descriptions:matrix - used to transform a concatenated data into matrix form of compatible dimensions. rbind - short for row bind, that binds a conca...

Read more »

At what sample size do correlations stabilize?

June 6, 2013
By
At what sample size do correlations stabilize?

Maybe you have encountered this situation: you run a large-scale study over the internet, and out of curiosity, you frequently check the correlation between two variables. My experience with this practice is usually frustrating, as in small sample sizes (and we will see what “small” means in this context) correlations go up and down, change sign,

Read more »

Hillslope Position by Soil Series

June 5, 2013
By

Soil survey data are typically built upon a foundation of soil-landscape relationships that have been verified in the field. SSURGO data contain several geomorphic descriptions of landscape, landform, hillslope position, and surface shape for each...

Read more »

KDNuggets 2013 software poll results

June 5, 2013
By
KDNuggets 2013 software poll results

The results of the 2013 KDNuggets software poll are in, with RapidMiner and R in a near-tie for first place. Of a record 1880 respondents, 737 reported using Rapid-I RapidMiner/RapidAnalytics, and 704 reported using R. Excel came in third: with 527 respondents, it was the lone commercial tool in the top 5. You can see the top 10 responses...

Read more »

Running R Scripts Directly From Dropbox

June 5, 2013
By

I have written a little function that allows users to run R scripts out of Dropbox directly from any location.  It was aided by this post on biobucket.  The reason I am particularly interested in this feature is because I am often using a ser...

Read more »

Hillslope Position by Soil Series

June 5, 2013
By

Soil survey data are typically built upon a foundation of soil-landscape relationships that have been verified in the field. read more

Read more »

Oracle R Distribution for R 2.15.3 is released

June 5, 2013
By
Oracle R Distribution for R 2.15.3 is released

We are pleased to announce that Oracle R Distribution (ORD) for R 2.15.3 is available for download today. This update consists of mostly minor bug fixes, and is the final release of the R 2.x series. Oracle recommends using yum to install ORD from our public yum server.  To install...

Read more »

The Frisch–Waugh–Lovell Theorem for Both OLS and 2SLS

June 5, 2013
By
The Frisch–Waugh–Lovell Theorem for Both OLS and 2SLS

The Frisch–Waugh–Lovell (FWL) theorem is of great practical importance for econometrics. FWL establishes that it is possible to re-specify a linear regression model in terms of orthogonal complements. In other words, it permits econometricians to partial out right-hand-side, or control, variables. This is useful in a variety of settings. For example, there may be cases

Read more »

RcppArmadillo 0.3.900.0

A Armadillo release 3.900.0 was provided by Conrad yesterday. It has been rolled into a new RcppArmadillo release 0.3.900.0 which is now on CRAN and in Debian. It has a number of nice changes, mostly on the performance side of things (see below) an...

Read more »

A Big Data introduction

June 5, 2013
By

Since R uses the computer RAM, it may handle only rather small sets of data. Nevertheless, there are some packages that allow to treat larger volumes and the best solution is to connect R with a Big Data environment. This … Continue reading →

Read more »

Major League Baseball run scoring trends with R’s Lahman package

June 4, 2013
By
Major League Baseball run scoring trends with R’s Lahman package

The statistical software R has an ever-expanding array of packages that provide pre-programmed functions and datasets. One such package is named Lahman, bundling the contents of the Lahman database into a quick-and-easy resource for R users. In addition to the data tables, the package resources also contain a variety of analyses and graphics undertaken using...

Read more »

A Graphical Approach to Showing the Result of Classification Models

June 4, 2013
By
A Graphical Approach to Showing the Result of Classification Models

This is one of my favorite charts, it easily allows one to see how many predictions are right, and it allows one to see where the wrong ones are as well. It is the equivalent of a confusion matrix, but sometimes a picture is worth a thousand words. Some sample code is included below.  

Read more »

Veterinary Epidemiologic Research: Modelling Survival Data – Semi-Parametric Analyses

June 4, 2013
By
Veterinary Epidemiologic Research: Modelling Survival Data – Semi-Parametric Analyses

Next on modelling survival data from Veterinary Epidemiologic Research: semi-parametric analyses. With non-parametric analyses, we could only evaluate the effect one or a small number of variables. To evaluate multiple explanatory variables, we analyze data with a proportional hazards model, the Cox regression. The functional form of the baseline hazard is not specified, which make

Read more »

How old is the oldest person you know?

June 4, 2013
By
How old is the oldest person you know?

Last week, we had a discussion with some colleagues about the fact that – in order to prepare for the SOA exams – we did not have time (so far) to mention results on extreme values in our actuarial program. I did gave an introduction in my nonlife actuarial models class, but it was only an introduction, in three...

Read more »

Webinar: Managing Data with R

June 4, 2013
By
Webinar: Managing Data with R

Before you can analyze data, it must be in the right form. Join Revolution Analytics and me this June 21st for a 4-hour webinar that shows how to perform the most commonly used data management tasks in R. We will work through … Continue reading →

Read more »

Collecting geocoded tweets with R and Java

June 4, 2013
By
Collecting geocoded tweets with R and Java

Number of tweets in different languages posted around GermanyThere are many thing one can do with tweets (sentiment analysis, maps, ...). This entry shows you how you can access the publicly available API using Java and how to analyse the data using R....

Read more »

IntR – Interactive GUI for performing geostatistical analysis in R

June 4, 2013
By
IntR – Interactive GUI for performing geostatistical analysis in R

In 2011 I presented at the UseR conference, held in Warwick (UK), a piece of software for easing the learning curve of R for geostatistical analysis. It is a very simple attempt to create an interactive interface in R, by using Python as GUI. F...

Read more »

PluginR v0.80 released & 2 new trainings in July 2013

June 4, 2013
By
PluginR v0.80 released & 2 new trainings in July 2013

I'm pleased to announce that the new version 0.80 of PluginR has just been released (as usual, available as a Tiki Mod) with a few minor bugs fixed, and a couple of new interesting features added: PluginR now uses a caching mechanism extending the ...

Read more »

PluginR v0.80 released & 2 new trainings in July 2013

June 4, 2013
By
PluginR v0.80 released & 2 new trainings in July 2013

I'm pleased to announce that the new version 0.80 of PluginR has just been released (as usual, available as a Tiki Mod) with a few minor bugs fixed, and a couple of new interesting features added: PluginR now uses a caching mechanism extending the ...

Read more »

UseR 2013

June 4, 2013
By
UseR 2013

Although the programme is quite interesting, I am not really involved in this conference (in fact I'm not even going $-$ even though, sometimes, I think it would be nice to spend the whole summer away at conferences!).But Vir has just put this pic...

Read more »

Value at Risk and Expected Shortfall, and other upcoming events

June 4, 2013
By
Value at Risk and Expected Shortfall, and other upcoming events

Highlighted Value at Risk and Expected Shortfall A two-day course exploring Value at Risk and Expected Shortfall, and their role in risk management. 2013 June 25 & 26, London. Lead by Patrick Burns. Details at the CFP Events site. New Events Thalesians — San Francisco 2013 June 5. Jesse Davis on “Risk Model Imposed Manager-to-Manager … Continue reading...

Read more »

Interactive slides with googleVis on shiny

June 4, 2013
By
Interactive slides with googleVis on shiny

Following on from last week's post, here are my slides on using googleVis on shiny from the Advanced R workshop at Lancaster University, 21 May 2013. googleVis on shiny Again, I wrote my slides in RMarkdown and I used slidify to create the HTML5 presentation. Unfortunately...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de









ODSC

CRC R books series













Contact us if you wish to help support R-bloggers, and place your banner here.