Monthly Archives: June 2011

K-Means Clustering on Big Data

June 7, 2011
By
K-Means Clustering on Big Data

In this post Joseph Rickert demonstrates how to build a classification model on a large data set with the RevoScaleR package. A script file for use with Revolution R Enterprise to recreate the analysis below is at the end of the post, and can also be downloaded here -- ed. The k-means (Lloyd) algorithm, an intuitive way to explore...

Read more »

The pros and cons of robust data characterizations

The pros and cons of robust data characterizations

Over the years, I have looked at a lot of data contaminated with outliers, the subject of Chapter 7 of Exploring Data in Engineering, the Sciences, and Medicine.  That chapter adopts the definition of an outlier presented by Barnett and Lewis in their book Outliers in Statistical Data 2nd Edition

Read more »

Fittesmodel.com: A user-friendly way to conduct empirical research together

June 6, 2011
By

(A guest post by Camiel de Koning) ————– When trying to replicate, verify or extend empirical research of others, a researcher generally encounters many time-consuming barriers and there are often many prerequisites. Fittestmodel has the objective to overcome many of these problems, by presenting a webapplication that allows users to: use but not having to install R. quickly incorporate...

Read more »

R for Data Mining

June 6, 2011
By

Statistics and data mining often get bundled together, but (in my opinion), they're generally different practices with different goals. As a language designed for statistics, much of R's core functionality is focused on exploring and understanding data: model design, inference, and visualization. But when your goal is simply to get the best predictions from a big data set (without...

Read more »

In case you missed it: May Roundup

June 6, 2011
By

In case you missed them, here are some articles from May of particular interest to R users. A review of "R Cookbook", a new how-to book for R programmers. A detailed example of using the RevoScaleR package to analyze a large airline data set. A new guide for R beginners, "How to Learn R", provides links to R resources,...

Read more »

Shared Ecological Modelling References

June 6, 2011
By

05.06.2011 Today i started to create a list of books and articles about ecological modelling. In this list you will not only find general books about modelling but also books about spatial analysis, image analysis and other (in my opinion) important techniques useful in the context of ecological modelling. For the collection i use “Zotero”

Read more »

10 R One Liners to Impress Your Friends

June 5, 2011
By

Following the trend of one liners for various languages (Haskell, Scala, Python), here's some examples in RMultiply Each Item in a List by 2#listslapply(list(1:4),function(n){n*2})# otherwise(1:4)*2 Sum a List of Numbers#listslapply(list(1:4),sum)# oth...

Read more »

Conway’s Game of Life in R with ggplot2 and animation

June 5, 2011
By

In undergrad I had a computer science professor that piqued my interest in applied mathematics, beginning with Conway’s Game of Life. At first, the Game of Life (not the board game) appears to be quite simple — perhaps, too simple — but it has been widely explored and is useful for modeling systems over time. It has been...

Read more »

An application of aggregate() and merge()

June 5, 2011
By
An application of aggregate() and merge()

Today, I encountered an interesting problem while processing a data set of mine. My data have observations on businesses that are repeated over time. My data set also contains information on longitude and latitude of the business location, but unfort...

Read more »

Conway’s Game of Life in R with ggplot2 and animation

June 5, 2011
By
Conway’s Game of Life in R with ggplot2 and animation

In undergrad I had a computer science professor that piqued my interest in applied mathematics, beginning with Conway’s Game of Life. At first, the Game of Life (not the board game) appears to be quite simple — perhaps, too simple — but it has been widely explored and is useful for modeling systems over time.

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)