Build a search engine in 20 minutes or less

March 27, 2013
By
Build a search engine in 20 minutes or less

…or your money back. author = "Ben Ogorek"Twitter = "@baogorek"email = paste0(sub("@", "", Twitter), "@gmail.com") Setup Pretend this is Big Data: doc1 <- "Stray cats are running all over the place. I see 10 a day!"doc2 <- "Cats are killers. They...

Read more »

TeXing R tables: Save yourself a lot of typing…

March 27, 2013
By
TeXing R tables: Save yourself a lot of typing…

I want to share a function I wrote for my dissertation. The function is useful for putting up to two R tables into one TeX table.You have to load the package 'languageR' to have the dataset 'dative' available.Let's suppose you have two tables, one with...

Read more »

Getting the data for the sjPlotting-functions into shape #rstats

March 27, 2013
By
Getting the data for the sjPlotting-functions into shape #rstats

I sometimes get questions on how to reproduce the samples that are posted in this blog. Currently, I’m referring to these posts: Plotting lm and glm models with ggplot Easily plotting grouped bars with ggplot Simplify frequency plots with ggplot … Weiterlesen →

Read more »

A few of my favorite moments from LPSC 2013

March 26, 2013
By
A few of my favorite moments from LPSC 2013

(Note: this was initially posted on my other blog at Glacial Till, but there were some good bits of information that I wanted to share with the Paleoposse.) Last week I attended my first science conference: The Lunar and Planetary Science Conference in Houston, TX. If you followed me on Twitter, then (for better or for worse)

Read more »

Got Data from People? Take Dan Ariely’s Coursera course.

March 26, 2013
By
Got Data from People? Take Dan Ariely’s Coursera course.

A Beginner's Guide to Irrational Behavior started yesterday.  One might not immediately think that such a course would be relevant for statistical modeling.  Well, it is if your statistical modeling uses people as informants.  If the dat...

Read more »

Big Analytics for R Users without Big Hassles: A Webinar About SciDB-R

March 26, 2013
By

You want your analytics to just work… with extremely large data sets as nimbly as small ones. You don’t want to have to think about parallelism, data formatting, and memory management. Paradigm4 presents a webinar about SciDB-R, an R package that lets you remain an R programmer, but expands R’s power with SciDB, the massively

Read more »

SciDB-R, a package for R, is now available

March 26, 2013
By

SciDB-R, a package for R that lets R programmers perform massive-scale data-management and analytical tasks from inside R programs, is now available. You can download the package from GitHub here. It is also available on The Comprehensive R Archive Net...

Read more »

What’s New in 6.2: Stepwise Regression for Big Data

March 26, 2013
By

by Thomas Dinsmore This is the third in a series of posts highlighting new features in Revolution R Enterprise Release 6.2, which is scheduled for General Availability April 22. This week's post features our new Stepwise Regression capability. The Stepwise process starts with a specified model and then sequentially adds into or removes from the model the variable that...

Read more »

Python vs R vs SPSS … Can’t All Programmers Just Get Along?

March 26, 2013
By
Python vs R vs SPSS … Can’t All Programmers Just Get Along?

Programmers have long been very proud and loyal with their tools, and often very vocal. This has led to well-contested rivalries and “fights” about which tool is better: emacs or vi; Java or C++; Perl or Python; Django or Rails; … Continue reading →The post Python vs R vs SPSS … Can’t All Programmers Just Get Along?...

Read more »

ChainLadder 0.1.5-6 released on CRAN

March 26, 2013
By
ChainLadder 0.1.5-6 released on CRAN

Last week we released version 0.1.5-6 of the ChainLadder package on CRAN. The ChainLadder package provides statistical models, which are typically used for the estimation of outstanding claims reserves in general insurance. The package vignette gives an overview of the package functionality. Output of plot(MackChainLadder(GenIns)) Since the last CRAN release...

Read more »

i Before e Except After c

March 26, 2013
By
i Before e Except After c

When I went to school we were always taught the “i before e, except after c” rule for spelling. But how accurate is this rule? Kevin Marks tweeted today the following: »@uberfacts: There are 923 words in the English language that break the “I before E” rule. Only 44 words actually follow that rule.« Science— Kevin Marks (@kevinmarks)

Read more »

Python vs R vs SPSS … Can’t All Programmers Just Get Along?

March 26, 2013
By
Python vs R vs SPSS … Can’t All Programmers Just Get Along?

Programmers have long been very proud and loyal with their tools, and often very vocal. This has led to well-contested rivalries and "fights" about which tool is better: emacs or vi; Java or C++; Perl or Python; Django or Rails; and, for data geeks, the SAS/SPSS/R/Matlab fight. The truth is, very few of us data geeks (data scientists, data analysts, statisticians, or...

Read more »

A Contest of the Flyer Variety

March 25, 2013
By
A Contest of the Flyer Variety

Guess what?! Charlie decided to institute another round of the epic Flyer Contest! Here’s how it works in 5 easy steps… Print out one of the two flyers below (or this link: here) Post it somewhere public. Examples include: a message board in your department, at your local comic shop, on your mom’s fridge, or

Read more »

Significant P-Values and Overlapping Confidence Intervals

March 25, 2013
By
Significant P-Values and Overlapping Confidence Intervals

There are all sorts of problems with p-values and confidence intervals and I have no intention (or the time) to cover all those problems right now.  However, a big problem is that most people have no idea what p-values really mean. Here is one example of a common problem with p-values and how it relates

Read more »

R – Defining Your Own Color schemes for HeatMaps

March 25, 2013
By
R – Defining Your Own Color schemes for HeatMaps

This post is intended at those who are beginners at R, and is inspired by a small post in Martin's bioblog.First, we plot a "correlation heatmap" using the same logic that Martin uses. In our example, let's use the Movies dataset that comes with ggplot...

Read more »

Computing Maritime Routes in R

March 25, 2013
By
Computing Maritime Routes in R

Thanks to the attention my paper on the cost of Somali piracy has received, a lot of people have approached me to ask how I computed the maritime routes. It is not a very difficult task using R. The key ingredient is a map of the world, that can be rasterized into a grid; all

Read more »

Massive online data stream mining with R

Massive online data stream mining with R

A few weeks ago, the stream package has been released on CRAN. It allows to do real time analytics on data streams. This can be very usefull if you are working with large datasets which are already hard to put in RAM completely, let alone to build some statistical model on it without getting into RAM problems. Most of...

Read more »

Ordinal data with JAGS

March 25, 2013
By

Last week is had a look at the standard R routines for estimating models for ordinal data. This week, I want to have a look at JAGS for examining the same data. To be honest, most of it is taking an example (inhaler) and removing code. To my surpr...

Read more »

Podcast #6: Data Analysis MOOC Post-mortem

March 25, 2013
By

Jeff and I talk about Jeff's recently completed MOOC on Data Analysis.

Read more »

Revolution Newsletter: March 2013

March 25, 2013
By

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full March edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Get Results Fast with our Quick Start Programs: Need help getting value from predictive...

Read more »

Model assessment (and predictions for RuPaul’s Drag Race Season 5, Episode 9)

March 25, 2013
By
Model assessment (and predictions for RuPaul’s Drag Race Season 5, Episode 9)

Last week, Alaska took it home with her dangerous performance, while Ivy Winters was sent home after going up against Alyssa Edwards. This is sad on many fronts. First, I love me some Ivy Winters. Second, Jinkx had revealed that she had a crush on Ivy, and the relationship that may have flourished between the… Continue reading →

Read more »

Simpler R help tooltips

March 25, 2013
By

I posted yesterday about R Help tooltips. I have started to use them e.g. on the graph gallery However, I’m quickly frustrated with having to write the full url, i.e if I want to add a link to the help … Continue reading →

Read more »

April 18, 2013Third Milano R net meeting: agenda

March 25, 2013
By
April 18, 2013Third Milano R net meeting: agenda

April 18, 2013 - 18:00 - 21:00 Fiori Oscuri Bistrot & Bar (www.fiorioscuri.it) Via Fiori Oscuri, 3 - Milano (Zona Brera) 18.00 - 18.15 Registration 18.15 - 18.30 Welcome presentation Andrea Spanò, Partner at Quantide 18.30 - 19.00 Digit recognition Machine … Continue reading →

Read more »

Submit a talk for the first R in Insurance conference

March 25, 2013
By
Submit a talk for the first R in Insurance conference

The registration for the first R in Insurance is open and there is still time to submit a talk / lightning talk.The conference will take place at Cass Business School in London on Monday, 15 July 2013. This is the Monday following the useR! 2013 confer...

Read more »

Does It Make Sense to Segment Using Individual Estimates from a Hierarchical Bayes Choice Model?

March 24, 2013
By
Does It Make Sense to Segment Using Individual Estimates from a Hierarchical Bayes Choice Model?

I raise this question because we see calls for running segmentation with individual estimates from hierarchical Bayes choice models without any mention of the possible complications that might accompany such an approach.  Actually, all the calls seem to be from those using MaxDiff to analyze the data from incomplete block designs.  For example, if one were to...

Read more »

Writing a MS-Word document using R (with as little overhead as possible)

March 24, 2013
By
Writing a MS-Word document using R (with as little overhead as possible)

The problem: producing a Word (.docx) file of a statistical report created in R, with as little overhead as possible. The solution: combining R+knitr+rmarkdown+pander+pandoc (it is easier than it is spelled). If you get what this post is about, just …Read more »

Read more »

Using R: reading tables that need a little cleaning

March 24, 2013
By
Using R: reading tables that need a little cleaning

Sometimes one needs to read tables that are a bit messy, so that read.table doesn’t immediately recognize the content as numerical. Maybe some weird characters are sprinkled in the table (ever been given a table with significance stars in otherwise numerical columns?). Some search and replace is needed. You can do this by hand, and

Read more »

R Help tooltips

March 24, 2013
By
R Help tooltips

I created a simple jquery plugin to display some information when hovering links to r documentation files hosted at help.r-enthusiasts.com Below is a snapshot from highlight.r-enthusiasts.com that uses the tooltips: See also a live example here: data.frame Using this feature … Continue reading →

Read more »

Tupper’s self-referential formula

March 24, 2013
By
Tupper’s self-referential formula

Can't remember where I first came across this equation but the Tupper's self referential equation, is a very interesting formula that when graphed in two dimension plane it reproduces the formula. \[ \frac{1}{2} I first thought this would be...

Read more »

Sponsors

Mango solutions





RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de









ODSC

CRC R books series













Contact us if you wish to help support R-bloggers, and place your banner here.