Moving up in the ranks: from an R-Rookie to an R-Pro

March 28, 2013
By

I am playing with R now for little over a year. Not very intensive, but once in a while I start up R Studio and do some coding and analysis. But I am still far, far away from becoming an R-Pro. If you talk to or read some of the posts of the more seaso...

Read more »

Swimming in a sea of code

March 28, 2013
By
Swimming in a sea of code

If you are looking for code here, move on. > In the beginning, there was only the relentless blinking of the cursor. With the maddening regularity of waves splashing on the shore: blink, blink, blink, blink…Beyond the cursor, the white wasteland … Continue reading →

Read more »

Playing with earthquake data

March 28, 2013
By

(This article was first published on Digithead's Lab Notebook, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on their blog: Digithead's Lab Notebook. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming...

Read more »

Creating a Business Dashboard in R

March 28, 2013
By
Creating a Business Dashboard in R

Business dashboards are available in many shapes and sizes. Business dashboards are useful to create an overview of key performance indicators (KPIs) important for the business strategy and/or operations. There are many flavours of dashboard frameworks and apps available, ranging in price from thousands of dollars to open-source implementations. Apparently  Read more »

Data visualization with R and ggplot2

March 28, 2013
By
Data visualization with R and ggplot2

I’m working on a one-hour ggplot2 lecture for the San Diego R users group, which I will post here when I’m done. I think there are many great intro to R data visualization resources out there so I’ll only share working examples on my blog. A retail chain client employs a few hundred field agents who perform

Read more »

Generalized Pairs Plot: It’s about time!

March 28, 2013
By
Generalized Pairs Plot: It’s about time!

JW Emerson, WA Green, B Schloerke, J Crowley, D Cook, H Hofmann, H Wickham (2013) The Generalized Pairs Plot. Journal of Computational and Graphical Statistics 22(1). Here's a free preprint version. Until this new paper and implementation by Emerson et al., there were no widely available pairs plots that accommodated both numerical and categorical fields.

Read more »

Benford law and lognormal distributions

March 28, 2013
By
Benford law and lognormal distributions

Benford’s law is nowadays extremely popular (see e.g. http://en.wikipedia.org/…). It is usually claimed that, for a given set data set, changing units does not affect the distribution of the first digit. Thus, it should be related to scale invariant distributions. Heuristically, scale (or unit) invariance means that the density of the measure  (or probability function) should be proportional to...

Read more »

Lots of data != "Big Data"

March 28, 2013
By
Lots of data != "Big Data"

by Joseph Rickert When talking with data scientists and analysts — who are working with large scale data analytics platforms such as Hadoop — about the best way to do some sophisticated modeling task it is not uncommon for someone to say, "We have all of the data. Why not just use it all?" This sort of comment often...

Read more »

Rencontres R, Lyon 27-28 June

March 28, 2013
By
Rencontres R, Lyon 27-28 June

Last year, the first French-speaking R conference, “Rencontres R“  was held in Bordeaux.  The meeting was a great success, and a second one will be  held in Lyon on 27 and 28 June 2013. The abstract submission deadline of 7 … Continue reading →

Read more »

RForcecom – An R package provides the connection between R and Salesforce.com

March 28, 2013
By
RForcecom – An R package provides the connection between R and Salesforce.com

In this post, I’ll introduce an R package RForcecom and its usage. As you may know, R statistical computing environment is the most populous statistical computing software, and Salesforce.com is the world’s most innovative cloud-computing based SaaS (Software-as-a-Service) CRM package.…Read more ›

Read more »

Mixed model R2 (UPDATED)

March 28, 2013
By
Mixed model R2 (UPDATED)

R2 is a useful tool for determining how strong the relationship between two variables is. Unfortunately, the definition of R2 for mixed effects models is difficult – do you include the random variable or just the fixed effects? Including just the fixed effects is essentially a standard linear model, while including the random effects could

Read more »

“Building ractives is so addictive it should be illegal!”

March 27, 2013
By

clickme is an amazing R package. I was not sure what to expect when I first saw Nacho Caballero's announcement. I actually was both skeptical and intimidated, but neither reaction was justified. The examples prove its power, and his wiki tutorials ease...

Read more »

Moving to R 3.0.0 on Ubuntu

March 27, 2013
By

As you may (or may not) be aware of, R 3.0.0 is scheduled to be released on April 3rd. Since this is a major release and there may be some growing pains (but I hope not) in the move 3.0.0, here is some information about how I will handle R 3.0.0 on CR...

Read more »

Rationality, and MS Excel (and other calculators)

March 27, 2013
By
Rationality, and MS Excel (and other calculators)

This morning, Mathieu had a nice experience in his course on computational method in actuarial science. But let us start with some mathematical formal definitions. First, recall that is – somehow – a standard expression. No one should be surprised to see such an expression. Generally (as explained in http://en.wikipedia.org/… ), this function is defined only when . The...

Read more »

What does a data scientist do?

March 27, 2013
By

The presentation below by Carlos Somohano (founder of Data Science London) provides the best description of a Data Scientist that I've seen in some time: Highlights include: On Slide 14, a history of the Data Science On Slide 22, the essential skills of data scientists (and a platypus) On Slide 26, 10 things data scientists do On Slide 27,...

Read more »

Build a search engine in 20 minutes or less

March 27, 2013
By
Build a search engine in 20 minutes or less

…or your money back. author = "Ben Ogorek"Twitter = "@baogorek"email = paste0(sub("@", "", Twitter), "@gmail.com") Setup Pretend this is Big Data: doc1 <- "Stray cats are running all over the place. I see 10 a day!"doc2 <- "Cats are killers. They...

Read more »

TeXing R tables: Save yourself a lot of typing…

March 27, 2013
By
TeXing R tables: Save yourself a lot of typing…

I want to share a function I wrote for my dissertation. The function is useful for putting up to two R tables into one TeX table.You have to load the package 'languageR' to have the dataset 'dative' available.Let's suppose you have two tables, one with...

Read more »

Getting the data for the sjPlotting-functions into shape #rstats

March 27, 2013
By
Getting the data for the sjPlotting-functions into shape #rstats

I sometimes get questions on how to reproduce the samples that are posted in this blog. Currently, I’m referring to these posts: Plotting lm and glm models with ggplot Easily plotting grouped bars with ggplot Simplify frequency plots with ggplot … Weiterlesen →

Read more »

A few of my favorite moments from LPSC 2013

March 26, 2013
By
A few of my favorite moments from LPSC 2013

(Note: this was initially posted on my other blog at Glacial Till, but there were some good bits of information that I wanted to share with the Paleoposse.) Last week I attended my first science conference: The Lunar and Planetary Science Conference in Houston, TX. If you followed me on Twitter, then (for better or for worse)

Read more »

Got Data from People? Take Dan Ariely’s Coursera course.

March 26, 2013
By
Got Data from People? Take Dan Ariely’s Coursera course.

A Beginner's Guide to Irrational Behavior started yesterday.  One might not immediately think that such a course would be relevant for statistical modeling.  Well, it is if your statistical modeling uses people as informants.  If the dat...

Read more »

Big Analytics for R Users without Big Hassles: A Webinar About SciDB-R

March 26, 2013
By

You want your analytics to just work… with extremely large data sets as nimbly as small ones. You don’t want to have to think about parallelism, data formatting, and memory management. Paradigm4 presents a webinar about SciDB-R, an R package that lets you remain an R programmer, but expands R’s power with SciDB, the massively

Read more »

SciDB-R, a package for R, is now available

March 26, 2013
By

SciDB-R, a package for R that lets R programmers perform massive-scale data-management and analytical tasks from inside R programs, is now available. You can download the package from GitHub here. It is also available on The Comprehensive R Archive Net...

Read more »

What’s New in 6.2: Stepwise Regression for Big Data

March 26, 2013
By

by Thomas Dinsmore This is the third in a series of posts highlighting new features in Revolution R Enterprise Release 6.2, which is scheduled for General Availability April 22. This week's post features our new Stepwise Regression capability. The Stepwise process starts with a specified model and then sequentially adds into or removes from the model the variable that...

Read more »

Python vs R vs SPSS … Can’t All Programmers Just Get Along?

March 26, 2013
By
Python vs R vs SPSS … Can’t All Programmers Just Get Along?

Programmers have long been very proud and loyal with their tools, and often very vocal. This has led to well-contested rivalries and “fights” about which tool is better: emacs or vi; Java or C++; Perl or Python; Django or Rails; … Continue reading →The post Python vs R vs SPSS … Can’t All Programmers Just Get Along?...

Read more »

ChainLadder 0.1.5-6 released on CRAN

March 26, 2013
By
ChainLadder 0.1.5-6 released on CRAN

Last week we released version 0.1.5-6 of the ChainLadder package on CRAN. The ChainLadder package provides statistical models, which are typically used for the estimation of outstanding claims reserves in general insurance. The package vignette gives an overview of the package functionality. Output of plot(MackChainLadder(GenIns)) Since the last CRAN release...

Read more »

i Before e Except After c

March 26, 2013
By
i Before e Except After c

When I went to school we were always taught the “i before e, except after c” rule for spelling. But how accurate is this rule? Kevin Marks tweeted today the following: »@uberfacts: There are 923 words in the English language that break the “I before E” rule. Only 44 words actually follow that rule.« Science— Kevin Marks (@kevinmarks)

Read more »

Python vs R vs SPSS … Can’t All Programmers Just Get Along?

March 26, 2013
By
Python vs R vs SPSS … Can’t All Programmers Just Get Along?

Programmers have long been very proud and loyal with their tools, and often very vocal. This has led to well-contested rivalries and "fights" about which tool is better: emacs or vi; Java or C++; Perl or Python; Django or Rails; and, for data geeks, the SAS/SPSS/R/Matlab fight. The truth is, very few of us data geeks (data scientists, data analysts, statisticians, or...

Read more »

A Contest of the Flyer Variety

March 25, 2013
By
A Contest of the Flyer Variety

Guess what?! Charlie decided to institute another round of the epic Flyer Contest! Here’s how it works in 5 easy steps… Print out one of the two flyers below (or this link: here) Post it somewhere public. Examples include: a message board in your department, at your local comic shop, on your mom’s fridge, or

Read more »

Significant P-Values and Overlapping Confidence Intervals

March 25, 2013
By
Significant P-Values and Overlapping Confidence Intervals

There are all sorts of problems with p-values and confidence intervals and I have no intention (or the time) to cover all those problems right now.  However, a big problem is that most people have no idea what p-values really mean. Here is one example of a common problem with p-values and how it relates

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.