Advanced Graphics I

June 22, 2013
By
Advanced Graphics I

Polygon is a such handy function in R for drawing beautiful charts where we can select regions (polygons) of the surface. It’s quite useful for indicating confidence regions of parameters, predictions for time-series, or areas under distributions:

Read more »

Calling C++ from R using Rcpp

June 22, 2013
By

Why call C/C++ from R? I really like programming in R. The fact that it is open source immediately wins my favour over Matlab. It can, however, be quite slow especially if you “speak” R with a strong C/C++ accent. This sluggishness, especially when writing unavoidable for loops, has led me to consider other programming The post Calling...

Read more »

What is “Practical Data Science with R”?

June 22, 2013
By
What is “Practical Data Science with R”?

A bit about our upcoming book “Practical Data Science with R”. Nina and I share our current draft of the front matter from the book, which is a description which will help you decide if this is the book for you (we hope that it is). Or this could be the book that helps explain Related posts:

Read more »

Got Bootstrap?

June 22, 2013
By
Got Bootstrap?

This week I read the book by Michael Chernick and Robert LaBudde, An Introduction to Bootstrap Methods with Applications to R. It’s an interesting oeuvre for useRs of all stripes. I strongly recommend check it out. The book brings lots of examples of bootstrapping applications, such as standard errors, confidence intervals, hypothesis testing, and even bootstrap...

Read more »

Optimization

June 22, 2013
By
Optimization

Many problems in statistics or machine learning are of the form "find the values of the parameters that minimize some measure of error". But in some cases, constraints are also imposed on the parameters: for instance, that they should sum up to 1, or that at most 10 of them should be non-zero -- this adds a combinatorial layer to the...

Read more »

Five years of Weight Tracking

June 22, 2013
By
Five years of Weight Tracking

After I moved back from New Jersey in June 2008 I started to track my body weight more seriously. My routine usually consists of getting up and after finishing the morning bathroom I would step on my scale. That way I try to ensure that the condition for each weighing are as similar as possible. … Continue reading...

Read more »

Everything in Its Right Place: Visualization and Content Analysis of Radiohead Lyrics

June 22, 2013
By
Everything in Its Right Place: Visualization and Content Analysis of Radiohead Lyrics

IntroductionI am not a huge Radiohead fan.To be honest, the Radiohead I know and love and remember is that which was a rock band without a lot of 'experimental' tracks - a band you discovered on Big Shiny Tunes 2, or because your friends told you about...

Read more »

Announcing pqR: A faster version of R

June 22, 2013
By
Announcing pqR:  A faster version of R

pqR — a “pretty quick” version of R — is now available to be downloaded, built, and installed on Linux/Unix systems. This version of R is based on R-2.15.0, but with many performance improvements, as well as some bug fixes and new features. Notable improvements in pqR include: Multiple processor cores can automatically be used to perform some numerical

Read more »

Are Green Number Runners More Likely to Bail?

June 22, 2013
By
Are Green Number Runners More Likely to Bail?

Comrades Marathon runners are awarded a permanent green race number once they have completed 10 journeys between Durban and Pietermaritzburg. For many runners, once they have completed the race a few times, achieving a green number becomes a possibility. And once the idea takes hold, it can become something of a compulsion. I can testify

Read more »

Not only CRAN downloads and Shiny … but also .. rCharts

June 21, 2013
By
Not only CRAN downloads and Shiny … but also .. rCharts

I have been meaning for some time to get stuck into the rCharts package which provides an interface to many Javascript graphic libraries. These offer rich charting capabilities with interactivity and a great deal of customization. As regular readers will know, I am also interested in improved publicity for CRAN packages, although the Shiny app

Read more »

R language skills: standard and necessary in today’s world

June 21, 2013
By

A recent Business Times article on Singapore's push to become a tech leader mentions Revolution Analytics new Center of Excellence, set up with the support of the Singapore government to train and grow a pool of data scientists and developers in data science. It includes this quote from SAP: "This will ensure we are equipping our workforce with the...

Read more »

Statistical models are stories about how the data came to be

June 21, 2013
By

And in much of Statistics, the way of telling such stories is through maximum likelihood: given a multitude of possible stories (models), which story is most consistent with the data we actually saw? Dave Harris originates the lovely aphorism above in...

Read more »

Job openings at conservative political analytics firm!

June 21, 2013
By
Job openings at conservative political analytics firm!

After posting that announcement about Civis Analytics, I wrote, “If a reconstituted Romney Analytics team is hiring, let me know and I’ll post that ad too.” Adam Schaeffer obliged: Not sure about Romney’s team, but Evolving Strategies is looking for sharp folks who lean right: Evolving Strategies is a political communications research firm specializing in The post Job...

Read more »

Disposable Visual Data Explorers with Shiny – Guardian University Tables 2014

June 21, 2013
By
Disposable Visual Data Explorers with Shiny – Guardian University Tables 2014

Have data – now what? Building your own interactive data explorer need not be a chore with the R shiny library… Here’s a quick walkthrough… In Datagrabbing Commonly Formatted Sheets from a Google Spreadsheet – Guardian 2014 University Guide Data, I showed how to grab some data from several dozen commonly formatted sheets in a

Read more »

ggplot Tutorial

June 21, 2013
By
ggplot Tutorial

ggplot Tutorial I liked the following ggplot2 tutorial which is featured in Gabriela de Queiroz’s blog called unbiasedestimator. The tutorial looks very neatly presented and I’m sure that it will be very helpful to anyone just getting started with ggplot2 before they jump into ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham or R Graphics Cookbook by...

Read more »

Put some cushions on the sofa

June 21, 2013
By

I posted earlier this week about sofa (here), introducing a package I started recently that interacts with CouchDB from R. There's been a fair amount of response at least in terms of page views, so I'll take that as a sign to keep going. One thing that would be nice while you are CouchDB-ing is to interact with local...

Read more »

Put some cushions on the sofa

June 21, 2013
By

I posted earlier this week about sofa (here), introducing a package I started recently that interacts with CouchDB from R. There's been a fair amount of response at least in terms of page views, so I'll take that as a sign to keep going. One thing that would be nice while you are CouchDB-ing is to interact with local...

Read more »

The PISA2009lite package is released

June 20, 2013
By
The PISA2009lite package is released

This post introduces a new R package named PISA2009lite. I will show how to install this package, what is inside and how to use it. Introduction PISA (Programme for International Student Assessment) is a worldwide study focused on measuring performance of 15-year-old school pupils. More precisely, scholastic performance on mathematics, science and reading is measured

Read more »

Measuring Associations

June 20, 2013
By
Measuring Associations

In Chapter 18, we discuss a relatively new method for measuring predictor importance called the maximal information coefficient (MIC). The original paper is by Reshef at al (2011). A summary of the initial reactions to the MIC are Speed and Tibshirani (and others can be found here). My (minor) beef with it is the lack...

Read more »

Upcoming Rcpp talk in Sydney

June 20, 2013
By

The Sydney Users of R Forum (SURF) will be hosting me for a talk on July 10. The focus will be Rcpp for R and C++ integration, and the intent is to have this be really applied with lots of motivating examples. Organizers Louise and Eugene were able...

Read more »

Quickly read Excel (xlsx) worksheets into R on any platform

June 20, 2013
By

I wrote a couple days about about importing Excel files into R. There are lots of ways to do this, but all the ways that use only R have drawbacks (as I outlined in my last post), and all the other ways require installation of programs other than R. I’m not opposed to using programs

Read more »

Huge interest in next LondonR user group meeting

June 20, 2013
By

The next LondonR meeting takes place on the 16 July and registrations have already exceeded 200. Presentations at the meeting will be made by Rich Pugh of Mango Solutions, Andrie de Vries of Revolution Analytics and Hadley Wickham of RStudio. All places for a  pre-meeting workshop with Hadley Wickham were snapped up within 2 days of announcing the details. More information...

Read more »

How American Century revolutionized their investment platform with R

June 20, 2013
By
How American Century revolutionized their investment platform with R

American Century Investments is a top-20 mutual fund company with more than 125 billion dollars of assets under management. The quantitative investment group manages 22 funds, and takes an objective, systematic and disciplined approach to determine which stocks to buy and sell. Real-time data and carefully calibrated statistical models are the foundation of this quantitative approach. This group formerly...

Read more »

Datagrabbing Commonly Formatted Sheets from a Google Spreadsheet – Guardian 2014 University Guide Data

June 20, 2013
By
Datagrabbing Commonly Formatted Sheets from a Google Spreadsheet – Guardian 2014 University Guide Data

So it seems like it’s that time of year when the Guardian publish their university rankings data (Datablog: University guide 2014), which means another opportunity to have a tinker and see what I’ve learned since last year… (Last year’s hack was a Filtering Guardian University Data Every Which Way You Can…, where I had a

Read more »

Bayesian Modeling of Anscombe’s Quartet

June 20, 2013
By
Bayesian Modeling of Anscombe’s Quartet

Anscombe’s quartet is a collection of four datasets that look radically different yet result in the same regression line when using ordinary least square regression. The graph below shows Anscombe’s quartet with imposed regression lines (taken from the Wikipedia article). While least square regression is a good choice for dataset 1 (upper left plot) it...

Read more »

Data Science Labs: Predictive Models to Improve Vaccine Quality and Production

June 20, 2013
By
Data Science Labs: Predictive Models to Improve Vaccine Quality and Production

The age of "blockbuster drugs" is coming to an end, as personalized medicine becomes a reality. Data science will be a major driver of innovation in these and other areas of the pharmaceutical industry. This was demonstrated during a project the Data Science Labs team executed on with a major pharmaceuticals company.

Read more »

Installing the RGoogleAnalytics package

June 20, 2013
By
Installing the RGoogleAnalytics package

In this blog post, I would walk you through the steps from downloading to installing the RGoogleAnalytics package on your machine. The RGoogleAnalytics package currently resides at https://code.google.com/p/r-google-analytics/ and this page lists the latest developments around the package. The zip and tarball archives for the package can be obtained from the Downloads Section. Once you download the

Read more »

Update to curves2d()

June 20, 2013
By

(This article was first published on geomorph, and kindly contributed to R-bloggers) Dear morphometricians, Below you will find an update to our function for digitizing curves in 2d: curves2d(). This solves a problem with the function plotting landmarks and semilandmarks out of sequence. To use it, you can "source()" the code from a directory, or copy and paste it...

Read more »

Using the Windows Clipboard, or Passing Data Quickly From Excel to R and Back Again

June 19, 2013
By
Using the Windows Clipboard, or Passing Data Quickly From Excel to R and Back Again

Two of my favorite functions are copy.table() and paste.table(). I’m going to turn this story on its head and give you the ending first. The first allows you to copy a data frame to the clipboard in a format that … Continue reading →

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.