Comparing the speed of pqR with R-2.15.0 and R-3.0.1

June 24, 2013
By
Comparing the speed of pqR with R-2.15.0 and R-3.0.1

As part of developing pqR, I wrote a suite of speed tests for R. Some of these tests were used to show how pqR speeds up simple real programs in my post announcing pqR, and to show the speed-up obtained with helper threads in pqR on systems with multiple processor cores. However, most tests in

Read more »

Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

Introduction Continuing my recent series on exploratory data analysis (EDA), this post focuses on the conceptual foundations of empirical cumulative distribution functions (CDFs); in a separate post, I will show how to plot them in R.  (Previous posts in this series include descriptive statistics, box plots, kernel density estimation, and violin plots.) To give you

Read more »

Merging Data — SAS, R, and Python

June 24, 2013
By
Merging Data — SAS, R, and Python

On analyticbridge, the question was posed about moving an inner join from Excel (which was taking many minutes via VLOOKUP()) to some other package.  The question asked what types of performance can be expected in other systems.  Of the list ...

Read more »

Rcpp 0.10.4

June 24, 2013
By

A new version of Rcpp is now on the CRAN network for GNU R; binaries for Debian have been uploaded as well. This release brings a fairly large number of fixes and improvements across a number of Rcpp features, see below for the detailed list. We a...

Read more »

A beer recommendation system made with R

June 24, 2013
By
A beer recommendation system made with R

If you know a beer you like and want some recommendations for a style of beer to try, check out the yhat Beer Recommender: This neat little app is the product of a recommendation system built using the R language by the folks behind the yhat blog. It's based on about 1.5 million beer reviews from the Beer Advocate....

Read more »

My Stat Bytes talk, with slides and code

June 24, 2013
By

On Thursday of last week I gave a short informal talk to Stat Bytes, the CMU Statistics department's twice a month computing seminar. Quick tricks for faster R code: Profiling to Parallelism Abstract: I will present a grab bag of tricks to speed up your R code. Topics will include: installing an optimized BLAS, how

Read more »

My Stat Bytes talk, with slides and code

June 24, 2013
By

On Thursday of last week I gave a short informal talk to Stat Bytes, the CMU Statistics department‘s twice a month computing seminar. Quick tricks for faster R code: Profiling to Parallelism Abstract: I will present a grab bag of … Continue reading →

Read more »

Opel Corsa Diesel Usage

June 24, 2013
By
Opel Corsa Diesel Usage

I wanted to extend my car weight distribution calculation of June 16 from only 2000 to years 2000 to 2013. Unfortunately, come Sunday afternoon the code seemed too slow and not even the beginning of a post. So, I went on to another calculation I w...

Read more »

Streamline Your Mechanical Turk Workflow with MTurkR

June 24, 2013
By
Streamline Your Mechanical Turk Workflow with MTurkR

I’ve been using Thomas Leeper‘s MTurkR package to administer my most recent Mechanical Turk study—an extension of work on representative-constituent communication claiming credit for pork benefits, with Justin Grimmer and Sean Westwood.  MTurkR is excellent, making it quick and easy to: test … Continue reading →

Read more »

Creating a BI Dashboard: Part 1

June 23, 2013
By
Creating a BI Dashboard: Part 1

Introduction A few first posts of this blog will demonstrate how to build each report hosted by the business intelligence (BI) application dashboard shown below (see Fig. 1). This application uses the following tools and technologies R – a free software environment for statistical computing and graphics, ASP.NET MVC4 – a free framework for building

Read more »

Revisualizing the best cities in the US in 2012- Shiny + googleVis = Incredibly powerful

June 23, 2013
By
Revisualizing the best cities in the US in 2012- Shiny + googleVis = Incredibly powerful

This is the last time I will talk about visualizing the best cities of 2012 based on Bloomberg Businessweek's rankings. In an earlier post on this topic, interactive applications to plot bar graphs and histograms for different characteristics...

Read more »

Principal Components Analysis Shiny App

June 23, 2013
By
Principal Components Analysis Shiny App

I’ve recently started experimenting with making Shiny apps, and today I wanted to make a basic app for calculating and visualizing principal components analysis (PCA). Here is the basic interface I came up with. Test drive the app for yourself using the code below or  check out the the R code HERE. Above is an example of the

Read more »

Parallel computation with helper threads in pqR

June 23, 2013
By
Parallel computation with helper threads in pqR

One innovative feature of pqR (my new, faster, version of R), is that it can perform some numeric computations in “helper” threads, in parallel with other such numeric computations, and with interpretive operations performed in the “master” thread. This can potentially speed up your computations by a factor as large as the number of processor cores

Read more »

Time Is on My Side – A Small Example for Text Analytics on a Stream

June 23, 2013
By
Time Is on My Side – A Small Example for Text Analytics on a Stream

Introduction and Background While my last posting was about recommendation in the context of Location Based Social Networks there are also other interesting topics regarding the analysis of unstructured data. The most established one is probably Text Analytics/Mining focusing on all sorts of text data.For me, coming from spatial analysis, these topic is relatively new but I couldn’t help noticing...

Read more »

Got Bootstrap?

June 23, 2013
By
Got Bootstrap?

This week I read the book by Michael Chernick and Robert LaBudde, An Introduction to Bootstrap Methods with Applications to R. It’s an interesting oeuvre for useRs of all stripes. I strongly recommend check it out. The book brings lots of examples of bootstrapping applications, such as standard errors, confidence intervals, hypothesis testing, and even

Read more »

GRNN and PNN

June 23, 2013
By
GRNN and PNN

From the technical prospective, people usually would choose GRNN (general regression neural network) to do the function approximation for the continuous response variable and use PNN (probabilistic neural network) for pattern recognition / classification problems with categorical outcomes. However, from the practical standpoint, it is often not necessary to draw a fine line between GRNN

Read more »

Generating Alerts From Guardian University Tables Data

June 23, 2013
By
Generating Alerts From Guardian University Tables Data

One of the things I’ve been pondering with respect to the whole data journalism process is how journalists without a lot of statistical training can quickly get a feel for whether there may be interesting story leads in a dataset, or how they might be able to fashion “alerts” that bring attention to data elements

Read more »

FuzzyNumbers-0.3-1 released

June 23, 2013
By

A new version of the FuzzyNumbers package for R has just been submitted to the CRAN archive. Check out our step-by-step tutorial. ** FuzzyNumbers Package CHANGELOG ** ********************************************************************* 0.3-1 /2013-06-23/ * piecewiseLinearApproximation() - general case (any knot.n) for method="NearestEuclidean" now…Read more ›

Read more »

Prototyping A General Regression Neural Network with SAS

June 22, 2013
By
Prototyping A General Regression Neural Network with SAS

Last time when I read the paper “A General Regression Neural Network” by Donald Specht, it was exactly 10 years ago when I was in the graduate school. After reading again this week, I decided to code it out with SAS macros and make this excellent idea available for the SAS community. The prototype of

Read more »

Advanced Graphics I

June 22, 2013
By
Advanced Graphics I

Polygon is a such handy function in R for drawing beautiful charts where we can select regions (polygons) of the surface. It’s quite useful for indicating confidence regions of parameters, predictions for time-series, or areas under distributions:

Read more »

Calling C++ from R using Rcpp

June 22, 2013
By

Why call C/C++ from R? I really like programming in R. The fact that it is open source immediately wins my favour over Matlab. It can, however, be quite slow especially if you “speak” R with a strong C/C++ accent. This sluggishness, especially when writing unavoidable for loops, has led me to consider other programming The post Calling...

Read more »

What is “Practical Data Science with R”?

June 22, 2013
By
What is “Practical Data Science with R”?

A bit about our upcoming book “Practical Data Science with R”. Nina and I share our current draft of the front matter from the book, which is a description which will help you decide if this is the book for you (we hope that it is). Or this could be the book that helps explain Related posts:

Read more »

Got Bootstrap?

June 22, 2013
By
Got Bootstrap?

This week I read the book by Michael Chernick and Robert LaBudde, An Introduction to Bootstrap Methods with Applications to R. It’s an interesting oeuvre for useRs of all stripes. I strongly recommend check it out. The book brings lots of examples of bootstrapping applications, such as standard errors, confidence intervals, hypothesis testing, and even bootstrap...

Read more »

Optimization

June 22, 2013
By
Optimization

Many problems in statistics or machine learning are of the form "find the values of the parameters that minimize some measure of error". But in some cases, constraints are also imposed on the parameters: for instance, that they should sum up to 1, or that at most 10 of them should be non-zero -- this adds a combinatorial layer to the...

Read more »

Five years of Weight Tracking

June 22, 2013
By
Five years of Weight Tracking

After I moved back from New Jersey in June 2008 I started to track my body weight more seriously. My routine usually consists of getting up and after finishing the morning bathroom I would step on my scale. That way I try to ensure that the condition for each weighing are as similar as possible. … Continue reading...

Read more »

Everything in Its Right Place: Visualization and Content Analysis of Radiohead Lyrics

June 22, 2013
By
Everything in Its Right Place: Visualization and Content Analysis of Radiohead Lyrics

IntroductionI am not a huge Radiohead fan.To be honest, the Radiohead I know and love and remember is that which was a rock band without a lot of 'experimental' tracks - a band you discovered on Big Shiny Tunes 2, or because your friends told you about...

Read more »

Announcing pqR: A faster version of R

June 22, 2013
By
Announcing pqR:  A faster version of R

pqR — a “pretty quick” version of R — is now available to be downloaded, built, and installed on Linux/Unix systems. This version of R is based on R-2.15.0, but with many performance improvements, as well as some bug fixes and new features. Notable improvements in pqR include: Multiple processor cores can automatically be used to perform some numerical

Read more »

Are Green Number Runners More Likely to Bail?

June 22, 2013
By
Are Green Number Runners More Likely to Bail?

Comrades Marathon runners are awarded a permanent green race number once they have completed 10 journeys between Durban and Pietermaritzburg. For many runners, once they have completed the race a few times, achieving a green number becomes a possibility. And once the idea takes hold, it can become something of a compulsion. I can testify

Read more »

Not only CRAN downloads and Shiny … but also .. rCharts

June 21, 2013
By
Not only CRAN downloads and Shiny … but also .. rCharts

I have been meaning for some time to get stuck into the rCharts package which provides an interface to many Javascript graphic libraries. These offer rich charting capabilities with interactivity and a great deal of customization. As regular readers will know, I am also interested in improved publicity for CRAN packages, although the Shiny app

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series











Contact us if you wish to help support R-bloggers, and place your banner here.