Using R to graph a subject trend in PubMed

May 15, 2012
By
Using R to graph a subject trend in PubMed

The traditional way to show that your topic is worth studying in front of an audience is to show the state of the field based on a literature review. This is especially true if your subject is obscure except to a handful of scientists in the world.I was confronted with this problem more than once and the last time...

Read more »

How long before R overtakes SAS and SPSS?

May 15, 2012
By

Based on an analysis of Google Scholar data on usage of statistical software, Bob Muenchen makes a forecast: R will overtake SAS and SPSS in 2015. Forecasting is extrapolation — always a tricky business — so Bob also provides these qualitative reasons why R will continue to grow at the expense of SAS and SPSS: The continued rapid growth...

Read more »

Interactive reports in R with knitr and RStudio

May 15, 2012
By
Interactive reports in R with knitr and RStudio

Last Saturday I met the guys from RStudio at the R in Finance conference in Chicago. I was curious to find out what RStudio could offer. In the past I have used mostly Emacs + ESS for editing R files. Well, and what a surprise it was. JJ, Joe and Josh ...

Read more »

Will 2015 be the Beginning of the End for SAS and SPSS?

May 15, 2012
By
Will 2015 be the Beginning of the End for SAS and SPSS?

Learning to use a data analysis tool well takes significant effort, so people tend to continue using the tool they learned in college for much of their careers. As a result, the software used by professors and their students is … Continue reading →

Read more »

Forthcoming R User Meetings

May 15, 2012
By

Two R User Group meetings are happening soon thanks to the support of Mango-solutions (one of R-bloggers’ long term sponsors).  Details below:   1)      ZurichR – Wednesday 23rd May 2012 (www.zurichr.org) ZurichR is a free networking event for R users sponsored by Mango Solutions and ETH Zurich All welcome to attend.  Please confirm attendance in advance to [email protected] Time:       6.30pm – 9.30 pm...

Read more »

Skew of Bonds

May 15, 2012
By
Skew of Bonds

As the researchpuzzler highlights in “a bad bet”, US bonds were a popular subject at the CFA Institute Annual Conference.  While US Bonds have been in an amazing 30 year run (see previous posts Lattice Explore Bonds, Bond Market as a Casino Ga...

Read more »

Improving script_002: “Monitor”

May 15, 2012
By
Improving script_002: “Monitor”

I read in an article that Ian Cowe said that what normally chemometricians do is to look to the graphics, of course interpret those graphics. So I still go on trying to develop a function can help me to understand the graphics and all the statistics th...

Read more »

RcppSMC 0.1.1

CRAN now tests packages against g++-4.7 (as this version has become the default on Debian's testing variant. This compiler switch once again triggered a set of build failures, mostly from include files now deemed missing. For RcppSMC, it came down ...

Read more »

Functions ddply and melt make plotting summary stats in R more tolerable

May 15, 2012
By
Functions ddply and melt make plotting summary stats in R more tolerable

The main reason why I have usually chosen to use excel to make my plots at work is because I had difficulty feeding the summary stats in R into a plotting function.  One thing I learned this week is how … Continue reading →

Read more »

R solvements to Project Euler — problem 1

May 15, 2012
By

Things have been going wild since I opened this blog. Tasks were piled up while I was tight on time. At present, I’m facing a major challenge in my life. However, I decide to spare some time for self-improvements. R … Continue reading →

Read more »

GitHub data analysis

May 15, 2012
By
GitHub data analysis

Few weeks ago GitHub announced, that its timeline data is available on bigquery for analysis. Moreover, it offers prizes for the best visualization of the data. Despite my art skills and minimal chances to win beauty contest, I decided to crunch GitHub data and run data analysis. After initial trial of bigquery service, I found hard

Read more »

Blog aggregators

May 15, 2012
By

A very useful way of keeping up with blogs in a particular area is to subscribe to a blog aggregator. These will syndicate posts from a large number of blogs and provide links back to the original sources. So you only need to subscribe once to get all the good stuff in that area. There are now several blog...

Read more »

Setting up StatET & Eclipse in Windows

May 15, 2012
By
Setting up StatET & Eclipse in Windows

A view of the StatET plugin in the Juno Eclipse. The environment is perfect for developing R packages and creating more complex functions. I wanted to write about creating R-packages in Windows but after trying to get StatET to work seamlessly...

Read more »

Plotting data and distribution simultaneously (with ggplot2)

May 14, 2012
By
Plotting data and distribution simultaneously (with ggplot2)

Ever wanted to see at a glance the distribution of your data across different axes? It happens often to me, and R allows to build a nice plot composition - This is my latest concoction. I used ggplot2 here, but equivalent graphics can be made...

Read more »

Multiple Sclerosis Tweet-Chat: Review

May 14, 2012
By
Multiple Sclerosis Tweet-Chat: Review

We had a great Twitter conversation last Thursday on the use of big-data analytics, Revolution R Enterprise, and IBM Netezza in the search for a cure for MS. Many thanks to the other panelists: Murali Ramanathan (SUNY Buffalo), Tim Coetzee (National MS Society) and moderator Shawn Dolley (IBM) for fielding and answering questions from interested parties following #IBMDataChat. As...

Read more »

New courses from R gurus

May 14, 2012
By

Looking to learn R, or to expand your R skills for data visualization or package development? Here are some R courses presented by the experts you may be interested in: June 19-20: Visualization in R with ggplot2. This course presented by Garrett Grolemund & Dr. Winston Chang of Rice University is also a web-based course with live presentation. This...

Read more »

generalised ratio of uniforms

May 14, 2012
By
generalised ratio of uniforms

A recent arXiv posting of the paper “On the Generalized Ratio of Uniforms as a Combination of Transformed Rejection and Extended Inverse of Density Sampling” by Martino, Luengo, and Míguez from Madrid rekindled my interest in this rather peculiar simulation method. The ratio of uniforms samples uniformly on the subgraph to produce simulations from p

Read more »

Spatial Randomness Evaluation in R: Monte Carlo Test

May 14, 2012
By
Spatial Randomness Evaluation in R: Monte Carlo Test

This post is a some kind of reply to this one.So our goal is to determine whether our point process is random or not. We will use R and spatstat package in particular. Spatstat provides a very handy function for this, that uses K-function combined with...

Read more »

New Version of RStudio (v0.96)

May 14, 2012
By
New Version of RStudio (v0.96)

Today a new version of RStudio (v0.96) is available for download from our website. The main focus of this release is improved tools for authoring, reproducible research, and web publishing. This means lots of new Sweave features as well as tight integration with the knitr package (including support for creating dynamic web reports with the

Read more »

Criticism 3 of NHST: Essential Information is Lost When Transforming 2D Data into a 1D Measure

May 14, 2012
By
Criticism 3 of NHST: Essential Information is Lost When Transforming 2D Data into a 1D Measure

Introduction Continuing on with my series on the weaknesses of NHST, I’d like to focus on an issue that’s not specific to NHST, but rather one that’s relevant to all quantitative analysis: the destruction caused by an inappropriate reduction of dimensionality. In our case, we’ll be concerned with the loss of essential information caused by

Read more »

Example 9.31: Exploring multiple testing procedures

May 14, 2012
By
Example 9.31: Exploring multiple testing procedures

In example 9.30 we explored the effects of adjusting for multiple testing using the Bonferroni and Benjamini-Hochberg (or false discovery rate, FDR) procedures. At the time we claimed that it would probably be inappropriate to extract the adjusted p-values from the FDR method from their context. In this entry we attempt to explain our misgivings about...

Read more »

Source R-Script from Dropbox

May 14, 2012
By
Source R-Script from Dropbox

A quick tip on how to source R-scripts from a Dropbox-account:(1) Upload the script.. (2) Get link with the "get link" option. The link should look like "https://www.dropbox.com/s/XXXXXX/yourscript.R"..(3) Grab this part "XXXXXX/yourscript.R" and paste...

Read more »

Bias in Federal Reserve Inflation Forecasts

May 13, 2012
By

Bias in Federal Reserve Inflation Forecasts: Christopher Gandrud uses ggplot2 to visualize potential partisan bias in US Federal Reserve inflation forecasts as a PhD student at the London School of Economics.

Read more »

Text Mining to Word Cloud App with R

May 13, 2012
By
Text Mining to Word Cloud App with R

Here is a simple application to transform text into a beautiful word cloud, Text Mining to WordCloud. The purpose is to find out the highest frequency word in a certain text. It is an app built with R language, the source code is attached at the end of...

Read more »

BCEA on CRAN!

May 13, 2012
By

Finally, I got round to find some time to work out all the problems in compiling the BCEA (Bayesian Cost-Effectiveness Analysis) package.I developed it as part of the work for the book. In a nutshell, what it does is the following: first, you need to s...

Read more »

The whinny of the exponential horse

May 13, 2012
By
The whinny of the exponential horse

A Poisson process provides a good model for events that happen rarely. That's what von Bortkiewicz realized in 1898 when he modeled deaths by horse kick in Prussian cavalry; since it would be ungentlemanly to actually kill my readers, I instead represent the events in a Poisson process using a horse's whinny.

Read more »

Spurious correlations and the Lasso

May 13, 2012
By
Spurious correlations and the Lasso

Autocorrelation of a time series can be useful for prediction because the most recent observation of the prediction target contains information about future values. At the same time autocorrelation can play tricks on you because many standard statistical methods implicitely assume independence of measurements at different times. The correlation coefficient between two variable and has

Read more »

ggplot2 presentation at Victoria University of Wellington

May 13, 2012
By

Next week I’ll present a glimpse of R and ggplot2 graphics at VUW. This is a MESA seminar on ‘Data analysis and plotting with free and open source tools’ where we’ll present spreadsheet alternatives based on gnuplot, Python, an...

Read more »

gRaphics! 2012-05-12 14:07:00

May 12, 2012
By
gRaphics! 2012-05-12 14:07:00

My own version of bubble plot (part 1)During one of my projects, I found myself in need of visualizing more than 3 dimensions at once. Three-dimensional graphs are not a good solution, usually - they will need to be properly oriented, for a start, ad that's tricky.So, I started looking at bubble plots. The size of the bubble can...

Read more »