Handling Large Datasets in R

Handling large dataset in R, especially CSV data, was briefly discussed before at Excellent free CSV splitter and Handling Large CSV Files in R. My file at that time was around 2GB with 30 million number of rows and 8 columns. Recently I started to collect and analyze US corporate bonds tick data from year...

Read more »

DEADLINE EXTENDED — CFP: Special Issue in JSS for "Magnetic Resonance Imaging in R"

October 26, 2010
By
DEADLINE EXTENDED — CFP: Special Issue in JSS for "Magnetic Resonance Imaging in R"

Magnetic Resonance Imaging in R The deadline for submission to the Special Issue of the Journal of Statistical Software(JSS - http://www.jstatsoft.org) has been extended to November 14, 2010.All MRI modalities are welcome, for example, structural ...

Read more »

DEADLINE EXTENDED — CFP: Special Issue in JSS for "Magnetic Resonance Imaging in R"

October 26, 2010
By
DEADLINE EXTENDED — CFP: Special Issue in JSS for "Magnetic Resonance Imaging in R"

Magnetic Resonance Imaging in R The deadline for submission to the Special Issue of the Journal of Statistical Software(JSS - http://www.jstatsoft.org) has been extended to November 14, 2010.All MRI modalities are welcome, for example, structural ...

Read more »

R nominated for best open-source project in New Zealand

October 26, 2010
By

The R project, born in New Zealand in 1993, has been nominated as the best open-source project in the New Zealand Open-Source Awards 2010. R's co-creator Ross Ihaka talks about the project in this article by the New Zealand Herald: Ross Ihaka from the University of Auckland started developing R 20 years ago, but it took off about a...

Read more »

Example 8.11: violin plots

October 26, 2010
By
Example 8.11: violin plots

We've continued to get useful feedback and ideas from our posts on the combination dotplot/boxplot and other ways to craft similar displays. Another notion is the violin plot, which combines a boxplot and a (doubled) kernel density plot. While the ba...

Read more »

Upcoming R courses from Statistics.com

October 26, 2010
By

The online training provider Statistics.com has three great courses based on R coming up in the next few months: Nov. 5 - Dec. 3: "Graphics in R," with Paul Murrell Nov. 20 – Dec. 18: Support Vector Machines in R" with Dr. Lutz Hamel Dec. 17 - Jan. 22: "Geostatistics in R" with Prof. David Unwin The courses take...

Read more »

Different results from different software

October 26, 2010
By

I’ve had a few questions on this topic lately. Here is an email received today: I use Eviews to estimate time series, but I have been checking out R recently, and your Forecast package. I cannot understand why 2 similar equations in Eviews and R are giving different estimated output. Your insights will be invaluable

Read more »

R User Groups 2010-10-25 21:14:50

October 25, 2010
By
R User Groups 2010-10-25 21:14:50

Videos from the October meeting “Text Mining with R” of the Los Angeles R users group:Rob Zinkov, “Text Mining with R”:Ryan Rosario, “Accessing R from Python using RPy2″:

Read more »

Algorithmic Trading with IBrokers

October 25, 2010
By
Algorithmic Trading with IBrokers

Kyle Matoba is a Finance PhD student at the UCLA Anderson School of Management.  He gave a presentation on Algorithmic Trading with R and IBrokers at a recent meeting of the Los Angeles R User Group.  The discussion of IBrokers begins near th...

Read more »

The language of Statistics

October 25, 2010
By

R is the lingua franca of Statistics: R code and R packages is the means by which statisticians communicate ideas and methods for statistical analysis. The reasons why are discussed in this article, but it also begs the question: what's wrong with the spoken or written word? How Statistics and Probability relate to the English language is the subject...

Read more »

R API to Interactive Brokers Trader Workstation

Interactive Brokers via Matlab was mentioned at the old post Matlab trading code, IBrokers: R API to Interactive Brokers Trader Workstation is the R package I realize for algo trading API. Should you are also interested, you can watch the following sh...

Read more »

Parametric Bootstrap Power Analysis of GISS Temp Data

October 24, 2010
By
Parametric Bootstrap Power Analysis of GISS Temp Data

Previosly, I calculated a bunch of ad-hoc power curves from GISTEMP data. Power is essentially a reframing of the p-value, to see the significance of the trend lines in the global temps. However, power calculations are inherently very noisy, hence, my ad-hoc way of aggregating the data. Another method is to bootstrap through the responses

Read more »

Accessing R from Python using RPy2

October 24, 2010
By

This past Tuesday I had the opportunity to present a short talk (a bit long) related to text mining at the Los Angeles R Users’ Group. Since I do most of my text mining in Python, I took this opportunity to discuss RPy2, an interface to R from Python. My slides are below: Accessing R from Python...

Read more »

Programming with R – Checking Function Arguments

October 24, 2010
By

In a previous post we considered writing a simple function to calculate the volume of a cylinder by specifying the height and radius of the cylinder. The function did not have any checking of the validity of the function arguments which we will consider in this post. R has various functions that we can use to

Read more »

Generate your own Risk Characterization Theatre

October 24, 2010
By
Generate your own Risk Characterization Theatre

In the recent posts Visualizing Smoking Risk and Shades of grey I wrote about the use of “Risk Characterization Theatres” (RCTs) to communicate probabilities. I found the idea in the book The Illusion of Certainty, by Eric Rifkin and Edward Bouwer. Here is how they explain the RCTs: Most of us are familiar with the crowd in a

Read more »

Grabbing Tables in Webpages Using the XML Package

October 24, 2010
By

ables are pretty common in web pages as data sources, and the most direct way to get these data is probably to copy and paste. This is OK if there are only two or three tables, and when we need to grab 5000 tables in 1000 web pages, we may not really wish to fulfill

Read more »

how to speak ggplot2 like a native, and Predictive Analytics World

October 24, 2010
By

I was recently given the opportunity to re-present my ggplot2 talk, which I originally gave to the NYC R Meetup, to the DC R Meetup group. The Meetup was held co-located with the Predictive Analytics World conference in Alexandria, VA. (More on my thoughts on PAW below…) Contentwise, I made only small changes, changing a

Read more »

Le Monde puzzle [42]

October 24, 2010
By
Le Monde puzzle [42]

An interesting suduko-like puzzle for this week puzzle in Le Monde thi A 10×10 grid is filled by a random permutation of {0,…,99}. The 4 largest figures in each row are coloured in yellow and the 4 largest values in each column are coloured in red. What is the range of the number of yellow-and-red

Read more »

Reader suggestions on alternative ways to create combination dotplot/boxplot

October 24, 2010
By
Reader suggestions on alternative ways to create combination dotplot/boxplot

Kudos to several of our readers, who suggested simpler ways to craft the graphical display (combination dotplot/boxplot) from our most recent example.Yihui Xie combines a boxplot with a coarsened version of the PCS scores (using the round() function) u...

Read more »

R GUI now offers interactive graphics – Deducer 0.4-2 connects with iplots

October 24, 2010
By
R GUI now offers interactive graphics – Deducer 0.4-2 connects with iplots

Earlier today, Ian Fwllows has announced the release of Deducer 0.4-2 and DeducerExtras 1.2 to CRAN (I copy his announcement here): Deducer 0.4-2 contains a few bug fixes, and an interface to the iplots package. With the new iplots interface it is now possible to do interactive plots with Deducer. An introductory example screen cast

Read more »

Aquamacs customizations (auctex, ESS)

October 23, 2010
By

I gave an informal talk on my Mac based “workflow” at Stanford on Friday.  I talked a lot about Aquamacs as a tool for editing LaTeX (I use MacTeX) and for working with R (thanks auctex and ess, respectively).  Skim also got a mention; I emphasized TeX-PDF synchronization. Some of the students were asking for

Read more »

R & Rapidminer tutorial

October 23, 2010
By
R & Rapidminer tutorial

  You can see in the following video a simple tutorial of Rapidminer R plugin Rapidminer R extension tutorial   via: neuralmarkettrends.

Read more »

R & Rapidminer tutorial

October 23, 2010
By
R & Rapidminer tutorial

  You can see in the following video a simple tutorial of Rapidminer R plugin Rapidminer R extension tutorial   via: neuralmarkettrends.

Read more »

Google slides

October 22, 2010
By

Last stop on my World tour was Google headquarters in Mountain View, California, where Dirk and I presented Rcpp, RInside, RProtoBuf, etc ... for 90 minutes today. The talk was recorded, and will be broadcasted on youtube at some point. In the mean...

Read more »

Bayesian Diabetes Projections by CDC

October 22, 2010
By
Bayesian Diabetes Projections by CDC

Bayesian methods are supporting decisions and news at the national level! The Centers for Disease Control and Prevention summarizes a report published in the journal Population Health Metrics. The news also made it to the national media. The report (JP Boyle, TJ Thompson, EW Gregg, LE Barker, and DF Williamson (2010) “Projection of the year

Read more »

Help! My model fits too well!

October 22, 2010
By
Help! My model fits too well!

This is sort-of related to my sidelined study of graph algebra. I was thinking about data I could apply a first-order linear difference model to, and the stock market came to mind. After all, despite some black swan sized shocks, what better predicts a day’s closing than the previous day’s closing? So,

Read more »

Because it’s Friday: Arthur C Clarke predicts the present

October 22, 2010
By

On the BBC Horizon programme in 1964, Arthur C Clarke made some predictions about the future. He prefaced his predictions with the following caveat: If, by some miracle, a prophet could describe the future exactly as it was going to take place, his predictions would sound so absurd, so farfetched, that everybody would laugh him to scorn. So what...

Read more »

Incremental improvements to Nightlights mapping thanks to R-Bloggers

October 22, 2010
By
Incremental improvements to Nightlights mapping thanks to R-Bloggers

The R community is very generous and collaborative. This post walks through the suggestions left by commenters to my previous post on Steve Mosher's Nightlights work, and show the resulting much-improved output.

Read more »

A workflow for R

October 22, 2010
By

Writing an R script is one thing. Organizing your process: where to put the data, how to refer to files in scripts, how to run the scripts, and how to produce and collect and report the results; that's quite another. Every R user has their own workflow for doing data analysis with R, but the best workflows achieve the...

Read more »