Linear regression models with robust parameter estimation

May 15, 2010
By

There are situations in regression modelling where robust methods could be considered to handle unusual observations that do not follow the general trend of the data set. There are various packages in R that provide robust statistical methods which are summarised on the CRAN Robust Task View. As an example of using robust statistical estimation in

Read more »

A small customization of ESS

May 14, 2010
By
A small customization of ESS

JD Long (at Cerebral Mastication) posted a question on Twitter about an artifact in ESS, where typing “_” gets you “<-”. This is because in the early days of S+, “_” was an allowed assignment operator, and ESS was developed in that era. Later, it was disallowed in favor of “<-” and “=”, so ESS

Read more »

Because it’s Friday: Optical Illusion

May 14, 2010
By

See more of the best illusions of 2010 at the link below. Best Illusion of the Year Contest: Top finalists in the 2010 contest

Read more »

New R User Group in Boston

May 14, 2010
By

There's another new R User Group, this time in Boston: the New England R User Group. Their first meeting will be on Tuesday, May 25. Get all the info by joining the Google Group at the link below. Google Groups: New England R User Group

Read more »

Introducing IBrokers (and Jeff Ryan)

May 13, 2010
By
Introducing IBrokers (and Jeff Ryan)

Josh had kindly invited me to post on FOSS Trading around the time when he first came up with the idea for the blog. Fast forward a year and I am finally taking him up on his offer.I'll start by highlighting that while all the software in this post is indeed free (true to FOSS), an account with...

Read more »

Introducing IBrokers (and Jeff Ryan)

May 13, 2010
By
Introducing IBrokers (and Jeff Ryan)

Josh had kindly invited me to post on FOSS Trading around the time when he first came up with the idea for the blog. Fast forward a year and I am finally taking him up on his offer.I'll start by highlighting that while all the software in this post is indeed free (true to FOSS), an account with...

Read more »

In case you missed it: April Roundup

May 13, 2010
By

In case you missed them, here are some articles from last month of particular interest to R users. We announced the availability of Revolution R Community 3.2 (based on R 2.10.1), now 100% open source, and including a new doMC package for parallel computing on Windows. We announced that Revolution R Enterprise is now available free of charge to...

Read more »

Introduction to using R in research

May 13, 2010
By

I was recently asked to give a talk to our graduate school annual conference. I offered several titles and the one they picked was Using R in research. I'm not sure if this was a good idea or not. The graduate school covers PhD students across three ar...

Read more »

Using R, LaTeX, and Sweave for Reproducible Research: Handouts, Templates, & Other Resources

May 13, 2010
By

Several readers emailed me or left a comment on my previous announcement of Frank Harrell's workshop on using Sweave for reproducible research asking if we could record the seminar. Unfortunately we couldn't record audio or video, but take a look a...

Read more »

Is it possible to get a causal smoothed filter ?

May 12, 2010
By
Is it possible to get a causal smoothed filter ?

Although I haven't been all that much of a fan of moving average based methods, I've observed some discussions and made some attempts to determine if it's possible to get an actual smoothed filter with a causal model. Anyone who's worked on financial ...

Read more »

pimax(mcsm)

May 12, 2010
By
pimax(mcsm)

The function pimax from our package mcsm is used in to reproduce Figure 5.11 of our book Introducing Monte Carlo Methods with R. (The name comes from using the Pima Indian R benchmark as the reference dataset.) I got this email from Josué I ran the ‘pimax’ example from the mcsm manual, and it gave

Read more »

Manual variable selection using the dropterm function

May 12, 2010
By
Manual variable selection using the dropterm function

When fitting a multiple linear regression model to data a natural question is whether a model can be simplified by excluding variables from the model. There are automatic procedures for undertaking these tests but some people prefer to follow a more manual approach to variable selection rather than pressing a button and taking what comes

Read more »

Revolution Analytics and R in the news

May 12, 2010
By

It was quite the media frenzy for Revolution and R last week. In conjunction with our relaunch as Revolution Analytics, we spoke to more than a dozen journalists and analysts to explain why we think R is at the center of a perfect storm for predictive analytics: with routine collection of large data sets, data analysis is now a...

Read more »

Reflections on consulting part 5 – what languages and tools to learn?

May 12, 2010
By
Reflections on consulting part 5 – what languages and tools to learn?

What languages and tools should you learn as a math/stat consultant?  To jump to the answer: Excel/VBA, SQL, R, Java, and Python. Spreadsheets have many problems with verifiability and scalability, so why Excel? Excel is: Useful for prototyping ideas quickly, either for your own use or to show to other team members Well-known and understood

Read more »

What Social Network Analysis software do you use?

May 12, 2010
By
What Social Network Analysis software do you use?

See a the poll here by Gabriel Rossman at Code and Culture. I voted for R and ‘igraph’. If you use R you are getting access to all the other wonderful things that come with R. Using specialized package, like Pajek, UCINET etc requires constant going back and forth between network software and some other

Read more »

Rcpp 0.8.0

May 12, 2010
By

Summary Version 0.8.0 of the Rcpp package was released to CRAN today. This release marks another milestone in the ongoing redesign of the package, and underlying C++ library. Overview Rcpp is an R package and C++ library that facilitates integr...

Read more »

Collect and Parse GPS (NMEA0183) Data in R

May 11, 2010
By

I recently wrote a serial connection for R-2.11.0 so that I can communicate with serial devices, for example an old Garmin eTrex Legend. This GPS device is able to output NMEA0183 sentences to a standard serial port (4800,8,1,N). I hooked up the device and used the serial connection to collect some data using some R

Read more »

Sweave for Reproducible Research and Beatiful Statistical Reports

May 11, 2010
By

Frank Harrell, chair of the Biostatistics department here at Vanderbilt, is giving a seminar entitled "Sweave for Reproducible Research and Beautiful Statistical Reports" tomorrow, Wednesday, May 12, 1:30-2:30pm, in the MRBIII Conference Room 1220. This tutorial covers the basics of Sweave and shows how to enhance the default output in various ways by using: latex methods for converting R...

Read more »

Number Formatting

May 11, 2010
By
Number Formatting

I was discussing some subject with my kids - can't recall if it was in the realm of astronomy, computing, or moder economics. In any case, it involved large numbers. I fired up R to do a quick calculation:> 1000000000 / 1000The resulting answer was ...

Read more »

R Package ‘rms’ for Regression Modeling

May 11, 2010
By

If you attended Frank Harrell's Regression Modeling Strategies course a few weeks ago, you got a chance to see the rms package for R in action. Frank's rms package does regression modeling, testing, estimation, validation, graphics, prediction, and ty...

Read more »

Webinar May 20: Introduction to Revolution R

May 11, 2010
By

I'll be giving a live webinar on Thursday next week (May 20) titled Introduction to Revolution R. If you're new to the R world and wondering what you can do with R, this webinar is for you. I'll also be introducing some of the functionality unique to Revolution R included in our Revolution R Community (free to everyone) and...

Read more »

R function names, explained

May 11, 2010
By

Why is the function to print out text in R named "cat"? Why is the function to delete objects called "rm"? Unless you have a background in Unix (or Linux) programming, some of R's command names can seem, well, a bit arcane. Jeromy Anglim explains the provenance of many of R's command names in this post: the details are...

Read more »

Beware of rogue header files (Bioconductor installation)

May 11, 2010
By
Beware of rogue header files (Bioconductor installation)

Just a short note concerning a “gotcha”. As I have many times before, I opened an R console on my newly-upgraded (to lucid 10.04) Ubuntu machine, typed source(“http://bioconductor.org/biocLite.R”) and began a Bioconductor install with biocLite(). Only this time, I saw this: Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared library '/home/sau103/R/i486-pc-linux-gnu-library/2.11/affyio/libs/affyio.so':

Read more »

Putting Text in a Margin

May 10, 2010
By
Putting Text in a Margin

I found text and title early on but was initially confounded when trying to add text outside of an actual chart. As it turns out, I needed to understand a bit about R's concept of margins in a chart> par(oma=c(2,2,2,2))> plot(rnorm)> mtext('The label'...

Read more »

String Concatenation in R

May 10, 2010
By
String Concatenation in R

String concatenation is a rather basic function - but my particular programming reflexes did not help me figure out how to do this in R. I tried the + and & operator, and even the || operator to no avail. Also tried concat() function... no dice. ...

Read more »

A ridiculous email

May 10, 2010
By
A ridiculous email

Wolfram Research presumably has a robot that sends automated email following postings on arXiv: Your article, “Evidence and Evolution: A review”, caught the attention of one of my colleagues, who thought that it could be developed into an interesting Demonstration to add to the Wolfram Demonstrations Project. The Demonstrations Project, launched alongside Mathematica 6 in

Read more »

Example 7.36: Propensity score stratification

May 10, 2010
By
Example 7.36: Propensity score stratification

In examples 7.34 and 7.35 we described methods using propensity scores to account for possible confounding factors in an observational study.In addition to adjusting for the propensity score in a multiple regression and matching on the propensity score...

Read more »

An economist explains: Why I use R

May 10, 2010
By

Economist and R blogger JD Long gave a talk last week (as part of the vconf.org project) about why he uses R to do statistical forecasts of agricultural yield for the reinsurance company he works for. I couldn't make the live session, but a replay is now available. The audio's a bit choppy, but if you've every struggled with...

Read more »

ggplot2: Waterfall Charts

May 10, 2010
By
ggplot2: Waterfall Charts

Waterfall charts are often used for analytical purposes in the business setting to show the effect of sequentially introduced negative and/or positive values. Sometimes waterfall charts are also referred to as cascade charts. In the next few paragraphs I will show how to plot a waterfall chart using ggplot2. Data A very small fictional dataset

Read more »