Blog Archives

Splitting a Dataset Revisited: Keeping Covariates Balanced Between Splits

March 8, 2011
By

In my previous post I showed you how to randomly split up a dataset into training and testing datasets. (Thanks to all those who emailed me or left comments letting me know that this could be done using other means. As things go with R, it's sometimes ...

Read more »

RStudio: New free IDE for R

February 28, 2011
By

Just saw the announcement of the availability of Rstudio, a new (free & open source) integrated development environment for R that works on Windows, Mac, and Linux. Judging from the screenshots, it looks like Rstudio supports syntax highlighting for Sweave & easy PDF creation from Sweave code, which is something I haven't seen anywhere else (on Windows at least)....

Read more »

Split a Data Frame into Testing and Training Sets in R

February 24, 2011
By

I recently analyzed some data trying to find a model that would explain body fat distribution as predicted by several blood biomarkers. I had more predictors than samples (p>n), and I didn't have a clue which variables, interactions, or quadratic terms made biological sense to put into a model. I then turned to a few data mining procedures that I...

Read more »

Get all your Questions Answered

February 22, 2011
By

When I have a question I usually ask the internet before bugging my neighbor. Yet it seems like Google's search results have become increasingly irrelevant over the last few years, and this is especially true for searching anything related to R (and pr...

Read more »

R: Given column name in a Data Frame, Get the Index

February 17, 2011
By

Had a mental block today trying to figure out how to get the indices of columns in a data frame given their names. Simple task but difficult to search Google for an answer. Thanks to jashapiro, Matt, and Vince for giving me a heads up on the which() fu...

Read more »

Summarize Missing Data for all Variables in a Data Frame in R

February 16, 2011
By

Something like this probably already exists in an R package somewhere out there, but I needed a function to summarize how much missing data I have in each variable of a data frame in R. Pass a data frame to this function and for each variable it'll give you the number of missing values, the total N, and the...

Read more »

R function for extracting F-test P-value from linear model object

January 10, 2011
By

I thought it would be trivial to extract the p-value on the F-test of a linear regression model (testing the null hypothesis R²=0). If I fit the linear model: fit<-lm(y~x1+x2), I can't seem to find it in names(fit) or summary(fit). But summary(fit)$fstatistic does give you the F statistic, and both degrees of freedom, so I wrote this function to...

Read more »

Webinar on Revolution R Enterprise

December 7, 2010
By

R evangelist David Smith, marketing VP at Revolution R, will be giving a webinar showing off some of the finer features of Revolution R Enterprise - an integrated development environment (IDE) for R that has an enhanced script editor with syntax highli...

Read more »

Using the "Divide by 4 Rule" to Interpret Logistic Regression Coefficients

December 6, 2010
By

I was recently reading a bit about logistic regression in a book on hierarchical/multilevel modeling when I first learned about the "divide by 4 rule" for quickly interpreting coefficients in a logistic regression model in terms of the predicted probabilities of the outcome. The idea is pretty simple. The logistic curve (predicted probabilities) is steepest at the center where...

Read more »

Syntax Highlighting R Code, Revisited

November 17, 2010
By

A few months ago I showed you how to syntax-highlight R code using Github Gists for displaying R code on your blog or other online medium. The idea's really simple if you use blogger - head over to gist.github.com, paste in your R code, create a public "gist", hit "embed", then copy the javascript onto your blog. However, if...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)