O’Reilly’s R is a Harpy Eagle

January 4, 2010
By
O’Reilly’s R is a Harpy Eagle

Today marks the hardcopy availability of the first book dedicated to R from O'Reilly, R in a Nutshell. In the familiar O'Reilly style, the cover is adorned with an illustration of an animal, in this case a harpy eagle: The book is written by Joe Adler, a data analyst and the author of Baseball Hacks. In contrast to the...

Read more »

Welcome!

January 4, 2010
By

Welcome to my new blog, Byte Mining! Data is all around us, all the time. It flows in from places you would least expect it, and more times that not, it remains in its original form untouched by human and machine. When data simply flows in and out of our lives, we miss out on the story that it...

Read more »

Soical Network Analysis in R

January 3, 2010
By

Presentation by Drew Conway on August 6, 2009 at the NYC R Statistical Programming Meetup on how to perform basic social network analysis in R using the igraph package.

Read more »

directlabels: Adding direct labels to ggplot2 and lattice plots

January 3, 2010
By
directlabels: Adding direct labels to ggplot2 and lattice plots

Sometimes it is preferable to label data series instead of using a legend. This post demonstrates one way of using labels instead of legend in a ggplot2 plot. > library(ggplot2) > p <- ggplot(dfm, aes(month, value, group = City, colour = City)) + geom_line(size = 1) + opts(legend.position = "none") > p + geom_text(data =

Read more »

Rcpp 0.7.1

January 2, 2010
By

Two weeks after the Rcpp 0.7.0 release, Romain and I are happy to announce release 0.7.1 of Rcpp. It is currently in the incoming section of CRAN and has been accepted into Debian. Mirrors will catch up over the next few days, in the meantime the local page is available for download too. A lot has...

Read more »

Rcpp 0.7.1

January 2, 2010
By

Two weeks after the Rcpp 0.7.0 release, Romain and I are happy to announce release 0.7.1 of Rcpp. It is currently in the incoming section of CRAN and has been accepted into Debian. Mirrors will catch up over the next few days, in the meantime the local...

Read more »

Rcpp 0.7.1

January 2, 2010
By

Two weeks after the Rcpp 0.7.0 release, Romain and I are happy to announce release 0.7.1 of Rcpp. It is currently in the incoming section of CRAN and has been accepted into Debian. Mirrors will catch up over the next few days, in the meantime the local page is available for download too. A lot has...

Read more »

LSPM Examples

January 2, 2010
By
LSPM Examples

I have received several requests for additional LSPM documentation over the past couple days and a couple months ago I promised an introduction to LSPM. In this long-overdue post, I will show how to optimize a Leverage Space Portfolio with the LSPM pa...

Read more »

Arctic Sea Ice Extent Trends With R

January 1, 2010
By
Arctic Sea Ice Extent Trends With R

This post shows how to retrieve on-line Arctic Sea Ice Extent data from the National Snow and Ice Data Center (NSIDC), consolidate the data files, generate a csv file, summarize and plot the data and post it as a Google Docs so that interested readers can download and analyze this data series themselves. Links to

Read more »

Happy New Year with R

December 31, 2009
By

I have to admit that the previous post on Christmas is actually not much fun. Today I received another pResent from Yixuan which is more interesting: Basically the code deals with letter polygons (i.e. glyphs) and plot them with proper projections from 3D to 2D space: ## original code by Yixuan <[email protected]>, with my slight modification h.x =

Read more »

Updates about R-bloggers, a Happy new year, and one request

December 31, 2009
By
Updates about R-bloggers, a Happy new year, and one request

Hello dear reader of R-bloggers, I am excited to inform you that R-bloggers has grown amazingly in the last few weeks. There are now 29 blogs participating and sharing their R related articles and tutorials with the rest of us. I didn’t even know that there where so many bloggers writing about R until now. Until now (according to google analytics,...

Read more »

Because it’s Friday: The inner life of a cell

December 31, 2009
By

Happy New Year, everyone. In celebration of the New Year, enjoy this celebration of the workings of life from Harvard University. More at the link below: 3 Quarks Daily: The inner life of the cell

Read more »

R/Finance 2010, April 16-17 in Chicago

December 31, 2009
By

Today is the last day to submit abstracts for the R/Finance 2010 conference to be held in Chicago on April 16-17. If you're not planning on speaking, but are interested in applications of R in Finance, be sure to add this to your calendar -- last year's conference was an outstanding event. Here's some more information about the conference...

Read more »

R in the NYT

December 30, 2009
By

The statistical package R received a positive overview in the New York Times recently.

Read more »

Brief Analysis of Abdulmutallab (Christmas Day bomber) Web Posting Data

December 30, 2009
By
Brief Analysis of Abdulmutallab (Christmas Day bomber) Web Posting Data

Thanks to Evan Kohlman at the NEFA Foundation for compiling, and Danger Room for publicizing, the data set of all of Farouk Abdulmutallab’s posts to the Islamic Forum on Gawaher.com. Since Evan took the initiative to download and save the raw HTML data, I thought it would be useful to go one step

Read more »

Use plyr instead of _apply() in R

December 30, 2009
By

I've covered plyr once before, showing you how to get means and variances for two quantitative traits across multilocus genotypes. JD Long over at Cerebral Mastication recently posted a nice screencast illustrating how plyr "just works" as an alternative to R's family of apply commands.  There's a set of R functions (apply, sapply, lapply, tapply, eapply, and rapply) that...

Read more »

What’s up with Darwin’s weather?

December 30, 2009
By
What’s up with Darwin’s weather?

Darwin is a the capital city of Australia's Northern Territory. Lying on the coast of far Northern Australia, it's situated well in the tropics and as a result has hot, steamy, monsoonal weather. Darwin's weather has already had impact on urban culture, and now it seems it's had a political impact too: it's been at in the middle of...

Read more »

Top Ten Must-Have R Packages for Social Scientists

December 29, 2009
By

The political scientist Drew Conway has come up with a useful list of his ten "must-have" R packages for social scientists. I agree with him for the most part, and his list highlights the usefulness of R (vis-a-vis Stata) for social network analysis (s...

Read more »

Start your engines; it’s a Linux era!

December 29, 2009
By
Start your engines; it’s a Linux era!

Well, I’m writing this from my new system. After years on hiatus I migrated to Linux, once again. Setting up a full system on Linux for a Greek user had been one of the greatest challenges. First,of all setting up writing, reading & printing in Greek was the biggest obstacle, I still recall memories of 2000/2001

Read more »

C++ exceptions at the R level

December 29, 2009
By

pre{ solid black 1px;} I've recently offered an extra set of hands to Dirk to work on the Rcpp package, this serves a good excuse to learn more about C++ Exception management was quite high on my list. C++ has nice exception handling (well not ...

Read more »

RGedit

December 29, 2009
By
RGedit

RGedit is A plugin for Gedit to run R script.

Read more »

Speed-reading files, revisited

December 29, 2009
By
Speed-reading files, revisited

In a post earlier this month, it seemed as though compressing a data file before reading it into R could save you some time. With some feedback from readers and further experimentation, we might need to revisit that conclusion To recap, in our previous experiment it took 170 seconds to read a 182Mb text file into R. But if...

Read more »

tooltips in R graphics; nytR package

December 28, 2009
By

At Doug Rivers’ suggestion, I started investigating tooltips as a way to label points in R graphs. An example appears at the top of my blog, where I plot the ideal points (revealed preferences) of the (current) 111th U.S. House of Representatives against Obama vote share in their district in 2008 (SVG). I’m using the

Read more »

inline 0.3.4 released

December 28, 2009
By

Oleg has updated the inline package to version 0.3.4 which is now on CRAN. It is includes my patch for both Rcpp support as well as extended header / library options for PKG_CPPFLAGS, PKG_CXXFLAGS, and PKG_LIBS which I had mentioned recently here an...

Read more »

inline 0.3.4 released

December 28, 2009
By

Oleg has updated the inline package to version 0.3.4 which is now on CRAN. It is includes my patch for both Rcpp support as well as extended header / library options for PKG_CPPFLAGS, PKG_CXXFLAGS, and PKG_LIBS which I had mentioned recently here and h...

Read more »

inline 0.3.4 released

December 28, 2009
By

Oleg has updated the inline package to version 0.3.4 which is now on CRAN. It is includes my patch for both Rcpp support as well as extended header / library options for PKG_CPPFLAGS, PKG_CXXFLAGS, and PKG_LIBS which I had mentioned recently here an...

Read more »

Introduction to the Grammar of Graphics with ggplot2 in R

December 28, 2009
By

A detailed introduction to the Grammar of Graphics as implemented in R with the data visualization library ggplot2. This talk was given by Harlan Harris to the NYC R Statistical Meetup on December 3, 2009.

Read more »

Example 7.19: find the closest pair of observations

December 28, 2009
By
Example 7.19: find the closest pair of observations

Suppose we need to find the closest pair of observations on some variable x. For example, we might be concerned that some data had been accidentally duplicated. We return the ID's of the two closest observations, and their distance from each other. In both languages, we'll first create the data, then sort it, recognizing that the...

Read more »

Estimated Net Worth of SoilWeb- Our Online Soil Survey

December 28, 2009
By

According to the excellent source code evaluation tool, SLOCCount, our online soil survey (SoilWeb) code is worth about $268,543 and would require about 2 years of development time to re-create from scratch with a single developer working full-time. This is a fairly close estimate, as I have been working (part-time) on this code-base for 3 years now...

Read more »