Blog Archives

Delimited file where delimiter clashes with data values

August 1, 2013
By

A comma-separated values (CSV) file is a typical way to store tabular/rectangular data. If a data cell contain a comma, then the cell with the commas is typically wrapped with quotes. However, what if a data cell contains a comma … Continued

Read more »

Guide to accessing MS SQL Server and MySQL server on Mac OS X

April 6, 2013
By

Native GUI client access to MS-SQL and MySQL We can use Oracle SQL Developer with the jTDS driver to access Microsoft SQL Server. Note: jTDS version 1.3.0 did not work for me; I had to use version 1.2.6. Detailed instructions can be found here. We can use MySQL Workbench to access MySQL server. Setup is... Read more »

Better decision tree graphics for rpart via party and partykit

May 29, 2012
By

I’ve been using Graphviz to create better decision tree graphics “by hand” for rpart objects created in R (final tree). I stumbled on this post that shows how one could convert an rpart object to a party project via the as.party function in partykit to utilize the plot functions in party. It looks quite nice.... Read more »

Build 32 bit R on 64 bit Ubuntu by utilizing chroot

March 30, 2012
By

In the past, I’ve described how one could build multiarch (64 bit and 32 bit) versions of R on a 64 bit Ubuntu machine. The method based on this thread no longer works as of R 2.13 or 2.14 I believe. I received advice from someone on #R over on freenode (forgot who) a few... Read more »

Build multiarch R (32 bit and 64 bit) on Debian/Ubuntu

August 11, 2011
By

I have the 64 bit version of R compiled from source on my Ubuntu laptop. I recently had a need for R based on 32 bit since a package I needed to compile and use only works in 32 bit. I thought it was readily available on Ubuntu since both 32 bit and 64 bit... Read more »

R from source

July 11, 2011
By

The following are notes for myself. I like to use the bleeding edge version of R: svn checkout https://svn.r-project.org/R/trunk/ r-devel cd r-devel ./tools/rsync-recommended ## use the following to update sources: svn update ## pre-reqs sudo apt-get build-dep r-base #sudo apt-get install gcc g++ gfortran libreadline-dev libx11-dev xorg-dev #sudo apt-get install texlive texinfo ./configure make sudo... Read more »

My own programming style convention for most languages

July 1, 2011
By

I write code mainly in R, and from times to times, in C, C++, SAS, bash, python, and perl. There are style guides out there that help make your code more consistent and readable to yourself and others. Here is a style guide for C++, and here is Google’s style guide for R and here... Read more »

serialize or turn a large parallel R job into smaller chunks for use with SGE

June 16, 2011
By

I use the snow package in R with OpenMPI and SGE quite often for my simulation studies; I’ve outlined how this can be done in the past. The ease of these methods make it so simple for me to just specify the maximum number of cores available all the time. However, unless you own your... Read more »

Creating even NICER, publishable, embeddable plots using tikzDevice in R for use with LaTeX

October 22, 2010
By
Creating even NICER, publishable, embeddable plots using tikzDevice in R for use with LaTeX

It’s true. I like to do my work in R and write using LaTeX (well, I prefer to use org-mode for less formal writing and/or if I don’t have to typeset a lot of math). I haven’t done a lot of LaTeX’ing or Sweaving in the last year since 1) I’ve been collaborating with scientists... Read more »

S4 classes in R: printing function definition and getting help

October 4, 2010
By

I’m not very familiar with S4 classes and methods, but I assume it’s the recommended way to write new packages since it is newer than S3; this of course is open to debate. I’ll outline my experience of programming with S4 classes and methods in a later post, but in the mean time, I want... Read more »