July 12, 2011

I wish I knew everything about R. I wish I could vectorise in my sleep. I wish there were perfect R packages out there to solve all my data transformation problems. I wish there were perfect data.

If I were Paul Graham, would I ever write code like the above? Would I hire someone who wrote that, if I were Joel Spoelsky?

My code smells, but I’ve spoken with a few experts in our department whom I trust, and they agree that the approach I’m taking is sound. I’m transforming data to be fed into a Cox model. Each data row contains a start and end date, event boolean, outcome boolean, number of prior events, and number of prior outcomes. There’s also an array of rules by which to construct the data, including those that involve season start and end dates, event start and end dates, events spanning multiple data rows, etc. Oh, and I’m using a big loop rather than vectorization. 

This project has made me question my ability to solve problems in software, which is humbling, but I soldier on.

