Mapping locations in R with the Data Science Toolkit

May 18, 2011
By

Pete Warden's Data Science Toolkit (which we mentioned briefly last week) is an open-source information server that provides an API you can query for information useful for building data science applications, like identifying proper names in unstructured text, or converting IP addresses to lat/long coordinates. You can make queries via the Web interface or by direct interface to the...

Stata-like Marginal Effects for Logit and Probit Models in R [2]

May 18, 2011
By

My thanks to those who emailed comments and suggestions for my ‘mfx’ function, I’m happy that I could fill a void for some people. I also received a request/suggestion from Tony Cookson, along with a helpful fix for a bug in the code, to include an option that would allow the user to specify values

Stata-like Marginal Effects for Logit and Probit Models in R

May 17, 2011
By
$Stata-like Marginal Effects for Logit and Probit Models in R$

Although this blog’s primary focus is time series, one feature I missed from Stata was the simple marginal effects command, ‘mfx compute’, for cross-sectional work, and I could not find an adequate replacement in R. To bridge this gap, I’ve written a (rather messy) R function to produce marginal effects readout for logit and probit

Simulating Win/Loss streaks with R rle function

May 17, 2011
By

The following script allows you to simulate sample runs of Win, Loss, Breakeven streaks based on a random distribution, using the run length encoding function, rle in R. Associated probabilities are entered as a vector argument in the sample function.Y...

A survey of the [60′s] Monte Carlo methods [2]

May 17, 2011
By

The 24 questions asked by John Halton in the conclusion of his 1970 survey are Can we obtain a theory of convergence for random variables taking values in Fréchet spaces? Can the study of Monte Carlo estimates in separable Fréchet spaces give a theory of global approximation? When sampling functions, what constitutes a representative sample

A simple function for plotting phylogenies in ggplot2

May 17, 2011
By

I wrote a simple function for plotting a phylogeny in ggplot2. However, it only handles a 3 species tree right now, as I haven't figured out how to generalize the approach to N species.Any ideas on how to improve this?

TreeBASE in R: a first tutorial

May 16, 2011
By

My TreeBASE R package is essentially functional now.  Here’s a quick tutorial on the kinds of things it can do.  Grab the treebase package here, install and load the library into R. TreeBASE provides two APIs to query the database, one which searches by the metadata associated with different publications (called OAI-PMH), and another which

Describing Data: Frequently Used Commands

May 13, 2011
By

Obtaining a coherent numerical summary of data is a common task, and it is common to want to port these summary statistics into a table of results. When I am in interactive mode with my data, I use the summary() command applied to my data frame. For ...

Because it’s Friday: French Press Heat Retention

May 13, 2011
By

While responding to this thread on Reddit I made a rough guess as to the heat retention of my french press when completely full of coffee. When I went to bed I realized there was no good reason why I … Continue reading

