2730 search results for "GIS"

Making Friends with Multicollinearity

December 18, 2012
By

Not every system of independent variables can be decomposed into separate components, each with its own unique contribution.  Sometimes our individual variables behave “as a unit” and thus become so entangled that we cannot say where the effect of one variable begins and the effect of another variable ends.  In such cases, it might be best to ignore the ...

Read more »

Making prettier network graphs with sna and igraph

December 18, 2012
By
Making prettier network graphs with sna and igraph

We’ve had some requests for ideas about how to make prettier network graphs, so here is one example, using the sna package for plotting, and the igraph package to calculate PageRank. The help file for gplot is pretty self-explanatory, but Melissa Clarkson has produced the most thorough and impressive guide for any R...

Read more »

Differential Isoform Expression With RNA-Seq: Are We Really There Yet?

December 17, 2012
By
Differential Isoform Expression With RNA-Seq: Are We Really There Yet?

In case you missed it, a new paper was published in Nature Biotechnology on a method for detecting isoform-level differential expression with RNA-seq Data:Trapnell, Cole, et al. "Differential analysis of gene regulation at transcript resolution with RN...

Read more »

Who are the pollinators? (with R plot)

December 17, 2012
By
Who are the pollinators? (with R plot)

I’ve been dreaming on writing a manuscript about who are the pollinators for a while, but it looks I’m not going to have the time soon, so here is an early draft of what the main figure should look like: It’s surprisingly … Continue reading →

Read more »

The Inverse Herfindahl–Hirschman Index as an “Effective Number of” Parties

December 17, 2012
By
The Inverse Herfindahl–Hirschman Index as an “Effective Number of” Parties

I learned of the passing of Albert Hirschman on December 11, and while better and more instructive tributes to his life can be read elsewhere, I wanted to focus on a little piece of Hirschman’s work that I use all the time: the (inverse) Herfindahl–Hirschman Index. The HHI is basically a measure of market concentration, but...

Read more »

Dark matter top 10, but an hour too late

December 16, 2012
By
Dark matter top 10, but an hour too late

Well, that’s embarrassing. A little tweak to my dark matter model resulted in a leaderboard score in the top 10. The only problem is that the contest closed about an hour ago. I ran this final prediction earlier today but then simply forgot to go back to it and submit!! On the bright side, I

Read more »

Possibly slightly better text analysis with lme4

December 16, 2012
By
Possibly slightly better text analysis with lme4

lme4 and its cousin arm are extremely useful for a huge variety of modeling applications (see Gelman and Hill’s book), but today we’re going to do something a little frivolous with them. Namely, we’re going to extend our Denver Debat...

Read more »

Making Data Visually Appealing

December 16, 2012
By
Making Data Visually Appealing

I’ve recently been considering the graphical presentation of data. I get the feeling that we, ecologists/scientsits, could be better at data presentation. Graphs must be informative, but they don’t have to be ugly. I think that making visually appealing charts … Continue reading →

Read more »

Text analysis made too easy with the tm package

December 15, 2012
By
Text analysis made too easy with the tm package

Today’s Gist takes the CNN transcript of the Denver Presidential Debate, converts paragraphs into a document-term matrix, and does the absolute most basic form of text analysis: a raw word count. There are actually quite a few steps in this proc...

Read more »

What is Correctness for Statistical Software?

December 14, 2012
By
What is Correctness for Statistical Software?

Introduction A few months ago, Drew Conway and I gave a webcast that tried to teach people about the basic principles behind linear and logistic regression. To illustrate logistic regression, we worked through a series of progressively more complex spam detection problems. The simplest data set we used was the following: This data set has

Read more »