In my last post, I looked at the success rates for EPSRC Fellowship applications using funnel plots. As luck would have it, Alex Hulkes and Derek Gillespie from EPSRC then got it touch to say that they had done a similar internal analysis and would I be interested in the data? Yes please!
The new data set considers EPSRC research grants as a whole, and gives the number of applications and success rates for 137 UK universities for four years, 2009-10 to 2012-13. I wanted to add a bit of colour to this data, literally and figuratively, and so I added a column to indicate whether the university was member of the research-intensive Russell Group or the (now defunct) 1994 Group. The basic funnel plot is shown below; note that I’ve normalised the results for each year to allow direct comparison.
Unfortunately with so many institutions, it’s a bit tricky to make out individual universities. Here I’ve highlighted an example university from each of the three groups. It would be fun to make an interactive version of this using D3 but since it took me ages to make these maps, I think I might pass on this for the moment.
Another question is how the performance of each university has changed over time; I’ve focused on the Russell Group universities to keep things legible. The first question is how to measure “performance”. I’ve done this by calculating, for each university, what is the probability of observing the actual number of successful applications given their total number of submitted proposals and the overall average success rate for that year. In R, this corresponds to the pbinom function and the result gives a score between 0 and 1. I’ve scaled this between 0 (bad) to 100 (good).
To make the plots, I’m using a new and improved version of some code that I wrote earlier for generating Tufte-style slopegraphs. The revised code allows the user to choose different layout algorithms. So in the first plot, the order of the universities is determined only by their rank in 2009-2010; each group line is then constrained not to overlap. This makes it easy to see how a particular university has changed over time, but not their overall rank in subsequent years.
In the second version, the position is based on rank in each year.
We might ask which university is “best”. I’ve calculated this as the average corrected probability of success in each year (see above), weighted by the number of submitted proposals. The table below gives the top 10 universities within each group.
|Group||University||Adjusted success rate||Applications submitted|
|Russell Group||University of Cambridge||0.934||442|
|University of Bristol||0.924||310|
|University of Oxford||0.907||437|
|University College London||0.797||523|
|University of Sheffield||0.775||352|
|University of Leeds||0.759||324|
|Imperial College London||0.735||664|
|1994 Group||Institute of Education, University of London||0.882||3|
|University of Lancaster||0.770||127|
|Goldsmiths, University of London||0.655||10|
|Royal Holloway, University of London||0.576||52|
|University of Leicester||0.574||82|
|Birkbeck, University of London||0.551||14|
|University of East Anglia||0.515||70|
|University of Sussex||0.433||76|
|University of Essex||0.317||42|
|Other||Institute of Development Studies||1.000||1|
|National Institute of Agricultural Botany||1.000||3|
|Queen Margaret University Edinburgh||1.000||1|
|Scottish Agricultural College||1.000||1|
|University of the Highlands and Islands||1.000||1|
|NERC British Geological Survey||0.938||6|
|Transport Research Laboratory Ltd||0.862||3|
|John Innes Centre||0.848||2|
I also calculated the performance of the groups as a whole and the Russell Group comes out on top.
|Group||Adjusted success rate||Average applications|
|submitted per university|
This last table made me wonder if there was any connection between the number of submitted applications and the adjusted success rate (i.e. taking into account the binomial model). In other words, by submitting more applications, are you changing the probability of success? This could be explored with a fancy multi-level model, but I’ve just gone for the basic scatter plot below. The short answer appears to ‘not really’. The more formal answer is a linear model has an insignificant fit with an adjusted r2 value of 0.011.
There are many more questions that a person could explore with this data, so kudos to EPSRC for making it available.