An Analysis of Texas High School Academic Competition Results, Part 1 – Introduction

[This article was first published on r on Tony ElHabr, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

NOTE: This is part of a series of write-ups discussing my findings of Texas high school academic University Interscholastic Scholarship (UIL) competitions.

To keep this and the other write-ups concise and to focus reader attention on the content, I have decided not to show the underlying code (especially that which is used to create the visuals). Nonetheless, the full code can be viewed on my GitHub account. In the future, I may write some some kind of addendum to demonstrate some of the programming implementation that I think is notable in some way.


After I finished high school in 2012, I thought it would be interesting to look back and evaluate my performance in the academic University Interscholastic League (UIL) competitions that I competed in with historical results. (To provide some background, most public high schools in Texas are registered in the University Interscholastic League (UIL), which “exists to provide educational extracurricular academic, athletic, and music contests”. For those familiar with the National Collegiate Athletic Association (NCAA), the UIL serves an analogous role for Texas high school extracurricular activities.)

Aside from my own self-interest in the historical results in these competitions, I think that this analysis can provide some insight into which schools (and individual students) are really the most “elite”. School-wide and invidividual average scores on state- and national- standardized tests (e.g. the SAT) certainly are the most common measure of academic strength, but I think rankings by academic competitions may be more indicative.

About the Data

To make some sense of the my analysis, the reader should be aware of the following details about the data.

  • The competition data was scraped from for all years from 2008 through 2017. 1 The data is not listed in an extremely user-friendly format (in my opinion). Consequently, the “cleaned” data is imperfect in some ways.

  • The UIL categorizes schools into one of six “Conferences”. The conference labels range from 1A, 2A, …, 6A, where the increasing leading digit (i.e. 1, 2, etc.) generally corresponds to increasing school size.

  • Schools only compete against other schools in their conference.

  • The UIL defines 3 levels of competition (in order of “difficulty”): District__, Region, and State. These are listed in order of “difficulty”. That is, Winning a District competitions, results in a Region competition appearance, and, subsequently, winning a Region competition results in a State competition appearance. (Keep in mind that schools still only compete against other schools in their same conference, even as they advance.)

  • The UIL defines 32 total Districts in Texas, which are aggregated into 4 Regions. (The source of the geo-spatial data is

  • The UIL defines 32 total Districts in Texas, which are aggregated into 4 Regions. (The source of the geo-spatial data is

  • For schools, winning is a “winner-take-all” matter: only the school with the most combined points among its top handful individual competitors (3 for most competitions) advances. On the other hand, an individual may advance even if his school does not win if he places among the top “n”. The value of “n” is dependent on the competition type. 2

  • There are 5 different academic competitions “types”: Calculator Applications, Computer Science, Mathematics, Number Sense, and Science. 3

What’s Next

In this series, I investigate the following topics:

  1. I checked the site’s “robots.txt” file prior to scraping rate limits. ^
  2. See the UIL rules for more details. ^
  3. There are many more UIL competition types than those analyzed here (including competitions for theater, band, etc.), but these are the ones for academics. ^

To leave a comment for the author, please follow the link and comment on their blog: r on Tony ElHabr. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)