analyze health professional shortage areas (hpsa) with r

February 11, 2013

(This article was first published on asdfree by anthony damico, and kindly contributed to R-bloggers)

a health professional shortage area (hpsa) is a geographic area, population group, or health care facility that has been designated by the united states government as having an insufficient supply of medical providers, based on certain provider-to-population ratios.  primary care physician shortage areas are usually defined as any geography with fewer than one general practitioner for every 3,500 residents.  dental and mental health shortage areas have fewer than one dentist or mental health professional per 5,000 and 30,000 residents, respectively.  states and locales have some financial incentives to receive this designation: medicare bonuses, loan repayment, etc.  the united states health resources and services administration (hrsa) holds the key to this data castle.

to quickly understand exactly what’s possible with this microdata, look at hrsa’s nationwide statistics table.  alright, you got me, this isn’t a complex sample survey.  but hey, merge it onto other survey data sets.  what’s that?  other data sets don’t have census tract– or minor civil division-level identifiers?  perhaps not in their public use files, but visit the data centers and you might find the geocodes you seek.  this new github repository contains three scripts:

download current hpsa table.R

identify point-in-time geographic hpsas.R

  • limit the primary care physician hpsa data to only records actively designated on the date specified
  • further limit the data to only geographic areas (and possibly certain population groups)
  • if included: create flags for each of the major population groups
  • extract and save county-, minor civil division-, and census tract-level records from this original file

replicate hrsa nationwide statistics.R

click here to view these three scripts

for more detail about health professional shortage areas, visit:


depending on your goals and motivations, the cartography tools at hrsa’s data n statistics pages might have all you really want.  loading the microdata into r is for the tinkerers among us.

for a mapping between minor civil divisions and other geographies, check out the missouri census data center’s geocorr12.  depending on what geographies you have available in the data set you’re merging hpsa data to, you’ll have to make some decisions about what to do when one minor civil division overlaps two or more, say, zip code tabulation areas.  geocorr12 provides an `afact` column, which is just the percent of the population of the geography you’re mapping from that lives in the geography you’re mapping to.  woo hoo.

confidential to sas, spss, stata, and sudaan users: why run a three-legged race when you can sprint like an olympian?  time to transition to r.  😀

To leave a comment for the author, please follow the link and comment on their blog: asdfree by anthony damico. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training


CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)