analyze health professional shortage areas (hpsa) with r

[This article was first published on asdfree by anthony damico, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

a health professional shortage area (hpsa) is a geographic area, population group, or health care facility that has been designated by the united states government as having an insufficient supply of medical providers, based on certain provider-to-population ratios.  primary care physician shortage areas are usually defined as any geography with fewer than one general practitioner for every 3,500 residents.  dental and mental health shortage areas have fewer than one dentist or mental health professional per 5,000 and 30,000 residents, respectively.  states and locales have some financial incentives to receive this designation: medicare bonuses, loan repayment, etc.  the united states health resources and services administration (hrsa) holds the key to this data castle.

to quickly understand exactly what’s possible with this microdata, look at hrsa’s nationwide statistics table.  alright, you got me, this isn’t a complex sample survey.  but hey, merge it onto other survey data sets.  what’s that?  other data sets don’t have census tract– or minor civil division-level identifiers?  perhaps not in their public use files, but visit the data centers and you might find the geocodes you seek.  this new github repository contains three scripts:

download current hpsa table.R
identify point-in-time geographic hpsas.R
  • limit the primary care physician hpsa data to only records actively designated on the date specified
  • further limit the data to only geographic areas (and possibly certain population groups)
  • if included: create flags for each of the major population groups
  • extract and save county-, minor civil division-, and census tract-level records from this original file

replicate hrsa nationwide statistics.R

click here to view these three scripts

for more detail about health professional shortage areas, visit:


depending on your goals and motivations, the cartography tools at hrsa’s data n statistics pages might have all you really want.  loading the microdata into r is for the tinkerers among us.

for a mapping between minor civil divisions and other geographies, check out the missouri census data center’s geocorr12.  depending on what geographies you have available in the data set you’re merging hpsa data to, you’ll have to make some decisions about what to do when one minor civil division overlaps two or more, say, zip code tabulation areas.  geocorr12 provides an `afact` column, which is just the percent of the population of the geography you’re mapping from that lives in the geography you’re mapping to.  woo hoo.

confidential to sas, spss, stata, and sudaan users: why run a three-legged race when you can sprint like an olympian?  time to transition to r.  😀

To leave a comment for the author, please follow the link and comment on their blog: asdfree by anthony damico. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)