In previous post we discovered Dallas Animal Services data sources (available on Dallas Open Data) and successfully analyzed how animals get admitted to and discharged from the city shelters. We loaded actual shelter records and looked at the types of admittance, different outcomes and their relationships. In this post we continue this analysis by focusing on the time animals spend and factors that favor or hinder survival of dogs in the shelters. For consistency and representation only types of admission Confiscated, Owner Surrender, and Stray and outcomes Adoption, Died, Euthanized, Returned to Owner, and Transfer were included. Outcome Dead on Arrival was excluded from survival analysis because it preempties outcome conditions before stay in shelter begins.
Time Spent in Shelters
Distributions are bimodal with relatively fat tails but they differ in how major modes compare to minor ones. As Wikipedia rightly notices “a bimodal distribution most commonly arises as a mixture of two different unimodal distributions” and dissecting data by admission and outcome types opens the door to further discovery:
If the former histogram used facets for separate plots for cats and dogs, the latter plot switched to dodged bars to pack more information into less space. Some interesting observations:
- Confiscated admissions have distinctively different profile and peaks presumingly attributed to legal obligations to owners;
- Confiscated has distinct bimodal distributions when outcomes are either Returned to Owner or Transfer;
- Adoption times are similar for both cats and dogs;
- Most distributions have clear unimodal profiles specific to the types of admission and outcome that vary between dogs and cats in density;
- Adoption and to less degree Owner Surrender distributions are almost indistinguishable between cats and dogs.
Sankeys With Average Times
And then for dogs:
Expected Chance of Not Surviving in Shelter
For the purpose of this analysis any outcome other than Died or Euthanized means animal survived to leave shelter alive (most with outcomes Adoption, Foster, Returned to Owner or Transfer). Remember that we also excluded dogs with intake type Dead on Arrival (see introduction).
We begin with rather simple calculations – an estimates of chance of dying in shelter given animal satisfies certain condition. Plot below contains conditional probabilities for dogs (unless cats specified) not surviving in shelter given certain factor at the time of admission (intake categories):
Two health conditions stand out with the highest rates: untreatable and unmanageable, while another health condition contagious is present in 3 out of top 4 factors.
There is one more factor breed which has over 200 values just for dogs. Below we display chances of dying for the dog breeds with at least 100 recorded admissions:
Note that probability scale is different between the two plots. Surprisingly, breed Chow Chow took the top spot with Pit Bull Terrier breeds Staffordshire, Pit Bull, Am Pit Bull Terrier, and American Staffordshire close next.
In this case pets survived when discharged with any outcome other than Died or Euthanized. The time t is always in days since the day of admission and all animal records included in this analysis are for animals that were discharged (effectively eliminating both left and right censoring cases). Survival analysis accounts for censored data – those subjects with last known status alive and no later information available. In our case all animal records contain outcome and thus all discharged alive are censored at discharge date.
Cats vs. Dogs KM Curves
Day of admission is the worst for both but cats fare twice as bad with 25% lost right away. Days 4 and 5 are critical for dogs as their survival plummets on these days. After that survival rates stabilize and trend in similar pattern.
KM Curves by Dog Intake Types
To make further analysis more plausible we include only dog records from this point on. We also exclude pets admitted as Dead on Arrival or Euthanasia Requested since their outcomes are obvious and immediate.
KM Curves by Dog Origins
Health Conditions at Admission
No surprise that unhealthy animals survival is significntly below healthy ones. Also, dominant majority of dogs accepted are in unhealthy condition, which is both not surprising and unfortunate.
There is more information about unhealthy dogs available from shelter records: treatable vs. untreatable and contagious vs. non-contagious. Unfortunately, these values reside inside single field so the survival curves include combinations of the health factors:
It clearly shows how each health factor reduces survival chances: from Healthy to Treatable Rehabilitable to Treatable Manageable to Unhealthy Untreatable to finally Unhealthy Untreatable Contagious.
If we extract and analyze each health factor (ignoring the rest) then these relationships become more apparent:
Survival of Dogs with Chips
As of June 17, 2017, all dogs and cats four months and older in the city of Dallas must be microchipped. This relatively new regulation will likely change both the share of chipped dogs in Dallas and survival curves as observed below from 2015 through October 2017:
Still having a dog microchipped will almost certainly keep survival chances higher.
Dallas shelters admitted dogs of over 200 different breeds from 2015 through 2017. Among them 56 breeds appeared 100 times or more (over 95% of all admissions):
Top 4 breeds – Pit Bull, Labrador Retriever, Chihuahua, and German Shepherd – account for almost 60% of all admissions with next breed – Cairn Terrier – dropping to just under 3%. The survival curves for these 5 breeds contain almost 2/3 of all dogs admitted to Dallas shelters:
Pit Bull‘s suffer the worst survival rate of the 5 most admitted breeds. It drops to below 50% survival rate after just over a week at shelter. Labrador and German Shepherd get 50% some time into 3 week period. Smaller breeds last much better as evident from Chihuahua and Cairn Terrier curves.
It turns out there are more breeds closely related to Pit Bull: American Staff, Am Pit Bull Ter, and Staffordshire:
Similar pattern for three of four breeds from the group sharply differ from the 4th – American Staffordshire for reason(s) beyond this analysis.
In the next and final post on Dallas animal shelters we will apply Cox proportional hazard semi-parameters statistical analysis to assess simultaneously the effect of several factors on survival time and outcome.