An Analysis of Global Shark Attacks between 1543-2016

[This article was first published on Environmental Science and Data Analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Travelling to a warm country and plan on hitting the beach? “Beware of the sharks!!” While the expression has almost become a cliché at this stage, shark attacks do occur and the fact that the global ocean is heating up potentially facilitates the occurrence of attacks (I would refer to them as interactions but we will leave it as attack for this post) in waters previously deemed safe. I feel sharks have been given a bad reputation by the media and Hollywood movies such as Jaws and Deep Blue Sea but just how prevalent are fatal shark attacks? One must turn to the data to extract the truth and Kaggle provided a fantastic dataset courtesy of the Global Shark Attack File. This dataset contains shark attack incidents for the period 1543-2016 and can be found here.

Sharks are amazing creatures that have existed in Earth’s oceans for almost half a billion years. Shark numbers are in decline due to trophy hunting, ocean pollution, fishing nets and pseudo-scientific medical applications. The fact that they are so widely feared is certainly less depressing but represents a gross misunderstanding of their behaviour. Shark “attacks” are typically a result of curiosity or a defence response. After all, Homo sapiens is a land-based species which technically has no place in the anaerobic environment of the ocean. But who doesn’t like a little splash about in the sea (especially if it is the salt water of the Mediterranean)? Humans will continue to take a dip in the sea and shark attacks will no doubt continue as a result. The aim of this post is to investigate the frequency of these attacks.

The Analysis

I began my analysis with some summary statistics which revealed some interesting information. For the period 1543-2016:

  • The USA has experienced the highest number of shark attacks (2116), followed by Australia (1279).
  • The mean number of attacks per year for the USA is 4.5 and for Australia is 2.7.
  • In the USA, the state of Florida has experienced the highest number of shark attacks (990). Hawaii (282) and California (276) are also prone to shark-human interactions.
  • In Australia, New South Wales (468) and Queensland (300) lead the way.
  • New Smyrna Beach, Volusia County is the beach that has experienced the highest number of attacks (157).
  • The activity most greatly associated with shark attacks is surfing (904) followed closely by swimming (819). No surprise there!
  • The vast majority of attack subjects have been male (4829). Female (585).
  • Between 1543 and 2016, there have been 1481 fatalities. This gives an
    average of 3.1 deaths per year globally.
  • The species of shark involved in the most attacks is the White Shark (157). There is, however, a lot of missing data for this variable (2961 missing values in the dataset).

A number of questions can  be asked to investigate the data further. What is the average age of someone who has been attacked by a shark? The histogram below shows the age distribution. We see that the majority of incidents occurred to people aged between 15-25 and 35-45. The median age is 39 years.


Which year saw the most shark attacks? The answer is 2015. The table below shows the top years for shark attacks. The 21st century dominates. I strongly suspect that this is due to better data recording. Note, however, the presence of 1959 and 1960 in this list.


In fact, this next plot further supports my suspicion that better data recording is responsible for the observed increase in shark attacks between 1543-2016. Here we see an exponential increase in the number of attacks over time. This sort of data visualisation could be used incorrectly to instill fear into beach-goers. Again, I suspect this pattern is due to better data recording in the 21st century. However, the human population has exploded since the 18th century so more humans may simply equate to more shark attack incidents, although the decline in shark numbers would likely counteract this effect.


If data accuracy and recording efficiency is indeed responsible for the shape of these data, we can extract a section of the graph which represents a period where data recording was likely to be improved. Let’s say 1950-2016. This next graph tells a different story.


The overall trend is still positive but looks much less alarming. We see that 1960 saw almost as many attacks as 2016. This comparison of plots highlights an important aspect of data interpretation and presentation. Data can be used to fool people and it is always good to thoroughly analyse it for yourself before drawing conclusions. What is the mean number of shark attack fatalities per year for this period? This is important because the 3.1 figure mentioned earlier is representative of the entire 1543-2016 period and, if data quality is an issue, may be misleading. The mean number of global fatalities between 1950-2016 is 11.2. This is still minute relative to other causes of death worldwide.

We have looked at the number of attacks and the mean number of fatalities globally per year. If we look at the top “shark attack” beaches we see something odd. The top 6 beaches for shark attacks are not the top beaches for shark attack fatalities. Compare the two tables below. The top locations for fatalities are on top while the top locations for attacks are below.


Interestingly, there have only been two recorded fatalities at the top attack locations between 1543-2016: Daytona Beach in 1981 and Ponce Inlet (year unknown). The final data visualisation I want to include is the world map below. The red circles represent the number of shark attacks recorded in each part of the world between 1543-2016. The USA, Australia and South Africa are clearly the top 3. Make of that what you will. The Mediterranean looks pretty safe though!



Based on this historical data, the contribution to global fatalities by sharks is practically negligible. For example, the average global deaths per year as a result of shark attacks is 3.1 between 1543-2016 and 11.2 between 1950-2016. Compare that to 3.3 million deaths worldwide in 2012 that were attributable to alcohol consumption and over 480 000 deaths estimated to occur annually in the USA as a result of cigarette smoking.

If you are worried about a chance meeting with a shark, it might be wise to avoid the beaches of Volusia County. Or you could take my approach and just stay on land where we belong! Otherwise, the probability of coming across a shark as you take a dip is very low but not zero remember. The fear of sharks is unnecessary and we should encourage the conservation and protection of this wonderful animal.

To leave a comment for the author, please follow the link and comment on their blog: Environmental Science and Data Analytics. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)