Survey Results: What Degree is Best for Data Science?

[This article was first published on novyden, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The Survey

The survey What Degree is Best for Data Science? ran from  February 9 through March 12, 2020 asking participants 4 questions:

    • Answers about self:
      • Q1: What is the highest level of school degree you have completed?
      • Q2: Which of the following best describes the field in which you received your highest degree?
    •  Answers about best education:
      • Q3: What level of school degree you consider optimal for successful career in data science?
      • Q4: Which field of study you consider optimal for successful career in data science?

      During that period 289 respondents participated and 285 successfully completed all 4 questions, so 4 participants with partial answers were removed from analysis below.

      Though simple and short (average time it took to complete survey was 55 seconds (after removing 6 outliers who took over 500 seconds to complete survey)) they survey possesses certain internal structure overlapping in time and subject. Time groups questions in 2 pairs: one about education already acquired by a participant and the other about participant recommendations for best education. Subject of questions yields alternative groups based on the answers questions share: pair of 1st and 3d about degree and pair of 2d and 4th about field of study.

      Answers to Each Question


      Bird’s-Eye View


      Sankey Diagrams: How Data Flows

      Sankey diagrams help visualize how answers flow through the questions. We start with pairs of related questions and finish with all 4 questions together. 

      Completed Degree and Field of Study (Q1, Q2)

      Best Degree and Field of Study (Q3, Q4)

      Completed Degree vs. Best Degree (Q1, Q3)

      Completed Field vs. Best Field (Q2, Q4)

      Complete Flow of Answers For All 4 Questions

      Concluding comments

      The survey is still open so anyone who didn’t participate so do so and also let others know about it. If you haven’t noticed yet there is certain bias towards statistics in answers. This might be because significant part of respondents reached the survey via R-bloggers distribution which is popular among R users who often have background in statistics. Finally, people with degree in Math are likely to suggest Math as best field, so on for other fields and degrees – this sort of bias is easy to see from Sankey diagrams above. Removing such bias from the results could be useful and I attempted this exercise but found it to be either too naive in my DIY approach or too extensive to process in short period of time from resources discovered. If you have pointers or even better a method of removing such bias from answers I’d love to hear from you.

      To leave a comment for the author, please follow the link and comment on their blog: novyden. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
      Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

      Never miss an update!
      Subscribe to R-bloggers to receive
      e-mails with the latest R posts.
      (You will not see this message again.)

      Click here to close (This popup will not appear again)