Julia Silge is joining us as one of our keynote speakers at EARL London 2019. We can’t wait to hear Julia’s full keynote, but until then she kindly answered a few questions. Julia shared with us what we can expect from her address – which will focus on how Stack Overflow uses R and their recent developer survey.
Hi Julia! Tell us about the StackOverflow Developer Survey and your role at Stack Overflow
The Stack Overflow Developer Survey is the largest and most comprehensive survey of people who code around the world each year. This year, we had almost 90,000 respondents who shared their opinions on topics including their favourite technologies, their priorities in looking for a job, and what music they listen to while coding. I am the data scientist who works on this survey, and I am involved throughout the process from initial design to writing copy about results. We have an amazing team who works together on this project, including a project manager, designers, community managers, marketers, and developers.
My role focuses on data analysis. Before the survey was fielded, I worked with one of our UX researchers on question writing, so that our expectations for data analysis were aligned, as well as using data from previous years’ surveys and our site to choose which technologies to include this year. After the survey was fielded, I cleaned and analyzed the data, created data visualizations, and wrote the text for both our developer-facing and business-facing reports.
Why did you use R to analyse the survey?
All of our data science tooling at Stack Overflow is R-centric, but specifically, with our annual survey, we are working with a complex dataset on a tight schedule and the R ecosystem provides the fluent data analysis tools we need to deliver compelling results on time. From munging complicated raw data to creating beautiful visualizations to delivering data deliverables via an API, R is the right tool for the job for us.
Were there results from the survey this year that came as a surprise?
This is such a rich dataset to get to work with, full of interesting things to notice! One result this year that I didn’t expect ahead of time was with our question about whether a respondent eventually wanted to move from technical work into people management. We found that younger, less experienced respondents were more likely to say that they wanted to make the switch! Once I thought about it more carefully, I came to think that those more experienced folks with an interest in managing probably had already shifted careers and were not there to answer that question anymore. Another result that was a surprise to me was just how many different kinds of metal people listen to, more than I even knew existed!
Do you see the gender imbalance improving?
Although our annual survey has a broad capacity for informing useful and actionable conclusions, including about gender, our results don’t represent everyone in the developer community evenly. We know that people from marginalized groups and underrepresented groups in tech participate on Stack Overflow at lower rates than they participate in the software workforce. This means that we undersample such groups in our survey (because of how we invite respondents to the survey, mostly on our site itself). Over the past few years, we have seen incremental improvement in the proportion of responses that are from marginalized or underindexed groups such as minority genders or minority racial/ethnic groups; we are so happy to see this because we want to hear from everyone who codes, everywhere. We believe the biggest driver of this kind of positive change is and will continue to be improving the balance of who participates on Stack Overflow itself, and we are committed to making Stack Overflow a more welcoming and inclusive platform. This kind of work can be difficult and slow, but we are in it for the long haul.
What future trends might you be able to predict from the survey?
One trend we’ve seen over the past several years that I expect to continue is the normalization of salaries for data work. Several years ago, people who worked as data scientists were extreme outliers in salary. Salaries for data scientists have started to move toward the norm for software engineering work, especially if you control for education (for example, comparing a data scientist with a master’s degree to a software engineer with a master’s degree). I don’t see this as entirely bad news, because it is associated with some standardization of data science as a role and increased industry agreement about what a data scientist is, what a data engineer is, how to hire for these roles, and what career paths might look like.
Given Python’s rise again this year, do you see this continuing? How will this affect the use of R?
Python has exhibited a meteoric rise over the past several years and is the fastest-growing major programming language in the world. Python has been climbing in the ranks of our survey over the past several years, edging past first PHP, then C#, then Java this year. It currently sits just below SQL in the ranking. I have a hard time imagining that next year more developers will say they use Python than say they use SQL! You can dig this interview up next year and point out my prediction failure if I am wrong.
In terms of R and R’s future, it’s important to note that R’s use has also been growing dramatically on Stack Overflow, both absolutely and relatively. R is now a top 10 to top 15 programming language (both in questions asked and traffic). Data technologies are in general growing a lot, and there are many factors that go into an individual or an organization deciding to embrace R, or Python, or both.
You can catch Julia and a whole host of other brilliant speakers at EARL London on 10-12 September at The Tower Hotel London.