Competitive predictive modeling site Kaggle conducted a survey of participants in prediction competitions, and the 16,000 responses provide some insights about that user community. (Whether those trends generalize to the wider community of all data scientists is unclear, however.) One question of interest asked what tools Kagglers use at work. Python is the most commonly-used tool within this community, and R is second. (Respondents could select more than one tool.)
Interestingly, the rankings varied according to the job title of the respondent. R and Python received top-ranking for every job-title subgroup except one (database administrators, who preferred SQL), according to the following division:
- R: Business Analyst, Data Analyst, Data Miner, Operations Researcher, Predictive Modeler, Statistician
- Python: Computer Scientist, Data Scientist, Engineer, Machine Learning Engineer, Other, Programmer, Researcher, Scientist, Software Developer
You can find summaries of the other questions in the survey at the link below. An anonymized dataset of survey responses is also available, as is the “Kaggle Kernel” (a kind of notebook) of the R code behind the survey analysis.