Estimating Analytics Software Market Share by Counting Books

[This article was first published on r4stats.com » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Below is the latest update to The Popularity of Data Analysis Software.

Books

The number of books published on each software package or language reflects its relative popularity. Amazon.com offers an advanced search method which works well for all the software except R and the general-purpose languages such as Java, C, and MATLAB. I did not find a way to easily search for books on analytics that used such general purpose languages, so I’ve excluded them in this section. While “R” is a vague term to search for, I was able to obtain a count of its books from https://www.r-project.org/doc/bib/R-books.html.

The Amazon.com advanced search configuration that I used was (using SAS as an example):

Title: SAS -excerpt -chapter -changes -articles 
Subject: Computers & Technology
Condition: New
Format: All formats
Publication Date: After January, 2000

The “title” parameter allowed me to focus the search on books that included the software names in their titles. Other books may use a particular software in their examples, but they’re impossible to search for easily.  SAS has many manuals for sale as individual chapters or excerpts. They contain “chapter” or “excerpt” in their title so I excluded them using the minus sign, e.g. “-excerpt”. SAS also has short “changes and enhancements” booklets that the developers of other packages release only in the form of flyers and/or web pages, so I excluded “changes” as well. Some software listed brief “articles” which I also excluded. I did the search on June 1, 2015, and I excluded excerpts, chapters, changes, and articles from all searches.

The results are shown in Table 1, where it’s clear that a very small number of analytics software packages dominate the world of book publishing. SAS has a huge lead with 576 titles, followed by SPSS with 339. SAS and SPSS both have many versions of the same book or manual still for sale, so their numbers are both inflated as a result. R was the only other package with more than 100 titles, though JMP came close. Although I obtained counts on all 27 of the domain-specific (i.e. not general-purpose) analytics software packages or languages shown in Figure 2a, I cut the table off at software that had 8 or fewer books to save space.

Software        Number of Books 
SAS                  576
SPSS Statistics      339
R                    172
JMP                   97
Hadoop                89
Stata                 62
Minitab               33
Enterprise Miner      32

Table 1. The number of books whose titles contain the name of each software package.


To leave a comment for the author, please follow the link and comment on their blog: r4stats.com » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)