[This article was first published on Shifting sands
, and kindly contributed to R-bloggers
]. (You can report issue about the content on this page here
Want to share your content on R-bloggers? click here
if you have a blog, or here
if you don't.
I have been using R for some time now and still can find it frustrating to work with. Over the years have come to the conclusion that it is primarily due to the documentation being bad. I offer no actual solutions here, but thought I would try and write down exactly what I dislike about it.
The docs are more or less premised on knowing which function you wish to use. Want to find out what some argument of some specific function does? Sorted.
But, if you are new to the language or just want to check out a few different ways of doing things, the built-in documentation is not going to help.
The “phone book” type reference that just lists all functions in a package alphabetically is useless for exploring the language. This in turn makes it very hard for people to adopt R. I know this from my own experience and helping others who are new to R.
As an example, let’s look at creating an identity matrix. Pretend we are new to R and staring at the prompt.
First I look at help(matrix) and see nothing about it there. Following the links in the “See Also” section, we visit the page for data.matrix and array, neither of which give any hints.
Let’s try a search, typing ??identity at the prompt. A big list of functions pops up, but only two from base and one from stats. First is called ‘dontCheck’ which probably isn’t going to make an identity matrix so I skip that, but second up is identity.
A-ha! Let’s take a look
“A trivial identity function returning its argument.”
All right, lets bring up the phone book listing for everything in base. I type help(base), navigate to the index and type identity in the search box. Nope, just the two functions I saw before.
Now, I personally know the right function is diag, but how are people meant to find that out? If they have to search google for basic things, it’s kinda hard to conclude that the documentation is good.
The second example is left as an exercise to the reader: find in the R help how to get the last element of a vector. And no, despite the helpful and intuitive name, the documentation for “[“ doesn’t say.
The actual help browser is I dislike as well. I have not yet figured out how to have more than one page open at a time, and I really don’t like the idea of having lots of windows popping up everywhere. Something like tabs would be a lot more useful.
Most people are reading on the screen, which makes big blocks of text hard to read, and big blocks of text feature quite often in the help pages.
This becomes especially important when there are so many intricate details being explained. It’s easy to miss things, and many pages require careful and close readings.
Usually, once you wade through all the text, there are examples. Many of these are not clear, seemingly having been written for brevity over elucidation. It’s hard to work out what exactly is going on, again especially for people new to R.
None of the examples include the output inline, which also makes it very hard to get a feel for what they do.
This is made worse by the fact that often I find myself on a “function safari.” I know there must be a function that does what I need but I have no idea what it is called or how to find it.
It’s a huge time sink if I have to go through a bunch of help pages, manually run all the examples to figure out if they do what I need.
The font is ugly as well. Just throwing it out there.
Having specific documentation for each function is necessary, but there should be higher level grouping of related functions.
For example when I look at help(matrix) why does it not include links to diag, rbind, cbind, t etc. or show examples of them all that include the output when run?
is a good start, but it is all on one big page and not actually referenced in the help page for matrix. (Note also it does not mention diag)
One of the problems is the use of generics and the general mish-mash that is OO in R. It makes it hard to find functions that are related to working with a particular data structure or the task at hand.
For example, compare
I really feel all these things compound to be a huge weakness of R.
For me the most frustrating thing is that the documentation I am after is usually there, it’s just too hard to find and often poorly presented. There is much room for improvement of structure and presentation. As it is now it is hard to search, hard to navigate and hard to read.
I would love to see a nice HTML based reference with navigation menus and logical groupings. It is easy to provide logically structured information as well as a reference index.
By switching to something browser based, it makes it a lot easier to find relevant information, as well as allowing users to pick their own fonts (and font sizes), as well as have multiple pages open at the same time.
I am sure there are people out there thinking “but I like the way R does its docs!!” and I can’t argue with that. But I have seen enough videos on the Internet to know some people seem to like some very strange things indeed …
I could go on and on with lots more examples/things that are just a bit silly. As I said at the start, I have no real solution to offer, but hopefully provide some useful specifics, or at least maybe help some people realize “it’s not you, it’s R.” I have gotten myself up to speed in a lot of languages over the years and by far, R took the longest to feel competent with.