StackExchange and CrossValidated: An Epidemiologist’s Review

[This article was first published on Confounded by Confounding » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This seems like as good a day as any to review CrossValidated, and the whole StackExchange constellation of websites. It’s been a month since I joined, exactly, and today I also crossed the 1,000 reputation threshold on the site. So why not give my impressions of it?

First, how I got there in the first place. I’ve been essentially learning Python and C as I go, working on my dissertation research. This, predictably, has resulted in some problems – and I happened across StackOverflow as a decidedly decent place to get answers to what were admittedly pretty rookie questions in a hurry – and at all random hours of the night. From there, I drifted over to their statistics site, CrossValidated, where I hoped I might do a little more good than just the asking of random programming questions.

More after the jump…

Some thoughts, both good and bad, about CrossValidated, and the community driven question-and-answer concept:

The Good:

First, like it says on the tin – have a question, ask it, get an answer. One the community, hopefully a knowledgable community, can help screen the good answers from the bad. And for most statistics questions, it does a decent job of it. Some more obscure ones don’t get answered, and the farther afield you get from straight-up statistics and into applied work, the more likely it is to go unanswered. Part of the reason I may have accumulated the reputation I have on the site in a month or so of time is that there were a lot of…low hanging fruit Epidemiology questions that had gone unanswered. But answers when they come are clear, well-sourced, and in my experience pretty damned good.

The sites in general are great for the kind of nagging questions about computing that come up and don’t get covered in classes and the like. Lingering questions about programming, hardware, interacting with big, scary cluster computers using command line interfaces…StackExchange is a decent place to get those kinds of questions sorted.


Some of them are a little off-kilter for the specific state-of-the-art in Epidemiology, but this is a pretty common phenomena. Each field has its own conventions, its own rules and its own way of doing things. Cross-field advice is often a little off. This, in my experience, is especially true when talking to dedicated statisticians – their answers are very, very good, but they require a little bit of processing to get to a usable form.

Software questions are common on the site, and tend to get answered with answers in R. Which is great…if you’re intending to use R. If you’re not – say, again, you’re in a more applied field where there’s a different standard, like SPSS or SAS, you’ll probably get decent theoretical answers, but software specific questions on platforms outside R may struggle to find answers.

CrossValidated is also undeniably a somewhat more obscure site than the bigger components of the StackExchange system – which means the answers are more easily dominated by a single user or small group of them. That does suggest some vulnerability to pushing a particular agenda or two – what if all the contributing users are dirty, filthy frequentists or some such!? I haven’t seen evidence of this problem particularly, but there is a decidedly small population on the site relative to some others.

The Bad:

Some of the questions – to be blunt – will put in you a profound fear of the state of research. But contributing to the site hopefully helps with that in its own, small way.

Stop by, check it out, consider posting on CrossValidated – or any of the other number of StackExchange sites. Programming, server administration, photography, bicycles…there’s options for whole slew of possible interests – and a site to suggest and build support for others, though that process is admittedly a long and hard one.

Filed under: Epidemiology, General, R, SAS, Simulation

To leave a comment for the author, please follow the link and comment on their blog: Confounded by Confounding » R. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)