Mara Averick is a non-profit data nerd, NBA stats junkie, and most recently, tidyverse developer advocate at RStudio. She is the voice behind two very popular Twitter accounts, @dataandme and @batpigandme. Mara and I discussed sports analytics, how attending a cool conference can change the approach to your career, and how she uses Twitter as a mechanism for self-imposed forced learning.
KO: What is your name, job title, and how long have you been using R? [Note: This interview took place in May 2017. Mara joined RStudio as their tidyverse developer advocate in November 2017.]
MA: My name is Mara Averick, I do consulting, data science, I just say “data nerd at large” because I’ve seen those Venn diagrams and I’m definitely not a data scientist. I used R in high school for fantasy basketball. I graduated from high school in 2003, and then in college used SPSS, and I didn’t use R for a long time. And then I was working with a company that does grant proposals for non-profits, doing all of the demand- and outcome-analysis and it all was in Excel and I thought, we could do better – R might also be helpful for this. It turns out there’s a package for American Community Survey data in R (acs), so that was how I got back into R.
KO: How did you find out about R when you first started using it in high school?
MA: I honestly don’t remember. I didn’t even use RStudio until two years ago. I think it was probably from other fantasy nerds?
KO: Is there an underground R fantasy basketball culture?
MA: Well R for fantasy football is legit. Fantasy Football Analytics is all R modeling.
KO: That’s awesome – so now, do you work with sports analytics? Or is that your personal project/passion?
MA: A little bit of both, I worked for this startup called Stattleship (@stattleship). Because I’ll get involved with anything if there’s a good pun involved… and so we were doing sports analytics work that kind of ended up shifting more in a marketing direction. I still do consulting with the head data scientist [Tanya Cashorali] for that [at TCB Analytics]. Some of the analysis/consulting will be with companies who are doing either consumer products for sports or data journalism stuff around sports analytics.
KO: How often do you use R now?
MA: Oh, I use R like every day. I use it… I don’t use Word any more. [Laughter] Yeah so one of the things about basketball is that there are times of the year where there are games every day. So that’s been my morning workflow for a while – scraping basketball data.
KO: So you get up every morning and scrape what’s new in Basketball?
MA: Yeah! So I end up in RStudio bright and early (often late, as well).
KO: So is that literally what the first half hour of your day looks like?
MA: No, so incidentally that’s kind of how this Twitter thing got started. My dog has long preceded me on Twitter and the internet at large, he’s kind of an internet famous dog @batpigandme. There’s an application called Buffer which allows you to schedule tweets and facebook page posts, which was most of Batpig’s traffic – facebook page visits from Japan. And so I had this morning routine (started in the winter when I had one of those light things you sit in front of for a certain number of minutes) where I would wake up and schedule batpig posts while I’m sitting there and read emails. And that ended up being a nice morning workflow thing.
I went to a Do Good Data conference, which is a Data Analysts for Social Good (@DA4SG) event, just over two years ago, and everyone there was giving out their twitter handles, and I was like, oh – maybe people who aren’t dogs also use Twitter? [Laughter] So that was how I ended up creating my own account @dataandme independent from Batpig.
KO: What happened after you went to this conference? Was it awesome, did it inspire you?
MA: Yeah so, I was the stats person at the company I was working at. And I didn’t realize there was all this really awesome work being done with really rigorous evaluation that wasn’t necessarily federal grant proposal stuff. So I was really inspired by that and started learning more about what other people were doing, some of it in R, some of it not. I kept in touch with some of the people from that conference. And then NBA Twitter is also a thing it turns out, and NBA, R/Statistics is also a really big thing so that was kind of what pulled me in. And it was really fun. A lot of interesting projects and people that I work with were all through that [Twitter] which still surprises me – that I can read a book and tell the author something and they care? It’s weird.
I like to make arbitrary rules for myself, one of the things is I don’t tweet stuff that I haven’t read.
KO: Everyone loves your twitter account. How do you find and curate the things you end up posting about?
MA: I like to make arbitrary rules for myself, one of the things is I don’t tweet stuff that I haven’t read. I like to learn new things and/or I have to learn new things every day so I basically started scheduling [tweets] as a way to make myself read the things that I want to read and get back to.
KO: Wait, so you schedule a tweet and then you’re like, okay well this is my deadline to read this thing – or I’ll be a liar.
KO: Whoa that’s awesome.
MA: I’ve also never not finished a book in my life. It’s one of my rules, I’m really strict about it.
KO: That’s a lot of pressure!
MA: So that was kind of how it started out – especially because I didn’t even know all the stuff I didn’t know. Then, as I’ve used R more and more, there’s stuff that I’ve just happened to read because I don’t know what I’m doing.
KO: The more you learn the more you can learn.
MA: Yeah so now a lot of the stuff [tweets] is stuff I end up reading over the course of the day and then add it [to the queue]. Or it’s just stuff I’ve already read when I feel like being lazy.
KO: Do you have side projects other than the basketball/sports stuff?
MA: I actually majored in science and technology studies, which means I was randomly trained in ethical/legal/social implications of science. So I’m working on some data ethics projects which unfortunately I can’t talk about. And then my big side project for total amusement was this D3.js in Action analysis of Archer which is a cartoon that I watch. But that’s also how I learned really how to use tidytext. So then I ended up doing a technical review for David [Robinson] and Julia’s [Silge] book Text Mining with R: A Tidy Approach. It was super fun. So yeah, I always have a bunch of random side projects going on.
KO: How is your work-life balance?
MA: It’s funny because I like what I do. So I don’t always know where that starts and ends. And I’m really bad at capitalism. It never occurs to me that I should be paid for doing some things. Especially if it involves open data and open source – surely you can’t charge for that? But I read a lot of stuff that’s not R too. I think I’m getting sort of a balance, but I’m not sure.
KO: Switching back to your job-job now. Are you on a team, are you remote, are you in an office, what are the logistics like?
MA: Kind of all of the above. In my old job I was on a team but I was the only person doing anything data related. And I developed some really lazy habits from that – really ugly code and committing terrible stuff to git. But with this NBA project I end up working with a lot of different people (who are also basketball-stat nerds).
KO: Do you work with people who are employed by the actual NBA teams, or just people who are really interested in the subject?
MA: No, so there is an unfortunate attrition of people whom I work with when they get hired by teams – which is not unfortunate, it’s awesome, but then they can no longer do anything with us. So that’s collaborative work but I don’t work on a team anymore.
KO: So you don’t have daily stand-ups or anything.
MA: No, no. I could probably benefit from that, but my goal is never to be 100% remote. After I went to that first data conference, I felt like being around all these people who are so much smarter than I am, and know so much more than I do is intimidating, but I also learned so much. And I learned so many things I was doing, not wrong, but inefficiently. I still learn about 80 things I’m doing inefficiently every day.
My goal right now – stop holding on to all of my projects that are not as done as I want them to be, and will never be done.
KO: Do you have set beginnings and endings to projects? How many projects are you juggling at a given time?
MA: After doing federal grant proposals, it doesn’t feel like anything is a deadline compared to that. They don’t care if your house burned down if it’s not in at the right time. So nothing feels as hard and fast as that. There are certain things like the NBA that —
KO: There are timely things.
MA: Yeah, and then sometimes we’ll just set arbitrary deadlines, just to kind of get out of a cycle of trying to perfect it, which I fall deeply into. Yeah so that’s kind of a little bit of my goal right now – stop holding on to all of my projects that are not as done as I want them to be, and will never be done. With the first iteration of this Archer thing I literally spent three days trying to get this faceted bar chart thing to sort in multiple ways and was super frustrated and then I tweeted something about it and immediately David Robinson responded with precisely what I needed and would have never figured out. So I’m working on doing that more. And also because it’s so helpful to me when other people do that.
KO: How did you get hooked up with Julia and David, just through Twitter?
MA: Yeah! So Julia I’d met at Open Vis Conf, David I’d read his blog about a million lines of bad code – it was open on my iPad for like years because I loved it so much, and still do. And yeah so again as this super random twitter-human that I feel like I am, I do end up meeting and doing things with cool people who are super smart and do really cool things.
KO: It’s impressive how much you post and not just that, but it’s really evident that you care. People can tell that this isn’t just someone who reposts a million things day.
MA: I mean it’s totally selfish, don’t get me wrong. But I’m super glad that it’s helpful to other people too. It gives me so much anxiety to think that people might think I know how to do all the things that I post, which I don’t, that’s why I had to read them – but even when I read them, sometimes I don’t know. The R community is pretty awesome, at least the parts of it that I know; which is not universally true of any community of any group of scientists. R Twitter is super-super helpful. And that was evident really quickly, at least to me.
My plea to everyone who has a blog is to put their Twitter handle somewhere on it.
KO: What are some of your favorite things on the internet? Blogs, Twitter Accounts, Podcasts…
MA: I have never skipped one of Julia Silge’s blog posts. Her posts are always something that I know I should learn how to do. Both she and D-Rob [David Robinson] know their stuff and they write really well. So those are two blogs and follows that I love. Bob Rudis – almost daily, I can’t believe how quickly he churns stuff out. R-Bloggers is a great way to discover new stuff. Dr. Simon J [Simon Jackson] – I literally think of people by their twitter handles [@drsimonj], and there are so many others.
Every day I’m amazed by all the stuff I didn’t know existed. And also there’s stuff that people wrote three or four years ago. A lot of the data vis stuff I end up finding from weird angles. So those are some of my favorites – I’m sure there are more. Oh! Thomas Lin Pedersen, Data Imaginist is his blog. There are so many good blogs. My plea to everyone who has a blog is to put their twitter handle somewhere on it. I actually try really hard to find attribution stuff. Every now and then I get it really wrong and it’ll be someone who has nothing to do with it but who has the same name. There’s a bikini model who has the same name as someone who I said wrote a thing – which I vetted it too! I was like, well she’s multi-faceted, good for her! And then somebody was like, I don’t think that’s the right one. Oops! I have to say that that’s the one thing that Medium nailed – when you click share it gives you their twitter handle. If you have a blog, put your twitter handle there so I don’t end up attributing it to a bikini model.