Machine Learning explained for statistics folk

[This article was first published on R – Dataviz – Stats – Bayes, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’m running a one-day workshop called “From Statistics To Machine Learning” in central London on 28 October, for anyone who learnt some statistics and wants to find out about machine learning methods.

I guess you might feel frustrated. There’s a lot of interest and investment in machine learning, but it’s hard to know where to start learning if you have a stats background. Writing is either aimed squarely at experts in computer science, with a lot of weird jargon that looks almost familiar, but isn’t quite (features? loss functions? out-of-bag?), or is written for gullible idiots with more money than sense.

This workshop is a mixture of about 25% talks, 25% demonstration with R, and 50% small-group activities where you will consider scenarios detailing realistic data analysis needs for public sector, private sector and NGOs. Your task will be to discuss possible machine learning approaches and to propose an analysis plan, detailing pros and cons and risks to the organisation from data problems and unethical consequences. You’ll also have time to practice communicating the analysis plan in plain English to decision-makers. With only 18 seats, it’ll be a small enough session for everyone to learn from each other and really get involved. This is not one where you sit at the back and catch up on your emails!

I’ve built this out of training I’ve done for clients who wanted machine learning and AI demystified. The difference here is that you’re expected to have some statistical know-how, so we can start talking about the strengths and weaknesses of machine learning in a rigorous way that is actually useful to you in your work straight away.

Here are some topics we’ll cover:

  • Terminology and jargon
  • Supervised and unsupervised learning
  • Ensembles, bagging and boosting
  • Neural networks, image data and adversarial thinking
  • AI and ethical concerns
  • Reinforcement and imitation learning
  • Big data’s challenges, opportunities and hype
  • Speed and memory efficiency
  • Concepts of model building such as cross-validation and feature engineering
  • Options for software, outsourcing and software-as-a-service
  • Data science workplaces combining statistical expertise with machine learning: what makes them happy and healthy

You can bring a laptop to try out some of the examples in R, but this is not essential.

It’s £110 all inclusive, which is about as cheap as I can make it with a central London venue. I’m mindful that a lot of people interested in this might be students or academics or public sector analysts, and I know you don’t exactly have big training budgets to dip into. Having said that, you can haggle with me if you’re a self-funding student or out of work or whatever.

Venue is TBC but I’ll confirm soon. If I get my way it’ll be one where you get breakfast (yes!), bottomless coffee and tea (yes!!) and a genuinely nice lunch (yes!!!), all included in the cost. Then, we can repair to the craft beer company afterwards and compare laptop stickers.

Book ’em quick here! I’ve had a lot of interest in this and it might go fast.

To leave a comment for the author, please follow the link and comment on their blog: R – Dataviz – Stats – Bayes. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)