On the heels of a report showing the inefficacy of government-run cyber security, it’s imperative to understand the limitations of your system and model. As that article shows, in addition to bureaucratic risk the government also needs to worry about gaming-the-bureaucracy risk! Government snafus aside, data science has enjoyed considerable success in the past few years. Despite this success, models can fail in surprising ways. Last year we saw how deep neural nets for image recognition fail on noisy data.
As these examples show, a lot can be learned by breaking models. Model builders of all stripes must consider the limitations of their models and should be a requisite step in the validation stage. As a fun exercise, below I present some ways to confuse models at popular web destinations. Can you figure out how a model will fail based on this behavior?
Netflix is known for using collaborative filtering but also matrix factorization like SVD.
- Choose a genre (e.g. Movies With A Strong Female Lead)
- For each movie, alternate ranking between 1 and 5 stars
Amazon is known for using user-based collaborative filtering.
Make a separate purchase for each item in a list. For each item do the following:
- Choose a dimension or combination of dimensions e.g. gender, age, department
- Browse related (i.e. similar) items in the given dimension
- Now browse related items in the opposite direction of dimension (or something unrelated)
- Add actual item to purchase to cart
Example: Choose baby car seat. View car seats plus related items (e.g. strollers). Now view a bunch of scooters for old people, such as the Pride 3 Wheel Celebrity X Scooter. Now add your purchase item and checkout.
Alternative: If you have disposable income, actually buy the car seat and scooter and donate them to a charity afterward.
The Facebook News Feed is notorious for changing regularly and being somewhat opaque to outsiders, here is a narrative description of how it “works”. The short version is that there are various scoring models combined with various rules to deal with outliers.
- Choose a set of dimensions (e.g. day of week, time of day, media type)
- Choose a behavior (e.g. like, hide, scroll past, stay for long time, comment)
- For given set of dimensions, perform same behavior over a fixed period of time (e.g. 15 minutes)
Example: Choose Monday + 9 AM as dimensions. Choose “stay for long time + hide” as behavior. Do this for each item in news feed for 30 minutes. Repeat following week.
Bonus: Recruit your friends to follow the same algorithm, ideally in same geographic region.
One curious feature of LinkedIn is automated skill endorsement recommendations. It’s often that I get endorsed for random things unrelated to what I do. Presumably this works on some sort of frequent itemset based on graph distances.
- Choose a network of related people
- Choose an unrelated skill
- Endorse all people in network with same “skill”
Example: For me, I might choose all my financial quant friends and endorse them with the skill “arm wrestling”.
Alternative: Use a brand slogan as the skill e.g. “Think Different” This can be awkward, so try changing initial verb to a present participle e.g. “Thinking Different”.
Bonus: Use a brand slogan with a double entendre e.g. “Doubling Your Pleasure”.
Marketing and Advertising
While there aren’t any models embedded within GA, many many models are used to analyze web behavior based on the tracking codes attached to a URL.
- Choose a URL to link to
- Choose a unique identifier
- Replace tracking code with custom identifier
- Get people to click link
This is a small sampling of how to identify flaws in models. Add your own ideas on how to break models in the comments!
Brian Lee Yung Rowe is Founder and Chief Pez Head of Pez.AI // Zato Novo, a conversational AI platform for guided data analysis and automated customer service. Learn more at Pez.AI.