A few notes about A/B testing from Jared Spool

A few notes from Jared Spool’s twitter thread on A/B testing.

A/B testing is an effective approach to use science to design and deliver deeply-frustrating user experiences.

A/B testing without upfront research is just random monkeys testing random designs to see which of those designs do “best” against random criteria.

If drug testing was actually implemented like most A/B tests, you’d give 2 drugs to 2 groups of people and pick the “winner” by whichever group had fewer deaths.

A big issue is the A/B test zealots confuse conversion rate optimization with delivering great user experiences.

There are excellent reasons that users don’t desire to “convert.” CRO doesn’t think those experiences are important to the business.

A/B testing is not the same as understanding how users behave.

You have no idea if their behavior is what they wanted.

There’s a difference between someone who takes an action because that’s what they wanted vs someone taking action they didn’t want to take.

A/B testing doesn’t tell you which it is. So, what do you really understand?

# How do you decide between 2 options that have tested well qualitatively?

Flip a coin.

The implementation costs of a coin flip are far better than an A/B test.

Not comfortable flipping a coin? Then you haven’t done enough research.

# But Amazon do a lot of A/B testing?

IMO: Amazon has done a great job of teaching us to accept mediocre user experiences in exchange for convenience.

The big case-in-point: can you tell me about a big, innovative UX capability Amazon e-commerce has introduced in the last 5 years?

Experimentation culture works against experience innovation, because it basically is about refinement of ideas, and not new idea creation.

The key thing that I think hurts orgs like Amazon is that you can’t learn why certain tests came out the way they did.

So the design efforts never get smarter about their users.

They just become blind “followers” of the test results, not knowing the reasons behind anything.

It creates a cargo cult/superstition approach to design knowledge.

“We don’t know why we do this, but we know it works.” (as far as we can tell based on test results that only measure easy-to-measure business indicators).

So, yes, for short term gains, you can justify building the experimentation culture.

But teams have found that it creates excessive design debt that makes it hard in the long term to stay competitive.

(Amazon stays competitive through means other than the UX of their site.)

# Can it be effective?

I think it can be used effectively.

But I see that organizations that invest heavily in it tend to produce poorly designed products and services.

It’s seen as a cheap solution to doing hard work. I believe it’s not the panacea that everyone thinks it is.

I think if you control the sample of participants to be sure they all have the same goal, can control variations in stimuli, and can be assured that you understand why one condition outperformed the other, A/B testing can be useful.

It’s rare to have that kind of situation.

Most experiments think they are doing that, but on further investigation, you find out that they aren’t testing in as controlled an environment as they believe.

This leads to a false faith in the results, which often come back to bite you later.