You Do Not Really Know Statistical Significance, Do You?

There are quite a few misconceptions about determining statistical significance in a simple A/B split test. More often than not, naive online marketers say they’ll use a “rule-of-thumb” to determine when to keep or optimize a landing page. Their “rule-of-thumb” is typically as simple as sending 300 clicks to landing page 1 and another 300 clicks to landing page 2 and toss which ever converts less. That’s a mistake.

The problem with using a rule-of-thumb is that you’ll either:

Spend more money than necessary on a certain advertisement — resulting in testing less advertisements and taking much more time to find that optimum ROI.
On the other hand, marketers will many times pull an ad too early — never learning that that advertisement may have been profitable given enough time.

So, what we want to do is find the exact amount of time to run both advertisements A and B to maximize profits.

How you should start a campaign

Obviously, everyone will tell you to start out with as many landing pages and advertisements as possible. “As possible” might mean ten or more, but who really does that? I’ve told myself that I’d do that, but that hasn’t happened yet. It’s not that I’m lazy because I’m not. It’s because I’m impatient. Here’s what I do:

I create two or three of everything – not ten. First combination (or two) is what I think will convert the best. The other is completely different. I’m not talking about telling your designer, “surprise me,” because we all know that won’t work. “Completely different” is usually a spontaneous thought I’ve had that I believe may work – but is unlikely, or it’s from an offline example of persuasion or marketing that I’ll try in some way to apply online.

After I calculate statistical significance, I’ll split test another set of ads against that. Typically, the results of the initial split-test give me more ideas to test.

Two ways to calculate statistical significance

Confidence intervals

This is done using confidence intervals from statistics. If you’ve never heard of confidence intervals in college, I’d imagine you didn’t learn anything from college that’s helping you in online advertising currently. Anyway, back to the subject… Several analytical programs calculate winning advertisements using this method. By using this free tool by Split Test Accelerator (opens in new window, type random data in it to try it out), you can determine which landing page or ad copy would perform the best at what percentage of the time. For a quick example, take a look at:

Landing Page Optimization: Call-to-Action Buttons

Essentially, if you have an advertisement that has a 95% confidence, that ad would be most effective 19 times out of 20. If the campaign is important, I aim for at least 99% confidence.

A/A (or null) split-test

This is the second way to determine statistical significance. Personally, I use confidence intervals, but I wanted to be sufficiently extensive so I’ll briefly explain what’s referred to as an A/A split-test.

Typically, in an A/A test, you’ll split test the same advertisement. You make a duplicate of an advertisement and track each ad as if they were different:

Finally, you need to decide your sample size and set up the criteria for success. To decide your ultimate sample size, run a “null” test with your A/B test. The null test is really just an A/A test, where you are running the control against itself to determine where the convergence of results matches up (typically within 0.05 percent of each other, but that’s up to you)…. – Mike Sack, executive VP of Inceptor

If you run an A/A test and then a A/B test at a later date, then conditions will be different since the timing is. So, the way to do this is to do an A/A test during an A/B test.

In summary, I know most of you aren’t going to use the A/A test, so just use confidence intervals through a tool or analytics program. Remember, you’re wasting valuable time if you’re not using statistical significance in your testing.

PS: If you know who came up with the Hathaway shirt ad, you’re awesome.

Comments

10 responses to “You Do Not Really Know Statistical Significance, Do You?”

Ben

May 4, 2010

i believe the man who created that ad is david ogilvy

Reply
Shock Marketer

May 4, 2010

@Ben: Yes!

Reply
Justin Dupre

June 2, 2010

Very informative! By being able to determine of what you are doing right or wrong on the project you’re working on is huge. I’ve also done some split testing as well.. great results! Good post.

Reply
ewoww

June 9, 2010

I’m so sad I’m late tot his posting partay… David Ogilvy = my hero, direct response was his dirty secret, and and that faux eyepatch propelled Hathaway into advertising greatness, not to mention a compelling and lasting brand. While admittedly confidence intervals weren’t my favorite part of school, it’s fantabulous to see someone putting it to relevant use. Great post – thanks for writing!

Reply
Tom

June 30, 2010

And the guy who was the Hathaway Shirt man was John Whyte. He ended up running the most popular gay bar on Fire Island and invented the Tea Dance.

I used to run ferries there. Not that there is anything wrong with that.

Reply
Ed

July 29, 2010

You used to ‘run’ fairies there Tom?

I smiled when I saw the Hathaway man. Great article, thanks!

Reply
Slave Rat

August 16, 2010

Couldn’t we accomplish the same as an A/A test by simply splitting the results of “A” into half (alternating their placement to A or A’ instead of grabbing the first 1000 and sending them to A and the last 1000 and assigning them to A’)?

Reply
Shock Marketer

August 16, 2010

@Slave Rat: If you mean by alternating each visit to A, A’, then B (instead of A gets first 1,000, A’ gets second 1,000, then B gets next set), yes, that’s what you’d do — until you reached percent significance you were anticipating.

Reply
Slave Rat

August 17, 2010

Yes, that’s where I was headed.

I’m trying for the love of me to remember all the Chi-Square tests for statistical significance and the calculations for sample size – It was pretty darn hard on me when I did this for my dissertation, and I had to do it some two dozen times with US Census data against my own survey data. It wasn’t that long ago – a little over two years, since I barely understood how I was doing what I was doing I’ve forgotten a lot of it. Would you have any good book to recommend? All I have here to help me is what I used back then, “SPSS survival manual” (I do still have SPSS 16), a marketing research book (McDaniel and Gates) and my “Dissertation Skills” (Brian White) book.

None of them address the Internet Marketer’s issues head-on… I KNOW half the battle is spending less on what doesn’t work and quickly figuring out what does work – For that I’m thinking being able to figure out sample sizes and statistical significance are essential.

Any good reads?

Reply
Shock Marketer

August 17, 2010

I don’t know of any books on stats that are explicitly for marketers. However, I’ll send you an email in a few days.

Reply