Reverse Engineering A/B Tests


Can you spot the difference?
Can you spot the differences?

One of the fun things I like to do when I have some free time is to try to learn from A/B tests other companies are running. My thinking is, "Why run a test when you don't have to?" There might be something you can learn and adapt to your business, or it might spark some new thinking.

(I've written down a few thoughts on the subject here, but I sincerely invite all the readers of this post to comment and expand on this topic. This article is only scratching the surface.)

If you call up your buddy at the company they might tell you what they're testing. Or they might not. But there's another way too. Just go to their website. Refresh. Clear cookies. Screen shot each view. Return later and see which one they stuck with.

For simplicity, let's stick with home page testing. There are three scenarios you'll run into.

  1. They content changes every time you refresh the page.
  2. The content changes only when you clear your cookies and revisit.
  3. The content doesn't change.

Not Testing

Obviously that last scenario means they're not running any tests at the moment on that page. Or at least not any tests that you personally would be eligible for. Whenever we set up tests, we determine if we want everyone eligible, or just a subset of the visitors. So we might for example only be testing on traffic from certain search keywords.

Here are a few websites that were apparently not running any Home Page tests when I decided to visit.

Williams-Sonoma:

WilliamsSonoma_Static_0.png


VistaPrint:

VistaPrint_Static_0.png

These were their homepages no matter how much I refreshed or cleared my cookies.

A/B Tests Live

If I refresh and get the same content, but when I clear my cookies I get different content, they're running a website optimization test of some sort.

Here's Netflix, which was running an a/b test when I visited.


Netflix Home Page Version A

Netflix Home Page Version B

Netflix is trying two versions of the bottom panel. Netflix offers two ways to watch movies - get a DVD by mail or stream it from the Internet. And in this test they're trying to determine which layout and iconography compels more visitors to sign up.

I returned a few weeks later, and I was still getting these same two versions, so I suppose they haven't determined a winner yet. I also got this next version just once.


Netflix Home Page Version C

Maybe Testing, Maybe Not

Interestingly, quite a number of big, well respected companies have something else going on altogether. The home page content changes every few seconds automatically, or it changes when you refresh the browser window. Clearing your cookies does nothing. You still get this "rotating" marquee. Here's Adobe.


Adobe_Rotate1_0.png

Adobe_Rotate2_0.png

Adobe_Rotate3_0.png

Adobe is using the marquee position to rotate in three different product lines. Each is it's own flash movie about the product.

Here's Apple's home page, also rotating three products in the marquee area:


Apple_Rotate1_0.png

Apple_Rotate2_0.png

Apple_Rotate3_0.png

And here's Microsoft's home page.


Microsoft_Rotate1_0.png


Microsoft_Rotate2_0.png


Microsoft_Rotate3_0.png

Are these companies running tests to see which marquee performs better for the company as a whole? It's possible, but doubtful. Here's why:

  • Because the content rotates, changes upon refresh, or can be fast forwarded or reversed by the user, many visitors are going to see all the treatments. So you'd be left making conclusions that the order of when they saw them mattered in whatever outcomes you observe. Does viewing order matter that much that you could regularly pick it up in a test? I doubt that in most cases it would be that significant of an impact, but I'm happy to proven wrong.
  • If the user can self select their way into or out of content, then the causality you're running your tests for gives way to mere correlation. Perhaps prospects predisposed to purchasing something (anything) are also predisposed to clicking a lot and viewing a lot of site content. Then the fact that they saw all the rotating marquees and even clicked on them doesn't really prove it helped to convert. It only shows there's a correlation between purchasing and clicking.
  • Most testing tools don't focus much on pathing, or time order. That's the domain of clickstream tools, but not testing tools. You could rig any tracking system to do what you like. But I doubt all these companies customized their testing apps to do this.

More likely than a/b testing, I'd guess this is old-fashioned negotation. These companies are rotating content because they're huge companies, with lots of product lines, all fighting for home page placement. Instead of saying "no" to the 2nd and 3rd runners up for placement, or forcing all the players into accepting only tiny, unflattering placements in a cramped place, this solution arose. Each business unit can get a nice big piece of real-estate and tell a good little story inside the flash video.

It certainly seems to be new norm. Here are the questions I have. If you know the answers, please comment below.

  • Was this general solution space initially tested? Did these initial tests show that overall dollars spent is greater when the marquee rotates product lines? Is this is a great economic solution as well as political solution?
  • Is three the magic number? Most companies seem to rotate three offerings. If there's data to back up this number, does it show that having more than three is suboptimal? Or perhaps it is the political dynamics that forces three, with the top dogs not wanting their air-time diluting much more than to one-third?
  • Rotating marquee aside, why doesn't the rest of the home page appear to be the subject of a/b testing?

Doing a bit of reverse engineering on these sites didn't give me all the answers. It did give me some good questions - a head start for where to look and what to look for.

It is time-consuming to try and reverse engineer what other companies are testing. While I do sometimes get some fresh insights, perhaps the most gratifying part of it for me is to see how few companies actually are actively testing. Gratifying only because those that do (which is any company I work for) are then at that much of a competitive advantage.

http://www.benchmark-analytics.com/d/?q=trackback/35