Here’s a question we get a lot:
“Is it okay to run multiple tests at the same time on a page or across pages and if so, how should it be done to ensure valid conclusions?”
It’s a great question, but the response will be different depending on who you ask. Mathematically speaking, there are significant risks to your results (and ultimately your performance metrics) if you don’t carefully consider your options and their respective outcomes.
To begin, let’s discuss what inspires a brand to want to run tests at the same time. Marketers want answers as quickly as possible, and if there are a number of tests in the queue then the inevitable question is: “Why not run them together?” Alternatively, it is possible you have scaled your program to multiple groups who each have their own set of tests to run.
In any case, you need to know your options and potential problems associated with the chosen methodology. The following examples will illustrate your options for running simultaneous tests and then analyze the outcomes.
Suppose you want to run two tests, one on your Product Page and one on your Cart Page, where in each case you will measure the impact on purchases and revenue.
We’ll assume that each test will take 3 weeks on its own. You essentially have 4 options to choose from, each with their own pros and cons:
We’ll evaluate each option and rate it as Low, Medium, or High for its ability to deliver on the following:
Then you can make an informed decision as to which option is right for you.
Option #1: Run the Tests Sequentially
The strategy with this option is to make incremental changes to your site that are aimed at increasing some key metrics. With this option you will also want to implement the results of Test 1 on your site before running Test 2. This ensures that the results are not different after making your changes.
Note that the order of the tests can make a difference. That is, if you change the order and run Test 2 first, implement the result, then run Test 1, the conclusions may actually be different for both tests. Why? Because the result for the second test may be influenced by the winner implemented from the first test.
A clear disadvantage of this approach is not only how long the tests will take, but also the potentially skewed results.
Here’s how we’ll rate this option:
Option #2: Run Both Tests at the Same Time
We’ve heard that some testing vendors play down the risk of running multiple tests at the same time. They advise clients to go ahead and claim that the conclusions for both tests will not be tangibly impacted by the other. They are making a dangerous assumption; the assumption that whatever effect Test 1 might have on Test 2, it has the same effect on all of the variants of Test 2.
So what’s wrong with this assumption?
It’s possible that the “interactions” between variants in the two tests are not equal to each other and uniformly spread out. What does this mean? Let’s look at our example...
Looking at both tests individually, the conversion rates for both tests are equal between the default and the variant. You would then conclude that the default wins in both cases, since the variant or challenger did not achieve a higher conversion rate.
Test 1: Product Page Test
Test 2: Cart Page Test
However, if you look at the visitors for Test 2 who saw the Variant from Test 1, compared with the visitors for Test 2 who saw the default for Test 1, you see that the conversion rates are different. In fact, you could actually get a higher conversion rate overall if you implement the variants from Test 1 and Test 2.
This is a case where interaction effects between tests matter, and this often goes unnoticed. Nevertheless, it can have a major impact on your test conclusions.
Here’s how we would rate this option:
Option #3: Run the Tests at the Same Time by Splitting Traffic Between Them
In this case you will split traffic between the tests. So all visitors will see either a variant of Test 1 and the default for Test 2, or the default for Test 1 and a variant for Test 2.
Though you can avoid misleading effects between variants by separating the traffic, this approach does lead to a potential problem. If the winner for both Test 1 and Test 2 is the variant (not the default), then we have no clue how visitors will behave when they see the variants from both tests, since no visitor traffic had seen that combination during our testing.
With this in mind we rate this option in the following way:
Option #4: Combine the Tests and Run them as a Multivariate Test
The final option is to combine the tests and run them as a single multivariate test. Multivariate tests can be setup for tests running on a single page or across multiple pages, depending on the capability of your optimization platform, of course. With a multivariate test you can then test all possible experiences together, which in our example would be four experiences.
Default and Default Default and Variant
Variant and Default Variant and Variant
The advantages of a multivariate test are that you can find out:
These insights enable you to identify the best options to implement for your site. For example, you may find that one small change contributes the most to your key metrics. You can then implement one small change for a greater ROI.
In addition, multivariate tests can be optimized, where experiences are automatically removed and narrowed to a winning combination. This process can reduce the number of experiences tested over time.
[Note: See our blog article where we discuss more about multivariate testing]
This option is the most accurate way to run the tests, as visitors will see all combinations of variants systematically, but it could take 6 weeks or more to conclude the test.
When it comes to running tests concurrently you’ll want to weigh the options. Some tests may influence each other more than others. We have summarized the different options and the points that you should consider, and we’ve rated them based upon their speed to get answers and the accuracy of the results.
Knowing the options and the risks for each is vitally important.
Stay tuned for an in-depth discussion on the CXO blog about going even further by running AA/BB tests.
[This piece was written in collaboration with Mark Buckallew]