While all businesses have mandates to improve their marketing effectiveness, few have built cultures and effective customer data testing methodologies to meet this objective. To help you approach testing from an objective and technical perspective we have put together a framework and sample size calculator. Together these should help you create more credible tests that get at the heart of what you are trying to accomplish with testing and produce better insights.
By Marc Shull
Why Does Testing Matter?
If you’re not testing, you’re flying blind. The strategic application of testing allows brands to understand the broader impact of possible changes to their communication portfolio, branding, and customer experience on a scale that enables them to learn while minimizing risk to their bottom line. The design of each test is key to generating valid results and insights that can be used to improve subsequent tests and align with overarching business objectives.
Testing Framework Basics
When designing each test, you should address each of the seven components described below before the test is executed. Consider creating a test template customized to your business that will ensure each aspect is addressed every time and can be used to communicate those aspects to all involved.
- Define the test objective(s) in quantifiable terms.
- Map out all test elements by version including: number of touches, outbound and inbound channels to be used, timing of touches, CTA’s, creative elements, audience definition, sampling methodology, and desired confidence level.
- Calculate the estimated minimum cell size (Use past test results if available) for your desired confidence level (click here to download our Excel based Sample Size Calculator).
- Determine which of your standard test metrics will be used to determine success.
- Determine what custom metrics are needed to determine success and verify they can be calculated. This should be done before the test to create a baseline.
- Determine desired test start and end dates.
- Define response window if different from the test dates.
Starter List of Standard Effective Testing Metrics
While every business is different, most digital marketing can be measured by a common set of metrics which permit comparison across channels, audiences, offers, and other test aspects. To start you off we recommend using the following metrics be part of your standard metrics set, as appropriate:
Creating Credible Results
It is very easy to introduce bias or corrupt test results so you will need to you think objectively, and with a touch of paranoia, about how your tests are designed. Well designed tests will give you results that are statistically valid and identify opportunities for continuous improvement. For example, how you measure new programs or audiences is often different from how you would approach existing programs or known audiences. Why? Because, past results for existing programs and known audiences should provide some insight into expected response rates that you can use to calculate test group sizes. Without the benefit of past results, you will want to err on the side of caution and use larger samples.
For the best results, all test groups should be compared to a statistically valid, no-mail control group. While this may not always be possible, it is a best practice that will provide you with the most reliable results. Test and control groups should be randomly selected, meaning each individual from the overall population should have the same probability of being included as any other individual from the population. Selecting from the top of the file, reusing previously used test/control group assignments, even-odd sorting based on things like customer ID, etc. are all examples of bad practices that will undermine the credibility of the results.
Control and test groups should always be randomly selected with the control group size being in part based on performance of similar tests with a similar audience or relevant benchmarks when there are no relevant past results available. When testing new program types or unknown audiences, an even split between test versions and the control (Two versions: 50%/50%, three versions: 33%/33%/33%, etc.) is usually a good starting point unless test requirements, audience size, or unacceptable risk do not allow for this. When testing changes to a successful existing program, we recommend a champion-challenger approach. With this approach the established program retains a much larger share of the recipient audience and the challenger(s) a much smaller, but statistically significant, share. This approach minimizes possible negative effects such as a loss of expected revenue if the challenger version is less successful than the champion.
Key Things to Remember
- Do not make assumptions, let the data drive conclusions.
- Only test one element per test group. Testing more than one element will give you unreliable and muddy results.
- Even “unsuccessful” tests provide insight.
- Look at metrics throughout the conversion funnel with every test. Focusing just on the top of the funnel provides a very limited perspective that can lead to “false-negative” and “false-positive” insights.
- Bias is everywhere. Try to minimize it within individual tests and recognize it when trying to make broader comparisons.
- Build short, medium, and long-term test objectives and associated plans to keep efforts focused.
- Incorporate insights into automated programs to maximize their positive impact while minimizing on-going effort.
If you need help developing your own test plans, contact MarketingIQ here.
Marc Shull, CEO of Marketing IQ, is a big-data expert with specific competencies in data insights and consumer privacy legislation.