In an increasingly competitive app market, taking a data-driven approach to optimizing the different aspects of your app is key. That can apply equally to your monetization flows, paywalls, pricing models, and more. With so many variables and elements to your app business, understanding which change had the greatest effect can be difficult. That’s where the concept of A/B testing comes in.
At Kovalee, we’re firm believers in the power of A/B testing to determine the “winning version” of any change made to one of our partner apps. It’s a core part of the way we operate on our mission to empower app creators to reach #1. So today, let’s look at how we used mobile A/B testing to improve product processes for our partner app PepTalk - but first, we’ll dive into what A/B testing even is and how to get it right.
Put simply, A/B testing is a method used to compare two or more variables of a feature, flow, or screen, for example, in order to determine which one works better. This is a fairly scientific approach of “split testing”, whereby you’ll divide app users into groups and set repartitions for each variable in %. The standard would be 50-50. Your new users are then randomly exposed to the control group of the variant.
After running the experiment for a predetermined amount of time, you’d then consider the performance metrics of each of the different versions and compare. This will help you see which version your target audience responds to better. Easy, right? That’s really the bottom line of why A/B testing in app development itself is so important - because It allows you to evaluate the impact of a decision or a change before you commit to rolling it out to all your users. It helps you avoid the pitfalls of taking one step forward only to take two steps back - with A/B testing, there are no regressions. When you make changes, you measure the positive impact they’ve had while gaining a key understanding of user behavior.
This, in turn, helps you make the right choices in any future iteration of your app, any new features, adjustments to functionalities, and more. You enter into a fast loop of learning and iterating that will get you to the final, winning version of your app much sooner. That’s actually why we run A/B tests every two weeks for each of our apps here at Kovalee. The faster we can iterate, the more we can learn! We call these tests our “learning loops”, and they’re a fundamental pillar of the way we work.
If you’re at a point now where you’re thinking “let’s start testing”, the first thing I’ll say is hold your horses and lay your groundwork. Here’s a quick checklist for what you need to do in order for your A/B testing strategy to be worthwhile and generate useful results:
While you might think the best way to start is with setting objectives, at Kovalee we actually prefer to kick off by identifying a pain point or an opportunity. We analyze our metrics first, and only then do we define a hypothesis and set the KPIs we want to improve. We outline what it is we want to do - and that could be anything from focusing on conversion rate optimization to improving retention rates, boosting onboarding completion rate, or raising an app's LTV. Doing it this way will help you track and validate the winner revealed by your A/B test results - but remember, you should only choose ONE metric at a time to validate your test. Any more than that will cause confusion. I’d recommend starting with a revenue impact metric like lifetime value (LTV).
You might have heard the term “segmentation” in this context. That’s basically a fancy way of saying that you need to figure out how to divide your user base, based on certain traits or characteristics. We always recommend targeting new users in your A/B testing, and starting with a low volume sample size, then run your rollouts progressively. You should use dedicated A/B testing tools like Firebase to randomly assign users to specific groups.
The easiest way to keep track of your experiment is to start by only defining one parameter you want to test at a time. This will make your entire testing process cleaner and simpler to observe. For example, if you’re testing a paywall, then don’t modify your pricing at the same time. If you do that, you won’t know whether it was your new pricing or your new paywall that impacted performance. What’s more, having a detailed tracking plan for these variants will also ensure that your user behavior is properly monitored. This will help you with a number of optimizations within the user experience.
If, on the other hand, you have several hypotheses you want to test, you can run several variants of the same parameter at once. But - a word of warning - you should do this if you have sufficient volume in your sample size. This approach is called “multivariate testing” and requires a really high degree of organization to get right. So, if you’re still a relatively small app business, I’d suggest taking it one step at a time.
It’s very exciting to get your first batch of data in. We’ve all been there and taken the first results for face value because we’ve grown impatient. But doing that could be a mistake. Coming to conclusions too soon can seriously mislead you, particularly because a trend can actually only be confirmed with enough volume. To determine anything before the test, I’d recommend relying on at least 1000 users per group and/or 100 or more conversions. Anything before that is unlikely to have much statistical significance in this regard.
Once you start testing, it’s also important to check in on what’s going on on a daily basis. This way, you’ll start seeing some preliminary results in real-time, but really it’s a fail-safe to make sure that everything is correctly set up and running efficiently.
When you’ve reached a certain volume threshold, you can start comparing the key metrics you defined before. This will help you identify the impact each variant you tested had on user behavior, or even user engagement. Once you’ve analyzed all your results, you can define a “winner” - i.e. the variant that most positively impacted the key metric you defined in your hypothesis. Be careful when taking other metrics into account, though. For example, if you have a better conversion rate but a lower retention rate - what do you think that tells you? Well, it shows that your next step should be to iterate on the “winning” variant to understand why it affected your conversion rate so well, and didn’t have as strong an impact on your retention rate.
Where possible, enrich your quantitative analysis with a qualitative one. Talk to the users in your test groups to see what worked for them and what didn’t. This will almost act like a bit of a “case study” for you and will provide deeper, more human insights to complement your data.
At Kovalee, each one of our partner apps has access to a full, personalized team of experts that help with everything - from monetization strategies to product development to UI/UX design to user acquisition. These teams dedicate themselves to our partner apps to help them reach their maximum potential in all of these fields.
So, when our partner app PepTalk was experiencing low conversion rates as a result of a paywall when users wanted to play premium videos, we saw this not as an issue but as an opportunity. So, we kicked off a project to optimize this process. Here’s how it went.
PepTalk is an app dedicated to daily motivation, providing users with extraordinary video content like motivational videos, daily affirmations, inspiring quotes, and more. It’s available for both Android and iOS, and consistently boasts an App Store rating of 4.9.
It is a subscription-based app with a free version that enables users to enjoy some content before upgrading to a premium account. It also offers in-app purchases as part of its monetization model.
While the app was experiencing strong overall conversion rates, there was a steep drop-off when users hit a specific paywall. This paywall was displayed when the mobile users tried to play a premium video while still using the free membership.
In order to help PepTalk overcome its paywall issue, we decided to launch a test focusing on clicks on a premium video. Our variants were structured in the following way:
This group continued using the current flow, whereby users would click on a video and see a paywall. The paywall would tell the users that they needed to subscribe to a premium account to watch the video. It looked like this:
We introduced an intermediate screen that displayed more details about the video the user wanted to watch. This resembled what Netflix does on the homepage when your remote control hovers over a show or movie thumbnail and a snippet starts playing automatically. Then, a CTA would pop up with the word “unlock”, followed by the paywall once the users clicked the CTA.
We had the video play for 10 seconds, at which point the paywall would pop up with the message “get premium to continue watching”.
We set up our user groups to test 15% of new users for each variant, and 70% for the control group. This way, we avoided any potential set backs and regressions. We ran the test for 2 weeks.
After the first week of testing, we noticed a positive uptick trending in both variants. So, we decided to roll out each variant to 33% of users to expand our testing pool and gather more indicative data. After the second week of testing, our results started to show.
With this increase in the size of the user base we were testing, we were able to clearly identify a “winning” variant after 2 weeks of testing. Conversion rates with Variant B overtook Variant A quite significantly, indicating that users responded better to the paywall after seeing the initial 10 seconds of the video. This likely showcased the quality of the content more to users, who wanted to continue watching after seeing these 10 seconds.
We then decided to roll this variant out to all users, and today, PepTalk’s conversion rate at the paywall stage remains strong and continues to grow. The paywall for premium videos now matches the high conversion rates other PepTalk paywalls are experiencing, contributing to the app’s profitability and long-term growth.
There are many A/B testing platforms and analytics tools out there. However, the engineers at Kovalee created our own, proprietary technology to help our experts determine the winning versions of A/B tests faster and more precisely. Our tool, Karbon, compares the different versions of a test element within an app against a forecast conducted by our data team. This enables our A/B tests to generate results much faster than any other tool on the market with 95% accuracy.
While normally A/B testing tools need to run for several weeks or even months, our teams are able to get actionable results from Karbon in a fortnight or less. As the tool has the capacity to forecast the effects of the outcome of the test, our partner apps are better able to understand the long-term impact of the decisions made.
If you’re stuck on a problem with any aspect of your app, our experts and our technology can help you. Just reach out at the link below!