This is part eight of the Mobile Marketing Creatives Series. In ten episodes, we aim to provide insight and inspiration on creating thumb-stopping visuals to promote your app.

Download the Mobile Ad Creatives eBook today to read the other nine episodes in the series. The comprehensive guide includes ten core topics condensed into a practical blueprint with examples from AppAgent’s Creative Studio.
What you will learn in this episode:
– How marketability testing increase chances for new product success
– Pros and cons of the 4 testing options mobile marketers have today
– The top ten 10 A/B testing commandments we recommend you follow
How to run an A/B test in app stores
Store listing optimization is a mid-funnel strategy that can significantly improve the performance of your paid and organic marketing efforts. Most mobile app or game installs enter through the store page, so the conversion rate influences both organic and paid traffic.
Today, there are effectively three A/B testing options:
- Look-a-like app store pages – 3rd party, web environment
- Google Play Experiments – Android only, native
- App Store Product Pages Optimization – iOS only, native
Alternatively, you can use (¾. option) sequential testing on the App Store. Technically, it’s not an A/B test, but until recently it was the only way to improve the conversion rate on iOS.
1) Look-a-like pages

When testing through a look-a-like app store, you direct owned or paid traffic to a web page that looks identical to the app store. You either direct owned traffic (for example, from your website) or you pay for traffic using Facebook Ads.
If the user clicks the ‘Get’ button they are either forwarded to:
- the real store (that’s when the real App Store native app opens);
- a landing page; or
- survey revealing it was only an experiment.
Why choose the look-a-like pages option for A/B testing? It’s marketability testing, which involves a series of experiments for a pre-release app or game in order to verify market interest, and refine aspects such as the product proposition, game theme, design look and feel before it is actually launched.
Jesse Lempiäinen from Geeklab explains: “Getting an initial sense of where your product concept and audience fit sits can be done as a stand-alone test between several ideas or by comparing your idea to a rival or a past effort.”
To provide you with a better sense of what can be tested, here are some suggestions from Jesse:
- Theme testing such as Zoo, ocean, wizards, candies… to understand what themes your audience finds most relevant and exciting.
- Art test ideal for determining which art styles perform best with your target demographic.
- Feature testing offers insights into which core game mechanic performs best in terms of top-of-the-funnel metrics (e.g. merge vs match 3 vs 1010).
- Motivators testing will help you figure out what are the main drivers for your audience (for example, social vs mastery vs creativity).
- Unique selling proposition (USP) testing can help you to define core communications points.
Marketability tests can save you a lot of time by avoiding investing in further development of a product that not work for the market. With a solid market and design research, you can refine your product vision for an app or game within a few weeks. All before you write a single line of code.
Another bonus of using look-a-like store pages are the insights you can obtain into user behavior. You’ll get great data on heatmaps, interaction with screenshots, video view rate, and more. It’s not just for start-ups either, even established apps and games can generate valuable insights by using this approach.
Thanks to the power and availability of Google Analytics (or similar tools), you can use valid statistical methods and analyze traffic by source, learning about differences in behavior from users attracted from paid campaigns on Facebook, email newsletters or your website.
The downside of look-a-like pages testing is that you have to buy users. This paid traffic will differ according to the nature of the paid user acquisition used by the store (Facebook auction pool for weblinks is different as well as Google UAC differs to Adwords).
Naturally, you must face the fact that you will lose some page visitors who clicked on the “Get” button once they reach the second step, the real app store.
While offering great insights, tools such as Geeklab aren’t free, and often come with a minimum commitment or contract length.
2) Google Play Store A/B testing
Google Play Store A/B testing is the most popular method of A/B testing store assets. It’s not only popular because it’s free, easy to set up, and great at getting you first-party store data, but mostly because it provides a real store experience that includes all channels.
Whoever lands on the store page––no matter if they’re an organic user or a paid one––you generate a valuable new data point for your experiment.
Google Play Store A/B tests can be localized and your text variants would index. A feature we at AppAgent like to use is to create custom store listings to target specific groups of countries. You can easily merge, for example, all English speaking markets, or the DACH region and then run tests for each segment. The outcome is a larger data set that’s generated faster than running experiments in individual countries.
The limitation of this option is that it’s only available for Android. There’s also limited reporting. At a technical level, because there are no deeplinks, you can’t properly measure post install events.
You also have no choice of traffic source and no reporting based on Search, Explore or Referrers.
In conclusion, Google Play Store experiments are great, but there’s one thing to keep in mind. Martin Jelinek, AppAgent’s Head of Marketing, explains the catch: “The biggest issue we have at AppAgent with Google Play experiments is the very loose statistical significance. Confidence intervals are extremely weak and often result in questionable outcomes, and the problem has been the same since 2015!”
To compensate for this, Simon Thillay, Head of ASO at AppTweak, recommends running A/B/B tests. “Basically, you create an experiment with two of the same B variants (A/B/B rather than A/B). If in this case both B samples provide similar results, you can be more confident of the impact of the uplift when implementing the variant.”
Read how we evaluate results of experiments to reduce false positives in the 10 commandments section at the end of the article.
3) App store listing A/B testing
The iOS 15 update released in late 2021 now enables you to test an app’s product page with different icons, screenshots, and app previews to find out which one achieves the best results. Similar to Google Play experiments, Product Pages Optimization (PPO) lets you test up to three new variants against the control variant in the native App Store environment.
Each tested version is shown to a percentage of randomly selected users in the App Store and results appear in App Analytics. You can run the test for up to 90 days and within selected countries.
Unlike Google Play experiments, you can test only visual elements such as the icon, screenshots and app preview video. You can’t test the title, subtitle and description. To test icons they must be included in the app binary (see the guideline by ASO Giraffe).
In each product page optimization test, all creative assets need to go through the standard App Store review process. However, it doesn’t require a new app version release (unless it’s the app icon). It has to be part of the app binary code in order to achieve consistent user experience.

Apple uses a 90% confidence interval for experiment evaluation, as low as Google. To provide guidance on the experiment set up, you can optionally define your desired improvement in conversion rate. Your app’s existing performance data, such as daily impressions and new downloads, are used to generate an estimate of how long the experiment will take. You can then play with a number of tested variants (with a maximum of three) and traffic distribution to optimize for the expected outcome.
One positive thing about PPO is that it allows you to export tests in CSV, which is useful for organizing backlogs. Those are important for knowing what you’ve done in the past, and the results you have achieved. Using historic data, you can better estimate the potential impact of similar experiments, and prioritize future tests more effectively. As of now, export includes only the test name and start and stop dates. In the future, we hope it will include more data that will allow better analysis and confidence calculation.
David Pertl, Mobile Marketing Manager at AppAgent adds an important insight: “Unlike on Google Play, a new app version release cancel an ongoing A/B test. If you are updating frequently, you’ll have to carefully plan experiments and app updates.”
This flaw is one of several issues that arose when the PPO was officially released in January 2022.
Another downside is that you can’t set up and run more than one test at a time. In addition to this limitation, when running a test in more countries (France and Germany, for example), you can’t apply a new variant in one location only (just in Germany) and let the test continue running in another one (France, in this example).
Speaking of markets, you can’t segment analytical data by them if your test runs in multiple countries. The workaround involves setting up separate tests for each market in order to have clear analytics. However, testing will then take ages as traffic will be fragmented.
The early stage of this new PPO feature also brings some unexpected results: “We’ve seen that two variants with similar impressions––over 100K each–– and similar conversion rate, had different confidence of 22% and 90%, which is a huge difference. On a positive note, changing the order of screenshots doesn’t require a review anymore, and instead goes live immediately (as opposed to new screenshots). The review time of new store assets without a new build is also much faster nowadays,” says Jiri Chochlik, Organic Growth Manager at Tilting Point.
To sum it up, only time will tell how serious Apple is at providing developers and marketers with a solid tool to increase conversion rates and to attract more users.
Until then, you can still use old-school sequential testing where you change one version of the store listing for another in time and measure the conversion uplift. The challenge when using this approach is to keep such experiments isolated from external influences, and to calculate the result with very limited data in the App Store Connect.
How to A/B test app store screenshots and icons
We’ve been running successful store experiments for over seven years and have defined a list of 10 internal commandments we follow at AppAgent. We hope that sharing this guide will help you to properly run and evaluate your A/B tests for any mobile app or game:
- Run every test for a minimum of seven days. If longer, do it for a multiple of weeks (two weeks, not 17 days, for example) to collect data from workdays as well as weekends. (No matter how quickly Google and Apple evaluate the experiment!)
- Bigger changes = clearer impact; smaller tweaks need more tests to find a winner. This is especially true for developers with low traffic and when you’re starting with experiments.
- Focus on first impression tests which have the biggest impact.
- Speed up testing volume by avoiding polishing the first version of new assets.
- Consider experiments as successful only if the confidence interval is fully in green and shows at least a 5% improvement (less is often a statistical error).
- If you are testing a critically important asset such as an icon or the first screenshot, consider a verification test (running the test various times, B/A test).
- Only run two variants if you’re low on traffic to get results faster and iterate upon them.
- If you’ve got a higher media spend, look at the impact on paid data before/during/after the test.
- Keep an eye on day one retention after major changes in your store presentation to verify users expectations were met after the download.
- Don’t waste a single day when you can test!
Closing remarks
ASO became much more about conversion optimization than search optimization in the past 2-3 years. The majority of search traffic involves brand terms, and is impossible to scale. CRO, on the other hand, offers a massive lever that can impact all traffic sources.
Improving conversion from 20% to 25% can bring an app with an average of 100,000 monthly downloads an additional 300,000 users every year. Alongside the growth in free traffic, you can also significantly reduce paid user acquisition costs. Our advice: don’t leave money on the table!
Want to keep reading? See our guidelines on how to prevent Apple from rejecting your app store preview videos.
Mobile Ad Creatives eBook How to Design Ads and App Store Creatives A comprehensive guide to designing thumb-stopping visuals that will grow your user base and revenue.
📕 Learn more about industry insights and best practices by signing up for our newsletter here.
🤝 Get help with growth strategy, app marketing, user acquisition and video ads production by reaching us at hi@appagent.com.
👉 Follow us also at LinkedIn, Twitter, YouTube, Slideshare, Facebook or Instagram.
💡 Join in our team as Marketing Lead, Senior Art Director, Marketing Manager
HOW TO A/B TEST APP STORE AND PLAY STORE CREATIVE ASSETS
Store listing optimization is a mid-funnel strategy that can significantly improve the performance of your paid and organic marketing efforts. Most mobile app or game installs enter through the store page, so the conversion rate influences both organic and paid traffic.
There are effectively three (plus one) A/B testing options:
– Look-a-like app store pages – 3rd party, web environment
– Google Play Experiments – Android only, native
– App Store Product Pages Optimization – iOS only, native
– Sequential testing on the App Store.
RELATED ARTICLES