Oxford University Press's
Academic Insights for the Thinking World

Playerless playtesting: AI and user experience evaluation

Over the past few decades, the digital games industry has taken the entertainment market by storm, transforming a niche into a multi-billion-dollar market and captivating the hearts of millions along the way. Today, the once-deserted space is overcome with cascades of new games clamouring for recognition. Competition is especially fierce in emerging spaces, such as the mobile and free-to-play markets, where designers must provide an engaging experience or face having their games forgotten. This has put an increasing strain on developers to pursue the utmost creative and technical precision in their releases.

Given this incredibly competitive climate, it’s no small wonder that Games User Research (GUR) has emerged as a field of key commercial and academic importance. Game development, after all, demands a deep understanding of human factors, user experience (UX), and player behaviour. Gone are the days of closed-environment, single-cycle development. Modern game development is an inherently iterative process, and one that is necessarily player-centric if it intends to be successful. For developers, UX evaluation is crucial, aiming to bolster the quality of finished products through insights acquired from analysing the actions and reactions of players within a game’s target audience.

Many different user research methodologies can be applied to the process of UX evaluation, from straightforward gameplay observation to complex experimental set-ups featuring physiological sensors, eye-tracking, and configurable recording equipment. Aggregated from several players over a given testing period, data obtained from multiple methods can be analysed to derive a range of insights, from identifying basic usability issues to providing a near-complete profile of the “average” player’s experience.

Practically speaking, the user evaluation process poses several open challenges for developers and researchers. Real-world user populations boast significant diversity in motivation, behaviour, and past experiences, which is astonishingly difficult to replicate when recruiting participants. Furthermore, the practice of testing early and often—while favourable to the quality of the finished product—can be prohibitively expensive and time-consuming, particularly for small- and medium-sized studios. But what if we could lessen this burden while improving our ability to conduct more representative and comprehensive testing sessions? And what if we managed to accomplish this by cutting the player, at least temporarily, out of the equation? What if we developed a UX evaluation system driven by artificial intelligence?

“Modern game development is an inherently iterative process, and one that is necessarily player-centric if it intends to be successful.”

We’re still a few years out from developing computer-controlled agents that might be able to pick up and play any game as a human might, but AI has already overtaken human skill in many complex games. For the purposes of UX evaluation, we’re less interested in developing an ideal AI agent, and more interested in maximizing its tendency to behave like a human player—complete with an imperfect memory, at-times flawed spatial reasoning, and goals that can diverge from a game’s intent based on its own simulated motivations. Then, we clone that agent into thousands of variants, all representing different “human” combinations reflecting a population with richly diverse demographics, experience levels, and playing styles. The resulting system could serve to test a few thousand “players” overnight, rather than perhaps a few dozen participants over the course of several weeks. And while such a system wouldn’t aim to replace current user evaluation methodologies, it might serve as a supplementary technique in the early stages of development. But how might such a framework be used in practical terms?

Imagine you’re a developer looking to identify any glaring flaws or opportunities for optimisation in level prototypes —corridors where players might get lost, easily missed objectives, and so on. Ideally, you’d want a dozen participants to play through each level, tracking their navigation and noting any unexpected behaviour. But this is expensive, time-consuming, and hardly repeatable for every single design change. The solution? An AI-driven framework capable of standing in for human participants. You set up a population of AI “players” representing your target demographic and instruct the system to simulate a number of trials. Within a few minutes, the results of the test have been logged, and after a few more clicks, you’ve brought up an overlay showing the aggregate navigation data of a hundred different AI agents. You note some interesting insights; an area where agents have trouble finding their way out; an objective trigger that might be a bit too difficult to reach. After a few tweaks, you’re ready to test again, and when the time comes for trials with human players, your more sophisticated UX evaluation won’t be bogged down with basic frustrations like players getting lost or missing key areas.

At a high level, current and future AI-driven approaches in user evaluation rely heavily on the adaptation of knowledge from existing research on subjects including machine learning, computer vision, and modelling human behaviour. While we might still be several years away from having an AI capable of predicting the finer complexities of user experience, the field as a whole is moving towards complete integration with computer modelling and analytical techniques alongside qualitative approaches. If this trend continues, playerless playtesting might just be the next great frontier in games user research.

Featured image credit: “K I N G S” by Jeswin Thomas. Public Domain via Unsplash.

Recent Comments

There are currently no comments.