Top level header for How We Load Test Pragma Engine.

This article was written on Pragma Engine version 0.0.94.

The Pragma Standard: Load Testing a Backend for Launch #

In this article series, learn why load testing is important and how our infrastructure team ran test scenarios with 1,008,000 existing players logging in, partying up, managing inventory, matchmaking, and logging out.

In this first piece, we’ll get into the how and why of load testing by overviewing:

  • Our load testing philosophy
  • How we emulate player behavior via a load testing client
  • How we deploy a Pragma Engine platform shard

Future articles will go over the results of our load testing efforts on single and multi-node environments, how we troubleshoot our topology, and the final 1 million CCU load test data.

You can see an example snapshot of our 1 million CCU test below with 81% of players in-game, with the rest running other platform operations.

Three different charts in green and yellow displaying a stable load of about 1 million simulated players on the platform.

Data on our 1 Million CCU load test.

Our Load Testing Philosophy #

A book with wrenches on the cover glowing blue and struck by magical blue lightning.

The book of load testing philosophy.

It’s standard to run a few RPC requests in a local environment when testing a new service or feature, but it’s impossible to build confidence in an online service’s stability when built and tested only in local or test environments. Therefore it’s crucial that load testing is performed and run as similarly as possible to a live production environment, otherwise there’s no way of knowing if the platform can truly run live and scale until launch day.

Launch day is one of the most important days for an online game, and load testing can help ensure that the large influx of day-one players won’t overload and crash the game servers.

Load testing provides critical data about how the platform performs and aids in fixing bottlenecks, platform and service errors, and overloaded queues that result from the tests. Every load testing environment is different and serves specific purposes for testing. For example, a small indie-studio won’t do their tests under a similar load as a AAA studio, who might expect hundreds of thousands (or even millions) of concurrent players on launch-day.

At Pragma, load testing means making sure Pragma Engine’s platform–with all our provided services and databases–can support a load of 1 million simulated players logging in and playing in a production ready backend environment. This includes logging into the platform, creating and joining parties, entering and leaving matchmaking, and running RPC requests to buy, sell, craft, and manage items in our inventory and store services.

Load testing a project using Pragma Engine #

A load test scenario involves simulating realistic player interactions against a platform configured as a production environment. The Pragma load test scenarios are executed against a demo project in Pragma Engine. This internal project contains examples of custom content, RPCs, and plugins all running within the context of the core services available in Pragma Engine–like matchmaking, parties, accounts, inventory, and many more.

An open “load-test” folder among other folders underneath a larger 4-demo project folder.

The demo project used to load test Pragma Engine.

Emulating player behavior #

In a load test scenario, emulating player behavior involves sending RPC requests to a load testing platform, similar to how a game’s SDK client sends RPC requests to a live platform. Although you can configure these RPC requests to call one another in quick succession, that process would be too instantaneous for real human users to input into their client. In other words, computers are much faster than players deciding what type of match to queue into or which loadout to swap into in their inventory.

A robot, arms raised, with a backpack, minerals, gold coins, arrows, and a sword, iron helmet, and iron plate boots.

A simulated player with an interactable inventory.

To emulate human player behavior as realistically as possible, you need to account for multiple RPC request scenarios that a human player would go through when playing your game. This involves configuring the timing of certain requests to model human input delay, having some accounts stay logged in as if they were AFK, managing how long certain players stay in a party or matchmaking, and plenty of other scenarios.

By accounting for all different kinds of human behavior, a successful load testing environment gets as close to a live game platform as possible. You can find an example of one of our local load testing configs below, with parameter values specifying many facets of the scenario.

The ScenarioConfig class is used throughout the load testing client, which we’ll explain later.

data class ScenarioConfig(
    val protocol: String = "http",
    val host: String = "localhost",
    val operatorHost: String = host,
    val playerGatewayPort: Int = 10000,
    val partnerGatewayPort: Int = 10100,
    val operatorGatewayPort: Int = 10200,
    val gameFlowGroupSize: Int = 2,
    val configurationFilePath: String = "",
    val numberOfAccounts: Int = 10,
    val runsPerScenario: Int = 5,
    val accountPrefix: String = "loadTest",
    val loginsPerSecond: Long = 100L,
    val connectionTimeout: Long = 5000L,
    var matchNotificationsTimeout: Long = 2 * connectionTimeout,
    val maxDelayMillis: Long = 100L,
    val disableSocketConnections: Boolean = false,
    var keepRunningForMinutes: Long = 0,
    var loginTimeoutMillis: Long = connectionTimeout,
)

An additional type of load testing is called stress testing. Though we try to simulate player behavior as best as possible, stress testing the platform with instantaneous RPC requests can be helpful for acquiring data on how much immediate load the platform can handle. These instantaneous tests are not as realistic as inputting delays and player behavior patterns, but they offer important insight on what services, gateways, and AWS platform shards might need the most tweaking and additional attention.

Stress testing should be considered an additional kind of testing within the load testing umbrella. For the purposes of this article, we’ll continue to focus on player behavior simulation, which is the type of load testing used for these examples.

The load testing client #

The Pragma load test client contains all the load testing configuration and logic, and is packaged as its own executable when Pragma Engine is built. The client uses libraries in the Pragma Engine codebase to make WebSocket connections and RPC requests over said WebSockets. When making RPC requests in a local environment like Postman, the connections and requests are sent over HTTP requests. However, to replicate a production environment as realistically as possible, it’s necessary to use secure WebSocket connections and not HTTP connections.

If you’d like to build your own load testing client, create it alongside all the client’s logic and configuration in your 5-ext directory.

Defining the load testing client involves writing logic for particular RPC requests in each Pragma Engine service and then stringing those RPC requests together into a load test scenario. The example below, which has been simplified for clarity, demonstrates a load test scenario utilizing a startMatchmaking RPC request with verification logic (the verification checks whether the correct MatchNotifications have been received or not).

fun startMatchmaking(
    leader: GameClient,
    group: List<GameClient>,
    runNumber: Int)
{
    val playerClients = group.map { it.playerClient }
    subscribeMatchNotifications(group)

    try {
        launch {
            stats.verifyMatchNotifications(playerClients, runNumber)
            stats.verifyMatchProcessed(playerClients, runNumber)
        }

        launch {
            stats.verifyMatchReady(playerClients, runNumber)
        }

        val request = EnterMatchmakingV1Request()
        leader.playerClient.gameRequest(request)
    } finally {
        unsubscribeMatchNotifications(group)
    }
}

It’s entirely possible to parse the responses of these requests to decide what other requests should be made. But for the most part, the responses to these calls don’t require additional logic or configuration once performed, except for calls that make state changes on the platform.

When all the RPCs are given logic to run through during the test, a suite of such requests is strung together via a listof() for the gameFlow part of the load test scenario, as seen in the code block below. Remember that each parameter in this function contains configured RPC requests from Pragma Engine services for each simulated player. In other words, this scenario is the path a simulated player goes through when parsed through our load test.

   private fun gameFlow() = listOf(
        GetLoginData,
        PurchaseItems,
        CheckInventory,
        TimeBasedCrafting,
        CreateAndJoinParty,
        SetPlayersReady,
        StartMatchmaking,
        LeaveParty
    )

We use multiple load test clients running at once to simulate load for AWS platform shards. Each client is able to generate 24,000 distinct players for simulating load (the machines running the AWS platform shards have 8GB of memory and 2 CPUs). So, when reaching 1 million concurrent users for testing, we needed 42 load testing client instances.

Every studio builds their platform’s load test shards for different use-cases; for our platform, the ability to redeploy the platform shards and gateways in our AWS infrastructure is crucial for getting load testing results fast and efficiently.

Deploying the platform shards #

Our AWS topology was built and deployed iteratively using infrastructure-as-code tools alongside deployment code defined by our infrastructure team. This deployment infrastructure allowed us to easily implement, create, and tear down virtual hosts that deploy and run our load testing multi-node deployments. This is incredibly helpful when errors, bottlenecks, service failures, and other platform issues occur and need to be fixed by recreating the AWS topology for retesting load.

For example, if we notice a particular RPC request gets bogged down with a particular AWS gateway, we can better identify and fix the issue in the next load test by either defining new load testing logic or redesigning the platform shards and server instances with the help of our deployment scripts.

Managing a backend platform’s topology and achieving horizontal scaling requires organizing the distribution of RPC requests to gateways spread across multiple hosts. Normally, a load balancer distributes traffic across instances randomly, which is sometimes effective for spreading traffic. However, this adds overhead by requiring additional network hits in order to route requests.

For a Pragma Engine-powered platform, we direct player related requests to specific gateway nodes via specific, non-randomized routing rules on the load balancer. This significantly reduces internal network overhead, since we’re utilizing a deterministic routing approach for directing traffic appropriately. By distributing requests via their respective gateway nodes, the whole load population can be equally distributed with less likelihood of overwhelming the platform.

Region locking is a very common routing and targeting rule to particular gateways for load balancers.

Additionally, one way we balance the load is to halve the population of RPC requests with distinct social and game gateways, essentially splitting the load of each gateway in two. We also load balance directly in Pragma Engine service configs, such as by regulating how much load is passed through a service at a time.

White, grey, and blue boxes connected to one another.

Load balancing the platform by separating the Game and Social gateways in two using routing rules.

It’s important to note that these shards for our load tests are deployed in a multi-node environment. For internal testing, low player counts, and cost management purposes, a studio might deploy their platform shards on single nodes instead. We’ll get into how we load test single-node deployments in a future article.

In short, our load testing process–building the client in our demo project, writing deployment scripts, and creating the AWS infrastructure environment using said scripts–allows us to easily iterate and solve issues that pop up when running each load testing scenario.

And that’s how we test the platform under load and perform a trial and error approach for different scenarios! Future articles in this series will go over the results from our load testing efforts using data gathered by our metrics systems.


For more information, check out the rest of the articles in this series:

  1. The Pragma Standard: Load Testing a Backend for Launch (this article)
  2. Pragma Backend Load Testing Results: Achieving 1 Million CCU


Posted by Patrick Olszewski on June 16, 2023