My Thoughts on The Essential Shape of Games Part 2: Quantifying the Problem For MMOs

NOTE: this is old, and I no longer stand by the design in these posts for a variety of reasons.

In part 1 of this series, I gave a proposal for what a framework for implementing most types of games might look like. In that post, I promised a part 2 in which I attempt to quantify the size of MMOs in terms of the framework. Here it is.

But First, a Disclaimer

Before I go into this too deeply, I must confess that finding any actual data on this is difficult. Most of the information on object counts comes from my time on Muds, which are similar enough in scope to indicate what one might need to scale to for a MMO. The message passing quantification is specific to what I'm planning, and so finding actual data for something that's possibly not been done (or, if it has, it's not widely talked about) is somewhat challenging. One thing I did find is Alter Aeon's worldstat command; Alter Aeon is similar in scope and size to what I wish to build. The difference is that mine is a directed graph of terrain maps instead of a directed graph of text descriptions. If anyone has some actual data relating to this subject, I am most definitely interested; what is here is my line of thought before any code has been written and possibly completely wrong.

This is why I waited so long to put this post out. It was written almost two weeks ago. I waited so that I could forget enough of the content and underlying assumptions to see if this still seems correct. This is also less formal than usual. I find that when writing about things before I've done them, I have a much lower level of linguistic precision.

What Needs Quantification?

The first question which must be answered is this: which parts are actually problematic enough that they need quantification?

For a single-player game, the answer is nothing. A home computer can easily handle 10000 messages a second. This holds true regardless of programming language; most performance issues can be fixed by increasing the cleverness of the algorithm causing them. Huge chunks of the world can be put to sleep, leaving only a small part active--the current level for example. The only issue here is long-running logical components and systems. But this is an issue of any framework or engine. The area that needs to be actively simulated is something like 5% of the world, possibly less; single-player game developers have the luxury of pretending that things the player doesn't see don't exist at all. Even those single-player games that offer a "open world" do so. Whether or not open world 2D games can muster the power to simulate everything is definitely debatable; 3D games definitely can't. Regardless, this is unimportant for this post.

Given the above and my personal goals, I'm going to turn to MMOs.

In MMOs, we have network limits, memory limits, and CPU limits. There are several issues that immediately come up. A MMO can't really put parts of its world to sleep, but it can divide it logically into maps, by which I mean "open spaces". This prevents the physics engine from having to consider the whole game world in a single step and, in fact, gives some nice parallelism for free. Furthermore, the server is not going to need to tick physics 60 times a second--10 or 20 is good enough for all but the most intense action games (this does not hold true for games which rely on collision and allow dodging, but those come with all sorts of other issues like needing UDP).

We need not be concerned with the network limit, at least not at this stage. Network within the server only comes into play when talking about a cluster, and the problem can be mitigated by keeping areas limited to one server. Physics messages should never need to cross the server boundary, nor should combat and presentation (as in play this sound) messages. Needing to scale to more than one computer may come up in the future, but I ironically feel that my idea is actually up to it. As for the players, they will see something along the lines of 0.01% of game messages and this estimate is absurdly pessimistic.

So that leaves memory and CPU. In terms of memory the issue is objects. In terms of CPU the issue is message passing and processing. We need to quantify objects first, as this has a great effect on message processing; objects alone are not an issue.

Attempting to Quantify Objects and Components

There are three general categories of object which may be quantified: the per-container/inventory count, the number of objects in a map on average, and the number of maps.

Let's start with containers. A player is a container of containers, and probably the biggest offender in terms of objects per map. Players don't like to let things go. I've seen, in my various online adventures, anywhere from 8 to 8000 items in an inventory. The latter was a game which went overboard with collecting materials, each material being represented by stacks of one-unit items that did not combine themselves. I think that 50-100 objects per player is more reasonable. This isn't so much a memory problem as a CPU problem, but provides a good starting point. 75 objects times the number of players is a reasonable count of per-player objects in my opinion.

As for maps, maps like to have NPCs. NPCs like to have armor. For sanity, let's say that NPCs only spawn treasure on death. There is likely to be more than one NPC per area, though the exact count depends on the game. My definition of map as an open space leaves a lot to be desired in terms of figuring this out-is it a solar system? A city? what? So a better question is how many per game.

I'd say that an area will never be empty. I'd go further and say that an area will have about 50 NPCs on average, that most areas will equip them with at least 2 pieces of equipment, and that a good-sized game will end up having at least 100 areas. Any area consisting of humanoids is going to give them a full set of armor: typically 2 arms, 2 legs, body, neck, head, feet, hands, wrists, ankles, and possibly rings. This is 13 objects if the wrists, hands, feet, and ankles are modeled as a single slot and one ring is allowed per hand. The average is probably somewhere halfway between these values: wildlife, dragons, and other nonhuman NPCs probably don't get equipment unless the game is allowing for some sort of more generic slot system. Let's say that it's 7 objects on average. Saying how many maps per area is difficult: if the server can handle really big maps, it's not unreasonable for areas to have one map only.

So 7 objects for 50 NPCs for 100 areas is: 35000. Add the NPCs themselves: 40000. Players just became insignificant in terms of memory, even if these numbers are much lower.

So what about containers? I'd say that a well-designed area is going to have 4 or 5 hidden treasures, and that these are going to have at least one item each. Some builders will go overboard with this, but quantifying it exactly is really hard to do in advance. The best I can do here is a ballpark estimate which could be wildly off: 100 areas, 5 treasure containers each, and probably 1-10 items in each: only a couple thousand.

Let's round up and just say 50000 objects on average for a decent-sized game.

So, how many components each? Well, all top-level objects will have at least position and velocity, all NPCs will have inventories and equipment slots, and all areas will have terrains of some sort. Scripted objects get a copy of their script, damageable objects have a health component, players have a logical network component, and lots and lots of objects have lists of stat modifiers. Almost everything will have a name. The best I can figure, it's at least 5 data components per object to make it meaningful to gameplay, and objects which aren't meaningful to gameplay should be deleted anyway. This means 250000 data components, where a data component is anywhere from 10 to 50 bytes.

So, assuming no programming language overhead, i.e. this is programmed in C++: something like 100 megabytes. I'm raising this because of terrains and script, which are larger. It's probably more like 500-600 MB (Python), 1 GB (Java because it includes the runtime), or somewhere thereabouts. This does not account for the allocation and management of messages.

As I said above, this isn't so bad.

Quantifying CPU

Note: In the following and probably also in the actual implementation, I am considering the routing algorithm in terms of objects resending messages. This might change slightly if the routing algorithm sits outside the hierarchy, but I do not believe that would buy too much. Making the algorithm separate can make some message sends faster, but adds coding overhead with little gain.

I did not discuss logical components yet because they are a CPU issue. Every object must either ignore or respond to certain messages; the way it does so is logical components. If an object has no logical components, it can simply ignore most messages sent to it. The actual number of logical components will vary widely, but there will probably be at least one per object. The most notable effect of a logical component, however, is that it can block messages for children-this is a major savings for physics.

The most CPU-intensive of the logical components is physics. As I am considering it, physics will need to lock all objects below it for the amount of time it takes to run one physics step. Physics steps will happen roughly 10 to 20 times per second in most games on a per-area basis. The physics step will generate collision messages, which are sent through the reflection path directly down from the area to all objects it contains. The physics step also generates position updates; consequently, every object that is the immediate child of an area will see a message every time it runs. It is hard to say how many objects this is, but at least 10000: all players, NPCs, and objects on the ground get one. That means approximately 100000 to 200000 messages a second are being generated by physics, plus collision messages-probably only 4 or 5 thousand a second, probably cut off by the top-level objects, and thus something we can ignore.

That's not the scary part. This is the scary part. The way that NPCs, players, and other AI tasks query other objects is via a message. These things need to know when AI goals are met, etc. It is therefore not infeasible that some or all physics messages need to be mirrored to all other objects in an environment. I probably will be able to come up with something smarter-say, something that will only multiply this number by 3 or 4 instead of 10 or 12 or 30-but the possibility still exists, and 100000 is not the end of the story with physics messages.

In addition, there is at least another 50000 messages going on. This number is probably closer to 100000, as reflection is an expensive thing in terms of messages-implementing custom components that block messages all over the place is not a good idea from a design point of view. These messages represent things like combat, play this sound, AI goal specification, incoming player actions, dialog, etc. Everything in the game is done through messages and reflection is preferred, so every action gets at least 10 message sends-recall that messages, after reflection, need copying and resending for safety.

So how bad is this, really? We're talking anywhere from 200000 to 1 million messages a second. Each message needs to be allocated, sent, and deleted. It is hard to say just how much CPU this will take, but if the message sending is expensive, the game is not going to run. Fortunately, some systems provide zero-copy message sending. More fortunately, arenas and pools are old ideas that I can probably take advantage of. The problem that is most vexing here is that memory is slow and messages are not small things: cache friendliness may be an issue. So too may heap fragmentation. When I started reasoning out this part of the post, I was not expecting the number to grow so quickly: it's much higher than I expected. But I don't see many places to cut corners.

Conclusion

Keep in mind that the above estimates are specific to the kind of game in which NPCs are important and the floor is a thing. A space game has more interesting dynamics in terms of entities per map but probably less maps overall. A strategy game probably makes the map an entire game in progress, and probably also has hundreds or thousands of entities in it-most of which may be automated by larger, overarching systems. I'm also considering the case of a 2D game in which the server does not deal directly with textures, so moving to 3D and needing to replace terrains with triangular meshes probably makes things take quite a bit more memory.

Also keep in mind that a MMO is not necessarily one core nor necessarily one machine. It is possible to split the game along the area boundary. With the right tools and programming style, this is even potentially easy (but not something I have any experience with yet). The area division makes it clear what objects need to be on the same physical machine; moving an area off the main machine also moves most of the messages related to it off the main machine. I am personally putting this off until it's needed. I think it might be at some point, but not at the beginning.

This is for a general framework. I like the idea of general frameworks: a domain-specific system can avoid much of the performance overhead, but has to implement all sorts of custom game-specific logic at a very low level. I think that the overhead here--which modern computers really are able to handle--is the same as the overhead with every other high level language: usability at the cost of performance in an age where performance is starting not to matter.

To be blunt, I'm planning to brute force the problem. 20 years ago, brute forcing any computer science problem was unthinkable. 5 years ago, brute forcing this one was unthinkable. And yet we have amazing amounts of resources and inexpensive cloud platforms now. I'm essentially betting that I can come up with the resources as I need them. Once the game is launched, it will take a year or two to be up to the size described here. This gives at least 3 more years of continued technological growth before it's a problem, and there are optimizations that can be applied to make it less of one. To put it another way, I'm aiming for flexibility in the tools given to builders and less complex code at the cost of needing a very powerful server.

In closing, I'm going to be renaming what I call reflection. When I wrote the first post, it did not occur to me that reflection is already in use by just about every programming language out there. I'm not sure what I'm calling it yet, but reflection is a bad choice in hindsight. That's too bad: reflection is a very accurate term for what I mean.