My Thoughts on The Essential Shape of Games Part 1: What is the Essential Shape of a Game?
NOTE: this is old, and I no longer stand by the design in these posts for a variety of reasons.
My quest for an MMO is leading me down all sorts of interesting avenues, most recently Libaudioverse and self-taught digital signal processing. While Libaudioverse is my top priority project at the moment, I am still devoting a non-negligible amount of thought to some core issues with game programming.
One of these is how one might go about making a game system that has three essential properties: network friendliness, developer friendliness, and flexibility. Spurred on by a design pattern called the Entity Component System, my brain began to generate ideas specifically aimed at MMOs, and for a while it seemed that an Entity Component System was the silver bullet. But I no longer think it is, at least not by itself.
And I wasn't even asking the right question. The question I should have been asking is this: leaving aside network issues for the moment, is there some structure that seems to encapsulate most or all games? The answer seems to be possibly, but I have yet to implement it.
Herein I share my thoughts on what the structure might be. Part 2 defines the problem in terms of an MMO and part 3 will talk about the options for actually getting this up and running. But first, we need to define the structure.
The Problem with a pure ECS
A pure ECS(entity component system, sometimes called a component entity system or even just an entity system) is a very simple model for a game. To summarize with a brief bulleted list:
-
Entities are simply unique identifiers.
-
Components store data of some sort. Examples of components include position, velocity, or name. Each component is associated with an entity. Entities can have more than one component.
-
Systems are functions that process entities with a certain subset of components. For example, the collision system might look for objects with position and geometry, physics for velocity and position, network for attached client. They then do something for each entity in the subset-typically, update a component or components on it.
I am adding the pure qualifier to the name myself in order to discriminate against implementations that mix approaches; my solution is a twisted ECS with actors. For that reason, I want to take some time to talk about pure ECS and why I don't like them anymore.
An ECS works really well for certain types of games where objects don't talk much. Space invaders? Sure. You have really neat emergent behavior: any entity will start automatically being processed by the appropriate systems for exploding if you add the explosion description component, or whatever. In addition, it's inherently serializable. Components and entities map to a database and a network socket very easily. Diffing is trivial. Implementing common logic is trivial. Builders can manipulate components and suddenly get their objects to do all sorts of things, without us having to put a label like "mob" on it.
But what happens when two objects want to interact? Let's take the most basic type of object interaction: collision. A pure ECS would say that you need a collision system, or that it belongs in the physics system, or perhaps that you should start marking colliding entities with an is colliding component. Systems are external to game objects-we don't ever call a method for this functionality. But I've never heard of a game where collision is simple: bullets hurt, portals teleport, walls make sounds, and objects suddenly move into your inventory.
Worse, none of those four things happens always. This means that the logic is either in one giant ball of if statements or implemented for every possible pair of objects. But systems can't ask for pairs of objects, so take the first option and have a system that exists simply to process is_colliding components and remove them (and you have to do the if statements inside it!). Note that testing for the is_colliding component and running a system on it is functionally the same as a dedicated collision system, but the latter makes you do a very expensive check twice.
Now we want to communicate this to the client. So we have to toggle the is_colliding component over there. But this raises its own issues: what if it toggles on and off between updates? Suddenly the network code needs to be able to defer packets, depending on what they will do to the game model-we need to be sure we see every component. Suddenly, we're now also talking about an event system and completely different client-side logic. It's not possible to send every graphical and sound update as an event, so the client now has a melange of different presentation systems. And we haven't even talked about control.
And here's the final problem. Entity explosion. In order to do an attack, the best way is probably some sort of attack entity that exists for a time and damages on collisions. Even if it's not a game that uses that model, i.e. Final fantasy, you can't just directly send the message. You have to go grab the second object somehow and update its components, and then somehow tell everyone that this thing happened. So who gets the complexity? I vote for a new entity that represents an attack, with a target and countdown component. And then you've got another dedicated system which has to be implemented in the server for all areas and situations.
Here's something that makes it more clear. Consider trading. A trade is an action that happens on a collection of objects. Trades need to be entities with a magical trade_info component and another dedicated system; in this case, it's possibly more than one. You have to somehow communicate objects that desire to be added to the trade and get some system somewhere to magically pick them up, and there's probably a state machine of systems that switch out components: hello trade_nonfinalized, trade_halfconfirmed, trade_confirmed, and trade_finished. Obviously, you can make really complicated systems and have singleton entities. But you're not supposed to communicate between systems in a pure ECS. Woe is you if you suddenly need to worry about system order, which complicates issues like this further.
My problem with this isn't that superfluous entities get created, as they're kind of just instances of classes. It is nevertheless absurd that we have a trade entity for every trade. My problems are these: you end up with multiple systems that all work "over" the problem, every system needs to implement communication of important events to interested entities somehow, nothing can intercept actions for modification (i.e. spell effects) without it being coded at a very low level, and there's a lot of systems that need to execute on less than 1% of objects (yes, really. Remember that entity explosion? A lot of them represent really meta things, and we get to do database queries or something hand-rolled and inefficient to find them out of the entity spam). Every time something needs to change, the entire server goes down. Every time a builder needs a new feature just for one mob in one area, hello to the next new and upcoming global system with is_for_boss_6_in_northlands component.
So Let's Solve it: Say Hello to Messages and Logical Components
All that said, systems are really, really good at something. That something is logic that needs to run on 99% of objects. Examples include physics, position and velocity updates, etc.
But a good game is going to have lots of one percents, and even singleton behaviors: if every boss, quest, artifact, and city is the same there's no point. So let's allow logic in components, and see where that takes us.
And here's where messages come in.
A data component is a component as discussed above. A logical component is a component which responds to a message. A logical component has the following constraints on its behavior: it must respond to messages with atomicity, it must modify only the object of which it is a part, it must not block, and it must not communicate with other objects save by sending messages. Furthermore, logical components must keep any data they wish to store in a data component, unless that data is transient-logical components are not serialized.
Clearly more clarification is needed. Firstly, entities receive messages. When an entity receives a message, all logical components get a chance to respond. Secondly, all entities respond to a message asking for a snapshot of their state at a point in time. Thirdly, an asynchronous reply mechanism must be implemented. Fourthly, there must be some useful routing logic.
I'm going to discuss the routing logic in a section of its own, as that's the key insight of this post. But let's talk about what this gives us as it stands if we assume that the routing logic is the worst I can imagine: source to destination only.
There is only one collision system. It is global. Two objects that collide both get a collided message with a reference to the other object involved. Suppose that they are both uninterested: nothing happens. If one is a bullet, it sends a damage message to the other; the other object may or may not respond, but the bullet needn't care. If there is a logical component representing a client connection, then it can forward all messages-which serve as events-to the client; the client will then see anything that happens to its avatar. Furthermore, a clever builder somewhere wants to write a script that makes the player immune to damage. We can model this as a logical component that executes scripts on message reception and a simple component containing the script.
But we can't go further without routing logic, so let's turn to that now.
What is the Essential Shape of games?
The essential shape of games is a tree. Not a graph, just a tree. Objects may or may not have inventories, and all objects except the root exist inside an inventory. Any sideways or upwards links in the tree are to be considered weak, and are subject to breakage at any time.
this statement is actually very profound, as it immediately places limits on coding and actually reverses an earlier belief of mine. Let's discuss the last part first. If we allow strong sideways links, things can't die without us having to worry about what else might be referencing them from half a world away. Nothing is wrong with making a link temporarily strong for the duration of an operation such as teleportation or trading and, as you will see in the routing logic below, doing so isn't so impossible. The truth is that these are actually pretty rare: most links are strict parent/child relationships anyway.
The tree part of the statement falls out of the fact that it is very rarely useful for an object to have more than one parent. If it needs to be referenced from elsewhere, a weak link can be made to it. But in 99% of cases (I'd say 100%, but I'm sure someone else can think of one) the tree constraint is fine. Since this lets us traverse without worrying about whether or not we've visited a node, we take it. Traversal is something we will be doing often.
The inventory of an object is all strong links from the object to other objects. This lets the inventory segregate between things like components and equipped items vs. inventories and contents of rooms.
The last piece of this is the routing logic. I have identified two crucial things that messages can do. Let's break this into small pieces: what happens when a message is received by an entity and how messages get to entities in the first place.
When a message arrives at an entity, all logical components receive it in strict priority order. Any component may modify or cancel processing. By defining an order, we allow for spell effects like immortality (cancel all damage messages) or armor (lower number in the damage message). I'm going to leave a discussion of message order and standard messages for the next and much less formal post: they are implementation issues and not general overview material.
A message can get to entities in two ways. The first is simple destination broadcasting: go here and be processed.
The second is a very strange form of pubsub. I call it reflection, but it may have a proper name somewhere. A message sent on behalf of an entity using reflection has two stages to the routing algorithm and an additional condition (the reflection condition) that the message must be flagged with.
In the first stage, the message bubbles up from child to parent and is flagged with a bubbling flag. It is shown to components as usual, though very few should respond. Examples of components that might respond in this stage include bags which may wish to muffle their contents or rooms which wish to enforce no combat. Should a message be canceled in this stage, processing stops completely. After it is shown to an entity but before it bubbles up again, we check the reflection condition. Should it succeed we move on to the second stage of processing.
The purpose of the second stage is to let every object in an area of interest know about an event. The reflection condition defines just how wide of an area this is: a room, an area, the whole game, or something really nonstandard. The entity at which the condition is met is the first to receive the message--this time with a reflecting marker. The message is then passed to its immediate children, the immediate children of those children, and so on. Canceling also takes on a different meaning in this stage of processing: a canceled message only terminates processing for all children of the canceling entity, no more. It is worth noting that the message must be copied uniquely for all children; if not, player a's armor might protect player b, or some such nonsense.
Here's the final controversial statement I want to make in this post. Most messages should use reflection, optionally tagging themselves with information on which object they specifically wish to affect. Any object that's not interested can just drop it. But if an enterprising content creator wants to do something atypical with AI, they can. Modifying the server to use reflection for an operation is not necessarily trivial, but going the other way is.
The first option exists for messages that need to go "too far". Any message that needs to reflect at a level higher than the room will incur large performance costs, but we still might want to communicate with the telepportation daemon or the trading system. Also, some systems-like physics-exist outside the tree and will wish to broadcast to things inside it. God help any message that needs to reflect through the root node, because it's going to hit at least 50 thousand entities (but hey, we can do game-wide damage messages if we really want). SO we send such messages directly to their destination and avoid the problem.
And locking? Implement it by making a component that captures and cancels messages, and then resend them later when the lock is released.
Bringing it all together
The title of this post comes from the fact that I think about this as a shape: a tree of objects with messages climbing up and down it. And it sure seems to buy a lot. Builders have as much freedom as in an ECS, scripting and all sorts of nontraditional stuff just fits, etc. This was the product of about 3 weeks of off and on intensive thought, coupled with a month or so of realization that my previous effort probably isn't going to cut it in production.
This might not either. As I said, actually programming this hasn't happened yet. Part 2 is going to outline some of the problems by examining some estimates of what a game might look like, as well as talk about what's needed. While this looks great in theory, there are some subtle and not so subtle issues with it. The above gives the impression that I'm completely sold on the idea, but the truth is that I'm not.
The problem with figuring out if something will work for an MMO is difficult because there's two categories of people. The first is the category of people who talk about how MMOs work. The second is people who have done it and don't want to give away all their commercial secrets. Unfortunately, I fall into and seem to be primarily able to find information from the first category. So before I go preaching how good the above proposal is, I intend to program it. As you shall soon see, this is far from easy and will not be a quick prototype. At all.