The Codepath Combinatoric Explosion

On maintenance of code with a large possibility space, stateful code, caching, the downfall of retained mode APIs, and freeing up programmer resources.

Apr 12, 2023

I defined a codepath as “a sequential list of instructions which terminates”. One interesting detail of that definition is that it entirely ignores control flow—there is no way to form loops or conditionals in a codepath (given that definition), because an executor must necessarily execute instructions in sequence.

What does that mean, though, for everyday code we write which includes loops and conditionals, and why does it matter? Why is that a useful definition?

Consider the following code:

FunctionA();
if(A)
{
 FunctionB();
}
else
{
 FunctionC();
}
FunctionD();

Assume, first, that calls to FunctionA, FunctionB, and FunctionC resolve to a single possible codepath. If this is the case, and a codepath is a sequential list of instructions, then the above C code does not signify one codepath, but two. In one case, it produces a codepath with the following pseudocode:

FunctionA();
// test `A`
FunctionB();
FunctionD();

And in the other case:

FunctionA();
// test `A`
FunctionC();
FunctionD();

But now, imagine that FunctionD did not resolve to a single codepath, but instead had the following definition:

FunctionD()
{
 if(X)
 {
  DoA();
 }
 else
 {
  DoB();
 }
}

You’ll notice that this too is encoding two possible codepaths. Depending on however X evaluates, it will be one or the other.

This set of possibilities compounds with those introduced by FunctionD call sites. Substituting FunctionD’s call site in the original code example with the definition of FunctionD gives us:

FunctionA();
if(A)
{
 FunctionB();
}
else
{
 FunctionC();
}
if(X)
{
 DoA();
}
else
{
 DoB();
}

Using the same rules, we can see that the above C code encodes four codepaths, rather than just one:

FunctionA(), check A, FunctionB(), check X, DoA()
FunctionA(), check A, FunctionC(), check X, DoA()
FunctionA(), check A, FunctionB(), check X, DoB()
FunctionA(), check A, FunctionC(), check X, DoB()

Each codepath corresponds one-to-one with possible sequences of operations, which are literally executed by a machine executing code. When we bifurcate the structure of our code by introducing branches, we multiply this set of possible operation sequences by the number of possible branches. Take the following code as an example:

void UpdateEntity(Entity *entity)
{
 switch(entity->kind)
 {
  default:{}break;
  case EntityKind_Player:{UpdatePlayer(entity);}break;
  case EntityKind_Goblin:{UpdateGoblin(entity);}break;
  case EntityKind_Skelly:{UpdateSkelly(entity);}break;
  case EntityKind_Knight:{UpdateKnight(entity);}break;
 }
}

This C function encodes at least five codepaths, multiplied by the product of the number of all possible codepaths produced by each Update* function.

But, notably, it isn’t merely the branching in code which causes additional codepaths—it’s also the variability in input data. The five codepaths encoded by the above C function can run on any possible entity value which we might produce and pass in. So, in reality, there are possibly numerous codepaths here—to know for sure, we’d have to zoom out and look at what entity could be.

Each additional codepath, being one additional possibility for the execution of our code, is one additional case we ought to be prepared for. And what’s clear is that the set of possible codepaths produced by any nontrivial C code is quite large.

But, there’s a problem I’ve glossed over… the addition of one codepath is not, strictly speaking, free.

Immediately after writing code which produces new codepaths, we can say that those codepaths have not been experimentally verified as correct. We haven’t stepped through them in a debugger—we haven’t even made sure they’re working at all. Those codepaths might be chosen from depending on a wide variety of possible preexisting states—to exhaustively test these new codepaths, we’d have to visit each one of those possible preexisting states, and step through each one, and run each one normally, on a variety of possible hardware configurations.

Now, obviously, that is an enormous set of possibilities to test! Nobody does all of that for every new codepath they introduce. Nobody can do all of that. It’s simply not economic to do the full set of possible extra work tasks, which verify that our new code executes appropriately under all circumstances. In fact, for sufficiently complex software, there is something inherently infeasible about doing that—the whole point of some software is to allow for a vast space of possibilities. At some point, this becomes fundamentally untestable.

So, we have two choices. First, we can attempt the smallest possible subset of that extra work—for the sake of maximizing our efficiency—which offers the greatest possible predictive power over what we’d find in doing that full set of extra work. Second, we can shrink the number of codepaths we produce.

Actually, I was wrong—we have three choices. The third is the most popular option, preferred by many software vendors today. Let’s call this the “industry standard”. We can also entirely ignore this problem. We can spawn as many codepaths as we want, assume—or don’t care whether or not—they’ll do anything of value, collect our Monopoly money paychecks, and, I don’t know, pay our Tesla loans, or our $2.5M cardboard box downtown San Francisco mortgage payment. Instead of focusing on this problem at all, we can spend time critiquing the look and feel of code. What is being “conveyed by” the code? How “future proof” is the code? How closely related to the “business logic” is the code? How “beautiful” is the code? I mean, after all, code is art, and so coders are just artists—it doesn’t really matter what the machine does. We can sit all day in self-gratifying code review meetings, go home pretending we produced something, and maybe write a few social media posts about—I don’t know—how corrupt capitalists are, or something like that. Fight the power!

This post won’t focus on that third approach, though—in this instance, I will depart from the industry standard. I realize this is unpopular, but I think it’s important we have a grasp on the other options.

Continuing on, if we had a mathematical function describing the amount of work we did for each new codepath, while maximizing our confidence that each codepath runs correctly, we’d better hope that function does not scale much with the number of codepaths. If that function is O(N), where N is the number of codepaths, or O(N^2), or O(N lg N), or even O(lg N), we’d better hope there’s a massive constant divisor hidden in there, otherwise we’re on the hook for quite a lot of work, as our codebase matures and adapts to new constraints.

What would be ideal is if, first, that work function happened to be much more like O(1) (with a manageable constant scalar), whenever possible—and second, that we shrink N to the degree that it’s possible (or in other words, we don’t introduce codepaths unnecessarily).

Let’s explore what this means practically.

Effective Codepaths vs. Codepaths

In some cases where we introduce several codepaths, we’d like all of our codepaths to reach the same end result, but in order to do so, we need different computational work, because each codepath will be run in different circumstances.

Take the following example:

// retrieves value associated with `key` - if it does not
// exist inside `table`, then inserts it with initial value
// of `default_val`
Val ValFromKey(Table *table, Key *key, Val default_val);

Key key = ...;
Val val = ValFromKey(table, &key, default_val);

The above is a common usage pattern of hash tables. If we assume that ValFromKey is a single codepath, then it sure looks like the above code is just a single codepath per possible table and default_val!

But digging into any implementation of ValFromKey will tell us that this is definitely not the case:

Val ValFromKey(Table *table, Key *key, Val default_val)
{
  // map key -> hash and slot
  U64 hash = HashFromKey(key);
  U64 slot_index = hash % table->slot_count;

  // map key and slot -> node already in the table
  Node *existing_node = 0;
  for(Node *n = table->slots[slot_index].first; n != 0; n = n->hash_next)
  {
    if(KeyMatch(&n->key, key))
    {
      existing_node = n;
      break;
    }
  }

  // no valid node? allocate & insert one!
  if(existing_node == 0)
  {
    existing_node = AllocateNode(...);
    // insert existing_node into table slot at table->slots[slot_index]...
    existing_node->val = default_val;
  }

  // we can reasonable assume that `existing_node` exists now.
  return existing_node->val;
}

That’s not a single codepath—in fact, that’s a lot of codepaths! But what’s interesting is that this code—while resolving to one of many codepaths—is written to produce some of the effects of a single codepath. Each codepath works toward the same end result.

But, no… hold on! There are necessarily low level differences between each codepath in this case, because there are literally different computational things happening. Many measurements of an allocation, insertion, and default initialization within this hash table will likely not match that of just a retrieval. So, we shouldn’t literally call them the same codepath. But, in other ways, these codepaths act the same way—they’ll go from point A to point B, at some level of analysis. Thus, for many purposes, we can treat all of these codepaths as a single “effective codepath”.

If we’re in the position of disregarding the differences between several real codepaths and tweaking these codepaths such that they form a single effective codepath, then the earlier multiplicative effect of adding new codepaths is simplified away.

In other words, for many purposes, we can gloss over the several real codepaths in the following:

Key key = ...;
Val val = ValFromKey(table, &key, default_val);

And consider it as a single effective codepath.

Now, some readers may be thinking—“Well, of course ValFromKey looks like a single codepath, because it’s a single function interface which abstracts over a number of possibilities! There is no way that it could be considered as anything other than a single effective codepath.”

But this perspective is mistaken. Whether or not we can treat a function call as a single effective codepath does not depend on the function’s type signature, because a single type signature does not guarantee a single set of computational effects. Instead, it depends on the function’s cumulative effects—in the set of effects we care for—in all cases. In other words, what—in the set of measurements we care about—can we reliably expect from the function, in all cases? What can the function guarantee us?

Therefore, whether or not we consider a function call as a “single effective codepath” is dependent on circumstances. If my measurement technique for “the cumulative effects” included, for instance, the number of bytes allocated, then we can’t treat ValFromKey as a single effective codepath. Any individual call to ValFromKey may allocate, or may not allocate. There is a different set of effects for each codepath, and so from any call site, I can’t guarantee anything.

But, for a great number of many other measurements—“Do I get a valid value back?”, “What type of value do I get back?”, “Can I use the value without checking?”, “Can I run the function repeatedly and produce the same effect without allocating memory per call?”—the function call does produce the same cumulative effects. In those cases, it is a single effective codepath.

Questions vs. Answers

Another way to tackle this problem of per-new-codepath-work is to consider whether code introduces questions, and whether code introduces answers. Is code introducing entropy? Or is it introducing order?

One way of looking at code is to consider each point in a scope as having a single type, which is comprised of all of the types which have been introduced into the scope.

{
  int x = 0;           // struct { int }
  float y = 123.45f;   // struct { int, float }
  int z = 56;          // struct { int, float, int }
  Foo *foo = GetFoo(); // struct { int, float, int, Foo* }
  // ...
}

Only considering the type system, we can ask the question at each point in the scope—“does the scope contain type X with label Y?”

A declaration provides an answer to that question. In the above example, after the first declaration, if I ask “does the scope contain an int x?”, the declaration “answers my question”, and replies “yes”. That is an answer I can rely on.

But a scope’s type at any point does not comprehensively represent the set of things we’d like to rely on in that scope. I can write statically typed code which guarantees many things about what types a scope receives, yet still guarantees almost nothing of value.

Consider the following:

enum ObjectKind
{
  ObjectKind_Foo,
  ObjectKind_Bar,
  ObjectKind_Baz,
};

struct Object
{
  ObjectKind kind;
  union
  {
    Foo foo;
    Bar bar;
    Baz baz;
  };
};

void Function(void)
{
  Object object = ...;
}

What has been guaranteed at the declaration site of object? In short, only that I have an ObjectKind. That is the only answer I’ve actually been guaranteed, at the time of writing the code, within the type system. Digging further—writing code which relies upon having a Foo, Bar, or Baz—requires my code to “ask more questions”, and figure out the answer myself. This requires me to segment my own code into even more codepaths.

But discriminated unions, like Object above, are merely one example of a more general definition: sum types. And the lack of certainty they provide is merely one example of a more general phenomenon: Even if some code passes me a fully guaranteed type, it might make no guarantees about what I can do with that type—or, in other words, what values the instance of that type may contain.

Consider the following:

struct Node
{
  Node *first;
  Node *last;
  Node *next;
  int value;
  // more stuff
};

Node *ChildFromValue(Node *parent, int value)
{
  Node *result = 0;
  for(Node *node = parent->first; node != 0; node = node->next)
  {
    if(value == node->value)
    {
      result = node;
      break;
    }
  }
  return result;
}

Node *SearchTreeForInterestingChain(Node *root)
{
  Node *n1 = ChildFromValue(root, 1);
  Node *n2 = ChildFromValue(n1, 2);
  Node *n3 = ChildFromValue(n2, 3);
  Node *n4 = ChildFromValue(n3, 4);
  return n4;
}

Hidden in every return from ChildFromValue is an implicit sum type. Why? Because the result can either encode “there is no such child—0”, or “the pointer to the child with the target value”. Either result is packaged up into the same type—Node*. But the guarantees about that result are different in either case.

In the case of the child with a value matching the target value, I get a pointer back to some memory which—if the Node tree is well-formed—has been allocated by my process. I can dereference that Node* to get a Node, and I can look at its value, and presumably the other useful things inside of it.

In the case of an invalid child, I get a pointer to some address space—at address 0—which I have never allocated. I cannot dereference it (if I want my program to continue operating after the fact). I cannot look at the result’s value, and I cannot use it to get to a Node, and look at the other information within the Node.

And—if you were paying attention—you’ll notice that because of this fact, there’s a bug! If n1, n2, or n3 in SearchTreeForInterestingChain are not found, then I’ll immediately dereference a null pointer in the next subsequent call to ChildFromValue when I attempt to access parent->first.

So, the “correct” implementation of SearchTreeForInterestingChain—which gracefully fails and continues operating even when a specific value is not found—looks like this:

Node *SearchTreeForInterestingChain(Node *root)
{
  Node *result = 0;
  if(root)
  {
    Node *n1 = ChildFromValue(root, 1);
    if(n1)
    {
      Node *n2 = ChildFromValue(n1, 2);
      if(n2)
      {
        Node *n3 = ChildFromValue(n2, 3);
        if(n3)
        {
          result = ChildFromValue(n3, 4);
        }
      }
    }
  }
  return result;
}

Wow, that looks pretty stupid!

And it is pretty stupid. And we should feel pretty stupid when we write it.

But we can prevent the proliferation of more codepaths by considering which guarantees codepaths meet, just as we did in the previous section with ValFromKey.

We can’t change the nature of the problem—some node may literally not have a child with some target value. But we can tweak the several codepaths implicit in ChildFromValue, such that they all provide more guarantees, in all cases. That way, callers of ChildFromValue can rely on more invariants.

Consider the following new definition of ChildFromValue:

static Node nil_node = {&nil_node, &nil_node, &nil_node, 0};

Node *ChildFromValue(Node *parent, int value)
{
  Node *result = &nil_node;
  for(Node *node = parent->first; node != &nil_node; node = node->next)
  {
    if(value == node->value)
    {
      result = node;
      break;
    }
  }
  return result;
}

In the above example, ChildFromValue always returns a pointer to some valid address space within our process—we simply reserve a “nil node” for “invalid cases”. As a result, callers of ChildFromValue can now depend on more in all cases. Or, in other words, for a larger set of scenarios, it has become a single effective codepath, rather than several.

Surely enough, SearchTreeForInterestingChain cleans right up, and we don’t have a subtle null pointer dereference to catch:

Node *SearchTreeForInterestingChain(Node *root)
{
  Node *n1 = ChildFromValue(root, 1);
  Node *n2 = ChildFromValue(n1, 2);
  Node *n3 = ChildFromValue(n2, 3);
  Node *n4 = ChildFromValue(n3, 4);

  // we will necessarily get here - and we're also
  // guaranteeing for all of *our* callers that they
  // can dereference this result, even if it's 'invalid'
  return n4;
}

Globally Stateful Code

A single “effective codepath” arising from several “real codepaths”—due to a larger number of reliable guarantees, or answers, in all cases—extends to many other scenarios.

As everyone is well aware, computers transform data. There is some input data, code executes, and it produces some output data. When the input data is stored in the same location as the output data, the code is considered “stateful”—it is mutating data, rather than producing a fresh new result.

In general—waves hands around vaguely—I’d say stateful code is more subtle, can be more confusing, and can be less flexible. (Don’t worry, I’m not going to just assert that and move on…) Code written in single static assignment form—meaning one declaration correlates with one named artifact of computation (mutation is disallowed)—provides more guarantees, more answers, at each point than code which is not.

But, nevertheless, stateful code is often quite a lot more useful, more powerful, more succinct, and in many cases, necessary. And when people disregard this fact—and try to board the “never mutate anything” train—they get themselves into some pretty ridiculous situations.

Stateful code, just like all code, gives rise to one—or many—codepaths. Logical bugs can arise if stateful codepaths are written expecting some input data configuration, and yet still run despite quite unexpected input data configurations.

The most serious manifestation of this problem is often not the case with locally stateful code—where one reasonably small, localized codepath mutates the same data—and is more often the case with globally stateful code—where many decentralized codepaths, written by various people (or the same person at various times), are responsible for coordinating in some way to mutate some data appropriately.

Consider the following concrete example:

struct Entity
{
  B32 is_active; // determines whether or not an entity slot is free
  Vec3 position;
  F32 radius;
  Entity *carrier_entity; // entity which is carrying this entity
};

Let’s imagine a video game where entities can pick things up. If an entity is being carried by another entity, it implies some extra behavioral characteristics—first of all, it shouldn’t be able to move on its own, it needs to be drawn relative to whatever entity is carrying it, and so on.

Let’s say, now, that a goblin picks up a jar. To encode this, we’ll write a pointer to the goblin entity into the jar entity:

Entity *goblin = ...;
Entity *jar    = ...;
jar->carrier_entity = goblin;

But out of nowhere, the player 360-no-scopes the goblin with an arrow. The goblin dies. The game code marks the goblin’s entity memory as free, by setting its is_active member to 0.

goblin->is_active = 0;

Subsequently, elsewhere in the level, a chicken spawner’s timer elapses, and needs to spawn a chicken entity. So it locates the first non-active entity slot, and initializes that entity slot for the newly-spawned chicken entity. This entity slot happens to be the same slot which was used for the goblin who was 360-no-scope-headshotted by the player.

And we’ve got a problem! The jar entity is now pointing at the newly spawned chicken, despite the chicken never picking the jar up.

In this case, our code did perhaps reasonable work when considered in isolation. But through a long chain of events, stateful code which we wrote expecting some input data configuration—reading from a carrier_entity member blindly—made logically incorrect decisions.

The entire set of problems which arise like this is difficult to summarize, and impossible to put a lid on, because every such issue with stateful code is its own complex drama. One codepath writes somewhere, an entirely unrelated codepath writes somewhere else, a third codepath reads from the first location and uses it to write to the second, so on and so on… Just outlining the set of codepaths which collectively cause the problem requires a lot of information.

But every instance is a matter of one codepath making assumptions which don’t hold up to reality. Somewhere, some codepath violated some promise! The answers some code was written to assume were not correct.

In this example, there was an implicit promise that code could always grab a carrier_entity and assume it was reasonable. This promise was broken by the code which marked entity slots as invalid without a story for how state pointing at those slots would adjust, or a story for how code which read from those slots would gather the information that their pointer was no longer valid.

In short, this mistake was caused by the assumption of an effective codepath (for some set of measurements—in this case, logical gameplay correctness) where, in reality, there were many effective codepaths.

A graceful solution to this problem in particular is “generational handles”:

struct Handle
{
  Entity *entity;
  U64 generation;
};

struct Entity
{
  U64 generation;
  // ...
  Handle carrier_entity;
};

Handle HandleFromEntity(Entity *entity);

// returns a null - or nil - entity if the
// generation number has changed!
Entity *EntityFromHandle(Handle handle);

The rule with generational handles is: never (or only in very rare scenarios) store pointers to entities. In all other cases, store the handle produced by the entity. This handle is comprised of both an index, or pointer, and a generation number. Each entity slot also has a generation number, which is used to produce handles. Whenever handles to an entity slot should break—the entity is released, or reallocated for some other purpose—its generation number is incremented. When the entity pointer is to be retrieved, before returning a valid pointer, check the handle’s generation number against the entity slot’s generation number.

Generational handles—much like the “nil node” example in the previous section—provide additional guarantees in all cases. As such, they allow for the assumption of a single effective codepath in a larger number of scenarios.

Writing stateful code carefully to avoid such issues is worth a hundred posts on its own, and it’s a subject well beyond the scope of this post.

But there’s an aspect of this problem which is quite relevant: codepaths are not produced only by code, but rather code and a particular input data configuration.

Recall the example I opened with:

FunctionA();
if(A)
{
 FunctionB();
}
else
{
 FunctionC();
}
FunctionD();

In this case, the codepath which executes at the end of the day is decided upon not by some detail in the code, but what A ends up being.

Now, consider this slightly different example:

FunctionType *FunctionTable[] =
{
  FunctionB,
  FunctionC,
};
FunctionA();
FunctionTable[!A]();
FunctionD();

Hey, look, all the ifs are gone! We got rid of the two codepaths! Problem solved! Pack it up, everyone, it’s a wrap, time to go home!

…But no—we have not reduced the number of codepaths. Codepaths, by definition, exist for each possibility of code execution.

And now it becomes clearer why I stated globally stateful code is more subtle and, in some ways, more difficult. When stateful code mutates state which is used in other stateful code, it can make a decision about which codepath will actually execute in that other stateful code. That can become very difficult to follow, and when looking at any individual stateful code snippet, it’s difficult to know just exactly what input data configuration you’re working with.

Caching

Given the added caveats inherent in stateful code (and especially globally stateful code), and how it contributes to the unpredictable explosion of effective codepaths, I’d like to examine a common justification for globally stateful code which manipulates complex, shared data structures.

Consider, for a moment, assets in a game or application, like textures, sounds, or 3D geometry data.

You’ve probably seen code using such assets which looks something like this:

Texture *texture = LoadTexture(...);
Sound *sound = LoadSound(...);

for(B32 should_quit = 0; !should_quit;)
{
  // ...
  PlaySound(sound, ...);
  // ...
  DrawSprite(texture, ...);
  // ...
}

So far, there is not any globally stateful code at this layer related to our assets—our initialization phase prepares our assets, and our main loop uses them.

But what if we wanted, for instance, asset hot reloading? Being able to modify textures or sounds and immediately see the changes reflected in the application is quite useful for the purposes of reducing iteration time.

With that added constraint, we introduce globally stateful code:

Texture *texture = LoadTexture(...);
Sound *sound = LoadSound(...);

for(B32 should_quit = 0; !should_quit;)
{
  // ...
  if(TextureNeedsReload(texture))
  {
    UnloadTexture(texture);
    texture = LoadTexture(...);
  }
  if(SoundNeedsReload(sound))
  {
    UnloadSound(sound);
    sound = LoadSound(...);
  }
  // ...
  PlaySound(sound, ...);
  // ...
  DrawSprite(texture, ...);
  // ...
}

But now I’ll stop and ask the question—why didn’t we do this?

for(B32 should_quit = 0; !should_quit;)
{
  Texture *texture = LoadTexture(...);
  Sound *sound = LoadSound(...);

  // ...
  PlaySound(sound, ...);
  // ...
  DrawSprite(texture, ...);
  // ...
}

This seems quite a bit simpler—every frame we can’t assume we need the same texture as the last frame, so we just reload it!

Surely alarm bells are ringing in your head. Maybe that works for trivial asset loads—but it doesn’t take much before that becomes prohibitively expensive. We’re loading the entire asset into memory from disk, in some cases mirroring some CPU memory in GPU memory, and so on—the work is definitely not free! There was a reason for the initialization phase.

But these are not our only options.

Let’s rephrase LoadTexture and LoadSound—let’s call them TextureFromKey and SoundFromKey. Instead of micromanaging asset state in various sprawling globally stateful codepaths, the code is free to refer to what is relevant for it—the concept of referring to a specific asset via its unique key:

for(B32 should_quit = 0; !should_quit;)
{
  Texture *texture = TextureFromKey(...);
  Sound *sound = SoundFromKey(...);

  // ...
  PlaySound(sound, ...);
  // ...
  DrawSprite(texture, ...);
  // ...
}

And, at this point, there’s not much reason to use pointers to “asset objects” at all. We can just use keys:

for(B32 should_quit = 0; !should_quit;)
{
  // ...
  PlaySound(some_sound_key, ...);
  // ...
  DrawSprite(some_texture_key, ...);
  // ...
}

And now that I’ve rephrased to TextureFromKey and SoundFromKey, you may notice the similarity to ValFromKey from earlier. From the perspective of our usage code, we aren’t “loading” a texture or sound every frame—we’re just mapping to one from some key, with the help of some underlying layer. This underlying layer could just do the same exact thing we did before—reloading the textures and sounds every time the mapping occurs. But it can easily be transformed into a cache, just as ValFromKey looked up into a “cache” of Vals.

A number of other details become clear with this architecture. How would—for instance—assets be loaded in the background, while a main loop continues operating as usual, and draws something like a loading screen? The heaviest work in maintaining the cache would simply move to other codecycles—the only work that must happen in the codecycle actually using the artifacts from that heaviest work is looking up into the cache.

This can, again, be understood from the perspective of considering our effective codepaths.

It’s easy to see the simplicity inherent in the main loop which simply repeatedly reloads assets every frame, but its practical limitations make it not useful. Programmers are often taught to introduce carefully-managed stateful data structures in order to cache important artifacts—like data from a loaded asset—and do so, by spending more effective codepaths.

But we can have our cake and eat it too. We didn’t need to manage a complex stateful data structure to do caching, and we didn’t need to generate more effective codepaths than necessary. In other words, our cache did not need to be keyed by pointers to values.

Importantly, the stateful data structure did not go away, just as the hash table did not go away to implement ValFromKey. But it was merely confined to a slim layer, which provides an interface, providing reasonable guarantees in all cases. As such, the number of effective codepaths is decreased for our purposes, and our predictive power over our code’s behavior is much stronger.

Retained Mode and Immediate Mode

I’d like to conclude this post by relating the above ideas to “retained mode” and “immediate mode” APIs, which I’ve discussed before in the domain of UI programming.

To be clear, these are not rigorously defined terms—they’re usually used moreas terms that “you see when you see”. But broadly, I’d define an API as “retained mode”, with respect to some concept, when it requires direct user operations with respect to instantiations of that concept beginning, ending, and mutating. I’d define an API as “immediate mode”, with respect to some concept, when instantiations of that concept beginning, ending and mutating are functionally derived from another layer.

To be concrete, using the example from earlier, LoadTexture and UnloadTexture are retained mode with respect to Texture*s because they are the explicit user operations which manage a texture’s beginning, ending, and mutation. TextureFromKey, on the other hand, is immediate mode with respect to Texture*s, because when a texture is loaded, when it’s unloaded (e.g. for the purposes of cache eviction to meet memory usage budgets), and when it’s mutated (when it’s detected to have changed on disk, and when it must reload) are decisions functionally derived from the user operating with an interface which is one layer removed from those decisions specifically.

Given that definition, I’d suggest that usage of a retained mode API with respect to some concept necessarily requires at least the minimum number of effective codepaths to properly create, mutate, and destroy the required instantiations of that concept, whereas an immediate mode API with respect to some concept—using the layer of removal from, for instance, a cache being managed—has more freedom to reduce the number of effective codepaths for users of that API.

When an API becomes about managing stateful lifetimes—especially globally—it introduces a remarkable number of questions for usage code. Is the state being mutated in the expected configuration? How can we tell? Are we violating hidden promises upon which other code relies to operate correctly?

Those extra questions correspond to a larger number of effective codepaths. The cost of these additional effective codepaths are incurred for every usage site, for every user, and for every modification.

Thus, immediate mode APIs offer a remarkably useful tool in eliminating effective codepaths, and thus improving predictive power over code. I’ve reached the conclusion that, broadly, they are vastly preferable. Now, importantly, a larger number of immediate mode APIs only work given that they have careful internal machinery—someone needs to manage “the cache”, after all. In the case of high-level gameplay code, or high-level interface instantiation entities, the high level application code is required to manage the cache, because it must define the rules regarding lifetimes and mutation.

Closing Thoughts

Embedded in this post are lessons that I’ve found incredibly useful over the past few years—they have entirely transformed my approach as a programmer, and made me much more comfortable with much simpler tools and with much more sophisticated problems. When I felt I had learned some of the lessons I described in this article, I felt freer—I could spend more of my time doing more useful things, rather than being weighed down by a thick technical-debt-molasses of my own making.

To be frank, I didn’t think many of these lessons were related at all before I sat down to write this post. But, to my surprise, there is a fascinating thread which ties them all together.

I hope this was an interesting and useful post, and provided you similar insight as it did me.

If you enjoyed this post, please consider subscribing. Thanks for reading.

-Ryan

Trần Thành Long

Apr 18, 2023

I see a lot of similarities between this post and Fabien's "Data structures and invariants" https://fgiesen.wordpress.com/2010/09/27/data-structures-and-invariants/

Specifically, the SearchTreeForInterestingChain example vs the doubly-linked list one.

Do you think your idea of an "effective codepath" is similar to the concept of invariants?

Expand full comment

dear dear

Sep 1

What can be an effective way to create keys for textures and sound? I could use file paths since those are always unique but asset hot loading won't work with this since it's the same filepath before and after modifying the file.

3 replies by Ryan Fleury and others

3 more comments...

Digital Grove

Discussion about this post