In my UI series, I wrote about my preference for an “immediate mode” interface for building UI. This preference came from years of attempting to build UI from scratch in various ways. When I’d use a retained mode interface, it would always feel like I was pushing sand around.
When I discovered the immediate mode approach to UI—similar in spirit to the immediate mode approach to rendering—it felt like a breath of fresh air. My UIs became not only easier to build—which is in itself not enough justification for the preference, although still important in improving iteration time—but also more robust, more easily made dynamic, and more naturally reactive to state changes in my programs.
There is a fair amount of rhetoric that suggests immediate mode interfaces are less capable than retained mode interfaces. The most common arguments are: (a) there are caching opportunities that immediate mode interfaces cannot take advantage of; (b) immediate mode interfaces must update all state for all changes, thus causing worse performance and higher battery usage; (c) immediate mode interfaces prohibit the use of user interfaces as a data structure, thus prohibiting integration with operating system accessibility features, keyboard navigation, autolayout, and so on.
Statements (a), (b), and (c) are all false. The embedded links provide compelling counterarguments to each. It is a bit tiresome that these arguments are still continuously made—asserted—considering the lack of investigation and thought behind them.
But despite that, there is still a point at which immediate mode interfaces lose their utility, and where retained mode interfaces become preferable, if not required.
As I’ve written about before, one place where this rings true is in controlling higher level, user-controlled “interface instantiation entities”—think windows, tabs, panels, and so on. In other words, things that the user might explicitly create with a “+” button, and destroy with an “x” button.
But the same is true, for example, in the implementation of an immediate mode interface with persistent “key-based” caching behavior. The implementation of the cache cannot be immediate mode—the cache is implemented with “retained mode pieces”, they are just localized to the implementation of the immediate mode interface.
Another example would be state for entities in a game level. In a usual game scenario, entities are dynamically created and destroyed depending on various gameplay conditions. Usage code of an immediate mode interface, however, is static, as code is immutable:
// every frame:
Entity("player", player_sprite, player_pos, ...);
Entity("goblin", goblin_sprite, goblin_pos, ...);
Entity("chest", chest_sprite, chest_pos, ...);
In the above example, let’s say that the gameplay involves the player
defeating the goblin
, in order to obtain valuables from the chest
.
There are countless possible combinations of state in this scenario, accounting for all the various entity positions, entity health levels, and so on. But one important possibility is: is the goblin
alive, or is the goblin
dead?
Imagine that you need to encode this possibility into usage of the “immediate mode entity interface”, used above as Entity
. You’d need something like the following:
// every frame:
Entity("player", player_sprite, player_pos, ...);
if(is_goblin_alive)
{
Entity("goblin", goblin_sprite, goblin_pos, ...);
}
Entity("chest", chest_sprite, chest_pos, ...);
And what is is_goblin_alive
? It’s usage-code-side state. And usage-code-side state requires a usage-code-side state machine. This state machine requires retained-mode-like mutations.
In other words, it’s impossible to escape state machines—or “retained mode interfaces”—they are the exactly correct choice in many places.
But “many” is not “all”—and the reason why immediate mode interfaces become preferable in some places is because the introduction of yet another state machine is actually a burden on usage code, rather than providing some necessary functionality.
Thinking about this led me to a useful analogy—is the system I’m writing upstream? Or is it downstream?
Downstream of what? Downstream of state machines.
Upstream systems are shallow layers in a call stack. They need control over state, and they want to describe how many computational effects flow from that state.
Downstream systems are deeper layers in a call stack. They are called into to produce computational effects from user-provided state. Their purpose is to organize the particulars of how some system functionally derives from another.
When a downstream system introduces new state machines for usage code, when those state machines merely mirror an upstream state machine, this introduces additional cruft, additional busywork for usage code, possibility for bugs, worse iteration time, and a substantially worse programming experience. This is what happens when a “retained mode system” is inappropriately introduced.
One important reason why is because there are far more—exponentially more—downstream system entry points than upstream, because every entry point can itself call into N other entry points. Thus, introducing usage-code-controlled state machines unnecessarily in downstream systems explodes the number of possibilities the usage code author must actually be concerned about.
This explains why, over time, it has become obvious that, for rendering, immediate mode interfaces are dramatically better than retained mode interfaces. This was not always the case, but this model makes it clear why—on-screen artifacts produced by rendering are exactly that: artifacts. They’re a function of some invisible state—the entire purpose of rendering is to take some invisible state in the machine, and turn it into something visible.
UI is only one step further—instead of merely visual artifacts as “output” of a system, it is also concerned with user inputs as “input” to a system. But it still, nevertheless, remains largely downstream of that system.
But immediate mode interfaces—to specify “downstream effects”—is a technique far from localized to rendering and UI.
And when truly downstream systems are correctly organized as such—providing an immediate mode interface to usage code—it becomes trivial to dynamically produce a multitude of possible downstream effects, with few upstream state changes. Every upstream state change becomes much more powerful and meaningful—it costs little code, little computation, and little work to produce a new world of effects. There is no need to inform every downstream system of the state change.
These reasons are why I’ve found the analogy useful, and why I wanted to share it. When writing a system, ask yourself—are you downstream, or are you upstream? The answer to that question can help inform you about the appropriate design for that system.
If you enjoyed this post, please consider subscribing. Thanks for reading.
-Ryan