> there'd be three calls, typically: DrawButton(), ButtonBehavior(), and DoButton() which calls them both. If you want to draw it differently, you just replace the call to DoButton() with a call to ButtonBehavior(), and draw the button yourself. That's all there is to it.
The pattern that Casey describes is also possible within this model, because the API is layered in the same way, with high-level APIs simply being fast-path compositions of lower-level APIs. The feature flags are just fast-paths for a supported set of feature combinations, which itself is a much larger set than the set of “widget kinds” (which would simply poke into that set).
Feature flags aside, builder code can consume input and render graphics arbitrarily. It can take full control over the building codepath in the same way. The feature flags are not the only means by which builder code can provide the effects it needs; they are just options for standardized fast-paths that would be duplicative to rewrite or maintain (e.g. clicking/dragging behavior).
Nevertheless, your mentioning of the position as “library user” is why I do not think UI core code should be packaged within a library for general application development; or, I at least don’t think it’s that simple. Sometimes you do need standardization across multiple applications (e.g. picture yourself as an OS vendor). If libraries were to be provided in this case, I’d say that the ideal decomposition is not obviously “UI library”, and instead it might be that the OS provides simple building blocks that offer controls for its features (accessibility, for example), while allowing composition with user code. (On top of this, you might still have high level APIs that are also just compositions of low level APIs).
This is why I wrote a post on ditching the idea that there is any one layer that defines “what a widget is”—instead, there is just a code code layer that has some building blocks for certain patterns of data transformations. In my experience, there’s rarely a single layer that can totally define a high-level idea; the desired high-level effect must be achieved through composition of multiple layers.
Right, thanks for the insight. Thinking about this problem, I started learning about data-oriented designs and I came up with this idea: instead of storing the flags inside the widget struct, we could store them out of band, in collections (arrays or hash maps) of widget references (pointers or indices) for each flag, consumed by functions which operate only on a single flag or something like that, opening up the possibility of adding new features in the future. What do you think about this approach? Maybe it could bring some issues in the layouting phase.
I think what you’re getting at is just mirroring entity component systems (for the record, I don’t actually consider using SoA-style storage for an ECS to be data-oriented in itself—data-oriented thinking has much more to do with first principles rather than a specific architectural choice).
This would work in theory but is unlikely to provide the wins you might expect it to, and it will probably result in a loss of flexibility—namely because you’re adding structure to “features” where structure may not exist: it isn’t obvious that a single feature breaks down into a single batch-processing codepath, especially because many operations in UI are order-dependent and thus serially-dependent.
It’s also unclear if storing the flags out-of-band will be of much use. It is true that it would allow you to look at flags without pulling other per-widget data into cache, but the general structure of UIs in my experience is two codepaths (building + rendering) that often touch all per-widget data.
But, in any case, it actually is unlikely to matter at all, due to the small number of widgets you need active at any one point in time (even with e.g. infinite lists), where all of your widget data is likely able to fit inside L2, and where there are just barely any widgets to begin with. It’s just not a data processing problem worth thinking much about, because there isn’t much data.
If you want the ability of builder code to extend per-widget data (e.g. with its own feature flags, or other stuff), then you can just equip each node with a slot for extra user data, and let the user (builder) code decide how to fill it out and later use it. There’s not much reason to shoehorn user-attached data into the first-class slot of features that the core supports and is aware of.
Thanks again. You're right, I was mirroring an ECS. Unfortunately I was thinking ahead of time if an ECS would fit in this context, but first I actually need to start implementing your ideas.
Can't wait for what's coming next. What are you planning to do?
Yeah for sure. You can only organize data for bulk efficient data transforms once you know what the set of transforms you need is, and what the shape of the problem looks like. That may vary dramatically depending on the problem, so there’s no “bag of tricks” here, you just need to explore the shape of the problem and then do another pass of data organization. That’s why I choose to keep my data organization and types very simple when in an exploratory phase—you can use simple bucketing mechanisms to figure out the shape, and then use that to inform you of what the really tight version would look like.
I still have a lot to write about—I want to get to rendering, infinite lists, text input, panel trees, and so on. There’s a lot to sort through and I haven’t organized it all yet, so not too sure what’s coming next, but it’s probably stuff like that!
Thanks for this series of posts on UI. It will be my point of reference for my next UI-related project.
I have one question regarding the UI_Comm struct: what is the rational behind including mouse pos and drag_delta in it? Since those two should remain fixed across the whole frame, I assume you can store them as global states, just like any other input states? What am I missing here?
You're not missing anything! You're absolutely right, there's no actual "data-flow reason" to put them in there - it's just extra copying work. I put them in just because it's often convenient to have them ready when you are working with other parts of UI_Comm's data. Probably a bad habit, though, and I should just grab them when I actually need them in a codepath. :)
Great post. There's one (probably quite simple) problem I can't see how to implement this way - how do encode a parent that has a border? The size of that parent node, let's say is 200x200px and it has a 5px border - there's no "inner" size or coords there, and I'm wondering how to implement such a thing. Would you encode a bordered parent as 2 widgets? One that draws its background and another nested inside that is offset somehow using some new variant of UI_SizeKind? With the example code the way it is, I can't see how this fits in. Another way would be to have a code path in the layout code that "shrinks" children inside bordered components but that seems like it would invite further problems with the other layout code.
That's like saying 'so basically you discovered Camry' when someone shows how the car works under the hood :)
Each platform has it own primitive building blocks:
* Svg has `g` (not div)
* Flutter has `Container`
* React-Native has `View`
* IOS had `UIView` and Android has `android.view` (based on React-Native experience)
While the intention is similar, implementation differ quite a bit. Having the ability to implement minimal UI system tailored specifically for the needs of you app is the skills our industry is missing, from my point of view.
Once again, thank you for the great series. Working through your suggestions, I should be able to resolve a number of ugly workarounds resulting from a more traditional IMGUI implementation.
This may be a dumb question, but shouldn't UICommFromUIWidget be taking some sort of mouse/keyboard state input (say, from the OS/platform) for hit testing against the rect? I'm assuming it's left out because it's pseudo-ish code?
Also, why is there a pointer to a UI_Widget in UI_Comm?
In my implementations, the UI building phase begins with a UI_BeginBuild function. This function takes per-build parameters, including a list of operating system events (which encode user input). That list is remembered in the selected thread-local state, so I don't have to pass it around - it would be the same one every time.
A widget is returned to enable composition in some cases. For example, if I were to use a helper function like UI_Button (which returns this struct), but I wanted to extend that helper's widget subtree *from the caller code*, I would need a way to get the root node of the button's subtree.
I see, so by keeping input on the thread local state the emphasis is on a very clean interface, without passing input etc pointers everywhere.
Regarding the widget pointer, I wasn't seeing it until I tried to build a hierarchy from the build code -- of course you need a reference back to the parent to do so. Thanks!
The Comm type is global state? In the past I’ve just used 2 bits of state per button (real or virtual), for frame-based input, one for the previous frame (on/off), and another for the current frame. Then all the combinations follow naturally. 0b00: idle, 0b01: press_started, 0b11: press_held, 0b10: press_stopped. You can do the same thing for hovering. If you need to time any input you can swap the bits for walltimes and store the last two timestamps for both on_key_down and on_key_up events. Any frame based boolean input can then be derived by just seeing if it happened within the frame interval, and double-presses and button-hold durations come for free without having to set an individual timer for any inputs, just subtract the relevant timestamps. As for names… maybe “signal”?
The Comm type is just a type for returning data about interaction at a particular node, it does not correspond directly to any global state. It is computed from scratch by the "Interact" API every frame.
I don't use the standard bool type because I prefer having explicit sizes for primitive types.
B8 and B32 end up having different low-level characteristics. When returning just a B32, it's already going to be returned in a register on modern ABIs. So space doesn't actually matter, and you may pessimize certain codepaths by requiring them to move a B8 into a larger slot (due to alignment or other prep work). So, in that case, it makes sense to just use a B32 (B64 would also be acceptable with 64-bit, but possibly a pessimization on 32-bit machines, whereas B32 will work on either 32-bit or 64-bit). It's possible that it's still preferable to use a B64 on 64-bit, in which case I've seen codebases have a "B32X", which is just "at least 32 bits, but might be more on some systems".
In the case of UI_Comm, it'll be allocated on the stack by the caller, and you could conceivably want to pass UI_Comm's around to other functions by value (which may require a copy) so it makes "more sense" to be more concerned about how much space they take.
But, ultimately, I think using either B8 or B32 in any of these cases will not make a huge difference.
This posts are amazing, thanks!
When you talked about features as widget flags, I thought it to be limiting: the library users can only create a new widget which has at most N features, but what if they need a new one? In this way, if I'm not missing something, they should wait for the library developers to add it. While it's certainly better than the WidgetKind, maybe we could push the abstraction further. In the Molly Rocket forum (https://web.archive.org/web/20070825122349/http://www.mollyrocket.com/forums/viewtopic.php?t=134&postdays=0&postorder=asc&start=30&sid=9680eeedbe87034741d936cbfe319f57) Casey talked about how to customise a button, generally created through doButton():
> there'd be three calls, typically: DrawButton(), ButtonBehavior(), and DoButton() which calls them both. If you want to draw it differently, you just replace the call to DoButton() with a call to ButtonBehavior(), and draw the button yourself. That's all there is to it.
What do you think?
The pattern that Casey describes is also possible within this model, because the API is layered in the same way, with high-level APIs simply being fast-path compositions of lower-level APIs. The feature flags are just fast-paths for a supported set of feature combinations, which itself is a much larger set than the set of “widget kinds” (which would simply poke into that set).
Feature flags aside, builder code can consume input and render graphics arbitrarily. It can take full control over the building codepath in the same way. The feature flags are not the only means by which builder code can provide the effects it needs; they are just options for standardized fast-paths that would be duplicative to rewrite or maintain (e.g. clicking/dragging behavior).
Nevertheless, your mentioning of the position as “library user” is why I do not think UI core code should be packaged within a library for general application development; or, I at least don’t think it’s that simple. Sometimes you do need standardization across multiple applications (e.g. picture yourself as an OS vendor). If libraries were to be provided in this case, I’d say that the ideal decomposition is not obviously “UI library”, and instead it might be that the OS provides simple building blocks that offer controls for its features (accessibility, for example), while allowing composition with user code. (On top of this, you might still have high level APIs that are also just compositions of low level APIs).
This is why I wrote a post on ditching the idea that there is any one layer that defines “what a widget is”—instead, there is just a code code layer that has some building blocks for certain patterns of data transformations. In my experience, there’s rarely a single layer that can totally define a high-level idea; the desired high-level effect must be achieved through composition of multiple layers.
Right, thanks for the insight. Thinking about this problem, I started learning about data-oriented designs and I came up with this idea: instead of storing the flags inside the widget struct, we could store them out of band, in collections (arrays or hash maps) of widget references (pointers or indices) for each flag, consumed by functions which operate only on a single flag or something like that, opening up the possibility of adding new features in the future. What do you think about this approach? Maybe it could bring some issues in the layouting phase.
I think what you’re getting at is just mirroring entity component systems (for the record, I don’t actually consider using SoA-style storage for an ECS to be data-oriented in itself—data-oriented thinking has much more to do with first principles rather than a specific architectural choice).
This would work in theory but is unlikely to provide the wins you might expect it to, and it will probably result in a loss of flexibility—namely because you’re adding structure to “features” where structure may not exist: it isn’t obvious that a single feature breaks down into a single batch-processing codepath, especially because many operations in UI are order-dependent and thus serially-dependent.
It’s also unclear if storing the flags out-of-band will be of much use. It is true that it would allow you to look at flags without pulling other per-widget data into cache, but the general structure of UIs in my experience is two codepaths (building + rendering) that often touch all per-widget data.
But, in any case, it actually is unlikely to matter at all, due to the small number of widgets you need active at any one point in time (even with e.g. infinite lists), where all of your widget data is likely able to fit inside L2, and where there are just barely any widgets to begin with. It’s just not a data processing problem worth thinking much about, because there isn’t much data.
If you want the ability of builder code to extend per-widget data (e.g. with its own feature flags, or other stuff), then you can just equip each node with a slot for extra user data, and let the user (builder) code decide how to fill it out and later use it. There’s not much reason to shoehorn user-attached data into the first-class slot of features that the core supports and is aware of.
Thanks again. You're right, I was mirroring an ECS. Unfortunately I was thinking ahead of time if an ECS would fit in this context, but first I actually need to start implementing your ideas.
Can't wait for what's coming next. What are you planning to do?
Yeah for sure. You can only organize data for bulk efficient data transforms once you know what the set of transforms you need is, and what the shape of the problem looks like. That may vary dramatically depending on the problem, so there’s no “bag of tricks” here, you just need to explore the shape of the problem and then do another pass of data organization. That’s why I choose to keep my data organization and types very simple when in an exploratory phase—you can use simple bucketing mechanisms to figure out the shape, and then use that to inform you of what the really tight version would look like.
I still have a lot to write about—I want to get to rendering, infinite lists, text input, panel trees, and so on. There’s a lot to sort through and I haven’t organized it all yet, so not too sure what’s coming next, but it’s probably stuff like that!
You are absolutely right. You give very simple and powerful insights, which are the hardest to find. Thank you so much.
Text is one of the most interesting topics to me, I'm looking forward to the next posts. You are doing a very good job, keep it up!
Thanks for this series of posts on UI. It will be my point of reference for my next UI-related project.
I have one question regarding the UI_Comm struct: what is the rational behind including mouse pos and drag_delta in it? Since those two should remain fixed across the whole frame, I assume you can store them as global states, just like any other input states? What am I missing here?
You're not missing anything! You're absolutely right, there's no actual "data-flow reason" to put them in there - it's just extra copying work. I put them in just because it's often convenient to have them ready when you are working with other parts of UI_Comm's data. Probably a bad habit, though, and I should just grab them when I actually need them in a codepath. :)
Great post. There's one (probably quite simple) problem I can't see how to implement this way - how do encode a parent that has a border? The size of that parent node, let's say is 200x200px and it has a 5px border - there's no "inner" size or coords there, and I'm wondering how to implement such a thing. Would you encode a bordered parent as 2 widgets? One that draws its background and another nested inside that is offset somehow using some new variant of UI_SizeKind? With the example code the way it is, I can't see how this fits in. Another way would be to have a code path in the layout code that "shrinks" children inside bordered components but that seems like it would invite further problems with the other layout code.
So you basically discovered the div
Great takeaway!
That's like saying 'so basically you discovered Camry' when someone shows how the car works under the hood :)
Each platform has it own primitive building blocks:
* Svg has `g` (not div)
* Flutter has `Container`
* React-Native has `View`
* IOS had `UIView` and Android has `android.view` (based on React-Native experience)
While the intention is similar, implementation differ quite a bit. Having the ability to implement minimal UI system tailored specifically for the needs of you app is the skills our industry is missing, from my point of view.
Thanks Ryan for a great series.
Once again, thank you for the great series. Working through your suggestions, I should be able to resolve a number of ugly workarounds resulting from a more traditional IMGUI implementation.
This may be a dumb question, but shouldn't UICommFromUIWidget be taking some sort of mouse/keyboard state input (say, from the OS/platform) for hit testing against the rect? I'm assuming it's left out because it's pseudo-ish code?
Also, why is there a pointer to a UI_Widget in UI_Comm?
In my implementations, the UI building phase begins with a UI_BeginBuild function. This function takes per-build parameters, including a list of operating system events (which encode user input). That list is remembered in the selected thread-local state, so I don't have to pass it around - it would be the same one every time.
A widget is returned to enable composition in some cases. For example, if I were to use a helper function like UI_Button (which returns this struct), but I wanted to extend that helper's widget subtree *from the caller code*, I would need a way to get the root node of the button's subtree.
I see, so by keeping input on the thread local state the emphasis is on a very clean interface, without passing input etc pointers everywhere.
Regarding the widget pointer, I wasn't seeing it until I tried to build a hierarchy from the build code -- of course you need a reference back to the parent to do so. Thanks!
The Comm type is global state? In the past I’ve just used 2 bits of state per button (real or virtual), for frame-based input, one for the previous frame (on/off), and another for the current frame. Then all the combinations follow naturally. 0b00: idle, 0b01: press_started, 0b11: press_held, 0b10: press_stopped. You can do the same thing for hovering. If you need to time any input you can swap the bits for walltimes and store the last two timestamps for both on_key_down and on_key_up events. Any frame based boolean input can then be derived by just seeing if it happened within the frame interval, and double-presses and button-hold durations come for free without having to set an individual timer for any inputs, just subtract the relevant timestamps. As for names… maybe “signal”?
The Comm type is just a type for returning data about interaction at a particular node, it does not correspond directly to any global state. It is computed from scratch by the "Interact" API every frame.
"Signal" might be good!
I really like your posts, keep it up!
I have a question about the bool types - how do you decide when to use B8 and B32? What about the standard bool type?
I don't use the standard bool type because I prefer having explicit sizes for primitive types.
B8 and B32 end up having different low-level characteristics. When returning just a B32, it's already going to be returned in a register on modern ABIs. So space doesn't actually matter, and you may pessimize certain codepaths by requiring them to move a B8 into a larger slot (due to alignment or other prep work). So, in that case, it makes sense to just use a B32 (B64 would also be acceptable with 64-bit, but possibly a pessimization on 32-bit machines, whereas B32 will work on either 32-bit or 64-bit). It's possible that it's still preferable to use a B64 on 64-bit, in which case I've seen codebases have a "B32X", which is just "at least 32 bits, but might be more on some systems".
In the case of UI_Comm, it'll be allocated on the stack by the caller, and you could conceivably want to pass UI_Comm's around to other functions by value (which may require a copy) so it makes "more sense" to be more concerned about how much space they take.
But, ultimately, I think using either B8 or B32 in any of these cases will not make a huge difference.