14 Comments

Great article packed with good information as always.

Expand full comment

This is similar to the "null object pattern" in OOP, albeit lacking polymorphism. Since we can't designate the "null struct" as const, we must find an alternative method to prevent writes to it. The use of that `read_only` macro is pretty clever.

Expand full comment

Thanks for the article!

But how do you do "per_thread" globals in C in a portable way? I only know of __thread but it doesn't seem portable or even part of the standard.

If you have a custom macro, I'd love a peek :)

Expand full comment

This was a great read, for high productivity stuff I agree with everything and its a philosophy I started following since using C/Odin a lot more. Although, I think FP and the pursuit to use type systems to try to make invalid state irrepresentable when working with a lot of people who may or may not be competent, has its place.

Expand full comment

i do like multiple return values so no need to unify both types under a result struct though

Expand full comment

This was a great write up! sometimes I forget how different to popular "practices" the fact based approach to programming is. The combinatoric explosion has been a major pain in my C development, but these approaches will help a lot.

Expand full comment

I wonder how do you handle errors in a function returning Node* which should be writable? Also how do you handle nil pointers during, say, depth-first search. You still need to check for sentinel values, be it null pointer or preallocated nil pointer?

Expand full comment

I go into the "writable pointer case" in the article - the idea would be that you pre-allocate as much as possible to prevent e.g. an allocation failure. If the case is literally that you're doing a lookup or something like that, and it might fail, then yeah, you have distinctly different things to do if you get a result or if you don't, so that's just an `if`.

Nil pointers must often be checked like null pointers - as you say, inside of things like depth-first searches, or inside of for-loops. The difference is that they do *not* have to be checked in many cases where null pointers *do* have to be checked.

Expand full comment

Functional programming expresses this idea through a `Maybe` / `Option` type. (Or C++ std::optional). Here's a nice writeup of this same idea of "Separate code that checks for presence from code that calculates values": https://thoughtbot.com/blog/problem-solving-with-maybe

Expand full comment

This is not the concept I was getting at - the post was partly advocating for exactly the *opposite* of Maybe/Option/std::optional. These are sum types, which needlessly bifurcate codepaths which must work with them. It is much better to collapse codepaths whenever possible, so that error cases flow through identically to non-error cases, with both being identical at a lower level.

Expand full comment

Right. The linked article advocates for the same thing--prune the multiple code paths at the edges, do checks as early as possible. As in, "Maybe is viral and infects the whole codebase if you don't handle it right." My comment misses to capture that essence, but the linked article advocates for the same principle: collapsing codepaths as early as possible.

Expand full comment

I think you might be misunderstanding the point of this article.

Let's say you have an Haskell's `Maybe T`. `Nothing` is the sentinel value indicating an error. Also, `T` and `Maybe T` are separate types.

In particular `Nothing`, *is not* an instance of T. If you have two functions `func_1` and `func_2`, where `func_1` returns an error, you can't write something as simple as `func_2 (func_1 (val)))`.

You need to write a check that the result of `func_1` is not `Nothing`.

Compound this across your entire program, and that is lots of boilerplate.

Fancy functional programming abstractions such as functors and monads abstract away some of the boilerplate.

They don't remove it, they just make it shorter to type.

But if you make the sentinel value for failure a valid instance of T in the first place, you solve the problem at the root.

You don't need a separate type to signal the presence of an error. Youu can compose functions without extra boilerplate, and the code is easier to follow.

(When I say "make the sentinel value a valid instance of T, I mean that all normal operations work on the sentinel value.

That means that null pointers are not "valid", since dereferencing them in undefined behaviour.)

Expand full comment

I re-read the article.

While my original comment was expressing agreement with the part "fail early and code for the happy path," this time I noticed that the technique is in fact limited to null pointer issue, like you underscore. The "sentinel" as described really only solves the unfortunate issue of dereferencing null pointers causing ub.

The example func_2(func_1(val)) is about chaining computations. A pointer "sentinel value" carries through all function calls without triggering ub, but its meaning is "valid type, unknown result." func_1 can still produce a result invalid for func_2's computation.

Say func_1 is `MD_TokenizeFromText` and func_2 calculates average chars per token. `MD_TokenizeFromText` will succeed even if the input text was zero length. Clearly `tokenize.tokens.count` will be zero. Zero is a valid instance of U64. Yet we'd be better off checking for that zero before calculating the average in func_2, no? And then, how should func_2 signal to its caller the absence of the calculation in the zero-token case?

In the general case you can't chain computations without being aware of presence/absence of value and the success/failure of a previous computation. Checks are necessary, no matter where you hide them. Whether you explicitly check with `if` or encode in structures like `nil_node` or embed in types like `Maybe T` and `Result T E`, is a question of preference and ergonomics.

That said, maybe I did miss the point of the article. Maybe it had nothing to do with chaining computations that can fail, and was all about null pointers.

Expand full comment

In your example, func_2 should return zero when 'tokenize.tokens.count' is zero. Imagine having a 'for (int i = 0; i < tokenize.tokens.count; ++i)' in func_2. When the count is zero, the loop will never run.

Expand full comment