Demystifying Debuggers, Part 1: A Busy Intersection

An introduction to a new post series covering debugger basics.

Dec 16, 2024

Debuggers exist at the intersection of many parts of the computing ecosystem—they must contend with intricate details of kernels, compilers, linkers, programming languages, and instruction set architectures.

My familiarity with debuggers has improved my programming abilities, the utility of debuggers in my day-to-day programming, and my general knowledge of computing. Back in January, the RAD Debugger—the project I work on full-time—was open sourced to the public, to mark the start of its open alpha phase. I’ve been working on the debugger, or the technology on which it depends, for almost four years full-time now. The project has taught me an enormous number of lessons, through exposure to an enormous number of problems. There is still a lot of work to do, and so I expect it will continue to do so, for many years to come.

But perhaps most importantly, debuggers are an intricate piece of the puzzle of the design of a development platform—a future I become more interested in every day, given the undeniable decay infecting modern computing devices and their software ecosystems.

To emphasize their importance, I’d like to reflect on the name “debugger”. It is not a name I would’ve chosen, because it can give the impression that a debugger is an auxiliary, only-relevant-when-things-break tool. Of course, a debugger is used to debug—which is why it was named as such—but it is also enormously useful to analyze working code’s behavior, and to verify code’s correctness, with respect to the expectations of the code.

A good debugger provides clear and insightful visualizations into what code is doing. As such, they are also enormously useful educational tools—for beginners and experts alike—because they make what is normally opaque, visible. They provide these features by dynamically interacting with running programs—as such, they can also dynamically modify code. At the limit, this approximates (or employs) JIT-compilation and hot-reloading, making traditional compiled toolchains have much more runtime flexibility for developers.

For these reasons, “debugger” is much too special-purpose of a name for the full set of capabilities that debuggers actually provide—they offer glimpses into the lower level inner-workings of a computer. If one designed a computing system from scratch, they might not ideally be independent from the operating system itself. Instead, perhaps the same capabilities could simply be provided through first-class visualization and dynamic execution adjustment features that the operating system naturally exposes. But that is a topic for another day.

I hope this sheds light on the imbecility of Internet debates about the utility of debuggers—for example, where one might find comments like, “I don’t need debuggers, because I can just use printf”, or “I don’t need debuggers if I can statically guarantee correctness”. It’s akin to suggesting that someone does not benefit from vision, because they can feel their way around with a mobility cane, or read text through Braille. Even though mobility canes and Braille are obviously good inventions for people who can’t have vision, that doesn’t somehow imply that vision isn’t an obvious benefit, or that it isn’t obviously preferable. Similarly, even though logging and static verification are obviously good inventions for programs or circumstances which cannot be easily debugged at runtime, or when those things are simply preferable in context, that doesn’t somehow imply that actively visualizing the runtime execution of programs through a debugger isn’t an obvious net benefit, or that it isn’t obviously preferable in many cases. To suggest otherwise in either case is absurd. The more useful debuggers become, the shorter the iteration loop of the programmer, the more efficient software production becomes, and the more trivially that programmers can obtain true from-first-principles reasoning about their code.

Given their importance for both the present and future, and their utility to myself (and thus perhaps readers), I’m writing a series explaining and documenting debugger architecture.

In this series of posts, I’ll cover the following topics:

The Anatomy Of A Running Program — On the concepts involved in a running program. What happens, exactly, when you double click an executable file, or launch it from the command line, and it begins to execute?
Debugger-Kernel Interaction — On how kernels collect and expose information about program execution to debuggers, like “debug events”, encoding changes like thread creation & destruction, dynamic module loading & unloading, low level exceptions being hit by threads, and more; or like the reading & writing of memory & thread registers, or like the suspension and resumption of threads.
CPU Debug Features — On the features that CPUs commonly expose for debuggers, like interruption instructions, debug registers, single-stepping mode, and more.
Debugger-Inserted Traps — On how debuggers set “traps”—a trivial but widely-used form of runtime code modification that allows the debugger to intercept and control code execution (like to implement the higher level “breakpoints” feature).
Debug Info & Toolchains — On the traditional compilation and linking pipeline, how “debug info” is produced, what it contains, and how it helps debuggers implement higher level features, which can correlate a program’s state with source code or language constructs.
Evaluation — On evaluating expressions using an expression language and “location info” and “type info”—two parts of “debug info”.
Breakpoints — On how “breakpoints” are implemented, from address breakpoints, symbol breakpoints, source code location breakpoints, to conditional breakpoints and processor (or data) breakpoints.
Stepping — On the various “stepping” features in debuggers, from the barebones single-instruction stepping, to disassembly stepping, to source line stepping, all with variants like “step into”, “step over”, and “step out”, all while correctly handling multithreaded programs.
Unwinding — On “unwinding”, which is how a debugger determines a thread’s current “call stack”, and is able to correctly evaluate values from all scopes in a call stack.
Graphical Debugger Multithreaded Architecture — On the structure of a graphical debugger, which employs the aforementioned features and concepts, and exposes them through a real-time interactive interface.
The Watch Window, & General-Purpose Data Visualization — On the traditional “watch window” graphical debugger interface, and how it may be extended to support general-purpose data visualization.
…and anything else I stumble across while writing that I think would be appropriate to cover!

In discussing these topics, I’ll try to abstract over platform and architectural details when possible, but I’ll base my writing on my experience from working on the RAD Debugger, which has begun its journey as a Windows, user-mode, x64 debugger (although it’s not finishing its journey as merely that). I’ll also use the RAD Debugger to demonstrate certain concepts and features concretely.

When I am explicitly relying on that context, I’ll do my best to state so, but I’ll also do my best to extrapolate to more generalized information when appropriate, as many of the concepts have similar if not identical analogs on other platforms, and so I feel the knowledge is quite generalizable.

I hope you’re excited to come along for the ride, and demystify debuggers for yourself!

If you enjoyed this post, please consider subscribing. Thanks for reading.

-Ryan

Jakov Spahija

Dec 16

I hope you are eventually be going over topics like DWARF.

Eli Bendersky has a good couple of articles on debuggers too.

I work in embedded software so with remote debugging, we dont have the luxury of PTrace and rely on RTOS/architecture support. But this just makes me appreciate the fundamentals even more.

Expand full comment

3 replies by Ryan Fleury and others

Yousef

Dec 19

Looking forward to this.

4 more comments...

Digital Grove

Discussion about this post