11.6 Advanced [K]GDB Tips and Tricks
In the previous section, we used hbreak to latch onto do_init_module, thoroughly solving the "module vanishes on load" problem. But once you actually start running KGDB, you'll find it's like a bottomless toolbox—most of the time you only use the screwdriver, but when you really need that angled needle-nose plier, you'd better know which corner it's hiding in.
GDB is a massive program with an absurd number of features. This section skips the basic workflow and focuses on cataloging advanced tips that are easily overlooked but can save your life at critical moments. Some are kernel-bundled enhancement scripts, while others are underrated native GDB features.
Enabling Kernel GDB Scripts (CONFIG_GDB_SCRIPTS)
Something many people don't know: the Linux kernel source tree actually hides a bunch of ready-made Python scripts that can be used directly as commands inside GDB.
Starting with the 4.0 kernel, developers stashed these things in the scripts/gdb directory. They aren't ordinary scripts; they are hooks mounted inside GDB. To enable them, the first step is to turn on the switch in your kernel configuration:
CONFIG_GDB_SCRIPTS=y
After enabling this option, the kernel build will generate vmlinux-gdb.py. However, for security reasons, GDB won't casually load auto-running scripts by default—it's afraid that one day you might accidentally download a malicious kernel source tree and get your own machine hacked.
So the second step is to tell GDB: "Hey, the scripts in this directory are mine, go ahead and load them."
Add this line to your ~/.gdbinit file:
add-auto-load-safe-path /path/to/linux/kernel/scripts/gdb/vmlinux-gdb.py
If you don't want to write absolute paths (or if you frequently switch kernel source directories), you can use a more brute-force but universal approach:
add-auto-load-safe-path /
This essentially removes all security restrictions. Once this is done, the next time GDB loads vmlinux, it will automatically parse these Python scripts.
How do you know it worked?
Type this in GDB:
(gdb) apropos lx-
If the screen floods with commands prefixed with lx-, you're good to go.
The lx- prefix stands for Linux, and the _ prefix (lx_) stands for helper functions.
Figure 11.18 shows an overview of these commands. You'll notice some familiar names: lx-dmesg (view kernel logs), lx-cmdline (view boot parameters), lx-lsmod (view loaded modules), lx-iomem (view memory layout)...
What's the point of this? It's incredibly useful.
Previously, to check kernel logs, you had to type dmesg on the target system, or even redirect logs to a file. Now, while in a GDB debugging session, without exiting, you can just type lx-dmesg and the logs will stream right before your eyes. This is extremely effective for debugging "what happened in the very last moment before the crash."
Want help with a specific command? GDB's standard help still works:
(gdb) help lx-lsmod
This feature requires GDB 7.2 or higher. The official documentation is here—I recommend reading it and then jumping straight into hands-on practice.
Why Doesn't target remote :1234 Work on Real Hardware?
We ran very smoothly on QEMU—target remote :1234 connected on the first try. But once you switch to two real machines connected by a serial cable, the same command might just hang on you.
This is the malice of the physical world—a connection layer problem.
If your GDB client reports a connection failure or throws weird warnings (like warning: unrecognized item "timeout" in "qSupported" response), don't suspect your kernel configuration; suspect the cable first.
Before wrestling with KGDB, do one thing first: confirm that bidirectional communication is working.
Assume your serial cable is connected to the Host's /dev/ttyUSB0 and the Target's /dev/ttyS0.
On the Host (requires root privileges):
echo "hello, target" > /dev/ttyUSB0
On the Target:
cat /dev/ttyS0
If you see hello, target on the Target's screen, it means Host -> Target is fine.
Now reverse it. On the Target:
echo "hello, host" > /dev/ttyS0
On the Host:
cat /dev/ttyUSB0
If this step also works, the physical link is good.
If the link is good but KGDB still doesn't work, it might be an issue with your USB-to-serial adapter.
Here's a word of experience: using a USB-to-serial adapter on the Host side is usually fine, but if the Target side connects to KGDB through a USB-to-serial adapter, some kernel drivers don't play well together. The Target is best connected directly to the motherboard's native serial port (COM port), or over the network (Ethernet/Wireless).
Setting sysroot — What to Do When Libraries Can't Be Found?
When you use an x86 computer to debug an ARM board (or do remote cross-debugging), GDB runs into an awkward problem: it knows how to read ELF files, but it doesn't know where the /lib library files on the board are located.
If you try to step into a library function, GDB might report an error saying it can't find the .so file.
There are two solutions:
-
Use the GDB from your cross toolchain Don't use the system's default
gdb; use the one dedicated to your toolchain, such asarm-none-linux-gnueabihf-gdb. -
Tell GDB where the target rootfs is Set the
sysrootvariable in GDB:(gdb) set sysroot /path/to/target/rootfsOr set
solib-search-path:(gdb) set solib-search-path /path/to/target/rootfs/libThis way, GDB knows where to find those dependent libraries. This is especially important when debugging user-space programs (via the rootfs mounted by the kernel).
GDB's TUI Mode — Say Goodbye to Plain Black and White
Many people dislike GDB because they think it's antiquated: one line of command, one line of output, scrolling back and forth is exhausting.
Actually, GDB has a built-in graphical interface mode called TUI (Text User Interface). No plugins needed, no IDE required—just add a parameter:
gdb -tui -q vmlinux
The moment you press Enter, your terminal transforms.
The window splits in half: source code on top, GDB command line on the bottom. You can look at code and type commands at the same time, without frequently typing list or l.
Want three columns? Press Ctrl-x then 2 (press Ctrl+X, release, then press 2).
This brings up the CPU register view as well. Figure 11.19 demonstrates this effect: registers, source code, and command line, all in one.
And this thing is dynamic—if a register changed after the previous instruction executed, it gets highlighted. This is incredibly handy when single-stepping through assembly.
If you want to switch views (e.g., from source to assembly), you can use Ctrl-x then 1 (automatic) or Ctrl-x then a (assembly mode).
Table 11.1 lists the common shortcuts—it's worth printing out and taping to your monitor bezel.
For example:
Ctrl-x 1: Single-column mode (view only source or assembly)Ctrl-x 2: Dual-column modeCtrl-x a: Toggle mixed assembly/source modeCtrl-p / Ctrl-n: Previous/next command (history)
Once you get the hang of this mode, you'll find it's really no different from GUI-equipped IDEs (like VS Code or Eclipse), and it's even more lightweight.
By the way, to single-step assembly instructions, use si (stepi) or ni (nexti). To view assembly code, use the disas or /m modifiers (disas /m func_name). Combined with TUI mode, these are simply god-tier tools.
What to Do When You Encounter <value optimized out>?
This is one of the most panic-inducing moments for all kernel newcomers.
You want to look at a local variable i in GDB:
(gdb) p i
$1 = <value optimized out>
What the heck? GDB slaps you in the face: this variable was optimized out.
This is the compiler's fault. After GCC enables -O2 or -O3, it realizes that your i can actually be kept in a register the entire time, with no need to access memory at all. So, there's no trace of it in memory, and when GDB looks in memory (its default behavior), it naturally can't find it.
When you run into this situation, don't panic. You have three options:
-
Look at the registers This requires understanding the CPU's ABI (Application Binary Interface). We mentioned this in Chapter 4 when discussing Kprobes. According to the calling convention, parameters are usually placed in the first few registers (like x86's
rdi,rsi,rdx...). If the variable hasn't been spilled to the stack, it must be sitting in some register. You can directlyinfo registersto check, or keep an eye on the register window at the top in TUI mode. -
Enable DWARF 4 debug info Turn on
CONFIG_DEBUG_INFO_DWARF4in your kernel configuration. This option makes debug information more detailed. Although it can't guarantee 100% success at resolving variables in optimized code, the success rate improves significantly. -
Switch to TUI mode to look at assembly If you can't see the variable name, look at what it's doing. If the source code is hard to follow, look at the corresponding assembly instructions (
disas /m). Assembly doesn't lie; you can always find where that register is being manipulated.
GDB Convenience Functions and Custom Macros
If you find yourself typing the same command sequence every day, it's time to write macros.
GDB's macro syntax is very simple: just define ... end. You can put these macros in your ~/.gdbinit file, and they will automatically load when you start GDB.
For example, connecting to the port and setting breakpoints every time you debug QEMU is annoying. Write a macro:
# ~/.gdbinit
define connect_qemu
target remote :1234
hbreak start_kernel
hb panic
hb do_init_module
b do_fsync
end
From now on, you only need to type connect_qemu after entering GDB, and everything is set up.
Here's a more practical one: looking at the stack.
define xs
printf "Examine stack:\n"
x/8x $sp
printf "---\n"
x/8x $sp-32
end
This macro xs will dump a chunk of memory around the stack pointer $sp, which is very useful for checking stack overflows or finding local variables.
GDB also has some built-in convenience functions, many of which require Python support:
$_memeq(buf1, buf2, length): Compare whether two blocks of memory are identical.$_regex(str, regex): Regex matching.$_strlen(str): Calculate string length.
These can all be found in the help documentation. For more macro examples, you can borrow from the kernel docs: Documentation/admin-guide/kdump/gdbmacros.txt.
Advanced Breakpoints: Conditional Breakpoints and Watchpoints
A plain break is too mechanical. Sometimes you need finer-grained control.
Conditional Breakpoints
Imagine you're debugging a loop that runs 100,000 times. You suspect the bug occurs on the last iteration.
Are you really going to press c (continue) 100,000 times?
No. Use a conditional breakpoint:
(gdb) break loop_function if i == 99999
Or if the breakpoint is already set:
(gdb) condition 1 i == 99999
This way, GDB will only stop when i equals 99999. This is an absolute lifesaver when tracking down off-by-one errors.
Temporary Breakpoints
Sometimes you just want to "pause once" without being constantly annoyed by this breakpoint afterward. Use tbreak:
(gdb) tbreak some_function
It automatically deletes itself after being hit once, saving you from manually typing disable or delete.
Hardware Watchpoints
This is one of GDB's most powerful features—arguably even more powerful than breakpoints.
A regular breakpoint means "stop on this line of code." A watchpoint means "stop when this variable is modified/read."
It uses the CPU's hardware debug registers (DR0-DR3 on x86), making it extremely fast, unlike software breakpoints that need to replace instructions.
How to use it?
watch <var>: Stop when the variable is written to.rwatch <var>: Stop when the variable is read.awatch <var>: Stop on both read and write.
For example, jiffies_64 is a kernel global clock variable that increments by 1 on every timer interrupt.
If we set a watchpoint on it:
(gdb) watch jiffies_64
Hardware watchpoint 1: jiffies_64
(gdb) c
Continuing.
Hardware watchpoint 1: jiffies_64
Old value = 4294888091
New value = 4294888092
0x (... some address in timer interrupt handler ...)
Look at Figure 11.20. GDB will pause the instant the variable changes, telling you the old value, the new value, and conveniently stopping on the exact line of code that modified it.
At this point, if you use bt (backtrace), you can see at a glance who changed this variable.
For debugging phantom bugs like "who changed my pointer to NULL" or "who is messing with my struct members," this is a nuclear-level weapon.
Note: Hardware breakpoints/watchpoints are limited resources (x86 typically only has 4). Once they're used up, they're gone. If GDB warns can't set breakpoints, it likely means the hardware resources are exhausted.
Miscellaneous Tips
Finally, here are a few more scattered but handy tips:
-
Tab Completion When typing commands, variable names, or function names in GDB, press Tab to autocomplete. If there are too many matches, pressing Tab twice will list all possibilities.
-
Locating Code from an Oops Address Remember the Oops from Chapter 7? If an Oops gives you a function name + offset (like
my_func+0x5c), you can use this in GDB:(gdb) list *my_func+0x5cto jump directly to that line of code.
-
Running a Shell Inside GDB Without exiting GDB, simply type:
(gdb) shell ls -lThis is convenient when debugging requires temporarily checking a file or changing something.
-
Reverse Debugging GDB even supports "rewinding" (record / reverse), though this feature is extremely memory-intensive and might not be practical in remote debugging scenarios like KGDB. If you're debugging user-space programs locally, you can try
record btrace.
With this, we've reached the bottom of the GDB (and KGDB) toolbox.
Now, you not only know how to connect to the kernel and set breakpoints, but also how to use Python scripts to check logs, how to work in TUI mode like an IDE, and how to use hardware watchpoints to catch the "ghost" that's randomly modifying your variables.
In the final chapter, we'll shift our focus from "how to debug" back to "overall system design"—looking at what other system-level debugging techniques can help us solve the final puzzles.
I'll see you there.