1.7 Building Our Custom Debug Kernel
In the previous section, we built a "production kernel." It's like an agile field agent—lean, efficient, and always ready for deployment. But as developers, having just an agent isn't enough—sometimes, we need a chatterbox. Someone who shouts what they're doing before every single operation, even printing out their inner monologue.
That is exactly the purpose of a debug kernel.
In this kernel, we don't care about performance overhead or security trade-offs—what we care about is: when things go wrong, how many clues can it leave behind?
The process of building a debug kernel is very similar to that of a production kernel. To avoid repeating those mechanical steps, we'll focus on the differences between the two. It's like two models cast from the same mold, just filled with different ingredients.
Preparation: The Production Kernel is Our Starting Point
First, make sure you are currently running the production kernel we just compiled. This is important because the upcoming debug kernel configuration will use the running system's state as a baseline.
$ uname -r
5.10.60-prod01
See that? The -prod01 suffix confirms we're sitting on the right foundation.
Staking Out Territory: An Independent Workspace
Although tempting, do not modify the configuration directly in the production kernel's source directory.
You must create a clean working directory. Yes, this will take up a few extra GB of disk space, but it's absolutely worth it—imagine the despair of your production and debug kernel configuration files getting mixed up and overwriting each other. Keeping them isolated is the first step to maintaining your sanity.
mkdir -p ~/lkd_kernels/debugk
Next, just like before, extract the kernel source tarball into this new directory. We'll reuse the linux-5.10.60.tar.xz we downloaded earlier:
cd ~/lkd_kernels
tar xf linux-5.10.60.tar.xz --directory=debugk/
Configuration Strategy: Inheritance and Mutation
Enter the new directory, and we'll use the localmodconfig strategy again to generate the initial configuration. This produces a streamlined config containing only the modules required by the currently running hardware.
But this time, the "currently running kernel" is our custom production kernel. This means the debug kernel will inherit the production kernel's hardware adaptation features, while layering debug capabilities on top.
cd ~/lkd_kernels/debugk/linux-5.10.60
lsmod > /tmp/lsmod.now
make LSMOD=/tmp/lsmod.now localmodconfig
Configuration Core: Enabling Kernel Hacking
The entry point to the configuration interface is the same as before:
make menuconfig
If you find the sheer number of menus overwhelming, don't worry—there's a shortcut: press the / key (just like in vi), then type the name of the config option you're looking for to jump directly to it.
Most of the kernel's debugging facilities are hidden in a single menu—right at the bottom of the main menu, with a name that sounds a bit like The Matrix: Kernel hacking.
In this menu, we can see a dazzling array of debug switches. There are so many it's slightly intimidating, and most of them will look obscure right now. Don't worry—you don't need to understand every detail right now; we'll break them down one by one as we encounter them in later chapters.
Our task right now is to flip on the most critical switches.
For convenience, I've put together a table (Table 1.1) listing the typical configuration differences between a production kernel and a debug kernel. This is by no means an exhaustive list, but it's sufficient as a starting point.
💡 Table 1.1: Kernel Configuration Variable Comparison
(Note: The original table comparison is preserved here, reflecting the differences between the production and debug kernels on key options like
CONFIG_DEBUG_INFO,CONFIG_KASAN,CONFIG_LOCKDEP. The table is quite long, covering configuration recommendations for multiple subsystems includingGeneral setup,Kernel hacking, and more.)
stands for "Depends" (left to the architect's discretion), depending on your product's High Availability (HA) and security requirements. - [1] Note: If you enable
CONFIG_DEBUG_INFO_BTFon Ubuntu 20.04, compilation might fail because the system's bundledpaholeis too old (v1.16+ is required). I've included a v1.17 package in the code repository for emergencies:sudo dpkg -i dwarves_1.17-1_amd64.deb
Saving the Configuration: Preventing Future Headaches
After you've picked a whole bunch of debug options like at a buffet, remember to save this "menu." Do not skip this step—otherwise, when you forget what you selected or the config gets accidentally overwritten, you'll be in tears.
cp -af .config ~/lkd_kernels/kconfig_dbg01
Compiling and Installing: The Birth of a Heavyweight
Now, start compiling. Because we've enabled a massive amount of debug information (especially symbol tables), the resulting build artifact will be a "heavyweight."
make -j8 all
Once compilation is done, compare the sizes of the two core files—you'll be shocked:
$ ls -lh arch/x86/boot/bzImage vmlinux
-rw-r--r-- 1 letsdebug letsdebug 18M Aug 20 12:35 arch/x86/boot/bzImage
-rwxr-x--x 1 letsdebug letsdebug 1.1G Aug 20 12:35 vmlinux
Notice that? vmlinux —the uncompressed kernel binary—is a whopping 1.1GB.
This is actually a counterintuitive moment. The kernel images (bzImage) we usually see are only a few dozen MB, so why is this one so huge?
The answer lies in the debug options we just enabled. CONFIG_DEBUG_INFO stuffs all debug symbols into the binary file; if you also enabled CONFIG_KASAN (Kernel Address Sanitizer), it inserts a massive amount of shadow memory code. It's like putting a full set of sensor armor on a lean athlete—the bulk naturally expands.
Here, bzImage is still around 18M (because it gets compressed), while the size of vmlinux truly reflects the degree of "bloat."
Finally, install the modules, update the initramfs and boot menu—the process is exactly the same as for the production kernel:
sudo make modules_install && sudo make install
Window: How to View the Current Kernel's Configuration
Sometimes you'll encounter a running machine and need to figure out how its kernel was compiled, but you can't find the source directory. The kernel actually has a built-in "black box" feature, but it depends on a config option: CONFIG_IKCONFIG.
In our configuration, we typically set it like this:
- Debug kernel:
CONFIG_IKCONFIG=y(built-in directly, always visible) - Production kernel:
CONFIG_IKCONFIG=m(built as a module, loaded only when needed, adding an extra layer of security cover)
If the production kernel has it set as a module (m), you won't see the config file by default unless you have root privileges and manually load the module. This is a very clever design: developers can load it when needed, while regular users don't even know it exists.
Let's try this out. On a machine running the production kernel, first try to view the config file:
$ ls -l /proc/config.gz
ls: cannot access '/proc/config.gz': No such file or directory
Can't see it? Right, because the module hasn't been loaded yet.
Now, let's plug in that "black box" module with sudo:
$ sudo modprobe configs
$ ls -l /proc/config.gz
-r--r--r-- 1 root root 34720 Oct 5 19:35 /proc/config.gz
There it is. Now we can use zcat to decompress and query its contents, even querying its own config option—it's almost like looking in a mirror:
$ zcat /proc/config.gz | grep IKCONFIG
CONFIG_IKCONFIG=m
CONFIG_IKCONFIG_PROC=y
Perfect. This confirms that the current kernel provides this functionality as a module (m).
🍔 Food for Thought
In the configuration Table 1.1 above, I set the production kernel's
CONFIG_KALLSYMS_ALLto<D>(left to the architect's discretion). You might ask: since this is a production kernel, shouldn't we disable the ability to view all kernel symbols for security reasons?Intuition certainly says yes. But I want to tell an old story—Mars Pathfinder.
In 1997, shortly after landing, Mars Pathfinder began experiencing frequent resets. Glenn Reeves, the JPL software team lead, said something profound during the post-mortem:
"The software flying on Mars actually included many debug features used in the lab. Although we didn't use them in space (because the data volume was too large to transmit back to Earth), these features weren't left in by 'accident'—they were kept there deliberately. We firmly believed in one philosophy: Test what you fly and fly what you test."
Sometimes, in the harsh environment of production, retaining even a tiny bit of debugging capability and logging can save your life at a critical moment. This is a trade-off between "perfect security" and "survivability."