Skip to main content

6.7 Practical Tips for Developers

Honestly, no amount of debugging tools beats writing cleaner code in the first place.

We won't discuss debugging tools in this section. Instead, let's talk about something more pragmatic—how to fundamentally prevent kernel memory issues through better coding habits. It echoes the old saying: prevention is better than cure.

The Modern Approach to Preventing Leaks: Devres API

If you are writing modern Linux drivers, you must know about a "resource management" mechanism provided by the kernel, commonly known as the devres API.

Its biggest selling point is simple: you allocate, and it frees.

Although the kernel has many functions with the devm_* prefix, as driver authors, the two memory allocation interfaces we use most are:

void *devm_kmalloc(struct device *dev, size_t size, gfp_t gfp);
void *devm_kzalloc(struct device *dev, size_t size, gfp_t gfp);

You might have noticed the emphasis on "driver authors" only. Just look at the function signature: the first parameter is struct device *. Only drivers have access to this; core kernel code or other modules don't.

The reason this mechanism is so handy is that the kernel guarantees: when a driver detaches, or when a kernel module is unloaded, the resource management framework will automatically free the memory allocated through these APIs.

This directly improves code robustness. Why? Because we are all human and we all make mistakes. And in the kernel, one of the easiest mistakes to make—especially when handling various error return paths—is forgetting to free memory.

Lesser-Known Facts About the Devres API

While it feels like magic, there are a few things you should keep in mind when using it:

  1. Don't blindly replace everything. Don't assume devm_kzalloc is inherently superior and replace every kmalloc in your code with it. This "automatically managed" memory is only suitable for use during driver initialization and probe() phases. Why? Because its lifecycle is tied to the device. If you use it in a regular function that gets called frequently, the memory won't be freed until the device is unplugged. That's not a leak; that's a vulnerability.

  2. Prefer devm_kzalloc. It not only allocates memory but also zeroes it out. This saves you from an entire class of rookie mistakes: Uninitialized Memory Reads (UMR). Statistics show that in kernel 5.10.60, devm_kzalloc was called over 5,000 times—developers know a good deal when they see one.

  3. Standard parameters. The second parameter is the size in bytes, and the third is the GFP flag (e.g., GFP_KERNEL), which is no different from a standard kmalloc. The only new face is the first parameter, dev, which points to your device structure.

  4. Don't manually free, unless.... Once you use this API, stop worrying about kfree. Of course, the kernel does provide devm_kfree() for manual freeing. But if you find yourself needing to manually free it before the driver disconnects, it usually means you're using it in the wrong place—if you're going to manage it manually, why use devm in the first place?

  5. GPL's revenge. This API is only exported to modules that comply with the GPL license. The kernel community's little agenda—you know how it is.

A Few More "Fatal Mistakes" to Avoid

Beyond using the right tools, some pitfalls are purely human error. Here are a few of the most common memory-related bugs you'll encounter during development. We recommend keeping this page in mind during Code Review.

  • Wrong GFP flags. This is the most classic way to crash: calling GFP_KERNEL in an atomic context (e.g., inside an Interrupt Handler, or while holding a spinlock). This puts the kernel to sleep, causing the system to hang or crash instantly. Here, you must use GFP_ATOMIC. History repeats itself. Here is a patch example that serves as a painful lesson: https://lore.kernel.org/lkml/1420845382-25815-1-git-send-email-khoroshilov@ispras.ru/

  • Mismatched allocation and freeing. Never use vfree to free memory allocated with kmalloc, and vice versa. That's like using a hammer to drive a screw—the kernel will ruthlessly throw an error at you.

  • Not checking return values. Memory allocation can fail. If kmalloc returns NULL and you immediately dereference it, that's a Kernel Panic. Even if it feels a bit verbose, always write:

    if (unlikely(!p)) {
    /* 处理错误 */
    }
  • Redundant if checks. Fans of defensive programming often write:

    if (p)
    kfree(p);

    Actually, kfree(NULL) is safe; the kernel already handles this. Writing it this way is harmless but redundant. Conversely, some people assume kfree(p) sets p to NULL. This is an illusion. kfree does not modify the pointer variable's value, so if you intend to use this pointer later, you should manually set it to NULL. Otherwise, you'll end up accessing freed memory (a dangling pointer).

  • Ignoring "internal fragmentation". When you request 4097 bytes, the Slab Allocator will typically give you 8192 bytes (usually a power of two or aligned to a specific cache line). This means you're wasting nearly half of the memory. How can you verify this? Use the ksize() API.

    p = kmalloc(4097, GFP_KERNEL);
    n = ksize(p); // n 很可能是 8192

    At this point, you should ask yourself: can you optimize the data structure to round up cleanly, like requesting exactly 4096 bytes? Or use slabinfo -L to check the system's overall fragmentation situation.


The Ultimate Tool Showdown

Remember the test cases we ran in the previous chapter and this one? Now is the time to connect all the dots.

The table below (Table 6.4) is an enhanced version of our previous table, with a new rightmost column added—the performance of the SLUB Debug Framework.

Table 6.4 — The Ultimate Showdown of Common Memory Defect Detection Methods

(This corresponds to the content of Table 6.4 in the original text)

This table covers:

  • Plain kernel: Nothing enabled, running bare.
  • Compiler warnings: Relying on GCC's -Wall.
  • KASAN: Kernel Address Sanitizer.
  • UBSAN: Undefined Behavior Sanitizer.
  • SLUB Debug: The focus of this chapter, a debug kernel with slub_debug enabled.

Let's quickly recap the patterns behind the table:

  • KASAN is the most comprehensive guardian. It can catch almost all Out-Of-Bounds (OOB) accesses, whether on global variables, the stack, or dynamically allocated memory. In contrast, UBSAN is powerless against OOB on dynamic memory (slab).
  • UBSAN's specialty. It can't catch memory overflows, but it's an expert at catching "Undefined Behavior (UB)" (like test case 8.x). This is exactly KASAN's blind spot.
  • The ones that got away. Neither KASAN nor UBSAN can catch the first three categories of issues: Uninitialized Memory Reads (UMR), Use-After-Return (UAR), and Memory Leaks. UMR can still be somewhat flagged by compiler warnings and static analysis tools (like cppcheck).
  • SLUB Debug's home turf. It excels at catching slab-layer memory corruption, but is helpless against anything outside of that.
  • Kmemleak's signature move. It is specifically designed to tackle "leaks." As long as memory allocated via kmalloc, vmalloc, or kmem_cache_alloc isn't freed, it will dig it up for you.

A Few Additional Notes

The footnotes in the table are worth a closer look:

  • [V1]: If the system encounters an Oops (crash) or hang, even if it seems fine afterward, the kernel is actually in an unstable state. Don't push your luck.
  • [S1]: When SLUB debugging is enabled (slub_debug=FZPU), it can simultaneously catch "write overflow" and "write overflow (underflow)." But like UBSAN, it only catches this when accessed via an incorrect array index. If you write directly via an incorrect pointer offset, it might miss it. Furthermore, it usually only catches "write" overflows—it ignores "read" overflows.

Chapter Echoes

With this, we have finally completed the grand puzzle of catching kernel memory defects.

Think back on what we actually did in this chapter (and the previous one): we built a cognitive kaleidoscope. Every tool—KASAN, UBSAN, SLUB Debug, Kmemleak—is a facet of the lens. Looking at any single facet alone only reveals a distorted, partial truth; only by overlapping them can you see the complete picture of kernel memory errors.

Along the way, you'll discover one fact: there is no silver bullet. KASAN is powerful, but it has performance overhead and can't see UMR; kmemleak is clever, but it has false positives and can only detect leaks. As a kernel developer, your value lies not in memorizing which command to run, but in knowing exactly which scalpel to pick up the instant symptoms appear to dissect the problem.

Don't forget the fear we mentioned at the beginning of this chapter—memory leaks. Through kmemleak, you now have the ability to find things in the dark expanse of memory. And don't forget devm_kzalloc, which puts you on a safe starting line from the very beginning.

In the next chapter, we will pull our gaze back from the micro-battlefield of "memory" and look at the macro-picture when the entire system crashes: Kernel Oops. When the kernel finally gives up and spits out a pile of hex codes, how do we as developers read its dying words? That is an even more hardcore detective game.

Are you ready? See you in the next chapter.


Exercises

Exercise 1: Understanding

Question: While investigating a kernel crash log, you find a SLUB Allocator error report: 'BUG kmalloc-32: Right Redzone overwritten'. Please explain what type of memory access issue this error specifically indicates, and state what the Magic Value filled in the Red Zone is during SLUB debugging.

Answer & Analysis

Answer: This error indicates a buffer overflow, specifically a write out-of-bounds (right/forward overflow), where the program wrote past the end of the allocated object and intruded into the red zone. The red zone's magic value is 0x5a (corresponding to the character 'Z', i.e., POISON_INUSE).

Analysis: 'Right Redzone overwritten' is a type of error reported by the SLUB debugging mechanism. When Red Zone (the Z flag) is enabled, the allocator inserts padding areas before and after the object's memory. If the values in this area are found to be modified during freeing or checking, it means the code performed an out-of-bounds write. According to the kernel's definitions, SLUB uses POISON_INUSE (0x5a) to fill these padding areas. If it detects that the value is no longer 0x5a, it triggers this error.

Exercise 2: Application

Question: You need to enable the following features via the kernel boot parameter slub_debug: automatically filling freed memory with a specific value (to detect UAF), and performing metadata validation during allocation/freeing. Please provide the correct parameter format, and state what the original fill value (in hexadecimal) of the corrupted memory area is when a Use-After-Free (UAF) write operation occurs.

Answer & Analysis

Answer: Parameter format: slub_debug=PF. The corrupted fill value is 0x6b.

Analysis: According to the slub_debug parameter documentation, 'P' stands for Poisoning, which means filling with a specific value; 'F' stands for Sanity checks. Once the 'P' flag is enabled, objects are filled when freed or before initialization. Based on the kernel's definitions, the Poison value used for Use-After-Free detection (POISON_FREE) is 0x6b (corresponding to ASCII 'k'). When a UAF write occurs, SLUB will detect that the value at that location is no longer 0x6b and will report 'Poison overwritten'.

Exercise 3: Application

Question: While analyzing a driver module's memory usage efficiency, you use the slabinfo tool and find that although the kmalloc-192 cache is active, it has high internal fragmentation. Suppose the code calls kmalloc(140), and ksize() returns 192. What is this extra memory (52 bytes) called in slabinfo? When writing kernel code to release resources, which coding style should you adopt to avoid duplicating release logic across multiple error-handling paths?

Answer & Analysis

Answer: This is called Internal fragmentation. You should adopt the Centralized exiting of functions coding style.

Analysis: A request for 140 bytes results in the Slab Allocator providing 192 bytes (usually a power of two or a specific object size). The extra 52 bytes are allocated to the requester but essentially wasted; this is called internal fragmentation. In terms of code maintenance, to prevent memory leaks, the kernel community recommends using goto labels to consolidate cleanup code at the end of the function, known as the 'Centralized exiting of functions' style.

Exercise 4: Thinking

Question: Suppose you are a kernel developer maintaining a PCI device driver. The driver allocates a large amount of memory in the probe function via kmalloc for device register mapping, but forgets to free it in the remove function due to an early return, causing a memory leak. In addition to manually fixing the goto label, you decide to refactor the code to leverage modern kernel APIs to completely eliminate such human errors. Which API mechanism would you use? How does it work? Also, if you want to detect this type of leak, besides code review, which configuration option (CONFIG_...) can you use at runtime to help discover it?

Answer & Analysis

Answer: You should use Resource-managed memory allocation (Devres API), such as devm_kmalloc. It ties the memory lifecycle to the device, automatically freeing it when the device detaches or the driver is unloaded. For runtime detection, you can use CONFIG_DEBUG_KMEMLEAK.

Analysis: This is a comprehensive thinking question. For the common resource leak problem in drivers, the Devres API (like devm_kmalloc) is the best solution. It leverages the device model core to automatically manage resources, eliminating the need to manually write kfree. As a debugging tool, kmemleak (enabled via CONFIG_DEBUG_KMEMLEAK) can discover dynamically allocated memory that is no longer referenced but not freed by scanning memory and pointer references. This is perfect for verifying the fix during the testing phase.


Key Takeaways

This chapter focused on how to use the Linux kernel's built-in lightweight tools, SLUB and kmemleak, to track down "silent" and hard-to-reproduce memory corruption and leak issues. As the modern kernel's default Slab Allocator, SLUB can enforce "poisoning" and "red zone" monitoring on memory by enabling CONFIG_SLUB_DEBUG and using the slub_debug kernel parameter (e.g., FZPU). This converts random bugs like Uninitialized Memory Reads (UMR), Use-After-Free (UAF), Out-Of-Bounds (OOB) writes, and double frees into deterministic error reports at runtime.

The core mechanism of SLUB debugging lies in using specific magic numbers (like 0x6b representing uninitialized or freed memory) to fill the areas surrounding objects. Once a program illegally reads or writes these "landmines," the kernel immediately catches it and prints detailed stack traces, helping developers pinpoint the culprit at the source code level. Compared to the more powerful but heavily overhead-inducing KASAN, SLUB provides a more flexible "on-demand" debugging strategy, capable of targeting specific caches for checks while catching most memory corruption behaviors without severely degrading system performance.

For memory leak issues, the guide introduced kmemleak, a scan-based detection tool. It discovers leaks by periodically traversing kernel memory to find allocated memory blocks that have no pointer references. The key to using kmemleak lies in correctly configuring the environment (ensuring the kmemleak=on boot parameter is passed) and mastering the standard troubleshooting workflow (trigger the leak -> manual scan -> view the report -> clear the records), leaving invisible memory black holes that "only allocate, never free" nowhere to hide.

Beyond tool usage, the guide also introduced userspace helper tools like slabinfo and slabratetop for monitoring slab cache usage and allocation rates. By analyzing the statistics output by these tools, developers can quickly identify which kernel objects are consuming too much memory in the system, or which caches have abnormal allocation frequencies. Combined with mechanisms like kprobe to trace specific call paths, this enables "X-ray vision" and precise analysis of kernel memory behavior.