Skip to main content

Chapter 4: Hello, Kernel — Linux Kernel Modules and Kernel Architecture Basics

Chapter Prologue: Privilege and Boundaries

Imagine this scenario: you're sitting in front of a running Linux machine with a freshly written piece of C code. In user space, you can freely compile and run it, watch it print "Hello, world" in the terminal, and exit — if it crashes, no big deal, the worst that happens is the process gets killed, and the operating system remains rock solid.

Now, I want you to imagine that code as a bullet about to be injected into the heart of the operating system.

This isn't an exaggeration. In the world of Linux kernel programming, once code enters kernel space, it is no longer a supervised process — it is the operating system. It runs at the highest privilege level, with the ability to access all physical memory, directly manipulate hardware ports, and even intercept system calls. There are no protective mechanisms to pull you back here; if a pointer goes astray, the consequence isn't a segmentation fault — it's a system crash or reboot. This is exactly why kernel development is both thrilling and terrifying.

But the traditional kernel development workflow is downright brutal: if you want to change a single line of driver code, you have to reconfigure the entire kernel tree, spend hours recompiling bzImage, and then reboot the machine. This "modify-compile-reboot" endless loop is enough to drain all your creativity.

Is there a way to load and unload code as flexibly as we do with user-space programs, while retaining the sheer power of kernel mode?

This is the core problem we'll solve in this chapter. Linux's answer is the LKM (Loadable Kernel Module) framework. It shatters the stereotype of the kernel as a "static monolith" and breathes a soul of dynamic extensibility into Linux.

In this chapter, we'll build a foundational understanding of kernel architecture — understanding what the "boundary" dividing user and kernel space truly means. Then, we'll write our first kernel module with our own hands, watching it transform from a simple .c file into a .ko binary, get fired into kernel memory by the insmod command, and leave its first cry in the dmesg logs.

This isn't just about writing code; it's about making "physical contact" with the underlying logic of an operating system.


4.1 Architectural Landscape: User Space and Kernel Space

Before we write any code, we need to take a step back and see what the "territory" we're entering actually looks like.

4.1.1 Two Worlds, Two Privilege Levels

Modern processors aren't just calculators that execute instructions — they're strict gatekeepers. They support different privilege levels.

For the x86 architecture, these are called Ring 0 through Ring 3; for ARM, they're called Exception Level 0 through 3. Regardless of the naming, the core idea is the same: security and isolation.

The operating system leverages these hardware features to divide the virtual address space into two distinct regions:

  1. User Space:

    • This is where all applications — your browser, text editor, database processes — live.
    • They run in unprivileged mode.
    • They are strictly restricted: they cannot directly access hardware, cannot arbitrarily read or write memory, and can only request services through specific "gates."
  2. Kernel Space:

    • This is the fortress where the core of the operating system — the kernel, drivers, and network stack — resides.
    • It runs in the highest privilege mode.
    • It is the master of the system: it can access all memory, manipulate all hardware, and execute any CPU instruction.

💡 Key Point Beyond the Code

Many beginners confuse the Root user with kernel privilege.

  • Root is a "superuser" defined at the operating system's software layer, with a UID of 0.
  • Kernel privilege is a state at the CPU's hardware layer.

Although a Root user's process can typically enter kernel privilege mode via system calls, conceptually, the permissions of kernel code itself are far greater than simply "running a script as Root." When you write a kernel module, your code wields this absolute power of life and death.

4.1.2 The Bridge: System Calls

Since user space and kernel space are isolated, how do applications ask the kernel to do work? For example, how does a program display "Hello, world" on the screen?

It can't directly manipulate the graphics card. It has to knock on the door.

The only legitimate entry point is a system call. APIs like open(), read(), write(), and fork() might look like ordinary library functions, but they are actually doors leading to kernel space. When a user process calls them, the CPU switches from unprivileged mode to privileged mode, jumps to predefined code in the kernel to execute, and then returns to user space.

This is the rule we must follow: user space communicates with the kernel through library APIs and system calls.

4.1.3 Inside the Kernel: Monolithic Kernel and Modules

The Linux kernel uses a monolithic kernel architecture. This means all core components of the kernel — the scheduler, memory management, VFS, drivers, and network stack — share the same kernel address space. They can call each other's functions directly, making it highly efficient.

Although Linux is monolithic, it isn't rigid. Through the LKM framework, modern kernels allow us to write certain parts of the code (mainly drivers and filesystems) as independent modules that can be dynamically inserted into or removed from this massive address space.


4.2 The Dynamics: Why Do We Need LKM?

Let's return to our original motivation: why do kernel modules exist?

4.2.1 The Pain Points of the Traditional Approach

Suppose you need to write a driver for a newly purchased network card.

  • The old way: Throw the driver code into the drivers/ directory of the kernel source tree, modify Kconfig, configure it as Y (built-in), and then... recompile the entire kernel and reboot the system.
  • The cost: Even if you only changed one line of code, this entire process could take half an hour. Development efficiency is extremely low.

4.2.2 The Elegance of LKM

The LKM framework provides a dynamic extension mechanism. You can compile your driver code into an independent .ko (Kernel Object) file.

  • Insertion: Use a tool to load the .ko into kernel memory. The code instantly becomes part of the kernel with full kernel privileges.
  • Removal: If you don't want it anymore, or need to update to a new version, simply remove it from memory without even rebooting the system.

This "plug-and-play" capability makes Linux incredibly flexible.

  • Distribution-friendly: Ubuntu doesn't know exactly which network card your computer uses, so it compiles thousands of drivers as modules and automatically loads the correct one when you plug in the hardware, rather than bloating the kernel image to the size of a balloon.
  • Development-friendly: We can insmod -> rmmod -> 编辑 -> 重新编译 -> insmod, iterating in a matter of seconds.

Analogy:

You can think of the kernel as a massive aircraft carrier.

  • Traditional built-in code is the hull itself — it's welded shut during construction, and modifying it requires a dry dock.
  • LKM modules are shipping containers. We can hoist a new container aboard or toss an old one overboard while sailing on the open sea. Once on board, it belongs to the ship's cargo (in the same address space), but it remains modular.

4.3 First Practical Lesson: Writing a Hello World Kernel Module

Enough theory — let's start building this "shipping container."

4.3.1 Preparation: Tools and Environment

Before we begin, you need to confirm two things. It's like cooking: you need a pot and rice first.

  1. Toolchain: The compiler. This is usually already installed on modern distributions. If not:
    sudo apt install gcc
  2. Kernel headers: This is the most critical part. Because a kernel module interfaces intimately with the kernel, it must use the exact same data structure definitions and function prototypes as the currently running kernel.
    sudo apt install linux-headers-generic
    After installation, you'll see the /lib/modules/$(uname -r)/build symbolic link, which points to the installed headers directory (usually in /usr/src/linux-headers-...). This is the "cornerstone" for compiling our modules.

⚠️ Pitfall Warning

Never try to compile a kernel module C file directly using the gcc command! The compilation process for kernel modules is extremely complex and relies on the kernel build system. You must use a Makefile and the make command. We'll explain exactly why later.

4.3.2 First Look at Code: Breaking the main() Obsession

The hardest part is often the first line of code. Let's see what the simplest kernel module looks like. There is no main() function here.

File: helloworld_lkm.c

#include <linux/init.h>
#include <linux/module.h>

/* 模块元数据:这些信息可以通过 modinfo 看到 */
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("LKP2E book:ch4: hello, world LKM");
MODULE_LICENSE("Dual MIT/GPL");
MODULE_VERSION("0.2");

/* 初始化入口点:当模块加载时执行 */
static int __init helloworld_lkm_init(void)
{
printk(KERN_INFO "Hello, world\n");
return 0; /* 返回 0 表示成功 */
}

/* 清理出口点:当模块卸载时执行 */
static void __exit helloworld_lkm_exit(void)
{
printk(KERN_INFO "Goodbye, world!\n");
}

/* 注册我们的入口和出口函数 */
module_init(helloworld_lkm_init);
module_exit(helloworld_lkm_exit);

Let's dissect this "organism" line by line:

  1. Header files:
    • <linux/init.h> and <linux/module.h> are the cornerstones of kernel modules. Notice there's no <stdio.h> here — that's for user space. The kernel space does not use the standard C library.
  2. Module metadata:
    • Those macros starting with MODULE_ aren't just comments. They get embedded into the compiled .ko file. Users can read this information with the modinfo ./helloworld_lkm.ko command. This is very important in formal product development for managing copyrights and versions.
  3. Entry and exit:
    • Kernel modules don't have a main(). They are event-driven.
    • module_init(helloworld_lkm_init): Tells the kernel, "When you load me, please execute this function."
    • module_exit(helloworld_lkm_exit): Tells the kernel, "When you unload me, please execute this function."
  4. The __init and __exit macros:
    • This is kernel optimization magic.
    • __init tells the linker: "Put this function into a special initialization memory segment (.init.text)." Once initialization is complete, the kernel frees this memory, reclaiming precious RAM.
    • __exit does the same for cleanup code. For built-in modules (non-LKM), this macro is ignored because built-in code cannot be unloaded.

4.3.3 The Philosophy of Return Values: The 0/-E Convention

Notice that the helloworld_lkm_init function returns 0.

Here is an ironclad rule of kernel programming, counterintuitive to user-space programming:

  • Success returns 0.
  • Failure returns a negative error code (such as -ENOMEM, -EINVAL).

Why? Inside this initialization function, the kernel is doing work on behalf of a user-space process (the one that called insmod). According to POSIX system call conventions, if it fails, the kernel needs to set the global variable errno. By returning a negative error code (like -12, which is -ENOMEM), the upper-level system call wrapper (glibc) negates it into a positive number (12), assigns it to errno, and returns -1 to user space.

🚫 Pitfall in Action

If you write a init function and forget to write return 0; at the end, or if you return a positive number (like 1), on some older kernels your module will be immediately rejected from loading; on newer kernels, you'll receive a warning and the module might not work properly. Conclusion: On success, you must explicitly return 0.


4.4 The Build System: Magic in the Makefile

The code is written — how do we turn it into a .ko? We need a special recipe.

File: Makefile

# 获取当前目录路径
PWD := $(shell pwd)

# 指向内核构建目录(这是关键!)
KDIR := /lib/modules/$(shell uname -r)/build/

# 核心目标:告诉 Kbuild 我们要编译哪个模块
# obj-m 表示“编译成模块”
obj-m += helloworld_lkm.o

# 默认目标:编译模块
all:
make -C $(KDIR) M=$(PWD) modules

# 清理目标
clean:
make -C $(KDIR) M=$(PWD) clean

What exactly happens with this make -C __PRESERVED_16__(PWD) modules? This command is incredibly elegant and is the core of LKM building:

  1. make -C $(KDIR): We aren't running make in the current directory; instead, we're handing control over to the top-level Makefile of the kernel source tree (located at /lib/modules/.../build, which is that linux-headers directory). This ensures our module is built strictly according to the current kernel's configuration, compiler flags, and macro definitions.

  2. M=$(PWD): This is a parameter. We're telling the kernel's Makefile: "Hey, boss, even though you live in /usr/src/..., I want you to turn around and compile the module in my current directory."

  3. modules: This is the target of the kernel build system, telling it we want to build modules.

The obj-m variable: This is Kbuild system syntax.

  • obj-y: If you write obj-y += foo.o, it means foo.c will be compiled and linked into the static kernel image (vmlinux).
  • obj-m: If you write obj-m += foo.o, it means generating a loadable module foo.ko.

Now, run the build command:

make

If all goes well, you'll see compilation output and a file named helloworld_lkm.ko will be generated in the current directory. This is the "shipping container" we're about to inject into the kernel.


4.5 The Lifecycle: Loading, Running, and Unloading

Now, with the .ko file in hand, it's time to get to work.

4.5.1 Injecting into the Kernel: insmod

We use the insmod command to insert the module into the kernel.

$ sudo insmod ./helloworld_lkm.ko

What happens here?

  1. insmod is a user-space tool.
  2. It reads the contents of the .ko file.
  3. It initiates a system call (finit_module or the older init_module).
  4. The kernel takes over, loads the code and data segments into kernel memory, resolves symbols, and executes your helloworld_lkm_init function.

4.5.2 Invisible Output: printk and dmesg

Did the code run? Let's check the logs. Inside the kernel, we can't use printf. The kernel has no C library. It has its own printing function: printk. printk writes messages to the kernel log buffer.

To view this buffer, we use the dmesg tool:

$ sudo dmesg | tail -n 5
[ 4123.028252] Hello, world

⚠️ Permission Issues

On distributions like Ubuntu, running dmesg as a regular user might throw an error: dmesg: read kernel buffer failed: Operation not permitted This is because the dmesg_restrict security mechanism is enabled, preventing regular users from peeking into kernel details. Make sure to use sudo.

💡 About "Tainted Kernel"

In the logs, you might see a warning like this: kernel: loading out-of-tree module taints kernel. Don't panic! This is just a "taint" flag. Because you loaded an unsigned, out-of-tree module that isn't part of the official kernel tree, kernel developers want to express: "If the system blows up now, don't blame us, because we don't know what this module did." For learning and development, this is perfectly fine.

4.5.3 Checking Survival Status: lsmod

Your module is now alive as a kernel entity. We can use the lsmod command to view all modules residing in memory.

$ lsmod | grep helloworld
helloworld_lkm 16384 0

The output contains three columns:

  1. Module name
  2. Memory size (bytes)
  3. Reference count (0 means it's not depended on by other modules and can be safely unloaded)

4.5.4 Exiting the Stage: rmmod

When the show is over, we unload it.

$ sudo rmmod helloworld_lkm

At this point, the kernel will call the helloworld_lkm_exit function you wrote. Let's verify this once more:

$ sudo dmesg | tail -n 5
[ 4123.028252] Hello, world
[ 40280.138269] Goodbye, world!

See that? "Goodbye, world!" has appeared. Your module has gracefully left memory.


4.6 Deep Dive: Kernel Logging and Advanced Usage of printk

You now know how to write the simplest module, but printk, this "console output," is actually a deep-sea leviathan — far more complex than it appears.

4.6.1 Log Levels: What is KERN_INFO?

Looking back at the code:

printk(KERN_INFO "Hello, world\n");

Notice there is no comma before KERN_INFO. This isn't an argument; it's string concatenation. During preprocessing, it becomes a special ASCII character (Start of Header, \001) plus the number "6".

The kernel defines 8 log levels (0-7):

  • KERN_EMERG "0": System is unusable (about to crash).
  • KERN_ALERT "1": Action must be taken immediately.
  • KERN_CRIT "2": Critical conditions.
  • KERN_ERR "3": Errors.
  • KERN_WARNING "4": Warnings.
  • KERN_NOTICE "5": Normal but significant.
  • KERN_INFO "6": Informational.
  • KERN_DEBUG "7": Debug-level messages.

What is the point of this level? Console output control. The kernel has a parameter called console_loglevel (inside /proc/sys/kernel/printk). Only messages with a log level number less than this level will be printed to the current console terminal (the screen you're looking at).

  • If the level is set to 4, then KERN_ERR (3) will print, but KERN_INFO (6) will not print to the screen — it will only sit in the buffer, waiting for you to view it with dmesg.
  • If you want all messages to flood the screen like printf, you can (temporarily) change the level to 8:
    sudo sh -c "echo '8 4 1 7' > /proc/sys/kernel/printk"

4.6.2 Modern Style: The pr_info Family

Although you can keep using printk(KERN_INFO "msg"), kernel developers recommend using the more modern, convenient macro wrappers:

pr_info("Hello, world\n");
pr_err("Something went wrong: %d\n", err);

These macros are essentially still printk, but they automatically handle the log level strings and are incredibly powerful when used with the pr_fmt macro.

Standardizing your output: You can define pr_fmt at the very beginning of a file, so that all pr_xxx calls in that file will automatically carry a prefix (like the module name).

#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

#include <linux/module.h>

// ...
pr_info("Data loaded\n"); // 实际输出: "helloworld_lkm: Data loaded"

4.6.3 Dynamic Debugging: Dyndbg

If you think #ifdef DEBUG is too old-school, the kernel has a magical tool called Dynamic Debug. When you enable CONFIG_DYNAMIC_DEBUG in your kernel configuration, you can use debugfs at runtime to toggle any pr_debug() call statement on or off, even matching by filename or function name.

For example, to print all debug messages in the usb core code:

echo 'file *usb* +p' > /sys/kernel/debug/dynamic_debug/control

This is leagues better than recompiling the module.

4.6.4 Rate Limiting

If you put a printk inside a high-frequency execution path (like an interrupt handler triggered 1,000 times per second), your console will be flooded, your logs will be washed away, and it might even cause system stuttering.

The kernel provides rate-limited versions of the printing macros:

pr_info_ratelimited("High frequency event: %d\n", val);

This automatically limits the printing frequency (default is allowing one burst every 5 seconds) and tells you in the logs "suppressed N callbacks," letting you know how many times it occurred without printing.


Chapter Echoes

Remember the question we posed at the beginning of this chapter: How do we dynamically grant the kernel new capabilities while keeping it stable?

Now you have the answer. The LKM framework isn't just a tool for "loading drivers" — it's the physical embodiment of Linux's modular philosophy. Through insmod, we inject code into the kernel's high-privilege address space; through module_init, we insert initialization logic between the kernel's breaths; through printk, we establish a communication pipeline from the depths of the kernel to the human eye.

On the surface, we spent this chapter writing a Hello, world, but in reality, we were learning how to walk safely in the most heavily restricted forbidden zone. You learned why we can't use printf, why return values are negative, and why the Makefile has to borrow the kernel's build system.

In the next chapter, we'll dive deeper into this territory to handle more complex scenarios — inter-module dependencies, passing parameters, and how to pass real data between the kernel and user space, rather than just printing a string of characters.

When that time comes, you'll find that all the concepts we built today will come back into play in unexpected ways.


Exercises

Exercise 1: Understanding

Question: True or False: A user-space application (like a web browser) can call kernel functions by directly accessing kernel-space memory addresses, as long as it knows the correct memory address.

Answer and Explanation

Answer: False

Explanation: User space and kernel space are isolated virtual address spaces. Programs running in user mode are unprivileged and cannot directly access kernel space. All requests for kernel functionality must be completed through the legitimate entry point of system calls, which handle the switch from user mode to kernel mode. Directly accessing kernel memory will result in an invalid memory access exception.

Exercise 2: Application

Question: When writing the initialization function static int __init my_init(void) for a kernel module, if resource allocation fails, what value should the function return according to the kernel programming 0/-E return convention?

Answer and Explanation

Answer: A negative error code (such as -ENOMEM or -1)

Explanation: In kernel programming conventions, a function must return 0 on success. If an error occurs (such as out of memory, inability to acquire a lock, etc.), it must return a negative error code (for example, -ENOMEM indicates out of memory, -EINVAL indicates an invalid parameter). Returning a non-zero positive value violates convention and will cause the system to mistakenly judge it as a success.

Exercise 3: Application

Question: Scenario application: You are writing kernel module code for a high-frequency interrupt handler. This code path might be called thousands of times per second and contains a printk() log output statement. To prevent this log from causing system log overflow or disk I/O overload, which macro should you use to replace the standard printk()?

Answer and Explanation

Answer: pr_info_ratelimited() or printk_ratelimited()

Explanation: A standard printk will create a "log storm" under high-frequency calls. The kernel provides a rate-limiting mechanism through the pr_<foo>_ratelimited() family of macros (like pr_info_ratelimited), which ensures that the same log message is only output once within a specific time window, thereby protecting system stability.

Exercise 4: Thinking

Question: Thought experiment: Why can't kernel modules (LKMs) simply link and use the printf() function from the standard C library (glibc) like user-space programs do, and why must they use the kernel-specific printk()? What essential difference regarding "context" and "dependency" in operating system design does this reflect?

Answer and Explanation

Answer: Primarily due to differences in the runtime environment (context) and dependent libraries. The kernel runs in an independent address space and lacks the support of user-space libraries.

Explanation: The core of this question lies in understanding the fundamental differences between the kernel and user space:

  1. Context environment: Kernel modules run in kernel space with the highest privilege level (Ring 0), directly managing hardware, without any runtime environment support.
  2. Library dependencies: glibc is a library that runs in user space and relies on system calls provided by the kernel to work (in fact, glibc's printf ultimately calls the write system call). The kernel exists at a lower level and cannot "depend" on an upper-layer library, otherwise it would create a circular dependency.
  3. Functional differences: printf needs to format buffers and handle stream I/O, whereas the kernel needs a lower-level logging tool that doesn't rely on complex buffering mechanisms and can safely run in interrupt context — namely, printk. This reflects the operating system's principle of layered design: lower-level modules cannot depend on upper-level implementations; they must be self-contained.

Key Takeaways

The Linux kernel strictly divides the virtual address space into user space and kernel space through privilege levels. The former runs applications restricted to unprivileged mode and can only request services through system calls; the latter is the highest privilege mode where the core of the operating system runs, with permissions to access all memory and hardware. Understanding this boundary is a prerequisite for kernel development, because once a kernel module's code is loaded, it will run directly in this unprotected, high-privilege environment.

To solve the inefficient "modify-compile-reboot" endless loop of traditional kernel development, Linux introduced the Loadable Kernel Module (LKM) mechanism. LKM allows driver or feature code to be compiled into independent .ko files, which can be dynamically inserted or removed via the insmod and rmmod commands, extending kernel functionality without rebooting the system. This mechanism gives the Linux monolithic kernel flexibility similar to a microkernel while maintaining high performance.

Kernel modules do not have a main function like user-space programs; instead, they are based on an event-driven model, specifying initialization and cleanup functions through the module_init and module_exit macros. The initialization function must return 0 to indicate success (returning a negative error code on failure), and using the __init macro at compile time places the code into a temporary memory segment to save RAM. This design clarifies the lifecycle management of modules and is deeply tied to the kernel's build process.

Building kernel modules cannot use the standard gcc command; it must rely on the kernel's built-in Kbuild build system. In the Makefile, you need to define the obj-m variable to specify the target module and use the make -C __PRESERVED_17__(PWD) command to borrow the configuration and headers from the current kernel source tree for compilation. This ensures the module strictly matches the currently running kernel version, avoiding system crashes caused by interface inconsistencies.

Kernel space cannot use standard C library functions, so printf is replaced by printk, which outputs logs to the kernel buffer instead of the terminal. The dmesg tool can be used to read these logs, and log levels (like KERN_INFO) determine the display strategy of messages on the console. Proficiency in using printk (or the modern pr_info) and its log levels is a key method for developers to obtain debugging information from inside the kernel.