Skip to main content

Chapter 5: Writing Your First Kernel Module — Part 2

This might be the first time you realize that "modularity" in real-world engineering doesn't just mean throwing things into separate folders. In the previous chapter, we got a minimal kernel module up and running—like learning to light a fire with a lighter. But if you're actually going to survive winter on that fire, you need a system that doesn't just start fires, but controls them, prevents backfires, and automatically adds fuel when it gets cold.

This brings us to the core mission of this chapter: how to treat kernel modules as serious engineering projects.

This isn't just about code style. When you try to port a module from x86_64 to an ARM board, when you switch between a debug kernel and a release kernel, or when you try to make multiple modules work together like building blocks, you'll find yourself dealing with an extremely strict set of rules—rules that don't care about your intuition, only whether your configuration, version, and Application Binary Interface (ABI) match perfectly.

Why the Old Approach Doesn't Work

The Makefile from the previous chapter was a "just make it run" minimal version. It compiles, it installs, but that's about it. In a real project, this rudimentary build system is like writing code in Notepad—it might work, but it will make your life miserable in actual engineering:

  • Lack of automated checks: You might bury a buffer overflow vulnerability in your code, or use an API deprecated by the kernel community, but the build system won't say a word.
  • Terrible debugging experience: If you want to enable some debug macros, you have to manually edit the code instead of simply passing a flag.
  • No security awareness: No code style checks, no static analysis, and the generated module might go live with a bunch of security holes.

This Chapter's Mission

We need to upgrade that "toy" module development environment into a professional development workflow.

This isn't just about writing a few more lines in a Makefile. We'll go through the entire process from "configuring the environment" to "writing code," then to "security hardening" and "deployment." Along the way, you'll encounter some counterintuitive phenomena—like why a module that compiles perfectly simply refuses to load on the board? Or why you can't just use floating-point numbers in the kernel, and why even sprintf gets frowned upon?

This is the necessary path from "it works" to "it works well," and finally to "it works well and is secure."


5.1 A "Better" Kernel Module Makefile Template

Let's start with a very practical question: how do we make the build process smarter and safer?

In the previous chapter, we used a very basic Makefile. It got the job done, but as I just mentioned, it wasn't smart enough. Now I'll show you a "better" Makefile template. This template isn't just for compiling; it's designed to help you fill in the pitfalls during the build phase itself.

The goal of this template is simple: force you to focus on code quality. It integrates static analysis, code style checks, automated cleanup, and packaging. You might think these features are something to "do later," but experience tells us that "later" usually means "never."

What This Makefile Can Do

You can think of this template as an automated code reviewer. When you're about to commit code, it helps you catch those obvious mistakes. Specifically, it includes the following categories of targets:

  1. Standard build targets: all (build), install (install), clean (clean). There's a clever trick here: it decides whether to strip debugging symbols from the module based on whether you have "debug mode" enabled.
  2. Code style targets: indent (automatically format code) and checkpatch (run the kernel's official code style check script, scripts/checkpatch.pl).
  3. Static analysis targets: Integrates static checks from sparse, gcc, and tools like flawfinder. It also includes support for Coccinelle.
  4. Dynamic analysis placeholders: It defines some "fake" targets (da_kasan, da_lockdep, etc.) to remind you: if you really want to catch memory leaks or deadlocks, you need to configure and run a dedicated "debug kernel." We'll cover how to configure this kernel very soon.
  5. Packaging target: tarxz-pkg. This target packages your source code into a .tar.xz file. This is great for porting—you can transfer the archive to another machine, extract it, and compile directly.

Seeing Its True Colors

This Makefile is located in the ch5/lkm_template directory. Let's see what it looks like in action. In your terminal, try pressing Tab twice:

lkm_template $ make <tab><tab>
all clean help install
checkpatch code-style indent nsdeps sa_cppcheck sa_gc
sa_sparse sa_flawfinder

Figure 5.1: Screenshot of the help target output from our "better" Makefile

This isn't just listing commands. Notice the highlighted FYI: line in Figure 5.1. It reveals the Makefile's current understanding of several key variables we've set:

FYI: KDIR=/lib/modules/6.1.25-lkp-kernel/build ARCH= CROSS_COMPILE=

Here are a few key points:

  1. MYDEBUG variable: In this Makefile, I use MYDEBUG to control whether to perform a "debug build." By default, it's set to n (off). If set to y, the build process retains debugging symbols and defines the DEBUG macro.
  2. DBG_STRIP variable: This variable controls whether to strip symbols. The default is y. Our Makefile is a bit smart: it only strips symbols when in non-debug mode (MYDEBUG=n) AND the kernel doesn't have module signing enabled. Why? Because module signing requires symbol information to verify integrity (we'll cover module signing later).
  3. KDIR variable: This is the path to the kernel source tree (specifically, the kernel headers). The default value is the build directory corresponding to our currently running kernel version.
  4. ARCH and CROSS_COMPILE: These variables default to empty because we haven't started cross-compilation yet.

Give It a Try

Talk is cheap. Let's use this "better" Makefile to build our template module, plug it into the kernel, and pull it out again to see if the printk output looks normal (see Figure 5.2).

Figure 5.2: Building the lkm_template module with the "better" Makefile and trying it out (on our x86_64 Ubuntu VM)

⚠️ Warning To get the most out of this Makefile, you need to have a few essential packages installed on your system:

  • indent(1) (code formatting)
  • linux-headers-$(uname -r) (kernel headers)
  • sparse(1) (static analysis tool)
  • flawfinder(1) (security scanning)
  • cppcheck(1) (C++ static checking)
  • tar(1) (packaging tool)

These tools are usually mentioned in the "Setting Up the Kernel Workspace" chapter. If you're not sure, running the ch1/pkg_install4ubuntu_lkp.sh script (on Ubuntu) is an easy way out.

One more detail: the "dynamic analysis" targets (da_*) mentioned in the Makefile are actually empty shells. Running them will only print a reminder message. Their purpose is like a sticky note on your monitor: constantly reminding you that you can't catch deep bugs with a normal kernel—you need to configure a dedicated "debug kernel."

Speaking of debug kernels, the next section covers how to configure one.


5.2 Configuring a "Debug" Kernel

During development, your life will be much happier if you can run your code on a kernel with various debug options enabled. Those random crashes that make you tear your hair out on a normal kernel will often show their true colors in front of a debug kernel—they might even stop immediately and sound the alarm the moment they happen.

I strongly recommend preparing two kernel environments during development:

  1. Production kernel: A carefully configured, fully optimized kernel for final release and daily use.
  2. Debug kernel: A kernel with a large number of kernel debug options intentionally enabled (and probably not optimized much), dedicated to catching bugs.

For specific details on configuring and compiling kernels, you can look back at Chapters 2 and 3. Here I assume you're already proficient with the basics of make menuconfig. Now, we need to enable some key debug configurations for our custom 6.1 kernel.

Configuration List (set all options below to y) Most options are in the Kernel hacking submenu.

  • General Debugging
    • CONFIG_DEBUG_KERNEL and CONFIG_DEBUG_INFO: The foundation, must have.
    • CONFIG_DEBUG_MISC: Miscellaneous debug support.
    • CONFIG_MAGIC_SYSRQ: Allows you to execute emergency commands via key combinations (like Alt+SysRq+...).
    • CONFIG_DEBUG_FS: Mount the debugfs filesystem, where a lot of debug information lives.
    • CONFIG_KGDB: Kernel GDB support (optional, but recommended).
    • CONFIG_UBSAN: Undefined behavior checker.
    • CONFIG_KCSAN: Dynamic data race detector.
  • Memory Debugging
    • CONFIG_SLUB_DEBUG: Enable SLUB Allocator debugging features.
    • CONFIG_DEBUG_MEMORY_INIT: Memory initialization debugging.
    • CONFIG_KASAN: A magic tool. Kernel Address Sanitizer, specializes in curing various memory corruptions (out-of-bounds, use-after-free, etc.).
    • CONFIG_DEBUG_SHIRQ: Shared interrupt debugging.
    • CONFIG_SCHED_STACK_END_CHECK: Check for stack overflows.
    • CONFIG_DEBUG_PREEMPT: Preemption debugging.
  • Lock Debugging
    • CONFIG_PROVE_LOCKING: A magic tool. Lock dependency checker, helps you find potential deadlocks. Enabling it automatically turns on a series of other lock debugging options.
    • CONFIG_LOCK_STAT: Lock usage statistics.
    • CONFIG_DEBUG_ATOMIC_SLEEP: Check for the error of sleeping in atomic context.
  • Other
    • CONFIG_BUG_ON_DATA_CORRUPTION: Trigger a BUG on data corruption.
    • CONFIG_STACKTRACE: Stack trace support.
    • CONFIG_DEBUG_BUGVERBOSE: Verbose BUG reporting.
    • CONFIG_FTRACE: Kernel tracing framework. Enable at least a few tracers, like the "kernel function tracer."

Architecture-Specific Options (x86)

  • CONFIG_EARLY_PRINTK: Early console output.
  • CONFIG_DEBUG_BOOT_PARAMS: Boot parameter debugging.
  • CONFIG_UNWINDER_FRAME_POINTER: Select frame pointer unwinder and enable stack validation.

A few things to note here:

  • Don't be intimidated: If you don't understand what these options do right now, don't worry. By the time you finish this book, most of them will become clear.
  • The Ftrace pitfall: Although Ftrace itself might be enabled by default, its various "plugins" aren't necessarily all turned on. For things like CONFIG_IRQSOFF_TRACER, we need to manually enable them because we'll use them in later chapters.
  • Performance overhead: Enabling these options will definitely reduce system performance. But that's fine—our goal right now is to catch bugs, especially in the face of those hard-to-reproduce bugs, performance is an acceptable sacrifice.

Your development workflow should look like this: run the code on the debug kernel first to ensure there are no obvious memory errors or deadlocks; then verify performance and functionality on the production kernel.

Alright, with this equipment, we can finally tackle a real-world scenario: cross-compiling a kernel module for another device (usually an ARM board).


5.3 Cross-Compiling Kernel Modules

In Chapter 3, we demonstrated how to cross-compile the entire Linux kernel for a Raspberry Pi. Now, let's focus on a more specific task: cross-compiling a single kernel module.

This seems simple—isn't it just changing the compiler path? But in reality, to get this process working, we might go through four failed attempts. Don't worry, each failure is because we hit a pitfall that must be understood.

To prepare you mentally, I've prepared a table at the end of this section summarizing the problems encountered and solutions found in these four attempts.

But before that, we need to lay the groundwork.

Preparation: Setting Up the Cross-Compilation Environment

To cross-compile a module, two things are essential:

  1. The target device's kernel source tree: You must have a complete kernel source tree for the target device (e.g., Raspberry Pi) on the host machine. Note: a complete source tree, not just headers, because we also need the Module.symvers file (we'll talk about what this weird thing is later).
  2. Cross-toolchain: You need a compiler that can generate ARM64 code from an x86_64 host. If you haven't installed it yet, you can run:
    sudo apt install gcc-aarch64-linux-gnu binutils-aarch64-linux-gnu

I assume you've followed the Chapter 3 guide, placed the Raspberry Pi 6.1.34 kernel source in the ~/rpi_work/kernel_rpi/linux directory, and installed the toolchain. The toolchain prefix is aarch64-linux-gnu-. We can do a quick verification:

$ aarch64-linux-gnu-gcc
aarch64-linux-gnu-gcc: fatal error: no input files
compilation terminated.

Great, the "no input files" error means the compiler is running. As a modern alternative, the Clang toolchain is also popular (Android uses it) and is even stronger than GCC in some aspects. However, for consistency in our demonstration, we'll stick with the classic GCC.

Alright, the environment is ready. Let's begin the first attempt.


Attempt 1: Setting ARCH and CROSS_COMPILE Environment Variables

In theory, this should be ridiculously simple. You just need to specify the architecture and cross-compiler prefix in the make command.

To avoid polluting the original code, let's create a new directory cross and copy the code over (the book's source repository actually has this ready, under ch5/cross):

cd <book-dir>/ch5
mkdir cross
cd cross
cp ../lkm_template/lkm_template.c ../lkm_template/Makefile .

Then try to compile:

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-

(The book's code repository provides a small script ch5/cross/buildit that does some pre-checks and then runs this command.)

But if you actually run this, it will most likely crash immediately:

--- Building : KDIR=~/arm64_prj/kernel/linux ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
[...]
make[1]: *** /home/c2kp/arm64_prj/kernel/linux: No such file or directory
make: *** [Makefile:93: all] Error 2

Why did it fail?

The clue is right in the error message: it's trying to find the kernel at the path /home/c2kp/arm64_prj/kernel/linux, which is obviously our host machine's path (and probably an invalid one at that). It has no idea we're compiling for the Raspberry Pi.

Solution: We need to tell the Makefile, stop guessing, go look for the target kernel source here.

This requires modifying the KDIR variable in the Makefile. Let's see how this corrected Makefile is written:

# ch5/cross/Makefile:
# 为了支持交叉编译,通过 make ARCH=<arch> CROSS_COMPILE=<prefix> 来调用
[ ... ]
else ifeq ($(ARCH),arm64)
# *更新* 下面的 KDIR 指向你的 ARM64 Linux 内核源码树路径
#KDIR ?= ~/arm64_prj/kernel/linux
KDIR ?= ~/rpi_work/kernel_rpi/linux
else ifeq ($(ARCH),powerpc)
[ ... ]
else
[ … ]
endif
[ ... ]
# 重要:设置 FNAME_C 为内核模块源文件名
FNAME_C := lkm_template
PWD := $(shell pwd)
obj-m += ${FNAME_C}.o
[ ... ]
all:
@echo
@echo '--- Building : KDIR=${KDIR} ARCH=${ARCH} CROSS_COMPILE=${CROSS_COMPILE}'
@echo
make -C $(KDIR) M=$(PWD) modules
[...]

This "better" Makefile is now smart:

  • It automatically points KDIR to the correct kernel source directory based on the value of the ARCH environment variable.
  • It defines obj-m, specifying the module object file to be generated.
  • It adds the DEBUG macro definition via ccflags-y (don't use the old-style CFLAGS_EXTRA), so macros like pr_debug() can work (though it's off by default, of course).
  • The @echo lines are there to print some useful information during compilation, making it easier for you to troubleshoot.
  • Finally, in targets like all, install, and clean, we use the standard make -C __PRESERVED_52__(PWD) modules syntax to ensure the build system switches to the correct kernel directory to work.

Now, let's try again with this corrected Makefile.


Attempt 2: Fixing the Makefile to Point to the Correct Source Tree

Now the Makefile knows where to find the Raspberry Pi's kernel source. Let's try building again:

$ make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf-
[ ... ]
CC [M] /home/c2kp/Linux-Kernel-Programming_2E/ch5/cross/lkm_template.o
MODPOST /home/c2kp/Linux-Kernel-Programming_2E/ch5/cross/Module.symvers
ERROR: modpost: "_printk" [/home/c2kp/Linux-Kernel-Programming_2E/ch5/cross/lkm_template.ko] undefined!
make[2]: *** [scripts/Makefile.modpost:126: /home/c2kp/Linux-Kernel-Programming_2E/ch5/lkm_template.ko] Error 1
[ ... ]

Oops, failed again. This time it's an error at the modpost stage.

Reason: modpost is a stage in the build system responsible for checking the symbols exported by the module. This information is stored in the Module.symvers file at the root of the kernel source tree. If this file isn't there, or if the kernel tree hasn't been fully compiled yet, modpost won't be able to find kernel-exported functions like _printk.

Solution: Simple and brute-force—clean the target kernel tree and recompile it to ensure Module.symvers is generated:

# 在树莓派内核源码目录下
make mrproper
# 重新配置并编译
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- defconfig
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -j$(nproc)

Alright, now Module.symvers should be sitting there obediently. Try compiling the module again:

rpi $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
--- Building : KDIR=~/rpi_work/kernel_rpi/linux ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
aarch64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
make -C ~/rpi_work/kernel_rpi/linux M=/home/c2kp/Linux-Kernel-Programming_2E/ch5/cross modules
make[1]: Entering directory '/home/c2kp/rpi_work/kernel_rpi/linux'
CC [M] /home/c2kp/Linux-Kernel-Programming_2E/ch5/cross/lkm_template.o
MODPOST /home/c2kp/Linux-Kernel-Programming_2E/ch5/cross/Module.symvers
CC [M] /home/c2kp/Linux-Kernel-Programming_2E/ch5/cross/lkm_template.mod.o
LD [M] /home/c2kp/Linux-Kernel-Programming_2E/ch5/cross/lkm_template.ko
make[1]: Leaving directory '/home/c2kp/rpi_work/kernel_rpi/linux'
if [ "n" != "y" ]; then \
sudo aarch64-linux-gnu-strip --strip-debug lkm_template.ko ; \
fi

Excellent! Compilation succeeded. Let's look at the fruits of our labor:

$ ls -l ./lkm_template.ko
-rw-rw-r-- 1 c2kp c2kp [...] ./lkm_template.ko
$ file ./lkm_template.ko
./lkm_template.ko: ELF 64-bit LSB relocatable, ARM aarch64, version 1 (SYSV), not stripped

Note here: the file type shows as ARM aarch64, meaning we successfully generated an ARM64 architecture module!

Of course, in reality, you might encounter another pitfall: the target kernel source tree might be in a "virgin" state (doesn't even have a .config), or it was never configured. In that case, just bite the bullet and configure and compile the kernel once first, then come back to compile the module.


Attempt 3: Loading the Cross-Compiled Module on the Device

Now we have a module cross-compiled using a properly configured Raspberry Pi kernel source tree (and the Module.symvers is there too). Theoretically, it should run on the board.

The proof is in the pudding. Let's scp the module to the Raspberry Pi and try to load it (the following output comes directly from the device):

rpi $ sudo insmod ./lkm_template.ko
insmod: ERROR: could not insert module ./lkm_template.ko: Invalid module format

Failed again?

Yes, and the error message is very vague. This usually means the kernel refused to load the module. Let's use dmesg to see what actually happened:

rpi $ dmesg
[...]
[ 123.456789] lkm_template: version magic '6.1.34-v8+' should be '6.1.21-v8+'

Aha, found the culprit!

Reason: Version mismatch. The Raspberry Pi kernel we're currently running is version 6.1.21-v8+ (this is the kernel that comes by default with Raspberry Pi OS). But our module was compiled for 6.1.34-v8+ (the version we compiled in our source tree).

The kernel has an ironclad rule: a module can only be inserted into the exact kernel it was compiled for. The exact version number, compilation parameters, and even configuration options must all match.

This leads to the next topic we need to dive into: Linux kernel ABI compatibility issues.


Checking Linux Kernel ABI Compatibility Issues

The Linux kernel has a hard rule regarding the Application Binary Interface (ABI):

The kernel will only insert a module into memory if that module was built precisely for it.

"Built precisely for it" means:

  • Exact kernel version
  • Same compiler version and flags
  • Same kernel configuration options

This doesn't mean kernel modules have zero portability. They are source-level portable. As long as you have the source code, you can recompile and run it on any architecture. However, the binary file (that .ko file) is not portable. It can only run on the specific kernel it was compiled for.

Figure 5.1 demonstrates the error log when versions don't match.

Figure 5.1: Error message in the kernel log about version magic mismatch

Although we don't plan to use it right now, there is a framework called DKMS (Dynamic Kernel Module Support) specifically designed to solve the automatic recompilation problem for third-party modules. Its core idea is simple: automatically recompile the module when a new kernel is installed.

VirtualBox drivers are a typical example of using DKMS. Every time your host kernel upgrades, DKMS automatically helps you recompile modules like vboxdrv to ensure they can run on the new kernel.


Attempt 4: Resolving the ABI Issue and Loading Successfully

Now we understand the problem: the module and the kernel running it must be "a perfect match." There are a few solutions:

  1. The hardcore approach (common in embedded development): Configure, cross-compile, and boot your own custom kernel, then compile all modules against this specific kernel source tree. This is the standard practice for embedded products.
  2. The DKMS approach (common on desktops/servers): Use the DKMS framework to let the system automatically recompile modules when the kernel updates.
  3. The compromise approach (for experimentation): Recompile your module to match the kernel currently running on the board.

Since we're doing embedded development, we'll adopt the first approach: make the board boot the custom 6.1 kernel we compiled in Chapter 3.

For how to copy the kernel image, device tree, and modules to the SD card and boot, the official documentation explains it clearly (https://www.raspberrypi.org/documentation/linux/kernel/building.md). Here I'll just mention a handy trick: how to easily switch between two kernels.

Assuming your device is a Raspberry Pi 4B running a 64-bit kernel:

  1. Copy your custom-compiled Image kernel binary to the /boot partition of the SD card, naming it kernel8.img. (For safety, rename the original to kernel8.img.orig as a backup first).
  2. Use scp to copy the freshly cross-compiled lkm_template.ko (ARM64 version) to the /home/pi directory on the SD card.
  3. (Optional) If you want to specify booting a particular kernel, you can edit the /boot/config.txt file on the SD card and use kernel= to specify the filename. However, by default, the bootloader will automatically load kernel8.img.
  4. Save and reboot.

After logging into the device, try loading the module again. Figure 5.2 shows a screenshot of the successful run.

Figure 5.2: The cross-compiled LKM running successfully on a Raspberry Pi 4B. Notice how the kernel version, hardware, and kernel configuration match perfectly.

Look! It worked this time! Notice the modinfo output: the vermagic field shows the module was compiled for 6.1.34-v8+, and our currently running kernel is also 6.1.34-v8+. A perfect match.

⚠️ Warning: If you encounter non-fatal errors during rmmod, or if the module state is abnormal after unloading, it might be because you haven't fully deployed the newly compiled kernel modules. You need to copy all kernel modules (under /lib/modules/<kernel-ver>/) over, and run depmod on the device.


Summary: Pitfalls of Cross-Compilation

For your future reference, I've summarized the problems encountered and solutions found in these attempts in the table below:

AttemptReason for FailureSolution
1The Makefile tried to compile against the host machine's kernel source tree (x86_64) instead of the target's (ARM64).Modify the Makefile to point KDIR to the correct target kernel source directory based on the ARCH variable: KDIR ?= ~/rpi_work/kernel_rpi/linux.
2Compilation failed with an error at the modpost stage. The target kernel tree wasn't configured/compiled, missing the Module.symvers file.Ensure the Module.symvers file exists. If missing, you need to configure and compile the target kernel first (make modules will generate it).
3Module loading failed with Invalid module format.Version mismatch. The kernel version at compile time (e.g., 6.1.34) doesn't match the kernel version running on the board (e.g., 6.1.21).
4Same as above.Boot the board using your custom-compiled kernel (matching the module version), and ensure the module was compiled against that kernel.

Table 5.1: Summary of module cross-compilation and execution attempts from an x86_64 host to an AArch64 (Raspberry Pi 4) target

The LKM framework is incredibly rich. Next, we'll explore how to retrieve some minimal system information from within a kernel module.


5.4 Retrieving Minimal System Information

Sometimes, when you write a module that needs to be ported across architectures, you need to conditionally execute some code based on the currently running CPU family. The kernel provides some macros and methods to help you "probe" these low-level details. Let's build a simple demo module (ch5/min_sysinfo/min_sysinfo.c) that shows how to detect the CPU architecture, bit width, and endianness.

To avoid being too verbose, I'll only show the most core function here:

// ch5/min_sysinfo/min_sysinfo.c
[ ... ]
void llkd_sysinfo(void)
{
char msg[128];
memset(msg, 0, 128);
my_snprintf_lkp(msg, 47, "%s(): minimal Platform Info:\nCPU: ", __func__);
/* 严格来说,下面这些 #if...#endif 有点丑陋,但为了演示方便 */
/* 在实际工程中,应尽可能将这些逻辑隔离 */

#ifdef CONFIG_X86
#if(BITS_PER_LONG == 32)
strncat(msg, "x86-32, ", 9);
#else
strncat(msg, "x86_64, ", 9);
#endif
#endif

#ifdef CONFIG_ARM
strncat(msg, "AArch32 (ARM-32), ", 19);
#endif

#ifdef CONFIG_ARM64
strncat(msg, "AArch64 (ARM-64), ", 19);
#endif
// ... (其他架构检查略)

#ifdef __BIG_ENDIAN
strncat(msg, "big-endian; ", 13);
#else
strncat(msg, "little-endian; ", 16);
#endif

#if(BITS_PER_LONG == 32)
strncat(msg, "32-bit OS.\n", 12);
#elif(BITS_PER_LONG == 64)
strncat(msg, "64-bit OS.\n", 12);
#endif

pr_info("%s", msg);
show_sizeof();
// ... (显示各种数据类型的范围,略)
}
EXPORT_SYMBOL(llkd_sysinfo);

This module demonstrates how to write portable code. Remember, a kernel module's binary file is not portable, but its source code can be (and should be). As long as you recompile it on the target architecture, it's ready to deploy.

Figure 5.3: Kernel output from our simple and fun min_sysinfo module (on an x86_64 VM)

The highlighted part in Figure 5.3 is the output of the llkd_sysinfo() function. It's followed by the output of llkd_sysinfo2() (which is a safer version, covered in the next section). It also uses sizeof() to print the byte sizes of various data types, and finally shows the word size ranges for that architecture (including signed and unsigned 8/16/32/64-bit integers).

Similarly, we can cross-compile this module as an AArch64 version, transfer it to the Raspberry Pi, and run it:

Figure 5.4: Output when running on a Raspberry Pi 4B with our custom 64-bit 6.1.34-v8+ kernel

Look, this time the output shows information relevant to the AArch64 platform!

⚠️ Warning: In this demo module, we temporarily ignore the usage of the EXPORT_SYMBOL() macro, which we'll cover right in the next section. Also, my_snprintf_lkp() is just a simple snprintf() wrapper I put together to be a bit safer, temporarily defined in the current file (min_sysinfo.c). When we get to "simulating library functionality," we'll move it elsewhere. The next section will dive into more security details.


5.5 Being More Security-Conscious

Nowadays, security is a big deal. Professional developers must write secure code. In recent years, exploits against the Linux kernel have been commonplace. Parallel to this, the kernel community has been continuously working to improve security.

In our min_sysinfo.c module just now, we actually used some old-school, discouraged routines (like sprintf, strlen, etc.). Static analysis tools are great helpers for catching such security-related bugs. I strongly recommend you start using them.

We can use the sa_flawfinder target integrated into our "better" Makefile to run flawfinder (written by David Wheeler):

$ make sa_flawfinder
make clean
[...]
--- static analysis with flawfinder ---
flawfinder *.[ch]
Flawfinder version 2.0.19, (C) 2001-2019 David A. Wheeler.
[...]
Examining min_sysinfo.c
FINAL RESULTS:
min_sysinfo.c:54: [2] (buffer) char:
Statically-sized arrays can be improperly restricted, leading to pot. overflows
or other issues (CWE-119!/CWE-120). Perform bounds checking or use functions
that limit length, or ensure that the size is larger than the maximum possible
length.
[...]
min_sysinfo.c:136: [1] (buffer) strncat:
Easily used incorrectly (e.g., incorrectly computing the correct max size to add)
[MS-banned] (CWE-120). Consider strcat_s, strlcat, snprintf or automatically
resizing strings. Risk is low because the source is a constant string.

Look closely at flawfinder's warning about the strncat() function. Following its advice, we used strlcat() to replace it in the llkd_sysinfo2() function (safer code). Similarly, to prevent buffer overflows (the root cause of many vulnerabilities), when using snprintf(), you must also check its return value. So I wrote a simple wrapper, my_snprintf_lkp().

Figure 5.5: The CWE numbers (like CWE-120) mentioned in the flawfinder output. Search for them on https://cwe.mitre.org/ and you'll understand what category of security issue they refer to. CWE-120 is the "classic buffer overflow," one of the most common targets for hacker attacks.

Of course, there's much more to say about Linux kernel security and hardening techniques. You can refer to my talk at the Embedded IOT Summit 2023: "Mitigating Hackers with Hardening on Linux" (https://www.youtube.com/watch?v=KQa_XEiLGMc).

Now, let's shift to a slightly dry but absolutely important topic: licensing.


5.6 Licensing Kernel Modules

As is well known, the Linux kernel itself is released under the GNU GPL v2 license. As briefly mentioned in Chapter 4, choosing the correct license for your kernel code is mandatory and important. Let's break this down into two parts:

  1. In-tree kernel code (code directly contributed to the mainline kernel)
  2. Third-party out-of-tree modules (the kind most of us write)

Licensing for In-Tree Kernel Code

If your code is directly in the kernel source tree, or if you plan to contribute it to the mainline kernel, then you must release the code under the kernel's own license—GNU GPL-2.0. This is explicitly stated in the official documentation: https://docs.kernel.org/process/license-rules.html#linux-kernel-licensing-rules.

For consistency, the kernel now has a hard rule: the very first line of every source file must be an SPDX license identifier (https://spdx.org/). This is a concise way to declare the code's license. So, the first line of most C source files looks like this:

// SPDX-License-Identifier: GPL-2.0

Licensing for Out-of-Tree Kernel Modules

For out-of-tree modules, the situation is slightly more "flexible," but there's a bottom line: if you want help from the kernel community (which is a huge plus), you should (or are expected to) release the code under the GNU GPL-2.0 license (of course, dual licensing is also acceptable, like "Dual MIT/GPL").

There are two ways to declare a module's license:

  1. SPDX-License-Identifier tag: As a comment on the first line of the source file. Strictly speaking, this primarily applies to modules within the source tree.
  2. MODULE_LICENSE() macro: This is mandatory. The official documentation explicitly states: "Loadable kernel modules also require a MODULE_LICENSE tag. This tag is not a replacement for proper source code license information (SPDX-License-Identifier), nor is it used to express or determine the license of the source code itself."

Its sole purpose is to tell the kernel module loader and userspace tools whether this module is "free software" or "proprietary software." include/linux/module.h clearly lists which license identifiers are acceptable:

/*
* 以下许可标识符目前被接受为软件模块
* "GPL" [GNU Public License v2 or later]
* "GPL v2" [GNU Public License v2]
* "Dual BSD/GPL" [GNU Public License v2 or BSD license choice]
* "Dual MIT/GPL" [GNU Public License v2 or MIT license choice]
* "Dual MPL/GPL" [GNU Public License v2 or Mozilla license choice]
*
* 其他可用标识
* "Proprietary" [Non free products]

Obviously, the kernel community strongly encourages you to use GPL-2.0 or similar licenses (BSD/MIT/MPL). If you want to contribute code to mainline, then GPL-2.0 is the only choice.

There are many complex details regarding licensing (it even requires legal knowledge). I strongly recommend consulting your company's legal department. As for the GPL Frequently Asked Questions (FAQ), you can check here: https://www.gnu.org/licenses/gpl-faq.html.

Alright, enough with the dry legal topics. Let's get back to the technical details: how to simulate "library" functionality in kernel space.


5.7 Simulating "Library" Functionality

A huge difference between userspace programming and kernel space programming is that there are no traditional "libraries" in the kernel. Although the lib/ folder has some library-like routines, they are compiled directly into the kernel image and can't be dynamically linked like .so files.

The good news is that we have two ways to achieve a similar effect:

  1. Explicitly linking multiple source files: Compile and link the "library" code and your module code into a single .ko file.
  2. Module stacking: This is the true "library" concept, where a "core" module exports symbols for other modules to use.

Spoiler alert: the first method is usually superior. But the second method also has its uses. Let's take a detailed look.

Simulating a Library by Linking Multiple Source Files

So far, our modules have only had a single .c file. What if the project gets bigger? For example, a project called projx contains prj1.c, prj2.c, and prj3.c. You want to compile them into a single module called projx.ko.

Just write it like this in the Makefile:

obj-m := projx.o
projx-objs := prj1.o prj2.o prj3.o

Note the use of the projx tag in both obj-m and projx-objs. The build system will first compile the three .c files into .o files, then link them into a final projx.ko.

In our book, we use this mechanism to build a small "kernel library" (the source code is in the root directory's klib.h and klib.c). Other modules (like lowlevel_mem in Chapter 8) can use the functions inside by linking against this klib.o. In the Makefile for lowlevel_mem, it's written like this:

FNAME_C := lowlevel_mem
[ … ]
PWD := $(shell pwd)
obj-m += ${FNAME_C}_lkm.o
lowlevel_mem_lkm-objs := ${FNAME_C}.o ../../klib.o

This line lowlevel_mem_lkm-objs := ... tells the build system: compile and link the current module's code together with the klib.c code from the parent directory into a single lowlevel_mem_lkm.ko.

The advantages of this method are obvious:

  • No need to explicitly mark every function/data with EXPORT_SYMBOL.
  • These functions and data are only visible to the module they're linked into; they don't pollute the global symbol table.

The downside is that the final .ko file might be a bit large.

Before diving into "module stacking," we need to understand a more fundamental concept: the scope of functions and variables.


Understanding the Scope of Functions and Variables in Kernel Modules

Everyone knows C's scope rules:

  • Local variables inside a function... well, are local.
  • Variables and functions with the static keyword have their scope limited to the current file. This is good; it reduces namespace pollution.

In ancient Linux kernels (2.4 and earlier), all global variables and functions in a module were globally visible by default. This was obviously not a good idea.

Starting with the 2.6 kernel, the rules changed: all variables (including static and global data) and functions in a kernel module are private to the module and invisible outside it by default. So, if two modules