Skip to main content

Chapter 9: Netfilter Frameworks

Chapter Intro: Setting Up Checkpoints on the Highway

If you think of the Network Stack as a busy highway and packets as the vehicles traveling along it, then Netfilter is the traffic police.

But this isn't just a metaphor. In the Linux kernel's design, network traffic truly does fly by—the NIC receives an interrupt, the kernel allocates a sk_buff, the protocol stack passes it upward, and finally frees the memory. This process is astonishingly fast. But if you are a "traffic cop" wanting to pull over a vehicle to check its credentials, or even reroute it entirely, where should you intervene?

Early kernel designs didn't reserve this many "checkpoints." If you wanted to filter packets or perform NAT (Network Address Translation), you had to modify the core protocol stack code, hardcoding the logic directly into the processing flow. This was a terrible way to extend the kernel: with every new feature, the kernel grew a little fatter, and it became error-prone.

Netfilter emerged to solve this problem. Instead of modifying the main road, it hangs "hooks" at five key locations along it. Any kernel module can register callback functions on these hooks. When a packet passes by, the kernel pauses and asks, "Does anyone here want to process this packet?"

In this chapter, we dive deep into this exact mechanism. We will see how it embeds itself into every corner of the Network Stack, and how it supports functionalities we use every day like iptables, connection tracking, and NAT.

Netfilter Frameworks

Now let's zoom out a bit. Before diving into the specific code implementation, let's get a clear view of the big picture.

The Netfilter subsystem is not just a firewall; it is a low-level traffic interception framework. Built on top of it, the kernel implements a series of features that are indispensable in network engineering:

  • Packet Selection: This is the core of iptables, determining which packets get pulled out for processing.
  • Packet Filtering: What we commonly refer to as firewall rules—let this one through, drop that one.
  • Network Address Translation (NAT): Modifying a packet's IP or port, which is the foundation for allowing a LAN to share a single public IP for internet access.
  • Packet Mangling: Modifying packet header contents before or after routing (such as adding a specific mark).
  • Connection Tracking: One of the most core mechanisms, through which the kernel remembers "this is an already established connection." Both NAT and stateful firewalls rely entirely on it.
  • Network Accounting: Collecting traffic data for billing or monitoring purposes.

Our task in this chapter is to figure out exactly how these functionalities run while hooked into the Netfilter framework.

Three Major Frameworks Based on Netfilter

In the kernel source's net/netfilter directory, alongside the core framework, lie several well-known projects. If you've ever maintained production servers, you've most likely encountered LVS or IPVS—or at the very least, the traffic you've handled has flowed through them.

1. IPVS: Transport Layer Load Balancing

IPVS (IP Virtual Server) is a transport layer load balancing solution built on Netfilter. The code is located at net/netfilter/ipvs.

When you have only one entry server but ten web servers running behind it, you need to distribute incoming traffic evenly across those ten servers. IPVS intercepts traffic on Netfilter's hooks, modifies the destination address based on a scheduling algorithm (such as round-robin or least-connections), and forwards the packets to the real backend servers.

Historical note: Support for IPv4 IPVS has existed in the kernel for a very long time. IPv6 IPVS support, however, was merged in version 2.6.28, primarily developed by Google's Julius Volz and Vince Busam. If you want to dive deeper into the LVS (Linux Virtual Server) architecture, you can refer to www.linuxvirtualserver.org.

2. IP Sets: Managing Massive Numbers of IP Addresses

If you've ever written firewall rules, you might have encountered this pain point: if you want to block 10,000 IP addresses, you have to write 10,000 iptables rules. This is a performance disaster because every packet has to traverse all 10,000 rules.

IP sets were born to solve this problem. It consists of two parts:

  • Kernel module: net/netfilter/ipset
  • Userspace tool: ipset

IP sets allow you to maintain a Set in the kernel that can contain a bunch of IP addresses, port numbers, or even network segments. In iptables, you only need to reference the name of this set, and the kernel will efficiently look it up using a hash table or red-black tree. This framework was developed by Jozsef Kadlecsik. For details, see http://ipset.netfilter.org.

3. iptables: The Most Familiar Old Companion

When talking about Netfilter, we must mention iptables. It is probably the most popular Linux firewall tool in history.

It's crucial to understand this relationship: Netfilter is the framework inside the kernel, while iptables is the userspace frontend tool. You use the iptables command to write rules, and iptables tells the kernel about these rules through the setsockopt() system call. The Netfilter framework inside the kernel then uses these rules to match packets.

iptables provides a complete set of management layer functionalities:

  • Adding or deleting rules
  • Displaying statistics
  • Managing different tables
  • Zeroing counters

Depending on the protocol, there are several parallel implementations inside the kernel:

  • iptables: For IPv4 (code in net/ipv4/netfilter/ip_tables.c)
  • ip6tables: For IPv6 (code in net/ipv6/netfilter/ip6_tables.c)
  • arptables: For the ARP protocol (code in net/ipv4/netfilter/arp_tables.c)
  • ebtables: For Ethernet bridging (Layer 2 firewall, code in net/bridge/netfilter/ebtables.c)

In userspace, the corresponding command-line tools are iptables and ip6tables. They similarly communicate with the kernel through setsockopt() and getsockopt().

Next-Generation Technologies: xtables2 and nftables

Although iptables is a classic, this old interface has its breaking points—such as when you try to insert tens of thousands of rules. To solve this problem, the kernel community began exploring new architectures.

There are two directions here worth our attention.

1. xtables2 This is a project primarily developed by Jan Engelhardt (still in progress as of this writing). Its core improvement is discarding the aging setsockopt() interface in favor of using a Netlink message-based interface to communicate with the kernel.

There is a technical detail here worth pondering: when passing rules via the old setsockopt approach, it often required copying the entire rule table from userspace to kernel space, and then parsing it. When the number of rules reaches tens of thousands, the overhead of this "bulk copy" is staggering. xtables2, using Netlink's message-based mechanism, enables incremental updates—transmitting only the changed parts. This makes it much more graceful when dealing with large-scale rule sets. For project details, see http://xtables.de.

2. nftables This is the true "successor," aimed at completely replacing iptables. The most core design philosophy of nftables is:

  • Unified Implementation: No longer distinguishing between iptables, ip6tables, arptables, and ebtables; all protocol processing logic lives in a single unified codebase.
  • Virtual Machine: It introduces a simple virtual machine to execute rules, making rule expression more powerful and flexible.

The introduction of a virtual machine is not for showing off, but to solve the efficiency problems brought about by flexibility. Traditional iptables rules are hardcoded matching functions, whereas nftables allows users to write assembly-like instructions in userspace, which the kernel executes through a tiny interpreter. This means you can combine very complex logic without needing to modify the kernel code every time.

This project was first proposed by Patrick McHardy at the 2008 Netfilter Workshop. The kernel infrastructure and userspace tools were primarily developed by Patrick McHardy and Pablo Neira Ayuso. If you want to read a forward-looking technical analysis, check out the classic article "Nftables: a new packet filtering engine" (LWN, http://lwn.net/Articles/324989/).


Extensions and Ecosystem

The charm of Netfilter lies in its extensibility. Beyond the core frameworks mentioned above, there are a massive number of kernel modules (under net/netfilter) that extend its functionality. We won't list all of these modules one by one in this book—after all, managing and using these rules is more of a system administrator's daily work, not a required course for kernel developers.

If you are interested in specific extension modules (such as conntrack helpers, specific NAT modules, etc.), the best references besides the source code itself are the official Netfilter project website: www.netfilter.org.

We've now seen the big picture, but this is only the map from a userspace perspective. Next, we'll dive into the kernel and see exactly where these hooks are placed.