9.7 Network Address Translation (NAT)
Now we arrive at the most famous feature in the Netfilter world—NAT (Network Address Translation).
You probably use it every day—whenever you connect your phone to your home Wi-Fi, you're using NAT. It's the "tourniquet" that keeps the modern internet from running out of addresses, and the "magic box" that lets a room full of devices share a single public IP.
But as a kernel developer, you can't stop at "it lets us share an internet connection." We need to crack open this box and see exactly where in the network stack it reaches in to modify packets.
This section is divided into two parts: first, we'll clarify the two common NAT modes (SNAT and DNAT), and then we'll dive into the kernel to see how it hooks itself onto the Hook points we discussed earlier.
What NAT is, where it comes from, and where it's going
As the name suggests, the NAT module's primary job is to translate network addresses—specifically, rewriting the source or destination address in the IP header, along with the L4 ports (TCP/UDP header).
The most common scenario looks like this: you have a router at home (what we often call a "modem" or "gateway"), with your phone, laptop, and fridge connected behind it. These devices all have private IPs (like 192.168.x.x). When your laptop visits Google, Google's servers obviously can't send the reply back to 192.168.1.5—that's a private address and invalid on the public internet.
This is where the NAT on your router steps in. It changes the source IP in your packet to the router's public IP, and simultaneously makes a note in memory: "The packet from 192.168.1.5:12345 has been masqueraded as PublicIP:54321." When Google's reply arrives back at the router, it looks up this record, changes the destination IP back to 192.168.1.5, and forwards the packet to your laptop.
This is the core value of NAT: it allows a group of devices with "fake IDs" (private IPs) to access the outside world through a proxy that holds a "real ID" (a public IP).
The Netfilter subsystem implements NAT for both IPv4 and IPv6.
Although NAT exists primarily to alleviate IPv4 address exhaustion—which shouldn't be an issue in the IPv6 world—historical inertia is powerful. The kernel has officially supported IPv6 NAT since version 3.7 (code in net/ipv6/netfilter/ip6table_nat.c). Its implementation largely mirrors that of IPv4, and the user configuration experience is similar. Beyond sharing an internet connection, NAT is also commonly used for simple load balancing (by configuring DNAT to distribute traffic to different backend servers).
NAT configuration details vary wildly, and the documentation online could fill several hard drives. But in the kernel's eyes, there are really only two basic operations:
- SNAT (Source NAT): Rewrites the source IP.
- This is what your home router does.
- Usually performed when a packet is leaving the network stack (
POST_ROUTING).
- DNAT (Destination NAT): Rewrites the destination IP.
- This is what port forwarding does (e.g., forwarding a packet sent to PublicIP:80 to your internal 192.168.1.100:80).
- Usually performed after a packet enters the network stack but before the routing decision is made (
PRE_ROUTING).
When you specify a target with the -j parameter in iptables, you are choosing between these two:
# 把源地址改成 1.2.3.4(SNAT)
iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to-source 1.2.3.4
# 把目的地址改成 10.0.0.1(DNAT)
iptables -t nat -A PREROUTING -i eth0 -j DNAT --to-destination 10.0.0.1
Whether it's SNAT or DNAT, the underlying implementation logic lives in net/netfilter/xt_nat.c.
Note: The NAT table in
iptablesis different from the filter table; it only takes effect at specific Hook points. Below, we'll see how this table is initialized.
NAT Initialization: Hanging the Hooks
Remember the filter table we saw in the previous section? The NAT table (nat table) is essentially also a xt_table object. It is registered in a very similar way to the filter table, handled by ipt_register_table() and ipt_unregister_table().
But there is a key difference: the NAT table doesn't care about NF_INET_FORWARD.
Think about why.
If the goal is to share an internet connection (SNAT), by the time a packet reaches the forwarding node (FORWARD), the route has already determined where it's going. At this point, we only care about changing the source address right before it leaves (POST_ROUTING). If the goal is port forwarding (DNAT), we need to change the destination address before the routing decision (PRE_ROUTING) so that the routing table lookup directs the packet to the internal machine.
At the FORWARD point, NAT usually has nothing to do—the decision has already been made.
Let's look at the NAT table definition (code in net/ipv4/netfilter/iptable_nat.c):
static const struct xt_table nf_nat_ipv4_table = {
.name = "nat",
/* 只在以下四个 hook 点生效 */
.valid_hooks = (1 << NF_INET_PRE_ROUTING) | /* DNAT 在这里 */
(1 << NF_INET_POST_ROUTING) | /* SNAT 在这里 */
(1 << NF_INET_LOCAL_OUT) | /* 本机发出的包也能做 DNAT */
(1 << NF_INET_LOCAL_IN), /* 本机收到的包也能做 SNAT */
.me = THIS_MODULE,
.af = NFPROTO_IPV4,
};
You can see clearly here: it excludes NF_INET_FORWARD. This aligns perfectly with our earlier logic that "NAT modifies the starting point or endpoint of a packet."
This xt_table object gets registered into the netns_ipv4 structure within the network namespace (struct net) (via that nat_table pointer). We mentioned this structure when discussing iptables earlier, and now the nat table lives there too.
Registering Hook Callbacks
Having a table alone isn't enough; we need to tell the kernel: "When a packet reaches these Hook points, please call our functions to handle it."
This requires defining a nf_hook_ops array and registering them with the Netfilter framework. This array is slightly richer than the filter table's because NAT needs to do different things at different Hook points:
static struct nf_hook_ops nf_nat_ipv4_ops[] __read_mostly = {
/* Before packet filtering, change destination (DNAT) */
{
.hook = nf_nat_ipv4_in, /* 处理函数 */
.owner = THIS_MODULE,
.pf = NFPROTO_IPV4,
.hooknum = NF_INET_PRE_ROUTING, /* 挂在 PRE_ROUTING 上 */
.priority = NF_IP_PRI_NAT_DST, /* 优先级:DNAT 必须在路由前做完 */
},
/* After packet filtering, change source (SNAT) */
{
.hook = nf_nat_ipv4_out,
.owner = THIS_MODULE,
.pf = NFPROTO_IPV4,
.hooknum = NF_INET_POST_ROUTING, /* 挂在 POST_ROUTING 上 */
.priority = NF_IP_PRI_NAT_SRC, /* 优先级:SNAT 在包即将离开时做 */
},
/* Before packet filtering, change destination (DNAT) */
{
.hook = nf_nat_ipv4_local_fn,
.owner = THIS_MODULE,
.pf = NFPROTO_IPV4,
.hooknum = NF_INET_LOCAL_OUT, /* 本机发出的包,也能做 DNAT */
.priority = NF_IP_PRI_NAT_DST,
},
/* After packet filtering, change source (SNAT) */
{
.hook = nf_nat_ipv4_fn, /* 本机收到的包,也能做 SNAT */
.owner = THIS_MODULE,
.pf = NFPROTO_IPV4,
.hooknum = NF_INET_LOCAL_IN,
.priority = NF_IP_PRI_NAT_SRC,
},
};
Notice the logic in the comments:
- The callbacks hooked onto
PRE_ROUTINGandLOCAL_OUThave a priority ofNF_IP_PRI_NAT_DST(destination address translation). This means that as soon as a packet arrives, before any complex filtering judgments are made, its destination is changed first—this is the hallmark of DNAT. - The callbacks hooked onto
POST_ROUTINGandLOCAL_INhave a priority ofNF_IP_PRI_NAT_SRC(source address translation). This means the packet has already been routed and filtered, and is about to leave or be handed to a local process; its source address is modified at the last moment—this is the hallmark of SNAT.
The final registration action happens in the module initialization function iptable_nat_init():
static int __init iptable_nat_init(void)
{
int err;
. . .
err = nf_register_hooks(nf_nat_ipv4_ops, ARRAY_SIZE(nf_nat_ipv4_ops));
if (err < 0)
goto err2;
return 0;
. . .
}
Once this code runs, your network stack has a set of "invisible hands" that quietly modify packet addresses as they pass through these critical intersections.
But this is just the infrastructure being put in place. In the next section, we'll dive inside these hook functions (like nf_nat_ipv4_fn) to see exactly how the kernel modifies the IP header step by step, recalculates the checksum, and updates the connection tracking entry when a packet is actually captured—after all, changing an address and forgetting to update the conntrack record would be a disaster.