8.4 Extension Headers — Infinite Extension via Chaining

In the previous section, we spent a good while looking at IPv6's fixed 40-byte header. It's clean and pure, shedding all unnecessary baggage.

But you probably have a question: What if certain optional features absolutely must be added? For example, encryption, fragmentation, or route recording?

In the IPv4 era, this problem was solved with "Options"—stuffing a bunch of variable-length fields after the main header. This was the root of all evil: variable-length headers made hardware acceleration painful, forcing routers to peel open the entire packet just to parse a few option fields that might or might not be there.

IPv6's answer is: Don't put it inside; put it outside.

These are Extension Headers. They are one of the most elegant parts of the IPv6 design: they turn the protocol into a chain.

The Secret of the Chain: `nexthdr`

Remember the nexthdr field we mentioned in the previous section?

I called it IPv6's "Rosetta Stone." Now let's see how this stone turns.

The core logic of extension headers is chaining. In the IPv6 main header, nexthdr doesn't point to TCP or UDP—it points to whoever is sitting right behind the IPv6 header.

It's like a treasure hunt:

You open the IPv6 box, and inside is a note saying nexthdr = 43 (Routing Header).
You find the Routing Header box, open it, and inside is another note saying nexthdr = 17 (UDP).
You then go grab the UDP box.

This is why every extension header must have a Next Header field. It's the key to the next box. Only at the very last extension header does Next Header point to the real star of the show—like TCP, UDP, or ICMPv6.

The greatest benefit of this design is: intermediate routers only need to look at the IPv6 header. Unless they encounter a special Hop-by-Hop header, routers don't care how many extension headers are chained behind it—they simply forward based on the IPv6 header. This represents a qualitative leap in forwarding performance compared to IPv4 Options.

The Rules: Order in the Chain

Although extension headers are flexible, the chain is not chaotic. There are a few ironclad rules we must follow:

Strict sequential processing: We can't look at Fragment before Routing. We must process them in the exact order they appear in the packet.
Each extension header typically appears only once: The exception is Destination Options, which is allowed to appear twice (we'll explain why later).
8-byte alignment: For memory access efficiency, the length of all extension headers must be a multiple of 8. What if it falls short? Padding.

RFC 2460 provides a "recommended order." While not mandatory, it's best for everyone to follow it, or things will get messy:

IPv6 Header
Hop-by-Hop Options (this must be first, immediately after the IPv6 header)
Destination Options (first occurrence, for routers to read)
Routing Header
Fragment Header
Authentication Header
Encapsulating Security Payload Header
Destination Options (second occurrence, for the destination host to read)
Upper-layer header (TCP/UDP/etc)

When the Chain Breaks: Handling Unknown Protocols

What if, while walking the chain, we suddenly encounter an unrecognized nexthdr value?

For example, a certain kernel version doesn't yet support a new protocol, or a malicious packet deliberately stuffs in a weird number.

We can't just drop it, nor can we force our way through. The correct approach is to file a complaint with the sender.

The kernel calls icmpv6_param_prob() to send back an ICMPv6 Parameter Problem message, with the error code set to ICMPV6_UNK_NEXTHDR (unknown Next Header).

Analogy time: Imagine receiving a package. You open a box inside, and it says "Please forward to Department XX," but you've never heard of that department. At this point, you can only return the package to the sender with a note: "We don't have that department here." That's exactly what ICMPv6 does.

Diving into the Four Core Extension Headers

The Linux kernel defines all extension header types using constants (see Table 8-2 at the end of this book). Each extension header (except Hop-by-Hop) registers its own processing callback via inet6_add_protocol().

Let's talk about the most commonly used ones.

1. Hop-by-Hop Options

This is the only privileged class.

It must sit immediately after the IPv6 header, and it is the only one that forces every router along the path to process it. Because of this, it must be used with extreme caution—once abused, routing efficiency across the entire internet would be dragged down.

Its privileged status means it can't go through the normal registration process. In the kernel, it has a dedicated ipv6_parse_hopopts() method for handling, which is called directly within ipv6_rcv().

It contains a set of options in TLV (Type-Length-Value) format. Here are a few typical examples:

Router Alert (IPV6_TLV_ROUTERALERT): Tells the router, "Hey, you need to take a look at this packet." Primarily used for RSVP (Resource Reservation) or multicast packets.
Jumbo (IPV6_TLV_JUMBO): A normal IPv6 payload has a maximum size of 65535 bytes. If we need to transmit "Jumbograms" exceeding 2^16 bytes, we need this option to extend the length field.
Pad1 / PadN: "Scrap paper" used purely for padding to meet the 8-byte alignment. Pad1 takes up 1 byte, and PadN takes up N bytes.

2. Routing Header

This is the successor to IPv4 Source Routing.

If we want a packet to "pass through Router X, then come to me," we use this header. It allows the sender to specify one or more intermediate addresses that must be traversed.

Echoes of history: In the IPv4 era, this feature was directly blocked by many networks due to security risks (it could be used for IP spoofing attacks). In IPv6, although it still exists, many strict network operators will still drop packets containing a Routing Header by default.

3. Fragment Header

This is the core of the IPv6 fragmentation mechanism.

In IPv4, if an intermediate router found that the interface MTU was too small, it could fragment the packet. This was the "fallback plan when Path MTU Discovery (PMTUD) fails."

But in IPv6, this fallback plan was removed. IPv6 dictates: intermediate routers are strictly forbidden from fragmenting. Fragmentation can only happen at the source host.

This means if a source host sends a 5000-byte packet, and an intermediate router has an MTU of only 1500, the router won't fragment it. Instead, it drops the packet and sends an ICMPv6 Packet Too Big message back to the source host. Upon receiving this, the source host has no choice but to shrink the packet size and retransmit.

This is IPv6's PMTUD mechanism—a "tough" negotiation.

The kernel implements fragmentation in the ip6_fragment() method, while reassembly happens in net/ipv6/reassembly.c.

Here is a kernel code snippet that registers the Fragment protocol handler, showing how it hooks into the chain:

static const struct inet6_protocol frag_protocol = {
    .handler     = ipv6_frag_rcv,
    .flags       = INET6_PROTO_NOPOLICY,
};

int __init ipv6_frag_init(void)
{
    int ret;
    // 注册 IPPROTO_FRAGMENT (44) 对应的处理函数
    ret = inet6_add_protocol(&frag_protocol, IPPROTO_FRAGMENT);
    // ...
}

(net/ipv6/reassembly.c)

When a packet's nexthdr is 44, the kernel knows: "Oh, a fragment header follows; I need to collect all the fragments before passing them up."

4. Destination Options

This is the only extension header allowed to appear twice. Why? Because it has two completely different use cases:

Appearing before the Routing Header: At this point, its information is meant for those "transit routers" specified in the Routing Header.
Appearing after the Routing Header (or when there is no Routing Header): At this point, its information is meant for the final destination host.

IPv6 Initialization: Starting from the Ethernet Type

We've covered so many structures; now let's see how this system boots up in the kernel.

Everything starts with inet6_init(). This function acts as the commander-in-chief of the IPv6 subsystem. It initializes procfs, registers the TCPv6/UDPv6 protocol handlers, and starts the Neighbor Discovery and routing subsystems.

But the most critical step is telling the network core: "Hand over any frame with an Ethernet type of 0x86DD to me."

This is accomplished via dev_add_pack(), almost exactly like the IPv4 approach:

static struct packet_type ipv6_packet_type __read_mostly = {
    .type = cpu_to_be16(ETH_P_IPV6), // 0x86DD
    .func = ipv6_rcv,                // 处理回调
};

static int __init ipv6_packet_init(void)
{
    dev_add_pack(&ipv6_packet_type);
    return 0;
}

(net/ipv6/af_inet6.c)

From this point on, whenever the NIC receives an Ethernet frame whose Type field is 0x86DD, the kernel jumps directly to the ipv6_rcv() function.

ipv6_rcv() is the entry point for the entire IPv6 receive path. Here, the kernel checks the version number, performs checksums (if any), and then follows the chain linked by nexthdr that we discussed earlier, unwrapping the packet step by step, ultimately delivering the data to the upper-layer protocols.

Foreshadowing for this section

Up to now, all the problems we've tackled (fragmentation, routing) actually assume a premise: your address is already configured.

But IPv6 addresses are 128 bits long—nobody wants to type them in by hand. How do they end up on your machine? Is it DHCP? Or some kind of magic?

In the next section, we'll dive into IPv6's most representative feature: Autoconfiguration. We'll discover that even without a DHCP server, Linux can conjure up a globally unique address out of thin air.

The Secret of the Chain: nexthdr​

The Rules: Order in the Chain​

When the Chain Breaks: Handling Unknown Protocols​

Diving into the Four Core Extension Headers​

1. Hop-by-Hop Options​

2. Routing Header​

3. Fragment Header​

4. Destination Options​

IPv6 Initialization: Starting from the Ethernet Type​