4.3 Receiving IPv4 Multicast Packets

At the end of the previous section, we mentioned that once ip_rcv_finish() completes route lookup and option processing, a packet's fate generally splits into three paths: local delivery, forwarding, or—multicast.

For ordinary unicast, the logic is straightforward: either the packet is for me, or I'm forwarding it for someone else. But multicast is different. The fate of a multicast packet is determined by a tangled web of "if you... or if..." conditions. In the kernel source, you'll find deeply nested logic just to figure out where a multicast packet should go.

Let's zoom in on the moment ip_rcv_finish() calls ip_route_input_noref(). At this point, the kernel already knows the destination address is a multicast address (224.0.0.0/4), but this raises a new question: Am I a member of this group? Or am I just a delivery driver (a multicast router) responsible for moving the packet along?

The Multicast Branch in Route Lookup

In the ip_route_input_noref() method, the kernel first uses ipv4_is_multicast(daddr) to confirm this is indeed a multicast packet. If it is, the very next step is to check whether the local device has joined this group.

This is done via ip_check_mc_rcu(). Its purpose is very direct: it takes the destination multicast address and checks the network interface's subscription list. If we find a match in the list, the variable our is set to 1.

But there's an "either-or" logic here:

I am a member (our == 1): I get to receive this packet.
I am a router (CONFIG_IP_MROUTE enabled + IN_DEV_MFORWARD set): Regardless of whether I've joined this group, as long as multicast forwarding is enabled, I must forward this packet to others.

As long as either condition is met, the kernel calls ip_route_input_mc() to build a routing cache entry (dst_entry) for this packet.

Look at this code (net/ipv4/route.c):

int ip_route_input_noref(struct sk_buff *skb, __be32 daddr, __be32 saddr,
                         u8 tos, struct net_device *dev)
{
        int res;
        rcu_read_lock();
        ...
        if (ipv4_is_multicast(daddr)) {
                struct in_device *in_dev = __in_dev_get_rcu(dev);
                if (in_dev) {
                        int our = ip_check_mc_rcu(in_dev, daddr, saddr,
                                                  ip_hdr(skb)->protocol);
                        if (our
#ifdef CONFIG_IP_MROUTE
                                ||
                            (!ipv4_is_local_multicast(daddr) &&
                             IN_DEV_MFORWARD(in_dev))
#endif
                           ) {
                                int res = ip_route_input_mc(skb, daddr, saddr,
                                                            tos, dev, our);
                                rcu_read_unlock();
                                return res;
                        }
                }
           ...
        }
        ...
}

Note the conditional check here. If it's not a local multicast address (!ipv4_is_local_multicast, i.e., not an address like 224.0.0.x that only roams the local subnet), and the device is configured for multicast forwarding (IN_DEV_MFORWARD), the kernel considers this a multicast packet that needs to be forwarded.

Parting Ways: Deliver or Forward?

After entering ip_route_input_mc(), based on the lookup results just obtained, the kernel assigns the packet's "final destination" to the input callback function in dst_entry.

The logic goes like this (net/ipv4/route.c):

static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
                             u8 tos, struct net_device *dev, int our)
 {
         struct rtable *rth;
         struct in_device *in_dev = __in_dev_get_rcu(dev);

        ...

         if (our) {
                 rth->dst.input = ip_local_deliver;
                 rth->rt_flags |= RTCF_LOCAL;
         }

 #ifdef CONFIG_IP_MROUTE
         if (!ipv4_is_local_multicast(daddr) && IN_DEV_MFORWARD(in_dev))
                 rth->dst.input = ip_mr_input;
 #endif
        ...
}

There are two key branches here:

If our is true: The kernel sets dst.input to ip_local_deliver. This means the packet will be treated as data destined for the local machine, ultimately reaching the application listening on the multicast socket (such as your video conferencing software). At the same time, the flag RTCF_LOCAL is set.
If multicast forwarding is enabled (IN_DEV_MFORWARD): At this point, the kernel acts as a relay station. dst.input is set to ip_mr_input. The name is very straightforward—IP Multicast Receive Input. This packet won't go to upper-layer applications; instead, it enters the multicast forwarding logic.

⚠️ Note: The Read-Only Trap of mc_forwarding

There is a detail here that can easily drive newcomers crazy.

That IN_DEV_MFORWARD(in_dev) macro checks the /proc/sys/net/ipv4/conf/all/mc_forwarding sysctl switch. But you'll find that you can't just run echo 1 > ... to set it like you would with normal forwarding.

It is read-only.

This is an intentional kernel design. The kernel doesn't allow you to flip this switch manually because its state must be managed by a specific Multicast Routing Daemon. The most common implementation is pimd (PIM-SM v2 daemon).

When pimd starts and is ready to work, it notifies the kernel to set mc_forwarding to 1.
When pimd stops, the kernel automatically sets it back to 0.

If you're interested in this underlying mechanism, you can look at the pimd source code to see how it interacts with the kernel (https://github.com/troglobit/pimd/). It is responsible for maintaining complex multicast routing states, while the kernel simply follows orders and moves packets.

Multicast Forwarding: The Behind-the-Scenes Work of the MFC

If the packet takes the ip_mr_input() path, the story isn't over yet. The multicast layer maintains a table called the Multicast Forwarding Cache (MFC).

You can think of the MFC as the "routing table" for multicast forwarding, but it's much more complex than a normal routing table. A normal routing table maps "destination -> egress," whereas the MFC maps "".

Although we won't dive deep into the MFC details until Chapter 6 of this book, the ip_mr_input() logic here is simple: it takes the information from the packet header and looks up the MFC table.

Hit: This is a known multicast flow, and the kernel knows which interfaces to forward it to.
Miss: The kernel might pass this packet up to pimd, letting it decide how to establish a forwarding path.

If there is a hit in the MFC, the kernel calls ip_mr_forward(), which in turn calls ipmr_queue_xmit().

The flow after this is actually strikingly similar to the unicast forwarding we will discuss in the next section.

Recalculating TTL: A Required Course Before Forwarding

Multicast forwarding is still forwarding, and since it's forwarding, the TTL (Time To Live) must be decremented. This is part of the eternal truth of IP networks for preventing loops.

Inside ipmr_queue_xmit(), you'll see a familiar face:

static void ipmr_queue_xmit(struct net *net, struct mr_table *mrt,
                             struct sk_buff *skb, struct mfc_cache *c, int vifi)
{
       ...

       ip_decrease_ttl(ip_hdr(skb));
       ...
       NF_HOOK(NFPROTO_IPV4, NF_INET_FORWARD, skb, skb->dev, dev,
                       ipmr_forward_finish);
       return;
}

That's right, it's ip_decrease_ttl().

It does two things:

Decrements the TTL field in the IPv4 header by 1.
Recalculates the IPv4 header checksum (Checksuming).

Here we can also see that the Netfilter hook NF_INET_FORWARD is triggered. This means your iptables FORWARD rule chain will also process multicast forwarded packets.

Finally, ipmr_forward_finish() is called. This function is very short and mainly does three things:

Updates statistical counters (OutMcastPkts and OutOctets in /proc/net/snmp are incremented right here).
If there are IP options, calls ip_forward_options() to handle them.
Calls dst_output(skb) to send the packet to the network driver's transmit queue.

static inline int ipmr_forward_finish(struct sk_buff *skb)
{
        struct ip_options *opt = &(IPCB(skb)->opt);

        IP_INC_STATS_BH(dev_net(skb_dst(skb)->dev), IPSTATS_MIB_OUTFORWDATAGRAMS);
        IP_ADD_STATS_BH(dev_net(skb_dst(skb)->dev), IPSTATS_MIB_OUTOCTETS, skb->len);

        if (unlikely(opt->optlen))
                ip_forward_options(skb);

        return dst_output(skb);
}

With this, a multicast packet's journey through the kernel comes to an end. It goes through the baptism of ip_rcv, parts ways in the routing system—if you are a receiver, it arrives at the socket; if you are a forwarder, it modifies the TTL and heads to the next network.

In the next section, we will turn to an ancient and sometimes troublesome feature in the IPv4 protocol: IP Options. Although rare nowadays, they still pop up in certain route tracing or timestamp recording scenarios, and handling them requires a special code path.

The Multicast Branch in Route Lookup​

Parting Ways: Deliver or Forward?​

Multicast Forwarding: The Behind-the-Scenes Work of the MFC​

Recalculating TTL: A Required Course Before Forwarding​

The Multicast Branch in Route Lookup

Parting Ways: Deliver or Forward?

Multicast Forwarding: The Behind-the-Scenes Work of the MFC

Recalculating TTL: A Required Course Before Forwarding