Skip to main content

3.3 Quick Reference and Practical Supplements

In this chapter, we broke down the internal workings of ICMPv4 and ICMPv6, from initialization flows to packet transmission and reception, covering a significant amount of kernel code. Now it's time to consolidate these scattered details into a quick reference, and along the way, cover two "corners" that are easily overlooked in engineering practice.

This section might look like an "appendix," but I strongly recommend not treating it merely as a dictionary. The procfs parameters and iptables REJECT target mentioned later directly determine whether your production firewall's "rejection" behavior is a polite wave or an angry silence.


📚 Core Method Quick Reference

We repeatedly encountered these functions during our code walkthrough. They are the "limbs" of the ICMP subsystem.

Reception Handling

int icmp_rcv(struct sk_buff *skb);

Location: net/ipv4/icmp.c

This is the main entry function for ICMPv4. When an IP packet is stripped of its IP header and the protocol number is found to be IPPROTO_ICMP (1), the kernel jumps directly to this function. It handles checksum verification, packet dispatch, and triggers the generation of corresponding ICMP replies (such as Echo Reply).

Transmission Handling

void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info);

Location: net/ipv4/icmp.c

The core function for the kernel to send ICMPv4 error messages (such as Destination Unreachable) outward.

  • skb_in: The original packet (provoking SKB) that triggered the error. The kernel extracts the IP header from this packet for diagnostics and embeds it into the data portion of the newly generated ICMP error packet.
  • type / code: The type and code of the ICMP message.
  • info: Typically used to pass the MTU (in the Fragmentation Needed scenario) or a pointer to the offending field.

void icmpv6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info);

Location: net/ipv6/icmp.c

The ICMPv6 version of the send function. Although the name is similar, its internal implementation calls ip6_append_data and ip6_push_pending_frames, following the IPv6 transmission path.

void icmpv6_param_prob(struct sk_buff *skb, u8 code, int pos);

Location: net/ipv6/icmp.c

This is a wrapper function specifically for sending ICMPv6 Parameter Problem messages. Its purpose is to save you the trouble of manually filling in the ICMPV6_PARAMPROB type, and it automatically frees skb after transmission (because the parameter issue is severe enough that the packet cannot be processed and must be discarded).

Utility Helpers

struct icmp6hdr *icmp6_hdr(const struct sk_buff *skb);

Location: include/linux/icmpv6.h

A simple inline helper function used to safely obtain the icmp6hdr pointer from skb. In IPv6 extension header processing, the SKB pointers might keep moving, and using this macro prevents errors in pointer arithmetic.


📋 Key Data Table: Error Code Dictionary

ICMP "error" messages are primarily distinguished by Type (major category) and Code (minor category). The following tables are the "dictionary" you need to constantly refer to during packet capture analysis.

1. ICMPv4 Destination Unreachable

This is the most common category of errors. When a router cannot find a route, or a host port is not open, you will receive these.

CodeKernel SymbolHuman-Readable Meaning
0ICMP_NET_UNREACHNetwork unreachable (no such subnet in the routing table).
1ICMP_HOST_UNREACHHost unreachable (ARP resolution failed, or the host is down).
2ICMP_PROT_UNREACHProtocol unreachable (e.g., you sent TCP, but the target machine's kernel doesn't have the TCP module loaded).
3ICMP_PORT_UNREACHPort unreachable (most common, sending UDP to a port that isn't listening).
4ICMP_FRAG_NEEDEDFragmentation Needed, but the DF flag is set (the key packet for PMTU Discovery).
5ICMP_SR_FAILEDSource Route Failed (rarely used, indicating an issue with the specified source route path).
9ICMP_NET_ANONetwork administratively prohibited (the firewall decided you shouldn't go here).
10ICMP_HOST_ANOHost administratively prohibited (the firewall considers this host protected).
13ICMP_PKT_FILTEREDPacket filtered (usually generated by a firewall).

(Note: Some rarely seen TOS-related and Historical codes are omitted)

2. ICMPv4 Redirect

Host A wants to send data to Host C and must go through Router R. But Router R realizes that Host A could actually send directly to Host B, who would then forward it, or R knows a shorter route. In this case, R will send a Redirect to A.

CodeKernel SymbolMeaning
0ICMP_REDIR_NETRedirect Network.
1ICMP_REDIR_HOSTRedirect Host.
2ICMP_REDIR_NETTOSRedirect Network (based on TOS).
3ICMP_REDIR_HOSTTOSRedirect Host (based on TOS).

3. ICMPv4 Time Exceeded

CodeKernel SymbolMeaning
0ICMP_EXC_TTLTTL Exceeded (the core of how Traceroute works).
1ICMP_EXC_FRAGTIMEFragment Reassembly Time Exceeded (took too long to collect all fragments, discarded).

4. ICMPv6 Destination Unreachable

IPv6's definitions are slightly different. Note that in IPv6, ICMPV6_PKT_TOOBIG is an independent Type, not a Code.

CodeKernel SymbolMeaning
0ICMPV6_NOROUTENo Route to destination.
1ICMPV6_ADM_PROHIBITEDAdministratively Prohibited (forbidden by admin).
3ICMPV6_ADDR_UNREACHAddress Unreachable (usually means neighbor unreachable, NDP failed).
4ICMPV6_PORT_UNREACHPort Unreachable.

5. ICMPv6 Parameter Problem

CodeKernel SymbolMeaning
0ICMPV6_HDR_FIELDErroneous Header Field (unrecognized field).
1ICMPV6_UNK_NEXTHDRUnrecognized Next Header (extension header chain is broken).
2ICMPV6_UNK_OPTIONUnrecognized IPv6 Option.

⚙️ procfs Interfaces: Controlling ICMP from User Space

Linux allows you to adjust ICMP behavior via /proc/sys without recompiling the kernel. These variables all belong to the netns_ipv4 structure and are isolated per network namespace.

Key Configuration Items

1. icmp_echo_ignore_all

  • Path: /proc/sys/net/ipv4/icmp_echo_ignore_all
  • Default: 0 (disabled)
  • Purpose: When set to 1, the kernel will ignore all ICMP Echo Requests (Ping requests).
    • Scenario: This is the first step to "stealth mode." Note that this only ignores Pings; it doesn't mean the host is unreachable.

2. icmp_echo_ignore_broadcasts

  • Path: /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts
  • Default: 1 (enabled)
  • Purpose: When set to 1, ignores Pings or Timestamp requests sent to broadcast/multicast addresses.
    • Scenario: Must remain enabled. If a virus on your network is frantically Pinging the broadcast address, turning this off will directly cause a network storm.

3. icmp_ratelimit

  • Path: /proc/sys/net/ipv4/icmp_ratelimit
  • Default: 1 * HZ (i.e., 1 second, 1000ms)
  • Purpose: Limits the sending rate of specific ICMP error messages (defined by icmp_ratemask). The unit is milliseconds. 0 means no limit.
    • Warning: If set to 0, an attacker can forge massive amounts of traffic to trigger your ICMP error replies, turning you into an accomplice in a reflection/amplification attack (although ICMP packets are small, they are still enough to congest a link).

4. icmp_ratemask

  • Path: /proc/sys/net/ipv4/icmp_ratemask
  • Default: 0x1818 (binary 0001 1000 0001 1000)
  • Purpose: A bitmask that determines which ICMP Types need to be rate-limited. Each bit corresponds to an ICMP Type.
    • For example, if you want to limit Destination Unreachable (Type 3) and Time Exceeded (Type 11), you need to set the corresponding bits. The default value primarily targets error messages.

5. icmp_errors_use_inbound_ifaddr

  • Path: /proc/sys/net/ipv4/icmp_errors_use_inbound_ifaddr
  • Default: 0
  • Purpose: Determines how the source IP address is selected when sending ICMP error messages.
    • 0 (default): Uses the primary address of the outbound interface.
    • 1: Uses the primary address of the inbound interface (i.e., the interface that received the triggering packet).
    • Gotcha: In multi-homed hosts or complex asymmetric routing scenarios, if you select the wrong source address, the error packet received by the peer might be discarded as a forged packet, or the peer's reply might get lost again. Usually, keeping the default is fine unless your network topology is very special.

🛡️ Active Rejection: Using iptables/ip6tables REJECT

There is an iron rule for firewalls: REJECT is better than DROP.

  • DROP: Your machine simply discards the packet, and the client will keep waiting until it times out. The user experience is poor, and it's hard for an attacker to tell if they were blocked by a firewall or if the machine is down.
  • REJECT: Your machine politely sends back an ICMP error packet ("Host Unreachable" or "Port Unreachable"). The client immediately knows "this path is blocked" and stops trying.

However, the details of this mechanism differ slightly between IPv4 and IPv6.

IPv4 (iptables)

The ipt_REJECT module allows you to specify what type of ICMP error to return.

Most common example: Deny access and inform the peer "administratively prohibited."

iptables -A INPUT -j REJECT --reject-with icmp-host-prohibited

--reject-with optional parameters:

ParameterICMP Message SentMeaning
icmp-net-unreachableICMP_NET_UNREACHNetwork unreachable.
icmp-host-unreachableICMP_HOST_UNREACHHost unreachable.
icmp-port-unreachableICMP_PORT_UNREACHPort unreachable (this is the default behavior).
icmp-proto-unreachableICMP_PROT_UNREACHProtocol unreachable.
icmp-net-prohibitedICMP_NET_ANONetwork prohibited.
icmp-host-prohibitedICMP_HOST_ANOHost prohibited (commonly used).
icmp-admin-prohibitedICMP_PKT_FILTEREDAdministratively filtered (explicitly informs that it's a filtering action).
  • Extra bonus: --reject-with tcp-reset. If this is a TCP packet, the kernel will forge a TCP RST packet back instead of using ICMP. This is very effective for breaking stale connections stuck in the SYN_SENT state.

IPv6 (ip6tables)

Under IPv6, ip6t_REJECT works similarly, but the available error types are fewer (because IPv6's error types are inherently streamlined).

Example: Reject ICMPv6 requests from a certain IPv6 subnet.

ip6tables -A INPUT -s 2001::/64 -p ICMPv6 -j REJECT --reject-with icmp6-adm-prohibited

--reject-with optional parameters:

ParameterICMPv6 Message Sent
no-route / icmp6-no-routeICMPV6_NOROUTE
adm-prohibited / icmp6-adm-prohibitedICMPV6_ADM_PROHIBITED
addr-unreach / icmp6-addr-unreachableICMPV6_ADDR_UNREACH
port-unreach / icmp6-port-unreachableICMPV6_PORT_UNREACH

Chapter Echoes

ICMP is often misunderstood. Many people equate it with "Ping" and even consider it a security risk, wishing to block it entirely on the first line of their firewall.

But through the source code walkthrough in this chapter, you should have built a new understanding: ICMP is the maintenance crew of the IP protocol. Without ICMP error messages, IP routing becomes a one-way blind delivery—packets are sent out with no receiver, or sent down the wrong path, and the sender never knows. PMTU Discovery breaks (causing large packets to pass but with poor performance), redirects fail (causing paths to remain suboptimal), and connectivity testing becomes "blind men feeling an elephant."

Those intuitions we mentioned at the beginning of the chapter—that Ping just tests connectivity, and firewalls should block Ping—now seem too crude.

  • Ping tests more than just connectivity; it also includes routing integrity and the efficiency of name resolution (if NDP is involved in an IPv6 environment).
  • Firewalls shouldn't blindly DROP all ICMP, but rather apply fine-grained control (allow Type 3 Code 4 for MTU Discovery, allow Type 11 for Traceroute), and prioritize using the REJECT target over DROP.

By understanding ICMP's transmission and reception mechanisms, data structures, and the inet protocol registration flow, you've mastered the "stethoscope" for debugging network layer issues. When a network problem occurs, don't just stare at the TCP three-way handshake; shift your perspective down to L3 and ask yourself: Did ICMP report an error? What error did it report? The truth is often hidden in that error packet.

Next, we will formally enter the core of the IPv4 network layer—those core engines responsible for routing, fragmentation, and forwarding. Now that we've figured out the "control signals," it's time to see how the "data traffic" is scheduled and transferred.


Exercises

Exercise 1: Understanding

Question: During the Linux kernel initialization phase, the icmp_sk_init() function creates a kernel socket (net->ipv4.icmp_sk[i]) for each CPU. What is the protocol type (protocol) specified when creating these sockets? What is their primary purpose?

Answer and Analysis

Answer: The protocol type is IPPROTO_ICMP (value 1). These sockets are used by the kernel to send ICMP error messages (such as Destination Unreachable, Time Exceeded) or reply messages (such as Echo Reply). They belong to Raw Sockets, are not used for direct user-space communication, and are for internal use within the kernel network stack.

Analysis: Based on the code snippet of icmp_sk_init in the text: err = inet_ctl_sock_create(&sk, PF_INET, SOCK_RAW, IPPROTO_ICMP, net); It can be seen that the third parameter IPPROTO_ICMP specifies the protocol type. These sockets are described in the text as "kernel ICMP socket for sending ICMP messages" and are used in icmp_push_reply() to construct and send kernel-generated ICMP packets.

Exercise 2: Application

Question: Suppose your host, acting as a router, is forwarding a packet. If the packet's length (1500 bytes) exceeds the next hop's link MTU (1400 bytes), and the IP header has the DF (Don't Fragment) flag set. What kind of ICMP message will the kernel return to the sender? Which function in the kernel code is responsible for sending this message, and what does its info parameter represent?

Answer and Analysis

Answer: It returns an ICMP "Destination Unreachable" (Type 3) message with code "Fragmentation Needed" (Code 4). The sending function is icmp_send() (called within ip_forward). The info parameter (corresponding to the next hop's MTU value) will be set to 1400 (i.e., dst_mtu(&rt->dst)).

Analysis: This is a typical MTU Discovery scenario. Based on the code snippet of the ip_forward function in the text: icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(dst_mtu(&rt->dst))); When the packet length is greater than the MTU and the DF flag is set, the kernel triggers this logic. The fourth parameter info of the icmp_send function is used in this scenario to inform the peer of the required MTU size, thereby implementing Path MTU Discovery.

Exercise 3: Thinking

Question: When the icmp_rcv() function processes a received ICMPv4 packet, if the checksum check fails, the kernel usually discards the packet. Please think about and analyze: why does the kernel only update the statistical counter and discard the packet when the checksum fails, instead of returning a negative error value (like -EINVAL)?

Answer and Analysis

Answer: Because if the protocol handling mechanism in ip_local_deliver_finish() receives a negative return value, it will attempt to reprocess the packet (e.g., possibly considering it a transport layer issue, etc.). However, a checksum error in the ICMP packet itself indicates that the packet is corrupted, and there is no need for recovery or reprocessing. The only correct action is to discard it. Returning 0 indicates that the protocol stack has finished processing (even if the processing method was to discard), avoiding needless system overhead.

Analysis: As mentioned in the text: The icmp_rcv() method does not return an error in this case... when a protocol handler returns a negative error, another attempt to process the packet is performed, and it is not needed in this case. This is part of the robustness design of the network stack. ICMP is primarily a control message, and a checksum error means the content is untrustworthy and must be silently discarded to prevent operations based on erroneous packets. If an error were returned, it might trigger upper-layer retransmissions or other side effects, which is unsafe.


Key Takeaways

ICMP is an indispensable "nervous system" in IP networks. It goes beyond the scope of a simple Ping tool, serving as the exception handling and feedback mechanism for the network layer. Its core value lies in breaking the black-box nature of IP's "best-effort" delivery. By reporting errors (such as Destination Unreachable, Time Exceeded) and transmitting diagnostic information, it allows the sender to perceive network state and make adjustments (like PMTU Discovery), acting as the cornerstone of maintaining network debuggability and robustness.

The kernel's core mechanism for handling ICMP relies on efficient dispatch and parallel architectures. For the receive path, the kernel uses table lookups or jump dispatching to hand off different types of packets (like Echo, Time Exceeded, Unreachable) to specific handlers. For the send path, to avoid lock contention in multi-core systems, the Linux kernel adopts a design that creates independent Raw Sockets for each CPU, ensuring that control messages can be efficiently generated and sent even under heavy load.

Although ICMPv4 and ICMPv6 have similar names, there are fundamental differences in their responsibilities. ICMPv4 is primarily used to assist IPv4 with error reporting; whereas ICMPv6 holds a very high status in the IPv6 architecture. It not only inherits the error reporting function but also completely takes over the responsibilities of ARP (Address Resolution) and IGMP (Multicast Management), becoming the unified carrier for Neighbor Discovery (ND) and Multicast Listener Discovery (MLD), and is the absolute core of IPv6 network operations.

To prevent network storms, ICMP transmission is subject to strict rate limiting. The kernel rate-limits error messages by default to avoid congestion collapse caused by "avalanche" feedback triggered by network failures. However, there are two key exceptions to this rule: the "Packet Too Big" messages required for PMTU Discovery and messages sent to the loopback device must be sent immediately, ensuring timely Path MTU updates and smooth local communication.

In engineering practice, security policies should favor "reject" over "drop." Blindly using a firewall to DROP all ICMP will cause MTU Discovery to fail (leading to large packet transmission stalls) and Traceroute to break; a wiser approach is to use the iptables REJECT target or fine-grained procfs configurations (like icmp_echo_ignore_all) to hide specific host information while allowing necessary error messages (like Fragmentation Needed) to pass through, ensuring network connectivity.