3.3 Quick Reference and Practical Supplements
In this chapter, we broke down the internal workings of ICMPv4 and ICMPv6, from initialization flows to packet transmission and reception, covering a significant amount of kernel code. Now it's time to consolidate these scattered details into a quick reference, and along the way, cover two "corners" that are easily overlooked in engineering practice.
This section might look like an "appendix," but I strongly recommend not treating it merely as a dictionary. The procfs parameters and iptables REJECT target mentioned later directly determine whether your production firewall's "rejection" behavior is a polite wave or an angry silence.
📚 Core Method Quick Reference
We repeatedly encountered these functions during our code walkthrough. They are the "limbs" of the ICMP subsystem.
Reception Handling
int icmp_rcv(struct sk_buff *skb);
Location:
net/ipv4/icmp.c
This is the main entry function for ICMPv4. When an IP packet is stripped of its IP header and the protocol number is found to be IPPROTO_ICMP (1), the kernel jumps directly to this function. It handles checksum verification, packet dispatch, and triggers the generation of corresponding ICMP replies (such as Echo Reply).
Transmission Handling
void icmp_send(struct sk_buff *skb_in, int type, int code, __be32 info);
Location:
net/ipv4/icmp.c
The core function for the kernel to send ICMPv4 error messages (such as Destination Unreachable) outward.
skb_in: The original packet (provoking SKB) that triggered the error. The kernel extracts the IP header from this packet for diagnostics and embeds it into the data portion of the newly generated ICMP error packet.type/code: The type and code of the ICMP message.info: Typically used to pass the MTU (in the Fragmentation Needed scenario) or a pointer to the offending field.
void icmpv6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info);
Location:
net/ipv6/icmp.c
The ICMPv6 version of the send function. Although the name is similar, its internal implementation calls ip6_append_data and ip6_push_pending_frames, following the IPv6 transmission path.
void icmpv6_param_prob(struct sk_buff *skb, u8 code, int pos);
Location:
net/ipv6/icmp.c
This is a wrapper function specifically for sending ICMPv6 Parameter Problem messages. Its purpose is to save you the trouble of manually filling in the ICMPV6_PARAMPROB type, and it automatically frees skb after transmission (because the parameter issue is severe enough that the packet cannot be processed and must be discarded).
Utility Helpers
struct icmp6hdr *icmp6_hdr(const struct sk_buff *skb);
Location:
include/linux/icmpv6.h
A simple inline helper function used to safely obtain the icmp6hdr pointer from skb. In IPv6 extension header processing, the SKB pointers might keep moving, and using this macro prevents errors in pointer arithmetic.
📋 Key Data Table: Error Code Dictionary
ICMP "error" messages are primarily distinguished by Type (major category) and Code (minor category). The following tables are the "dictionary" you need to constantly refer to during packet capture analysis.
1. ICMPv4 Destination Unreachable
This is the most common category of errors. When a router cannot find a route, or a host port is not open, you will receive these.
| Code | Kernel Symbol | Human-Readable Meaning |
|---|---|---|
| 0 | ICMP_NET_UNREACH | Network unreachable (no such subnet in the routing table). |
| 1 | ICMP_HOST_UNREACH | Host unreachable (ARP resolution failed, or the host is down). |
| 2 | ICMP_PROT_UNREACH | Protocol unreachable (e.g., you sent TCP, but the target machine's kernel doesn't have the TCP module loaded). |
| 3 | ICMP_PORT_UNREACH | Port unreachable (most common, sending UDP to a port that isn't listening). |
| 4 | ICMP_FRAG_NEEDED | Fragmentation Needed, but the DF flag is set (the key packet for PMTU Discovery). |
| 5 | ICMP_SR_FAILED | Source Route Failed (rarely used, indicating an issue with the specified source route path). |
| 9 | ICMP_NET_ANO | Network administratively prohibited (the firewall decided you shouldn't go here). |
| 10 | ICMP_HOST_ANO | Host administratively prohibited (the firewall considers this host protected). |
| 13 | ICMP_PKT_FILTERED | Packet filtered (usually generated by a firewall). |
(Note: Some rarely seen TOS-related and Historical codes are omitted)
2. ICMPv4 Redirect
Host A wants to send data to Host C and must go through Router R. But Router R realizes that Host A could actually send directly to Host B, who would then forward it, or R knows a shorter route. In this case, R will send a Redirect to A.
| Code | Kernel Symbol | Meaning |
|---|---|---|
| 0 | ICMP_REDIR_NET | Redirect Network. |
| 1 | ICMP_REDIR_HOST | Redirect Host. |
| 2 | ICMP_REDIR_NETTOS | Redirect Network (based on TOS). |
| 3 | ICMP_REDIR_HOSTTOS | Redirect Host (based on TOS). |
3. ICMPv4 Time Exceeded
| Code | Kernel Symbol | Meaning |
|---|---|---|
| 0 | ICMP_EXC_TTL | TTL Exceeded (the core of how Traceroute works). |
| 1 | ICMP_EXC_FRAGTIME | Fragment Reassembly Time Exceeded (took too long to collect all fragments, discarded). |
4. ICMPv6 Destination Unreachable
IPv6's definitions are slightly different. Note that in IPv6, ICMPV6_PKT_TOOBIG is an independent Type, not a Code.
| Code | Kernel Symbol | Meaning |
|---|---|---|
| 0 | ICMPV6_NOROUTE | No Route to destination. |
| 1 | ICMPV6_ADM_PROHIBITED | Administratively Prohibited (forbidden by admin). |
| 3 | ICMPV6_ADDR_UNREACH | Address Unreachable (usually means neighbor unreachable, NDP failed). |
| 4 | ICMPV6_PORT_UNREACH | Port Unreachable. |
5. ICMPv6 Parameter Problem
| Code | Kernel Symbol | Meaning |
|---|---|---|
| 0 | ICMPV6_HDR_FIELD | Erroneous Header Field (unrecognized field). |
| 1 | ICMPV6_UNK_NEXTHDR | Unrecognized Next Header (extension header chain is broken). |
| 2 | ICMPV6_UNK_OPTION | Unrecognized IPv6 Option. |
⚙️ procfs Interfaces: Controlling ICMP from User Space
Linux allows you to adjust ICMP behavior via /proc/sys without recompiling the kernel. These variables all belong to the netns_ipv4 structure and are isolated per network namespace.
Key Configuration Items
1. icmp_echo_ignore_all
- Path:
/proc/sys/net/ipv4/icmp_echo_ignore_all - Default:
0(disabled) - Purpose: When set to 1, the kernel will ignore all ICMP Echo Requests (Ping requests).
- Scenario: This is the first step to "stealth mode." Note that this only ignores Pings; it doesn't mean the host is unreachable.
2. icmp_echo_ignore_broadcasts
- Path:
/proc/sys/net/ipv4/icmp_echo_ignore_broadcasts - Default:
1(enabled) - Purpose: When set to 1, ignores Pings or Timestamp requests sent to broadcast/multicast addresses.
- Scenario: Must remain enabled. If a virus on your network is frantically Pinging the broadcast address, turning this off will directly cause a network storm.
3. icmp_ratelimit
- Path:
/proc/sys/net/ipv4/icmp_ratelimit - Default:
1 * HZ(i.e., 1 second, 1000ms) - Purpose: Limits the sending rate of specific ICMP error messages (defined by
icmp_ratemask). The unit is milliseconds. 0 means no limit.- Warning: If set to 0, an attacker can forge massive amounts of traffic to trigger your ICMP error replies, turning you into an accomplice in a reflection/amplification attack (although ICMP packets are small, they are still enough to congest a link).
4. icmp_ratemask
- Path:
/proc/sys/net/ipv4/icmp_ratemask - Default:
0x1818(binary0001 1000 0001 1000) - Purpose: A bitmask that determines which ICMP Types need to be rate-limited. Each bit corresponds to an ICMP Type.
- For example, if you want to limit Destination Unreachable (Type 3) and Time Exceeded (Type 11), you need to set the corresponding bits. The default value primarily targets error messages.
5. icmp_errors_use_inbound_ifaddr
- Path:
/proc/sys/net/ipv4/icmp_errors_use_inbound_ifaddr - Default:
0 - Purpose: Determines how the source IP address is selected when sending ICMP error messages.
0(default): Uses the primary address of the outbound interface.1: Uses the primary address of the inbound interface (i.e., the interface that received the triggering packet).- Gotcha: In multi-homed hosts or complex asymmetric routing scenarios, if you select the wrong source address, the error packet received by the peer might be discarded as a forged packet, or the peer's reply might get lost again. Usually, keeping the default is fine unless your network topology is very special.
🛡️ Active Rejection: Using iptables/ip6tables REJECT
There is an iron rule for firewalls: REJECT is better than DROP.
- DROP: Your machine simply discards the packet, and the client will keep waiting until it times out. The user experience is poor, and it's hard for an attacker to tell if they were blocked by a firewall or if the machine is down.
- REJECT: Your machine politely sends back an ICMP error packet ("Host Unreachable" or "Port Unreachable"). The client immediately knows "this path is blocked" and stops trying.
However, the details of this mechanism differ slightly between IPv4 and IPv6.
IPv4 (iptables)
The ipt_REJECT module allows you to specify what type of ICMP error to return.
Most common example: Deny access and inform the peer "administratively prohibited."
iptables -A INPUT -j REJECT --reject-with icmp-host-prohibited
--reject-with optional parameters:
| Parameter | ICMP Message Sent | Meaning |
|---|---|---|
icmp-net-unreachable | ICMP_NET_UNREACH | Network unreachable. |
icmp-host-unreachable | ICMP_HOST_UNREACH | Host unreachable. |
icmp-port-unreachable | ICMP_PORT_UNREACH | Port unreachable (this is the default behavior). |
icmp-proto-unreachable | ICMP_PROT_UNREACH | Protocol unreachable. |
icmp-net-prohibited | ICMP_NET_ANO | Network prohibited. |
icmp-host-prohibited | ICMP_HOST_ANO | Host prohibited (commonly used). |
icmp-admin-prohibited | ICMP_PKT_FILTERED | Administratively filtered (explicitly informs that it's a filtering action). |
- Extra bonus:
--reject-with tcp-reset. If this is a TCP packet, the kernel will forge a TCP RST packet back instead of using ICMP. This is very effective for breaking stale connections stuck in theSYN_SENTstate.
IPv6 (ip6tables)
Under IPv6, ip6t_REJECT works similarly, but the available error types are fewer (because IPv6's error types are inherently streamlined).
Example: Reject ICMPv6 requests from a certain IPv6 subnet.
ip6tables -A INPUT -s 2001::/64 -p ICMPv6 -j REJECT --reject-with icmp6-adm-prohibited
--reject-with optional parameters:
| Parameter | ICMPv6 Message Sent |
|---|---|
no-route / icmp6-no-route | ICMPV6_NOROUTE |
adm-prohibited / icmp6-adm-prohibited | ICMPV6_ADM_PROHIBITED |
addr-unreach / icmp6-addr-unreachable | ICMPV6_ADDR_UNREACH |
port-unreach / icmp6-port-unreachable | ICMPV6_PORT_UNREACH |
Chapter Echoes
ICMP is often misunderstood. Many people equate it with "Ping" and even consider it a security risk, wishing to block it entirely on the first line of their firewall.
But through the source code walkthrough in this chapter, you should have built a new understanding: ICMP is the maintenance crew of the IP protocol. Without ICMP error messages, IP routing becomes a one-way blind delivery—packets are sent out with no receiver, or sent down the wrong path, and the sender never knows. PMTU Discovery breaks (causing large packets to pass but with poor performance), redirects fail (causing paths to remain suboptimal), and connectivity testing becomes "blind men feeling an elephant."
Those intuitions we mentioned at the beginning of the chapter—that Ping just tests connectivity, and firewalls should block Ping—now seem too crude.
- Ping tests more than just connectivity; it also includes routing integrity and the efficiency of name resolution (if NDP is involved in an IPv6 environment).
- Firewalls shouldn't blindly DROP all ICMP, but rather apply fine-grained control (allow Type 3 Code 4 for MTU Discovery, allow Type 11 for Traceroute), and prioritize using the REJECT target over DROP.
By understanding ICMP's transmission and reception mechanisms, data structures, and the inet protocol registration flow, you've mastered the "stethoscope" for debugging network layer issues. When a network problem occurs, don't just stare at the TCP three-way handshake; shift your perspective down to L3 and ask yourself: Did ICMP report an error? What error did it report? The truth is often hidden in that error packet.
Next, we will formally enter the core of the IPv4 network layer—those core engines responsible for routing, fragmentation, and forwarding. Now that we've figured out the "control signals," it's time to see how the "data traffic" is scheduled and transferred.
Exercises
Exercise 1: Understanding
Question: During the Linux kernel initialization phase, the icmp_sk_init() function creates a kernel socket (net->ipv4.icmp_sk[i]) for each CPU. What is the protocol type (protocol) specified when creating these sockets? What is their primary purpose?
Answer and Analysis
Answer: The protocol type is IPPROTO_ICMP (value 1). These sockets are used by the kernel to send ICMP error messages (such as Destination Unreachable, Time Exceeded) or reply messages (such as Echo Reply). They belong to Raw Sockets, are not used for direct user-space communication, and are for internal use within the kernel network stack.
Analysis: Based on the code snippet of icmp_sk_init in the text:
err = inet_ctl_sock_create(&sk, PF_INET, SOCK_RAW, IPPROTO_ICMP, net);
It can be seen that the third parameter IPPROTO_ICMP specifies the protocol type.
These sockets are described in the text as "kernel ICMP socket for sending ICMP messages" and are used in icmp_push_reply() to construct and send kernel-generated ICMP packets.
Exercise 2: Application
Question: Suppose your host, acting as a router, is forwarding a packet. If the packet's length (1500 bytes) exceeds the next hop's link MTU (1400 bytes), and the IP header has the DF (Don't Fragment) flag set. What kind of ICMP message will the kernel return to the sender? Which function in the kernel code is responsible for sending this message, and what does its info parameter represent?
Answer and Analysis
Answer: It returns an ICMP "Destination Unreachable" (Type 3) message with code "Fragmentation Needed" (Code 4).
The sending function is icmp_send() (called within ip_forward).
The info parameter (corresponding to the next hop's MTU value) will be set to 1400 (i.e., dst_mtu(&rt->dst)).
Analysis: This is a typical MTU Discovery scenario. Based on the code snippet of the ip_forward function in the text:
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(dst_mtu(&rt->dst)));
When the packet length is greater than the MTU and the DF flag is set, the kernel triggers this logic. The fourth parameter info of the icmp_send function is used in this scenario to inform the peer of the required MTU size, thereby implementing Path MTU Discovery.
Exercise 3: Thinking
Question: When the icmp_rcv() function processes a received ICMPv4 packet, if the checksum check fails, the kernel usually discards the packet. Please think about and analyze: why does the kernel only update the statistical counter and discard the packet when the checksum fails, instead of returning a negative error value (like -EINVAL)?
Answer and Analysis
Answer: Because if the protocol handling mechanism in ip_local_deliver_finish() receives a negative return value, it will attempt to reprocess the packet (e.g., possibly considering it a transport layer issue, etc.). However, a checksum error in the ICMP packet itself indicates that the packet is corrupted, and there is no need for recovery or reprocessing. The only correct action is to discard it. Returning 0 indicates that the protocol stack has finished processing (even if the processing method was to discard), avoiding needless system overhead.
Analysis: As mentioned in the text: The icmp_rcv() method does not return an error in this case... when a protocol handler returns a negative error, another attempt to process the packet is performed, and it is not needed in this case.
This is part of the robustness design of the network stack. ICMP is primarily a control message, and a checksum error means the content is untrustworthy and must be silently discarded to prevent operations based on erroneous packets. If an error were returned, it might trigger upper-layer retransmissions or other side effects, which is unsafe.
Key Takeaways
ICMP is an indispensable "nervous system" in IP networks. It goes beyond the scope of a simple Ping tool, serving as the exception handling and feedback mechanism for the network layer. Its core value lies in breaking the black-box nature of IP's "best-effort" delivery. By reporting errors (such as Destination Unreachable, Time Exceeded) and transmitting diagnostic information, it allows the sender to perceive network state and make adjustments (like PMTU Discovery), acting as the cornerstone of maintaining network debuggability and robustness.
The kernel's core mechanism for handling ICMP relies on efficient dispatch and parallel architectures. For the receive path, the kernel uses table lookups or jump dispatching to hand off different types of packets (like Echo, Time Exceeded, Unreachable) to specific handlers. For the send path, to avoid lock contention in multi-core systems, the Linux kernel adopts a design that creates independent Raw Sockets for each CPU, ensuring that control messages can be efficiently generated and sent even under heavy load.
Although ICMPv4 and ICMPv6 have similar names, there are fundamental differences in their responsibilities. ICMPv4 is primarily used to assist IPv4 with error reporting; whereas ICMPv6 holds a very high status in the IPv6 architecture. It not only inherits the error reporting function but also completely takes over the responsibilities of ARP (Address Resolution) and IGMP (Multicast Management), becoming the unified carrier for Neighbor Discovery (ND) and Multicast Listener Discovery (MLD), and is the absolute core of IPv6 network operations.
To prevent network storms, ICMP transmission is subject to strict rate limiting. The kernel rate-limits error messages by default to avoid congestion collapse caused by "avalanche" feedback triggered by network failures. However, there are two key exceptions to this rule: the "Packet Too Big" messages required for PMTU Discovery and messages sent to the loopback device must be sent immediately, ensuring timely Path MTU updates and smooth local communication.
In engineering practice, security policies should favor "reject" over "drop." Blindly using a firewall to DROP all ICMP will cause MTU Discovery to fail (leading to large packet transmission stalls) and Traceroute to break; a wiser approach is to use the iptables REJECT target or fine-grained procfs configurations (like icmp_echo_ignore_all) to hide specific host information while allowing necessary error messages (like Fragmentation Needed) to pass through, ensuring network connectivity.