14.4 Managing Network Namespaces: From a God's-Eye View to Hands-On Operations

In the previous section, we traced the past and present of struct net from the kernel's perspective. Now, let's return to user space.

With the theory in place, it's time for hands-on practice. We don't need to write C code to call unshare(), though that would certainly be cool. In 99% of cases, we use the ip netns command provided by the iproute2 package. It's the Swiss Army knife for manipulating network namespaces—handy and intuitive, but with a few hidden pitfalls.

In this section, we'll step on every single one of those pitfalls from the command line.

4.4.1 Creating and Destroying: More Than Just Adding a Name

Creating a network namespace named ns1 is deceptively simple:

ip netns add ns1

But there's a lot more happening behind the scenes than meets the eye. Think of it as a "registering a household" process.

First, the system creates a file named ns1 under /var/run/netns/. Immediately after, ip netns uses the unshare() system call (with the CLONE_NEWNET flag) to tell the kernel to create a new network namespace. Finally, and most critically, it bind mounts /proc/self/ns/net (the current process's network namespace handle) onto the /var/run/netns/ns1 file just created.

Why do it this way? Here's a useful analogy:

[Analogy Intro]: You can think of the files under /var/run/netns/ as "portal anchors." The network namespaces in the kernel are data structures floating in memory. Without a file in the filesystem referencing them, the namespace would vanish like a ghost the moment the process that created it exits.

[Limitation Reveal]: But the "portal" analogy has a limitation—it implies the file is the namespace. It isn't. The file is merely a handle (via bind mount) pointing to a kernel object. If you directly delete the /var/run/netns/ns1 file (using rm instead of ip netns del), as long as there are still processes alive inside that namespace, the namespace itself won't be destroyed—you just won't be able to find the door to get back.

With this "anchor" in place, even if the process that created the namespace exits, we can still "return" to that isolated network environment through this file.

Of course, network namespaces can be nested. You can create another ns2 inside ns1, like Russian nesting dolls, where each layer has its own lo device and routing table.

Destroying a Namespace

Deleting it is just as intuitive:

ip netns del ns1

But there's a catch here.

If there are still living processes inside this namespace (for example, if you left a bash shell open), this command will refuse to execute. Why? Because the kernel won't tear down a house that still has "residents."

Only when it confirms no processes are attached will ip netns del delete the /var/run/netns/ns1 file. As the reference count drops to zero, the kernel cleans up the namespace.

Something interesting happens at this point: When a namespace is destroyed, all network devices inside it are forcibly "relocated"—moved back to the initial default namespace, init_net. Exception: Devices marked with NETIF_F_NETNS_LOCAL (such as the lo device, VXLAN, bridges, etc.) have "local residency" and cannot be relocated; they are destroyed along with the namespace. We'll dive into this flag later when we discuss ip link set.

4.4.2 Viewing, Monitoring, and PID Magic Tricks

Seeing All the Worlds

List all namespaces created via ip netns add:

ip netns list

The implementation behind this is quite simple: it just lists the filenames under the /var/run/netns/ directory. What does this mean? It means if you created a namespace using a method like unshare --net bash, it won't appear in this list—because this method doesn't leave an "anchor" under /var/run/netns/. It's like an invisible space, perceivable only to the current process.

Real-Time Monitoring

If you want to watch namespace creation and destruction from another terminal, you can use this:

ip netns monitor

This is implemented by monitoring the /var/run/netns/ directory via the inotify mechanism. When you run ip netns add ns2, the monitoring terminal prints add ns2. When you run ip netns del ns2, it prints delete ns2.

⚠️ Note: If you run the monitor before creating any namespaces, you'll get an error: inotify_add_watch failed: No such file or directory The reason is simple: the /var/run/netns/ directory doesn't exist yet, so inotify has nowhere to attach its hooks.

Similarly, only operations performed via the ip netns command will trigger the monitor. If you manually write code to unshare(), it won't show up here—because that directory wasn't touched.

Hide and Seek Between Processes and Namespaces

iproute2 also gives us two very useful commands for handling the relationship between processes and namespaces.

1. Finding a Name for a PID You see a process (say, PID 1234) and want to know which network namespace it's currently in?

ip netns identify 1234

What happens behind the scenes: it reads /proc/1234/ns/net, then iterates through the files under /var/run/netns/, comparing inode numbers using the stat() system call. Once it finds a match, it tells you the name. If there's no match (for instance, if it's an invisible space created via unshare), the command stays silent.

2. Finding PIDs for a Name You want to know who's hiding inside ns1?

ip netns pids ns1

This is like taking roll call. It reads the inode of /var/run/netns/ns1, then iterates through /proc/<pid>/ns/net, comparing inode numbers one by one. Any matching PIDs get listed.

4.4.3 Entering a Namespace: The Magic of Exec

Creation is just the first step; the real goal is to get inside and play.

The most classic operation is opening a shell in ns1:

ip netns exec ns1 bash

This exec command is extremely powerful. It essentially does three things:

Opens the /var/run/netns/ns1 file (obtaining the bind mount's fd).
Uses the setns() system call to associate the current process with the namespace pointed to by this fd.
fork() + execve() the command you specified (in this case, bash).

Once that bash starts up, you are inside ns1. If you run ifconfig -a now, you'll see a very minimalist world:

ip netns exec ns1 ifconfig -a
lo       Link encap:Local Loopback
         LOOPBACK MTU:65536 Metric:1

(Usually just a lone lo device, and it's in the DOWN state).

4.4.4 Moving Devices: You Can't Take All Your Luggage

Right now, your ns1 is empty, with nothing but lo. That's boring. We usually need to plug a physical or virtual network interface into it.

For example, moving eth0 into ns1:

ip link set eth0 netns ns1

The moment you run this command, your main shell will immediately lose control of eth0—it disappears from your view. You must ip netns exec ns1 to look inside and find it over there.

But there's a hard limit here.

Remember the net_device structure we mentioned in the previous section? It has a feature flag called NETIF_F_NETNS_LOCAL. If this bit is set, the device has "local residency" and refuses to be moved.

You can check this flag using ethtool:

ethtool -k eth0 | grep netns-local
netns-local: off [fixed]

If it's on (like the lo device, vxlan, pppoe, etc.), the kernel will directly return -EINVAL (Invalid argument) when you try to move it.

Let's see how the kernel source code rejects this:

/* net/core/dev.c */
int dev_change_net_namespace(struct net_device *dev, struct net *net, const char *pat)
{
        int err;

        ASSERT_RTNL();

        /* Don't allow namespace local devices to be moved. */
        err = -EINVAL;
        if (dev->features & NETIF_F_NETNS_LOCAL)
                goto out;
        
        /* ... 真正的切换逻辑 ... */
        dev_net_set(dev, net);
        /* ... */

out:
        return err;
}

Those are the rules. You want to take it with you? No way.

An alternative way to move: Sometimes you might not know the namespace's name, but you know a process (PID 666) is running in that space. You can also just throw the network interface at that process:

ip link set eth1 netns 666

The kernel will find the namespace where PID 666 resides via get_net_ns_by_pid() and stuff the interface in. This achieves the same result as get_net_ns_by_fd() (finding by name).

The Wireless Exception: If you want to move a wireless interface (wlan0) to another namespace, ip link might not be enough. You'll need to use the iw command, because configuring wireless devices is much more complex than regular Ethernet (involving phy, etc.).

iw phy phy0 set netns <pidNumber>

4.4.5 Bridging the Void: Making Two Worlds Talk

Now that we've separated the devices, we still need them to communicate. Otherwise, what's the point of isolation? There are usually two ways for two isolated namespaces to communicate:

Unix Sockets: The granddaddy of inter-process communication, ignoring network isolation entirely.
VETH Pairs: More like a "cross-world network cable."

VETH (Virtual Ethernet) always comes in pairs. The pair acts like two ends of a pipe—data going in one end comes out the other immediately. We can leave one end in init_net (the main world) and throw the other into ns1.

Suppose we have two empty namespaces, ns1 and ns2:

# 创建空间
ip netns add ns1
ip netns add ns2

Now let's create a VETH pair in the current (main) space:

ip link add name if_one type veth peer name if_one_peer

Now the main space has two new devices: if_one and if_one_peer. Next, we throw if_one_peer into ns1:

ip link set dev if_one_peer netns ns1

Now, the main space only has if_one, while ns1 has if_one_peer. If you then throw if_one into ns2, or configure IPs for if_one and the if_one_peer inside ns1 and bring them up, you've laid a direct network cable between two namespaces.

You can use ifconfig or ip addr to configure their IPs, then ping each other. This is the foundation of building container networks.

4.4.6 Conclusion: But This Is Just the Beginning

In this chapter, we dissected the implementation of network namespaces: from nsproxy in the kernel, to ip netns in user space. We also saw how many compromises and changes the kernel made to achieve all of this: the new CLONE_NEW* flag, new system calls, the pernet_operations callback mechanism...

These things aren't magic; they are layers of abstraction stacked up by engineers to solve the age-old problem of "isolation."

But this isn't everything. Namespaces solve the problem of "visibility isolation," but they don't solve the problem of "resource contention." If a process inside a container starts hogging the CPU, the host machine will still freeze. The solution to this problem is the star of the next chapter: Cgroups. After that, we'll see how specific network modules attach themselves to the larger Cgroups framework.

Are you ready? We're about to enter the deep waters of resource management.

4.4.1 Creating and Destroying: More Than Just Adding a Name​

Destroying a Namespace​

4.4.2 Viewing, Monitoring, and PID Magic Tricks​

Seeing All the Worlds​

Real-Time Monitoring​

Hide and Seek Between Processes and Namespaces​

4.4.3 Entering a Namespace: The Magic of Exec​

4.4.4 Moving Devices: You Can't Take All Your Luggage​

4.4.5 Bridging the Void: Making Two Worlds Talk​

4.4.6 Conclusion: But This Is Just the Beginning​