Skip to main content

6.5 Using slabinfo and Its Companion Tools

A New Tool in the Box

Up to this point, we've been operating strictly "inside" the kernel—reading kernel logs, deciphering panic messages, or staring at hex dumps. This is essential, but sometimes we need to step back and observe from the outside, like a detached spectator.

For that, we need a handy "external scope."

You might ask, /proc/slabinfo haven't we mentioned that file several times already? Yes, but reading that file directly is like reading a long sentence without commas—all the information is there, but it's exhausting for humans. We need a tool that can structure this data and even analyze it for us.

That tool is slabinfo.

Counterintuitively, even though this is a userspace program, it lives right inside the kernel source tree: tools/vm/slabinfo.c

Compiling it is straightforward—no need to wrestle with the complex Kbuild system. Just enter the tools/vm directory in the source tree and type make. This produces a binary, and for convenience, we can create a symlink to /usr/bin/slabinfo.

$ ls -l $(which slabinfo)
lrwxrwxrwx 1 root root 71 Nov 20 16:26 /usr/bin/slabinfo ->
<...>/linux-5.10.60/tools/vm/slabinfo

Before we start, run -h (or --help). You'll see a massive list of parameters. Don't be intimidated—many of these are for specific scenarios. The figure below shows all parameters for kernel version 5.10.60 (Figure 6.6):

(Figure 6.6 – The help screen of the kernel slabinfo utility)

A few things to keep in mind:

  • Default behavior: slabinfo only displays slab caches with data by default (equivalent to passing the -l flag). To see empty caches, run slabinfo -e.
  • Prerequisites: Not all parameters work out of the box. Most features require the kernel to have CONFIG_SLUB_DEBUG=y enabled. The good news is that most distribution kernels enable this by default. Some features also require passing specific flags via slub_debug in the kernel boot parameters.
  • Permissions: You must be root.

A Quick "Health Check"

Let's run it once without any arguments and see what it spits out.

Here we'll just look at the header and one specific example (the kmalloc-32 cache):

$ sudo slabinfo |head -n1
Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg
$ sudo slabinfo | grep "^kmalloc-32"
kmalloc-32 35072 32 1.1M 224/0/50 128 0 0 100
$

This line of output is a bit cramped, so let's break it down piece by piece. Each column represents a "health metric" for this cache:

  • Name: The cache name (here, kmalloc-32).
  • Objects: How many objects are currently allocated (35,072).
  • Objsize: The size of a single object (32 bytes, which is expected).
  • Space: The total kernel memory space occupied by these objects (approximately 1.1MB, basically Objects * Objsize).
  • Slabs/Part/Cpu: A triplet representing the number of full slabs, partially empty slabs, and per-CPU slabs. This ratio tells you how fragmented the system is.
  • O/S (Objects per Slab): How many objects fit in a single slab (128).
  • O (Order): The order of memory requested from the page allocator. 0 means 1 page ($2^0$), 1 means 2 pages ($2^1$), and so on. An order of 0 here means single pages are used to hold these objects.
  • %Fr: The percentage of free memory.
  • %Ef: The effective memory usage percentage.
  • Flg: Flags. This shows the special attributes enabled for this cache (we'll dive into this below).

If you want to see exactly how this line is generated, you can check the source code directly: https://elixir.bootlin.com/linux/v5.10.60/source/tools/vm/slabinfo.c#L640

Regarding the Flg flags on the far right, they represent various attributes of the slab cache with the following meanings:

  • *: Aliases exist
  • d: DMA memory
  • A: Hardware cache line aligned
  • P: Poisoned (used to detect uninitialized access)
  • a: Reclaim statistics activated
  • Z: Red Zoned (used to detect out-of-bounds access)
  • F: Sanity checks on
  • U: User tracking (records who allocated it)
  • T: Traced

Additionally, if you add the -D (Display active) option, the output column format changes to show more detailed (but wider) information.

Who Are the Memory Hogs?

A common engineering question arises here: out of all these slab caches, which one is consuming the most kernel memory?

slabinfo provides two approaches to answer this:

  1. Use the -B parameter, which displays the bytes occupied, making it easy to sort manually.
  2. An even simpler method: use the -S parameter directly. This makes slabinfo sort by space usage in descending order and automatically appends human-readable units (KB, MB, etc.).

The figure below shows using -S to view the top 10 slab caches by memory usage (Figure 6.7):

(Figure 6.7 – The top 10 slab caches sorted by total kernel memory space taken)

As you can see, the top ranks are usually occupied by generic kmalloc-* caches or certain high-frequency kernel data structures.

Here's an interesting side note: the -U (Show unreclaimable slabs only) option was born entirely out of a real system panic. The story goes like this: unreclaimable slab memory usage on a system once approached 100%, causing the OOM Killer to fail to find any candidate processes to kill, ultimately leading to a direct panic. To make it easier to troubleshoot such desperate situations in the future, a developer submitted a patch that not only made slabinfo capable of displaying these stubborn caches but also updated the OOM Killer code path to print them. This patch was merged in kernel 4.15. You can feel the weight of this commit: https://github.com/torvalds/linux/commit/7ad3f188aac15772c97523dc4ca3e8e5b6294b9c

Where Did the Waste Go? (Internal Fragmentation)

Next up is the -L (sort by loss) option. This "loss" might be more familiar to you by another name—internal fragmentation.

The slab layer uses a "best fit" model. When you request 100 bytes, the kernel can't carve out exactly 100 bytes for you; instead, it stuffs you into a kmalloc-128 cache (because 96 bytes isn't enough). The result: you requested 100 bytes but actually consumed 128 bytes. That 28-byte difference is the so-called Loss or Waste.

Running sudo slabinfo -L |head displays a list of caches sorted by waste in descending order (look at the fourth column, Loss). This is incredibly helpful for evaluating your system's memory efficiency—sometimes you'll find that slightly optimizing the size of certain structures can save a surprising amount of fragmented space.

Digging into a Single Cache

If you notice a cache that looks suspicious, or you simply want to study it in depth, the -r (report) option is your best friend.

By default, it spits out detailed statistics for all caches. A more practical approach is to pair it with a regular expression. For example, to only look at caches related to virtual memory:

sudo slabinfo -r vm.*

This displays details for all caches matching the vm.* pattern. If SLUB debug flags (like U) were enabled earlier, you can even see who allocated and freed these objects, and where.

Sometimes you'll encounter a cache with an unfamiliar name and won't know what it's for. That's where the -a (or --aliases) option comes in handy—it shows which kernel object this cache is an alias for, pulling back the curtain.

Want a global overview? The -T option displays a summary snapshot of all caches: how many caches exist, how many are active, how much total memory is used, etc. If that's still not enough, the -X option provides an extended version with detailed information (Figure 6.8).

(Figure 6.8 – Screenshot showing extended summary information via slabinfo -X)

Finally, to see all caches (including empty ones), use the -z (zero) parameter.

The Option That Could Save Your Life

Here we have two options dedicated to debugging: -d and -v. Both share a hard prerequisite: your system must boot with the slub_debug kernel parameter, and this parameter must have a non-empty value.

Let's look at something concrete first. When you boot with slub_debug=FZPU, you'll notice that the flags for all slab caches—even those that previously had none—will now have at least the letters PZFU attached (Figure 6.9).

(Figure 6.9 – Partial screenshot – the focus is on the SLUB debug flags being set)

This means the kernel has forcibly added these debug attributes to all caches.

Regarding the -d option:

  • Running slabinfo -d alone turns debugging off. This isn't very intuitive, right? But it's consistent with the behavior of the slub_debug kernel parameter.
  • If you want to turn on certain debug flags, you need to pass them explicitly, such as --debug=fzput.

The underlying mechanism is actually quite simple: when you execute --debug=fzput, the slabinfo tool (as root) writes to the corresponding pseudo-files under /sys/kernel/slab/<slabname>/, setting them to 1.

The specific mapping is as follows:

  • f|F -> writes to /sys/kernel/slab/<slabname>/sanity_checks
  • z|Z -> writes to /sys/kernel/slab/<slabname>/red_zone
  • And so on...

The code that does the dirty work is here: https://elixir.bootlin.com/linux/v5.10.60/source/tools/vm/slabinfo.c#L717

Then there's the -v (validate) option. This option forces SLUB to traverse all objects in a specified cache and check the validity of their metadata. If it finds anything wrong, it blasts diagnostic information directly into the kernel log.

The format is identical to the "kernel reporting an error" messages you've seen before. In fact, the essence of slabinfo -v is simply writing a 1 to /sys/kernel/slab/<slabcache>/validate.

This is extremely useful for troubleshooting production systems suspected of having memory corruption—proactively triggering a full scan to see if there are any hidden bombs lurking deep inside.

By the way, the kernel source tree even includes a slabinfo-gnuplot.sh script that can plot slab runtime behavior as graphs. If you're interested in visualization, check out the instructions in the kernel documentation: Documentation/vm/slub.txt (Extended slabinfo mode and plotting section)

The Old Reliable: /proc/slabinfo

Of course, the kernel itself exposes this information through procfs, namely /proc/slabinfo. This is actually the data source for tools like slabtop. It also requires root privileges to read.

Its format is quite detailed (perhaps even a bit verbose):

$ sudo head -n2 /proc/slabinfo
slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> :
tunables <limit> <batchcount> <sharedfactor> :
slabdata <active_slabs> <num_slabs> <sharedavail>

The figure below offers a glimpse of actual data (Figure 6.10):

(Figure 6.10 – A screenshot showing some data from /proc/slabinfo)

As for how to interpret these columns, man slabinfo(5) explains it clearly. However, note that the man page is slightly outdated, and some of the statistics are primarily designed for the legacy Slab Allocator.

There's also an old friend, vmstat. With the -m parameter, it can also display slab statistics. It essentially reads /proc/slabinfo as well.

sudo vmstat -m

Watching Slab Like top: slabtop

Since we have top for watching CPU, is there a top for watching slab? Of course there is, and it's called slabtop.

Its usage is almost identical to top, refreshing in real time and sorting by object count by default (you can use -s to change the sort field). It's also based on /proc/slabinfo data, so it requires root.

Once running, you'll notice that besides those specific kernel structure caches, the small-size generic caches (the kmalloc-* series) are usually the workhorses of the system, being called most frequently.

A More Cutting-Edge Perspective: slabratetop

Finally, there's a relatively new member—the eBPF-based slabratetop (which might be called slabratetop-bpfcc on your system).

Most of the tools mentioned above provide "stock statistics," whereas slabratetop shows "incremental rates." It displays the allocation rate (allocations per second) and total bytes of kernel slab caches in real time, with a default refresh interval of one second.

Internally, it achieves this by tracing the kmem_cache_alloc kernel API.

For example, to take 3 samples at 5-second intervals:

sudo slabratetop-bpfcc 5 3

This is incredibly helpful for analyzing the dynamic behavior of a system, letting you see which memory area is in a state of "high-concurrency allocation."

Hands-On Exercise: Who's Eating My Memory?

Alright, the toolbox is open—how do we use these tools to solve real problems?

One of the most common questions is: Where did the memory go?

This breaks down into two layers: userspace and kernel space. If it's a userspace process, tools like smem, ps, or simply reading /proc/*/status will do the trick. For example, to find the top 10 processes by physical memory (RSS) usage:

grep -r "^VmRSS" /proc/*/status |sed 's/kB$//'|sort -t: -k3n |tail

But if you suspect the kernel is secretly gobbling up memory, especially slab memory, things get interesting. Let's walk through a troubleshooting scenario:

  1. Step 1: Use slabratetop to find which cache has an abnormally high allocation frequency or byte count.
  2. Step 2: Use dynamic kprobes to capture kernel stacks and see exactly which code path is frantically allocating from this cache.

Let's try this in practice.

First, run slabratetop (or slabratetop-bpfcc):

sudo slabratetop-bpfcc
[...]
CACHE ALLOCS BYTES
names_cache 18 78336
vm_area_struct 176 46464
...

In this output (top half of Figure 6.11), I noticed a cache called vm_area_struct with a particularly high allocation rate (176 times per second). That's suspicious.

(Figure 6.11 – Output from slabinfo and slabratetop)

Now the question is: Who is allocating vm_area_struct?

We know that kernel allocation of specific cache objects ultimately calls kmem_cache_alloc(). If we could see the kernel stack for this function in real time, wouldn't we know exactly who's calling it?

This is where kprobe comes in (recalling what we covered in Chapter 4). We can use the kprobe-perf script to capture the call stack of kmem_cache_alloc.

But there's a detail: the first parameter of kmem_cache_alloc is a pointer to a struct kmem_cache. How do we know which cache this pointer actually refers to? On x86_64, the first parameter is passed via the RDI register. Inside the struct kmem_cache structure, at an offset of 96 bytes, lies the name member (the cache name).

So we can write our command like this:

sudo kprobe-perf -s 'p:kmem_cache_alloc name=+0(+96(%di)):string'

This prints the stack for all slab allocations. To filter for only the vm_area_struct we care about, pipe it through grep:

sudo kprobe-perf -s 'p:kmem_cache_alloc name=+0(+96(%di)):string' | grep -A10 "name=.*vm_area_struct"

You'll get output similar to the following (bottom half of Figure 6.11):

(Figure 6.11 – Kernel stack trace showing the call path to kmem_cache_alloc)

This stack trace is very clear: sys_brk() --> do_brk() --> do_brk_flags() --> vm_area_alloc() --> kmem_cache_alloc()

sys_brk is the entry point for the userspace brk() system call, typically used to extend the heap memory. To manage a process's memory areas, the kernel needs to maintain VMA (Virtual Memory Area) structures. Whenever a new memory mapping needs to be created or an existing one extended, the kernel must allocate an object from the dedicated vm_area_struct cache. This is the source of this chain of allocations.

If you dig into the source code, you'll find this exact line in mm/mmap.c: https://elixir.bootlin.com/linux/v5.10.60/source/mm/mmap.c#L3110

At this point, you should have a sense of "X-ray vision" into your system's memory behavior.


Security Tip

Finally, although slightly off-topic, this is important: security. To ensure that slab memory contents are thoroughly wiped upon both allocation and freeing (preventing information leaks), you can add the following to your kernel boot parameters: init_on_alloc=1 init_on_free=1

This incurs a performance penalty, but for high-security environments, it's worth it. For reference, see the recommendations from the Kernel Self Protection Project: https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings