From patchwork Mon Aug 12 05:21:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13760152 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-179.mail-mxout.facebook.com (66-220-155-179.mail-mxout.facebook.com [66.220.155.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C026D4D8CE for ; Mon, 12 Aug 2024 05:21:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723440086; cv=none; b=LhONGFZFKZoTyo5LLG1EvvdvZiK2NjcP1JxeFhQBLSzFfZzrLQjwNZxR4KkdgBZzTnVQm2Xe7PZa98Y+K98AY2a0913OrS12GvH+MN0CBJArbYkRc570hQxK28oepQvGr6me6ghg9hauBLrBwqPh9/nEuvQN35w/y94t8P0a46o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723440086; c=relaxed/simple; bh=h1YHqs/YQhFSrnx28ToAg4/nBU1A9zcRgcuigCX9BX4=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=l+NIuhKlT8ad5OapknbyhtfUizytOP2fijh8vuZHZNJK+UC2XZiTKfh8OixG+kLGqvzcFZ+TKjY9PE6Bn1Ez2CsmlOLNpspekqZ+r+fhGSoKofl4AFN9BDiqkeMgch/aiaJzbbuwPv4n4xGxt5e3GXNgupdHMN8wQGBcqotXWhg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id F1C6E7A354AA; Sun, 11 Aug 2024 22:21:06 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Eduard Zingerman , Daniel Hodges Subject: [PATCH bpf 1/2] bpf: Fix a kernel verifier crash in stacksafe() Date: Sun, 11 Aug 2024 22:21:06 -0700 Message-ID: <20240812052106.3980303-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Daniel Hodges reported a kernel verifier crash when playing with sched-ext. The crash dump looks like below: [ 65.874474] BUG: kernel NULL pointer dereference, address: 0000000000000088 [ 65.888406] #PF: supervisor read access in kernel mode [ 65.898682] #PF: error_code(0x0000) - not-present page [ 65.908957] PGD 0 P4D 0 [ 65.914020] Oops: 0000 [#1] SMP [ 65.920300] CPU: 19 PID: 9364 Comm: scx_layered Kdump: loaded Tainted: G S E 6.9.5-g93cea04637ea-dirty #7 [ 65.941874] Hardware name: Quanta Delta Lake MP 29F0EMA01D0/Delta Lake-Class1, BIOS F0E_3A19 04/27/2023 [ 65.960664] RIP: 0010:states_equal+0x3ee/0x770 [ 65.969559] Code: 33 85 ed 89 e8 41 0f 48 c7 83 e0 f8 89 e9 29 c1 48 63 c1 4c 89 e9 48 c1 e1 07 49 8d 14 08 0f b6 54 10 78 49 03 8a 58 05 00 00 <3a> 54 08 78 0f 85 60 03 00 00 49 c1 e5 07 43 8b 44 28 70 83 e0 03 [ 66.007120] RSP: 0018:ffffc9000ebeb8b8 EFLAGS: 00010202 [ 66.017570] RAX: 0000000000000000 RBX: ffff888149719680 RCX: 0000000000000010 [ 66.031843] RDX: 0000000000000000 RSI: ffff88907f4e0c08 RDI: ffff8881572f0000 [ 66.046115] RBP: 0000000000000000 R08: ffff8883d5014000 R09: ffffffff83065d50 [ 66.060386] R10: ffff8881bf9a1800 R11: 0000000000000002 R12: 0000000000000000 [ 66.074659] R13: 0000000000000000 R14: ffff888149719a40 R15: 0000000000000007 [ 66.088932] FS: 00007f5d5da96800(0000) GS:ffff88907f4c0000(0000) knlGS:0000000000000000 [ 66.105114] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 66.116606] CR2: 0000000000000088 CR3: 0000000388261001 CR4: 00000000007706f0 [ 66.130873] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 66.145145] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 66.159416] PKRU: 55555554 [ 66.164823] Call Trace: [ 66.169709] [ 66.173906] ? __die_body+0x66/0xb0 [ 66.180890] ? page_fault_oops+0x370/0x3d0 [ 66.189082] ? console_unlock+0xb5/0x140 [ 66.196926] ? exc_page_fault+0x4f/0xb0 [ 66.204597] ? asm_exc_page_fault+0x22/0x30 [ 66.212974] ? states_equal+0x3ee/0x770 [ 66.220643] ? states_equal+0x529/0x770 [ 66.228312] do_check+0x60f/0x5240 [ 66.235114] do_check_common+0x388/0x840 [ 66.242960] do_check_subprogs+0x101/0x150 [ 66.251150] bpf_check+0x5d5/0x4b60 [ 66.258134] ? __mod_memcg_state+0x79/0x110 [ 66.266506] ? pcpu_alloc+0x892/0xba0 [ 66.273829] bpf_prog_load+0x5bb/0x660 [ 66.281324] ? bpf_prog_bind_map+0x1e1/0x290 [ 66.289862] __sys_bpf+0x29d/0x3a0 [ 66.296664] __x64_sys_bpf+0x18/0x20 [ 66.303811] do_syscall_64+0x6a/0x140 [ 66.311133] entry_SYSCALL_64_after_hwframe+0x4b/0x53 Forther investigation shows that the crash is due to invalid memory access in stacksafe(). More specifically, it is the following code: if (exact != NOT_EXACT && old->stack[spi].slot_type[i % BPF_REG_SIZE] != cur->stack[spi].slot_type[i % BPF_REG_SIZE]) return false; If cur->allocated_stack is 0, cur->stack will be a ZERO_SIZE_PTR. If this happens, cur->stack[spi].slot_type[i % BPF_REG_SIZE] will crash the kernel as the memory address is illegal. This is exactly what happened in the above crash dump. If cur->allocated_stack is not 0, the above code could trigger array out-of-bound access. The patch added a condition 'i < cur->allocated_stack' to ensure cur->stack[spi].slot_type[i % BPF_REG_SIZE] memory access always legal. Fixes: 2793a8b015f7 ("bpf: exact states comparison for iterator convergence checks") Cc: Eduard Zingerman Reported-by: Daniel Hodges Signed-off-by: Yonghong Song Acked-by: Eduard Zingerman --- kernel/bpf/verifier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 4cb5441ad75f..1e3d7794bf13 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -16883,7 +16883,7 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old, spi = i / BPF_REG_SIZE; - if (exact != NOT_EXACT && + if (exact != NOT_EXACT && i < cur->allocated_stack && old->stack[spi].slot_type[i % BPF_REG_SIZE] != cur->stack[spi].slot_type[i % BPF_REG_SIZE]) return false; From patchwork Mon Aug 12 05:21:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13760151 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-179.mail-mxout.facebook.com (66-220-155-179.mail-mxout.facebook.com [66.220.155.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDECC5FBB7 for ; Mon, 12 Aug 2024 05:21:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723440086; cv=none; b=Iad/u9iMrUYiyBMfiG7DXg9eoURW0SdM3RN+JTmztS3El6E8IaxVa1jkCr+/HeRPp68sziU02r6v1UxYM72gDg+1tEKIvY2+8hftrVP6Dj9zi1ASr31DvO+eDwMQxt3GHmjP5EQsv8hOk7Vv1pAKLdyfibzju9MvOHB0FIDTHYI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723440086; c=relaxed/simple; bh=/2bi0y2iVeFYjXQJs4bDPCALBYhyo0TrCsTZU7C+jDk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k6mB0k0jE+38u7dCRL1ggq56zbJ8UFWZQLBP6fRCSe8LIJhTRf6QitOG9n9c1wr3i2apMKMpXPN8J9KU9e9T0+93AH32RIdGWSv3p1YT8R88jiZEjMZYHkbKIJTkPSgUxuITPCazuLTw0vtCunu57J5SxpTy3JIWb1lCRnRy+J4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 14F947A354D0; Sun, 11 Aug 2024 22:21:12 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau Subject: [PATCH bpf 2/2] selftests/bpf: Add a test to verify previous stacksafe() fix Date: Sun, 11 Aug 2024 22:21:12 -0700 Message-ID: <20240812052112.3980530-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240812052106.3980303-1-yonghong.song@linux.dev> References: <20240812052106.3980303-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net A selftest is added such that without the previous patch, a crash can happen. With the previous patch, the test can run successfully. The new test is written in a way which mimics original crash case: main_prog static_prog_1 static_prog_2 where static_prog_1 has different paths to static_prog_2 and some path has stack allocated and some other path does not. A stacksafe() checking in static_prog_2() triggered the crash. Signed-off-by: Yonghong Song --- tools/testing/selftests/bpf/progs/iters.c | 54 +++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/iters.c b/tools/testing/selftests/bpf/progs/iters.c index 16bdc3e25591..8d3b75147617 100644 --- a/tools/testing/selftests/bpf/progs/iters.c +++ b/tools/testing/selftests/bpf/progs/iters.c @@ -1432,4 +1432,58 @@ int iter_arr_with_actual_elem_count(const void *ctx) return sum; } +__u32 upper, select_n, result; +__u64 global; + +static __noinline bool nest_2(char *str, int len) +{ + /* some insns (including branch insns) to ensure stacksafe() is triggered + * in nest_2(). This way, stacksafe() can compare frame associated with nest_1(). + */ + if (str[0] == 't') + return true; + if (str[1] == 'e') + return true; + if (str[2] == 's') + return true; + if (str[3] == 't') + return true; + return false; +} + +static __noinline bool nest_1(int n) +{ + /* case 0: allocate stack, case 1: no allocate stack */ + switch (n) { + case 0: { + char comm[16]; + + if (bpf_get_current_comm(comm, 16)) + return false; + return nest_2(comm, 16); + } + case 1: + return nest_2((char *)&global, sizeof(global)); + default: + return false; + } +} + +SEC("raw_tp") +__success +int iter_subprog_check_stacksafe(const void *ctx) +{ + long i; + + bpf_for(i, 0, upper) { + if (!nest_1(select_n)) { + result = 1; + return 0; + } + } + + result = 2; + return 0; +} + char _license[] SEC("license") = "GPL";