[bpf,v2,0/2] Don't trust r0 bounds after BPF to BPF calls with abnormal returns

Message ID	20241213212717.1830565-1-afabre@cloudflare.com (mailing list archive)
Headers	show Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CC7071DA5F for <bpf@vger.kernel.org>; Fri, 13 Dec 2024 21:27:46 +0000 (UTC) From: Arthur Fabre <afabre@cloudflare.com> To: bpf@vger.kernel.org Cc: Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, John Fastabend <john.fastabend@gmail.com>, Andrii Nakryiko <andrii@kernel.org>, Martin KaFai Lau <martin.lau@linux.dev>, Eduard Zingerman <eddyz87@gmail.com>, Song Liu <song@kernel.org>, Yonghong Song <yonghong.song@linux.dev>, KP Singh <kpsingh@kernel.org>, Stanislav Fomichev <sdf@fomichev.me>, Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>, kernel-team@cloudflare.com, Arthur Fabre <afabre@cloudflare.com> Subject: [PATCH bpf v2 0/2] Don't trust r0 bounds after BPF to BPF calls with abnormal returns Date: Fri, 13 Dec 2024 22:27:15 +0100 Message-Id: <20241213212717.1830565-1-afabre@cloudflare.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Don't trust r0 bounds after BPF to BPF calls with abnormal returns \| expand [bpf,v2,0/2] Don't trust r0 bounds after BPF to BPF calls with abnormal returns [bpf,v2,1/2] bpf: Don't trust r0 bounds after BPF to BPF calls with abnormal returns [bpf,v2,2/2] selftests/bpf: Test r0 bounds after BPF to BPF call with abnormal return

Message ID

20241213212717.1830565-1-afabre@cloudflare.com (mailing list archive)

Headers

From: Arthur Fabre <afabre@cloudflare.com>
To: bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	John Fastabend <john.fastabend@gmail.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Eduard Zingerman <eddyz87@gmail.com>,
	Song Liu <song@kernel.org>,
	Yonghong Song <yonghong.song@linux.dev>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Hao Luo <haoluo@google.com>,
	Jiri Olsa <jolsa@kernel.org>,
	kernel-team@cloudflare.com,
	Arthur Fabre <afabre@cloudflare.com>
Subject: [PATCH bpf v2 0/2] Don't trust r0 bounds after BPF to BPF calls with
 abnormal returns
Date: Fri, 13 Dec 2024 22:27:15 +0100
Message-Id: <20241213212717.1830565-1-afabre@cloudflare.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

Don't trust r0 bounds after BPF to BPF calls with abnormal returns | expand

Message

Arthur Fabre Dec. 13, 2024, 9:27 p.m. UTC

A BPF function can return before its exit instruction: LD_ABS, LD_IND,
and tail_call() can all cause it to return prematurely.

When such a function is called by another BPF function, the verifier
doesn't take this into account when calculating the bounds of r0 in the
caller after the callee returns.

---
Changes in v2:
- Handle LD_ABS and LD_IND, not just tail_call()
- Split tests out
- Use inline asm for tests

---
Arthur Fabre (2):
  bpf: Don't trust r0 bounds after BPF to BPF calls with abnormal
    returns
  selftests/bpf: Test r0 bounds after BPF to BPF call with abnormal
    return

 kernel/bpf/verifier.c                         | 18 ++--
 .../selftests/bpf/prog_tests/verifier.c       |  2 +
 .../bpf/progs/verifier_abnormal_ret.c         | 88 +++++++++++++++++++
 3 files changed, 102 insertions(+), 6 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_abnormal_ret.c

Comments

Arthur Fabre Dec. 16, 2024, 5:45 p.m. UTC | #1

On Fri Dec 13, 2024 at 10:27 PM CET, Arthur Fabre wrote:
> A BPF function can return before its exit instruction: LD_ABS, LD_IND,
> and tail_call() can all cause it to return prematurely.
>
> When such a function is called by another BPF function, the verifier
> doesn't take this into account when calculating the bounds of r0 in the
> caller after the callee returns.

I've just realized r0 isn't he only problem: a caller can pass a
reference to its stack to a callee, and the verifier also tracks the
value of this.

If the callee writes to the caller's stack after the abnormal return
(tail_call, ld_abs), the verifier will also incorrectly assume the 
write always happens.

I think we need to treat these abnormal returns as a branch that can
exit. That way the verifier will explore both possibilities, and the
combined result will really reflect what can happen.
I'll try that out for a v3.

Alexei Starovoitov Dec. 16, 2024, 6:03 p.m. UTC | #2

On Mon, Dec 16, 2024 at 9:45 AM Arthur Fabre <afabre@cloudflare.com> wrote:
>
> On Fri Dec 13, 2024 at 10:27 PM CET, Arthur Fabre wrote:
> > A BPF function can return before its exit instruction: LD_ABS, LD_IND,
> > and tail_call() can all cause it to return prematurely.
> >
> > When such a function is called by another BPF function, the verifier
> > doesn't take this into account when calculating the bounds of r0 in the
> > caller after the callee returns.
>
> I've just realized r0 isn't he only problem: a caller can pass a
> reference to its stack to a callee, and the verifier also tracks the
> value of this.
>
> If the callee writes to the caller's stack after the abnormal return
> (tail_call, ld_abs), the verifier will also incorrectly assume the
> write always happens.
>
> I think we need to treat these abnormal returns as a branch that can
> exit. That way the verifier will explore both possibilities, and the
> combined result will really reflect what can happen.
> I'll try that out for a v3.

Good idea. Makes sense to me.