diff mbox series

[bpf,v7,1/2] lib/strncpy_from_user.c: Don't overcopy bytes after NUL terminator

Message ID 21efc982b3e9f2f7b0379eed642294caaa0c27a7.1605642949.git.dxu@dxuuu.xyz (mailing list archive)
State Accepted
Commit 33b97ea52713943e82c3e954acc5f7a2fd979376
Delegated to: BPF
Headers show
Series Fix bpf_probe_read_user_str() overcopying | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for bpf
netdev/subject_prefix success Link
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 33 this patch: 33
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning CHECK: spaces preferred around that '+' (ctx:VxV)
netdev/build_allmodconfig_warn success Errors and warnings before: 33 this patch: 33
netdev/header_inline success Link
netdev/stable success Stable not CCed

Commit Message

Daniel Xu Nov. 17, 2020, 8:05 p.m. UTC
do_strncpy_from_user() may copy some extra bytes after the NUL
terminator into the destination buffer. This usually does not matter for
normal string operations. However, when BPF programs key BPF maps with
strings, this matters a lot.

A BPF program may read strings from user memory by calling the
bpf_probe_read_user_str() helper which eventually calls
do_strncpy_from_user(). The program can then key a map with the
resulting string. BPF map keys are fixed-width and string-agnostic,
meaning that map keys are treated as a set of bytes.

The issue is when do_strncpy_from_user() overcopies bytes after the NUL
terminator, it can result in seemingly identical strings occupying
multiple slots in a BPF map. This behavior is subtle and totally
unexpected by the user.

This commit uses the proper word-at-a-time APIs to avoid overcopying.

Fixes: 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel} and probe_read_{user, kernel}_str helpers")
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
---
As mentioned in the v6 discussion, I didn't think it would make a lot
of sense to put a comment in kernel/bpf/hashtab.c:alloc_htab_elem .
I opted to add the comment to bpf_probe_read_user_str_common() b/c it
seems like the next best place. Just let me know if you want it
somewhere else.

 kernel/trace/bpf_trace.c | 10 ++++++++++
 lib/strncpy_from_user.c  | 19 +++++++++++++++++--
 2 files changed, 27 insertions(+), 2 deletions(-)

Comments

Alexei Starovoitov Nov. 17, 2020, 8:14 p.m. UTC | #1
On Tue, Nov 17, 2020 at 12:05 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
>
> This commit uses the proper word-at-a-time APIs to avoid overcopying.

that part of the commit log is no longer correct. I can fix it up while applying
if Linus doesn't have an issue with the rest.
Alexei Starovoitov Nov. 19, 2020, 6:33 p.m. UTC | #2
On Tue, Nov 17, 2020 at 12:14 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Nov 17, 2020 at 12:05 PM Daniel Xu <dxu@dxuuu.xyz> wrote:
> >
> > This commit uses the proper word-at-a-time APIs to avoid overcopying.
>
> that part of the commit log is no longer correct. I can fix it up while applying
> if Linus doesn't have an issue with the rest.

Linus,
ping.
Linus Torvalds Nov. 19, 2020, 6:40 p.m. UTC | #3
On Thu, Nov 19, 2020 at 10:34 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> ping.

I'm ok with this series that adds explanations for why you care and
what bpf does that makes it valid.

So this one you can put in the bpf tree.

Or, if you want me to just apply it as a series, I can do that too, I
just generally assume that when there's a git tree I usually get
things from, that's the default, so then it needs ot be a very loud
and explicit "Linus, can you apply this directly".

              Linus
Alexei Starovoitov Nov. 19, 2020, 6:44 p.m. UTC | #4
On Thu, Nov 19, 2020 at 10:40:21AM -0800, Linus Torvalds wrote:
> On Thu, Nov 19, 2020 at 10:34 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > ping.
> 
> I'm ok with this series that adds explanations for why you care and
> what bpf does that makes it valid.

Great.

> So this one you can put in the bpf tree.
> 
> Or, if you want me to just apply it as a series, I can do that too, I
> just generally assume that when there's a git tree I usually get
> things from, that's the default, so then it needs ot be a very loud
> and explicit "Linus, can you apply this directly".

Right. The set will go through the normal bpf.git->net.git route to make
sure it is tested by humans and CIs.
Thanks!
diff mbox series

Patch

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 5113fd423cdf..048c655315f1 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -181,6 +181,16 @@  bpf_probe_read_user_str_common(void *dst, u32 size,
 {
 	int ret;
 
+	/*
+	 * NB: We rely on strncpy_from_user() not copying junk past the NUL
+	 * terminator into `dst`.
+	 *
+	 * strncpy_from_user() does long-sized strides in the fast path. If the
+	 * strncpy does not mask out the bytes after the NUL in `unsafe_ptr`,
+	 * then there could be junk after the NUL in `dst`. If user takes `dst`
+	 * and keys a hash map with it, then semantically identical strings can
+	 * occupy multiple entries in the map.
+	 */
 	ret = strncpy_from_user_nofault(dst, unsafe_ptr, size);
 	if (unlikely(ret < 0))
 		memset(dst, 0, size);
diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index e6d5fcc2cdf3..122d8d0e253c 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -35,17 +35,32 @@  static inline long do_strncpy_from_user(char *dst, const char __user *src,
 		goto byte_at_a_time;
 
 	while (max >= sizeof(unsigned long)) {
-		unsigned long c, data;
+		unsigned long c, data, mask;
 
 		/* Fall back to byte-at-a-time if we get a page fault */
 		unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);
 
-		*(unsigned long *)(dst+res) = c;
+		/*
+		 * Note that we mask out the bytes following the NUL. This is
+		 * important to do because string oblivious code may read past
+		 * the NUL. For those routines, we don't want to give them
+		 * potentially random bytes after the NUL in `src`.
+		 *
+		 * One example of such code is BPF map keys. BPF treats map keys
+		 * as an opaque set of bytes. Without the post-NUL mask, any BPF
+		 * maps keyed by strings returned from strncpy_from_user() may
+		 * have multiple entries for semantically identical strings.
+		 */
 		if (has_zero(c, &data, &constants)) {
 			data = prep_zero_mask(c, data, &constants);
 			data = create_zero_mask(data);
+			mask = zero_bytemask(data);
+			*(unsigned long *)(dst+res) = c & mask;
 			return res + find_zero(data);
 		}
+
+		*(unsigned long *)(dst+res) = c;
+
 		res += sizeof(unsigned long);
 		max -= sizeof(unsigned long);
 	}