diff mbox series

[v2,05/10] mm/util: Fix possible race condition in kstrdup()

Message ID 20240613023044.45873-6-laoar.shao@gmail.com (mailing list archive)
State Superseded
Headers show
Series Improve the copy of task comm | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 853 this patch: 853
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 3 of 3 maintainers
netdev/build_clang success Errors and warnings before: 854 this patch: 854
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 857 this patch: 857
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 11 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 5 this patch: 5
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-0 success Logs for Lint
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2 success Logs for Unittests
bpf/vmtest-bpf-next-VM_Test-3 success Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-5 success Logs for aarch64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-4 success Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-12 success Logs for s390x-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-9 success Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-17 success Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-18 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-19 success Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for x86_64-gcc / build-release
bpf/vmtest-bpf-next-VM_Test-28 success Logs for x86_64-llvm-17 / build / build for x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-29 success Logs for x86_64-llvm-17 / build-release / build for x86_64 with llvm-17-O2
bpf/vmtest-bpf-next-VM_Test-34 success Logs for x86_64-llvm-17 / veristat
bpf/vmtest-bpf-next-VM_Test-35 success Logs for x86_64-llvm-18 / build / build for x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-36 success Logs for x86_64-llvm-18 / build-release / build for x86_64 with llvm-18-O2
bpf/vmtest-bpf-next-VM_Test-42 success Logs for x86_64-llvm-18 / veristat
bpf/vmtest-bpf-next-VM_Test-6 success Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-8 success Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-21 success Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-24 success Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-27 success Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for x86_64-llvm-17 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-33 success Logs for x86_64-llvm-17 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-37 success Logs for x86_64-llvm-18 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-41 success Logs for x86_64-llvm-18 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-22 success Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for x86_64-llvm-17 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-32 success Logs for x86_64-llvm-17 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-17
bpf/vmtest-bpf-next-VM_Test-38 success Logs for x86_64-llvm-18 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-39 success Logs for x86_64-llvm-18 / test (test_progs_cpuv4, false, 360) / test_progs_cpuv4 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-40 success Logs for x86_64-llvm-18 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-18
bpf/vmtest-bpf-next-VM_Test-13 success Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-14 success Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc

Commit Message

Yafang Shao June 13, 2024, 2:30 a.m. UTC
In kstrdup(), it is critical to ensure that the dest string is always
NUL-terminated. However, potential race condidtion can occur between a
writer and a reader.

Consider the following scenario involving task->comm:

    reader                    writer

  len = strlen(s) + 1;
                             strlcpy(tsk->comm, buf, sizeof(tsk->comm));
  memcpy(buf, s, len);

In this case, there is a race condition between the reader and the
writer. The reader calculate the length of the string `s` based on the
old value of task->comm. However, during the memcpy(), the string `s`
might be updated by the writer to a new value of task->comm.

If the new task->comm is larger than the old one, the `buf` might not be
NUL-terminated. This can lead to undefined behavior and potential
security vulnerabilities.

Let's fix it by explicitly adding a NUL-terminator.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/util.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Andrew Morton June 13, 2024, 9:14 p.m. UTC | #1
On Thu, 13 Jun 2024 10:30:39 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:

> In kstrdup(), it is critical to ensure that the dest string is always
> NUL-terminated. However, potential race condidtion can occur between a
> writer and a reader.
> 
> Consider the following scenario involving task->comm:
> 
>     reader                    writer
> 
>   len = strlen(s) + 1;
>                              strlcpy(tsk->comm, buf, sizeof(tsk->comm));
>   memcpy(buf, s, len);
> 
> In this case, there is a race condition between the reader and the
> writer. The reader calculate the length of the string `s` based on the
> old value of task->comm. However, during the memcpy(), the string `s`
> might be updated by the writer to a new value of task->comm.
> 
> If the new task->comm is larger than the old one, the `buf` might not be
> NUL-terminated. This can lead to undefined behavior and potential
> security vulnerabilities.
> 
> Let's fix it by explicitly adding a NUL-terminator.

The concept sounds a little strange.  If some code takes a copy of a
string while some other code is altering it, yes, the result will be a
mess.  This is why get_task_comm() exists, and why it uses locking.

I get that "your copy is a mess" is less serious than "your string
isn't null-terminated" but still.  Whichever outcome we get, the
calling code is buggy and should be fixed.

Are there any other problematic scenarios we're defending against here?

>
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -60,8 +60,10 @@ char *kstrdup(const char *s, gfp_t gfp)
>  
>  	len = strlen(s) + 1;
>  	buf = kmalloc_track_caller(len, gfp);
> -	if (buf)
> +	if (buf) {
>  		memcpy(buf, s, len);
> +		buf[len - 1] = '\0';
> +	}
>  	return buf;
>  }

Now I'll start receiving patches to remove this again.  Let's have a
code comment please.

And kstrdup() is now looking awfully similar to kstrndup().  Perhaps
there's a way to reduce duplication?
Linus Torvalds June 13, 2024, 10:17 p.m. UTC | #2
On Thu, 13 Jun 2024 at 14:14, Andrew Morton <akpm@linux-foundation.org> wrote:
>
> The concept sounds a little strange.  If some code takes a copy of a
> string while some other code is altering it, yes, the result will be a
> mess.  This is why get_task_comm() exists, and why it uses locking.

The thing is, get_task_comm() is terminally broken.

Nobody sane uses it, and sometimes it's literally _because_ it uses locking.

Let's look at the numbers:

 - 39 uses of get_task_comm()

 - 2 uses of __get_task_comm() because the locking doesn't work

 - 447 uses of raw "current->comm"

 - 112 uses of raw 'ta*sk->comm' (and possibly

IOW, we need to just accept the fact that nobody actually wants to use
"get_task_comm()". It's a broken interface. It's inconvenient, and the
locking makes it worse.

Now, I'm not convinced that kstrdup() is what anybody should use
should, but of the 600 "raw" uses of ->comm, four of them do seem to
be kstrdup.

Not great, I think they could be removed, but they are examples of
people doing this. And I think it *would* be good to have the
guarantee that yes, the kstrdup() result is always a proper string,
even if it's used for unstable sources. Who knows what other unstable
sources exist?

I do suspect that most of the raw uses of 'xyz->comm' is for
printouts. And I think we would be better with a '%pTSK' vsnprintf()
format thing for that.

Sadly, I don't think coccinelle can do the kinds of transforms that
involve printf format strings.

And no, a printk() string still couldn't use the locking version.

               Linus
Yafang Shao June 14, 2024, 2:33 a.m. UTC | #3
On Fri, Jun 14, 2024 at 5:14 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu, 13 Jun 2024 10:30:39 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
>
> > In kstrdup(), it is critical to ensure that the dest string is always
> > NUL-terminated. However, potential race condidtion can occur between a
> > writer and a reader.
> >
> > Consider the following scenario involving task->comm:
> >
> >     reader                    writer
> >
> >   len = strlen(s) + 1;
> >                              strlcpy(tsk->comm, buf, sizeof(tsk->comm));
> >   memcpy(buf, s, len);
> >
> > In this case, there is a race condition between the reader and the
> > writer. The reader calculate the length of the string `s` based on the
> > old value of task->comm. However, during the memcpy(), the string `s`
> > might be updated by the writer to a new value of task->comm.
> >
> > If the new task->comm is larger than the old one, the `buf` might not be
> > NUL-terminated. This can lead to undefined behavior and potential
> > security vulnerabilities.
> >
> > Let's fix it by explicitly adding a NUL-terminator.
>
> The concept sounds a little strange.  If some code takes a copy of a
> string while some other code is altering it, yes, the result will be a
> mess.  This is why get_task_comm() exists, and why it uses locking.
>
> I get that "your copy is a mess" is less serious than "your string
> isn't null-terminated" but still.  Whichever outcome we get, the
> calling code is buggy and should be fixed.
>
> Are there any other problematic scenarios we're defending against here?
>
> >
> > --- a/mm/util.c
> > +++ b/mm/util.c
> > @@ -60,8 +60,10 @@ char *kstrdup(const char *s, gfp_t gfp)
> >
> >       len = strlen(s) + 1;
> >       buf = kmalloc_track_caller(len, gfp);
> > -     if (buf)
> > +     if (buf) {
> >               memcpy(buf, s, len);
> > +             buf[len - 1] = '\0';
> > +     }
> >       return buf;
> >  }
>
> Now I'll start receiving patches to remove this again.  Let's have a
> code comment please.

I will add a comment for it.

>
> And kstrdup() is now looking awfully similar to kstrndup().  Perhaps
> there's a way to reduce duplication?

Yes, I believe we can add a common helper for them :

  static char *__kstrndup(const char *s, size_t max, gfp_t gfp)
Yafang Shao June 14, 2024, 2:41 a.m. UTC | #4
On Fri, Jun 14, 2024 at 6:18 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Thu, 13 Jun 2024 at 14:14, Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > The concept sounds a little strange.  If some code takes a copy of a
> > string while some other code is altering it, yes, the result will be a
> > mess.  This is why get_task_comm() exists, and why it uses locking.
>
> The thing is, get_task_comm() is terminally broken.
>
> Nobody sane uses it, and sometimes it's literally _because_ it uses locking.
>
> Let's look at the numbers:
>
>  - 39 uses of get_task_comm()
>
>  - 2 uses of __get_task_comm() because the locking doesn't work
>
>  - 447 uses of raw "current->comm"
>
>  - 112 uses of raw 'ta*sk->comm' (and possibly
>
> IOW, we need to just accept the fact that nobody actually wants to use
> "get_task_comm()". It's a broken interface. It's inconvenient, and the
> locking makes it worse.
>
> Now, I'm not convinced that kstrdup() is what anybody should use
> should, but of the 600 "raw" uses of ->comm, four of them do seem to
> be kstrdup.
>
> Not great, I think they could be removed, but they are examples of
> people doing this. And I think it *would* be good to have the
> guarantee that yes, the kstrdup() result is always a proper string,
> even if it's used for unstable sources. Who knows what other unstable
> sources exist?
>
> I do suspect that most of the raw uses of 'xyz->comm' is for
> printouts. And I think we would be better with a '%pTSK' vsnprintf()
> format thing for that.

I will implement this change in the next step if no one else handles it.

>
> Sadly, I don't think coccinelle can do the kinds of transforms that
> involve printf format strings.

Yes, we need to carefully check them one by one.

>
> And no, a printk() string still couldn't use the locking version.
>
>                Linus
diff mbox series

Patch

diff --git a/mm/util.c b/mm/util.c
index c9e519e6811f..3b383f790208 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -60,8 +60,10 @@  char *kstrdup(const char *s, gfp_t gfp)
 
 	len = strlen(s) + 1;
 	buf = kmalloc_track_caller(len, gfp);
-	if (buf)
+	if (buf) {
 		memcpy(buf, s, len);
+		buf[len - 1] = '\0';
+	}
 	return buf;
 }
 EXPORT_SYMBOL(kstrdup);