Message ID | 20240817025624.13157-6-laoar.shao@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Improve the copy of task comm | expand |
Hi Yafang, On Sat, Aug 17, 2024 at 10:56:21AM GMT, Yafang Shao wrote: > In kstrdup(), it is critical to ensure that the dest string is always > NUL-terminated. However, potential race condidtion can occur between a > writer and a reader. > > Consider the following scenario involving task->comm: > > reader writer > > len = strlen(s) + 1; > strlcpy(tsk->comm, buf, sizeof(tsk->comm)); > memcpy(buf, s, len); > > In this case, there is a race condition between the reader and the > writer. The reader calculate the length of the string `s` based on the > old value of task->comm. However, during the memcpy(), the string `s` > might be updated by the writer to a new value of task->comm. > > If the new task->comm is larger than the old one, the `buf` might not be > NUL-terminated. This can lead to undefined behavior and potential > security vulnerabilities. > > Let's fix it by explicitly adding a NUL-terminator. > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > --- > mm/util.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/mm/util.c b/mm/util.c > index 983baf2bd675..4542d8a800d9 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp) > > len = strlen(s) + 1; > buf = kmalloc_track_caller(len, gfp); > - if (buf) > + if (buf) { > memcpy(buf, s, len); > + /* During memcpy(), the string might be updated to a new value, > + * which could be longer than the string when strlen() is > + * called. Therefore, we need to add a null termimator. > + */ > + buf[len - 1] = '\0'; > + } I would compact the above to: len = strlen(s); buf = kmalloc_track_caller(len + 1, gfp); if (buf) strcpy(mempcpy(buf, s, len), ""); It allows _FORTIFY_SOURCE to track the copy of the NUL, and also uses less screen. It also has less moving parts. (You'd need to write a mempcpy() for the kernel, but that's as easy as the following:) #define mempcpy(d, s, n) (memcpy(d, s, n) + n) In shadow utils, I did a global replacement of all buf[...] = '\0'; by strcpy(..., "");. It ends up being optimized by the compiler to the same code (at least in the experiments I did). Have a lovely day! Alex > return buf; > } > EXPORT_SYMBOL(kstrdup); > -- > 2.43.5 >
On Sat, 17 Aug 2024 at 01:48, Alejandro Colomar <alx@kernel.org> wrote: > > I would compact the above to: > > len = strlen(s); > buf = kmalloc_track_caller(len + 1, gfp); > if (buf) > strcpy(mempcpy(buf, s, len), ""); No, we're not doing this kind of horror. If _FORTIFY_SOURCE has problems with a simple "memcpy and add NUL", then _FORTIFY_SOURCE needs to be fixed. We don't replace a "buf[len] = 0" with strcpy(,""). Yes, compilers may simplify it, but dammit, it's an unreadable incomprehensible mess to humans, and humans still matter a LOT more. Linus
Hi Linus, On Sat, Aug 17, 2024 at 09:26:21AM GMT, Linus Torvalds wrote: > On Sat, 17 Aug 2024 at 01:48, Alejandro Colomar <alx@kernel.org> wrote: > > > > I would compact the above to: > > > > len = strlen(s); > > buf = kmalloc_track_caller(len + 1, gfp); > > if (buf) > > strcpy(mempcpy(buf, s, len), ""); > > No, we're not doing this kind of horror. Ok. > If _FORTIFY_SOURCE has problems with a simple "memcpy and add NUL", > then _FORTIFY_SOURCE needs to be fixed. _FORTIFY_SOURCE works (AFAIK) by replacing the usual string calls by oneis that do some extra work to learn the real size of the buffers. This means that for _FORTIFY_SOURCE to work, you need to actually call a function. Since the "add NUL" is not done in a function call, it's unprotected (except that sanitizers may protect it via other means). Here's the fortified version of strcpy(3) in the kernel: $ grepc -h -B15 strcpy ./include/linux/fortify-string.h /** * strcpy - Copy a string into another string buffer * * @p: pointer to destination of copy * @q: pointer to NUL-terminated source string to copy * * Do not use this function. While FORTIFY_SOURCE tries to avoid * overflows, this is only possible when the sizes of @q and @p are * known to the compiler. Prefer strscpy(), though note its different * return values for detecting truncation. * * Returns @p. * */ /* Defined after fortified strlen to reuse it. */ __FORTIFY_INLINE __diagnose_as(__builtin_strcpy, 1, 2) char *strcpy(char * const POS p, const char * const POS q) { const size_t p_size = __member_size(p); const size_t q_size = __member_size(q); size_t size; /* If neither buffer size is known, immediately give up. */ if (__builtin_constant_p(p_size) && __builtin_constant_p(q_size) && p_size == SIZE_MAX && q_size == SIZE_MAX) return __underlying_strcpy(p, q); size = strlen(q) + 1; /* Compile-time check for const size overflow. */ if (__compiletime_lessthan(p_size, size)) __write_overflow(); /* Run-time check for dynamic size overflow. */ if (p_size < size) fortify_panic(FORTIFY_FUNC_strcpy, FORTIFY_WRITE, p_size, size, p); __underlying_memcpy(p, q, size); return p; } > We don't replace a "buf[len] = 0" with strcpy(,""). Yes, compilers may > simplify it, but dammit, it's an unreadable incomprehensible mess to > humans, and humans still matter a LOT more. I understand. While I don't consider it unreadable anymore (I guess I got used to it), it felt strange at first. > > Linus Have a lovely day! Alex
On Thu, Sep 26, 2024 at 7:44 PM Yafang Shao <laoar.shao@gmail.com> wrote: > > In kstrdup(), it is critical to ensure that the dest string is always > NUL-terminated. However, potential race condidtion can occur between a condition > writer and a reader. > > Consider the following scenario involving task->comm: > > reader writer > > len = strlen(s) + 1; > strlcpy(tsk->comm, buf, sizeof(tsk->comm)); > memcpy(buf, s, len); > > In this case, there is a race condition between the reader and the > writer. The reader calculate the length of the string `s` based on the calculates > old value of task->comm. However, during the memcpy(), the string `s` > might be updated by the writer to a new value of task->comm. > > If the new task->comm is larger than the old one, the `buf` might not be > NUL-terminated. This can lead to undefined behavior and potential > security vulnerabilities. > > Let's fix it by explicitly adding a NUL-terminator. memcpy() is not atomic AFAIK, meaning that the new string can be also shorter and when memcpy() already copied past the new NUL. I would amend the explanation to include this as well. ... > + /* During memcpy(), the string might be updated to a new value, > + * which could be longer than the string when strlen() is > + * called. Therefore, we need to add a null termimator. /* * The wrong comment style. Besides that a typo * in the word 'terminator'. Please, run codespell on your changes. * Also use the same form: NUL-terminator when you are talking * about '\0' and not NULL. */ > + */
On Fri, Sep 27, 2024 at 1:35 AM Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > > On Thu, Sep 26, 2024 at 7:44 PM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > In kstrdup(), it is critical to ensure that the dest string is always > > NUL-terminated. However, potential race condidtion can occur between a > > condition > > > writer and a reader. > > > > Consider the following scenario involving task->comm: > > > > reader writer > > > > len = strlen(s) + 1; > > strlcpy(tsk->comm, buf, sizeof(tsk->comm)); > > memcpy(buf, s, len); > > > > In this case, there is a race condition between the reader and the > > writer. The reader calculate the length of the string `s` based on the > > calculates > > > old value of task->comm. However, during the memcpy(), the string `s` > > might be updated by the writer to a new value of task->comm. > > > > If the new task->comm is larger than the old one, the `buf` might not be > > NUL-terminated. This can lead to undefined behavior and potential > > security vulnerabilities. > > > > Let's fix it by explicitly adding a NUL-terminator. > > memcpy() is not atomic AFAIK, meaning that the new string can be also > shorter and when memcpy() already copied past the new NUL. I would > amend the explanation to include this as well. > > ... > > > + /* During memcpy(), the string might be updated to a new value, > > + * which could be longer than the string when strlen() is > > + * called. Therefore, we need to add a null termimator. > > /* > * The wrong comment style. Besides that a typo > * in the word 'terminator'. Please, run codespell on your changes. > * Also use the same form: NUL-terminator when you are talking > * about '\0' and not NULL. > */ Thank you for pointing out these errors and for recommending the use of codespell. Will fix them in the next version.
On Sat, Aug 17, 2024 at 10:56:21AM +0800, Yafang Shao wrote: > In kstrdup(), it is critical to ensure that the dest string is always > NUL-terminated. However, potential race condidtion can occur between a > writer and a reader. > > Consider the following scenario involving task->comm: > > reader writer > > len = strlen(s) + 1; > strlcpy(tsk->comm, buf, sizeof(tsk->comm)); > memcpy(buf, s, len); > > In this case, there is a race condition between the reader and the > writer. The reader calculate the length of the string `s` based on the > old value of task->comm. However, during the memcpy(), the string `s` > might be updated by the writer to a new value of task->comm. > > If the new task->comm is larger than the old one, the `buf` might not be > NUL-terminated. This can lead to undefined behavior and potential > security vulnerabilities. > > Let's fix it by explicitly adding a NUL-terminator. So, I'm fine with adding this generally, but I'm not sure we can construct these helpers to be universally safe against the strings changing out from under them. This situation is distinct to the 'comm' member, so I'd like to focus on helpers around 'comm' access behaving in a reliable fashion. -Kees > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > --- > mm/util.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/mm/util.c b/mm/util.c > index 983baf2bd675..4542d8a800d9 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp) > > len = strlen(s) + 1; > buf = kmalloc_track_caller(len, gfp); > - if (buf) > + if (buf) { > memcpy(buf, s, len); > + /* During memcpy(), the string might be updated to a new value, > + * which could be longer than the string when strlen() is > + * called. Therefore, we need to add a null termimator. > + */ > + buf[len - 1] = '\0'; > + } > return buf; > } > EXPORT_SYMBOL(kstrdup); > -- > 2.43.5 >
On Sat, Aug 17, 2024 at 10:48:15AM +0200, Alejandro Colomar wrote: > Hi Yafang, > > On Sat, Aug 17, 2024 at 10:56:21AM GMT, Yafang Shao wrote: > > In kstrdup(), it is critical to ensure that the dest string is always > > NUL-terminated. However, potential race condidtion can occur between a > > writer and a reader. > > > > Consider the following scenario involving task->comm: > > > > reader writer > > > > len = strlen(s) + 1; > > strlcpy(tsk->comm, buf, sizeof(tsk->comm)); > > memcpy(buf, s, len); > > > > In this case, there is a race condition between the reader and the > > writer. The reader calculate the length of the string `s` based on the > > old value of task->comm. However, during the memcpy(), the string `s` > > might be updated by the writer to a new value of task->comm. > > > > If the new task->comm is larger than the old one, the `buf` might not be > > NUL-terminated. This can lead to undefined behavior and potential > > security vulnerabilities. > > > > Let's fix it by explicitly adding a NUL-terminator. > > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > Cc: Andrew Morton <akpm@linux-foundation.org> > > --- > > mm/util.c | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/mm/util.c b/mm/util.c > > index 983baf2bd675..4542d8a800d9 100644 > > --- a/mm/util.c > > +++ b/mm/util.c > > @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp) > > > > len = strlen(s) + 1; > > buf = kmalloc_track_caller(len, gfp); > > - if (buf) > > + if (buf) { > > memcpy(buf, s, len); > > + /* During memcpy(), the string might be updated to a new value, > > + * which could be longer than the string when strlen() is > > + * called. Therefore, we need to add a null termimator. > > + */ > > + buf[len - 1] = '\0'; > > + } > > I would compact the above to: > > len = strlen(s); > buf = kmalloc_track_caller(len + 1, gfp); > if (buf) > strcpy(mempcpy(buf, s, len), ""); > > It allows _FORTIFY_SOURCE to track the copy of the NUL, and also uses > less screen. It also has less moving parts. (You'd need to write a > mempcpy() for the kernel, but that's as easy as the following:) > > #define mempcpy(d, s, n) (memcpy(d, s, n) + n) > > In shadow utils, I did a global replacement of all buf[...] = '\0'; by > strcpy(..., "");. It ends up being optimized by the compiler to the > same code (at least in the experiments I did). Just to repeat what's already been said: no, please, don't complicate this with yet more wrappers. And I really don't want to add more str/mem variants -- we're working really hard to _remove_ them. :P -Kees
[CC += Andy, Gustavo] On Sat, Sep 28, 2024 at 02:17:30PM GMT, Kees Cook wrote: > > > diff --git a/mm/util.c b/mm/util.c > > > index 983baf2bd675..4542d8a800d9 100644 > > > --- a/mm/util.c > > > +++ b/mm/util.c > > > @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp) > > > > > > len = strlen(s) + 1; > > > buf = kmalloc_track_caller(len, gfp); > > > - if (buf) > > > + if (buf) { > > > memcpy(buf, s, len); > > > + /* During memcpy(), the string might be updated to a new value, > > > + * which could be longer than the string when strlen() is > > > + * called. Therefore, we need to add a null termimator. > > > + */ > > > + buf[len - 1] = '\0'; > > > + } > > > > I would compact the above to: > > > > len = strlen(s); > > buf = kmalloc_track_caller(len + 1, gfp); > > if (buf) > > strcpy(mempcpy(buf, s, len), ""); > > > > It allows _FORTIFY_SOURCE to track the copy of the NUL, and also uses > > less screen. It also has less moving parts. (You'd need to write a > > mempcpy() for the kernel, but that's as easy as the following:) > > > > #define mempcpy(d, s, n) (memcpy(d, s, n) + n) > > > > In shadow utils, I did a global replacement of all buf[...] = '\0'; by > > strcpy(..., "");. It ends up being optimized by the compiler to the > > same code (at least in the experiments I did). > > Just to repeat what's already been said: no, please, don't complicate > this with yet more wrappers. And I really don't want to add more str/mem > variants -- we're working really hard to _remove_ them. :P Hi Kees, I assume by "[no] more str/mem variants" you're referring to mempcpy(3). mempcpy(3) is a libc function available in several systems (at least glibc, musl, FreeBSD, and NetBSD). It's not in POSIX nor in OpenBSD, but it's relatively widely available. Availability is probably pointless to the kernel, but I mention it because it's not something random I came up with, but rather something that several projects have found useful. I find it quite useful to copy the non-zero part of a string. See string_copying(7). <https://www.man7.org/linux/man-pages/man7/string_copying.7.html> Regarding "we're working really hard to remove them [mem/str wrappers]", I think it's more like removing those that are prone to misuse, not just blinly reducing the amount of wrappers. Some of them are really useful. I've done a randomized search of kernel code, and found several places where mempcpy(3) would be useful for simplifying code: ./drivers/staging/rtl8723bs/core/rtw_ap.c: memcpy(pwps_ie, pwps_ie_src, wps_ielen + 2); ./drivers/staging/rtl8723bs/core/rtw_ap.c- pwps_ie += (wps_ielen+2); equivalent to: pwps_ie = mempcpy(pwps_ie, pwps_ie_src, wps_ielen + 2); ./drivers/staging/rtl8723bs/core/rtw_ap.c: memcpy(supportRate + supportRateNum, p + 2, ie_len); ./drivers/staging/rtl8723bs/core/rtw_ap.c- supportRateNum += ie_len; equivalent to: supportRateNum = mempcpy(supportRate + supportRateNum, p + 2, ie_len); ./drivers/staging/rtl8723bs/core/rtw_ap.c: memcpy(dst_ie, &tim_bitmap_le, 2); ./drivers/staging/rtl8723bs/core/rtw_ap.c- dst_ie += 2; equivalent to: dst_ie = mempcpy(dst_ie, &tim_bitmap_le, 2); And there are many cases like this. Using mempcpy(3) would make this pattern less repetitive. Have a lovely day! Alex
On Sun, Sep 29, 2024 at 09:58:30AM GMT, Alejandro Colomar wrote: > [CC += Andy, Gustavo] > > On Sat, Sep 28, 2024 at 02:17:30PM GMT, Kees Cook wrote: > > > > diff --git a/mm/util.c b/mm/util.c > > > > index 983baf2bd675..4542d8a800d9 100644 > > > > --- a/mm/util.c > > > > +++ b/mm/util.c > > > > @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp) > > > > > > > > len = strlen(s) + 1; > > > > buf = kmalloc_track_caller(len, gfp); > > > > - if (buf) > > > > + if (buf) { > > > > memcpy(buf, s, len); > > > > + /* During memcpy(), the string might be updated to a new value, > > > > + * which could be longer than the string when strlen() is > > > > + * called. Therefore, we need to add a null termimator. > > > > + */ > > > > + buf[len - 1] = '\0'; > > > > + } > > > > > > I would compact the above to: > > > > > > len = strlen(s); > > > buf = kmalloc_track_caller(len + 1, gfp); > > > if (buf) > > > strcpy(mempcpy(buf, s, len), ""); > > > > > > It allows _FORTIFY_SOURCE to track the copy of the NUL, and also uses > > > less screen. It also has less moving parts. (You'd need to write a > > > mempcpy() for the kernel, but that's as easy as the following:) > > > > > > #define mempcpy(d, s, n) (memcpy(d, s, n) + n) > > > > > > In shadow utils, I did a global replacement of all buf[...] = '\0'; by > > > strcpy(..., "");. It ends up being optimized by the compiler to the > > > same code (at least in the experiments I did). > > > > Just to repeat what's already been said: no, please, don't complicate > > this with yet more wrappers. And I really don't want to add more str/mem > > variants -- we're working really hard to _remove_ them. :P > > Hi Kees, > > I assume by "[no] more str/mem variants" you're referring to mempcpy(3). > > mempcpy(3) is a libc function available in several systems (at least > glibc, musl, FreeBSD, and NetBSD). It's not in POSIX nor in OpenBSD, > but it's relatively widely available. Availability is probably > pointless to the kernel, but I mention it because it's not something > random I came up with, but rather something that several projects have > found useful. I find it quite useful to copy the non-zero part of a > string. See string_copying(7). > <https://www.man7.org/linux/man-pages/man7/string_copying.7.html> > > Regarding "we're working really hard to remove them [mem/str wrappers]", > I think it's more like removing those that are prone to misuse, not just > blinly reducing the amount of wrappers. Some of them are really useful. > > I've done a randomized search of kernel code, and found several places > where mempcpy(3) would be useful for simplifying code: > > ./drivers/staging/rtl8723bs/core/rtw_ap.c: memcpy(pwps_ie, pwps_ie_src, wps_ielen + 2); > ./drivers/staging/rtl8723bs/core/rtw_ap.c- pwps_ie += (wps_ielen+2); > > equivalent to: > > pwps_ie = mempcpy(pwps_ie, pwps_ie_src, wps_ielen + 2); > > ./drivers/staging/rtl8723bs/core/rtw_ap.c: memcpy(supportRate + supportRateNum, p + 2, ie_len); > ./drivers/staging/rtl8723bs/core/rtw_ap.c- supportRateNum += ie_len; > > equivalent to: > > supportRateNum = mempcpy(supportRate + supportRateNum, p + 2, ie_len); Oops, I misread the original in the above. I didn't notice that the += is being done on the count, not the pointer. The other equivalences are good, though. > > ./drivers/staging/rtl8723bs/core/rtw_ap.c: memcpy(dst_ie, &tim_bitmap_le, 2); > ./drivers/staging/rtl8723bs/core/rtw_ap.c- dst_ie += 2; > > equivalent to: > > dst_ie = mempcpy(dst_ie, &tim_bitmap_le, 2); > > > And there are many cases like this. Using mempcpy(3) would make this > pattern less repetitive.
diff --git a/mm/util.c b/mm/util.c index 983baf2bd675..4542d8a800d9 100644 --- a/mm/util.c +++ b/mm/util.c @@ -62,8 +62,14 @@ char *kstrdup(const char *s, gfp_t gfp) len = strlen(s) + 1; buf = kmalloc_track_caller(len, gfp); - if (buf) + if (buf) { memcpy(buf, s, len); + /* During memcpy(), the string might be updated to a new value, + * which could be longer than the string when strlen() is + * called. Therefore, we need to add a null termimator. + */ + buf[len - 1] = '\0'; + } return buf; } EXPORT_SYMBOL(kstrdup);
In kstrdup(), it is critical to ensure that the dest string is always NUL-terminated. However, potential race condidtion can occur between a writer and a reader. Consider the following scenario involving task->comm: reader writer len = strlen(s) + 1; strlcpy(tsk->comm, buf, sizeof(tsk->comm)); memcpy(buf, s, len); In this case, there is a race condition between the reader and the writer. The reader calculate the length of the string `s` based on the old value of task->comm. However, during the memcpy(), the string `s` might be updated by the writer to a new value of task->comm. If the new task->comm is larger than the old one, the `buf` might not be NUL-terminated. This can lead to undefined behavior and potential security vulnerabilities. Let's fix it by explicitly adding a NUL-terminator. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> --- mm/util.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)