Message ID | 20220831230006.1016236-1-keescook@chromium.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | string: Introduce strtomem() and strtomem_pad() | expand |
Hi Kees, CC torvalds Thanks for your patch! On Thu, Sep 1, 2022 at 1:00 AM Kees Cook <keescook@chromium.org> wrote: > One of the "legitimate" uses of strncpy() is copying a NUL-terminated > string into a fixed-size non-NUL-terminated character array. To avoid > the weaknesses and ambiguity of intent when using strncpy(), provide > replacement functions that explicitly distinguish between trailing > padding and not, and require the destination buffer size be discoverable > by the compiler. > > For example: > > struct obj { > int foo; > char small[4] __nonstring; > char big[8] __nonstring; > int bar; > }; > > struct obj p; > > /* This will truncate to 4 chars with no trailing NUL */ > strncpy(p.small, "hello", sizeof(p.small)); > /* p.small contains 'h', 'e', 'l', 'l' */ > > /* This will NUL pad to 8 chars. */ > strncpy(p.big, "hello", sizeof(p.big)); > /* p.big contains 'h', 'e', 'l', 'l', 'o', '\0', '\0', '\0' */ > > When the "__nonstring" attributes are missing, the intent of the > programmer becomes ambiguous for whether the lack of a trailing NUL > in the p.small copy is a bug. Additionally, it's not clear whether > the trailing padding in the p.big copy is _needed_. Both cases > become unambiguous with: > > strtomem(p.small, "hello"); > strtomem_pad(p.big, "hello"); strtomem_pad(p.big, "hello", 0); > See also https://github.com/KSPP/linux/issues/90 > > Expand the memcpy KUnit tests to include these functions. > > Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> > Cc: Nick Desaulniers <ndesaulniers@google.com> > Cc: Geert Uytterhoeven <geert@linux-m68k.org> > Cc: Guenter Roeck <linux@roeck-us.net> > Signed-off-by: Kees Cook <keescook@chromium.org> The idea looks good to me, but I guess Linus has something to say, too. > --- a/include/linux/string.h > +++ b/include/linux/string.h > @@ -260,6 +260,49 @@ static inline const char *kbasename(const char *path) > void memcpy_and_pad(void *dest, size_t dest_len, const void *src, size_t count, > int pad); > > +/** > + * strtomem_pad - Copy NUL-terminated string to non-NUL-terminated buffer > + * > + * @dest: Pointer of destination character array (marked as __nonstring) > + * @src: Pointer to NUL-terminated string > + * @pad: Padding character to fill any remaining bytes of @dest after copy > + * > + * This is a replacement for strncpy() uses where the destination is not > + * a NUL-terminated string, but with bounds checking on the source size, and > + * an explicit padding character. If padding is not required, use strtomem(). > + * > + * Note that the size of @dest is not an argument, as the length of @dest > + * must be discoverable by the compiler. > + */ > +#define strtomem_pad(dest, src, pad) do { \ > + const size_t _dest_len = __builtin_object_size(dest, 1); \ > + \ > + BUILD_BUG_ON(!__builtin_constant_p(_dest_len) || \ > + _dest_len == (size_t)-1); \ I think you want to include __must_be_array(dest) here. > + memcpy_and_pad(dest, _dest_len, src, strnlen(src, _dest_len), pad); \ > +} while (0) > + > +/** > + * strtomem - Copy NUL-terminated string to non-NUL-terminated buffer > + * > + * @dest: Pointer of destination character array (marked as __nonstring) > + * @src: Pointer to NUL-terminated string > + * > + * This is a replacement for strncpy() uses where the destination is not > + * a NUL-terminated string, but with bounds checking on the source size, and > + * without trailing padding. If padding is required, use strtomem_pad(). > + * > + * Note that the size of @dest is not an argument, as the length of @dest > + * must be discoverable by the compiler. > + */ > +#define strtomem(dest, src) do { \ > + const size_t _dest_len = __builtin_object_size(dest, 1); \ > + \ > + BUILD_BUG_ON(!__builtin_constant_p(_dest_len) || \ > + _dest_len == (size_t)-1); \ I think you want to include __must_be_array(dest) here. > + memcpy(dest, src, min(_dest_len, strnlen(src, _dest_len))); \ > +} while (0) > + > /** > * memset_after - Set a value after a struct member to the end of a struct > * > diff --git a/lib/memcpy_kunit.c b/lib/memcpy_kunit.c > index 62f8ffcbbaa3..2aeb8643e1b0 100644 > --- a/lib/memcpy_kunit.c > +++ b/lib/memcpy_kunit.c > @@ -272,10 +272,63 @@ static void memset_test(struct kunit *test) > #undef TEST_OP > } > > +static void strtomem_test(struct kunit *test) > +{ > + static const char input[] = "hi"; > + static const char truncate[] = "this is too long"; > + struct { > + unsigned long canary1; > + unsigned char output[sizeof(unsigned long)] __nonstring; > + unsigned long canary2; > + } wrap; > + > + memset(&wrap, 0xFF, sizeof(wrap)); > + KUNIT_EXPECT_EQ_MSG(test, wrap.canary1, -1UL, -1L or ULONG_MAX (everywhere) > + "bad initial canary value"); > + KUNIT_EXPECT_EQ_MSG(test, wrap.canary2, -1UL, > + "bad initial canary value"); > + > + /* Check unpadded copy leaves surroundings untouched. */ > + strtomem(wrap.output, input); > + KUNIT_EXPECT_EQ(test, wrap.canary1, -1UL); > + KUNIT_EXPECT_EQ(test, wrap.output[0], input[0]); > + KUNIT_EXPECT_EQ(test, wrap.output[1], input[1]); > + for (int i = 2; i < sizeof(wrap.output); i++) unsigned int i (everywhere) > + KUNIT_EXPECT_EQ(test, wrap.output[i], 0xFF); > + KUNIT_EXPECT_EQ(test, wrap.canary2, -1UL); > + > + /* Check truncated copy leaves surroundings untouched. */ > + memset(&wrap, 0xFF, sizeof(wrap)); > + strtomem(wrap.output, truncate); > + KUNIT_EXPECT_EQ(test, wrap.canary1, -1UL); > + for (int i = 0; i < sizeof(wrap.output); i++) > + KUNIT_EXPECT_EQ(test, wrap.output[i], truncate[i]); > + KUNIT_EXPECT_EQ(test, wrap.canary2, -1UL); > + > + /* Check padded copy leaves only string padded. */ > + memset(&wrap, 0xFF, sizeof(wrap)); > + strtomem_pad(wrap.output, input, 0xAA); > + KUNIT_EXPECT_EQ(test, wrap.canary1, -1UL); > + KUNIT_EXPECT_EQ(test, wrap.output[0], input[0]); > + KUNIT_EXPECT_EQ(test, wrap.output[1], input[1]); > + for (int i = 2; i < sizeof(wrap.output); i++) > + KUNIT_EXPECT_EQ(test, wrap.output[i], 0xAA); > + KUNIT_EXPECT_EQ(test, wrap.canary2, -1UL); > + > + /* Check truncated padded copy has no padding. */ > + memset(&wrap, 0xFF, sizeof(wrap)); > + strtomem(wrap.output, truncate); > + KUNIT_EXPECT_EQ(test, wrap.canary1, -1UL); > + for (int i = 0; i < sizeof(wrap.output); i++) > + KUNIT_EXPECT_EQ(test, wrap.output[i], truncate[i]); > + KUNIT_EXPECT_EQ(test, wrap.canary2, -1UL); > +} > + > static struct kunit_case memcpy_test_cases[] = { > KUNIT_CASE(memset_test), > KUNIT_CASE(memcpy_test), > KUNIT_CASE(memmove_test), > + KUNIT_CASE(strtomem_test), > {} > }; Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Thu, Sep 01, 2022 at 10:39:19AM +0200, Geert Uytterhoeven wrote: > [...] > > When the "__nonstring" attributes are missing, the intent of the > > programmer becomes ambiguous for whether the lack of a trailing NUL > > in the p.small copy is a bug. Additionally, it's not clear whether > > the trailing padding in the p.big copy is _needed_. Both cases > > become unambiguous with: > > > > strtomem(p.small, "hello"); > > strtomem_pad(p.big, "hello"); > > strtomem_pad(p.big, "hello", 0); Oops, thanks. I will adjust the example. And actually, instead of these notes just living in commit logs, I realize I can update the kerndoc for strncpy with a "here's now to pick a replacement" table... > > See also https://github.com/KSPP/linux/issues/90 > > > > Expand the memcpy KUnit tests to include these functions. > > > > Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> > > Cc: Nick Desaulniers <ndesaulniers@google.com> > > Cc: Geert Uytterhoeven <geert@linux-m68k.org> > > Cc: Guenter Roeck <linux@roeck-us.net> > > Signed-off-by: Kees Cook <keescook@chromium.org> > > The idea looks good to me, but I guess Linus has something to > say, too. > > > --- a/include/linux/string.h > > +++ b/include/linux/string.h > > @@ -260,6 +260,49 @@ static inline const char *kbasename(const char *path) > > void memcpy_and_pad(void *dest, size_t dest_len, const void *src, size_t count, > > int pad); > > > > +/** > > + * strtomem_pad - Copy NUL-terminated string to non-NUL-terminated buffer > > + * > > + * @dest: Pointer of destination character array (marked as __nonstring) > > + * @src: Pointer to NUL-terminated string > > + * @pad: Padding character to fill any remaining bytes of @dest after copy > > + * > > + * This is a replacement for strncpy() uses where the destination is not > > + * a NUL-terminated string, but with bounds checking on the source size, and > > + * an explicit padding character. If padding is not required, use strtomem(). > > + * > > + * Note that the size of @dest is not an argument, as the length of @dest > > + * must be discoverable by the compiler. > > + */ > > +#define strtomem_pad(dest, src, pad) do { \ > > + const size_t _dest_len = __builtin_object_size(dest, 1); \ > > + \ > > + BUILD_BUG_ON(!__builtin_constant_p(_dest_len) || \ > > + _dest_len == (size_t)-1); \ > > I think you want to include __must_be_array(dest) here. I didn't do that for the cases where we may be writing to non-array destinations (e.g. see the cast from u64 in the strncpy use in tools/perf/arch/x86/util/intel-pt.c). Since what we need to know is the object size, it does not strictly need to be an array. > > [...] > > + memset(&wrap, 0xFF, sizeof(wrap)); > > + KUNIT_EXPECT_EQ_MSG(test, wrap.canary1, -1UL, > > -1L or ULONG_MAX (everywhere) Yeah, ULONG_MAX looks best. Thanks! > > > + "bad initial canary value"); > > + KUNIT_EXPECT_EQ_MSG(test, wrap.canary2, -1UL, > > + "bad initial canary value"); > > + > > + /* Check unpadded copy leaves surroundings untouched. */ > > + strtomem(wrap.output, input); > > + KUNIT_EXPECT_EQ(test, wrap.canary1, -1UL); > > + KUNIT_EXPECT_EQ(test, wrap.output[0], input[0]); > > + KUNIT_EXPECT_EQ(test, wrap.output[1], input[1]); > > + for (int i = 2; i < sizeof(wrap.output); i++) > > unsigned int i (everywhere) I guess, but why? This could even be u8. Thanks for the review!
Hi Kees, On Thu, Sep 1, 2022 at 8:35 PM Kees Cook <keescook@chromium.org> wrote: > On Thu, Sep 01, 2022 at 10:39:19AM +0200, Geert Uytterhoeven wrote: > > > --- a/include/linux/string.h > > > +++ b/include/linux/string.h > > > @@ -260,6 +260,49 @@ static inline const char *kbasename(const char *path) > > > void memcpy_and_pad(void *dest, size_t dest_len, const void *src, size_t count, > > > int pad); > > > > > > +/** > > > + * strtomem_pad - Copy NUL-terminated string to non-NUL-terminated buffer > > > + * > > > + * @dest: Pointer of destination character array (marked as __nonstring) > > > + * @src: Pointer to NUL-terminated string > > > + * @pad: Padding character to fill any remaining bytes of @dest after copy > > > + * > > > + * This is a replacement for strncpy() uses where the destination is not > > > + * a NUL-terminated string, but with bounds checking on the source size, and > > > + * an explicit padding character. If padding is not required, use strtomem(). > > > + * > > > + * Note that the size of @dest is not an argument, as the length of @dest > > > + * must be discoverable by the compiler. > > > + */ > > > +#define strtomem_pad(dest, src, pad) do { \ > > > + const size_t _dest_len = __builtin_object_size(dest, 1); \ > > > + \ > > > + BUILD_BUG_ON(!__builtin_constant_p(_dest_len) || \ > > > + _dest_len == (size_t)-1); \ > > > > I think you want to include __must_be_array(dest) here. > > I didn't do that for the cases where we may be writing to non-array > destinations (e.g. see the cast from u64 in the strncpy use in > tools/perf/arch/x86/util/intel-pt.c). Since what we need to know is the > object size, it does not strictly need to be an array. IC. That does mean we cannot catch silly mistakes where the caller passes a pointer instead of the address of an array? > > > + for (int i = 2; i < sizeof(wrap.output); i++) > > > > unsigned int i (everywhere) > > I guess, but why? This could even be u8. sizeof() is unsigned, so using int may cause signed/unsigned comparison warnings. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Thu, Sep 01, 2022 at 09:14:26PM +0200, Geert Uytterhoeven wrote: > Hi Kees, > > On Thu, Sep 1, 2022 at 8:35 PM Kees Cook <keescook@chromium.org> wrote: > > On Thu, Sep 01, 2022 at 10:39:19AM +0200, Geert Uytterhoeven wrote: > > > > --- a/include/linux/string.h > > > > +++ b/include/linux/string.h > > > > @@ -260,6 +260,49 @@ static inline const char *kbasename(const char *path) > > > > void memcpy_and_pad(void *dest, size_t dest_len, const void *src, size_t count, > > > > int pad); > > > > > > > > +/** > > > > + * strtomem_pad - Copy NUL-terminated string to non-NUL-terminated buffer > > > > + * > > > > + * @dest: Pointer of destination character array (marked as __nonstring) > > > > + * @src: Pointer to NUL-terminated string > > > > + * @pad: Padding character to fill any remaining bytes of @dest after copy > > > > + * > > > > + * This is a replacement for strncpy() uses where the destination is not > > > > + * a NUL-terminated string, but with bounds checking on the source size, and > > > > + * an explicit padding character. If padding is not required, use strtomem(). > > > > + * > > > > + * Note that the size of @dest is not an argument, as the length of @dest > > > > + * must be discoverable by the compiler. > > > > + */ > > > > +#define strtomem_pad(dest, src, pad) do { \ > > > > + const size_t _dest_len = __builtin_object_size(dest, 1); \ > > > > + \ > > > > + BUILD_BUG_ON(!__builtin_constant_p(_dest_len) || \ > > > > + _dest_len == (size_t)-1); \ > > > > > > I think you want to include __must_be_array(dest) here. > > > > I didn't do that for the cases where we may be writing to non-array > > destinations (e.g. see the cast from u64 in the strncpy use in > > tools/perf/arch/x86/util/intel-pt.c). Since what we need to know is the > > object size, it does not strictly need to be an array. > > IC. That does mean we cannot catch silly mistakes where the > caller passes a pointer instead of the address of an array? It's okay to pass a pointer as long as the compiler can reason about the size of the object being pointed at (which is what __bos() does here). > > > > + for (int i = 2; i < sizeof(wrap.output); i++) > > > > > > unsigned int i (everywhere) > > > > I guess, but why? This could even be u8. > > sizeof() is unsigned, so using int may cause signed/unsigned comparison > warnings. Do we have those warnings enabled anywhere? I thought solving that warning was Sisyphean. But I guess, yes, better to avoid adding yet more. :)
diff --git a/include/linux/string.h b/include/linux/string.h index 61ec7e4f6311..cf7607b32102 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -260,6 +260,49 @@ static inline const char *kbasename(const char *path) void memcpy_and_pad(void *dest, size_t dest_len, const void *src, size_t count, int pad); +/** + * strtomem_pad - Copy NUL-terminated string to non-NUL-terminated buffer + * + * @dest: Pointer of destination character array (marked as __nonstring) + * @src: Pointer to NUL-terminated string + * @pad: Padding character to fill any remaining bytes of @dest after copy + * + * This is a replacement for strncpy() uses where the destination is not + * a NUL-terminated string, but with bounds checking on the source size, and + * an explicit padding character. If padding is not required, use strtomem(). + * + * Note that the size of @dest is not an argument, as the length of @dest + * must be discoverable by the compiler. + */ +#define strtomem_pad(dest, src, pad) do { \ + const size_t _dest_len = __builtin_object_size(dest, 1); \ + \ + BUILD_BUG_ON(!__builtin_constant_p(_dest_len) || \ + _dest_len == (size_t)-1); \ + memcpy_and_pad(dest, _dest_len, src, strnlen(src, _dest_len), pad); \ +} while (0) + +/** + * strtomem - Copy NUL-terminated string to non-NUL-terminated buffer + * + * @dest: Pointer of destination character array (marked as __nonstring) + * @src: Pointer to NUL-terminated string + * + * This is a replacement for strncpy() uses where the destination is not + * a NUL-terminated string, but with bounds checking on the source size, and + * without trailing padding. If padding is required, use strtomem_pad(). + * + * Note that the size of @dest is not an argument, as the length of @dest + * must be discoverable by the compiler. + */ +#define strtomem(dest, src) do { \ + const size_t _dest_len = __builtin_object_size(dest, 1); \ + \ + BUILD_BUG_ON(!__builtin_constant_p(_dest_len) || \ + _dest_len == (size_t)-1); \ + memcpy(dest, src, min(_dest_len, strnlen(src, _dest_len))); \ +} while (0) + /** * memset_after - Set a value after a struct member to the end of a struct * diff --git a/lib/memcpy_kunit.c b/lib/memcpy_kunit.c index 62f8ffcbbaa3..2aeb8643e1b0 100644 --- a/lib/memcpy_kunit.c +++ b/lib/memcpy_kunit.c @@ -272,10 +272,63 @@ static void memset_test(struct kunit *test) #undef TEST_OP } +static void strtomem_test(struct kunit *test) +{ + static const char input[] = "hi"; + static const char truncate[] = "this is too long"; + struct { + unsigned long canary1; + unsigned char output[sizeof(unsigned long)] __nonstring; + unsigned long canary2; + } wrap; + + memset(&wrap, 0xFF, sizeof(wrap)); + KUNIT_EXPECT_EQ_MSG(test, wrap.canary1, -1UL, + "bad initial canary value"); + KUNIT_EXPECT_EQ_MSG(test, wrap.canary2, -1UL, + "bad initial canary value"); + + /* Check unpadded copy leaves surroundings untouched. */ + strtomem(wrap.output, input); + KUNIT_EXPECT_EQ(test, wrap.canary1, -1UL); + KUNIT_EXPECT_EQ(test, wrap.output[0], input[0]); + KUNIT_EXPECT_EQ(test, wrap.output[1], input[1]); + for (int i = 2; i < sizeof(wrap.output); i++) + KUNIT_EXPECT_EQ(test, wrap.output[i], 0xFF); + KUNIT_EXPECT_EQ(test, wrap.canary2, -1UL); + + /* Check truncated copy leaves surroundings untouched. */ + memset(&wrap, 0xFF, sizeof(wrap)); + strtomem(wrap.output, truncate); + KUNIT_EXPECT_EQ(test, wrap.canary1, -1UL); + for (int i = 0; i < sizeof(wrap.output); i++) + KUNIT_EXPECT_EQ(test, wrap.output[i], truncate[i]); + KUNIT_EXPECT_EQ(test, wrap.canary2, -1UL); + + /* Check padded copy leaves only string padded. */ + memset(&wrap, 0xFF, sizeof(wrap)); + strtomem_pad(wrap.output, input, 0xAA); + KUNIT_EXPECT_EQ(test, wrap.canary1, -1UL); + KUNIT_EXPECT_EQ(test, wrap.output[0], input[0]); + KUNIT_EXPECT_EQ(test, wrap.output[1], input[1]); + for (int i = 2; i < sizeof(wrap.output); i++) + KUNIT_EXPECT_EQ(test, wrap.output[i], 0xAA); + KUNIT_EXPECT_EQ(test, wrap.canary2, -1UL); + + /* Check truncated padded copy has no padding. */ + memset(&wrap, 0xFF, sizeof(wrap)); + strtomem(wrap.output, truncate); + KUNIT_EXPECT_EQ(test, wrap.canary1, -1UL); + for (int i = 0; i < sizeof(wrap.output); i++) + KUNIT_EXPECT_EQ(test, wrap.output[i], truncate[i]); + KUNIT_EXPECT_EQ(test, wrap.canary2, -1UL); +} + static struct kunit_case memcpy_test_cases[] = { KUNIT_CASE(memset_test), KUNIT_CASE(memcpy_test), KUNIT_CASE(memmove_test), + KUNIT_CASE(strtomem_test), {} };
One of the "legitimate" uses of strncpy() is copying a NUL-terminated string into a fixed-size non-NUL-terminated character array. To avoid the weaknesses and ambiguity of intent when using strncpy(), provide replacement functions that explicitly distinguish between trailing padding and not, and require the destination buffer size be discoverable by the compiler. For example: struct obj { int foo; char small[4] __nonstring; char big[8] __nonstring; int bar; }; struct obj p; /* This will truncate to 4 chars with no trailing NUL */ strncpy(p.small, "hello", sizeof(p.small)); /* p.small contains 'h', 'e', 'l', 'l' */ /* This will NUL pad to 8 chars. */ strncpy(p.big, "hello", sizeof(p.big)); /* p.big contains 'h', 'e', 'l', 'l', 'o', '\0', '\0', '\0' */ When the "__nonstring" attributes are missing, the intent of the programmer becomes ambiguous for whether the lack of a trailing NUL in the p.small copy is a bug. Additionally, it's not clear whether the trailing padding in the p.big copy is _needed_. Both cases become unambiguous with: strtomem(p.small, "hello"); strtomem_pad(p.big, "hello"); See also https://github.com/KSPP/linux/issues/90 Expand the memcpy KUnit tests to include these functions. Cc: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Kees Cook <keescook@chromium.org> --- include/linux/string.h | 43 ++++++++++++++++++++++++++++++++++ lib/memcpy_kunit.c | 53 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 96 insertions(+)