diff mbox series

lib/string_helpers: rework overflow-dependent code

Message ID 20240808-b4-string_helpers_caa133-v1-1-686a455167c4@google.com (mailing list archive)
State Mainlined
Commit bbf3c7ff9dfa45be51500d23a1276991a7cd8c6e
Headers show
Series lib/string_helpers: rework overflow-dependent code | expand

Commit Message

Justin Stitt Aug. 8, 2024, 9:43 p.m. UTC
When @size is 0, the desired behavior is to allow unlimited bytes to be
parsed. Currently, this relies on some intentional arithmetic overflow
where --size gives us SIZE_MAX when size is 0.

Explicitly spell out the desired behavior without relying on intentional
overflow/underflow.

Signed-off-by: Justin Stitt <justinstitt@google.com>
---
 lib/string_helpers.c | 3 +++
 1 file changed, 3 insertions(+)


---
base-commit: ee9a43b7cfe2d8a3520335fea7d8ce71b8cabd9d
change-id: 20240808-b4-string_helpers_caa133-53effde85a18

Best regards,
--
Justin Stitt <justinstitt@google.com>

Comments

Andy Shevchenko Aug. 8, 2024, 10:07 p.m. UTC | #1
On Fri, Aug 9, 2024 at 12:44 AM Justin Stitt <justinstitt@google.com> wrote:
>
> When @size is 0, the desired behavior is to allow unlimited bytes to be
> parsed. Currently, this relies on some intentional arithmetic overflow
> where --size gives us SIZE_MAX when size is 0.
>
> Explicitly spell out the desired behavior without relying on intentional
> overflow/underflow.

Hmm... but why? Overflow for the _unsigned_ types is okay. No?
Kees Cook Aug. 8, 2024, 11:11 p.m. UTC | #2
On Fri, Aug 09, 2024 at 01:07:21AM +0300, Andy Shevchenko wrote:
> On Fri, Aug 9, 2024 at 12:44 AM Justin Stitt <justinstitt@google.com> wrote:
> >
> > When @size is 0, the desired behavior is to allow unlimited bytes to be
> > parsed. Currently, this relies on some intentional arithmetic overflow
> > where --size gives us SIZE_MAX when size is 0.
> >
> > Explicitly spell out the desired behavior without relying on intentional
> > overflow/underflow.
> 
> Hmm... but why? Overflow for the _unsigned_ types is okay. No?

Yes, it's well defined, but in trying to find a place to start making a
meaningful impact on unexpected wrap-around, after discussions with
Linus and Peter Zijlstra, we're going taking a stab at defining size_t
as not expecting to wrap. Justin has been collecting false positive
fixes while working on the compiler side of this, and I had asked him to
send this one now since I think it additionally helps with readability.
Andy Shevchenko Aug. 9, 2024, 11:07 a.m. UTC | #3
On Fri, Aug 9, 2024 at 2:11 AM Kees Cook <kees@kernel.org> wrote:
>
> On Fri, Aug 09, 2024 at 01:07:21AM +0300, Andy Shevchenko wrote:
> > On Fri, Aug 9, 2024 at 12:44 AM Justin Stitt <justinstitt@google.com> wrote:
> > >
> > > When @size is 0, the desired behavior is to allow unlimited bytes to be
> > > parsed. Currently, this relies on some intentional arithmetic overflow
> > > where --size gives us SIZE_MAX when size is 0.
> > >
> > > Explicitly spell out the desired behavior without relying on intentional
> > > overflow/underflow.
> >
> > Hmm... but why? Overflow for the _unsigned_ types is okay. No?
>
> Yes, it's well defined, but in trying to find a place to start making a
> meaningful impact on unexpected wrap-around, after discussions with
> Linus and Peter Zijlstra, we're going taking a stab at defining size_t
> as not expecting to wrap. Justin has been collecting false positive
> fixes while working on the compiler side of this, and I had asked him to
> send this one now since I think it additionally helps with readability.

Okay, but the patch has an off-by-one error (which has no impact on
the behavior as it's close to unrealistic to have the SIZE_MAX array).
I prefer that patch can be reconsidered to keep original behaviour,
otherwise it might be not so clear why 0 is SIZE_MAX - 1 in _this_
case.
Justin Stitt Aug. 9, 2024, 11:53 p.m. UTC | #4
On Fri, Aug 09, 2024 at 02:07:57PM GMT, Andy Shevchenko wrote:
> On Fri, Aug 9, 2024 at 2:11 AM Kees Cook <kees@kernel.org> wrote:
> >
> > On Fri, Aug 09, 2024 at 01:07:21AM +0300, Andy Shevchenko wrote:
> > > On Fri, Aug 9, 2024 at 12:44 AM Justin Stitt <justinstitt@google.com> wrote:
> > > >
> > > > When @size is 0, the desired behavior is to allow unlimited bytes to be
> > > > parsed. Currently, this relies on some intentional arithmetic overflow
> > > > where --size gives us SIZE_MAX when size is 0.
> > > >
> > > > Explicitly spell out the desired behavior without relying on intentional
> > > > overflow/underflow.
> > >
> > > Hmm... but why? Overflow for the _unsigned_ types is okay. No?
> >
> > Yes, it's well defined, but in trying to find a place to start making a
> > meaningful impact on unexpected wrap-around, after discussions with
> > Linus and Peter Zijlstra, we're going taking a stab at defining size_t
> > as not expecting to wrap. Justin has been collecting false positive
> > fixes while working on the compiler side of this, and I had asked him to
> > send this one now since I think it additionally helps with readability.
> 
> Okay, but the patch has an off-by-one error (which has no impact on
> the behavior as it's close to unrealistic to have the SIZE_MAX array).
> I prefer that patch can be reconsidered to keep original behaviour,
> otherwise it might be not so clear why 0 is SIZE_MAX - 1 in _this_
> case.

Right, it is technically different but still functionally provides the
"unlimited" behavior.

But, we could  do this too:

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 69ba49b853c7..0f76b5288833 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -320,11 +320,13 @@ static bool unescape_special(char **src, char **dst)
 int string_unescape(char *src, char *dst, size_t size, unsigned int flags)
 {
 	char *out = dst;
+	bool unlimited = !size;
 
-	while (*src && --size) {
-		if (src[0] == '\\' && src[1] != '\0' && size > 1) {
+	while (*src && (unlimited || --size)) {
+		if (src[0] == '\\' && src[1] != '\0' &&
+		    (unlimited || size > 1)) {
 			src++;
-			size--;
+			size -= !unlimited;
 
 			if (flags & UNESCAPE_SPACE &&
 					unescape_space(&src, &out))



Really, I am fine with either.

Thanks
Justin
Andy Shevchenko Aug. 10, 2024, 3:39 p.m. UTC | #5
On Sat, Aug 10, 2024 at 2:53 AM Justin Stitt <justinstitt@google.com> wrote:
> On Fri, Aug 09, 2024 at 02:07:57PM GMT, Andy Shevchenko wrote:
> > On Fri, Aug 9, 2024 at 2:11 AM Kees Cook <kees@kernel.org> wrote:

...

> > Okay, but the patch has an off-by-one error (which has no impact on
> > the behavior as it's close to unrealistic to have the SIZE_MAX array).
> > I prefer that patch can be reconsidered to keep original behaviour,
> > otherwise it might be not so clear why 0 is SIZE_MAX - 1 in _this_
> > case.
>
> Right, it is technically different but still functionally provides the
> "unlimited" behavior.
>
> But, we could  do this too:

>  int string_unescape(char *src, char *dst, size_t size, unsigned int flags)
>  {
>         char *out = dst;
> +       bool unlimited = !size;
>
> -       while (*src && --size) {
> -               if (src[0] == '\\' && src[1] != '\0' && size > 1) {
> +       while (*src && (unlimited || --size)) {
> +               if (src[0] == '\\' && src[1] != '\0' &&
> +                   (unlimited || size > 1)) {
>                         src++;
> -                       size--;
> +                       size -= !unlimited;
>
>                         if (flags & UNESCAPE_SPACE &&
>                                         unescape_space(&src, &out))
>
> Really, I am fine with either.

This one is worse, I think.
Let's take time and not hurry up and think more about better approaches.
Andy Shevchenko Aug. 10, 2024, 3:41 p.m. UTC | #6
On Sat, Aug 10, 2024 at 6:39 PM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Sat, Aug 10, 2024 at 2:53 AM Justin Stitt <justinstitt@google.com> wrote:
> > On Fri, Aug 09, 2024 at 02:07:57PM GMT, Andy Shevchenko wrote:

...

> > But, we could  do this too:

> > -       while (*src && --size) {
> > -               if (src[0] == '\\' && src[1] != '\0' && size > 1) {

> This one is worse, I think.
> Let's take time and not hurry up and think more about better approaches.

Btw, have you played with the "do {} while (size);" approach?
Justin Stitt Aug. 12, 2024, 6:20 p.m. UTC | #7
On Sat, Aug 10, 2024 at 8:42 AM Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
>
> On Sat, Aug 10, 2024 at 6:39 PM Andy Shevchenko
> <andy.shevchenko@gmail.com> wrote:
> > On Sat, Aug 10, 2024 at 2:53 AM Justin Stitt <justinstitt@google.com> wrote:
> > > On Fri, Aug 09, 2024 at 02:07:57PM GMT, Andy Shevchenko wrote:
>
> ...
>
> > > But, we could  do this too:
>
> > > -       while (*src && --size) {
> > > -               if (src[0] == '\\' && src[1] != '\0' && size > 1) {
>
> > This one is worse, I think.
> > Let's take time and not hurry up and think more about better approaches.
>
> Btw, have you played with the "do {} while (size);" approach?

Nope. Although, lots of ideas can work here. We can reorder the logic
any which way, really.

I am happy to send a patch following any idea -- so long as the
overflow-dependent code is gone :)

>
> --
> With Best Regards,
> Andy Shevchenko
Kees Cook Aug. 12, 2024, 6:43 p.m. UTC | #8
On Thu, 08 Aug 2024 14:43:56 -0700, Justin Stitt wrote:
> When @size is 0, the desired behavior is to allow unlimited bytes to be
> parsed. Currently, this relies on some intentional arithmetic overflow
> where --size gives us SIZE_MAX when size is 0.
> 
> Explicitly spell out the desired behavior without relying on intentional
> overflow/underflow.
> 
> [...]

Since we can't have allocations larger than INT_MAX, SIZE_MAX is
sufficiently large for this use-case, and the "off by one" introduce
here isn't meaningful. I prefer the improved readability and lack of
wrapping pre-decrement.

Applied to for-next/hardening, thanks!

[1/1] lib/string_helpers: rework overflow-dependent code
      https://git.kernel.org/kees/c/5d6b91b74ccd

Take care,
Justin Stitt Aug. 12, 2024, 8:05 p.m. UTC | #9
On Mon, Aug 12, 2024 at 11:44 AM Kees Cook <kees@kernel.org> wrote:
>
> On Thu, 08 Aug 2024 14:43:56 -0700, Justin Stitt wrote:
> > When @size is 0, the desired behavior is to allow unlimited bytes to be
> > parsed. Currently, this relies on some intentional arithmetic overflow
> > where --size gives us SIZE_MAX when size is 0.
> >
> > Explicitly spell out the desired behavior without relying on intentional
> > overflow/underflow.
> >
> > [...]
>
> Since we can't have allocations larger than INT_MAX, SIZE_MAX is
> sufficiently large for this use-case, and the "off by one" introduce
> here isn't meaningful. I prefer the improved readability and lack of
> wrapping pre-decrement.

Great! Thanks.

>
> Applied to for-next/hardening, thanks!
>
> [1/1] lib/string_helpers: rework overflow-dependent code
>       https://git.kernel.org/kees/c/5d6b91b74ccd
>
> Take care,
>
> --
> Kees Cook
>
diff mbox series

Patch

diff --git a/lib/string_helpers.c b/lib/string_helpers.c
index 69ba49b853c7..4f887aa62fa0 100644
--- a/lib/string_helpers.c
+++ b/lib/string_helpers.c
@@ -321,6 +321,9 @@  int string_unescape(char *src, char *dst, size_t size, unsigned int flags)
 {
 	char *out = dst;
 
+	if (!size)
+		size = SIZE_MAX;
+
 	while (*src && --size) {
 		if (src[0] == '\\' && src[1] != '\0' && size > 1) {
 			src++;