[3/3] git_parse_signed(): avoid integer overflow

Message ID	f058f391c3821b341a15fda9ae9fd20dda6a0494.1666359915.git.gitgitgadget@gmail.com (mailing list archive)
State	Accepted
Commit	14770cf0de218cc373e7d286b864f526e5ea2840
Headers	show Return-Path: <git-owner@kernel.org> Message-Id: <f058f391c3821b341a15fda9ae9fd20dda6a0494.1666359915.git.gitgitgadget@gmail.com> In-Reply-To: <pull.1389.git.1666359915.gitgitgadget@gmail.com> References: <pull.1389.git.1666359915.gitgitgadget@gmail.com> Date: Fri, 21 Oct 2022 13:45:14 +0000 Subject: [PATCH 3/3] git_parse_signed(): avoid integer overflow Fcc: Sent Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 To: git@vger.kernel.org Cc: =?utf-8?b?UmVuw6k=?= Scharfe <l.s.r@web.de>, Phillip Wood <phillip.wood@dunelm.org.uk>, Phillip Wood <phillip.wood@dunelm.org.uk> Precedence: bulk From: Phillip Wood <phillip.wood@dunelm.org.uk>
Series	a few config integer parsing fixes \| expand [0/3] a few config integer parsing fixes [1/3] git_parse_unsigned: reject negative values [2/3] config: require at least one digit when parsing numbers [3/3] git_parse_signed(): avoid integer overflow

Phillip Wood Oct. 21, 2022, 1:45 p.m. UTC

From: Phillip Wood <phillip.wood@dunelm.org.uk>

git_parse_signed() checks that the absolute value of the parsed string
is less than or equal to a caller supplied maximum value. When
calculating the absolute value there is a integer overflow if `val ==
INTMAX_MIN`. To fix this avoid negating `val` when it is negative by
having separate overflow checks for positive and negative values.

An alternative would be to special case INTMAX_MIN before negating `val`
as it is always out of range. That would enable us to keep the existing
code but I'm not sure that the current two-stage check is any clearer
than the new version.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 config.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

Junio C Hamano Oct. 21, 2022, 6:31 p.m. UTC | #1

"Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> git_parse_signed() checks that the absolute value of the parsed string
> is less than or equal to a caller supplied maximum value. When
> calculating the absolute value there is a integer overflow if `val ==
> INTMAX_MIN`.

It is a problem that is worth looking into.

> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
>  config.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/config.c b/config.c
> index b7fb68026d8..aad3e00341d 100644
> --- a/config.c
> +++ b/config.c
> @@ -1160,8 +1160,10 @@ static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
>  	if (value && *value) {
>  		char *end;
>  		intmax_t val;
> -		uintmax_t uval;
> -		uintmax_t factor;
> +		intmax_t factor;
> +
> +		if (max < 0)
> +			BUG("max must be a positive integer");

In parse_signed(), would we expect to accept end-user input that is
a negative integer?  We must.  Otherwise we would not be calling a
"signed" parser.  Now, are there cases where the valid value range
is bounded by a negative integer at the top?  No current callers may
pass such a value, but is it reasonable to add such a new constraints
to an existing API function?

René Scharfe Oct. 22, 2022, 8:09 a.m. UTC | #2

Am 21.10.22 um 20:31 schrieb Junio C Hamano:
> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> git_parse_signed() checks that the absolute value of the parsed string
>> is less than or equal to a caller supplied maximum value. When
>> calculating the absolute value there is a integer overflow if `val ==
>> INTMAX_MIN`.
>
> It is a problem that is worth looking into.
>
>> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>> ---
>>  config.c | 11 ++++++-----
>>  1 file changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/config.c b/config.c
>> index b7fb68026d8..aad3e00341d 100644
>> --- a/config.c
>> +++ b/config.c
>> @@ -1160,8 +1160,10 @@ static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
>>  	if (value && *value) {
>>  		char *end;
>>  		intmax_t val;
>> -		uintmax_t uval;
>> -		uintmax_t factor;
>> +		intmax_t factor;
>> +
>> +		if (max < 0)
>> +			BUG("max must be a positive integer");
>
> In parse_signed(), would we expect to accept end-user input that is
> a negative integer?  We must.  Otherwise we would not be calling a
> "signed" parser.  Now, are there cases where the valid value range
> is bounded by a negative integer at the top?  No current callers may
> pass such a value, but is it reasonable to add such a new constraints
> to an existing API function?

Hmm, if minimum and maximum are not symmetric, then we need to supply
both, don't we?

--- >8 ---
Subject: [PATCH] config: let git_parse_signed() check minimum

git_parse_signed() checks that the absolute value of the parsed string
is less than or equal to a caller supplied maximum value.  When
calculating the absolute value there is a integer overflow if `val ==
INTMAX_MIN`.

Avoid overflow during sign inversion by supplying the minimum value to
the function as well.  Make `factor` signed to avoid promoting the
division results in the limit check line to unsigned, but check whether
it's positive.

Add a new macro, minimum_signed_value_of_type, and use it in the callers
of git_parse_signed() to provide the newly required minimum value.  It
calculates the minimum for a given type using division instead of right
shift because the latter is implementation-defined for signed types,
and we need an arithmetic right shift here, not a logical one.

Original-patch-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
git_config_get_expiry_in_days() could use git_parse_int(), but that's
a different topic.

 config.c          | 28 +++++++++++++++++-----------
 git-compat-util.h |  2 ++
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index cbb5a3bab7..7cf196dc84 100644
--- a/config.c
+++ b/config.c
@@ -1155,26 +1155,24 @@ static uintmax_t get_unit_factor(const char *end)
 	return 0;
 }

-static int git_parse_signed(const char *value, intmax_t *ret, intmax_t max)
+static int git_parse_signed(const char *value, intmax_t *ret,
+			    intmax_t min, intmax_t max)
 {
 	if (value && *value) {
 		char *end;
 		intmax_t val;
-		uintmax_t uval;
-		uintmax_t factor;
+		intmax_t factor;

 		errno = 0;
 		val = strtoimax(value, &end, 0);
 		if (errno == ERANGE)
 			return 0;
 		factor = get_unit_factor(end);
-		if (!factor) {
+		if (factor < 1) {
 			errno = EINVAL;
 			return 0;
 		}
-		uval = val < 0 ? -val : val;
-		if (unsigned_mult_overflows(factor, uval) ||
-		    factor * uval > max) {
+		if (val < min / factor || val > max / factor) {
 			errno = ERANGE;
 			return 0;
 		}
@@ -1218,7 +1216,9 @@ static int git_parse_unsigned(const char *value, uintmax_t *ret, uintmax_t max)
 static int git_parse_int(const char *value, int *ret)
 {
 	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int)))
+	if (!git_parse_signed(value, &tmp,
+			      minimum_signed_value_of_type(int),
+			      maximum_signed_value_of_type(int)))
 		return 0;
 	*ret = tmp;
 	return 1;
@@ -1227,7 +1227,9 @@ static int git_parse_int(const char *value, int *ret)
 static int git_parse_int64(const char *value, int64_t *ret)
 {
 	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(int64_t)))
+	if (!git_parse_signed(value, &tmp,
+			      minimum_signed_value_of_type(int64_t),
+			      maximum_signed_value_of_type(int64_t)))
 		return 0;
 	*ret = tmp;
 	return 1;
@@ -1245,7 +1247,9 @@ int git_parse_ulong(const char *value, unsigned long *ret)
 int git_parse_ssize_t(const char *value, ssize_t *ret)
 {
 	intmax_t tmp;
-	if (!git_parse_signed(value, &tmp, maximum_signed_value_of_type(ssize_t)))
+	if (!git_parse_signed(value, &tmp,
+			      minimum_signed_value_of_type(ssize_t),
+			      maximum_signed_value_of_type(ssize_t)))
 		return 0;
 	*ret = tmp;
 	return 1;
@@ -2751,7 +2755,9 @@ int git_config_get_expiry_in_days(const char *key, timestamp_t *expiry, timestam
 	if (git_config_get_string_tmp(key, &expiry_string))
 		return 1; /* no such thing */

-	if (git_parse_signed(expiry_string, &days, maximum_signed_value_of_type(int))) {
+	if (git_parse_signed(expiry_string, &days,
+			     minimum_signed_value_of_type(int),
+			     maximum_signed_value_of_type(int))) {
 		const int scale = 86400;
 		*expiry = now - days * scale;
 		return 0;
diff --git a/git-compat-util.h b/git-compat-util.h
index ea53ea4a78..35425c373b 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -112,6 +112,8 @@ struct strbuf;

 #define bitsizeof(x)  (CHAR_BIT * sizeof(x))

+#define minimum_signed_value_of_type(a) \
+    (INTMAX_MIN / ((intmax_t)1 << (bitsizeof(intmax_t) - bitsizeof(a))))
 #define maximum_signed_value_of_type(a) \
     (INTMAX_MAX >> (bitsizeof(intmax_t) - bitsizeof(a)))

--
2.38.1

Junio C Hamano Oct. 22, 2022, 4:51 p.m. UTC | #3

René Scharfe <l.s.r@web.de> writes:

>>> +		if (max < 0)
>>> +			BUG("max must be a positive integer");
>>
>> In parse_signed(), would we expect to accept end-user input that is
>> a negative integer?  We must.  Otherwise we would not be calling a
>> "signed" parser.  Now, are there cases where the valid value range
>> is bounded by a negative integer at the top?  No current callers may
>> pass such a value, but is it reasonable to add such a new constraints
>> to an existing API function?
>
> Hmm, if minimum and maximum are not symmetric, then we need to supply
> both, don't we?

Ah, thanks for injecting doze of sanity---I totally missed that the
bound was about the absolute value, so we can say "this is signed,
and the allowed values are (-3, -2, -1, 0, 1, 2, 3).  If so, then the
"reject negative max" in the posted patch is not a problem as I said
above.  I somehow thought that giving -1 as "max" would allow callers
to say "non-negative numbers are not allowed".  But that is not what
is going on.

Allowing callers to specify both lower and uppoer bounds so that
they can say "the allowed values are (-1, 0, 1, 2, 3)", while it
might make it more useful, is a separate new feature development and
outside the scope of "let's tighten the parsing of end user input"
Phillip has here.

Sorry about the thinko, and thanks for a new and interesting
tangent.

René Scharfe Oct. 23, 2022, 5:57 a.m. UTC | #4

Am 22.10.22 um 18:51 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>>>> +		if (max < 0)
>>>> +			BUG("max must be a positive integer");
>>>
>>> In parse_signed(), would we expect to accept end-user input that is
>>> a negative integer?  We must.  Otherwise we would not be calling a
>>> "signed" parser.  Now, are there cases where the valid value range
>>> is bounded by a negative integer at the top?  No current callers may
>>> pass such a value, but is it reasonable to add such a new constraints
>>> to an existing API function?
>>
>> Hmm, if minimum and maximum are not symmetric, then we need to supply
>> both, don't we?
>
> Ah, thanks for injecting doze of sanity---I totally missed that the
> bound was about the absolute value, so we can say "this is signed,
> and the allowed values are (-3, -2, -1, 0, 1, 2, 3).  If so, then the
> "reject negative max" in the posted patch is not a problem as I said
> above.  I somehow thought that giving -1 as "max" would allow callers
> to say "non-negative numbers are not allowed".  But that is not what
> is going on.

Right, currently the value of `max` is used to check the absolute value,
i.e. it imposes a limit in both the positive and negative direction.

> Allowing callers to specify both lower and uppoer bounds so that
> they can say "the allowed values are (-1, 0, 1, 2, 3)", while it
> might make it more useful, is a separate new feature development and
> outside the scope of "let's tighten the parsing of end user input"
> Phillip has here.

Allowing arbitrary limits in both directions might be a new feature, but
it's required if we want to support the full range of values.  E.g. on
my system INT_MAX is 2147483647 and INT_MIN is -2147483647-1.  Currently
git_parse_int() rejects INT_MIN as out of range.

In my eyes the assumption that a single limit can be used to check both
directions of the signed number line is flawed and caused the undefined
behavior.  Dropping it avoids tricky calculations that try to infer the
lower limit somehow and opens the full range of values to us.

That said, I'm not sure how useful the values INT_MIN, INT64_MIN and
SSIZE_MIN (which unlike SSIZE_MAX is not defined by POSIX [*]) actually
are. But doing the checks properly requires separate min and max values.

René

[*] Perhaps git_parse_ssize_t() should reject values less than -1; only
that single negative number is needed to represent errors or unlimited.

Phillip Wood Oct. 25, 2022, 10 a.m. UTC | #5

On 23/10/2022 06:57, René Scharfe wrote:
> Am 22.10.22 um 18:51 schrieb Junio C Hamano:
>> René Scharfe <l.s.r@web.de> writes:
>>
>>>>> +		if (max < 0)
>>>>> +			BUG("max must be a positive integer");
>>>>
>>>> In parse_signed(), would we expect to accept end-user input that is
>>>> a negative integer?  We must.  Otherwise we would not be calling a
>>>> "signed" parser.  Now, are there cases where the valid value range
>>>> is bounded by a negative integer at the top?  No current callers may
>>>> pass such a value, but is it reasonable to add such a new constraints
>>>> to an existing API function?
>>>
>>> Hmm, if minimum and maximum are not symmetric, then we need to supply
>>> both, don't we?
>>
>> Ah, thanks for injecting doze of sanity---I totally missed that the
>> bound was about the absolute value, so we can say "this is signed,
>> and the allowed values are (-3, -2, -1, 0, 1, 2, 3).  If so, then the
>> "reject negative max" in the posted patch is not a problem as I said
>> above.  I somehow thought that giving -1 as "max" would allow callers
>> to say "non-negative numbers are not allowed".  But that is not what
>> is going on.
> 
> Right, currently the value of `max` is used to check the absolute value,
> i.e. it imposes a limit in both the positive and negative direction.
> 
>> Allowing callers to specify both lower and uppoer bounds so that
>> they can say "the allowed values are (-1, 0, 1, 2, 3)", while it
>> might make it more useful, is a separate new feature development and
>> outside the scope of "let's tighten the parsing of end user input"
>> Phillip has here.
> 
> Allowing arbitrary limits in both directions might be a new feature, but
> it's required if we want to support the full range of values.  E.g. on
> my system INT_MAX is 2147483647 and INT_MIN is -2147483647-1.  Currently
> git_parse_int() rejects INT_MIN as out of range.
> 
> In my eyes the assumption that a single limit can be used to check both
> directions of the signed number line is flawed and caused the undefined
> behavior.  Dropping it avoids tricky calculations that try to infer the
> lower limit somehow and opens the full range of values to us.
> 
> That said, I'm not sure how useful the values INT_MIN, INT64_MIN and
> SSIZE_MIN (which unlike SSIZE_MAX is not defined by POSIX [*]) actually
> are. But doing the checks properly requires separate min and max values.

I'm happy to go either way, while I agree passing separate limits to 
allow INT_MIN is technically correct I'm not sure anyone has complained 
that the current code is too restrictive.

Best Wishes

Phillip

> René
> 
> 
> [*] Perhaps git_parse_ssize_t() should reject values less than -1; only
> that single negative number is needed to represent errors or unlimited.

René Scharfe Oct. 26, 2022, 11:01 a.m. UTC | #6

Am 25.10.22 um 12:00 schrieb Phillip Wood:
> On 23/10/2022 06:57, René Scharfe wrote:
>>
>> That said, I'm not sure how useful the values INT_MIN, INT64_MIN
>> and SSIZE_MIN (which unlike SSIZE_MAX is not defined by POSIX [*])
>> actually are. But doing the checks properly requires separate min
>> and max values.
>
> I'm happy to go either way, while I agree passing separate limits to
> allow INT_MIN is technically correct I'm not sure anyone has
> complained that the current code is too restrictive.
Right.  Having separate patches for the different aspects (your fix,
simplification due to adding a min parameter, range extension made
possible by that) is probably best to discuss the merits of each.

René

[3/3] git_parse_signed(): avoid integer overflow

Commit Message

Comments

Patch