[2/2] sha1-name: check for overflow of N in "foo^N" and "foo~N"
diff mbox series

Message ID 2be6e3ee-209e-9cd1-eb43-284f9a8462b3@web.de
State New
Headers show
Series
  • [1/2] rev-parse: demonstrate overflow of N for "foo^N" and "foo~N"
Related show

Commit Message

René Scharfe Sept. 15, 2019, 12:10 p.m. UTC
Reject values that don't fit into an int, as get_parent() and
get_nth_ancestor() cannot handle them.  That's better than potentially
returning a random object.

If this restriction turns out to be too tight then we can switch to a
wider data type, but we'd still have to check for overflow.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 sha1-name.c                    | 15 ++++++++++++---
 t/t1506-rev-parse-diagnosis.sh |  4 ++--
 2 files changed, 14 insertions(+), 5 deletions(-)

--
2.23.0

Comments

brian m. carlson Sept. 15, 2019, 3:15 p.m. UTC | #1
On 2019-09-15 at 12:10:28, René Scharfe wrote:
> Reject values that don't fit into an int, as get_parent() and
> get_nth_ancestor() cannot handle them.  That's better than potentially
> returning a random object.
> 
> If this restriction turns out to be too tight then we can switch to a
> wider data type, but we'd still have to check for overflow.

Certainly we want Git to perform as well as possible on large
repositories, but I doubt if it will scale to more than 2 billion
revisions, even with significant effort.  I think this restriction
should be fine.

> diff --git a/sha1-name.c b/sha1-name.c
> index c665e3f96d..7a047e9e2b 100644
> --- a/sha1-name.c
> +++ b/sha1-name.c
> @@ -1160,13 +1160,22 @@ static enum get_oid_result get_oid_1(struct repository *r,
>  	}
> 
>  	if (has_suffix) {
> -		int num = 0;
> +		unsigned int num = 0;
>  		int len1 = cp - name;
>  		cp++;
> -		while (cp < name + len)
> -			num = num * 10 + *cp++ - '0';
> +		while (cp < name + len) {
> +			unsigned int digit = *cp++ - '0';
> +			if (unsigned_mult_overflows(num, 10))
> +				return MISSING_OBJECT;
> +			num *= 10;
> +			if (unsigned_add_overflows(num, digit))
> +				return MISSING_OBJECT;

I was worried whether these functions only handled size_t or if they
also handle unsigned int, but I checked and they seem to be fine for any
unsigned type.

> +			num += digit;
> +		}
>  		if (!num && len1 == len - 1)
>  			num = 1;
> +		else if (num > INT_MAX)
> +			return MISSING_OBJECT;
>  		if (has_suffix == '^')
>  			return get_parent(r, name, len1, oid, num);
>  		/* else if (has_suffix == '~') -- goes without saying */

This approach seems reasonable.  I must admit some curiosity as to how
you discovered this issue, though.  Did you have a cat assisting you in
typing revisions?
René Scharfe Sept. 15, 2019, 4:12 p.m. UTC | #2
Am 15.09.19 um 17:15 schrieb brian m. carlson:
> This approach seems reasonable.  I must admit some curiosity as to how
> you discovered this issue, though.  Did you have a cat assisting you in
> typing revisions?

Found it by reading the code, but I'm not sure anymore what I was
actually looking for.

Would a fuzzer (or a cat) be able to catch that?  The function is
happily eating extra digits -- it's not crashing for me.

René

Patch
diff mbox series

diff --git a/sha1-name.c b/sha1-name.c
index c665e3f96d..7a047e9e2b 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -1160,13 +1160,22 @@  static enum get_oid_result get_oid_1(struct repository *r,
 	}

 	if (has_suffix) {
-		int num = 0;
+		unsigned int num = 0;
 		int len1 = cp - name;
 		cp++;
-		while (cp < name + len)
-			num = num * 10 + *cp++ - '0';
+		while (cp < name + len) {
+			unsigned int digit = *cp++ - '0';
+			if (unsigned_mult_overflows(num, 10))
+				return MISSING_OBJECT;
+			num *= 10;
+			if (unsigned_add_overflows(num, digit))
+				return MISSING_OBJECT;
+			num += digit;
+		}
 		if (!num && len1 == len - 1)
 			num = 1;
+		else if (num > INT_MAX)
+			return MISSING_OBJECT;
 		if (has_suffix == '^')
 			return get_parent(r, name, len1, oid, num);
 		/* else if (has_suffix == '~') -- goes without saying */
diff --git a/t/t1506-rev-parse-diagnosis.sh b/t/t1506-rev-parse-diagnosis.sh
index 5c4df47401..6a938b205b 100755
--- a/t/t1506-rev-parse-diagnosis.sh
+++ b/t/t1506-rev-parse-diagnosis.sh
@@ -215,11 +215,11 @@  test_expect_success 'arg before dashdash must be a revision (ambiguous)' '
 	test_cmp expect actual
 '

-test_expect_failure 'reject Nth parent if N is too high' '
+test_expect_success 'reject Nth parent if N is too high' '
 	test_must_fail git rev-parse HEAD^100000000000000000000000000000000
 '

-test_expect_failure 'reject Nth ancestor if N is too high' '
+test_expect_success 'reject Nth ancestor if N is too high' '
 	test_must_fail git rev-parse HEAD~100000000000000000000000000000000
 '