diff mbox series

parse-options: avoid arithmetic on pointer that's potentially NULL

Message ID 39a0b622-f725-9284-ea50-19cf4078209d@web.de (mailing list archive)
State New, archived
Headers show
Series parse-options: avoid arithmetic on pointer that's potentially NULL | expand

Commit Message

René Scharfe Nov. 12, 2019, 9:41 p.m. UTC
parse_options_dup() counts the number of elements in the given array
without the end marker, allocates enough memory to hold all of them plus
an end marker, then copies them and terminates the new array.  The
counting part is done by advancing a pointer through the array, and the
original pointer is reconstructed using pointer subtraction before the
copy operation.

The function is also prepared to handle a NULL pointer passed to it.
None of its callers do that currently, but this feature was used by
46e91b663b ("checkout: split part of it to new command 'restore'",
2019-04-25); it seems worth keeping.

It ends up doing arithmetic on that NULL pointer, though, which is
undefined in standard C, when it tries to calculate "NULL - 0".  Better
avoid doing that by remembering the originally given pointer value.

There is another issue, though.  memcpy(3) does not support NULL
pointers, even for empty arrays.  Use COPY_ARRAY instead, which does
support such empty arrays.  Its call is also shorter and safer by
inferring the element type automatically.

Coccinelle and contrib/coccinelle/array.cocci did not propose to use
COPY_ARRAY because of the pointer subtraction and because the source is
const -- the semantic patch cautiously only considers pointers and array
references of the same type.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
Patch formatted with --function-context for easier review.

 parse-options-cb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--
2.24.0

Comments

Junio C Hamano Nov. 13, 2019, 2:43 a.m. UTC | #1
René Scharfe <l.s.r@web.de> writes:

> parse_options_dup() counts the number of elements in the given array
> without the end marker, allocates enough memory to hold all of them plus
> an end marker, then copies them and terminates the new array.  The
> counting part is done by advancing a pointer through the array, and the
> original pointer is reconstructed using pointer subtraction before the
> copy operation.
>
> The function is also prepared to handle a NULL pointer passed to it.
> None of its callers do that currently, but this feature was used by
> 46e91b663b ("checkout: split part of it to new command 'restore'",
> 2019-04-25); it seems worth keeping.

... meaning it was used but since then we rewrote the user and there
is no caller that uses it?

Gee, how are you finding such an instance of use?  I am quite
impressed.

> It ends up doing arithmetic on that NULL pointer, though, which is
> undefined in standard C, when it tries to calculate "NULL - 0".  Better
> avoid doing that by remembering the originally given pointer value.
>
> There is another issue, though.  memcpy(3) does not support NULL
> pointers, even for empty arrays.  Use COPY_ARRAY instead, which does
> support such empty arrays.  Its call is also shorter and safer by
> inferring the element type automatically.

Nicely explained.

>  struct option *parse_options_dup(const struct option *o)
>  {
> +	const struct option *orig = o;

If I were doing this patch, I'd give the incoming parameter a more
meaningful name and the temporary/local variable that is used to
scan would get/keep the shorter name.  IOW

	struct parse_options_dup(const struct option *original)
	{
		const struct option *o = original;
		struct option *copied;
		int nr = 0;

		... loop to count ...
		ALLOC_ARRAY(copied, nr + 1);
		COPY_ARRAY(copied, original, nr);
		...

But your patch is a strict improvement from the current code.  Will
queue without any tweak.

Thanks.

>  	struct option *opts;
>  	int nr = 0;
>
>  	while (o && o->type != OPTION_END) {
>  		nr++;
>  		o++;
>  	}
>
>  	ALLOC_ARRAY(opts, nr + 1);
> -	memcpy(opts, o - nr, sizeof(*o) * nr);
> +	COPY_ARRAY(opts, orig, nr);
>  	memset(opts + nr, 0, sizeof(*opts));
>  	opts[nr].type = OPTION_END;
>  	return opts;
>  }
> --
> 2.24.0
René Scharfe Nov. 13, 2019, 4:48 p.m. UTC | #2
Am 13.11.19 um 03:43 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>> The function is also prepared to handle a NULL pointer passed to it.
>> None of its callers do that currently, but this feature was used by
>> 46e91b663b ("checkout: split part of it to new command 'restore'",
>> 2019-04-25); it seems worth keeping.
>
> ... meaning it was used but since then we rewrote the user and there
> is no caller that uses it?
>
> Gee, how are you finding such an instance of use?  I am quite
> impressed.

I wondered why the function checks for NULL, noticed that both users are
in builtin/checkout.c and pass a pointer to a struct, so I ran
"git log -p builtin/checkout.c" and searched for "= NULL" in its output
because I suspected or vaguely remembered that the options were added
one by one by a patch series.  Sloppy, but good enough in this case..

>>  struct option *parse_options_dup(const struct option *o)
>>  {
>> +	const struct option *orig = o;
>
> If I were doing this patch, I'd give the incoming parameter a more
> meaningful name and the temporary/local variable that is used to
> scan would get/keep the shorter name.  IOW
>
> 	struct parse_options_dup(const struct option *original)
> 	{
> 		const struct option *o = original;
> 		struct option *copied;
> 		int nr = 0;
>
> 		... loop to count ...
> 		ALLOC_ARRAY(copied, nr + 1);
> 		COPY_ARRAY(copied, original, nr);
> 		...

Right.  And I just rediscovered an unsent patch series from last summer
that refactors this function further.  It fixed the NULL issue as well,
but only by accident.  Will rethink/reintegrate these patches and send
them later, but here's one that's ready and fits the COPY_ARRAY theme:

-- >8 --
Subject: [PATCH] parse-options: use COPY_ARRAY in parse_options_concat()

Use COPY_ARRAY to copy whole arrays instead of iterating through their
elements.  That's shorter, simpler and bit more efficient.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 parse-options-cb.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/parse-options-cb.c b/parse-options-cb.c
index c2062ae742..8b443938b8 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -188,10 +188,8 @@ struct option *parse_options_concat(struct option *a, struct option *b)
 		b_len++;

 	ALLOC_ARRAY(ret, st_add3(a_len, b_len, 1));
-	for (i = 0; i < a_len; i++)
-		ret[i] = a[i];
-	for (i = 0; i < b_len; i++)
-		ret[a_len + i] = b[i];
+	COPY_ARRAY(ret, a, a_len);
+	COPY_ARRAY(ret + a_len, b, b_len);
 	ret[a_len + b_len] = b[b_len]; /* final OPTION_END */

 	return ret;
--
2.24.0
Martin Ågren Nov. 13, 2019, 6:25 p.m. UTC | #3
On Wed, 13 Nov 2019 at 18:19, René Scharfe <l.s.r@web.de> wrote:
>         ALLOC_ARRAY(ret, st_add3(a_len, b_len, 1));
> -       for (i = 0; i < a_len; i++)
> -               ret[i] = a[i];
> -       for (i = 0; i < b_len; i++)
> -               ret[a_len + i] = b[i];
> +       COPY_ARRAY(ret, a, a_len);
> +       COPY_ARRAY(ret + a_len, b, b_len);
>         ret[a_len + b_len] = b[b_len]; /* final OPTION_END */

Maybe include that last one in the COPY_ARRAY with something like this
on top?

-       COPY_ARRAY(ret + a_len, b, b_len);
-       ret[a_len + b_len] = b[b_len]; /* final OPTION_END */
+       /* 1 more to include the final OPTION_END */
+       COPY_ARRAY(ret + a_len, b, st_add(b_len, 1));

Or maybe even something like the below (directly on top of your patch)?

(Plus, you could drop `i` entirely, but I'm not sure that's worth
golfing on.)

Martin

diff --git a/parse-options-cb.c b/parse-options-cb.c
index 2bde78b726..11196cfb96 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -185,11 +185,11 @@ struct option *parse_options_concat(struct option *a, struct option *b)
 		a_len++;
 	for (i = 0; b[i].type != OPTION_END; i++)
 		b_len++;
+	b_len++; /* final OPTION_END */
 
-	ALLOC_ARRAY(ret, st_add3(a_len, b_len, 1));
+	ALLOC_ARRAY(ret, st_add(a_len, b_len));
 	COPY_ARRAY(ret, a, a_len);
 	COPY_ARRAY(ret + a_len, b, b_len);
-	ret[a_len + b_len] = b[b_len]; /* final OPTION_END */
 
 	return ret;
 }
René Scharfe Nov. 14, 2019, 4:01 p.m. UTC | #4
Am 13.11.19 um 19:25 schrieb Martin Ågren:
> On Wed, 13 Nov 2019 at 18:19, René Scharfe <l.s.r@web.de> wrote:
>>         ALLOC_ARRAY(ret, st_add3(a_len, b_len, 1));
>> -       for (i = 0; i < a_len; i++)
>> -               ret[i] = a[i];
>> -       for (i = 0; i < b_len; i++)
>> -               ret[a_len + i] = b[i];
>> +       COPY_ARRAY(ret, a, a_len);
>> +       COPY_ARRAY(ret + a_len, b, b_len);
>>         ret[a_len + b_len] = b[b_len]; /* final OPTION_END */
>
> Maybe include that last one in the COPY_ARRAY with something like this
> on top?
>
> -       COPY_ARRAY(ret + a_len, b, b_len);
> -       ret[a_len + b_len] = b[b_len]; /* final OPTION_END */
> +       /* 1 more to include the final OPTION_END */
> +       COPY_ARRAY(ret + a_len, b, st_add(b_len, 1));

I'd rather move towards allowing a and b to be NULL, for consistency and
safety.  Will send patches eventually, but preferably only once the
original patch hits master.  These follow-ups that I have in mind aren't
terribly urgent.

René
diff mbox series

Patch

diff --git a/parse-options-cb.c b/parse-options-cb.c
index 1240a8514e..c2062ae742 100644
--- a/parse-options-cb.c
+++ b/parse-options-cb.c
@@ -161,17 +161,18 @@  int parse_opt_tertiary(const struct option *opt, const char *arg, int unset)

 struct option *parse_options_dup(const struct option *o)
 {
+	const struct option *orig = o;
 	struct option *opts;
 	int nr = 0;

 	while (o && o->type != OPTION_END) {
 		nr++;
 		o++;
 	}

 	ALLOC_ARRAY(opts, nr + 1);
-	memcpy(opts, o - nr, sizeof(*o) * nr);
+	COPY_ARRAY(opts, orig, nr);
 	memset(opts + nr, 0, sizeof(*opts));
 	opts[nr].type = OPTION_END;
 	return opts;
 }