diff mbox series

[03/14] tree-diff: clear parent array in path_appendnew()

Message ID 20250109083310.GC2748836@coredump.intra.peff.net (mailing list archive)
State New
Headers show
Series combine-diff cleanups | expand

Commit Message

Jeff King Jan. 9, 2025, 8:33 a.m. UTC
All of the other functions which allocate a combine_diff_path struct
zero out the parent array, but this code path does not. There's no bug,
since our caller will fill in most of the fields. But leaving the unused
fields (like combine_diff_parent.path) uninitialized makes working with
the struct more error-prone than it needs to be.

Let's just zero the parent field to be consistent with the
combine_diff_path_new() allocator.

Signed-off-by: Jeff King <peff@peff.net>
---
 tree-diff.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Junio C Hamano Jan. 9, 2025, 6:28 p.m. UTC | #1
Jeff King <peff@peff.net> writes:

> All of the other functions which allocate a combine_diff_path struct
> zero out the parent array, but this code path does not. There's no bug,
> since our caller will fill in most of the fields. But leaving the unused
> fields (like combine_diff_parent.path) uninitialized makes working with
> the struct more error-prone than it needs to be.

OK.  We however will still not use the array at all when we do not
need it, so it would be between accessing uninitialized bytes vs
accessing 0-bytes by mistake?  With my devil's advocate hat on, I
wonder if this would lead to more sloppy users saying "I am not
following the pointer; I am merely stopping when I see a NULL
pointer at the end of the array" or something silly like that
without checking the validity of the array itself (which presumably
can be inferred by inspecting some other member in the containing
struct, right?)".

> Let's just zero the parent field to be consistent with the
> combine_diff_path_new() allocator.

But I like the "let's be consistent" reasoning, so I wouldn't
complain ;-)

> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  tree-diff.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tree-diff.c b/tree-diff.c
> index d9237ffd9b..24f7b5912c 100644
> --- a/tree-diff.c
> +++ b/tree-diff.c
> @@ -151,8 +151,6 @@ static int emit_diff_first_parent_only(struct diff_options *opt, struct combine_
>   *	process(p);
>   *	p = pprev;
>   *	; don't forget to free tail->next in the end
> - *
> - * p->parent[] remains uninitialized.
>   */
>  static struct combine_diff_path *path_appendnew(struct combine_diff_path *last,
>  	int nparent, const struct strbuf *base, const char *path, int pathlen,
> @@ -187,6 +185,8 @@ static struct combine_diff_path *path_appendnew(struct combine_diff_path *last,
>  	p->mode = mode;
>  	oidcpy(&p->oid, oid ? oid : null_oid());
>  
> +	memset(p->parent, 0, sizeof(p->parent[0]) * nparent);
> +
>  	return p;
>  }
Jeff King Jan. 10, 2025, 10:54 a.m. UTC | #2
On Thu, Jan 09, 2025 at 10:28:05AM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > All of the other functions which allocate a combine_diff_path struct
> > zero out the parent array, but this code path does not. There's no bug,
> > since our caller will fill in most of the fields. But leaving the unused
> > fields (like combine_diff_parent.path) uninitialized makes working with
> > the struct more error-prone than it needs to be.
> 
> OK.  We however will still not use the array at all when we do not
> need it, so it would be between accessing uninitialized bytes vs
> accessing 0-bytes by mistake?  With my devil's advocate hat on, I
> wonder if this would lead to more sloppy users saying "I am not
> following the pointer; I am merely stopping when I see a NULL
> pointer at the end of the array" or something silly like that
> without checking the validity of the array itself (which presumably
> can be inferred by inspecting some other member in the containing
> struct, right?)".

Yes, code may be equally wrong to look at uninitialized versus zero
bytes, depending on what it's doing. I don't think "stop when you see
NULL" is a danger here; this is an array of structs, one of which now
happens to be NULL (rather than an array of char pointers, which might
imply that NULL is the end).

Some of that sloppiness already exists. For instance, before my series,
check out intersect_paths(). If we are removing an element from the
list, we clean it up like this:

	for (j = 0; j < num_parent; j++)
		if (combined_all_paths &&
		    filename_changed(p->parent[j].status))
		strbuf_release(&p->parent[j].path);

but if we allocated for 3 parents and have only gotten to the second
pass, all of parent[2] will never have been filled in. We zero
initialize the parents in that function, so there's no memory error. But
it is relying on the fact that filename_changed() will reject a zero
status to avoid calling strbuf_release() on a zero'd strbuf (which
incidentally also works, but violates the strbuf API).

Now in that case we are zero-ing, so it is not one of the uninitialized
cases that Wink ran into. But even if he had tried to be careful with:

  if (filename_changed(p->parent[i].status))
	/* ok to look at p->parent[i].path */

it would not have worked, because that status would have been
uninitialized, too.

> > Let's just zero the parent field to be consistent with the
> > combine_diff_path_new() allocator.
> 
> But I like the "let's be consistent" reasoning, so I wouldn't
> complain ;-)

So yeah. This is the part that I think is really helping new code.
Changing the strbuf to a pointer makes it even simpler (you do not even
have to check the status at all), but this is the commit that is
preventing undefined behavior. ;)

-Peff
diff mbox series

Patch

diff --git a/tree-diff.c b/tree-diff.c
index d9237ffd9b..24f7b5912c 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -151,8 +151,6 @@  static int emit_diff_first_parent_only(struct diff_options *opt, struct combine_
  *	process(p);
  *	p = pprev;
  *	; don't forget to free tail->next in the end
- *
- * p->parent[] remains uninitialized.
  */
 static struct combine_diff_path *path_appendnew(struct combine_diff_path *last,
 	int nparent, const struct strbuf *base, const char *path, int pathlen,
@@ -187,6 +185,8 @@  static struct combine_diff_path *path_appendnew(struct combine_diff_path *last,
 	p->mode = mode;
 	oidcpy(&p->oid, oid ? oid : null_oid());
 
+	memset(p->parent, 0, sizeof(p->parent[0]) * nparent);
+
 	return p;
 }