diff mbox series

dir: fix malloc of root untracked_cache_dir

Message ID pull.884.git.1614177117508.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 6347d649bcddf531f82d400103e23d99ea8f2fd4
Headers show
Series dir: fix malloc of root untracked_cache_dir | expand

Commit Message

Jeff Hostetler Feb. 24, 2021, 2:31 p.m. UTC
From: Jeff Hostetler <jeffhost@microsoft.com>

Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
for the root directory.  Get rid of unsafe code that might fail to
initialize the `name` field (if FLEX_ARRAY is not 1).  This will
make it clear that we intend to have a structure with an empty
string following it.

A problem was observed on Windows where the length of the memset() was
too short, so the first byte of the name field was not zeroed.  This
resulted in the name field having garbage from a previous use of that
area of memory.

The record for the root directory was then written to the untracked-cache
extension in the index.  This garbage would then be visible to future
commands when they reloaded the untracked-cache extension.

Since the directory record for the root directory had garbage in the
`name` field, the `t/helper/test-tool dump-untracked-cache` tool
printed this garbage as the path prefix (rather than '/') for each
directory in the untracked cache as it recursed.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
    dir: fix malloc of root untracked_cache_dir

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-884%2Fjeffhostetler%2Funtracked-cache-corruption-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-884/jeffhostetler/untracked-cache-corruption-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/884

 dir.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)


base-commit: 966e671106b2fd38301e7c344c754fd118d0bb07

Comments

Taylor Blau Feb. 24, 2021, 4:56 p.m. UTC | #1
On Wed, Feb 24, 2021 at 02:31:57PM +0000, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
> for the root directory.  Get rid of unsafe code that might fail to
> initialize the `name` field (if FLEX_ARRAY is not 1).  This will
> make it clear that we intend to have a structure with an empty
> string following it.

I had to read this patch message a few times over to make sure that I
wasn't missing something in the patch itself, but it looks all good to
me.

  Reviewed-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor
Junio C Hamano Feb. 24, 2021, 8:08 p.m. UTC | #2
"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
> for the root directory.  Get rid of unsafe code that might fail to
> initialize the `name` field (if FLEX_ARRAY is not 1).  This will
> make it clear that we intend to have a structure with an empty
> string following it.
>
> A problem was observed on Windows where the length of the memset() was
> too short, so the first byte of the name field was not zeroed.  This
> resulted in the name field having garbage from a previous use of that
> area of memory.
>
> The record for the root directory was then written to the untracked-cache
> extension in the index.  This garbage would then be visible to future
> commands when they reloaded the untracked-cache extension.
>
> Since the directory record for the root directory had garbage in the
> `name` field, the `t/helper/test-tool dump-untracked-cache` tool
> printed this garbage as the path prefix (rather than '/') for each
> directory in the untracked cache as it recursed.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>     dir: fix malloc of root untracked_cache_dir

Nicely spotted.

The problematic code was introduced in 2015, a year before these
FLEX_ALLOC_*() helpers were introduced.  The result is of course
correct and much easier to read, as the necessary "ask for a region
of calloc'ed memory with an additional byte for terminating NUL
beyond strlen()" is hidden in the helper.

Will queue; thanks.

> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-884%2Fjeffhostetler%2Funtracked-cache-corruption-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-884/jeffhostetler/untracked-cache-corruption-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/884
>
>  dir.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index d153a63bbd14..fd8aa7c40faa 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -2730,11 +2730,8 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
>  		return NULL;
>  	}
>  
> -	if (!dir->untracked->root) {
> -		const int len = sizeof(*dir->untracked->root);
> -		dir->untracked->root = xmalloc(len);
> -		memset(dir->untracked->root, 0, len);
> -	}
> +	if (!dir->untracked->root)
> +		FLEX_ALLOC_STR(dir->untracked->root, name, "");
>  
>  	/* Validate $GIT_DIR/info/exclude and core.excludesfile */
>  	root = dir->untracked->root;
>
> base-commit: 966e671106b2fd38301e7c344c754fd118d0bb07
Jeff King Feb. 24, 2021, 9:05 p.m. UTC | #3
On Wed, Feb 24, 2021 at 12:08:42PM -0800, Junio C Hamano wrote:

> > Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
> > for the root directory.  Get rid of unsafe code that might fail to
> > initialize the `name` field (if FLEX_ARRAY is not 1).  This will
> > make it clear that we intend to have a structure with an empty
> > string following it.
> [...]
> The problematic code was introduced in 2015, a year before these
> FLEX_ALLOC_*() helpers were introduced.  The result is of course
> correct and much easier to read, as the necessary "ask for a region
> of calloc'ed memory with an additional byte for terminating NUL
> beyond strlen()" is hidden in the helper.

When I added the FLEX_ALLOC_* helpers, I audited for existing callers to
convert. But I did so by looking for places where we were doing manual
size computations. The bug here was that it was not doing any
computation at all (when it need to be doing "+1"). So that's my guess
why it got overlooked (which is not super important, but may give a hint
about how to look for similar bugs).

-Peff
Jeff Hostetler Feb. 24, 2021, 9:15 p.m. UTC | #4
On 2/24/21 4:05 PM, Jeff King wrote:
> On Wed, Feb 24, 2021 at 12:08:42PM -0800, Junio C Hamano wrote:
> 
>>> Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
>>> for the root directory.  Get rid of unsafe code that might fail to
>>> initialize the `name` field (if FLEX_ARRAY is not 1).  This will
>>> make it clear that we intend to have a structure with an empty
>>> string following it.
>> [...]
>> The problematic code was introduced in 2015, a year before these
>> FLEX_ALLOC_*() helpers were introduced.  The result is of course
>> correct and much easier to read, as the necessary "ask for a region
>> of calloc'ed memory with an additional byte for terminating NUL
>> beyond strlen()" is hidden in the helper.
> 
> When I added the FLEX_ALLOC_* helpers, I audited for existing callers to
> convert. But I did so by looking for places where we were doing manual
> size computations. The bug here was that it was not doing any
> computation at all (when it need to be doing "+1"). So that's my guess
> why it got overlooked (which is not super important, but may give a hint
> about how to look for similar bugs).


There's another call to xmalloc() at [1] that does the st_add3()
thing that I didn't change.  It was correctly computing the size
so I didn't want to change it for no reason.  That and the 2 memcpy()s
made it feel like we should have a different FLEX_ macro for this
case -- more of a FLEX_DUP... or something.  But I didn't want to
distract from my bug fix here.

[1] https://github.com/git/git/blob/master/dir.c#L3290

Jeff
Jeff King Feb. 24, 2021, 11:51 p.m. UTC | #5
On Wed, Feb 24, 2021 at 04:15:11PM -0500, Jeff Hostetler wrote:

> > When I added the FLEX_ALLOC_* helpers, I audited for existing callers to
> > convert. But I did so by looking for places where we were doing manual
> > size computations. The bug here was that it was not doing any
> > computation at all (when it need to be doing "+1"). So that's my guess
> > why it got overlooked (which is not super important, but may give a hint
> > about how to look for similar bugs).
> 
> There's another call to xmalloc() at [1] that does the st_add3()
> thing that I didn't change.  It was correctly computing the size
> so I didn't want to change it for no reason.  That and the 2 memcpy()s
> made it feel like we should have a different FLEX_ macro for this
> case -- more of a FLEX_DUP... or something.  But I didn't want to
> distract from my bug fix here.

Thanks for pointing it out. Even if we change it, I agree it's better to
keep it out of your existing bugfix patch.

But I do think that spot gets weird. We are copying all of the elements
of the struct _except_ the name field, which comes from somewhere else.
So it's tempting to do:

diff --git a/dir.c b/dir.c
index d153a63bbd..ab16db3f76 100644
--- a/dir.c
+++ b/dir.c
@@ -3287,9 +3287,9 @@ static int read_one_dir(struct untracked_cache_dir **untracked_,
 	if (!eos || eos == end)
 		return -1;
 
-	*untracked_ = untracked = xmalloc(st_add3(sizeof(*untracked), eos - data, 1));
+	FLEX_ALLOC_MEM(untracked, name, data, eos - data);
 	memcpy(untracked, &ud, sizeof(ud));
-	memcpy(untracked->name, data, eos - data + 1);
+	*untracked_ = untracked;
 	data = eos + 1;
 
 	for (i = 0; i < untracked->untracked_nr; i++) {

But that's wrong! The remaining memcpy using sizeof(ud) might or might
not touch the first byte of the name field, depending on struct padding
and the value of FLEX_ARRAY. And if it does, then we'd be overwriting
the early bytes of that field with whatever was in "ud" (which I guess
is uninitialized cruft, because that is a stack variable).

So it would have to be:

  memcpy(untracked, &ud, offsetof(struct untracked_cache_dir, name));

Quite subtle. Even with a comment, I think I prefer the existing code.
If we had a macro as you suggest for "allocate a flex struct, dup most of
the contents from this other struct, and then copy in the flex bytes
from this other spot", that would make it more readable. But I'm not
sure we'd find more than one spot to use such a macro. :)

-Peff
diff mbox series

Patch

diff --git a/dir.c b/dir.c
index d153a63bbd14..fd8aa7c40faa 100644
--- a/dir.c
+++ b/dir.c
@@ -2730,11 +2730,8 @@  static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
 		return NULL;
 	}
 
-	if (!dir->untracked->root) {
-		const int len = sizeof(*dir->untracked->root);
-		dir->untracked->root = xmalloc(len);
-		memset(dir->untracked->root, 0, len);
-	}
+	if (!dir->untracked->root)
+		FLEX_ALLOC_STR(dir->untracked->root, name, "");
 
 	/* Validate $GIT_DIR/info/exclude and core.excludesfile */
 	root = dir->untracked->root;