diff mbox series

sparse-index: pass string length to index_file_exists()

Message ID pull.1649.git.1706897095273.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 156e28b36d424a26b4548de636fa548b14defa71
Headers show
Series sparse-index: pass string length to index_file_exists() | expand

Commit Message

Jeff Hostetler Feb. 2, 2024, 6:04 p.m. UTC
From: Jeff Hostetler <jeffhostetler@github.com>

The call to index_file_exists() in the loop in expand_to_path() passes
the wrong string length.  Let's fix that.

The loop in expand_to_path() searches the name-hash for each
sub-directory prefix in the provided pathname. That is, by searching
for "dir1/" then "dir1/dir2/" then "dir1/dir2/dir3/" and so on until
it finds a cache-entry representing a sparse directory.

The code creates "strbuf path_mutable" to contain the working pathname
and modifies the buffer in-place by temporarily replacing the character
following each successive "/" with NUL for the duration of the call to
index_file_exists().

It does not update the strbuf.len during this substitution.

Pass the patched length of the prefix path instead.

Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
---
    sparse-index: pass string length to index_file_exists()
    
    The call to index_file_exists() in the loop in expand_to_path() passes
    the wrong string length. Let's fix that.
    
    The loop in expand_to_path() searches the name-hash for each
    sub-directory prefix in the provided pathname. That is, by searching for
    "dir1/" then "dir1/dir2/" then "dir1/dir2/dir3/" and so on until it
    finds a cache-entry representing a sparse directory.
    
    The code creates "strbuf path_mutable" to contain the working pathname
    and modifies the buffer in-place by temporarily replacing the character
    following each successive "/" with NUL for the duration of the call to
    index_file_exists().
    
    It does not update the strbuf.len during this substitution.
    
    Pass the patched length of the prefix path instead.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1649%2Fjeffhostetler%2Fsparse-index-string-length-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1649/jeffhostetler/sparse-index-string-length-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1649

 sparse-index.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


base-commit: 43c8a30d150ecede9709c1f2527c8fba92c65f40

Comments

Junio C Hamano Feb. 2, 2024, 6:24 p.m. UTC | #1
"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Jeff Hostetler <jeffhostetler@github.com>
>
> The call to index_file_exists() in the loop in expand_to_path() passes
> the wrong string length.  Let's fix that.
>
> The loop in expand_to_path() searches the name-hash for each
> sub-directory prefix in the provided pathname. That is, by searching
> for "dir1/" then "dir1/dir2/" then "dir1/dir2/dir3/" and so on until
> it finds a cache-entry representing a sparse directory.
>
> The code creates "strbuf path_mutable" to contain the working pathname
> and modifies the buffer in-place by temporarily replacing the character
> following each successive "/" with NUL for the duration of the call to
> index_file_exists().
>
> It does not update the strbuf.len during this substitution.

Meaning we memihash() the full pathname munged with '/' -> NUL through
to the end of the original, when we should memihash() the truncated
leading pathname.  This is bad, and the ...

>
> Pass the patched length of the prefix path instead.

... fix looks quite straight-forward.

> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
> ---

The problem description and the fix makes sense, but did you
actually see an end-user visible breakage due to this bug?  I am
wondering how you found it, and if it is reasonable to have a test
demonstrate the breakage.

>  sparse-index.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/sparse-index.c b/sparse-index.c
> index 1fdb07a9e69..093708f6220 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -579,8 +579,9 @@ void expand_to_path(struct index_state *istate,
>  		replace++;
>  		temp = *replace;
>  		*replace = '\0';
> +		substr_len = replace - path_mutable.buf;
>  		if (index_file_exists(istate, path_mutable.buf,
> -				      path_mutable.len, icase)) {
> +				      substr_len, icase)) {

There is a break out of this loop when the condition for this "if"
statement holds, but the value of substr_len does not affect what
happens after this index_file_exists() call (correctly) computes its
result.  The fix looks good.

Thanks.

>  			/*
>  			 * We found a parent directory in the name-hash
>  			 * hashtable, because only sparse directory entries
> @@ -593,7 +594,6 @@ void expand_to_path(struct index_state *istate,
>  		}
>  
>  		*replace = temp;
> -		substr_len = replace - path_mutable.buf;
>  	}
>  
>  cleanup:
>
> base-commit: 43c8a30d150ecede9709c1f2527c8fba92c65f40
Jeff Hostetler Feb. 2, 2024, 7:19 p.m. UTC | #2
On 2/2/24 1:24 PM, Junio C Hamano wrote:
> "Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Jeff Hostetler <jeffhostetler@github.com>
>>
>> The call to index_file_exists() in the loop in expand_to_path() passes
>> the wrong string length.  Let's fix that.
>>
>> The loop in expand_to_path() searches the name-hash for each
>> sub-directory prefix in the provided pathname. That is, by searching
>> for "dir1/" then "dir1/dir2/" then "dir1/dir2/dir3/" and so on until
>> it finds a cache-entry representing a sparse directory.
>>
>> The code creates "strbuf path_mutable" to contain the working pathname
>> and modifies the buffer in-place by temporarily replacing the character
>> following each successive "/" with NUL for the duration of the call to
>> index_file_exists().
>>
>> It does not update the strbuf.len during this substitution.
> 
> Meaning we memihash() the full pathname munged with '/' -> NUL through
> to the end of the original, when we should memihash() the truncated
> leading pathname.  This is bad, and the ...
> 
>>
>> Pass the patched length of the prefix path instead.
> 
> ... fix looks quite straight-forward.
> 
>> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
>> ---
> 
> The problem description and the fix makes sense, but did you
> actually see an end-user visible breakage due to this bug?  I am
> wondering how you found it, and if it is reasonable to have a test
> demonstrate the breakage.

I'm working on a bug where the fsmonitor client doesn't
invalidate the CE flags when there is a case discrepancy between
the expected and observed case on a case-insensitive file system.
(Fix due shortly.)

And I was single-stepping in the debugger inside of
`index_file_exists()` while tracking that down and noticed that the
length argument was bogus.  Something like { name="dir1/", len=10 }

I don't remember if this bug caused visible problems or not. It felt
like it caused a few extra rounds of mutually-recursive calls between
`index_file_exists()` and `expand_to_path()`, but I can't say that
for certain (and I was focused on a different problem at the time).

I do have some test code in `t/helper/lazy-init-name-hash.c` that
I suppose we could extend to beat on it, but I'm not sure it is
worth it in this case.

Jeff


> 
>>   sparse-index.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/sparse-index.c b/sparse-index.c
>> index 1fdb07a9e69..093708f6220 100644
>> --- a/sparse-index.c
>> +++ b/sparse-index.c
>> @@ -579,8 +579,9 @@ void expand_to_path(struct index_state *istate,
>>   		replace++;
>>   		temp = *replace;
>>   		*replace = '\0';
>> +		substr_len = replace - path_mutable.buf;
>>   		if (index_file_exists(istate, path_mutable.buf,
>> -				      path_mutable.len, icase)) {
>> +				      substr_len, icase)) {
> 
> There is a break out of this loop when the condition for this "if"
> statement holds, but the value of substr_len does not affect what
> happens after this index_file_exists() call (correctly) computes its
> result.  The fix looks good.
> 
> Thanks.
> 
>>   			/*
>>   			 * We found a parent directory in the name-hash
>>   			 * hashtable, because only sparse directory entries
>> @@ -593,7 +594,6 @@ void expand_to_path(struct index_state *istate,
>>   		}
>>   
>>   		*replace = temp;
>> -		substr_len = replace - path_mutable.buf;
>>   	}
>>   
>>   cleanup:
>>
>> base-commit: 43c8a30d150ecede9709c1f2527c8fba92c65f40
>
Junio C Hamano Feb. 2, 2024, 7:30 p.m. UTC | #3
Jeff Hostetler <git@jeffhostetler.com> writes:

> I don't remember if this bug caused visible problems or not. It felt
> like it caused a few extra rounds of mutually-recursive calls between
> `index_file_exists()` and `expand_to_path()`, but I can't say that
> for certain (and I was focused on a different problem at the time).
>
> I do have some test code in `t/helper/lazy-init-name-hash.c` that
> I suppose we could extend to beat on it, but I'm not sure it is
> worth it in this case.

Yeah, if we had a reproduction already handy that would have been a
different story, but I agree that it is not worth it.

Thanks for a fix.  Queued.
diff mbox series

Patch

diff --git a/sparse-index.c b/sparse-index.c
index 1fdb07a9e69..093708f6220 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -579,8 +579,9 @@  void expand_to_path(struct index_state *istate,
 		replace++;
 		temp = *replace;
 		*replace = '\0';
+		substr_len = replace - path_mutable.buf;
 		if (index_file_exists(istate, path_mutable.buf,
-				      path_mutable.len, icase)) {
+				      substr_len, icase)) {
 			/*
 			 * We found a parent directory in the name-hash
 			 * hashtable, because only sparse directory entries
@@ -593,7 +594,6 @@  void expand_to_path(struct index_state *istate,
 		}
 
 		*replace = temp;
-		substr_len = replace - path_mutable.buf;
 	}
 
 cleanup: