diff mbox series

[v3,02/12] sparse-index: include EXTENDED flag when expanding

Message ID 8aa41e749471df3bd9d593b8f55db6506eafea12.1621017072.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Sparse-index: integrate with status | expand

Commit Message

Derrick Stolee May 14, 2021, 6:31 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

When creating a full index from a sparse one, we create cache entries
for every blob within a given sparse directory entry. These are
correctly marked with the CE_SKIP_WORKTREE flag, but they must also be
marked with the CE_EXTENDED flag to ensure that the skip-worktree bit is
correctly written to disk in the case that the index is not converted
back down to a sparse-index.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 sparse-index.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Elijah Newren May 18, 2021, 1:33 a.m. UTC | #1
On Fri, May 14, 2021 at 11:31 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> When creating a full index from a sparse one, we create cache entries
> for every blob within a given sparse directory entry. These are
> correctly marked with the CE_SKIP_WORKTREE flag, but they must also be
> marked with the CE_EXTENDED flag to ensure that the skip-worktree bit is
> correctly written to disk in the case that the index is not converted
> back down to a sparse-index.

This seems odd to me.  When sparse-index is not involved and we are
just doing simple sparse checkouts, do we mark CE_SKIP_WORKTREE
entries with CE_EXTENDED?  I can't find any code that does so.

Is it possible that the setting of CE_EXTENDED is just a workaround
that happens to force the index to be written in cases where the logic
is otherwise thinking it can get away without one?  Or is there
something I'm missing about why the CE_EXTENDED flag is actually
needed here?

>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  sparse-index.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sparse-index.c b/sparse-index.c
> index 1b49898d0cb7..b2b3fbd75050 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -222,7 +222,7 @@ static int add_path_to_index(const struct object_id *oid,
>         strbuf_addstr(base, path);
>
>         ce = make_cache_entry(istate, mode, oid, base->buf, 0, 0);
> -       ce->ce_flags |= CE_SKIP_WORKTREE;
> +       ce->ce_flags |= CE_SKIP_WORKTREE | CE_EXTENDED;
>         set_index_entry(istate, istate->cache_nr++, ce);
>
>         strbuf_setlen(base, len);
> --
> gitgitgadget
>
Derrick Stolee May 18, 2021, 2:57 p.m. UTC | #2
On 5/17/2021 9:33 PM, Elijah Newren wrote:
> On Fri, May 14, 2021 at 11:31 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> When creating a full index from a sparse one, we create cache entries
>> for every blob within a given sparse directory entry. These are
>> correctly marked with the CE_SKIP_WORKTREE flag, but they must also be
>> marked with the CE_EXTENDED flag to ensure that the skip-worktree bit is
>> correctly written to disk in the case that the index is not converted
>> back down to a sparse-index.
> 
> This seems odd to me.  When sparse-index is not involved and we are
> just doing simple sparse checkouts, do we mark CE_SKIP_WORKTREE
> entries with CE_EXTENDED?  I can't find any code that does so.
> 
> Is it possible that the setting of CE_EXTENDED is just a workaround
> that happens to force the index to be written in cases where the logic
> is otherwise thinking it can get away without one?  Or is there
> something I'm missing about why the CE_EXTENDED flag is actually
> needed here?

This is happening within the context of ensure_full_index(), so we
are creating new cache entries and want to mimic what they would
look like on-disk. Something within do_write_index() discovers that
since CE_SKIP_WORKTREE is set, then also CE_EXTENDED should be set
in order to ensure that the on-disk representation has enough room
for the CE_SKIP_WORKTREE bit.

I suppose this might not have a meaningful purpose other than when
I compare a full index against an expanded sparse-index and check
if their flags match.

Thanks,
-Stolee
Elijah Newren May 18, 2021, 5:48 p.m. UTC | #3
On Tue, May 18, 2021 at 7:57 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 5/17/2021 9:33 PM, Elijah Newren wrote:
> > On Fri, May 14, 2021 at 11:31 AM Derrick Stolee via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> From: Derrick Stolee <dstolee@microsoft.com>
> >>
> >> When creating a full index from a sparse one, we create cache entries
> >> for every blob within a given sparse directory entry. These are
> >> correctly marked with the CE_SKIP_WORKTREE flag, but they must also be
> >> marked with the CE_EXTENDED flag to ensure that the skip-worktree bit is
> >> correctly written to disk in the case that the index is not converted
> >> back down to a sparse-index.
> >
> > This seems odd to me.  When sparse-index is not involved and we are
> > just doing simple sparse checkouts, do we mark CE_SKIP_WORKTREE
> > entries with CE_EXTENDED?  I can't find any code that does so.
> >
> > Is it possible that the setting of CE_EXTENDED is just a workaround
> > that happens to force the index to be written in cases where the logic
> > is otherwise thinking it can get away without one?  Or is there
> > something I'm missing about why the CE_EXTENDED flag is actually
> > needed here?
>
> This is happening within the context of ensure_full_index(), so we
> are creating new cache entries and want to mimic what they would
> look like on-disk. Something within do_write_index() discovers that
> since CE_SKIP_WORKTREE is set, then also CE_EXTENDED should be set
> in order to ensure that the on-disk representation has enough room
> for the CE_SKIP_WORKTREE bit.

Yeah, I think it's this part:

        /* reduce extended entries if possible */
        cache[i]->ce_flags &= ~CE_EXTENDED;
        if (cache[i]->ce_flags & CE_EXTENDED_FLAGS) {
            extended++;
            cache[i]->ce_flags |= CE_EXTENDED;
        }

>
> I suppose this might not have a meaningful purpose other than when
> I compare a full index against an expanded sparse-index and check
> if their flags match.

Ah, you're just setting this flag in advance of do_write_index() being
called so that you can compare in memory values and check they match
without doing a write-to-disk-and-read-back cycle.  Makes sense, but
it'd be nice to see this in the commit message.
Derrick Stolee May 18, 2021, 6:16 p.m. UTC | #4
On 5/18/2021 1:48 PM, Elijah Newren wrote:
> On Tue, May 18, 2021 at 7:57 AM Derrick Stolee <stolee@gmail.com> wrote:
>>
>> On 5/17/2021 9:33 PM, Elijah Newren wrote:
>>> Is it possible that the setting of CE_EXTENDED is just a workaround
>>> that happens to force the index to be written in cases where the logic
>>> is otherwise thinking it can get away without one?  Or is there
>>> something I'm missing about why the CE_EXTENDED flag is actually
>>> needed here?
>>
>> This is happening within the context of ensure_full_index(), so we
>> are creating new cache entries and want to mimic what they would
>> look like on-disk. Something within do_write_index() discovers that
>> since CE_SKIP_WORKTREE is set, then also CE_EXTENDED should be set
>> in order to ensure that the on-disk representation has enough room
>> for the CE_SKIP_WORKTREE bit.
> 
> Yeah, I think it's this part:
> 
>         /* reduce extended entries if possible */
>         cache[i]->ce_flags &= ~CE_EXTENDED;
>         if (cache[i]->ce_flags & CE_EXTENDED_FLAGS) {
>             extended++;
>             cache[i]->ce_flags |= CE_EXTENDED;
>         }
> 
>>
>> I suppose this might not have a meaningful purpose other than when
>> I compare a full index against an expanded sparse-index and check
>> if their flags match.
> 
> Ah, you're just setting this flag in advance of do_write_index() being
> called so that you can compare in memory values and check they match
> without doing a write-to-disk-and-read-back cycle.  Makes sense, but
> it'd be nice to see this in the commit message.

Will do. Thanks,

-Stolee
diff mbox series

Patch

diff --git a/sparse-index.c b/sparse-index.c
index 1b49898d0cb7..b2b3fbd75050 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -222,7 +222,7 @@  static int add_path_to_index(const struct object_id *oid,
 	strbuf_addstr(base, path);
 
 	ce = make_cache_entry(istate, mode, oid, base->buf, 0, 0);
-	ce->ce_flags |= CE_SKIP_WORKTREE;
+	ce->ce_flags |= CE_SKIP_WORKTREE | CE_EXTENDED;
 	set_index_entry(istate, istate->cache_nr++, ce);
 
 	strbuf_setlen(base, len);