diff mbox series

[v2,2/2] fix(gc): make --prune=now compatible with --expire-to

Message ID 579757957d21faaa8dd9228a191d82f663e93c03.1735611513.git.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series gc: add --expire-to option | expand

Commit Message

ZheNing Hu Dec. 31, 2024, 2:18 a.m. UTC
From: ZheNing Hu <adlternative@gmail.com>

The original `git gc --prune=now` attempted to delete all
unreachable objects. However, after the introduction of
`--cruft` and `--expire-to=<dir>` in git gc, `--prune=now`
can now compress unreachable objects into a cruft pack and
store them in the specified <dir> instead of deleting them
directly. This is beneficial for recovery in case of data
corruption during repository GC. Therefore, update the
handling logic of `--prune=now` in gc so that `-a` parameter
is only passed to the repack command when neither `--cruft`
nor `--expire-to` are used.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
 builtin/gc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Jeff King Jan. 13, 2025, 9:17 a.m. UTC | #1
On Tue, Dec 31, 2024 at 02:18:33AM +0000, ZheNing Hu via GitGitGadget wrote:

> diff --git a/builtin/gc.c b/builtin/gc.c
> index 77904694c9f..8656e1caff0 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
>  static void add_repack_all_option(struct gc_config *cfg,
>  				  struct string_list *keep_pack)
>  {
> -	if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> +	if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> +		&& !(cfg->cruft_packs && cfg->repack_expire_to))
>  		strvec_push(&repack, "-a");

I expected to see a mention of repack_expire_to here, but not
cfg->cruft_packs. These two are AND-ed together so we are only disabling
"repack -a" when both options ("--expire-to" and "--cruft") are passed.
Can we --expire-to without cruft? I.e., what should happen with:

  git gc --expire-to=some-path --prune=now --no-cruft

Looking at the underlying git-repack, it seems that we only respect
--expire-to at all when used with "--cruft", and don't otherwise
consider it. Which is what the manpage says ("Only useful with --cruft
-d").

But if we look at this proposed patch for example:

  https://lore.kernel.org/git/48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com/

then it is expanding how --expire-to is used during the pruning step.
OTOH, I think the way your patch 1 is structured means that we'd always
pass --expire-to to git-repack anyway, and I _think_ even with the patch
linked above that "repack -a -d --expire-to=whatever" would do the right
thing.

In which case the problem really is the combination of cruft packs and
expire-to. Just cruft packs by themselves do not need to override using
"-a" for "--prune=now" because we know that any such cruft pack would be
empty.

So I think this logic is correct. Taylor might have more thoughts,
though (and ideas on whether he intends to revisit that earlier patch).

I do think this change should probably be done as part of patch 1,
rather than introducing a buggy state and then fixing it in patch 2.

-Peff
ZheNing Hu Jan. 15, 2025, 7:56 a.m. UTC | #2
Jeff King <peff@peff.net> 于2025年1月13日周一 17:17写道:
>
> On Tue, Dec 31, 2024 at 02:18:33AM +0000, ZheNing Hu via GitGitGadget wrote:
>
> > diff --git a/builtin/gc.c b/builtin/gc.c
> > index 77904694c9f..8656e1caff0 100644
> > --- a/builtin/gc.c
> > +++ b/builtin/gc.c
> > @@ -433,7 +433,8 @@ static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
> >  static void add_repack_all_option(struct gc_config *cfg,
> >                                 struct string_list *keep_pack)
> >  {
> > -     if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
> > +     if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
> > +             && !(cfg->cruft_packs && cfg->repack_expire_to))
> >               strvec_push(&repack, "-a");
>
> I expected to see a mention of repack_expire_to here, but not
> cfg->cruft_packs. These two are AND-ed together so we are only disabling
> "repack -a" when both options ("--expire-to" and "--cruft") are passed.
> Can we --expire-to without cruft? I.e., what should happen with:
>
>   git gc --expire-to=some-path --prune=now --no-cruft
>
> Looking at the underlying git-repack, it seems that we only respect
> --expire-to at all when used with "--cruft", and don't otherwise
> consider it. Which is what the manpage says ("Only useful with --cruft
> -d").
>

Yes, this is the current state of git-repack. The --expire-to option can
only be used with --cruft, which is why I use cruft_packs && repack_expire_to
as a double safeguard.

When using --no-cruft, the option --expire-to becomes irrelevant.
So leaving `git gc --prune=now` as is at this point: passing -a as a
parameter to repack seems reasonable.

> But if we look at this proposed patch for example:
>
>   https://lore.kernel.org/git/48438876fb42a889110e100a6c42ca84e93aac49.1733011259.git.me@ttaylorr.com/
>
> then it is expanding how --expire-to is used during the pruning step.
> OTOH, I think the way your patch 1 is structured means that we'd always
> pass --expire-to to git-repack anyway, and I _think_ even with the patch
> linked above that "repack -a -d --expire-to=whatever" would do the right
> thing.
>

I've taken a look at the patch, and I believe Taylor's changes are primarily
aimed at extending the --expire-to functionality within the --cruft feature,
rather than expecting --expire-to to be used on its own.

> In which case the problem really is the combination of cruft packs and
> expire-to. Just cruft packs by themselves do not need to override using
> "-a" for "--prune=now" because we know that any such cruft pack would be
> empty.
>
> So I think this logic is correct. Taylor might have more thoughts,
> though (and ideas on whether he intends to revisit that earlier patch).
>
> I do think this change should probably be done as part of patch 1,
> rather than introducing a buggy state and then fixing it in patch 2.
>

Yes, I agree with that, and perhaps a single patch will suffice.

> -Peff

- ZheNing Hu
diff mbox series

Patch

diff --git a/builtin/gc.c b/builtin/gc.c
index 77904694c9f..8656e1caff0 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -433,7 +433,8 @@  static int keep_one_pack(struct string_list_item *item, void *data UNUSED)
 static void add_repack_all_option(struct gc_config *cfg,
 				  struct string_list *keep_pack)
 {
-	if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now"))
+	if (cfg->prune_expire && !strcmp(cfg->prune_expire, "now")
+		&& !(cfg->cruft_packs && cfg->repack_expire_to))
 		strvec_push(&repack, "-a");
 	else if (cfg->cruft_packs) {
 		strvec_push(&repack, "--cruft");