diff mbox series

[2/1] repack: warn if bitmaps are explicitly enabled with keep files

Message ID 20190629191600.nipp2ut37xd3mx56@dcvr (mailing list archive)
State New, archived
Headers show
Series repack: disable bitmaps-by-default if .keep files exist | expand

Commit Message

Eric Wong June 29, 2019, 7:16 p.m. UTC
Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> I have the feedback I posted before this patch in
> https://public-inbox.org/git/874l4f8h4c.fsf@evledraar.gmail.com/
> 
> In particular "b" there since "a" is clearly more work. I.e. shouldn't
> we at least in interactive mode on a "gc" print something about skipping
> what we'd otherwise do.
> 
> Maybe that's tricky with the gc.log functionality, but I think we should
> at least document this before the next guy shows up with "sometimes my
> .bitmap files aren't generated...".

I'm not sure if the warning should be present by default;
because we'll silently stop using bitmaps, now.  But warning
if '-b' is specified seems right:

-------8<----------
Subject: [PATCH] repack: warn if bitmaps are explicitly enabled with keep
 files

If a user explicitly enables bitmaps, we should warn if .keep
files exist or are specified via --keep-pack

Signed-off-by: Eric Wong <e@80x24.org>
---
 builtin/repack.c  |  8 ++++++++
 t/t7700-repack.sh | 16 ++++++++++++++++
 2 files changed, 24 insertions(+)

Comments

Junio C Hamano July 1, 2019, 6:15 p.m. UTC | #1
Eric Wong <e@80x24.org> writes:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>> I have the feedback I posted before this patch in
>> https://public-inbox.org/git/874l4f8h4c.fsf@evledraar.gmail.com/
>> 
>> In particular "b" there since "a" is clearly more work. I.e. shouldn't
>> we at least in interactive mode on a "gc" print something about skipping
>> what we'd otherwise do.
>> 
>> Maybe that's tricky with the gc.log functionality, but I think we should
>> at least document this before the next guy shows up with "sometimes my
>> .bitmap files aren't generated...".
>
> I'm not sure if the warning should be present by default;
> because we'll silently stop using bitmaps, now.  But warning
> if '-b' is specified seems right:

Hmph...  write_bitmaps can come either from the command line or from
the configuration.  When repack.writebitmaps configuration is set,
and some .keep files exist, the user probably is not even aware of
doing something that is unsupported.

The keep_pack_list will not be empty only if the "--keep-pack"
option is explicitly given from the command line, so making it and
an explicit "-b" on the command line a condition to trigger the
warning (or even "fatal: can't do both at the same time") is safe
enough, I think.  But anything more than that, I am not so sure.

Also, is it safe to unconditionally emit strings to the standard
error stream from the gc codepath these days?  I recall that once we
had trouble with the logic to skip repeatedly failing "gc --auto"
vaguely, but maybe I am remembering something else.

Thanks.


> -------8<----------
> Subject: [PATCH] repack: warn if bitmaps are explicitly enabled with keep
>  files
>
> If a user explicitly enables bitmaps, we should warn if .keep
> files exist or are specified via --keep-pack
>
> Signed-off-by: Eric Wong <e@80x24.org>
> ---
>  builtin/repack.c  |  8 ++++++++
>  t/t7700-repack.sh | 16 ++++++++++++++++
>  2 files changed, 24 insertions(+)
>
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 73250b2431..b1eeee88a7 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -359,7 +359,15 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>  				 is_bare_repository() &&
>  				 keep_pack_list.nr == 0 &&
>  				 !has_pack_keep_file();
> +	} else if (write_bitmaps > 0) {
> +		if (keep_pack_list.nr)
> +			fprintf(stderr,
> +				_("WARNING: --keep-pack is incompatible with bitmaps\n"));



> +		if (has_pack_keep_file())
> +			fprintf(stderr,
> +				_("WARNING: .keep files are incompatible with bitmaps\n"));
>  	}
> +
>  	if (pack_kept_objects < 0)
>  		pack_kept_objects = write_bitmaps;
>  
> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 0e9af832c9..839484c7dc 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -249,4 +249,20 @@ test_expect_success 'no bitmaps created if .keep files present' '
>  	test_must_be_empty actual
>  '
>  
> +test_expect_success '-b warns with .keep files present' '
> +	pack=$(ls bare.git/objects/pack/*.pack) &&
> +	test_path_is_file "$pack" &&
> +	keep=${pack%.pack}.keep &&
> +	>"$keep" &&
> +	git -C bare.git repack -adb 2>err &&
> +	test_i18ngrep -F ".keep files are incompatible" err &&
> +	rm -f "$keep"
> +'
> +
> +test_expect_success '-b warns with --keep-pack specified' '
> +	keep=$(cd bare.git/objects/pack/ && ls *.pack) &&
> +	git -C bare.git repack -adb --keep-pack="$keep" 2>err &&
> +	test_i18ngrep -F "keep-pack is incompatible" err
> +'
> +
>  test_done
Jeff King July 3, 2019, 5:38 p.m. UTC | #2
On Mon, Jul 01, 2019 at 11:15:38AM -0700, Junio C Hamano wrote:

> >> Maybe that's tricky with the gc.log functionality, but I think we should
> >> at least document this before the next guy shows up with "sometimes my
> >> .bitmap files aren't generated...".
> >
> > I'm not sure if the warning should be present by default;
> > because we'll silently stop using bitmaps, now.  But warning
> > if '-b' is specified seems right:
> 
> Hmph...  write_bitmaps can come either from the command line or from
> the configuration.  When repack.writebitmaps configuration is set,
> and some .keep files exist, the user probably is not even aware of
> doing something that is unsupported.

I think one tricky thing here is that we do not know if the .keep files
are meant to be there, or if they are simply transient locks.

The whole point of the current behavior is that we should be ignoring
the transient locks, and if you are explicitly using bitmaps you
understand the tradeoff you are making.

I think the important case to cover is the one where the user didn't
explicitly ask for them, and the initial patch (to disable them when
there's a .keep around) covers that.


A much more robust solution would be to stop conflating user-provided
permanent .keep files with temporary locks. I think that was a mistaken
design added many years ago. We probably could introduce a different
filename for the temporary locks (though I am not entirely convinced
they are necessary in the first place, as gc expiration-times would
generally save a racily-written packfile anyway).

Or perhaps we could differentiate our temporary locks from "real" .keep
files by looking at the content; I think our locks always say something
like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous
to commit to that, I think (or even adjust it to something even more
unambiguous).

It does muddy the meaning of packed_git.pack_keep a bit.  Some callers
would want to consider it kept in either case (i.e., for purposes of
pruning, we delete neither) and some would want it kept only for
non-locks (for packing, duplicating the objects is OK). So I think we'd
end up with two bits there, and callers would have to use one or the
other as appropriate.

-Peff
Junio C Hamano July 3, 2019, 6:10 p.m. UTC | #3
Jeff King <peff@peff.net> writes:

>
> A much more robust solution would be to stop conflating user-provided
> permanent .keep files with temporary locks. I think that was a mistaken
> design added many years ago. We probably could introduce a different
> filename for the temporary locks (though I am not entirely convinced
> they are necessary in the first place, as gc expiration-times would
> generally save a racily-written packfile anyway).

True, true (and I tend to agree).

> Or perhaps we could differentiate our temporary locks from "real" .keep
> files by looking at the content; I think our locks always say something
> like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous
> to commit to that, I think (or even adjust it to something even more
> unambiguous).

True, but it may be overkill to open and read.

> It does muddy the meaning of packed_git.pack_keep a bit.  Some callers
> would want to consider it kept in either case (i.e., for purposes of
> pruning, we delete neither) and some would want it kept only for
> non-locks (for packing, duplicating the objects is OK). So I think we'd
> end up with two bits there, and callers would have to use one or the
> other as appropriate.

Yeah, I agree that we'd need to treat them separately in the longer
run.

Thanks.
Junio C Hamano July 3, 2019, 6:37 p.m. UTC | #4
Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>>
>> A much more robust solution would be to stop conflating user-provided
>> permanent .keep files with temporary locks. I think that was a mistaken
>> design added many years ago. We probably could introduce a different
>> filename for the temporary locks (though I am not entirely convinced
>> they are necessary in the first place, as gc expiration-times would
>> generally save a racily-written packfile anyway).
>
> True, true (and I tend to agree).
>
>> Or perhaps we could differentiate our temporary locks from "real" .keep
>> files by looking at the content; I think our locks always say something
>> like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous
>> to commit to that, I think (or even adjust it to something even more
>> unambiguous).
>
> True, but it may be overkill to open and read.
>
>> It does muddy the meaning of packed_git.pack_keep a bit.  Some callers
>> would want to consider it kept in either case (i.e., for purposes of
>> pruning, we delete neither) and some would want it kept only for
>> non-locks (for packing, duplicating the objects is OK). So I think we'd
>> end up with two bits there, and callers would have to use one or the
>> other as appropriate.
>
> Yeah, I agree that we'd need to treat them separately in the longer
> run.
>
> Thanks.

In the meantime, this is about patch 2/1; the underlying "when .keep
is there, disable bitmaps" is probably good to go, still.

-- >8 --
From: Eric Wong <e@80x24.org>
Date: Sat, 29 Jun 2019 19:13:59 +0000
Subject: [PATCH] repack: disable bitmaps-by-default if .keep files exist

Bitmaps aren't useful with multiple packs, and users with
.keep files ended up with redundant packs when bitmaps
got enabled by default in bare repos.

So detect when .keep files exist and stop enabling bitmaps
by default in that case.

Wasteful (but otherwise harmless) race conditions with .keep files
documented by Jeff King still apply and there's a chance we'd
still end up with redundant data on the FS:

  https://public-inbox.org/git/20190623224244.GB1100@sigill.intra.peff.net/

v2: avoid subshell in test case, be multi-index aware

Fixes: 36eba0323d3288a8 ("repack: enable bitmaps by default on bare repos")
Signed-off-by: Eric Wong <e@80x24.org>
Helped-by: Jeff King <peff@peff.net>
Reported-by: Janos Farkas <chexum@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/repack.c  | 18 ++++++++++++++++--
 t/t7700-repack.sh | 10 ++++++++++
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index caca113927..73250b2431 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -89,6 +89,17 @@ static void remove_pack_on_signal(int signo)
 	raise(signo);
 }
 
+static int has_pack_keep_file(void)
+{
+	struct packed_git *p;
+
+	for (p = get_all_packs(the_repository); p; p = p->next) {
+		if (p->pack_keep)
+			return 1;
+	}
+	return 0;
+}
+
 /*
  * Adds all packs hex strings to the fname list, which do not
  * have a corresponding .keep file. These packs are not to
@@ -343,9 +354,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 	    (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE)))
 		die(_("--keep-unreachable and -A are incompatible"));
 
-	if (write_bitmaps < 0)
+	if (write_bitmaps < 0) {
 		write_bitmaps = (pack_everything & ALL_INTO_ONE) &&
-				 is_bare_repository();
+				 is_bare_repository() &&
+				 keep_pack_list.nr == 0 &&
+				 !has_pack_keep_file();
+	}
 	if (pack_kept_objects < 0)
 		pack_kept_objects = write_bitmaps;
 
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 86d05160a3..0e9af832c9 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -239,4 +239,14 @@ test_expect_success 'bitmaps can be disabled on bare repos' '
 	test -z "$bitmap"
 '
 
+test_expect_success 'no bitmaps created if .keep files present' '
+	pack=$(ls bare.git/objects/pack/*.pack) &&
+	test_path_is_file "$pack" &&
+	keep=${pack%.pack}.keep &&
+	>"$keep" &&
+	git -C bare.git repack -ad &&
+	find bare.git/objects/pack/ -type f -name "*.bitmap" >actual &&
+	test_must_be_empty actual
+'
+
 test_done
Jeff King July 3, 2019, 9:23 p.m. UTC | #5
On Wed, Jul 03, 2019 at 11:10:22AM -0700, Junio C Hamano wrote:

> > Or perhaps we could differentiate our temporary locks from "real" .keep
> > files by looking at the content; I think our locks always say something
> > like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous
> > to commit to that, I think (or even adjust it to something even more
> > unambiguous).
> 
> True, but it may be overkill to open and read.

Yeah, that cross my mind as well, but:

  1. We'd only need to open them when we _see_ them. And they're pretty
     rare anyway.

  2. Effort-wise, we're already opening and mmap-ing the .idx files, so
     this is on par.

  3. Most callers don't care about keep-files anyway. We could turn
     packed_git.pack_keep into:

       enum {
         PACK_KEEP_NONE,
         PACK_KEEP_LOCK,
	 PACK_KEEP_USER
       } check_packed_keep(struct packed_git *pack);

     and then most programs wouldn't pay anything.

Just some thoughts. I don't have immediate plans to work on it, but
maybe somebody else is excited about it. :)

-Peff
Jeff King July 3, 2019, 9:24 p.m. UTC | #6
On Wed, Jul 03, 2019 at 11:37:52AM -0700, Junio C Hamano wrote:

> In the meantime, this is about patch 2/1; the underlying "when .keep
> is there, disable bitmaps" is probably good to go, still.
> 
> -- >8 --
> From: Eric Wong <e@80x24.org>
> Date: Sat, 29 Jun 2019 19:13:59 +0000
> Subject: [PATCH] repack: disable bitmaps-by-default if .keep files exist

Yeah, this one looks good to me. Thanks for keeping things moving.

-Peff
Junio C Hamano July 8, 2019, 5:40 p.m. UTC | #7
Jeff King <peff@peff.net> writes:

> On Wed, Jul 03, 2019 at 11:10:22AM -0700, Junio C Hamano wrote:
>
>> > Or perhaps we could differentiate our temporary locks from "real" .keep
>> > files by looking at the content; I think our locks always say something
>> > like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous
>> > to commit to that, I think (or even adjust it to something even more
>> > unambiguous).
>> 
>> True, but it may be overkill to open and read.
>
> Yeah, that cross my mind as well, but:
>
>   1. We'd only need to open them when we _see_ them. And they're pretty
>      rare anyway.
>
>   2. Effort-wise, we're already opening and mmap-ing the .idx files, so
>      this is on par.
>
>   3. Most callers don't care about keep-files anyway. We could turn
>      packed_git.pack_keep into:
>
>        enum {
>          PACK_KEEP_NONE,
>          PACK_KEEP_LOCK,
> 	 PACK_KEEP_USER
>        } check_packed_keep(struct packed_git *pack);
>
>      and then most programs wouldn't pay anything.
>
> Just some thoughts. I don't have immediate plans to work on it, but
> maybe somebody else is excited about it. :)

OK.  I do agree that .keep would be rare enough to justify spending
a bit more extra cycles, as long as the benefit is big enough (and
in this case it may be a good trade-off).
diff mbox series

Patch

diff --git a/builtin/repack.c b/builtin/repack.c
index 73250b2431..b1eeee88a7 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -359,7 +359,15 @@  int cmd_repack(int argc, const char **argv, const char *prefix)
 				 is_bare_repository() &&
 				 keep_pack_list.nr == 0 &&
 				 !has_pack_keep_file();
+	} else if (write_bitmaps > 0) {
+		if (keep_pack_list.nr)
+			fprintf(stderr,
+				_("WARNING: --keep-pack is incompatible with bitmaps\n"));
+		if (has_pack_keep_file())
+			fprintf(stderr,
+				_("WARNING: .keep files are incompatible with bitmaps\n"));
 	}
+
 	if (pack_kept_objects < 0)
 		pack_kept_objects = write_bitmaps;
 
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 0e9af832c9..839484c7dc 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -249,4 +249,20 @@  test_expect_success 'no bitmaps created if .keep files present' '
 	test_must_be_empty actual
 '
 
+test_expect_success '-b warns with .keep files present' '
+	pack=$(ls bare.git/objects/pack/*.pack) &&
+	test_path_is_file "$pack" &&
+	keep=${pack%.pack}.keep &&
+	>"$keep" &&
+	git -C bare.git repack -adb 2>err &&
+	test_i18ngrep -F ".keep files are incompatible" err &&
+	rm -f "$keep"
+'
+
+test_expect_success '-b warns with --keep-pack specified' '
+	keep=$(cd bare.git/objects/pack/ && ls *.pack) &&
+	git -C bare.git repack -adb --keep-pack="$keep" 2>err &&
+	test_i18ngrep -F "keep-pack is incompatible" err
+'
+
 test_done