Message ID | 20190629191600.nipp2ut37xd3mx56@dcvr (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | repack: disable bitmaps-by-default if .keep files exist | expand |
Eric Wong <e@80x24.org> writes: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: >> I have the feedback I posted before this patch in >> https://public-inbox.org/git/874l4f8h4c.fsf@evledraar.gmail.com/ >> >> In particular "b" there since "a" is clearly more work. I.e. shouldn't >> we at least in interactive mode on a "gc" print something about skipping >> what we'd otherwise do. >> >> Maybe that's tricky with the gc.log functionality, but I think we should >> at least document this before the next guy shows up with "sometimes my >> .bitmap files aren't generated...". > > I'm not sure if the warning should be present by default; > because we'll silently stop using bitmaps, now. But warning > if '-b' is specified seems right: Hmph... write_bitmaps can come either from the command line or from the configuration. When repack.writebitmaps configuration is set, and some .keep files exist, the user probably is not even aware of doing something that is unsupported. The keep_pack_list will not be empty only if the "--keep-pack" option is explicitly given from the command line, so making it and an explicit "-b" on the command line a condition to trigger the warning (or even "fatal: can't do both at the same time") is safe enough, I think. But anything more than that, I am not so sure. Also, is it safe to unconditionally emit strings to the standard error stream from the gc codepath these days? I recall that once we had trouble with the logic to skip repeatedly failing "gc --auto" vaguely, but maybe I am remembering something else. Thanks. > -------8<---------- > Subject: [PATCH] repack: warn if bitmaps are explicitly enabled with keep > files > > If a user explicitly enables bitmaps, we should warn if .keep > files exist or are specified via --keep-pack > > Signed-off-by: Eric Wong <e@80x24.org> > --- > builtin/repack.c | 8 ++++++++ > t/t7700-repack.sh | 16 ++++++++++++++++ > 2 files changed, 24 insertions(+) > > diff --git a/builtin/repack.c b/builtin/repack.c > index 73250b2431..b1eeee88a7 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -359,7 +359,15 @@ int cmd_repack(int argc, const char **argv, const char *prefix) > is_bare_repository() && > keep_pack_list.nr == 0 && > !has_pack_keep_file(); > + } else if (write_bitmaps > 0) { > + if (keep_pack_list.nr) > + fprintf(stderr, > + _("WARNING: --keep-pack is incompatible with bitmaps\n")); > + if (has_pack_keep_file()) > + fprintf(stderr, > + _("WARNING: .keep files are incompatible with bitmaps\n")); > } > + > if (pack_kept_objects < 0) > pack_kept_objects = write_bitmaps; > > diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh > index 0e9af832c9..839484c7dc 100755 > --- a/t/t7700-repack.sh > +++ b/t/t7700-repack.sh > @@ -249,4 +249,20 @@ test_expect_success 'no bitmaps created if .keep files present' ' > test_must_be_empty actual > ' > > +test_expect_success '-b warns with .keep files present' ' > + pack=$(ls bare.git/objects/pack/*.pack) && > + test_path_is_file "$pack" && > + keep=${pack%.pack}.keep && > + >"$keep" && > + git -C bare.git repack -adb 2>err && > + test_i18ngrep -F ".keep files are incompatible" err && > + rm -f "$keep" > +' > + > +test_expect_success '-b warns with --keep-pack specified' ' > + keep=$(cd bare.git/objects/pack/ && ls *.pack) && > + git -C bare.git repack -adb --keep-pack="$keep" 2>err && > + test_i18ngrep -F "keep-pack is incompatible" err > +' > + > test_done
On Mon, Jul 01, 2019 at 11:15:38AM -0700, Junio C Hamano wrote: > >> Maybe that's tricky with the gc.log functionality, but I think we should > >> at least document this before the next guy shows up with "sometimes my > >> .bitmap files aren't generated...". > > > > I'm not sure if the warning should be present by default; > > because we'll silently stop using bitmaps, now. But warning > > if '-b' is specified seems right: > > Hmph... write_bitmaps can come either from the command line or from > the configuration. When repack.writebitmaps configuration is set, > and some .keep files exist, the user probably is not even aware of > doing something that is unsupported. I think one tricky thing here is that we do not know if the .keep files are meant to be there, or if they are simply transient locks. The whole point of the current behavior is that we should be ignoring the transient locks, and if you are explicitly using bitmaps you understand the tradeoff you are making. I think the important case to cover is the one where the user didn't explicitly ask for them, and the initial patch (to disable them when there's a .keep around) covers that. A much more robust solution would be to stop conflating user-provided permanent .keep files with temporary locks. I think that was a mistaken design added many years ago. We probably could introduce a different filename for the temporary locks (though I am not entirely convinced they are necessary in the first place, as gc expiration-times would generally save a racily-written packfile anyway). Or perhaps we could differentiate our temporary locks from "real" .keep files by looking at the content; I think our locks always say something like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous to commit to that, I think (or even adjust it to something even more unambiguous). It does muddy the meaning of packed_git.pack_keep a bit. Some callers would want to consider it kept in either case (i.e., for purposes of pruning, we delete neither) and some would want it kept only for non-locks (for packing, duplicating the objects is OK). So I think we'd end up with two bits there, and callers would have to use one or the other as appropriate. -Peff
Jeff King <peff@peff.net> writes: > > A much more robust solution would be to stop conflating user-provided > permanent .keep files with temporary locks. I think that was a mistaken > design added many years ago. We probably could introduce a different > filename for the temporary locks (though I am not entirely convinced > they are necessary in the first place, as gc expiration-times would > generally save a racily-written packfile anyway). True, true (and I tend to agree). > Or perhaps we could differentiate our temporary locks from "real" .keep > files by looking at the content; I think our locks always say something > like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous > to commit to that, I think (or even adjust it to something even more > unambiguous). True, but it may be overkill to open and read. > It does muddy the meaning of packed_git.pack_keep a bit. Some callers > would want to consider it kept in either case (i.e., for purposes of > pruning, we delete neither) and some would want it kept only for > non-locks (for packing, duplicating the objects is OK). So I think we'd > end up with two bits there, and callers would have to use one or the > other as appropriate. Yeah, I agree that we'd need to treat them separately in the longer run. Thanks.
Junio C Hamano <gitster@pobox.com> writes: > Jeff King <peff@peff.net> writes: > >> >> A much more robust solution would be to stop conflating user-provided >> permanent .keep files with temporary locks. I think that was a mistaken >> design added many years ago. We probably could introduce a different >> filename for the temporary locks (though I am not entirely convinced >> they are necessary in the first place, as gc expiration-times would >> generally save a racily-written packfile anyway). > > True, true (and I tend to agree). > >> Or perhaps we could differentiate our temporary locks from "real" .keep >> files by looking at the content; I think our locks always say something >> like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous >> to commit to that, I think (or even adjust it to something even more >> unambiguous). > > True, but it may be overkill to open and read. > >> It does muddy the meaning of packed_git.pack_keep a bit. Some callers >> would want to consider it kept in either case (i.e., for purposes of >> pruning, we delete neither) and some would want it kept only for >> non-locks (for packing, duplicating the objects is OK). So I think we'd >> end up with two bits there, and callers would have to use one or the >> other as appropriate. > > Yeah, I agree that we'd need to treat them separately in the longer > run. > > Thanks. In the meantime, this is about patch 2/1; the underlying "when .keep is there, disable bitmaps" is probably good to go, still. -- >8 -- From: Eric Wong <e@80x24.org> Date: Sat, 29 Jun 2019 19:13:59 +0000 Subject: [PATCH] repack: disable bitmaps-by-default if .keep files exist Bitmaps aren't useful with multiple packs, and users with .keep files ended up with redundant packs when bitmaps got enabled by default in bare repos. So detect when .keep files exist and stop enabling bitmaps by default in that case. Wasteful (but otherwise harmless) race conditions with .keep files documented by Jeff King still apply and there's a chance we'd still end up with redundant data on the FS: https://public-inbox.org/git/20190623224244.GB1100@sigill.intra.peff.net/ v2: avoid subshell in test case, be multi-index aware Fixes: 36eba0323d3288a8 ("repack: enable bitmaps by default on bare repos") Signed-off-by: Eric Wong <e@80x24.org> Helped-by: Jeff King <peff@peff.net> Reported-by: Janos Farkas <chexum@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> --- builtin/repack.c | 18 ++++++++++++++++-- t/t7700-repack.sh | 10 ++++++++++ 2 files changed, 26 insertions(+), 2 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index caca113927..73250b2431 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -89,6 +89,17 @@ static void remove_pack_on_signal(int signo) raise(signo); } +static int has_pack_keep_file(void) +{ + struct packed_git *p; + + for (p = get_all_packs(the_repository); p; p = p->next) { + if (p->pack_keep) + return 1; + } + return 0; +} + /* * Adds all packs hex strings to the fname list, which do not * have a corresponding .keep file. These packs are not to @@ -343,9 +354,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix) (unpack_unreachable || (pack_everything & LOOSEN_UNREACHABLE))) die(_("--keep-unreachable and -A are incompatible")); - if (write_bitmaps < 0) + if (write_bitmaps < 0) { write_bitmaps = (pack_everything & ALL_INTO_ONE) && - is_bare_repository(); + is_bare_repository() && + keep_pack_list.nr == 0 && + !has_pack_keep_file(); + } if (pack_kept_objects < 0) pack_kept_objects = write_bitmaps; diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 86d05160a3..0e9af832c9 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -239,4 +239,14 @@ test_expect_success 'bitmaps can be disabled on bare repos' ' test -z "$bitmap" ' +test_expect_success 'no bitmaps created if .keep files present' ' + pack=$(ls bare.git/objects/pack/*.pack) && + test_path_is_file "$pack" && + keep=${pack%.pack}.keep && + >"$keep" && + git -C bare.git repack -ad && + find bare.git/objects/pack/ -type f -name "*.bitmap" >actual && + test_must_be_empty actual +' + test_done
On Wed, Jul 03, 2019 at 11:10:22AM -0700, Junio C Hamano wrote: > > Or perhaps we could differentiate our temporary locks from "real" .keep > > files by looking at the content; I think our locks always say something > > like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous > > to commit to that, I think (or even adjust it to something even more > > unambiguous). > > True, but it may be overkill to open and read. Yeah, that cross my mind as well, but: 1. We'd only need to open them when we _see_ them. And they're pretty rare anyway. 2. Effort-wise, we're already opening and mmap-ing the .idx files, so this is on par. 3. Most callers don't care about keep-files anyway. We could turn packed_git.pack_keep into: enum { PACK_KEEP_NONE, PACK_KEEP_LOCK, PACK_KEEP_USER } check_packed_keep(struct packed_git *pack); and then most programs wouldn't pay anything. Just some thoughts. I don't have immediate plans to work on it, but maybe somebody else is excited about it. :) -Peff
On Wed, Jul 03, 2019 at 11:37:52AM -0700, Junio C Hamano wrote: > In the meantime, this is about patch 2/1; the underlying "when .keep > is there, disable bitmaps" is probably good to go, still. > > -- >8 -- > From: Eric Wong <e@80x24.org> > Date: Sat, 29 Jun 2019 19:13:59 +0000 > Subject: [PATCH] repack: disable bitmaps-by-default if .keep files exist Yeah, this one looks good to me. Thanks for keeping things moving. -Peff
Jeff King <peff@peff.net> writes: > On Wed, Jul 03, 2019 at 11:10:22AM -0700, Junio C Hamano wrote: > >> > Or perhaps we could differentiate our temporary locks from "real" .keep >> > files by looking at the content; I think our locks always say something >> > like "(receive|receive)-pack \d+ on .*", and it wouldn't be too onerous >> > to commit to that, I think (or even adjust it to something even more >> > unambiguous). >> >> True, but it may be overkill to open and read. > > Yeah, that cross my mind as well, but: > > 1. We'd only need to open them when we _see_ them. And they're pretty > rare anyway. > > 2. Effort-wise, we're already opening and mmap-ing the .idx files, so > this is on par. > > 3. Most callers don't care about keep-files anyway. We could turn > packed_git.pack_keep into: > > enum { > PACK_KEEP_NONE, > PACK_KEEP_LOCK, > PACK_KEEP_USER > } check_packed_keep(struct packed_git *pack); > > and then most programs wouldn't pay anything. > > Just some thoughts. I don't have immediate plans to work on it, but > maybe somebody else is excited about it. :) OK. I do agree that .keep would be rare enough to justify spending a bit more extra cycles, as long as the benefit is big enough (and in this case it may be a good trade-off).
diff --git a/builtin/repack.c b/builtin/repack.c index 73250b2431..b1eeee88a7 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -359,7 +359,15 @@ int cmd_repack(int argc, const char **argv, const char *prefix) is_bare_repository() && keep_pack_list.nr == 0 && !has_pack_keep_file(); + } else if (write_bitmaps > 0) { + if (keep_pack_list.nr) + fprintf(stderr, + _("WARNING: --keep-pack is incompatible with bitmaps\n")); + if (has_pack_keep_file()) + fprintf(stderr, + _("WARNING: .keep files are incompatible with bitmaps\n")); } + if (pack_kept_objects < 0) pack_kept_objects = write_bitmaps; diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 0e9af832c9..839484c7dc 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -249,4 +249,20 @@ test_expect_success 'no bitmaps created if .keep files present' ' test_must_be_empty actual ' +test_expect_success '-b warns with .keep files present' ' + pack=$(ls bare.git/objects/pack/*.pack) && + test_path_is_file "$pack" && + keep=${pack%.pack}.keep && + >"$keep" && + git -C bare.git repack -adb 2>err && + test_i18ngrep -F ".keep files are incompatible" err && + rm -f "$keep" +' + +test_expect_success '-b warns with --keep-pack specified' ' + keep=$(cd bare.git/objects/pack/ && ls *.pack) && + git -C bare.git repack -adb --keep-pack="$keep" 2>err && + test_i18ngrep -F "keep-pack is incompatible" err +' + test_done