Message ID | 1ed91f6d255b76bdbdcccea7e1effcebbb263ced.1639758526.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Two small 'git repack' fixes | expand |
On Fri, Dec 17, 2021 at 04:28:45PM +0000, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <dstolee@microsoft.com> > > Historically, we needed a single packfile in order to have reachability > bitmaps. This introduced logic that when 'git repack' had a '-b' option > that we should stop sending the '--honor-pack-keep' option to the 'git > pack-objects' child process, ensuring that we create a packfile > containing all reachable objects. > > In the world of multi-pack-index bitmaps, we no longer need to repack > all objects into a single pack to have valid bitmaps. Thus, we should > continue sending the '--honor-pack-keep' flag to 'git pack-objects'. > > The fix is very simple: only disable the flag when writing bitmaps but > also _not_ writing the multi-pack-index. > > This opens the door to new repacking strategies that might want to keep > some historical set of objects in a stable pack-file while only > repacking more recent objects. That all makes sense. Another way of thinking about it: we disable --honor-pack-keep so we that keep packs do not interfere with writing a pack bitmap. But with --write-midx, we skip the pack bitmap entirely. In the end it's the same, but I wanted to emphasize that the important hing is not so much that we are writing a midx bitmap as that we are _not_ writing a pack bitmap. And that is what makes this OK to do (that the repack code already disables the pack bitmap when writing a midx one). > diff --git a/builtin/repack.c b/builtin/repack.c > index 9b0be6a6ab3..1f128b7c90b 100644 > --- a/builtin/repack.c > +++ b/builtin/repack.c > @@ -693,7 +693,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) > write_bitmaps = 0; > } > if (pack_kept_objects < 0) > - pack_kept_objects = write_bitmaps > 0; > + pack_kept_objects = write_bitmaps > 0 && !write_midx; So the code change here looks good. > diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh > index 0260ad6f0e0..8c4ba6500be 100755 > --- a/t/t7700-repack.sh > +++ b/t/t7700-repack.sh > @@ -372,4 +372,19 @@ test_expect_success '--write-midx with preferred bitmap tips' ' > ) > ' > > +test_expect_success '--write-midx -b packs non-kept objects' ' > + git init midx-kept && > + test_when_finished "rm -fr midx-kept" && > + ( > + cd midx-kept && > + test_commit_bulk 100 && > + GIT_TRACE2_EVENT="$(pwd)/trace.txt" \ > + git repack --write-midx -a -b && > + cat trace.txt | \ > + grep \"event\":\"start\" | \ > + grep pack-objects | \ > + grep \"--honor-pack-keep\" > + ) > +' This looks correct, though: - do we really need this separate repo directory? The test before it uses one, but only because it needs a very specific set of commits. There is a long-running "midx" directory we could use, though I think just the regular test repo would be fine, too. - likewise, do we need 100 commits? They are not too expensive these days with test_commit_bulk, but I think we don't even care about the repo contents at all. - there is no actual .keep file in your test. That's OK, as we are just checking the args passed to pack-objects. - useless use of cat. :) Also, you could probably do it all with one grep. This is bikeshedding, of course, but it's nice to keep process counts low in the test suite. Also, your middle grep is looser than the others (it doesn't check for surrounding quotes). So something like this would work: test_expect_success '--write-midx -b packs non-kept objects' ' GIT_TRACE2_EVENT="$(pwd)/midx-keep-bitmap.trace" \ git -C midx repack --write-midx -a -b && grep "\"event\":\"start\".*\"pack-objects\".*\"--honor-pack-keep\"" \ midx-keep-bitmap.trace ' One could perhaps argue that the combined grep is less readable. If that's a concern, I'd suggest wrapping it in a function like: # usage: check_trace2_arg <trace_file> <cmd> <arg> check_trace2_arg () { grep "\"event\":\"start\".*\"$2\".*\"$3\"" "$1" } All just suggestions, of course. I'd be happy enough to see the patch go in as-is. -Peff
On Fri, Dec 17 2021, Derrick Stolee via GitGitGadget wrote: > +test_expect_success '--write-midx -b packs non-kept objects' ' > + git init midx-kept && > + test_when_finished "rm -fr midx-kept" && Nit: Better the other way around, so we'll clean up things if "git init" dies midway, but unlikely to ever happen here. > + ( > + cd midx-kept && > + test_commit_bulk 100 && > + GIT_TRACE2_EVENT="$(pwd)/trace.txt" \ > + git repack --write-midx -a -b && > + cat trace.txt | \ > + grep \"event\":\"start\" | \ The "cat" here should go in favor of "grep <pat> trace.txt", and can't this all be on one grep line with "." between the relevant parts you're extracting out?
On 12/17/2021 12:24 PM, Jeff King wrote: > On Fri, Dec 17, 2021 at 04:28:45PM +0000, Derrick Stolee via GitGitGadget wrote: ... >> +test_expect_success '--write-midx -b packs non-kept objects' ' >> + git init midx-kept && >> + test_when_finished "rm -fr midx-kept" && >> + ( >> + cd midx-kept && >> + test_commit_bulk 100 && >> + GIT_TRACE2_EVENT="$(pwd)/trace.txt" \ >> + git repack --write-midx -a -b && >> + cat trace.txt | \ >> + grep \"event\":\"start\" | \ >> + grep pack-objects | \ >> + grep \"--honor-pack-keep\" >> + ) >> +' > > This looks correct, though: > > - do we really need this separate repo directory? The test before it > uses one, but only because it needs a very specific set of commits. > There is a long-running "midx" directory we could use, though I > think just the regular test repo would be fine, too. > > - likewise, do we need 100 commits? They are not too expensive these > days with test_commit_bulk, but I think we don't even care about the > repo contents at all. > > - there is no actual .keep file in your test. That's OK, as we are > just checking the args passed to pack-objects. > > - useless use of cat. :) Also, you could probably do it all with one > grep. This is bikeshedding, of course, but it's nice to keep process > counts low in the test suite. Also, your middle grep is looser than > the others (it doesn't check for surrounding quotes). > > So something like this would work: > > test_expect_success '--write-midx -b packs non-kept objects' ' > GIT_TRACE2_EVENT="$(pwd)/midx-keep-bitmap.trace" \ > git -C midx repack --write-midx -a -b && > grep "\"event\":\"start\".*\"pack-objects\".*\"--honor-pack-keep\"" \ > midx-keep-bitmap.trace > ' > > One could perhaps argue that the combined grep is less readable. If > that's a concern, I'd suggest wrapping it in a function like: > > # usage: check_trace2_arg <trace_file> <cmd> <arg> > check_trace2_arg () { > grep "\"event\":\"start\".*\"$2\".*\"$3\"" "$1" > } > > All just suggestions, of course. I'd be happy enough to see the patch go > in as-is. Thanks for the suggestions. I decided to adapt the 'test_subcommand' helper into a 'test_subcommand_inexact' helper. The existing helper requires the full argument list in exact order, but the new one only cares about the given arguments (in that relative order). Thanks, -Stolee
On Mon, Dec 20, 2021 at 08:40:09AM -0500, Derrick Stolee wrote: > > One could perhaps argue that the combined grep is less readable. If > > that's a concern, I'd suggest wrapping it in a function like: > > > > # usage: check_trace2_arg <trace_file> <cmd> <arg> > > check_trace2_arg () { > > grep "\"event\":\"start\".*\"$2\".*\"$3\"" "$1" > > } > > > > All just suggestions, of course. I'd be happy enough to see the patch go > > in as-is. > > Thanks for the suggestions. I decided to adapt the 'test_subcommand' > helper into a 'test_subcommand_inexact' helper. The existing helper > requires the full argument list in exact order, but the new one only > cares about the given arguments (in that relative order). Heh, I should have looked to see if we had faced this problem before. Extending that family of helpers is better still than my suggestion. -Peff
diff --git a/builtin/repack.c b/builtin/repack.c index 9b0be6a6ab3..1f128b7c90b 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -693,7 +693,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) write_bitmaps = 0; } if (pack_kept_objects < 0) - pack_kept_objects = write_bitmaps > 0; + pack_kept_objects = write_bitmaps > 0 && !write_midx; if (write_bitmaps && !(pack_everything & ALL_INTO_ONE) && !write_midx) die(_(incremental_bitmap_conflict_error)); diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 0260ad6f0e0..8c4ba6500be 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -372,4 +372,19 @@ test_expect_success '--write-midx with preferred bitmap tips' ' ) ' +test_expect_success '--write-midx -b packs non-kept objects' ' + git init midx-kept && + test_when_finished "rm -fr midx-kept" && + ( + cd midx-kept && + test_commit_bulk 100 && + GIT_TRACE2_EVENT="$(pwd)/trace.txt" \ + git repack --write-midx -a -b && + cat trace.txt | \ + grep \"event\":\"start\" | \ + grep pack-objects | \ + grep \"--honor-pack-keep\" + ) +' + test_done