Message ID | 20241206124248.160494-1-christian.couder@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce a "promisor-remote" capability | expand |
Christian Couder <christian.couder@gmail.com> writes: > This work is part of some effort to better handle large files/blobs in > a client-server context using promisor remotes dedicated to storing > large blobs. To help understand this effort, this series now contains > a patch (patch 5/5) that adds design documentation about this effort. https://github.com/git/git/actions/runs/12229786922/job/34110073072 is a CI-run on 'seen' with this topic. linux-TEST-vars job is failing. A CI-run for the same topics in 'seen' but without this topic is https://github.com/git/git/actions/runs/12230853182/job/34112864500 This topic seems to break linux-TEST-vars CI job (where different settings like + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master is used).
On Mon, Dec 9, 2024 at 9:04 AM Junio C Hamano <gitster@pobox.com> wrote: > > Christian Couder <christian.couder@gmail.com> writes: > > > This work is part of some effort to better handle large files/blobs in > > a client-server context using promisor remotes dedicated to storing > > large blobs. To help understand this effort, this series now contains > > a patch (patch 5/5) that adds design documentation about this effort. > > https://github.com/git/git/actions/runs/12229786922/job/34110073072 > is a CI-run on 'seen' with this topic. linux-TEST-vars job is failing. > > A CI-run for the same topics in 'seen' but without this topic is > https://github.com/git/git/actions/runs/12230853182/job/34112864500 > > This topic seems to break linux-TEST-vars CI job (where different > settings like + export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master > is used). Yeah, in the "CI tests" section in the cover letter I wrote: > One test, linux-TEST-vars, failed much earlier, in what doesn't look > like a CI issue as I could reproduce the failure locally when setting > GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL to 1. I will investigate, > but in the meantime I think I can send this as-is so we can start > discussing. I noticed that fcb2205b77 (midx: implement support for writing incremental MIDX chains, 2024-08-06) which introduced GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL adds lines like: GIT_TEST_MULTI_PACK_INDEX=0 GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 at the top of a number of repack related test scripts like t7700-repack.sh, so I guess that it should be OK to add the same lines at the top of the t5710 test script added by this series. This should fix the CI failures. I have made this change in my current version. Thanks. Yeah, not sure why
On Mon, Dec 9, 2024 at 11:40 AM Christian Couder
<christian.couder@gmail.com> wrote:
> Yeah, not sure why
Sorry for this. It's an editing mistake.
Christian Couder <christian.couder@gmail.com> writes: > I noticed that fcb2205b77 (midx: implement support for writing > incremental MIDX chains, 2024-08-06) > which introduced GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL adds lines like: > > GIT_TEST_MULTI_PACK_INDEX=0 > GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 > > at the top of a number of repack related test scripts like > t7700-repack.sh, so I guess that it should be OK to add the same lines > at the top of the t5710 test script added by this series. This should > fix the CI failures. > > I have made this change in my current version. Thanks. Is it because the feature is fundamentally incompatible with the multi-pack index (or its incremental writing), or is it merely because the way the feature is verified assumes that the multi-pack index is not used, even though the protocol exchange, capability selection, and the actual behaviour adjustment for the capability are all working just fine? I am assuming it is the latter, but just to make sure we know where we stand... Thanks, again.
On Tue, Dec 10, 2024 at 12:01 AM Junio C Hamano <gitster@pobox.com> wrote: > > Christian Couder <christian.couder@gmail.com> writes: > > > I noticed that fcb2205b77 (midx: implement support for writing > > incremental MIDX chains, 2024-08-06) > > which introduced GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL adds lines like: > > > > GIT_TEST_MULTI_PACK_INDEX=0 > > GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 > > > > at the top of a number of repack related test scripts like > > t7700-repack.sh, so I guess that it should be OK to add the same lines > > at the top of the t5710 test script added by this series. This should > > fix the CI failures. > > > > I have made this change in my current version. > > Thanks. > > Is it because the feature is fundamentally incompatible with the > multi-pack index (or its incremental writing), It's not an incompatibility with the feature developed in this series. Adding the following test script on top of master or even fcb2205b77 (midx: implement support for writing incremental MIDX chains, 2024-08-06), shows that it fails in the same way without any code change to `git` itself from this series: diff --git a/t/t5709-midx-increment-write.sh b/t/t5709-midx-increment-write.sh new file mode 100755 index 0000000000..8801222374 --- /dev/null +++ b/t/t5709-midx-increment-write.sh @@ -0,0 +1,132 @@ +#!/bin/sh + +test_description='test midx incremental write' + +. ./test-lib.sh + +export GIT_TEST_MULTI_PACK_INDEX=1 +export GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=1 + +# Setup the repository with three commits, this way HEAD is always +# available and we can hide commit 1 or 2. +test_expect_success 'setup: create "template" repository' ' + git init template && + test_commit -C template 1 && + test_commit -C template 2 && + test_commit -C template 3 && + test-tool genrandom foo 10240 >template/foo && + git -C template add foo && + git -C template commit -m foo +' + +# A bare repo will act as a server repo with unpacked objects. +test_expect_success 'setup: create bare "server" repository' ' + git clone --bare --no-local template server && + mv server/objects/pack/pack-* . && + packfile=$(ls pack-*.pack) && + git -C server unpack-objects --strict <"$packfile" +' + +check_missing_objects () { + git -C "$1" rev-list --objects --all --missing=print > all.txt && + perl -ne 'print if s/^[?]//' all.txt >missing.txt && + test_line_count = "$2" missing.txt && + if test "$2" -lt 2 + then + test "$3" = "$(cat missing.txt)" + else + test -f "$3" && + sort <"$3" >expected_sorted && + sort <missing.txt >actual_sorted && + test_cmp expected_sorted actual_sorted + fi +} + +initialize_server () { + count="$1" + missing_oids="$2" + + # Repack everything first + git -C server -c repack.writebitmaps=false repack -a -d && + + # Remove promisor file in case they exist, useful when reinitializing + rm -rf server/objects/pack/*.promisor && + + # Repack without the largest object and create a promisor pack on server + git -C server -c repack.writebitmaps=false repack -a -d \ + --filter=blob:limit=5k --filter-to="$(pwd)/pack" && + promisor_file=$(ls server/objects/pack/*.pack | sed "s/\.pack/.promisor/") && + >"$promisor_file" && + + # Check objects missing on the server + check_missing_objects server "$count" "$missing_oids" +} + +copy_to_server2 () { + oid_path="$(test_oid_to_path $1)" && + path="server/objects/$oid_path" && + path2="server2/objects/$oid_path" && + mkdir -p $(dirname "$path2") && + cp "$path" "$path2" +} + +test_expect_success "setup for testing promisor remote advertisement" ' + # Create another bare repo called "server2" + git init --bare server2 && + + # Copy the largest object from server to server2 + obj="HEAD:foo" && + oid="$(git -C server rev-parse $obj)" && + copy_to_server2 "$oid" && + + initialize_server 1 "$oid" && + + # Configure server2 as promisor remote for server + git -C server remote add server2 "file://$(pwd)/server2" && + git -C server config remote.server2.promisor true && + + git -C server2 config uploadpack.allowFilter true && + git -C server2 config uploadpack.allowAnySHA1InWant true && + git -C server config uploadpack.allowFilter true && + git -C server config uploadpack.allowAnySHA1InWant true +' + +test_expect_success "setup for subsequent fetches" ' + # Generate new commit with large blob + test-tool genrandom bar 10240 >template/bar && + git -C template add bar && + git -C template commit -m bar && + + # Fetch new commit with large blob + git -C server fetch origin && + git -C server update-ref HEAD FETCH_HEAD && + git -C server rev-parse HEAD >expected_head && + + # Repack everything twice and remove .promisor files before + # each repack. This makes sure everything gets repacked + # into a single packfile. The second repack is necessary + # because the first one fetches from server2 and creates a new + # packfile and its associated .promisor file. + + rm -f server/objects/pack/*.promisor && + git -C server -c repack.writebitmaps=false repack -a -d && + rm -f server/objects/pack/*.promisor && + git -C server -c repack.writebitmaps=false repack -a -d && + + # Unpack everything + rm pack-* && + mv server/objects/pack/pack-* . && + packfile=$(ls pack-*.pack) && + git -C server unpack-objects --strict <"$packfile" && + + # Copy new large object to server2 + obj_bar="HEAD:bar" && + oid_bar="$(git -C server rev-parse $obj_bar)" && + copy_to_server2 "$oid_bar" && + + # Reinitialize server so that the 2 largest objects are missing + printf "%s\n" "$oid" "$oid_bar" >expected_missing.txt && + initialize_server 2 expected_missing.txt +' + +test_done Changing `export GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=1` into `export GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0` at the top of the file makes it work. This could probably be simplified, but I think it shows that it's just the incremental writing of the multi-pack index that is incompatible or has a bug when doing some repacking. > or is it merely > because the way the feature is verified assumes that the multi-pack > index is not used, even though the protocol exchange, capability > selection, and the actual behaviour adjustment for the capability > are all working just fine? I am assuming it is the latter, but just > to make sure we know where we stand... Let me know if you need more than the above, but I think it's fair for now to just use: GIT_TEST_MULTI_PACK_INDEX=0 GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 at the top of the tests, like it's done in the version 4 of this series I will send soon.
Christian Couder <christian.couder@gmail.com> writes: >> or is it merely >> because the way the feature is verified assumes that the multi-pack >> index is not used, even though the protocol exchange, capability >> selection, and the actual behaviour adjustment for the capability >> are all working just fine? I am assuming it is the latter, but just >> to make sure we know where we stand... > > Let me know if you need more than the above, Hard to say if I got a test script when I asked for a simple yes-or-no question. > but I think it's fair for > now to just use: > > GIT_TEST_MULTI_PACK_INDEX=0 > GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 > > at the top of the tests, like it's done in the version 4 of this > series I will send soon. Doesn't it mean that people should not use multi-pack-index or incremental writing with this feature? If we cannot make both of them work together even in our controlled testing environment, how would the users know what combinations of features are safe to use and what are incompatible? That sounds far from fair at least to me. I see Taylor is included in the Cc: list, so hopefully, we'll get the anomalies you found in the multi-pack stuff resolved and see how well these two things would work together. Thanks.