Message ID | 65557f2d-9de0-49ae-a858-80476aa52b68@web.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Test breakage with zlib-ng | expand |
On Wed, Dec 13, 2023 at 01:28:56PM +0100, René Scharfe wrote: > Add test_object_size and its helpers test_loose_object_size and > test_packed_object_size, which allow determining the size of a Git > object using only the low-level Git commands rev-parse and show-index. > > Use it in t6300 to replace the bare-bones function test_object_file_size > as a motivating example. There it provides the expected output of the > high-level Git command for-each-ref. This adds a packed-object function, but I doubt anybody actually calls it. If we're going to do that, it's probably worth adding some tests for "cat-file --batch-check" or similar. At which point I wonder if rather than having a function for a single object, we are better off just testing the result of: git cat-file --batch-all-objects --unordered --batch-check='%(objectsize:disk)' against a single post-processed "show-index" invocation. > So how about this? I'm a bit nervous about all the rules about output > descriptors and error propagation and whatnot in the test library, but > this implementation seems simple enough and might be useful in more than > one test. No idea how to add support for alternate object directories, > but I doubt we'll ever need it. I'm not sure that we need to do anything special with output redirection. Shouldn't these functions just send errors to stderr as usual? If they are run inside a test_expect block, that goes to descriptor 4 (which is either /dev/null or the original stderr, depending on whether "-v" was used). > +test_loose_object_size () { > + test "$#" -ne 1 && BUG "1 param" > + local path=$(test_oid_to_path "$1") > + test_file_size "$(git rev-parse --git-path "objects/$path")" 2>&4 > +} OK. We lose the exit code from "rev-parse" but that is probably OK for our purposes. > +test_packed_object_size () { > + test "$#" -ne 2 && BUG "2 params" > + local oid=$1 idx=$2 packsize rawsz end > + > + packsize=$(test_file_size "${idx%.idx}.pack") > + rawsz=$(test_oid rawsz) > + end=$(($packsize - $rawsz)) OK, this $end is the magic required for the final entry. Makes sense. > + git show-index <"$idx" | > + awk -v oid="$oid" -v end="$end" ' > + $2 == oid {start = $1} > + {offsets[$1] = 1} > + END { > + if (!start || start >= end) > + exit 1 > + for (o in offsets) > + if (start < o && o < end) > + end = o > + print end - start > + } > + ' && return 0 I was confused at first, because I didn't see any sorting happening. But if I understand correctly, you're just looking for the smallest "end" that comes after the start of the object we're looking for. Which I think works. -Peff
Am 14.12.23 um 21:59 schrieb Jeff King: > On Wed, Dec 13, 2023 at 01:28:56PM +0100, René Scharfe wrote: > >> Add test_object_size and its helpers test_loose_object_size and >> test_packed_object_size, which allow determining the size of a Git >> object using only the low-level Git commands rev-parse and show-index. >> >> Use it in t6300 to replace the bare-bones function test_object_file_size >> as a motivating example. There it provides the expected output of the >> high-level Git command for-each-ref. > > This adds a packed-object function, but I doubt anybody actually calls > it. If we're going to do that, it's probably worth adding some tests for > "cat-file --batch-check" or similar. Yes, and I was assuming that someone else would be eager to add such tests. *ahem* > At which point I wonder if rather than having a function for a single > object, we are better off just testing the result of: > > git cat-file --batch-all-objects --unordered --batch-check='%(objectsize:disk)' > > against a single post-processed "show-index" invocation. Sure, we might want to optimize for bulk-processing and possibly end up only checking the size of single objects in t6300, making new library functions unnecessary. When dumping size information of multiple objects, it's probably a good idea to include "%(objectname)" as well in the format. You'd need one show-index call for each .idx file. A simple test would only have a single one; a library function might need to handle multiple packs. >> So how about this? I'm a bit nervous about all the rules about output >> descriptors and error propagation and whatnot in the test library, but >> this implementation seems simple enough and might be useful in more than >> one test. No idea how to add support for alternate object directories, >> but I doubt we'll ever need it. > > I'm not sure that we need to do anything special with output > redirection. Shouldn't these functions just send errors to stderr as > usual? If they are run inside a test_expect block, that goes to > descriptor 4 (which is either /dev/null or the original stderr, > depending on whether "-v" was used). Good point. My bad excuse is that I copied the redirection to fd 4 from test_grep. >> + git show-index <"$idx" | >> + awk -v oid="$oid" -v end="$end" ' >> + $2 == oid {start = $1} >> + {offsets[$1] = 1} >> + END { >> + if (!start || start >= end) >> + exit 1 >> + for (o in offsets) >> + if (start < o && o < end) >> + end = o >> + print end - start >> + } >> + ' && return 0 > > I was confused at first, because I didn't see any sorting happening. But > if I understand correctly, you're just looking for the smallest "end" > that comes after the start of the object we're looking for. Which I > think works. Yes, calculating the minimum offset suffices when handling a single object -- no sorting required. For bulk mode we'd better sort, of course: git show-index <"$idx" | sort -n | awk -v end="$end" ' NR > 1 {print oid, $1 - start} {start = $1; oid = $2} END {print oid, end - start} ' No idea how to make such a thing robust against malformed or truncated output from show-index, but perhaps that's not necessary, depending on how the result is used. René
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh index 843a7fe143..4687660f38 100755 --- a/t/t6300-for-each-ref.sh +++ b/t/t6300-for-each-ref.sh @@ -20,12 +20,6 @@ setdate_and_increment () { export GIT_COMMITTER_DATE GIT_AUTHOR_DATE } -test_object_file_size () { - oid=$(git rev-parse "$1") - path=".git/objects/$(test_oid_to_path $oid)" - test_file_size "$path" -} - test_expect_success setup ' # setup .mailmap cat >.mailmap <<-EOF && @@ -40,7 +34,11 @@ test_expect_success setup ' git branch -M main && setdate_and_increment && git tag -a -m "Tagging at $datestamp" testtag && + testtag_oid=$(git rev-parse refs/tags/testtag) && + testtag_disksize=$(test_object_size $testtag_oid) && git update-ref refs/remotes/origin/main main && + commit_oid=$(git rev-parse refs/heads/main) && + commit_disksize=$(test_object_size $commit_oid) && git remote add origin nowhere && git config branch.main.remote origin && git config branch.main.merge refs/heads/main && @@ -129,7 +127,7 @@ test_atom head push:strip=1 remotes/myfork/main test_atom head push:strip=-1 main test_atom head objecttype commit test_atom head objectsize $((131 + hexlen)) -test_atom head objectsize:disk $(test_object_file_size refs/heads/main) +test_atom head objectsize:disk $commit_disksize test_atom head deltabase $ZERO_OID test_atom head objectname $(git rev-parse refs/heads/main) test_atom head objectname:short $(git rev-parse --short refs/heads/main) @@ -203,8 +201,8 @@ test_atom tag upstream '' test_atom tag push '' test_atom tag objecttype tag test_atom tag objectsize $((114 + hexlen)) -test_atom tag objectsize:disk $(test_object_file_size refs/tags/testtag) -test_atom tag '*objectsize:disk' $(test_object_file_size refs/heads/main) +test_atom tag objectsize:disk $testtag_disksize +test_atom tag '*objectsize:disk' $commit_disksize test_atom tag deltabase $ZERO_OID test_atom tag '*deltabase' $ZERO_OID test_atom tag objectname $(git rev-parse refs/tags/testtag) diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index 9c3cf12b26..9b49645f77 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1733,6 +1733,53 @@ test_oid_to_path () { echo "${1%$basename}/$basename" } +test_loose_object_size () { + test "$#" -ne 1 && BUG "1 param" + local path=$(test_oid_to_path "$1") + test_file_size "$(git rev-parse --git-path "objects/$path")" 2>&4 +} + +test_packed_object_size () { + test "$#" -ne 2 && BUG "2 params" + local oid=$1 idx=$2 packsize rawsz end + + packsize=$(test_file_size "${idx%.idx}.pack") + rawsz=$(test_oid rawsz) + end=$(($packsize - $rawsz)) + + git show-index <"$idx" | + awk -v oid="$oid" -v end="$end" ' + $2 == oid {start = $1} + {offsets[$1] = 1} + END { + if (!start || start >= end) + exit 1 + for (o in offsets) + if (start < o && o < end) + end = o + print end - start + } + ' && return 0 + + echo >&4 "error: '$oid' not found in '$idx'" + return 1 +} + +test_object_size () { + test "$#" -ne 1 && BUG "1 param" + local oid=$1 + + test_loose_object_size "$oid" 4>/dev/null && return 0 + + for idx in "$(git rev-parse --git-path objects/pack)"/pack-*.idx + do + test_packed_object_size "$oid" "$idx" 4>/dev/null && return 0 + done + + echo >&4 "error: '$oid' not found" + return 1 +} + # Parse oids from git ls-files --staged output test_parse_ls_files_stage_oids () { awk '{print $2}' -
Add test_object_size and its helpers test_loose_object_size and test_packed_object_size, which allow determining the size of a Git object using only the low-level Git commands rev-parse and show-index. Use it in t6300 to replace the bare-bones function test_object_file_size as a motivating example. There it provides the expected output of the high-level Git command for-each-ref. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> --- So how about this? I'm a bit nervous about all the rules about output descriptors and error propagation and whatnot in the test library, but this implementation seems simple enough and might be useful in more than one test. No idea how to add support for alternate object directories, but I doubt we'll ever need it. --- t/t6300-for-each-ref.sh | 16 ++++++-------- t/test-lib-functions.sh | 47 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+), 9 deletions(-) -- 2.43.0