From patchwork Fri Jun 2 13:02:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13265295 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E604C7EE2C for ; Fri, 2 Jun 2023 13:02:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235465AbjFBNCj (ORCPT ); Fri, 2 Jun 2023 09:02:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234366AbjFBNCi (ORCPT ); Fri, 2 Jun 2023 09:02:38 -0400 Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 993771A8 for ; Fri, 2 Jun 2023 06:02:35 -0700 (PDT) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id 0215B5C010C; Fri, 2 Jun 2023 09:02:35 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Fri, 02 Jun 2023 09:02:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1685710954; x=1685797354; bh=Lv DySHOaOYRtODWhOADAcFFWRZ5TricVu+nCWn71HlU=; b=RhUf/LrJX/Pk06mmfD R/rwltdy4tbH8wpFDhl6OoIzcPLV1UocOp9I6tEaZ6pSRnhAEnUtUGtZsQ5h7D6w TcTt1243XQuoBj3LDQEVubfSzxs/LT5l07RWVxiwhRRCK+bUhIobUzw1Fbj6ZOPw FSS/WoOQa2EsPDR6Ld0fiC0707UeotKhbkV0ffw78PaM8IbR8uVvxfUn78Hn1RNW irrj5B+li5FcTrGiQcL8Fohm9MXLIQfXRkeUTJZZZBNsGsEmHjbCY46STwCAKp2t /x5TeJJV39zxZmwKkqksNFs+ovzXGCr97w02A98tOkiLWD2KWVXHhKbY4cBVttzW 3GUw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1685710954; x=1685797354; bh=LvDySHOaOYRtO DWhOADAcFFWRZ5TricVu+nCWn71HlU=; b=aIhbvcbMtXXVnPuE2xbe1tp4qtfFM iokHaIGi1yc/3qUiuDPt+Z+jtqim3qhFyCOCD1gF68ygPH9p42/lBTrdeuMjD0ik BdYOgA9pGRd4+pXahl+Ip2BrkTspk23jwnd8qHrWvyXCfbC1G4ROCaWZP2xd6QFu rDjCARIQM5FmiYcSvK2qCrGUSTbRhz9nxyd8nV/x3LxvvzjxrIqeE8CsSw+qtWlu Nl+tusR646dEeDXOjuyMl0/BpQac4Wf7aQyNTfUo2Yj3CLCzvl2U4n7jiMRSWbHd okGguc5ESJL9RXoYz+tLhaHJen+sg/IJE6KRp6SboIEvbB078fACAJvOg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeelfedgheejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 2 Jun 2023 09:02:32 -0400 (EDT) Received: by pks.im (OpenSMTPD) with ESMTPSA id d8cbbbf8 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Fri, 2 Jun 2023 13:01:37 +0000 (UTC) Date: Fri, 2 Jun 2023 15:02:30 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Taylor Blau , Toon Claes , Phillip Wood , Junio C Hamano Subject: [PATCH 1/5] t1006: don't strip timestamps from expected results Message-ID: <5c8b4a1d70b170e3426cdd537edc0c076115be0a.1685710884.git.ps@pks.im> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In t1006 we have a bunch of tests that verify the output format of the git-cat-file(1) command. But while part of the output for some tests would include commit timestamps, we don't verify those but instead strip them before comparing expected with actual results. This is done by the function `maybe_remove_timestamp`, which goes all the way back to the ancient commit b335d3f121 (Add tests for git cat-file, 2008-04-23). Our tests had been in a different shape back then. Most importantly we didn't yet have the infrastructure to create objects with deterministic timestamps. Nowadays we do though, and thus there is no reason anymore to strip the timestamps. Refactor the tests to not strip the timestamp anymore. Signed-off-by: Patrick Steinhardt --- t/t1006-cat-file.sh | 68 +++++++++++++++++---------------------------- 1 file changed, 26 insertions(+), 42 deletions(-) diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index 8eac74b59c..f139d56fb4 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -109,26 +109,12 @@ strlen () { echo_without_newline "$1" | wc -c | sed -e 's/^ *//' } -maybe_remove_timestamp () { - if test -z "$2"; then - echo_without_newline "$1" - else - echo_without_newline "$(printf '%s\n' "$1" | remove_timestamp)" - fi -} - -remove_timestamp () { - sed -e 's/ [0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$//' -} - - run_tests () { type=$1 sha1=$2 size=$3 content=$4 pretty_content=$5 - no_ts=$6 batch_output="$sha1 $type $size $content" @@ -163,21 +149,21 @@ $content" test -z "$content" || test_expect_success "Content of $type is correct" ' - maybe_remove_timestamp "$content" $no_ts >expect && - maybe_remove_timestamp "$(git cat-file $type $sha1)" $no_ts >actual && + echo_without_newline "$content" >expect && + git cat-file $type $sha1 >actual && test_cmp expect actual ' test_expect_success "Pretty content of $type is correct" ' - maybe_remove_timestamp "$pretty_content" $no_ts >expect && - maybe_remove_timestamp "$(git cat-file -p $sha1)" $no_ts >actual && + echo_without_newline "$pretty_content" >expect && + git cat-file -p $sha1 >actual && test_cmp expect actual ' test -z "$content" || test_expect_success "--batch output of $type is correct" ' - maybe_remove_timestamp "$batch_output" $no_ts >expect && - maybe_remove_timestamp "$(echo $sha1 | git cat-file --batch)" $no_ts >actual && + echo "$batch_output" >expect && + echo $sha1 | git cat-file --batch >actual && test_cmp expect actual ' @@ -191,9 +177,8 @@ $content" do test -z "$content" || test_expect_success "--batch-command $opt output of $type content is correct" ' - maybe_remove_timestamp "$batch_output" $no_ts >expect && - maybe_remove_timestamp "$(test_write_lines "contents $sha1" | - git cat-file --batch-command $opt)" $no_ts >actual && + echo "$batch_output" >expect && + test_write_lines "contents $sha1" | git cat-file --batch-command $opt >actual && test_cmp expect actual ' @@ -228,10 +213,9 @@ $content" test_expect_success "--batch without type ($type)" ' { echo "$size" && - maybe_remove_timestamp "$content" $no_ts + echo "$content" } >expect && - echo $sha1 | git cat-file --batch="%(objectsize)" >actual.full && - maybe_remove_timestamp "$(cat actual.full)" $no_ts >actual && + echo $sha1 | git cat-file --batch="%(objectsize)" >actual && test_cmp expect actual ' @@ -239,10 +223,9 @@ $content" test_expect_success "--batch without size ($type)" ' { echo "$type" && - maybe_remove_timestamp "$content" $no_ts + echo "$content" } >expect && - echo $sha1 | git cat-file --batch="%(objecttype)" >actual.full && - maybe_remove_timestamp "$(cat actual.full)" $no_ts >actual && + echo $sha1 | git cat-file --batch="%(objecttype)" >actual && test_cmp expect actual ' } @@ -284,7 +267,7 @@ test_expect_success '--batch-check without %(rest) considers whole line' ' tree_sha1=$(git write-tree) tree_size=$(($(test_oid rawsz) + 13)) -tree_pretty_content="100644 blob $hello_sha1 hello" +tree_pretty_content="100644 blob $hello_sha1 hello${LF}" run_tests 'tree' $tree_sha1 $tree_size "" "$tree_pretty_content" @@ -292,12 +275,12 @@ commit_message="Initial commit" commit_sha1=$(echo_without_newline "$commit_message" | git commit-tree $tree_sha1) commit_size=$(($(test_oid hexsz) + 137)) commit_content="tree $tree_sha1 -author $GIT_AUTHOR_NAME <$GIT_AUTHOR_EMAIL> 0 +0000 -committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 0 +0000 +author $GIT_AUTHOR_NAME <$GIT_AUTHOR_EMAIL> $GIT_AUTHOR_DATE +committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE $commit_message" -run_tests 'commit' $commit_sha1 $commit_size "$commit_content" "$commit_content" 1 +run_tests 'commit' $commit_sha1 $commit_size "$commit_content" "$commit_content" tag_header_without_timestamp="object $hello_sha1 type blob @@ -311,7 +294,7 @@ $tag_description" tag_sha1=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w) tag_size=$(strlen "$tag_content") -run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_content" 1 +run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_content" test_expect_success \ "Reach a blob from a tag pointing to it" \ @@ -400,13 +383,16 @@ deadbeef missing missing" test_expect_success '--batch with multiple sha1s gives correct format' ' - test "$(maybe_remove_timestamp "$batch_output" 1)" = "$(maybe_remove_timestamp "$(echo_without_newline "$batch_input" | git cat-file --batch)" 1)" + echo "$batch_output" >expect && + echo_without_newline "$batch_input" | git cat-file --batch >actual && + test_cmp expect actual ' test_expect_success '--batch, -z with multiple sha1s gives correct format' ' echo_without_newline_nul "$batch_input" >in && - test "$(maybe_remove_timestamp "$batch_output" 1)" = \ - "$(maybe_remove_timestamp "$(git cat-file --batch -z expect && + git cat-file --batch -z actual && + test_cmp expect actual ' batch_check_input="$hello_sha1 @@ -480,7 +466,7 @@ contents deadbeef flush" test_expect_success '--batch-command with multiple command calls gives correct format' ' - remove_timestamp >expect <<-EOF && + cat >expect <<-EOF && $hello_sha1 blob $hello_size $hello_content $commit_sha1 commit $commit_size @@ -491,15 +477,13 @@ test_expect_success '--batch-command with multiple command calls gives correct f EOF echo "$batch_command_multiple_contents" >in && - git cat-file --batch-command --buffer actual_raw && + git cat-file --batch-command --buffer actual && - remove_timestamp actual && test_cmp expect actual && echo "$batch_command_multiple_contents" | tr "\n" "\0" >in && - git cat-file --batch-command --buffer -z actual_raw && + git cat-file --batch-command --buffer -z actual && - remove_timestamp actual && test_cmp expect actual ' From patchwork Fri Jun 2 13:02:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13265296 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9732C77B7A for ; Fri, 2 Jun 2023 13:02:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235560AbjFBNCn (ORCPT ); Fri, 2 Jun 2023 09:02:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234366AbjFBNCl (ORCPT ); Fri, 2 Jun 2023 09:02:41 -0400 Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71BB11A5 for ; Fri, 2 Jun 2023 06:02:39 -0700 (PDT) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id D72685C00D0; Fri, 2 Jun 2023 09:02:38 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Fri, 02 Jun 2023 09:02:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1685710958; x=1685797358; bh=wi tQNPkEpk9lsExNDP5sXcR+Em8YDwNKWjK0s5ycZHw=; b=kgaaXK9ex7O+La3O7i Ddz8bBRA1GzrELm0bl0trobWW0BWcFg4VrH+qkxUca0vAui1VHu+fmxHPdYGkPXz pGavBUE0feYLRZP81kXV4ExMbR0XP1BgjXxYTMGOkc9EcDyTjwngWxx/1xPMH5N3 7WL3Qv9qcztrtOkPb55dtY+3XLe9T/Uyc8GqSv4aFu8mZ9rmYVOf9u9JGowoiKZA 9esDLNY2FVdZRMMnPNG0NsTE6yGEcP7KWRkbVKxO47AiAuxL81CWGJbpyUd63Qca a/evkZN5omlNrS4xC8yh1OdBIEhhjlCIjGJvZB8PV1i8fPf0438QNTD8vDs5s/Z3 1ZsA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1685710958; x=1685797358; bh=witQNPkEpk9ls ExNDP5sXcR+Em8YDwNKWjK0s5ycZHw=; b=pfevERob8uy9jdubFtE0LGWp7MiwH +B6tkA+kkOCzH/8JdZr7QsjL33Wbv4V3uZDF20Z5kBZvdJJfIS6Wec60lACJrdvV LwUunVUpX6ksixab8CokdfU86wOzPLw1PA+JiRU4JwHJ9r//4RW1hyKvCK2LQthX ad9n7C9VNU3VsEM+dfou77SDLvdztcEdWufQxPbp4wSFw5f3mTHJgizE39lgOSlV c9NAZ5+rzQHGLaUloTzqLTW8dEpGbkHl6ffpANqmZZsho7mi+9IZuuDg0tL6f4zu n8kgK7c6el0PIApiPUMAx1ggGmOwCsGZGlNIhLBdU6kG6bLRJ7uKvcJSg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeelfedgheejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 2 Jun 2023 09:02:37 -0400 (EDT) Received: by pks.im (OpenSMTPD) with ESMTPSA id 2b69fca8 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Fri, 2 Jun 2023 13:01:41 +0000 (UTC) Date: Fri, 2 Jun 2023 15:02:34 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Taylor Blau , Toon Claes , Phillip Wood , Junio C Hamano Subject: [PATCH 2/5] t1006: modernize test style to use `test_cmp` Message-ID: <251fc2a387067bb6d575cfa39f088d4474844a59.1685710884.git.ps@pks.im> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The tests for git-cat-file(1) are quite old and haven't ever been updated since they were introduced. They thus tend to use old idioms that have since grown outdated. Most importantly, many of the tests use `test $A = $B` to compare expected and actual output. This has the downside that it is impossible to tell what exactly is different between both versions in case the test fails. Refactor the tests to instead use `test_cmp`. While more verbose, it both tends to be more readable and will result in a nice diff in case states don't match. Signed-off-by: Patrick Steinhardt --- t/t1006-cat-file.sh | 70 ++++++++++++++++++++++++++++++--------------- 1 file changed, 47 insertions(+), 23 deletions(-) diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index f139d56fb4..7b985cfded 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -296,9 +296,11 @@ tag_size=$(strlen "$tag_content") run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_content" -test_expect_success \ - "Reach a blob from a tag pointing to it" \ - "test '$hello_content' = \"\$(git cat-file blob $tag_sha1)\"" +test_expect_success "Reach a blob from a tag pointing to it" ' + echo_without_newline "$hello_content" >expect && + git cat-file blob $tag_sha1 >actual && + test_cmp expect actual +' for batch in batch batch-check batch-command do @@ -334,30 +336,47 @@ do done test_expect_success "--batch-check for a non-existent named object" ' - test "foobar42 missing -foobar84 missing" = \ - "$( ( echo foobar42 && echo_without_newline foobar84 ) | git cat-file --batch-check)" + cat >expect <<-EOF && + foobar42 missing + foobar84 missing + EOF + + printf "foobar42\nfoobar84" >in && + git cat-file --batch-check actual && + test_cmp expect actual ' test_expect_success "--batch-check for a non-existent hash" ' - test "0000000000000000000000000000000000000042 missing -0000000000000000000000000000000000000084 missing" = \ - "$( ( echo 0000000000000000000000000000000000000042 && - echo_without_newline 0000000000000000000000000000000000000084 ) | - git cat-file --batch-check)" + cat >expect <<-EOF && + 0000000000000000000000000000000000000042 missing + 0000000000000000000000000000000000000084 missing + EOF + + printf "0000000000000000000000000000000000000042\n0000000000000000000000000000000000000084" >in && + git cat-file --batch-check actual && + test_cmp expect actual ' test_expect_success "--batch for an existent and a non-existent hash" ' - test "$tag_sha1 tag $tag_size -$tag_content -0000000000000000000000000000000000000000 missing" = \ - "$( ( echo $tag_sha1 && - echo_without_newline 0000000000000000000000000000000000000000 ) | - git cat-file --batch)" + cat >expect <<-EOF && + $tag_sha1 tag $tag_size + $tag_content + 0000000000000000000000000000000000000000 missing + EOF + + printf "$tag_sha1\n0000000000000000000000000000000000000000" >in && + git cat-file --batch actual && + test_cmp expect actual ' test_expect_success "--batch-check for an empty line" ' - test " missing" = "$(echo | git cat-file --batch-check)" + cat >expect <<-EOF && + missing + EOF + + echo >in && + git cat-file --batch-check actual && + test_cmp expect actual ' test_expect_success 'empty --batch-check notices missing object' ' @@ -384,7 +403,8 @@ deadbeef missing test_expect_success '--batch with multiple sha1s gives correct format' ' echo "$batch_output" >expect && - echo_without_newline "$batch_input" | git cat-file --batch >actual && + echo_without_newline "$batch_input" >in && + git cat-file --batch actual && test_cmp expect actual ' @@ -411,13 +431,17 @@ deadbeef missing missing" test_expect_success "--batch-check with multiple sha1s gives correct format" ' - test "$batch_check_output" = \ - "$(echo_without_newline "$batch_check_input" | git cat-file --batch-check)" + echo "$batch_check_output" >expect && + echo_without_newline "$batch_check_input" >in && + git cat-file --batch-check actual && + test_cmp expect actual ' test_expect_success "--batch-check, -z with multiple sha1s gives correct format" ' - echo_without_newline_nul "$batch_check_input" >in && - test "$batch_check_output" = "$(git cat-file --batch-check -z expect && + echo_without_newline_nul "$batch_check_input" >in && + git cat-file --batch-check -z actual && + test_cmp expect actual ' test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' ' From patchwork Fri Jun 2 13:02:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13265297 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8213FC77B7A for ; Fri, 2 Jun 2023 13:02:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235594AbjFBNCs (ORCPT ); Fri, 2 Jun 2023 09:02:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52290 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235525AbjFBNCp (ORCPT ); Fri, 2 Jun 2023 09:02:45 -0400 Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D24B51B3 for ; Fri, 2 Jun 2023 06:02:43 -0700 (PDT) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id 4E61A5C00CA; Fri, 2 Jun 2023 09:02:43 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Fri, 02 Jun 2023 09:02:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1685710963; x=1685797363; bh=gS rqGwai2YoT4E5P+7bATvRnXf1bno75BnQgcusaOxk=; b=mV/ppVXmTvybOPrxfJ RyrpIVg4QjR1OYP8Ft8bE2vxBYdWA5oVeuK+K1oo1/XQmgMlfZES7Rnd5Zpq0mHd vLnounbyUVsW5wVz7VcM2BfzolkLZEkOn519LMbZNiwIr6b+Uds/raQPL3UwW0oC 0FZO+t+2K4t/hfFt0pCFVnC/iYLECAq+QIO3AJwpkzeXHamdORVLxPxfYlox/Lut KGwqb/Wg2XE7hw83xgXyiHl2OXlpwYdGxZcUj8cEYmgL3VCms+tYbrNTBsPBCBFG 0rzOWT2Ql1DKE12q8gxkgJVTMOlg/E/F+HnKNBMxprKkbLdf4l58ksqhXyvFgLHy LtyQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1685710963; x=1685797363; bh=gSrqGwai2YoT4 E5P+7bATvRnXf1bno75BnQgcusaOxk=; b=D/TDxl55LeBbrcw6h8jOX9kR+9dsc 71A7eLbmL86UhAoM9kNgcLghNYDb5TzJQ/2Wel6SJIIcfa9lDvu+1zIbTJ0dCyl6 AR8hZ8Oo5EII1yrUsxqMyb4g9/ZqG3GZDDgkRqPYG0+yZQJN3CxsUJjURFZhAaVw 2G1VZ+ImQYCipEpeAii73bJaHbDQy0WQuP3/9p+YPwqY4rPrZP1KO251Cgmqi+Uk rYeTv3ZfgkeD5OMsxZVIv5LMDlTcdL6yTEVCgQUOTo5zRhhaPEpTkqjMhCpry7s9 2Lt+YoUshv3Rq5+BD8IKaDh664hUWUOTDE0yxvHqQk6DrbrER58Dt2pdg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeelfedgheejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 2 Jun 2023 09:02:41 -0400 (EDT) Received: by pks.im (OpenSMTPD) with ESMTPSA id ba69e51e (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Fri, 2 Jun 2023 13:01:45 +0000 (UTC) Date: Fri, 2 Jun 2023 15:02:38 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Taylor Blau , Toon Claes , Phillip Wood , Junio C Hamano Subject: [PATCH 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter Message-ID: <8127eeac97200da9aafccdf16cb7ba06f68b0121.1685710884.git.ps@pks.im> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Many of our commands support reading input that is separated either via newlines or via NUL characters. Furthermore, in order to be a better cross platform citizen, these commands typically know to strip the CRLF sequence so that we also support reading newline-separated inputs on e.g. the Windows platform. This results in the following kind of awkward pattern: ``` struct strbuf input = STRBUF_INIT; while (1) { int ret; if (nul_terminated) ret = strbuf_getline_nul(&input, stdin); else ret = strbuf_getline(&input, stdin); if (ret) break; ... } ``` Introduce a new CRLF-aware helper function that can read up to a user specified delimiter. If the delimiter is `\n` the function knows to also strip CRLF, otherwise it will only strip the specified delimiter. This results in the following, much more readable code pattern: ``` struct strbuf input = STRBUF_INIT; while (strbuf_getdelim_strip_crlf(&input, stdin, delim) != EOF) { ... } ``` The new function will be used in a subsequent commit. Signed-off-by: Patrick Steinhardt --- strbuf.c | 11 ++++++++--- strbuf.h | 12 ++++++++++++ 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/strbuf.c b/strbuf.c index 08eec8f1d8..31dc48c0ab 100644 --- a/strbuf.c +++ b/strbuf.c @@ -721,11 +721,11 @@ static int strbuf_getdelim(struct strbuf *sb, FILE *fp, int term) return 0; } -int strbuf_getline(struct strbuf *sb, FILE *fp) +int strbuf_getdelim_strip_crlf(struct strbuf *sb, FILE *fp, int term) { - if (strbuf_getwholeline(sb, fp, '\n')) + if (strbuf_getwholeline(sb, fp, term)) return EOF; - if (sb->buf[sb->len - 1] == '\n') { + if (term == '\n' && sb->buf[sb->len - 1] == '\n') { strbuf_setlen(sb, sb->len - 1); if (sb->len && sb->buf[sb->len - 1] == '\r') strbuf_setlen(sb, sb->len - 1); @@ -733,6 +733,11 @@ int strbuf_getline(struct strbuf *sb, FILE *fp) return 0; } +int strbuf_getline(struct strbuf *sb, FILE *fp) +{ + return strbuf_getdelim_strip_crlf(sb, fp, '\n'); +} + int strbuf_getline_lf(struct strbuf *sb, FILE *fp) { return strbuf_getdelim(sb, fp, '\n'); diff --git a/strbuf.h b/strbuf.h index 3dfeadb44c..0e69b656bc 100644 --- a/strbuf.h +++ b/strbuf.h @@ -475,6 +475,18 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint); */ ssize_t strbuf_write(struct strbuf *sb, FILE *stream); +/** + * Read from a FILE * until the specified terminator is encountered, + * overwriting the existing contents of the strbuf. + * + * Reading stops after the terminator or at EOF. The terminator is + * removed from the buffer before returning. If the terminator is LF + * and if it is preceded by a CR, then the whole CRLF is stripped. + * Returns 0 unless there was nothing left before EOF, in which case + * it returns `EOF`. + */ +int strbuf_getdelim_strip_crlf(struct strbuf *sb, FILE *fp, int term); + /** * Read a line from a FILE *, overwriting the existing contents of * the strbuf. The strbuf_getline*() family of functions share From patchwork Fri Jun 2 13:02:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13265298 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC206C7EE24 for ; Fri, 2 Jun 2023 13:02:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235761AbjFBNC4 (ORCPT ); Fri, 2 Jun 2023 09:02:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235595AbjFBNCu (ORCPT ); Fri, 2 Jun 2023 09:02:50 -0400 Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 350811B5 for ; Fri, 2 Jun 2023 06:02:48 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 752AB5C010C; Fri, 2 Jun 2023 09:02:47 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Fri, 02 Jun 2023 09:02:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1685710967; x=1685797367; bh=q7 GCfA8V7HDnBKB1Tk915p6kG3FU/Mzd1fXGookrldg=; b=ALJEmeBqDEIIgyPm1f KNynlvhwQpASJGCmr+qsbV/GJazqh3I1psc4mx1gAp3w1Vp0FjKhHc5oCH+JP2oM 29mhyVA2XC7l56TKtP+YiMC2O/h+zEjeYob6dtL+bDS7fjPtTWGVX7K1Fs8ujhNy BkeBffhlLZiiUpmOtGXG+2foieNmbJI89BUBzWJ7X+NHu0yxpvVayzofWXSeQVXC VkzlChxppkoy3S93HLaNnAYFUd/lxAIkYWKPCnfy5A5mRVPaC9sym8/9NIEhMsTO xLwJ82aMbj7UX2pavlCOti0ynV9itqDFkBp7WZnQN1KzxiUzSx3vx5TAaUuOEegj z0GA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1685710967; x=1685797367; bh=q7GCfA8V7HDnB KB1Tk915p6kG3FU/Mzd1fXGookrldg=; b=nLPBtL9N94jISAWIqBsAHNhw/A/ID xu6p9llSHcecaKj5fd8UsZEHEgTmbJ7Kxur8TExIZnLxjdo0cr1kRg+O+bNix0DT mzEYNxKiX9nvBs//pqPqBCodGTUdGrY08zTTfqkuJp/39ayrdiRfsZaUnvuT2FA7 WiyHyKePLCF4jT3FpA0cLoUM1KKi5tByyX4yQNa1PGsbvBxnw96CFnSJ6bOat0d4 qaBqMd4HM8sePEdkAIDJmOVSeZGqkUD5AmUJNYTHM1iibQCDFY5ohy5f5Fy6wv5N ynqpTqvHjrYHGopmzMwKG9o/zkMjVe9KKsumA7A2lephsmbi4mZH8C2MQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeelfedgheejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 2 Jun 2023 09:02:45 -0400 (EDT) Received: by pks.im (OpenSMTPD) with ESMTPSA id 7593346e (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Fri, 2 Jun 2023 13:01:49 +0000 (UTC) Date: Fri, 2 Jun 2023 15:02:42 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Taylor Blau , Toon Claes , Phillip Wood , Junio C Hamano Subject: [PATCH 4/5] cat-file: simplify reading from standard input Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The batch modes of git-cat-file(1) read queries from stantard input that are either newline- or NUL-delimited. This code was introduced via db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with `-z`, 2022-07-22), which notes that: """ The refactoring here is slightly unfortunate, since we turn loops like: while (strbuf_getline(&buf, stdin) != EOF) into: while (1) { int ret; if (opt->nul_terminated) ret = strbuf_getline_nul(&input, stdin); else ret = strbuf_getline(&input, stdin); if (ret == EOF) break; } """ The commit proposed introducing a helper function that is easier to use, which is just what we have done in the preceding commit. Refactor the code to use this new helper to simplify the loop. Signed-off-by: Patrick Steinhardt --- builtin/cat-file.c | 32 +++++++++----------------------- 1 file changed, 9 insertions(+), 23 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 0bafc14e6c..001dcb24d6 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -42,7 +42,7 @@ struct batch_options { int all_objects; int unordered; int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */ - int nul_terminated; + char input_delim; const char *format; }; @@ -694,20 +694,12 @@ static void batch_objects_command(struct batch_options *opt, struct queued_cmd *queued_cmd = NULL; size_t alloc = 0, nr = 0; - while (1) { - int i, ret; + while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) { + int i; const struct parse_cmd *cmd = NULL; const char *p = NULL, *cmd_end; struct queued_cmd call = {0}; - if (opt->nul_terminated) - ret = strbuf_getline_nul(&input, stdin); - else - ret = strbuf_getline(&input, stdin); - - if (ret) - break; - if (!input.len) die(_("empty command in input")); if (isspace(*input.buf)) @@ -851,16 +843,7 @@ static int batch_objects(struct batch_options *opt) goto cleanup; } - while (1) { - int ret; - if (opt->nul_terminated) - ret = strbuf_getline_nul(&input, stdin); - else - ret = strbuf_getline(&input, stdin); - - if (ret == EOF) - break; - + while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) { if (data.split_on_whitespace) { /* * Split at first whitespace, tying off the beginning @@ -929,6 +912,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) const char *exp_type = NULL, *obj_name = NULL; struct batch_options batch = {0}; int unknown_type = 0; + int input_nul_terminated = 0; const char * const usage[] = { N_("git cat-file "), @@ -965,7 +949,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) N_("like --batch, but don't emit "), PARSE_OPT_OPTARG | PARSE_OPT_NONEG, batch_option_callback), - OPT_BOOL('z', NULL, &batch.nul_terminated, N_("stdin is NUL-terminated")), + OPT_BOOL('z', NULL, &input_nul_terminated, N_("stdin is NUL-terminated")), OPT_CALLBACK_F(0, "batch-command", &batch, N_("format"), N_("read commands from stdin"), PARSE_OPT_OPTARG | PARSE_OPT_NONEG, @@ -1024,10 +1008,12 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) else if (batch.all_objects) usage_msg_optf(_("'%s' requires a batch mode"), usage, options, "--batch-all-objects"); - else if (batch.nul_terminated) + else if (input_nul_terminated) usage_msg_optf(_("'%s' requires a batch mode"), usage, options, "-z"); + batch.input_delim = input_nul_terminated ? '\0' : '\n'; + /* Batch defaults */ if (batch.buffer_output < 0) batch.buffer_output = batch.all_objects; From patchwork Fri Jun 2 13:02:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13265299 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4120AC7EE24 for ; Fri, 2 Jun 2023 13:03:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235782AbjFBNDF (ORCPT ); Fri, 2 Jun 2023 09:03:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235661AbjFBNC4 (ORCPT ); Fri, 2 Jun 2023 09:02:56 -0400 Received: from out5-smtp.messagingengine.com (out5-smtp.messagingengine.com [66.111.4.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 798E8E46 for ; Fri, 2 Jun 2023 06:02:52 -0700 (PDT) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id E26805C010E; Fri, 2 Jun 2023 09:02:51 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Fri, 02 Jun 2023 09:02:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1685710971; x=1685797371; bh=SG vkvV6qq8m0+FfiD7A5z1nJWZ8W1QQLVQp8nf48ohM=; b=RAji2Cpj365p2sauE2 WsJD4c4Kk+M8FI64npMqGVkeW61tH0P24gm7gNTTHhgz448DONZ+mvl3VTUWYLl8 So1s9FpQ74ehFpdTJlbIw7RMbR5LIEvfiF499d6ega6Md8xUJALbfpUoA/wXHbDQ Iy319EtdBCV/D6Ftb0vqOzN7iotdaJYw3K22BMO1JtA4kFb2kDwg6NK130NK2QWU Aie/NgazG+rYMZrpfbeb0mZxB8miUBh2zvyxdBEmOFXOEL5JE0a8ME6KdfPVP/Zu 3D9aravTfMw52od6k26GdPusbybscpflamFS6pdfHGsdKNKLaC4Ec59ek8HfGrc1 RJPA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1685710971; x=1685797371; bh=SGvkvV6qq8m0+ FfiD7A5z1nJWZ8W1QQLVQp8nf48ohM=; b=e6m+d0ENhVV+cpi+kHp9CLOeyphte fCkqhZ/nYtP6VIRvQIgg0+bxae/C7mRMbzd1tsJcfW0rpJRFjjD+BdVJXsUOcfdy VJANCG5TVBMxNcGc5E7ZPJDDd2OcqaIWJPqigEOryN+mHyIdY7iz322e6poDs3U2 GAwXhoElT3EPdyZ79WX/Rwz+u/TsAP0V1QouZPlWyQ/P/kma9+BOR220iW2TxQ0n 5op8FZChqot72sAO0jjLE1HgTGnxPG6U9DcJ7qwZWxcNZ15mg9WlER4jcgLs5nM4 reQt+UrPtledPeYOzwFdKavMCzaScJTq06CwGPCIsbtuxVMPZU1/SflXg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrfeelfedgheejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 2 Jun 2023 09:02:49 -0400 (EDT) Received: by pks.im (OpenSMTPD) with ESMTPSA id 631e4911 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Fri, 2 Jun 2023 13:01:53 +0000 (UTC) Date: Fri, 2 Jun 2023 15:02:46 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Taylor Blau , Toon Claes , Phillip Wood , Junio C Hamano Subject: [PATCH 5/5] cat-file: Introduce new option to delimit output with NUL characters Message-ID: <07a7c34615ec68fa42c725fd34d6144b6b191f03.1685710884.git.ps@pks.im> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with `-z`, 2022-07-22), we have introduced a new mode to read the input via NUL-delimited records instead of newline-delimited records. This allows the user to query for revisions that have newlines in their path component. While unusual, such queries are perfectly valid and thus it is clear that we should be able to support them properly. Unfortunately, the commit only changed the input to be NUL-delimited, but didn't change the output at the same time. While this is fine for queries that are processed successfully, it is less so for queries that aren't. In the case of missing commits for example the result can become entirely unparsable: ``` $ printf "7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10\n1234567890\n\n\commit000" | git cat-file --batch -z 7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10 1234567890 commit missing ``` This is of course a crafted query that is intentionally gaming the deficiency, but more benign queries that contain newlines would have similar problems. Ideally, we should have also changed the output to be NUL-delimited when `-z` is specified to avoid this problem. As the input is NUL-delimited, it is clear that the output in this case cannot ever contain NUL characters by itself. Furthermore, Git does not allow NUL characters in revisions anyway, further stressing the point that using NUL-delimited output is safe. The only exception is of course the object data itself, but as git-cat-file(1) prints the size of the object data clients should read until that specified size has been consumed. But even though `-z` has only been introduced a few releases ago in Git v2.38.0, changing the output format retroactively to also NUL-delimit output would be a backwards incompatible change. And while one could make the argument that the output is inherently broken already, we need to assume that there are existing users out there that use it just fine given that revisions containing newlines are quite exotic. Instead, introduce a new option `-Z` that switches to NUL-delimited input and output. The old `-z` option is marked as deprecated with a hint that its output may become unparsable. Co-authored-by: Toon Claes Signed-off-by: Patrick Steinhardt --- Documentation/git-cat-file.txt | 13 +++- builtin/cat-file.c | 55 +++++++++------ t/t1006-cat-file.sh | 123 ++++++++++++++++++++++++--------- 3 files changed, 137 insertions(+), 54 deletions(-) diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt index 411de2e27d..b1f48fdfb1 100644 --- a/Documentation/git-cat-file.txt +++ b/Documentation/git-cat-file.txt @@ -14,7 +14,7 @@ SYNOPSIS 'git cat-file' (-t | -s) [--allow-unknown-type] 'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects] [--buffer] [--follow-symlinks] [--unordered] - [--textconv | --filters] [-z] + [--textconv | --filters] [-z] [-Z] 'git cat-file' (--textconv | --filters) [: | --path= ] @@ -246,6 +246,12 @@ respectively print: -z:: Only meaningful with `--batch`, `--batch-check`, or `--batch-command`; input is NUL-delimited instead of + newline-delimited. This option is deprecated in favor of + `-Z` as the output can otherwise be ambiguous. + +-Z:: + Only meaningful with `--batch`, `--batch-check`, or + `--batch-command`; input and output is NUL-delimited instead of newline-delimited. @@ -384,6 +390,11 @@ notdir SP LF is printed when, during symlink resolution, a file is used as a directory name. +Alternatively, when `-Z` is passed, the line feeds in any of the above examples +are replaced with NUL terminators. This ensures that output will be parsable if +the output itself would contain a linefeed and is thus recommended for +scripting purposes. + CAVEATS ------- diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 001dcb24d6..90ef407d30 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -43,6 +43,7 @@ struct batch_options { int unordered; int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */ char input_delim; + char output_delim; const char *format; }; @@ -437,11 +438,12 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d } } -static void print_default_format(struct strbuf *scratch, struct expand_data *data) +static void print_default_format(struct strbuf *scratch, struct expand_data *data, + struct batch_options *opt) { - strbuf_addf(scratch, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid), + strbuf_addf(scratch, "%s %s %"PRIuMAX"%c", oid_to_hex(&data->oid), type_name(data->type), - (uintmax_t)data->size); + (uintmax_t)data->size, opt->output_delim); } /* @@ -470,8 +472,8 @@ static void batch_object_write(const char *obj_name, &data->oid, &data->info, OBJECT_INFO_LOOKUP_REPLACE); if (ret < 0) { - printf("%s missing\n", - obj_name ? obj_name : oid_to_hex(&data->oid)); + printf("%s missing%c", + obj_name ? obj_name : oid_to_hex(&data->oid), opt->output_delim); fflush(stdout); return; } @@ -492,17 +494,18 @@ static void batch_object_write(const char *obj_name, strbuf_reset(scratch); if (!opt->format) { - print_default_format(scratch, data); + print_default_format(scratch, data, opt); } else { strbuf_expand(scratch, opt->format, expand_format, data); - strbuf_addch(scratch, '\n'); + strbuf_addch(scratch, opt->output_delim); } batch_write(opt, scratch->buf, scratch->len); if (opt->batch_mode == BATCH_MODE_CONTENTS) { + char buf[] = {opt->output_delim}; print_object_or_die(opt, data); - batch_write(opt, "\n", 1); + batch_write(opt, buf, 1); } } @@ -520,22 +523,25 @@ static void batch_one_object(const char *obj_name, if (result != FOUND) { switch (result) { case MISSING_OBJECT: - printf("%s missing\n", obj_name); + printf("%s missing%c", obj_name, opt->output_delim); break; case SHORT_NAME_AMBIGUOUS: - printf("%s ambiguous\n", obj_name); + printf("%s ambiguous%c", obj_name, opt->output_delim); break; case DANGLING_SYMLINK: - printf("dangling %"PRIuMAX"\n%s\n", - (uintmax_t)strlen(obj_name), obj_name); + printf("dangling %"PRIuMAX"%c%s%c", + (uintmax_t)strlen(obj_name), + opt->output_delim, obj_name, opt->output_delim); break; case SYMLINK_LOOP: - printf("loop %"PRIuMAX"\n%s\n", - (uintmax_t)strlen(obj_name), obj_name); + printf("loop %"PRIuMAX"%c%s%c", + (uintmax_t)strlen(obj_name), + opt->output_delim, obj_name, opt->output_delim); break; case NOT_DIR: - printf("notdir %"PRIuMAX"\n%s\n", - (uintmax_t)strlen(obj_name), obj_name); + printf("notdir %"PRIuMAX"%c%s%c", + (uintmax_t)strlen(obj_name), + opt->output_delim, obj_name, opt->output_delim); break; default: BUG("unknown get_sha1_with_context result %d\n", @@ -547,9 +553,9 @@ static void batch_one_object(const char *obj_name, } if (ctx.mode == 0) { - printf("symlink %"PRIuMAX"\n%s\n", + printf("symlink %"PRIuMAX"%c%s%c", (uintmax_t)ctx.symlink_path.len, - ctx.symlink_path.buf); + opt->output_delim, ctx.symlink_path.buf, opt->output_delim); fflush(stdout); return; } @@ -913,6 +919,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) struct batch_options batch = {0}; int unknown_type = 0; int input_nul_terminated = 0; + int nul_terminated = 0; const char * const usage[] = { N_("git cat-file "), @@ -920,7 +927,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) N_("git cat-file (-t | -s) [--allow-unknown-type] "), N_("git cat-file (--batch | --batch-check | --batch-command) [--batch-all-objects]\n" " [--buffer] [--follow-symlinks] [--unordered]\n" - " [--textconv | --filters] [-z]"), + " [--textconv | --filters] [-z] [-Z]"), N_("git cat-file (--textconv | --filters)\n" " [: | --path= ]"), NULL @@ -950,6 +957,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) PARSE_OPT_OPTARG | PARSE_OPT_NONEG, batch_option_callback), OPT_BOOL('z', NULL, &input_nul_terminated, N_("stdin is NUL-terminated")), + OPT_BOOL('Z', NULL, &nul_terminated, N_("stdin and stdout is NUL-terminated")), OPT_CALLBACK_F(0, "batch-command", &batch, N_("format"), N_("read commands from stdin"), PARSE_OPT_OPTARG | PARSE_OPT_NONEG, @@ -1011,8 +1019,15 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) else if (input_nul_terminated) usage_msg_optf(_("'%s' requires a batch mode"), usage, options, "-z"); + else if (nul_terminated) + usage_msg_optf(_("'%s' requires a batch mode"), usage, options, + "-Z"); - batch.input_delim = input_nul_terminated ? '\0' : '\n'; + batch.input_delim = batch.output_delim = '\n'; + if (input_nul_terminated) + batch.input_delim = '\0'; + if (nul_terminated) + batch.input_delim = batch.output_delim = '\0'; /* Batch defaults */ if (batch.buffer_output < 0) diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index 7b985cfded..d73a0be1b9 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -89,7 +89,8 @@ done for opt in --buffer \ --follow-symlinks \ --batch-all-objects \ - -z + -z \ + -Z do test_expect_success "usage: bad option combination: $opt without batch mode" ' test_incompatible_usage git cat-file $opt && @@ -392,17 +393,18 @@ deadbeef " -batch_output="$hello_sha1 blob $hello_size -$hello_content -$commit_sha1 commit $commit_size -$commit_content -$tag_sha1 tag $tag_size -$tag_content -deadbeef missing - missing" +printf "%s\0" \ + "$hello_sha1 blob $hello_size" \ + "$hello_content" \ + "$commit_sha1 commit $commit_size" \ + "$commit_content" \ + "$tag_sha1 tag $tag_size" \ + "$tag_content" \ + "deadbeef missing" \ + " missing" >batch_output test_expect_success '--batch with multiple sha1s gives correct format' ' - echo "$batch_output" >expect && + tr "\0" "\n" expect && echo_without_newline "$batch_input" >in && git cat-file --batch actual && test_cmp expect actual @@ -410,11 +412,17 @@ test_expect_success '--batch with multiple sha1s gives correct format' ' test_expect_success '--batch, -z with multiple sha1s gives correct format' ' echo_without_newline_nul "$batch_input" >in && - echo "$batch_output" >expect && + tr "\0" "\n" expect && git cat-file --batch -z actual && test_cmp expect actual ' +test_expect_success '--batch, -Z with multiple sha1s gives correct format' ' + echo_without_newline_nul "$batch_input" >in && + git cat-file --batch -Z actual && + test_cmp batch_output actual +' + batch_check_input="$hello_sha1 $tree_sha1 $commit_sha1 @@ -423,40 +431,55 @@ deadbeef " -batch_check_output="$hello_sha1 blob $hello_size -$tree_sha1 tree $tree_size -$commit_sha1 commit $commit_size -$tag_sha1 tag $tag_size -deadbeef missing - missing" +printf "%s\0" \ + "$hello_sha1 blob $hello_size" \ + "$tree_sha1 tree $tree_size" \ + "$commit_sha1 commit $commit_size" \ + "$tag_sha1 tag $tag_size" \ + "deadbeef missing" \ + " missing" >batch_check_output test_expect_success "--batch-check with multiple sha1s gives correct format" ' - echo "$batch_check_output" >expect && + tr "\0" "\n" expect && echo_without_newline "$batch_check_input" >in && git cat-file --batch-check actual && test_cmp expect actual ' test_expect_success "--batch-check, -z with multiple sha1s gives correct format" ' - echo "$batch_check_output" >expect && + tr "\0" "\n" expect && echo_without_newline_nul "$batch_check_input" >in && git cat-file --batch-check -z actual && test_cmp expect actual ' -test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' ' +test_expect_success "--batch-check, -Z with multiple sha1s gives correct format" ' + echo_without_newline_nul "$batch_check_input" >in && + git cat-file --batch-check -Z actual && + test_cmp batch_check_output actual +' + +test_expect_success FUNNYNAMES 'setup with newline in input' ' touch -- "newline${LF}embedded" && git add -- "newline${LF}embedded" && git commit -m "file with newline embedded" && test_tick && - printf "HEAD:newline${LF}embedded" >in && - git cat-file --batch-check -z actual && + printf "HEAD:newline${LF}embedded" >in +' +test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' ' + git cat-file --batch-check -z actual && echo "$(git rev-parse "HEAD:newline${LF}embedded") blob 0" >expect && test_cmp expect actual ' +test_expect_success FUNNYNAMES '--batch-check, -Z with newline in input' ' + git cat-file --batch-check -Z actual && + printf "%s\0" "$(git rev-parse "HEAD:newline${LF}embedded") blob 0" >expect && + test_cmp expect actual +' + batch_command_multiple_info="info $hello_sha1 info $tree_sha1 info $commit_sha1 @@ -480,7 +503,13 @@ test_expect_success '--batch-command with multiple info calls gives correct form echo "$batch_command_multiple_info" | tr "\n" "\0" >in && git cat-file --batch-command --buffer -z actual && - test_cmp expect actual + test_cmp expect actual && + + echo "$batch_command_multiple_info" | tr "\n" "\0" >in && + tr "\n" "\0" expect_nul && + git cat-file --batch-command --buffer -Z actual && + + test_cmp expect_nul actual ' batch_command_multiple_contents="contents $hello_sha1 @@ -490,15 +519,15 @@ contents deadbeef flush" test_expect_success '--batch-command with multiple command calls gives correct format' ' - cat >expect <<-EOF && - $hello_sha1 blob $hello_size - $hello_content - $commit_sha1 commit $commit_size - $commit_content - $tag_sha1 tag $tag_size - $tag_content - deadbeef missing - EOF + printf "%s\0" \ + "$hello_sha1 blob $hello_size" \ + "$hello_content" \ + "$commit_sha1 commit $commit_size" \ + "$commit_content" \ + "$tag_sha1 tag $tag_size" \ + "$tag_content" \ + "deadbeef missing" >expect_nul && + tr "\0" "\n" expect && echo "$batch_command_multiple_contents" >in && git cat-file --batch-command --buffer actual && @@ -508,7 +537,12 @@ test_expect_success '--batch-command with multiple command calls gives correct f echo "$batch_command_multiple_contents" | tr "\n" "\0" >in && git cat-file --batch-command --buffer -z actual && - test_cmp expect actual + test_cmp expect actual && + + echo "$batch_command_multiple_contents" | tr "\n" "\0" >in && + git cat-file --batch-command --buffer -Z actual && + + test_cmp expect_nul actual ' test_expect_success 'setup blobs which are likely to delta' ' @@ -848,6 +882,13 @@ test_expect_success 'git cat-file --batch-check --follow-symlinks works for brok test_cmp expect actual ' +test_expect_success 'git cat-file --batch-check --follow-symlinks -Z works for broken in-repo, same-dir links' ' + printf "HEAD:broken-same-dir-link\0" >in && + printf "dangling 25\0HEAD:broken-same-dir-link\0" >expect && + git cat-file --batch-check --follow-symlinks -Z actual && + test_cmp expect actual +' + test_expect_success 'git cat-file --batch-check --follow-symlinks works for same-dir links-to-links' ' echo HEAD:link-to-link | git cat-file --batch-check --follow-symlinks >actual && test_cmp found actual @@ -862,6 +903,15 @@ test_expect_success 'git cat-file --batch-check --follow-symlinks works for pare test_cmp expect actual ' +test_expect_success 'git cat-file --batch-check --follow-symlinks -Z works for parent-dir links' ' + echo HEAD:dir/parent-dir-link | git cat-file --batch-check --follow-symlinks >actual && + test_cmp found actual && + printf "notdir 29\0HEAD:dir/parent-dir-link/nope\0" >expect && + printf "HEAD:dir/parent-dir-link/nope\0" >in && + git cat-file --batch-check --follow-symlinks -Z actual && + test_cmp expect actual +' + test_expect_success 'git cat-file --batch-check --follow-symlinks works for .. links' ' echo dangling 22 >expect && echo HEAD:dir/link-dir/nope >>expect && @@ -976,6 +1026,13 @@ test_expect_success 'git cat-file --batch-check --follow-symlink breaks loops' ' test_cmp expect actual ' +test_expect_success 'git cat-file --batch-check --follow-symlink -Z breaks loops' ' + printf "loop 10\0HEAD:loop1\0" >expect && + printf "HEAD:loop1\0" >in && + git cat-file --batch-check --follow-symlinks -Z actual && + test_cmp expect actual +' + test_expect_success 'git cat-file --batch --follow-symlink returns correct sha and mode' ' echo HEAD:morx | git cat-file --batch >expect && echo HEAD:morx | git cat-file --batch --follow-symlinks >actual &&