From patchwork Tue Jun 6 05:19:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13268245 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CA8AC7EE24 for ; Tue, 6 Jun 2023 05:19:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232492AbjFFFTk (ORCPT ); Tue, 6 Jun 2023 01:19:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230252AbjFFFTg (ORCPT ); Tue, 6 Jun 2023 01:19:36 -0400 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6DFEA123 for ; Mon, 5 Jun 2023 22:19:34 -0700 (PDT) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id A78393200AEC; Tue, 6 Jun 2023 01:19:33 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Tue, 06 Jun 2023 01:19:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1686028773; x=1686115173; bh=bM rPXm5DM+rxk5IeZ05/hKLcor+ZSYlqMv6RjbZo91Y=; b=ch7XFgmQWCJTJQd14F xKkq8z9ug2Y4MWTuzI22BysvsVHKjQbdWAskdsvuQaLLdt2VpueWNSSyihtUPZqg pL/RpfBtiIm3PrrXec/GQvZrrqvSRvb9XfDEnXuYjqDF1AS4W69/iwFMgTKW2CN3 v5vmB7dPPykVsCBEs75FwfCH+ycRJE7dCQ946vcCJKos10HSZc+xbKIPjPCc74rb T/qbrEJa53NErLrKaW640cbV1ine5cVY9hSP7m7M8tcSlxGoyVAeXiG9OCaFzhMt q9Ctxq9kL9Bjfy2EaUEdzuu/P67C/dwoZ7ymmJezx+PGqV+wbUDlWasmISeVLAXE n4JQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1686028773; x=1686115173; bh=bMrPXm5DM+rxk 5IeZ05/hKLcor+ZSYlqMv6RjbZo91Y=; b=znu8GwA3lK/WnEW+MUOwodGYnpYWe iabuuM8MwHd6usQHOJMxWjYi9j+aKEdfCT+wePwMxQvcKPKZ15CA8s7jTQ1K7LIm zcSSAoENhW8BOwHbw5UyGWD9uKOjyKNxnbil3QgQ7iWWpCAewsDTDaeJdDGJODEq KuUJ5h6wUFxbCFwgBvsel+87eFLX3LH1atdJJla5usFVVUnqRtVUH6r9L0HM1WOJ BAWJx3o2HRpkylTW6OoxM2m0eC8y/ZzM6YUd5SLfMI9IlbmYLYPMV5b6qIPgyyfH /xnJ5MhflBGT9bd/9a6df5kxzXMPk+u8XQM/uoHPmLfe3oFk7BiFTd5Hg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrgedttddgleduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 6 Jun 2023 01:19:31 -0400 (EDT) Received: by pks.im (OpenSMTPD) with ESMTPSA id 73960995 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 6 Jun 2023 05:18:27 +0000 (UTC) Date: Tue, 6 Jun 2023 07:19:29 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Taylor Blau , Toon Claes , Phillip Wood , Junio C Hamano Subject: [PATCH v2 1/5] t1006: don't strip timestamps from expected results Message-ID: <5c8b4a1d70b170e3426cdd537edc0c076115be0a.1686028409.git.ps@pks.im> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In t1006 we have a bunch of tests that verify the output format of the git-cat-file(1) command. But while part of the output for some tests would include commit timestamps, we don't verify those but instead strip them before comparing expected with actual results. This is done by the function `maybe_remove_timestamp`, which goes all the way back to the ancient commit b335d3f121 (Add tests for git cat-file, 2008-04-23). Our tests had been in a different shape back then. Most importantly we didn't yet have the infrastructure to create objects with deterministic timestamps. Nowadays we do though, and thus there is no reason anymore to strip the timestamps. Refactor the tests to not strip the timestamp anymore. Signed-off-by: Patrick Steinhardt --- t/t1006-cat-file.sh | 68 +++++++++++++++++---------------------------- 1 file changed, 26 insertions(+), 42 deletions(-) diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index 8eac74b59c..f139d56fb4 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -109,26 +109,12 @@ strlen () { echo_without_newline "$1" | wc -c | sed -e 's/^ *//' } -maybe_remove_timestamp () { - if test -z "$2"; then - echo_without_newline "$1" - else - echo_without_newline "$(printf '%s\n' "$1" | remove_timestamp)" - fi -} - -remove_timestamp () { - sed -e 's/ [0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$//' -} - - run_tests () { type=$1 sha1=$2 size=$3 content=$4 pretty_content=$5 - no_ts=$6 batch_output="$sha1 $type $size $content" @@ -163,21 +149,21 @@ $content" test -z "$content" || test_expect_success "Content of $type is correct" ' - maybe_remove_timestamp "$content" $no_ts >expect && - maybe_remove_timestamp "$(git cat-file $type $sha1)" $no_ts >actual && + echo_without_newline "$content" >expect && + git cat-file $type $sha1 >actual && test_cmp expect actual ' test_expect_success "Pretty content of $type is correct" ' - maybe_remove_timestamp "$pretty_content" $no_ts >expect && - maybe_remove_timestamp "$(git cat-file -p $sha1)" $no_ts >actual && + echo_without_newline "$pretty_content" >expect && + git cat-file -p $sha1 >actual && test_cmp expect actual ' test -z "$content" || test_expect_success "--batch output of $type is correct" ' - maybe_remove_timestamp "$batch_output" $no_ts >expect && - maybe_remove_timestamp "$(echo $sha1 | git cat-file --batch)" $no_ts >actual && + echo "$batch_output" >expect && + echo $sha1 | git cat-file --batch >actual && test_cmp expect actual ' @@ -191,9 +177,8 @@ $content" do test -z "$content" || test_expect_success "--batch-command $opt output of $type content is correct" ' - maybe_remove_timestamp "$batch_output" $no_ts >expect && - maybe_remove_timestamp "$(test_write_lines "contents $sha1" | - git cat-file --batch-command $opt)" $no_ts >actual && + echo "$batch_output" >expect && + test_write_lines "contents $sha1" | git cat-file --batch-command $opt >actual && test_cmp expect actual ' @@ -228,10 +213,9 @@ $content" test_expect_success "--batch without type ($type)" ' { echo "$size" && - maybe_remove_timestamp "$content" $no_ts + echo "$content" } >expect && - echo $sha1 | git cat-file --batch="%(objectsize)" >actual.full && - maybe_remove_timestamp "$(cat actual.full)" $no_ts >actual && + echo $sha1 | git cat-file --batch="%(objectsize)" >actual && test_cmp expect actual ' @@ -239,10 +223,9 @@ $content" test_expect_success "--batch without size ($type)" ' { echo "$type" && - maybe_remove_timestamp "$content" $no_ts + echo "$content" } >expect && - echo $sha1 | git cat-file --batch="%(objecttype)" >actual.full && - maybe_remove_timestamp "$(cat actual.full)" $no_ts >actual && + echo $sha1 | git cat-file --batch="%(objecttype)" >actual && test_cmp expect actual ' } @@ -284,7 +267,7 @@ test_expect_success '--batch-check without %(rest) considers whole line' ' tree_sha1=$(git write-tree) tree_size=$(($(test_oid rawsz) + 13)) -tree_pretty_content="100644 blob $hello_sha1 hello" +tree_pretty_content="100644 blob $hello_sha1 hello${LF}" run_tests 'tree' $tree_sha1 $tree_size "" "$tree_pretty_content" @@ -292,12 +275,12 @@ commit_message="Initial commit" commit_sha1=$(echo_without_newline "$commit_message" | git commit-tree $tree_sha1) commit_size=$(($(test_oid hexsz) + 137)) commit_content="tree $tree_sha1 -author $GIT_AUTHOR_NAME <$GIT_AUTHOR_EMAIL> 0 +0000 -committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 0 +0000 +author $GIT_AUTHOR_NAME <$GIT_AUTHOR_EMAIL> $GIT_AUTHOR_DATE +committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE $commit_message" -run_tests 'commit' $commit_sha1 $commit_size "$commit_content" "$commit_content" 1 +run_tests 'commit' $commit_sha1 $commit_size "$commit_content" "$commit_content" tag_header_without_timestamp="object $hello_sha1 type blob @@ -311,7 +294,7 @@ $tag_description" tag_sha1=$(echo_without_newline "$tag_content" | git hash-object -t tag --stdin -w) tag_size=$(strlen "$tag_content") -run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_content" 1 +run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_content" test_expect_success \ "Reach a blob from a tag pointing to it" \ @@ -400,13 +383,16 @@ deadbeef missing missing" test_expect_success '--batch with multiple sha1s gives correct format' ' - test "$(maybe_remove_timestamp "$batch_output" 1)" = "$(maybe_remove_timestamp "$(echo_without_newline "$batch_input" | git cat-file --batch)" 1)" + echo "$batch_output" >expect && + echo_without_newline "$batch_input" | git cat-file --batch >actual && + test_cmp expect actual ' test_expect_success '--batch, -z with multiple sha1s gives correct format' ' echo_without_newline_nul "$batch_input" >in && - test "$(maybe_remove_timestamp "$batch_output" 1)" = \ - "$(maybe_remove_timestamp "$(git cat-file --batch -z expect && + git cat-file --batch -z actual && + test_cmp expect actual ' batch_check_input="$hello_sha1 @@ -480,7 +466,7 @@ contents deadbeef flush" test_expect_success '--batch-command with multiple command calls gives correct format' ' - remove_timestamp >expect <<-EOF && + cat >expect <<-EOF && $hello_sha1 blob $hello_size $hello_content $commit_sha1 commit $commit_size @@ -491,15 +477,13 @@ test_expect_success '--batch-command with multiple command calls gives correct f EOF echo "$batch_command_multiple_contents" >in && - git cat-file --batch-command --buffer actual_raw && + git cat-file --batch-command --buffer actual && - remove_timestamp actual && test_cmp expect actual && echo "$batch_command_multiple_contents" | tr "\n" "\0" >in && - git cat-file --batch-command --buffer -z actual_raw && + git cat-file --batch-command --buffer -z actual && - remove_timestamp actual && test_cmp expect actual ' From patchwork Tue Jun 6 05:19:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13268246 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8332C77B7A for ; Tue, 6 Jun 2023 05:19:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234017AbjFFFTn (ORCPT ); Tue, 6 Jun 2023 01:19:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234550AbjFFFTj (ORCPT ); Tue, 6 Jun 2023 01:19:39 -0400 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2082A1BB for ; Mon, 5 Jun 2023 22:19:38 -0700 (PDT) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id 565F132009B1; Tue, 6 Jun 2023 01:19:37 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Tue, 06 Jun 2023 01:19:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1686028776; x=1686115176; bh=k9 5WaMVskBhrjniFlDSdZMCJDGDtd9dyoGz6P7+Qw40=; b=zXYcVWcZP+3NA122eH iRDTS9swfPyYkTvJ6yfEHfFmw6tfeNw3/3+Kdblxu+5XcWatwHsDl0qFCqsYm+/+ Kj2IbDFMUqWVxFh+pqnqr1ut8vndMnCRDEo1hwjoBv+/PY+1j0T8v2RyTz4W0pAP yvutVK+xm7Zdvl8dA9O2KMHEKP6TgBP/V0ikR4mO+0CYDXP4s/Hp1Eqd0eUWTzEn Fbs1AzENnsy0BETfE3RTc20XyqgMVB7Ro/fZvTie3LJeT5ZFHlllxILCfuUQz8jM ugWxH8t8PcRvdJFOPt4mBHF64nEUjmRviSviuZ6bvddfRfB0HG0jd5qUpQVge1Na WH8g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1686028776; x=1686115176; bh=k95WaMVskBhrj niFlDSdZMCJDGDtd9dyoGz6P7+Qw40=; b=XMTTwd94KO7QNjsnzCVZ1Hg9csccw 4ZucNR83fBQivIZn+4PZmtl8ivkK1PtONfyOSBK9qd705tMKY2r3uWmFbuFen1fI c3mG4sBPIRURoN2VwCvcx/xWTJJthRwnG+HSpisrgs9CHEuZICUPN8l+W2mvpPqb I3AAxN8ENcvh2GhmsaU0VeE3auvNGhS9QZ9ogQbOkaiGJ7yZ9QW9TIvqEEJwKKbQ GLks1acgoi6HIyIc7QMOEY6fe6nIMcitOG+1cj8gdk/TXCU7Ha8KmJav9qidyqyE eYsSMy+fDOKz49H+Ip3BPK/+E8UpvgL34l/oVLXa33xq6OwKJ6saEebcA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrgedttddgleduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgepudenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 6 Jun 2023 01:19:35 -0400 (EDT) Received: by pks.im (OpenSMTPD) with ESMTPSA id c96fb15e (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 6 Jun 2023 05:18:31 +0000 (UTC) Date: Tue, 6 Jun 2023 07:19:33 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Taylor Blau , Toon Claes , Phillip Wood , Junio C Hamano Subject: [PATCH v2 2/5] t1006: modernize test style to use `test_cmp` Message-ID: <251fc2a387067bb6d575cfa39f088d4474844a59.1686028409.git.ps@pks.im> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The tests for git-cat-file(1) are quite old and haven't ever been updated since they were introduced. They thus tend to use old idioms that have since grown outdated. Most importantly, many of the tests use `test $A = $B` to compare expected and actual output. This has the downside that it is impossible to tell what exactly is different between both versions in case the test fails. Refactor the tests to instead use `test_cmp`. While more verbose, it both tends to be more readable and will result in a nice diff in case states don't match. Signed-off-by: Patrick Steinhardt --- t/t1006-cat-file.sh | 70 ++++++++++++++++++++++++++++++--------------- 1 file changed, 47 insertions(+), 23 deletions(-) diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index f139d56fb4..7b985cfded 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -296,9 +296,11 @@ tag_size=$(strlen "$tag_content") run_tests 'tag' $tag_sha1 $tag_size "$tag_content" "$tag_content" -test_expect_success \ - "Reach a blob from a tag pointing to it" \ - "test '$hello_content' = \"\$(git cat-file blob $tag_sha1)\"" +test_expect_success "Reach a blob from a tag pointing to it" ' + echo_without_newline "$hello_content" >expect && + git cat-file blob $tag_sha1 >actual && + test_cmp expect actual +' for batch in batch batch-check batch-command do @@ -334,30 +336,47 @@ do done test_expect_success "--batch-check for a non-existent named object" ' - test "foobar42 missing -foobar84 missing" = \ - "$( ( echo foobar42 && echo_without_newline foobar84 ) | git cat-file --batch-check)" + cat >expect <<-EOF && + foobar42 missing + foobar84 missing + EOF + + printf "foobar42\nfoobar84" >in && + git cat-file --batch-check actual && + test_cmp expect actual ' test_expect_success "--batch-check for a non-existent hash" ' - test "0000000000000000000000000000000000000042 missing -0000000000000000000000000000000000000084 missing" = \ - "$( ( echo 0000000000000000000000000000000000000042 && - echo_without_newline 0000000000000000000000000000000000000084 ) | - git cat-file --batch-check)" + cat >expect <<-EOF && + 0000000000000000000000000000000000000042 missing + 0000000000000000000000000000000000000084 missing + EOF + + printf "0000000000000000000000000000000000000042\n0000000000000000000000000000000000000084" >in && + git cat-file --batch-check actual && + test_cmp expect actual ' test_expect_success "--batch for an existent and a non-existent hash" ' - test "$tag_sha1 tag $tag_size -$tag_content -0000000000000000000000000000000000000000 missing" = \ - "$( ( echo $tag_sha1 && - echo_without_newline 0000000000000000000000000000000000000000 ) | - git cat-file --batch)" + cat >expect <<-EOF && + $tag_sha1 tag $tag_size + $tag_content + 0000000000000000000000000000000000000000 missing + EOF + + printf "$tag_sha1\n0000000000000000000000000000000000000000" >in && + git cat-file --batch actual && + test_cmp expect actual ' test_expect_success "--batch-check for an empty line" ' - test " missing" = "$(echo | git cat-file --batch-check)" + cat >expect <<-EOF && + missing + EOF + + echo >in && + git cat-file --batch-check actual && + test_cmp expect actual ' test_expect_success 'empty --batch-check notices missing object' ' @@ -384,7 +403,8 @@ deadbeef missing test_expect_success '--batch with multiple sha1s gives correct format' ' echo "$batch_output" >expect && - echo_without_newline "$batch_input" | git cat-file --batch >actual && + echo_without_newline "$batch_input" >in && + git cat-file --batch actual && test_cmp expect actual ' @@ -411,13 +431,17 @@ deadbeef missing missing" test_expect_success "--batch-check with multiple sha1s gives correct format" ' - test "$batch_check_output" = \ - "$(echo_without_newline "$batch_check_input" | git cat-file --batch-check)" + echo "$batch_check_output" >expect && + echo_without_newline "$batch_check_input" >in && + git cat-file --batch-check actual && + test_cmp expect actual ' test_expect_success "--batch-check, -z with multiple sha1s gives correct format" ' - echo_without_newline_nul "$batch_check_input" >in && - test "$batch_check_output" = "$(git cat-file --batch-check -z expect && + echo_without_newline_nul "$batch_check_input" >in && + git cat-file --batch-check -z actual && + test_cmp expect actual ' test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' ' From patchwork Tue Jun 6 05:19:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13268247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DBECC77B7A for ; Tue, 6 Jun 2023 05:20:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234580AbjFFFTv (ORCPT ); Tue, 6 Jun 2023 01:19:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44250 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232252AbjFFFTm (ORCPT ); Tue, 6 Jun 2023 01:19:42 -0400 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A822819C for ; Mon, 5 Jun 2023 22:19:41 -0700 (PDT) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.west.internal (Postfix) with ESMTP id E0FC232009B1; Tue, 6 Jun 2023 01:19:40 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Tue, 06 Jun 2023 01:19:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1686028780; x=1686115180; bh=K2 PN3MpDnK55j7CUgeae+piI8EtMmNMlkf9ukebI/zA=; b=gOx+MltpyZjt4HubEQ A+FB+ZUAHRuhyRuUhtLF4+vfa1IXtNHnUXh7Ce9rt4KAeHFSPHl8mQe5RL5EtJyK jFhnFBOT6Z3WZRoNAxO48jyXBP3+nZLfXbya/HTNJgqxzfS8CCx5LYpPgtERfw5i XqVL83UlbZLimQfTPTFfW8TVkK5nRih30msmqDoBvrgca+FDgezROjRpCKYc0Wwj RKQvmDY0G5Rtj9UkywlR78RqHoSAF6mVCExP6cbCvN12ZoOUPyMncmQfQxM1YIpC mVi08UXsWAedrs0afqCfoKGtldK5DVTqLBDx7np88cwC2eMqVPCCtohI0Zl+5ChK 6TJA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1686028780; x=1686115180; bh=K2PN3MpDnK55j 7CUgeae+piI8EtMmNMlkf9ukebI/zA=; b=hCeFkDsKN3Dwl11CDpJZXtawz0XYX MSV4sVxaQet5eCap8HIMP5wb1BA7G/Xnuj4UPBdO31syltlDQZVow94/hwjrpLxz CaQdfGzHVLnKugckD328vL8XUWjM7K3UDAmPweUL2l+rtnDLznwIghB65Q2/bxfb uYYThQnh2M/qTcSXOTBucn7jOlNGBtrJQwaMLO+Qqi2aBoC8k1KPEepvRDndgdgS Gcx32PvYq7zKltBmeJPkAiKMqpCAXO3E76s48n2E9bdlz5Gy/el7bf+0nx9SqnlU n01X7xnw9bKdAdtDz5U6YOMYZ0QxQnMQSLsvkFziREPfXf7XCbJ8SbHLA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrgedttddgledvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 6 Jun 2023 01:19:39 -0400 (EDT) Received: by pks.im (OpenSMTPD) with ESMTPSA id e7fd1dbe (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 6 Jun 2023 05:18:35 +0000 (UTC) Date: Tue, 6 Jun 2023 07:19:37 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Taylor Blau , Toon Claes , Phillip Wood , Junio C Hamano Subject: [PATCH v2 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter Message-ID: <8127eeac97200da9aafccdf16cb7ba06f68b0121.1686028409.git.ps@pks.im> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Many of our commands support reading input that is separated either via newlines or via NUL characters. Furthermore, in order to be a better cross platform citizen, these commands typically know to strip the CRLF sequence so that we also support reading newline-separated inputs on e.g. the Windows platform. This results in the following kind of awkward pattern: ``` struct strbuf input = STRBUF_INIT; while (1) { int ret; if (nul_terminated) ret = strbuf_getline_nul(&input, stdin); else ret = strbuf_getline(&input, stdin); if (ret) break; ... } ``` Introduce a new CRLF-aware helper function that can read up to a user specified delimiter. If the delimiter is `\n` the function knows to also strip CRLF, otherwise it will only strip the specified delimiter. This results in the following, much more readable code pattern: ``` struct strbuf input = STRBUF_INIT; while (strbuf_getdelim_strip_crlf(&input, stdin, delim) != EOF) { ... } ``` The new function will be used in a subsequent commit. Signed-off-by: Patrick Steinhardt --- strbuf.c | 11 ++++++++--- strbuf.h | 12 ++++++++++++ 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/strbuf.c b/strbuf.c index 08eec8f1d8..31dc48c0ab 100644 --- a/strbuf.c +++ b/strbuf.c @@ -721,11 +721,11 @@ static int strbuf_getdelim(struct strbuf *sb, FILE *fp, int term) return 0; } -int strbuf_getline(struct strbuf *sb, FILE *fp) +int strbuf_getdelim_strip_crlf(struct strbuf *sb, FILE *fp, int term) { - if (strbuf_getwholeline(sb, fp, '\n')) + if (strbuf_getwholeline(sb, fp, term)) return EOF; - if (sb->buf[sb->len - 1] == '\n') { + if (term == '\n' && sb->buf[sb->len - 1] == '\n') { strbuf_setlen(sb, sb->len - 1); if (sb->len && sb->buf[sb->len - 1] == '\r') strbuf_setlen(sb, sb->len - 1); @@ -733,6 +733,11 @@ int strbuf_getline(struct strbuf *sb, FILE *fp) return 0; } +int strbuf_getline(struct strbuf *sb, FILE *fp) +{ + return strbuf_getdelim_strip_crlf(sb, fp, '\n'); +} + int strbuf_getline_lf(struct strbuf *sb, FILE *fp) { return strbuf_getdelim(sb, fp, '\n'); diff --git a/strbuf.h b/strbuf.h index 3dfeadb44c..0e69b656bc 100644 --- a/strbuf.h +++ b/strbuf.h @@ -475,6 +475,18 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint); */ ssize_t strbuf_write(struct strbuf *sb, FILE *stream); +/** + * Read from a FILE * until the specified terminator is encountered, + * overwriting the existing contents of the strbuf. + * + * Reading stops after the terminator or at EOF. The terminator is + * removed from the buffer before returning. If the terminator is LF + * and if it is preceded by a CR, then the whole CRLF is stripped. + * Returns 0 unless there was nothing left before EOF, in which case + * it returns `EOF`. + */ +int strbuf_getdelim_strip_crlf(struct strbuf *sb, FILE *fp, int term); + /** * Read a line from a FILE *, overwriting the existing contents of * the strbuf. The strbuf_getline*() family of functions share From patchwork Tue Jun 6 05:19:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13268248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3146DC77B7A for ; Tue, 6 Jun 2023 05:20:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233617AbjFFFUL (ORCPT ); Tue, 6 Jun 2023 01:20:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234597AbjFFFTw (ORCPT ); Tue, 6 Jun 2023 01:19:52 -0400 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57C8CE48 for ; Mon, 5 Jun 2023 22:19:46 -0700 (PDT) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id 708043200AD8; Tue, 6 Jun 2023 01:19:45 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute6.internal (MEProxy); Tue, 06 Jun 2023 01:19:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1686028785; x=1686115185; bh=vf cPmJ4efmXjiTOW83TQsU5+kFT87977jH3HBHHPl7Q=; b=OPxlFyl0VKV+Huii3N 9yHfu0s80BAtIbuzYYvwqaeAYh+tlfY1n4spXMV0aJxp8BPsVypf8kp37VjIk7xu bqIo4LrTfoVrHsE47sUIK6fpQtTa8AQgoFaiCt8Jj0dnm22d5brF8IAJ2AFoz/ia m8tKfS/jmILmAbLP2m3uN+GA9Qu+j3sN3uc5JB3TS0MFa/V6V2he2UAMQhvr4xuz gAYHWP4Cr8OrjNiY434B3NA3Y+XPip56E1xVzn3760bYOYsfuniz7avN7+nC9MtO bvC96RetcANlHRQu+WVqwkI7FD+DTTOSaCrZXfLAhYZ4VN1rukRUptBTXgcY5KvZ 5big== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1686028785; x=1686115185; bh=vfcPmJ4efmXji TOW83TQsU5+kFT87977jH3HBHHPl7Q=; b=v+1y2g888m1bm9kEYGOaZCP3J3lHu qu01pPGexT22W9LQNEe7s6+0/qXSGdDFRz+HrCFOaeeZTlw+y8W82zJUfAPhFryk nH8QTNFJTxnvxthhzC/gFk4m9UMKchM/SFVEmuIPKIRmUansqtvQeQ5TJmcbHMew rIqnVSKhcW7fVW/8shzmTcJN/P3oriWKs08ENAF63hmoXBr2iExhki8ZWs2FZhNO UYd8nj2GCeT1ryBxW6rrqSf13VaUZJ2MdPch9cAkMInxWI07kTUW7FSLiVHDnFbz HvYUV8/JnGqGtxIbgrcHO88blAUoFch1Dr/tCGt7TnE3O/WRyRT+KMteQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrgedttddgleduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgepvdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 6 Jun 2023 01:19:43 -0400 (EDT) Received: by pks.im (OpenSMTPD) with ESMTPSA id e761944c (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 6 Jun 2023 05:18:40 +0000 (UTC) Date: Tue, 6 Jun 2023 07:19:41 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Taylor Blau , Toon Claes , Phillip Wood , Junio C Hamano Subject: [PATCH v2 4/5] cat-file: simplify reading from standard input Message-ID: References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The batch modes of git-cat-file(1) read queries from stantard input that are either newline- or NUL-delimited. This code was introduced via db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with `-z`, 2022-07-22), which notes that: """ The refactoring here is slightly unfortunate, since we turn loops like: while (strbuf_getline(&buf, stdin) != EOF) into: while (1) { int ret; if (opt->nul_terminated) ret = strbuf_getline_nul(&input, stdin); else ret = strbuf_getline(&input, stdin); if (ret == EOF) break; } """ The commit proposed introducing a helper function that is easier to use, which is just what we have done in the preceding commit. Refactor the code to use this new helper to simplify the loop. Signed-off-by: Patrick Steinhardt --- builtin/cat-file.c | 32 +++++++++----------------------- 1 file changed, 9 insertions(+), 23 deletions(-) diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 0bafc14e6c..001dcb24d6 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -42,7 +42,7 @@ struct batch_options { int all_objects; int unordered; int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */ - int nul_terminated; + char input_delim; const char *format; }; @@ -694,20 +694,12 @@ static void batch_objects_command(struct batch_options *opt, struct queued_cmd *queued_cmd = NULL; size_t alloc = 0, nr = 0; - while (1) { - int i, ret; + while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) { + int i; const struct parse_cmd *cmd = NULL; const char *p = NULL, *cmd_end; struct queued_cmd call = {0}; - if (opt->nul_terminated) - ret = strbuf_getline_nul(&input, stdin); - else - ret = strbuf_getline(&input, stdin); - - if (ret) - break; - if (!input.len) die(_("empty command in input")); if (isspace(*input.buf)) @@ -851,16 +843,7 @@ static int batch_objects(struct batch_options *opt) goto cleanup; } - while (1) { - int ret; - if (opt->nul_terminated) - ret = strbuf_getline_nul(&input, stdin); - else - ret = strbuf_getline(&input, stdin); - - if (ret == EOF) - break; - + while (strbuf_getdelim_strip_crlf(&input, stdin, opt->input_delim) != EOF) { if (data.split_on_whitespace) { /* * Split at first whitespace, tying off the beginning @@ -929,6 +912,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) const char *exp_type = NULL, *obj_name = NULL; struct batch_options batch = {0}; int unknown_type = 0; + int input_nul_terminated = 0; const char * const usage[] = { N_("git cat-file "), @@ -965,7 +949,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) N_("like --batch, but don't emit "), PARSE_OPT_OPTARG | PARSE_OPT_NONEG, batch_option_callback), - OPT_BOOL('z', NULL, &batch.nul_terminated, N_("stdin is NUL-terminated")), + OPT_BOOL('z', NULL, &input_nul_terminated, N_("stdin is NUL-terminated")), OPT_CALLBACK_F(0, "batch-command", &batch, N_("format"), N_("read commands from stdin"), PARSE_OPT_OPTARG | PARSE_OPT_NONEG, @@ -1024,10 +1008,12 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) else if (batch.all_objects) usage_msg_optf(_("'%s' requires a batch mode"), usage, options, "--batch-all-objects"); - else if (batch.nul_terminated) + else if (input_nul_terminated) usage_msg_optf(_("'%s' requires a batch mode"), usage, options, "-z"); + batch.input_delim = input_nul_terminated ? '\0' : '\n'; + /* Batch defaults */ if (batch.buffer_output < 0) batch.buffer_output = batch.all_objects; From patchwork Tue Jun 6 05:19:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13268249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35104C7EE24 for ; Tue, 6 Jun 2023 05:20:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233794AbjFFFUP (ORCPT ); Tue, 6 Jun 2023 01:20:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234645AbjFFFTx (ORCPT ); Tue, 6 Jun 2023 01:19:53 -0400 Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DBE47E55 for ; Mon, 5 Jun 2023 22:19:50 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.west.internal (Postfix) with ESMTP id 1ECA63200AD8; Tue, 6 Jun 2023 01:19:50 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 06 Jun 2023 01:19:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1686028789; x=1686115189; bh=Jx gsPn1P0tf3QT4DMDo7IbSR4C79qdDsKbbRdKilBH4=; b=Z2wEZKJr9yBZbVdXwA qGiLO19GpgomFB+5Mp2x1OI1asOAYoWtppKo3z5t8OjZAPUCgd9eh77xE9dhH8S3 aeruLxzJ9qdns/3Vv+3gQTMEhuFr1szeT2RBreYKDjV8Y5FgtamGNmZgNk6IhqZe /mylCqsKC1s0R4NUvYxJIM58wYNI3dF6Vr/mb3MGaN+3sD5gLGLlcz7L97q0zy46 sNkQUZn9uNxLCU5ppKXmuL/hxOvi9D5pHFTEy7aiJdqw2E2DnlRs4REbreGb1Nvn gVW7gN9KRReIdT1b2v48h3Xu0/UqJ6fsxld1Z108rpYVAaA88+/+1ZSudoO6hEE/ Ipag== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; t=1686028789; x=1686115189; bh=JxgsPn1P0tf3Q T4DMDo7IbSR4C79qdDsKbbRdKilBH4=; b=qi7izz2ff5pEJC8+xu8Jd+NqA4owd WP5yNaL4L/vu0gut73gV+HQKr2Zn/0t4u8MAltb551/tVYv9yPeOz2Y/GMStMzcK arND8CaqfDcPZ41ias0jMm4c0mXX2zx1AYAlnxz2wsUpy53rBFZSXZ1/inlFK2ZQ 9jK7F2n//kF0a+ps3YIJ0wP5eNAsO1NXlOkHXQq10nsp7tdsUEIW13jjQMG+fgEM MbQnwhfSKxGsmNpam6ae0op22PHiRXRw7YOTRCzXrj814sLjtqXeJZ5GCI6lnCbC aBgWqlqDcNs8oFFrtO5G3ZJ7TCCXlHEdUVfo00X+ppWW6N8rlnFJnMv1w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrgedttddgledvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesghdtreertddtvdenucfhrhhomheprfgrthhr ihgtkhcuufhtvghinhhhrghrughtuceophhssehpkhhsrdhimheqnecuggftrfgrthhtvg hrnhepueektdevtdffveeljeetgfehheeigeekleduvdeffeeghefgledttdehjeelffet necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 6 Jun 2023 01:19:48 -0400 (EDT) Received: by pks.im (OpenSMTPD) with ESMTPSA id 2f7046e2 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Tue, 6 Jun 2023 05:18:44 +0000 (UTC) Date: Tue, 6 Jun 2023 07:19:45 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Taylor Blau , Toon Claes , Phillip Wood , Junio C Hamano Subject: [PATCH v2 5/5] cat-file: introduce option to delimit input and output with NUL Message-ID: <79ed618c845fae3f06b6ff4b3ade58e1fcccb758.1686028409.git.ps@pks.im> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In db9d67f2e9 (builtin/cat-file.c: support NUL-delimited input with `-z`, 2022-07-22), we have introduced a new mode to read the input via NUL-delimited records instead of newline-delimited records. This allows the user to query for revisions that have newlines in their path component. While unusual, such queries are perfectly valid and thus it is clear that we should be able to support them properly. Unfortunately, the commit only changed the input to be NUL-delimited, but didn't change the output at the same time. While this is fine for queries that are processed successfully, it is less so for queries that aren't. In the case of missing commits for example the result can become entirely unparsable: ``` $ printf "7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10\n1234567890\n\n\commit000" | git cat-file --batch -z 7ce4f05bae8120d9fa258e854a8669f6ea9cb7b1 blob 10 1234567890 commit missing ``` This is of course a crafted query that is intentionally gaming the deficiency, but more benign queries that contain newlines would have similar problems. Ideally, we should have also changed the output to be NUL-delimited when `-z` is specified to avoid this problem. As the input is NUL-delimited, it is clear that the output in this case cannot ever contain NUL characters by itself. Furthermore, Git does not allow NUL characters in revisions anyway, further stressing the point that using NUL-delimited output is safe. The only exception is of course the object data itself, but as git-cat-file(1) prints the size of the object data clients should read until that specified size has been consumed. But even though `-z` has only been introduced a few releases ago in Git v2.38.0, changing the output format retroactively to also NUL-delimit output would be a backwards incompatible change. And while one could make the argument that the output is inherently broken already, we need to assume that there are existing users out there that use it just fine given that revisions containing newlines are quite exotic. Instead, introduce a new option `-Z` that switches to NUL-delimited input and output. While this new option could arguably only switch the output format to be NUL-delimited, the consequence would be that users have to always specify both `-z` and `-Z` when the input may contain newlines. On the other hand, if the user knows that there never will be newlines in the input, they don't have to use either of those options. There is thus no usecase that would warrant treating input and output format separately, which is why we instead opt to "do the right thing" and have `-Z` mean to NUL-terminate both formats. The old `-z` option is marked as deprecated with a hint that its output may become unparsable. It is thus hidden both from the synopsis as well as the command's help output. Co-authored-by: Toon Claes Signed-off-by: Patrick Steinhardt --- Documentation/git-cat-file.txt | 15 +++- builtin/cat-file.c | 57 +++++++++------ t/t1006-cat-file.sh | 123 ++++++++++++++++++++++++--------- 3 files changed, 139 insertions(+), 56 deletions(-) diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt index 411de2e27d..0e4936d182 100644 --- a/Documentation/git-cat-file.txt +++ b/Documentation/git-cat-file.txt @@ -14,7 +14,7 @@ SYNOPSIS 'git cat-file' (-t | -s) [--allow-unknown-type] 'git cat-file' (--batch | --batch-check | --batch-command) [--batch-all-objects] [--buffer] [--follow-symlinks] [--unordered] - [--textconv | --filters] [-z] + [--textconv | --filters] [-Z] 'git cat-file' (--textconv | --filters) [: | --path= ] @@ -243,10 +243,16 @@ respectively print: /etc/passwd -- +-Z:: + Only meaningful with `--batch`, `--batch-check`, or + `--batch-command`; input and output is NUL-delimited instead of + newline-delimited. + -z:: Only meaningful with `--batch`, `--batch-check`, or `--batch-command`; input is NUL-delimited instead of - newline-delimited. + newline-delimited. This option is deprecated in favor of + `-Z` as the output can otherwise be ambiguous. OUTPUT @@ -384,6 +390,11 @@ notdir SP LF is printed when, during symlink resolution, a file is used as a directory name. +Alternatively, when `-Z` is passed, the line feeds in any of the above examples +are replaced with NUL terminators. This ensures that output will be parsable if +the output itself would contain a linefeed and is thus recommended for +scripting purposes. + CAVEATS ------- diff --git a/builtin/cat-file.c b/builtin/cat-file.c index 001dcb24d6..e5f72229df 100644 --- a/builtin/cat-file.c +++ b/builtin/cat-file.c @@ -43,6 +43,7 @@ struct batch_options { int unordered; int transform_mode; /* may be 'w' or 'c' for --filters or --textconv */ char input_delim; + char output_delim; const char *format; }; @@ -437,11 +438,12 @@ static void print_object_or_die(struct batch_options *opt, struct expand_data *d } } -static void print_default_format(struct strbuf *scratch, struct expand_data *data) +static void print_default_format(struct strbuf *scratch, struct expand_data *data, + struct batch_options *opt) { - strbuf_addf(scratch, "%s %s %"PRIuMAX"\n", oid_to_hex(&data->oid), + strbuf_addf(scratch, "%s %s %"PRIuMAX"%c", oid_to_hex(&data->oid), type_name(data->type), - (uintmax_t)data->size); + (uintmax_t)data->size, opt->output_delim); } /* @@ -470,8 +472,8 @@ static void batch_object_write(const char *obj_name, &data->oid, &data->info, OBJECT_INFO_LOOKUP_REPLACE); if (ret < 0) { - printf("%s missing\n", - obj_name ? obj_name : oid_to_hex(&data->oid)); + printf("%s missing%c", + obj_name ? obj_name : oid_to_hex(&data->oid), opt->output_delim); fflush(stdout); return; } @@ -492,17 +494,17 @@ static void batch_object_write(const char *obj_name, strbuf_reset(scratch); if (!opt->format) { - print_default_format(scratch, data); + print_default_format(scratch, data, opt); } else { strbuf_expand(scratch, opt->format, expand_format, data); - strbuf_addch(scratch, '\n'); + strbuf_addch(scratch, opt->output_delim); } batch_write(opt, scratch->buf, scratch->len); if (opt->batch_mode == BATCH_MODE_CONTENTS) { print_object_or_die(opt, data); - batch_write(opt, "\n", 1); + batch_write(opt, &opt->output_delim, 1); } } @@ -520,22 +522,25 @@ static void batch_one_object(const char *obj_name, if (result != FOUND) { switch (result) { case MISSING_OBJECT: - printf("%s missing\n", obj_name); + printf("%s missing%c", obj_name, opt->output_delim); break; case SHORT_NAME_AMBIGUOUS: - printf("%s ambiguous\n", obj_name); + printf("%s ambiguous%c", obj_name, opt->output_delim); break; case DANGLING_SYMLINK: - printf("dangling %"PRIuMAX"\n%s\n", - (uintmax_t)strlen(obj_name), obj_name); + printf("dangling %"PRIuMAX"%c%s%c", + (uintmax_t)strlen(obj_name), + opt->output_delim, obj_name, opt->output_delim); break; case SYMLINK_LOOP: - printf("loop %"PRIuMAX"\n%s\n", - (uintmax_t)strlen(obj_name), obj_name); + printf("loop %"PRIuMAX"%c%s%c", + (uintmax_t)strlen(obj_name), + opt->output_delim, obj_name, opt->output_delim); break; case NOT_DIR: - printf("notdir %"PRIuMAX"\n%s\n", - (uintmax_t)strlen(obj_name), obj_name); + printf("notdir %"PRIuMAX"%c%s%c", + (uintmax_t)strlen(obj_name), + opt->output_delim, obj_name, opt->output_delim); break; default: BUG("unknown get_sha1_with_context result %d\n", @@ -547,9 +552,9 @@ static void batch_one_object(const char *obj_name, } if (ctx.mode == 0) { - printf("symlink %"PRIuMAX"\n%s\n", + printf("symlink %"PRIuMAX"%c%s%c", (uintmax_t)ctx.symlink_path.len, - ctx.symlink_path.buf); + opt->output_delim, ctx.symlink_path.buf, opt->output_delim); fflush(stdout); return; } @@ -913,6 +918,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) struct batch_options batch = {0}; int unknown_type = 0; int input_nul_terminated = 0; + int nul_terminated = 0; const char * const usage[] = { N_("git cat-file "), @@ -920,7 +926,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) N_("git cat-file (-t | -s) [--allow-unknown-type] "), N_("git cat-file (--batch | --batch-check | --batch-command) [--batch-all-objects]\n" " [--buffer] [--follow-symlinks] [--unordered]\n" - " [--textconv | --filters] [-z]"), + " [--textconv | --filters] [-Z]"), N_("git cat-file (--textconv | --filters)\n" " [: | --path= ]"), NULL @@ -949,7 +955,9 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) N_("like --batch, but don't emit "), PARSE_OPT_OPTARG | PARSE_OPT_NONEG, batch_option_callback), - OPT_BOOL('z', NULL, &input_nul_terminated, N_("stdin is NUL-terminated")), + OPT_BOOL_F('z', NULL, &input_nul_terminated, N_("stdin is NUL-terminated"), + PARSE_OPT_HIDDEN), + OPT_BOOL('Z', NULL, &nul_terminated, N_("stdin and stdout is NUL-terminated")), OPT_CALLBACK_F(0, "batch-command", &batch, N_("format"), N_("read commands from stdin"), PARSE_OPT_OPTARG | PARSE_OPT_NONEG, @@ -1011,8 +1019,15 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix) else if (input_nul_terminated) usage_msg_optf(_("'%s' requires a batch mode"), usage, options, "-z"); + else if (nul_terminated) + usage_msg_optf(_("'%s' requires a batch mode"), usage, options, + "-Z"); - batch.input_delim = input_nul_terminated ? '\0' : '\n'; + batch.input_delim = batch.output_delim = '\n'; + if (input_nul_terminated) + batch.input_delim = '\0'; + if (nul_terminated) + batch.input_delim = batch.output_delim = '\0'; /* Batch defaults */ if (batch.buffer_output < 0) diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index 7b985cfded..d73a0be1b9 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -89,7 +89,8 @@ done for opt in --buffer \ --follow-symlinks \ --batch-all-objects \ - -z + -z \ + -Z do test_expect_success "usage: bad option combination: $opt without batch mode" ' test_incompatible_usage git cat-file $opt && @@ -392,17 +393,18 @@ deadbeef " -batch_output="$hello_sha1 blob $hello_size -$hello_content -$commit_sha1 commit $commit_size -$commit_content -$tag_sha1 tag $tag_size -$tag_content -deadbeef missing - missing" +printf "%s\0" \ + "$hello_sha1 blob $hello_size" \ + "$hello_content" \ + "$commit_sha1 commit $commit_size" \ + "$commit_content" \ + "$tag_sha1 tag $tag_size" \ + "$tag_content" \ + "deadbeef missing" \ + " missing" >batch_output test_expect_success '--batch with multiple sha1s gives correct format' ' - echo "$batch_output" >expect && + tr "\0" "\n" expect && echo_without_newline "$batch_input" >in && git cat-file --batch actual && test_cmp expect actual @@ -410,11 +412,17 @@ test_expect_success '--batch with multiple sha1s gives correct format' ' test_expect_success '--batch, -z with multiple sha1s gives correct format' ' echo_without_newline_nul "$batch_input" >in && - echo "$batch_output" >expect && + tr "\0" "\n" expect && git cat-file --batch -z actual && test_cmp expect actual ' +test_expect_success '--batch, -Z with multiple sha1s gives correct format' ' + echo_without_newline_nul "$batch_input" >in && + git cat-file --batch -Z actual && + test_cmp batch_output actual +' + batch_check_input="$hello_sha1 $tree_sha1 $commit_sha1 @@ -423,40 +431,55 @@ deadbeef " -batch_check_output="$hello_sha1 blob $hello_size -$tree_sha1 tree $tree_size -$commit_sha1 commit $commit_size -$tag_sha1 tag $tag_size -deadbeef missing - missing" +printf "%s\0" \ + "$hello_sha1 blob $hello_size" \ + "$tree_sha1 tree $tree_size" \ + "$commit_sha1 commit $commit_size" \ + "$tag_sha1 tag $tag_size" \ + "deadbeef missing" \ + " missing" >batch_check_output test_expect_success "--batch-check with multiple sha1s gives correct format" ' - echo "$batch_check_output" >expect && + tr "\0" "\n" expect && echo_without_newline "$batch_check_input" >in && git cat-file --batch-check actual && test_cmp expect actual ' test_expect_success "--batch-check, -z with multiple sha1s gives correct format" ' - echo "$batch_check_output" >expect && + tr "\0" "\n" expect && echo_without_newline_nul "$batch_check_input" >in && git cat-file --batch-check -z actual && test_cmp expect actual ' -test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' ' +test_expect_success "--batch-check, -Z with multiple sha1s gives correct format" ' + echo_without_newline_nul "$batch_check_input" >in && + git cat-file --batch-check -Z actual && + test_cmp batch_check_output actual +' + +test_expect_success FUNNYNAMES 'setup with newline in input' ' touch -- "newline${LF}embedded" && git add -- "newline${LF}embedded" && git commit -m "file with newline embedded" && test_tick && - printf "HEAD:newline${LF}embedded" >in && - git cat-file --batch-check -z actual && + printf "HEAD:newline${LF}embedded" >in +' +test_expect_success FUNNYNAMES '--batch-check, -z with newline in input' ' + git cat-file --batch-check -z actual && echo "$(git rev-parse "HEAD:newline${LF}embedded") blob 0" >expect && test_cmp expect actual ' +test_expect_success FUNNYNAMES '--batch-check, -Z with newline in input' ' + git cat-file --batch-check -Z actual && + printf "%s\0" "$(git rev-parse "HEAD:newline${LF}embedded") blob 0" >expect && + test_cmp expect actual +' + batch_command_multiple_info="info $hello_sha1 info $tree_sha1 info $commit_sha1 @@ -480,7 +503,13 @@ test_expect_success '--batch-command with multiple info calls gives correct form echo "$batch_command_multiple_info" | tr "\n" "\0" >in && git cat-file --batch-command --buffer -z actual && - test_cmp expect actual + test_cmp expect actual && + + echo "$batch_command_multiple_info" | tr "\n" "\0" >in && + tr "\n" "\0" expect_nul && + git cat-file --batch-command --buffer -Z actual && + + test_cmp expect_nul actual ' batch_command_multiple_contents="contents $hello_sha1 @@ -490,15 +519,15 @@ contents deadbeef flush" test_expect_success '--batch-command with multiple command calls gives correct format' ' - cat >expect <<-EOF && - $hello_sha1 blob $hello_size - $hello_content - $commit_sha1 commit $commit_size - $commit_content - $tag_sha1 tag $tag_size - $tag_content - deadbeef missing - EOF + printf "%s\0" \ + "$hello_sha1 blob $hello_size" \ + "$hello_content" \ + "$commit_sha1 commit $commit_size" \ + "$commit_content" \ + "$tag_sha1 tag $tag_size" \ + "$tag_content" \ + "deadbeef missing" >expect_nul && + tr "\0" "\n" expect && echo "$batch_command_multiple_contents" >in && git cat-file --batch-command --buffer actual && @@ -508,7 +537,12 @@ test_expect_success '--batch-command with multiple command calls gives correct f echo "$batch_command_multiple_contents" | tr "\n" "\0" >in && git cat-file --batch-command --buffer -z actual && - test_cmp expect actual + test_cmp expect actual && + + echo "$batch_command_multiple_contents" | tr "\n" "\0" >in && + git cat-file --batch-command --buffer -Z actual && + + test_cmp expect_nul actual ' test_expect_success 'setup blobs which are likely to delta' ' @@ -848,6 +882,13 @@ test_expect_success 'git cat-file --batch-check --follow-symlinks works for brok test_cmp expect actual ' +test_expect_success 'git cat-file --batch-check --follow-symlinks -Z works for broken in-repo, same-dir links' ' + printf "HEAD:broken-same-dir-link\0" >in && + printf "dangling 25\0HEAD:broken-same-dir-link\0" >expect && + git cat-file --batch-check --follow-symlinks -Z actual && + test_cmp expect actual +' + test_expect_success 'git cat-file --batch-check --follow-symlinks works for same-dir links-to-links' ' echo HEAD:link-to-link | git cat-file --batch-check --follow-symlinks >actual && test_cmp found actual @@ -862,6 +903,15 @@ test_expect_success 'git cat-file --batch-check --follow-symlinks works for pare test_cmp expect actual ' +test_expect_success 'git cat-file --batch-check --follow-symlinks -Z works for parent-dir links' ' + echo HEAD:dir/parent-dir-link | git cat-file --batch-check --follow-symlinks >actual && + test_cmp found actual && + printf "notdir 29\0HEAD:dir/parent-dir-link/nope\0" >expect && + printf "HEAD:dir/parent-dir-link/nope\0" >in && + git cat-file --batch-check --follow-symlinks -Z actual && + test_cmp expect actual +' + test_expect_success 'git cat-file --batch-check --follow-symlinks works for .. links' ' echo dangling 22 >expect && echo HEAD:dir/link-dir/nope >>expect && @@ -976,6 +1026,13 @@ test_expect_success 'git cat-file --batch-check --follow-symlink breaks loops' ' test_cmp expect actual ' +test_expect_success 'git cat-file --batch-check --follow-symlink -Z breaks loops' ' + printf "loop 10\0HEAD:loop1\0" >expect && + printf "HEAD:loop1\0" >in && + git cat-file --batch-check --follow-symlinks -Z actual && + test_cmp expect actual +' + test_expect_success 'git cat-file --batch --follow-symlink returns correct sha and mode' ' echo HEAD:morx | git cat-file --batch >expect && echo HEAD:morx | git cat-file --batch --follow-symlinks >actual &&