[1/6] t4061: use POSIX compliance regex(7)

Message ID	c45d6383173d8d3e73cdcdd6e993d3259d519a68.1584625896.git.congdanhqx@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=WCE0=5E=vger.kernel.org=git-owner@kernel.org> From: =?utf-8?b?xJBvw6BuIFRy4bqnbiBDw7RuZyBEYW5o?= <congdanhqx@gmail.com> To: git@vger.kernel.org Cc: =?utf-8?b?xJBvw6BuIFRy4bqnbiBDw7RuZyBEYW5o?= <congdanhqx@gmail.com> Subject: [PATCH 1/6] t4061: use POSIX compliance regex(7) Date: Thu, 19 Mar 2020 21:00:02 +0700 Message-Id: <c45d6383173d8d3e73cdcdd6e993d3259d519a68.1584625896.git.congdanhqx@gmail.com> In-Reply-To: <cover.1584625896.git.congdanhqx@gmail.com> References: <cover.1584625896.git.congdanhqx@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: git-owner@vger.kernel.org Precedence: bulk
Series	fix test failure with busybox \| expand [0/6] fix test failure with busybox [1/6] t4061: use POSIX compliance regex(7) [2/6] test-lib-functions: test_cmp: eval $GIT_TEST_CMP [3/6] t5003: skip conversion test if unzip -a is unavailable [4/6] t5616: use rev-parse instead to get HEAD's object_id [5/6] t7063: use POSIX find(1) syntax [6/6] t4124: fix test for non-compliance diff

Message ID

c45d6383173d8d3e73cdcdd6e993d3259d519a68.1584625896.git.congdanhqx@gmail.com (mailing list archive)

State

New, archived

Headers

From: =?utf-8?b?xJBvw6BuIFRy4bqnbiBDw7RuZyBEYW5o?=  <congdanhqx@gmail.com>
To: git@vger.kernel.org
Cc: =?utf-8?b?xJBvw6BuIFRy4bqnbiBDw7RuZyBEYW5o?=  <congdanhqx@gmail.com>
Subject: [PATCH 1/6] t4061: use POSIX compliance regex(7)
Date: Thu, 19 Mar 2020 21:00:02 +0700
Message-Id: 
 <c45d6383173d8d3e73cdcdd6e993d3259d519a68.1584625896.git.congdanhqx@gmail.com>
In-Reply-To: <cover.1584625896.git.congdanhqx@gmail.com>
References: <cover.1584625896.git.congdanhqx@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Sender: git-owner@vger.kernel.org
Precedence: bulk

Series

fix test failure with busybox | expand

Commit Message

Đoàn Trần Công Danh March 19, 2020, 2 p.m. UTC

BRE interprets `+` literally, and
`\+` is undefined for POSIX BRE, from:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_02

> The interpretation of an ordinary character preceded
> by an unescaped <backslash> ( '\\' ) is undefined, except for:
> - The characters ')', '(', '{', and '}'
> - The digits 1 to 9 inclusive
> - A character inside a bracket expression

This test is failing with busybox sed, the default sed of Alpine Linux

Fix it by using literal `+` instead.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
---
 t/t4061-diff-indent.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jeff King March 19, 2020, 3:53 p.m. UTC | #1

On Thu, Mar 19, 2020 at 09:00:02PM +0700, Đoàn Trần Công Danh wrote:

> BRE interprets `+` literally, and
> `\+` is undefined for POSIX BRE, from:
> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_02
> 
> > The interpretation of an ordinary character preceded
> > by an unescaped <backslash> ( '\\' ) is undefined, except for:
> > - The characters ')', '(', '{', and '}'
> > - The digits 1 to 9 inclusive
> > - A character inside a bracket expression
> 
> This test is failing with busybox sed, the default sed of Alpine Linux
> 
> Fix it by using literal `+` instead.

This makes sense, I think. It could hurt a sed which is expected ERE and
needs the "+" escaped, but I think such a sed would be wrong (and I
imagine would break things elsewhere).

-Peff

Eric Sunshine March 19, 2020, 4:01 p.m. UTC | #2

On Thu, Mar 19, 2020 at 11:53 AM Jeff King <peff@peff.net> wrote:
> On Thu, Mar 19, 2020 at 09:00:02PM +0700, Đoàn Trần Công Danh wrote:
> > Fix it by using literal `+` instead.
>
> This makes sense, I think. It could hurt a sed which is expected ERE and
> needs the "+" escaped, but I think such a sed would be wrong (and I
> imagine would break things elsewhere).

I had the same thought and considered suggesting a character class:

    sed -n -e "1,4d" -e "s/^[+]//p" <"$1" >.tmp-1

to make it painfully obvious that "+" is not special in the
expression. But then I thought better of it -- for the same reason as
you (to wit: such a 'sed' would be wrong) -- and decided against
saying anything.

Junio C Hamano March 19, 2020, 10:02 p.m. UTC | #3

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Thu, Mar 19, 2020 at 11:53 AM Jeff King <peff@peff.net> wrote:
>> On Thu, Mar 19, 2020 at 09:00:02PM +0700, Đoàn Trần Công Danh wrote:
>> > Fix it by using literal `+` instead.
>>
>> This makes sense, I think. It could hurt a sed which is expected ERE and
>> needs the "+" escaped, but I think such a sed would be wrong (and I
>> imagine would break things elsewhere).
>
> I had the same thought and considered suggesting a character class:
>
>     sed -n -e "1,4d" -e "s/^[+]//p" <"$1" >.tmp-1
>
> to make it painfully obvious that "+" is not special in the
> expression. But then I thought better of it -- for the same reason as
> you (to wit: such a 'sed' would be wrong) -- and decided against
> saying anything.

I have only one thing that needs fixing, which is s/compliance/compliant/;
on the title.  Other than that, it looks good.

Having said that, I would have done the [+] thing if I were doing
this patch myself.  As long as we see no "wrong" sed that is broken
by this change, I am OK with it, though.

Thanks.

Đoàn Trần Công Danh March 20, 2020, 1:35 a.m. UTC | #4

On 2020-03-19 15:02:37-0700, Junio C Hamano <gitster@pobox.com> wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
> 
> > On Thu, Mar 19, 2020 at 11:53 AM Jeff King <peff@peff.net> wrote:
> >> On Thu, Mar 19, 2020 at 09:00:02PM +0700, Đoàn Trần Công Danh wrote:
> >> > Fix it by using literal `+` instead.
> >>
> >> This makes sense, I think. It could hurt a sed which is expected ERE and
> >> needs the "+" escaped, but I think such a sed would be wrong (and I
> >> imagine would break things elsewhere).
> >
> > I had the same thought and considered suggesting a character class:
> >
> >     sed -n -e "1,4d" -e "s/^[+]//p" <"$1" >.tmp-1
> >
> > to make it painfully obvious that "+" is not special in the
> > expression. But then I thought better of it -- for the same reason as
> > you (to wit: such a 'sed' would be wrong) -- and decided against
> > saying anything.
> 
> I have only one thing that needs fixing, which is s/compliance/compliant/;
> on the title.  Other than that, it looks good.
> 
> Having said that, I would have done the [+] thing if I were doing
> this patch myself.  As long as we see no "wrong" sed that is broken
> by this change, I am OK with it, though.

Well, `[+]` thing was my first version for this change,
but I change in to this version afterward.

However, your comment in a later patch:

> IOW, I do not have trouble changing the test so that it works with
> noncompliant "diff".  But then in the same series, I would prefer to
> see the existing test keeps working with a possibly noncompliant
> "sed" implementation that has been working well with the tests.

changed my mind.

I will use `[+]` here in the reroll.

diff --git a/t/t4061-diff-indent.sh b/t/t4061-diff-indent.sh
index 2affd7a100..0f7a6d97a8 100755
--- a/t/t4061-diff-indent.sh
+++ b/t/t4061-diff-indent.sh
@@ -17,7 +17,7 @@  compare_diff () {
 # Compare blame output using the expectation for a diff as reference.
 # Only look for the lines coming from non-boundary commits.
 compare_blame () {
-	sed -n -e "1,4d" -e "s/^\+//p" <"$1" >.tmp-1
+	sed -n -e "1,4d" -e "s/^+//p" <"$1" >.tmp-1
 	sed -ne "s/^[^^][^)]*) *//p" <"$2" >.tmp-2
 	test_cmp .tmp-1 .tmp-2 && rm -f .tmp-1 .tmp-2
 }

[1/6] t4061: use POSIX compliance regex(7)

Commit Message

Comments

Patch