Message ID | 285c6830c5182cb602d4fe559525083f69a158e9.1584625896.git.congdanhqx@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fix test failure with busybox | expand |
On Thu, Mar 19, 2020 at 09:00:07PM +0700, Đoàn Trần Công Danh wrote: > POSIX's diff(1) requires output in normal diff format. > However, busybox's diff's output is written in unified format. That's a pretty big difference. I'm surprised this only produces one problem in the test scripts. ;) > POSIX requires no option for normal-diff format. > > A hint in test-lib-functions::test_cmp said `diff -u` isn't available > everywhere. > > Workaround this problem by assuming `diff(1)` output is unified > if we couldn't make anything from normal-diff format. I wonder if we could use "git diff" here. We have to be careful about circular reasoning in our tests (i.e., making sure we're not verifying output with the same code that we're testing), but I think here we're checking how "apply --whitespace=fix" works. But if this is the only spot, then adjusting to handle unified or normal diff isn't too bad. > diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh > index 971a5a7512..2a54ce96b5 100755 > --- a/t/t4124-apply-ws-rule.sh > +++ b/t/t4124-apply-ws-rule.sh > @@ -52,6 +52,12 @@ test_fix () { > > # find touched lines > $DIFF file target | sed -n -e "s/^> //p" >fixed > + if ! test -s fixed; then > + $DIFF file target | > + grep '^+' | > + grep -v '^+++' | > + sed -e "s/+//" >fixed > + fi I think those greps could be lumped into sed like: sed -ne "s/^+[^+]//p" (at the cost of missing blank lines, but I think that's OK for our purposes here; it could be fixed with an ERE). Could we then make a single invocation that covers both diff formats? We can further observe that the only thing we do with the "fixed" file is count the lines, so we can leave the markers. Which means we could ditch sed entirely and use grep. Something like: diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh index 971a5a7512..15cb0c81b7 100755 --- a/t/t4124-apply-ws-rule.sh +++ b/t/t4124-apply-ws-rule.sh @@ -50,8 +50,9 @@ test_fix () { # fix should not barf apply_patch --whitespace=fix || return 1 - # find touched lines - $DIFF file target | sed -n -e "s/^> //p" >fixed + # find touched lines; handle either normal or unified + # diff, as system diff may generate either + $DIFF file target | grep '^[>+][^+]' >fixed # the changed lines are all expected to change fixed_cnt=$(wc -l <fixed) seems to work for with both busybox diff and GNU diff. -Peff
Jeff King <peff@peff.net> writes: > On Thu, Mar 19, 2020 at 09:00:07PM +0700, Đoàn Trần Công Danh wrote: > >> POSIX's diff(1) requires output in normal diff format. >> However, busybox's diff's output is written in unified format. > > That's a pretty big difference. I'm surprised this only produces one > problem in the test scripts. ;) > >> POSIX requires no option for normal-diff format. >> >> A hint in test-lib-functions::test_cmp said `diff -u` isn't available >> everywhere. >> >> Workaround this problem by assuming `diff(1)` output is unified >> if we couldn't make anything from normal-diff format. I do not mind working it around, but I am a bit disturbed by an uneven attitude towards POSIX noncompliance this series has. If we were willing to break other people's "sed" that does not do BRE correctly, instead of using '[+]' trick to accomodate them while making sure that an implementation that does not use nonstandard extension and does only BRE, we should just similarly be writing such an implementation of noncompliant diff off as broken, yet we bend backwards over to make sure we can work with them here. IOW, I do not have trouble changing the test so that it works with noncompliant "diff". But then in the same series, I would prefer to see the existing test keeps working with a possibly noncompliant "sed" implementation that has been working well with the tests. > But if this is the only spot, then adjusting to handle unified or normal > diff isn't too bad. Yup. > Could we then make a single invocation that covers both diff formats? We > can further observe that the only thing we do with the "fixed" file is > count the lines, so we can leave the markers. Which means we could ditch > sed entirely and use grep. Something like: > > diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh > index 971a5a7512..15cb0c81b7 100755 > --- a/t/t4124-apply-ws-rule.sh > +++ b/t/t4124-apply-ws-rule.sh > @@ -50,8 +50,9 @@ test_fix () { > # fix should not barf > apply_patch --whitespace=fix || return 1 > > - # find touched lines > - $DIFF file target | sed -n -e "s/^> //p" >fixed > + # find touched lines; handle either normal or unified > + # diff, as system diff may generate either > + $DIFF file target | grep '^[>+][^+]' >fixed > > # the changed lines are all expected to change > fixed_cnt=$(wc -l <fixed) > > seems to work for with both busybox diff and GNU diff. Sounds OK to me.
On 2020-03-19 12:33:34-0400, Jeff King <peff@peff.net> wrote: > On Thu, Mar 19, 2020 at 09:00:07PM +0700, Đoàn Trần Công Danh wrote: > > > POSIX's diff(1) requires output in normal diff format. > > However, busybox's diff's output is written in unified format. > > That's a pretty big difference. I'm surprised this only produces one > problem in the test scripts. ;) > > > POSIX requires no option for normal-diff format. > > > > A hint in test-lib-functions::test_cmp said `diff -u` isn't available > > everywhere. > > > > Workaround this problem by assuming `diff(1)` output is unified > > if we couldn't make anything from normal-diff format. > > I wonder if we could use "git diff" here. We have to be careful about > circular reasoning in our tests (i.e., making sure we're not verifying > output with the same code that we're testing), but I think here we're > checking how "apply --whitespace=fix" works. > > But if this is the only spot, then adjusting to handle unified or normal > diff isn't too bad. > > > diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh > > index 971a5a7512..2a54ce96b5 100755 > > --- a/t/t4124-apply-ws-rule.sh > > +++ b/t/t4124-apply-ws-rule.sh > > @@ -52,6 +52,12 @@ test_fix () { > > > > # find touched lines > > $DIFF file target | sed -n -e "s/^> //p" >fixed > > + if ! test -s fixed; then > > + $DIFF file target | > > + grep '^+' | > > + grep -v '^+++' | > > + sed -e "s/+//" >fixed > > + fi > > I think those greps could be lumped into sed like: > > sed -ne "s/^+[^+]//p" > > (at the cost of missing blank lines, but I think that's OK for our > purposes here; it could be fixed with an ERE). > > Could we then make a single invocation that covers both diff formats? We > can further observe that the only thing we do with the "fixed" file is > count the lines, so we can leave the markers. Which means we could ditch > sed entirely and use grep. Something like: > > diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh > index 971a5a7512..15cb0c81b7 100755 > --- a/t/t4124-apply-ws-rule.sh > +++ b/t/t4124-apply-ws-rule.sh > @@ -50,8 +50,9 @@ test_fix () { > # fix should not barf > apply_patch --whitespace=fix || return 1 > > - # find touched lines > - $DIFF file target | sed -n -e "s/^> //p" >fixed > + # find touched lines; handle either normal or unified > + # diff, as system diff may generate either > + $DIFF file target | grep '^[>+][^+]' >fixed > > # the changed lines are all expected to change > fixed_cnt=$(wc -l <fixed) > > seems to work for with both busybox diff and GNU diff. 3 lines after this one: ?*) expect_cnt=$(grep "[$1]" <fixed | wc -l) ;; As of now, we could simply replace sed with grep entirely, because ! "$1" ~"[>+]". Considering the complicated of: test_expect_success "rule=$rule" ' git config core.whitespace "$rule" && test_fix "$tt$ts$ti$th" ' I think it's better to use sed here.
On Thu, Mar 19, 2020 at 03:58:51PM -0700, Junio C Hamano wrote: > >> Workaround this problem by assuming `diff(1)` output is unified > >> if we couldn't make anything from normal-diff format. > > I do not mind working it around, but I am a bit disturbed by an > uneven attitude towards POSIX noncompliance this series has. If > we were willing to break other people's "sed" that does not do BRE > correctly, instead of using '[+]' trick to accomodate them while > making sure that an implementation that does not use nonstandard > extension and does only BRE, we should just similarly be writing > such an implementation of noncompliant diff off as broken, yet we > bend backwards over to make sure we can work with them here. > > IOW, I do not have trouble changing the test so that it works with > noncompliant "diff". But then in the same series, I would prefer to > see the existing test keeps working with a possibly noncompliant > "sed" implementation that has been working well with the tests. I don't think it's inconsistent. Real-world experience trumps standards. We _know_ that there is a real-world diff that generates only unified diffs, and it is not too hard to work around it. So we should do so. A sed that uses ERE and requires backslash-escaping pluses is theoretical at this point. POSIX forbids it, and I would guess that working around it would be more than just the "[+]" we found, because other patterns probably need it, too. But we won't know until we find one to test on. So I'm not entirely against "[+]" as a defensive measure. But I have a slight preference to avoid it until we know it's needed, not because it's hard to do once, but because I don't want to grow too many defensive superstitions if we don't know they're warranted. -Peff
On Fri, Mar 20, 2020 at 08:52:23AM +0700, Danh Doan wrote: > > diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh > > index 971a5a7512..15cb0c81b7 100755 > > --- a/t/t4124-apply-ws-rule.sh > > +++ b/t/t4124-apply-ws-rule.sh > > @@ -50,8 +50,9 @@ test_fix () { > > # fix should not barf > > apply_patch --whitespace=fix || return 1 > > > > - # find touched lines > > - $DIFF file target | sed -n -e "s/^> //p" >fixed > > + # find touched lines; handle either normal or unified > > + # diff, as system diff may generate either > > + $DIFF file target | grep '^[>+][^+]' >fixed > > > > # the changed lines are all expected to change > > fixed_cnt=$(wc -l <fixed) > > > > seems to work for with both busybox diff and GNU diff. > > 3 lines after this one: > > ?*) expect_cnt=$(grep "[$1]" <fixed | wc -l) ;; > > As of now, we could simply replace sed with grep entirely, > because ! "$1" ~"[>+]". > > Considering the complicated of: > > test_expect_success "rule=$rule" ' > git config core.whitespace "$rule" && > test_fix "$tt$ts$ti$th" > ' > > I think it's better to use sed here. Fair enough. I think it's OK now, but I agree that it puts a pretty subtle assumption into the test_fix function (and that's far removed from what actual tests will call it with, so it's easy to miss). Using sed should be more maintainable. -Peff
diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh index 971a5a7512..2a54ce96b5 100755 --- a/t/t4124-apply-ws-rule.sh +++ b/t/t4124-apply-ws-rule.sh @@ -52,6 +52,12 @@ test_fix () { # find touched lines $DIFF file target | sed -n -e "s/^> //p" >fixed + if ! test -s fixed; then + $DIFF file target | + grep '^+' | + grep -v '^+++' | + sed -e "s/+//" >fixed + fi # the changed lines are all expected to change fixed_cnt=$(wc -l <fixed)
POSIX's diff(1) requires output in normal diff format. However, busybox's diff's output is written in unified format. POSIX requires no option for normal-diff format. A hint in test-lib-functions::test_cmp said `diff -u` isn't available everywhere. Workaround this problem by assuming `diff(1)` output is unified if we couldn't make anything from normal-diff format. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> --- t/t4124-apply-ws-rule.sh | 6 ++++++ 1 file changed, 6 insertions(+)