[v3,1/5] CodingGuidelines: add shell piping guidelines

Message ID	7dd9a1120726dee34aeebdfa4bf45a232c185500.1537493611.git.matvore@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> Date: Thu, 20 Sep 2018 18:43:27 -0700 In-Reply-To: <cover.1537493611.git.matvore@google.com> Message-Id: <7dd9a1120726dee34aeebdfa4bf45a232c185500.1537493611.git.matvore@google.com> Mime-Version: 1.0 References: <cover.1536969438.git.matvore@google.com> <cover.1537493611.git.matvore@google.com> Subject: [PATCH v3 1/5] CodingGuidelines: add shell piping guidelines From: Matthew DeVore <matvore@google.com> To: git@vger.kernel.org Cc: Matthew DeVore <matvore@google.com>, peff@peff.net, jonathantanmy@google.com, gitster@pobox.com, jrn@google.com, sunshine@sunshineco.com Content-Type: text/plain; charset="UTF-8" Sender: git-owner@vger.kernel.org Precedence: bulk
Series	Clean up tests for test_cmp arg ordering and pipe placement \| expand [v3,0/5] Clean up tests for test_cmp arg ordering and pipe placement [v3,1/5] CodingGuidelines: add shell piping guidelines [v3,2/5] tests: standardize pipe placement [v3,3/5] t/*: fix ordering of expected/observed arguments [v3,4/5] tests: don't swallow Git errors upstream of pipes [v3,5/5] t9109: don't swallow Git errors upstream of pipes

Matthew DeVore Sept. 21, 2018, 1:43 a.m. UTC

Add two guidelines:

 - pipe characters should appear at the end of lines, and not cause
   indentation
 - pipes should be avoided when they swallow exit codes that can
   potentially fail
---
 Documentation/CodingGuidelines | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

Eric Sunshine Sept. 21, 2018, 2:06 a.m. UTC | #1

On Thu, Sep 20, 2018 at 9:43 PM Matthew DeVore <matvore@google.com> wrote:
> Add two guidelines:

Probably s/two/three/ or s/two/several/ since the patch now adds three
guidelines.

>  - pipe characters should appear at the end of lines, and not cause
>    indentation

The "not cause indentation" bit is outdated since the added guideline
no longer says this.

>  - pipes should be avoided when they swallow exit codes that can
>    potentially fail

And:

    - $(git ...) should be avoided ...

Matthew DeVore Sept. 21, 2018, 9:03 p.m. UTC | #2

On Thu, Sep 20, 2018 at 7:06 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Thu, Sep 20, 2018 at 9:43 PM Matthew DeVore <matvore@google.com> wrote:
> > Add two guidelines:
>
> Probably s/two/three/ or s/two/several/ since the patch now adds three
> guidelines.
>
> >  - pipe characters should appear at the end of lines, and not cause
> >    indentation
>
> The "not cause indentation" bit is outdated since the added guideline
> no longer says this.
>
> >  - pipes should be avoided when they swallow exit codes that can
> >    potentially fail
>
> And:
>
>     - $(git ...) should be avoided ...

Thank you for pointing this out - I obviously forgot to update the
commit message. Here is the revised message:

CodingGuidelines: add shell piping guidelines

Add the following guidelines:

 - pipe characters should appear at the end of lines, not the beginning,
   and the \ line continuation character should be omitted
 - pipes and $(git ...) should be avoided when they swallow exit codes
   of processes that can potentially fail

SZEDER Gábor Sept. 24, 2018, 9:03 p.m. UTC | #3

On Thu, Sep 20, 2018 at 06:43:27PM -0700, Matthew DeVore wrote:
> Add two guidelines:
> 
>  - pipe characters should appear at the end of lines, and not cause
>    indentation
>  - pipes should be avoided when they swallow exit codes that can
>    potentially fail
> ---
>  Documentation/CodingGuidelines | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> index 48aa4edfb..6d265327c 100644
> --- a/Documentation/CodingGuidelines
> +++ b/Documentation/CodingGuidelines
> @@ -118,6 +118,24 @@ For shell scripts specifically (not exhaustive):
>  		do this
>  	fi
>  
> + - If a command sequence joined with && or || or | spans multiple
> +   lines, put each command on a separate line and put && and || and |
> +   operators at the end of each line, rather than the start. This
> +   means you don't need to use \ to join lines, since the above
> +   operators imply the sequence isn't finished.
> +
> +	(incorrect)
> +	grep blob verify_pack_result \
> +	| awk -f print_1.awk \
> +	| sort >actual &&
> +	...
> +
> +	(correct)
> +	grep blob verify_pack_result |
> +	awk -f print_1.awk |
> +	sort >actual &&
> +	...
> +

The above are general shell coding style guidelines, so it makes sense
to include them in 'Documentation/CodingGuidelines'.


>   - We prefer "test" over "[ ... ]".
>  
>   - We do not write the noiseword "function" in front of shell
> @@ -163,6 +181,15 @@ For shell scripts specifically (not exhaustive):
>  
>     does not have such a problem.
>  
> + - In a piped chain such as "grep blob objects | sort", the exit codes

Let's make an example with git in it, e.g. something like this:

  git cmd | grep important | sort

since just two lines below the new text mentions git crashing.

> +   returned by processes besides the last are ignored. This means that
> +   if git crashes at the beginning or middle of a chain, it may go
> +   undetected. Prefer writing the output of that command to a
> +   temporary file with '>' rather than pipe it.
> +
> + - The $(git ...) construct also discards git's exit code, so if the

This contruct is called command substitution, and it does preserve the
command's exit code, when the expanded text is assigned to a variable:

  $ var=$(exit 42) ; echo $?
  42

Note, however, that even in that case only the exit code of the last
command substitution is preserved:

  $ var=$(exit 1)foo$(exit 2)bar$(exit 3) ; echo $?
  3

> +   goal is to test that particular command, redirect its output to a
> +   temporary file rather than wrap it with $( ).

I find this a bit vague, and to me it implies that ignoring the exit
code of a git command that is not the main focus of the given test is
acceptable, e.g. (made up pseudo example):

  test_expect_success 'fetch gets what it should' '
    git fetch $remote &&
    test "$(git rev-parse just-fetched)" = $expected_oid
  '

In my opinion no tests should ignore the exit code of any git
command, ever.


These last two points, however, are specific to test scripts,
therefore I think they would be better placed in 't/README', where the
rest of the test-specific guidelines are.

>  For C programs:
>  
> -- 
> 2.19.0.444.g18242da7ef-goog
>

Matthew DeVore Sept. 25, 2018, 9:58 p.m. UTC | #4

On Mon, Sep 24, 2018 at 2:03 PM SZEDER Gábor <szeder.dev@gmail.com> wrote:
> > + - In a piped chain such as "grep blob objects | sort", the exit codes
>
> Let's make an example with git in it, e.g. something like this:
>
>   git cmd | grep important | sort
>
> since just two lines below the new text mentions git crashing.
Done.

> > + - The $(git ...) construct also discards git's exit code, so if the
>
> This contruct is called command substitution, and it does preserve the
> command's exit code, when the expanded text is assigned to a variable:
>
>   $ var=$(exit 42) ; echo $?
>   42
>
> Note, however, that even in that case only the exit code of the last
> command substitution is preserved:
>
>   $ var=$(exit 1)foo$(exit 2)bar$(exit 3) ; echo $?
>   3
>
OK, I've changed this guideline to allow for setting a variable with
command substitution, but not in other contexts. It's worded
sufficiently openly such that your latter example will be forbidden.

> > +   goal is to test that particular command, redirect its output to a
> > +   temporary file rather than wrap it with $( ).
>
> I find this a bit vague, and to me it implies that ignoring the exit
> code of a git command that is not the main focus of the given test is
> acceptable, e.g. (made up pseudo example):
>
>   test_expect_success 'fetch gets what it should' '
>     git fetch $remote &&
>     test "$(git rev-parse just-fetched)" = $expected_oid
>   '
>
> In my opinion no tests should ignore the exit code of any git
> command, ever.

This seems like a pretty strong assertion, but something very similar
is written in t/README (in the "don't" section):

 - use '! git cmd' when you want to make sure the git command exits
   with failure in a controlled way by calling "die()".  Instead,
   use 'test_must_fail git cmd'.  This will signal a failure if git
   dies in an unexpected way (e.g. segfault).

So I've changed this to basically say you should never ignore git's exit code.

Here is the new commit with updated message (I will wait for a day or
two before I send a reroll):

    Documentation: add shell guidelines

    Add the following guideline to Documentation/CodingGuidelines:

            &&, ||, and | should appear at the end of lines, not the
            beginning, and the \ line continuation character should be
            omitted

    And the following to t/README (since it is specific to writing tests):

            pipes and $(git ...) should be avoided when they swallow exit
            codes of Git processes

    Signed-off-by: Matthew DeVore <matvore@google.com>

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 48aa4edfb..3d2cfea9b 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -118,6 +118,24 @@ For shell scripts specifically (not exhaustive):
                 do this
         fi

+ - If a command sequence joined with && or || or | spans multiple
+   lines, put each command on a separate line and put && and || and |
+   operators at the end of each line, rather than the start. This
+   means you don't need to use \ to join lines, since the above
+   operators imply the sequence isn't finished.
+
+        (incorrect)
+        grep blob verify_pack_result \
+        | awk -f print_1.awk \
+        | sort >actual &&
+        ...
+
+        (correct)
+        grep blob verify_pack_result |
+        awk -f print_1.awk |
+        sort >actual &&
+        ...
+
  - We prefer "test" over "[ ... ]".

  - We do not write the noiseword "function" in front of shell
@@ -163,7 +181,6 @@ For shell scripts specifically (not exhaustive):

    does not have such a problem.

-
 For C programs:

  - We use tabs to indent, and interpret tabs as taking up to
diff --git a/t/README b/t/README
index 9028b47d9..3e28b72c4 100644
--- a/t/README
+++ b/t/README
@@ -461,6 +461,32 @@ Don't:
    platform commands; just use '! cmd'.  We are not in the business
    of verifying that the world given to us sanely works.

+ - Use Git upstream in the non-final position in a piped chain, as in:
+
+     git -C repo ls-files |
+     xargs -n 1 basename |
+     grep foo
+
+   which will discard git's exit code and may mask a crash. In the
+   above example, all exit codes are ignored except grep's.
+
+   Instead, write the output of that command to a temporary
+   file with ">" or assign it to a variable with "x=$(git ...)" rather
+   than pipe it.
+
+ - Use command substitution in a way that discards git's exit code.
+   When assigning to a variable, the exit code is not discarded, e.g.:
+
+     x=$(git cat-file -p $sha) &&
+     ...
+
+   is OK because a crash in "git cat-file" will cause the "&&" chain
+   to fail, but:
+
+     test_cmp expect $(git cat-file -p $sha)
+
+   is not OK and a crash in git could go undetected.
+
  - use perl without spelling it as "$PERL_PATH". This is to help our
    friends on Windows where the platform Perl often adds CR before
    the end of line, and they bundle Git with a version of Perl that


>
>
> These last two points, however, are specific to test scripts,
> therefore I think they would be better placed in 't/README', where the
> rest of the test-specific guidelines are.
>
> >  For C programs:
> >
> > --
> > 2.19.0.444.g18242da7ef-goog
> >

SZEDER Gábor Sept. 27, 2018, 9:18 p.m. UTC | #5

On Tue, Sep 25, 2018 at 02:58:08PM -0700, Matthew DeVore wrote:
> Here is the new commit with updated message (I will wait for a day or
> two before I send a reroll):
> 
>     Documentation: add shell guidelines
> 
>     Add the following guideline to Documentation/CodingGuidelines:
> 
>             &&, ||, and | should appear at the end of lines, not the
>             beginning, and the \ line continuation character should be
>             omitted
> 
>     And the following to t/README (since it is specific to writing tests):
> 
>             pipes and $(git ...) should be avoided when they swallow exit
>             codes of Git processes
> 
>     Signed-off-by: Matthew DeVore <matvore@google.com>
> 
> diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
> index 48aa4edfb..3d2cfea9b 100644
> --- a/Documentation/CodingGuidelines
> +++ b/Documentation/CodingGuidelines
> @@ -118,6 +118,24 @@ For shell scripts specifically (not exhaustive):
>                  do this
>          fi
> 
> + - If a command sequence joined with && or || or | spans multiple
> +   lines, put each command on a separate line and put && and || and |
> +   operators at the end of each line, rather than the start. This
> +   means you don't need to use \ to join lines, since the above
> +   operators imply the sequence isn't finished.
> +
> +        (incorrect)
> +        grep blob verify_pack_result \
> +        | awk -f print_1.awk \
> +        | sort >actual &&
> +        ...
> +
> +        (correct)
> +        grep blob verify_pack_result |
> +        awk -f print_1.awk |
> +        sort >actual &&
> +        ...
> +
>   - We prefer "test" over "[ ... ]".
> 
>   - We do not write the noiseword "function" in front of shell
> @@ -163,7 +181,6 @@ For shell scripts specifically (not exhaustive):
> 
>     does not have such a problem.
> 
> -
>  For C programs:
> 
>   - We use tabs to indent, and interpret tabs as taking up to
> diff --git a/t/README b/t/README
> index 9028b47d9..3e28b72c4 100644
> --- a/t/README
> +++ b/t/README
> @@ -461,6 +461,32 @@ Don't:
>     platform commands; just use '! cmd'.  We are not in the business
>     of verifying that the world given to us sanely works.
> 
> + - Use Git upstream in the non-final position in a piped chain, as in:

Note the starting upper case 'U'.

> +
> +     git -C repo ls-files |
> +     xargs -n 1 basename |
> +     grep foo
> +
> +   which will discard git's exit code and may mask a crash. In the
> +   above example, all exit codes are ignored except grep's.
> +
> +   Instead, write the output of that command to a temporary
> +   file with ">" or assign it to a variable with "x=$(git ...)" rather
> +   than pipe it.
> +
> + - Use command substitution in a way that discards git's exit code.

'U' again.

> +   When assigning to a variable, the exit code is not discarded, e.g.:
> +
> +     x=$(git cat-file -p $sha) &&
> +     ...
> +
> +   is OK because a crash in "git cat-file" will cause the "&&" chain
> +   to fail, but:
> +
> +     test_cmp expect $(git cat-file -p $sha)
> +
> +   is not OK and a crash in git could go undetected.

Well, this is not OK indeed, because it doesn't make any sense in the
first place :)  'test_cmp' requires two paths as argumens, but the
output of 'git cat-file -p' is the whole _content_ of the given object.

>   - use perl without spelling it as "$PERL_PATH". This is to help our

Note the starting lower case 'u'.

This is because these are the continuation of the "Don't:" some lines
earlier, so your new points should start with a lower case 'u' as
well.


Sidenote: I think we should consider reformatting this whole section
as:

  - Don't do this.
  - Don't do that.

because it grew so much that when I look at the last points, then that
starting "Don't:" has already scrolled out of my screen.

>     friends on Windows where the platform Perl often adds CR before
>     the end of line, and they bundle Git with a version of Perl that
> 
> 
> >
> >
> > These last two points, however, are specific to test scripts,
> > therefore I think they would be better placed in 't/README', where the
> > rest of the test-specific guidelines are.
> >
> > >  For C programs:
> > >
> > > --
> > > 2.19.0.444.g18242da7ef-goog
> > >

Matthew DeVore Oct. 1, 2018, 11:07 p.m. UTC | #6

On Thu, Sep 27, 2018 at 2:18 PM SZEDER Gábor <szeder.dev@gmail.com> wrote:
>
> On Tue, Sep 25, 2018 at 02:58:08PM -0700, Matthew DeVore wrote:
> > + - Use Git upstream in the non-final position in a piped chain, as in:
>
> Note the starting upper case 'U'.
>
> > + - Use command substitution in a way that discards git's exit code.
>
> 'U' again.
>
...
> > +
> > +     test_cmp expect $(git cat-file -p $sha)
> > +
> > +   is not OK and a crash in git could go undetected.
>
> Well, this is not OK indeed, because it doesn't make any sense in the
> first place :)  'test_cmp' requires two paths as argumens, but the
> output of 'git cat-file -p' is the whole _content_ of the given object.

I've replaced the example and added "Don't" in front of each new item
(explanation for this below). Here is a new diff for this file - the
rest of the commit is the same:

    platform commands; just use '! cmd'.  We are not in the business
    of verifying that the world given to us sanely works.

+ - Don't use Git upstream in the non-final position in a piped chain, as
+   in:
+
+     git -C repo ls-files |
+     xargs -n 1 basename |
+     grep foo
+
+   which will discard git's exit code and may mask a crash. In the
+   above example, all exit codes are ignored except grep's.
+
+   Instead, write the output of that command to a temporary
+   file with ">" or assign it to a variable with "x=$(git ...)" rather
+   than pipe it.
+
+ - Don't use command substitution in a way that discards git's exit
+   code. When assigning to a variable, the exit code is not discarded,
+   e.g.:
+
+     x=$(git cat-file -p $sha) &&
+     ...
+
+   is OK because a crash in "git cat-file" will cause the "&&" chain
+   to fail, but:
+
+     test "refs/heads/foo" = "$(git symbolic-ref HEAD)"
+
+   is not OK and a crash in git could go undetected.
+
  - Don't use perl without spelling it as "$PERL_PATH". This is to help
    our friends on Windows where the platform Perl often adds CR before
    the end of line, and they bundle Git with a version of Perl that


>
> >   - use perl without spelling it as "$PERL_PATH". This is to help our
>
> Note the starting lower case 'u'.
>
> This is because these are the continuation of the "Don't:" some lines
> earlier, so your new points should start with a lower case 'u' as
> well.
>
>
> Sidenote: I think we should consider reformatting this whole section
> as:
>
>   - Don't do this.
>   - Don't do that.
>
> because it grew so much that when I look at the last points, then that
> starting "Don't:" has already scrolled out of my screen.
>
I didn't like how easy it was to mistake a "Don't" item for a "Do"
(although temporarily until you read the first sentence, but it's
still confusing). So I added a new commit to clean this section up.
Here it is:

Author: Matthew DeVore <matvore@google.com>
Date:   Mon Oct 1 15:30:49 2018 -0700

    t/README: reformat Do, Don't, Keep in mind lists

    The list of Don'ts for test writing has grown large such that it is hard
    to see at a glance which section an item is in. In other words, if I
    ignore a little bit of surrounding context, the "don'ts" look like
    "do's."

    To make the list more readable, prefix "Don't" in front of every first
    sentence in the items.

    Also, the "Keep in mind" list is out of place and awkward, because it
    was a very short "list" beneath two very long ones, and it seemed easy
    to miss under the list of "don'ts," and it only had one item. So move
    this item to the list of "do's" and phrase as "Remember..."

    Signed-off-by: Matthew DeVore <matvore@google.com>

diff --git a/t/README b/t/README
index 9028b47d9..85024aba6 100644
--- a/t/README
+++ b/t/README
@@ -393,13 +393,13 @@ This test harness library does the following things:
    consistently when command line arguments --verbose (or -v),
    --debug (or -d), and --immediate (or -i) is given.

-Do's, don'ts & things to keep in mind
+Do's & don'ts
 -------------------------------------

 Here are a few examples of things you probably should and shouldn't do
 when writing tests.

-Do:
+Here are the "do's:"

  - Put all code inside test_expect_success and other assertions.

@@ -444,16 +444,21 @@ Do:
    Windows, where the shell (MSYS bash) mangles absolute path names.
    For details, see the commit message of 4114156ae9.

-Don't:
+ - Remember that inside the <script> part, the standard output and
+   standard error streams are discarded, and the test harness only
+   reports "ok" or "not ok" to the end user running the tests. Under
+   --verbose, they are shown to help debug the tests.

- - exit() within a <script> part.
+And here are the "don'ts:"
+
+ - Don't exit() within a <script> part.

    The harness will catch this as a programming error of the test.
    Use test_done instead if you need to stop the tests early (see
    "Skipping tests" below).

- - use '! git cmd' when you want to make sure the git command exits
-   with failure in a controlled way by calling "die()".  Instead,
+ - Don't use '! git cmd' when you want to make sure the git command
+   exits with failure in a controlled way by calling "die()".  Instead,
    use 'test_must_fail git cmd'.  This will signal a failure if git
    dies in an unexpected way (e.g. segfault).

@@ -461,8 +466,8 @@ Don't:
    platform commands; just use '! cmd'.  We are not in the business
    of verifying that the world given to us sanely works.

- - use perl without spelling it as "$PERL_PATH". This is to help our
-   friends on Windows where the platform Perl often adds CR before
+ - Don't use perl without spelling it as "$PERL_PATH". This is to help
+   our friends on Windows where the platform Perl often adds CR before
    the end of line, and they bundle Git with a version of Perl that
    does not do so, whose path is specified with $PERL_PATH. Note that we
    provide a "perl" function which uses $PERL_PATH under the hood, so
@@ -470,17 +475,17 @@ Don't:
    (but you do, for example, on a shebang line or in a sub script
    created via "write_script").

- - use sh without spelling it as "$SHELL_PATH", when the script can
-   be misinterpreted by broken platform shell (e.g. Solaris).
+ - Don't use sh without spelling it as "$SHELL_PATH", when the script
+   can be misinterpreted by broken platform shell (e.g. Solaris).

- - chdir around in tests.  It is not sufficient to chdir to
+ - Don't chdir around in tests.  It is not sufficient to chdir to
    somewhere and then chdir back to the original location later in
    the test, as any intermediate step can fail and abort the test,
    causing the next test to start in an unexpected directory.  Do so
    inside a subshell if necessary.

- - save and verify the standard error of compound commands, i.e. group
-   commands, subshells, and shell functions (except test helper
+ - Don't save and verify the standard error of compound commands, i.e.
+   group commands, subshells, and shell functions (except test helper
    functions like 'test_must_fail') like this:

      ( cd dir && git cmd ) 2>error &&
@@ -495,7 +500,7 @@ Don't:
      ( cd dir && git cmd 2>../error ) &&
      test_cmp expect error

- - Break the TAP output
+ - Don't break the TAP output

    The raw output from your test may be interpreted by a TAP harness. TAP
    harnesses will ignore everything they don't know about, but don't step
@@ -515,13 +520,6 @@ Don't:
    but the best indication is to just run the tests with prove(1),
    it'll complain if anything is amiss.

-Keep in mind:
-
- - Inside the <script> part, the standard output and standard error
-   streams are discarded, and the test harness only reports "ok" or
-   "not ok" to the end user running the tests. Under --verbose, they
-   are shown to help debugging the tests.
-

 Skipping tests
 --------------

[v3,1/5] CodingGuidelines: add shell piping guidelines

Commit Message

Comments

Patch