Message ID | c625bfe2205d51b3158ef71e4bf472708642c146.1537223021.git.matvore@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,1/6] CodingGuidelines: add shell piping guidelines | expand |
On Mon, Sep 17, 2018 at 6:24 PM Matthew DeVore <matvore@google.com> wrote: > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines > @@ -163,6 +163,35 @@ For shell scripts specifically (not exhaustive): > + - In a piped sequence which spans multiple lines, put each statement > + on a separate line and put pipes on the end of each line, rather > + than the start. This means you don't need to use \ to join lines, > + since | implies a join already. Also, do not indent subsequent > + lines; if you need a sequence to visually stand apart from the > + surrounding code, use a blank line before and/or after the piped > + sequence. > + > + (incorrect) > + [...] > + (correct) > + echo '...' > expected Existing tests seem to favor the name "expect" over "expected", so perhaps use that instead. $ git grep '>expect\b' -- t | wc -l 2674 $ git grep '>expected\b' -- t | wc -l 1406 > + git ls-files -s file.1 file.2 file.3 file.4 file.5 | > + awk '{print $1}' | > + sort >observed This is not a great example since it flatly contradicts the very next bit of advice added by this patch about not placing a Git command upstream in a pipe. Perhaps come up with an example which doesn't suffer this shortcoming. I've seen the advice earlier in the thread of not indenting the sub-commands in a pipe, but I find that the result makes it far more difficult to see which commands are part of the pipe sequence than with them indented, so I'm not convinced that this advice should be in the guidelines. (But that just my opinion.) > + - In a pipe, any non-zero exit codes returned by processes besides > + the last will be ignored. If there is any possibility some > + non-final command in the pipe will raise an error, prefer writing > + the output of that command to a temporary file with '>' rather than > + pipe it. It's not so much that we care about losing a non-zero exit code (which might be perfectly acceptable depending upon the context) but that we care about missing a Git command which outright crashes. So, it might make sense to make this text more specific by saying that ("exit code indicating a crash" and "Git command") rather than being generic in saying only "exit code" and "command". Also, what about expression like $(git foo) by which a crash of a Git command can also be lost? Do we want to talk about that, as well?
On Mon, Sep 17, 2018 at 5:16 PM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Mon, Sep 17, 2018 at 6:24 PM Matthew DeVore <matvore@google.com> wrote: > > diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines > > @@ -163,6 +163,35 @@ For shell scripts specifically (not exhaustive): > > + - In a piped sequence which spans multiple lines, put each statement > > + on a separate line and put pipes on the end of each line, rather > > + than the start. This means you don't need to use \ to join lines, > > + since | implies a join already. Also, do not indent subsequent > > + lines; if you need a sequence to visually stand apart from the > > + surrounding code, use a blank line before and/or after the piped > > + sequence. > > + > > + (incorrect) > > + [...] > > + (correct) > > + echo '...' > expected > > Existing tests seem to favor the name "expect" over "expected", so > perhaps use that instead. > > $ git grep '>expect\b' -- t | wc -l > 2674 > $ git grep '>expected\b' -- t | wc -l > 1406 Thank you for clarifying that out for me, but I'm not longer using that example, so it's moot. > > > + git ls-files -s file.1 file.2 file.3 file.4 file.5 | > > + awk '{print $1}' | > > + sort >observed > > This is not a great example since it flatly contradicts the very next > bit of advice added by this patch about not placing a Git command > upstream in a pipe. Perhaps come up with an example which doesn't > suffer this shortcoming. Done. > > I've seen the advice earlier in the thread of not indenting the > sub-commands in a pipe, but I find that the result makes it far more > difficult to see which commands are part of the pipe sequence than > with them indented, so I'm not convinced that this advice should be in > the guidelines. (But that just my opinion.) I'm not totally sure either way, nor do I have a strong opinion. I agree it's probably better to not codify this in the documentation until there's a great reason to. > > > + - In a pipe, any non-zero exit codes returned by processes besides > > + the last will be ignored. If there is any possibility some > > + non-final command in the pipe will raise an error, prefer writing > > + the output of that command to a temporary file with '>' rather than > > + pipe it. > > It's not so much that we care about losing a non-zero exit code (which > might be perfectly acceptable depending upon the context) but that we > care about missing a Git command which outright crashes. So, it might > make sense to make this text more specific by saying that ("exit code > indicating a crash" and "Git command") rather than being generic in > saying only "exit code" and "command". Fixed. > > Also, what about expression like $(git foo) by which a crash of a Git > command can also be lost? Do we want to talk about that, as well? Yes, it's probably better to add a point about that. Here is the new documentation after applying your suggestions: - If a piped sequence which spans multiple lines, put each statement on a separate line and put pipes on the end of each line, rather than the start. This means you don't need to use \ to join lines, since | implies a join already. (incorrect) grep blob verify_pack_result \ | awk -f print_1.awk \ | sort >actual && ... (correct) grep blob verify_pack_result | awk -f print_1.awk | sort >actual && ... - In a pipe, any exit codes returned by processes besides the last are ignored. This means that if git crashes at the beginning or middle of a pipe, it may go undetected. Prefer writing the output of that command to a temporary file with '>' rather than pipe it. - The $(git ...) construct also discards git's exit code, so if the goal is to test that particular command, redirect its output to a temporary file rather than wrap it with $( ).
On Tue, Sep 18, 2018 at 10:11 PM Matthew DeVore <matvore@google.com> wrote: > Yes, it's probably better to add a point about that. Here is the new > documentation after applying your suggestions: > > - If a piped sequence which spans multiple lines, put each statement s/which// > on a separate line and put pipes on the end of each line, rather > than the start. This means you don't need to use \ to join lines, > since | implies a join already. > [...] > - In a pipe, any exit codes returned by processes besides the last > are ignored. This means that if git crashes at the beginning or > middle of a pipe, it may go undetected. Prefer writing the output > of that command to a temporary file with '>' rather than pipe it. > > - The $(git ...) construct also discards git's exit code, so if the > goal is to test that particular command, redirect its output to a > temporary file rather than wrap it with $( ). This all sounds better.
Matthew DeVore <matvore@google.com> writes: > Yes, it's probably better to add a point about that. Here is the new > documentation after applying your suggestions: > > - If a piped sequence which spans multiple lines, put each statement > on a separate line and put pipes on the end of each line, rather > than the start. This means you don't need to use \ to join lines, > since | implies a join already. > > (incorrect) > grep blob verify_pack_result \ > | awk -f print_1.awk \ > | sort >actual && > ... > > (correct) > grep blob verify_pack_result | > awk -f print_1.awk | > sort >actual && > ... The formatting advice to place '|' at the end applies equally to '&&' and '||' because these three syntactic elements share exactly the same trait: the shell knows you haven't finished speaking when it sees them at the end of the line and keeps listening, and humans would know that too, so there is no need for explicitly continuing the line with backslash. Organizationally speaking, I wonder if the above about formatting would better appear separate from the latter two points that are about semantics. > - In a pipe, any exit codes returned by processes besides the last > are ignored. This means that if git crashes at the beginning or > middle of a pipe, it may go undetected. Prefer writing the output > of that command to a temporary file with '>' rather than pipe it. > > - The $(git ...) construct also discards git's exit code, so if the > goal is to test that particular command, redirect its output to a > temporary file rather than wrap it with $( ).
On Wed, Sep 19, 2018 at 5:36 AM Eric Sunshine <sunshine@sunshineco.com> wrote: > > On Tue, Sep 18, 2018 at 10:11 PM Matthew DeVore <matvore@google.com> wrote: > > Yes, it's probably better to add a point about that. Here is the new > > documentation after applying your suggestions: > > > > - If a piped sequence which spans multiple lines, put each statement > > s/which// Done. On Wed, Sep 19, 2018 at 7:24 AM Junio C Hamano <gitster@pobox.com> wrote: > The formatting advice to place '|' at the end applies equally to > '&&' and '||' because these three syntactic elements share exactly > the same trait: the shell knows you haven't finished speaking when > it sees them at the end of the line and keeps listening, and humans > would know that too, so there is no need for explicitly continuing > the line with backslash. > I've reworded the text to indicate the advice applies to && and || as well. > Organizationally speaking, I wonder if the above about formatting > would better appear separate from the latter two points that are > about semantics. > I moved the formatting point to right under the point about formatting if statements, which does seem like a more natural progression. Here is the new patch to summarize the changes (warning: tabs are mangled): -------------------------------------------------------------------------------- CodingGuidelines: add shell piping guidelines Add two guidelines: - pipe characters should appear at the end of lines, and not cause indentation - pipes should be avoided when they swallow exit codes that can potentially fail diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 48aa4edfb..6d265327c 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -118,6 +118,24 @@ For shell scripts specifically (not exhaustive): do this fi + - If a command sequence joined with && or || or | spans multiple + lines, put each command on a separate line and put && and || and | + operators at the end of each line, rather than the start. This + means you don't need to use \ to join lines, since the above + operators imply the sequence isn't finished. + + (incorrect) + grep blob verify_pack_result \ + | awk -f print_1.awk \ + | sort >actual && + ... + + (correct) + grep blob verify_pack_result | + awk -f print_1.awk | + sort >actual && + ... + - We prefer "test" over "[ ... ]". - We do not write the noiseword "function" in front of shell @@ -163,6 +181,15 @@ For shell scripts specifically (not exhaustive): does not have such a problem. + - In a piped chain such as "grep blob objects | sort", the exit codes + returned by processes besides the last are ignored. This means that + if git crashes at the beginning or middle of a chain, it may go + undetected. Prefer writing the output of that command to a + temporary file with '>' rather than pipe it. + + - The $(git ...) construct also discards git's exit code, so if the + goal is to test that particular command, redirect its output to a + temporary file rather than wrap it with $( ). For C programs:
diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines index 48aa4edfb..7f903c1aa 100644 --- a/Documentation/CodingGuidelines +++ b/Documentation/CodingGuidelines @@ -163,6 +163,35 @@ For shell scripts specifically (not exhaustive): does not have such a problem. + - In a piped sequence which spans multiple lines, put each statement + on a separate line and put pipes on the end of each line, rather + than the start. This means you don't need to use \ to join lines, + since | implies a join already. Also, do not indent subsequent + lines; if you need a sequence to visually stand apart from the + surrounding code, use a blank line before and/or after the piped + sequence. + + (incorrect) + echo '...' > expected + git ls-files -s file.1 file.2 file.3 file.4 file.5 \ + | awk '{print $1}' \ + | sort >observed + test_cmp expected actual + + (correct) + echo '...' > expected + + git ls-files -s file.1 file.2 file.3 file.4 file.5 | + awk '{print $1}' | + sort >observed + + test_cmp expected actual + + - In a pipe, any non-zero exit codes returned by processes besides + the last will be ignored. If there is any possibility some + non-final command in the pipe will raise an error, prefer writing + the output of that command to a temporary file with '>' rather than + pipe it. For C programs: