diff mbox series

add usage-strings ci check and amend remaining usage strings

Message ID pull.1147.git.1645030949730.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series add usage-strings ci check and amend remaining usage strings | expand

Commit Message

Abhradeep Chakraborty Feb. 16, 2022, 5:02 p.m. UTC
From: Abhra303 <chakrabortyabhradeep79@gmail.com>

Usage strings for git (sub)command flags has a style guide that
suggests - first letter should not capitalized (unless requied)
and it should skip full-stop at the end of line. But there are
some files where usage-strings do not follow the above mentioned
guide. Moreover, there are no checks to verify if all usage strings
are following the guide/convention or not.

Amend the usage strings that don't follow the convention/guide and
add a `CI` check for checking the usage strings (whether the first
letter is capital or it ends with full-stop). If the `check` find
such strings then print those strings and return a non-zero status.

Also provide a script that takes an optional argument (a valid <tree>
string), to check the usage strings in the given <tree> (`HEAD` is
the default argument).

Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
---
    add usage-strings ci check and amend remaining usage strings
    
    This patch series completely fixes #636.
    
    The issue is about amending the usage-strings (for command flags such as
    -h, -v etc.) which do not follow the style convention/guide. There was a
    PR [https://github.com/gitgitgadget/git/pull/920] addressing this issue
    but as Johannes [https://github.com/dscho] said in his comment
    [https://github.com/gitgitgadget/git/issues/636#issuecomment-1018660439],
    there are some files that still have those kind of usage strings.
    Johannes also suggested to add a CI check under ci/test-documentation.sh
    to check the usage strings.
    
    So, in this patch, all remaining usage strings are corrected. I also
    added a check-usage-strings target in Makefile which can be used to
    check usage strings. It uses check-usage-strings.sh.
    
     1. If check-usage-strings.sh is run on a valid git repo - it will check
        the validity of usage-strings in the tree specified by an argument
        or in the HEAD if no argument provided.
     2. If the current repo is not a git repo (i.e. if it doesn't find any
        .git folder), it will check the usage string in the current root
        directory.
    
    For the first case, output of make check-usage-strings or
    ./check-usage-strings.sh would be similar to -
    
    HEAD:builtin/bisect--helper.c:1212                        N_("use <cmd>... to automatically bisect."), BISECT_RUN),
    HEAD:builtin/submodule--helper.c:1877            OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
    
    
    If an argument provided - ./check-usage-strings.sh 'v2.34.0' , it will
    search for usage-strings in v2.34.0 and v2.34.0 will be prefixed before
    filenames instead of HEAD.
    
    In the second case, output would be similar to -
    
    diff.c:5596                            N_("select files by diff type."),
    diff.c:5599               N_("Output to a specific file"),
    builtin/branch.c:666            OPT_BIT('C', NULL, &copy, N_("copy a branch, even if target exists."), 2),
    make: *** [check-usage-strings] Error 1
    
    
    Note in the last case - arguments provided to it will be useless.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1147%2FAbhra303%2Fusage_command_amend-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1147/Abhra303/usage_command_amend-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1147

 Makefile                    |  5 +++++
 builtin/bisect--helper.c    |  2 +-
 builtin/reflog.c            |  6 +++---
 builtin/submodule--helper.c |  2 +-
 check-usage-strings.sh      | 33 +++++++++++++++++++++++++++++++++
 ci/test-documentation.sh    |  1 +
 diff.c                      |  2 +-
 t/helper/test-run-command.c |  6 +++---
 8 files changed, 48 insertions(+), 9 deletions(-)
 create mode 100755 check-usage-strings.sh


base-commit: b80121027d1247a0754b3cc46897fee75c050b44

Comments

Abhradeep Chakraborty Feb. 21, 2022, 2:51 p.m. UTC | #1
Hello reviewers and community members,

This patch request is not reviewed yet (i.e. no response/suggestions came about this patch request; most probably because it was lost in the PR ocean).

So, could you please review my patch request?

thanks :)
Ævar Arnfjörð Bjarmason Feb. 21, 2022, 3:39 p.m. UTC | #2
On Wed, Feb 16 2022, Abhradeep Chakraborty via GitGitGadget wrote:

> From: Abhra303 <chakrabortyabhradeep79@gmail.com>
>
> Usage strings for git (sub)command flags has a style guide that
> suggests - first letter should not capitalized (unless requied)
> and it should skip full-stop at the end of line. But there are
> some files where usage-strings do not follow the above mentioned
> guide. Moreover, there are no checks to verify if all usage strings
> are following the guide/convention or not.
>
> Amend the usage strings that don't follow the convention/guide and
> add a `CI` check for checking the usage strings (whether the first
> letter is capital or it ends with full-stop). If the `check` find
> such strings then print those strings and return a non-zero status.
>
> Also provide a script that takes an optional argument (a valid <tree>
> string), to check the usage strings in the given <tree> (`HEAD` is
> the default argument).
>
> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com>
> ---
>     add usage-strings ci check and amend remaining usage strings
>     
>     This patch series completely fixes #636.
>     
>     The issue is about amending the usage-strings (for command flags such as
>     -h, -v etc.) which do not follow the style convention/guide. There was a
>     PR [https://github.com/gitgitgadget/git/pull/920] addressing this issue
>     but as Johannes [https://github.com/dscho] said in his comment
>     [https://github.com/gitgitgadget/git/issues/636#issuecomment-1018660439],
>     there are some files that still have those kind of usage strings.
>     Johannes also suggested to add a CI check under ci/test-documentation.sh
>     to check the usage strings.
>     
>     So, in this patch, all remaining usage strings are corrected. I also
>     added a check-usage-strings target in Makefile which can be used to
>     check usage strings. It uses check-usage-strings.sh.
>     
>      1. If check-usage-strings.sh is run on a valid git repo - it will check
>         the validity of usage-strings in the tree specified by an argument
>         or in the HEAD if no argument provided.
>      2. If the current repo is not a git repo (i.e. if it doesn't find any
>         .git folder), it will check the usage string in the current root
>         directory.
>     
>     For the first case, output of make check-usage-strings or
>     ./check-usage-strings.sh would be similar to -
>     
>     HEAD:builtin/bisect--helper.c:1212                        N_("use <cmd>... to automatically bisect."), BISECT_RUN),
>     HEAD:builtin/submodule--helper.c:1877            OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
>     
>     
>     If an argument provided - ./check-usage-strings.sh 'v2.34.0' , it will
>     search for usage-strings in v2.34.0 and v2.34.0 will be prefixed before
>     filenames instead of HEAD.
>     
>     In the second case, output would be similar to -
>     
>     diff.c:5596                            N_("select files by diff type."),
>     diff.c:5599               N_("Output to a specific file"),
>     builtin/branch.c:666            OPT_BIT('C', NULL, &copy, N_("copy a branch, even if target exists."), 2),
>     make: *** [check-usage-strings] Error 1
>     
>     
>     Note in the last case - arguments provided to it will be useless.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1147%2FAbhra303%2Fusage_command_amend-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1147/Abhra303/usage_command_amend-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1147

Sorry about leaving this patch submission hanging. I read this at the
time, but forgot to find time to loop back to it.

>  Makefile                    |  5 +++++
>  builtin/bisect--helper.c    |  2 +-
>  builtin/reflog.c            |  6 +++---
>  builtin/submodule--helper.c |  2 +-
>  check-usage-strings.sh      | 33 +++++++++++++++++++++++++++++++++
>  ci/test-documentation.sh    |  1 +
>  diff.c                      |  2 +-
>  t/helper/test-run-command.c |  6 +++---
>  8 files changed, 48 insertions(+), 9 deletions(-)
>  create mode 100755 check-usage-strings.sh
>
> diff --git a/Makefile b/Makefile
> index 186f9ab6190..93faed51da0 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -3416,6 +3416,11 @@ check-docs::
>  check-builtins::
>  	./check-builtins.sh
>  
> +### Make sure all the usage strings follow usage string style guide
> +#
> +check-usage-strings::
> +	./check-usage-strings.sh
> +
>  ### Test suite coverage testing
>  #
>  .PHONY: coverage coverage-clean coverage-compile coverage-test coverage-report
> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
> index 28a2e6a5750..614d95b022c 100644
> --- a/builtin/bisect--helper.c
> +++ b/builtin/bisect--helper.c
> @@ -1209,7 +1209,7 @@ int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
>  		OPT_CMDMODE(0, "bisect-visualize", &cmdmode,
>  			 N_("visualize the bisection"), BISECT_VISUALIZE),
>  		OPT_CMDMODE(0, "bisect-run", &cmdmode,
> -			 N_("use <cmd>... to automatically bisect."), BISECT_RUN),
> +			 N_("use <cmd>... to automatically bisect"), BISECT_RUN),
>  		OPT_BOOL(0, "no-log", &nolog,
>  			 N_("no log for BISECT_WRITE")),
>  		OPT_END()
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 85b838720c3..28372c5e2b5 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -600,7 +600,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
>  		OPT_BIT(0, "updateref", &flags,
>  			N_("update the reference to the value of the top reflog entry"),
>  			EXPIRE_REFLOGS_UPDATE_REF),
> -		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
> +		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
>  		OPT_CALLBACK_F(0, "expire", &cmd, N_("timestamp"),
>  			       N_("prune entries older than the specified time"),
>  			       PARSE_OPT_NONEG,
> @@ -613,7 +613,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
>  			 N_("prune any reflog entries that point to broken commits")),
>  		OPT_BOOL(0, "all", &do_all, N_("process the reflogs of all references")),
>  		OPT_BOOL(1, "single-worktree", &all_worktrees,
> -			 N_("limits processing to reflogs from the current worktree only.")),
> +			 N_("limits processing to reflogs from the current worktree only")),
>  		OPT_END()
>  	};
>  
> @@ -736,7 +736,7 @@ static int cmd_reflog_delete(int argc, const char **argv, const char *prefix)
>  		OPT_BIT(0, "updateref", &flags,
>  			N_("update the reference to the value of the top reflog entry"),
>  			EXPIRE_REFLOGS_UPDATE_REF),
> -		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
> +		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
>  		OPT_END()
>  	};
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index c5d3fc3817f..9864ec1427d 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -1874,7 +1874,7 @@ static int module_clone(int argc, const char **argv, const char *prefix)
>  		OPT_STRING(0, "depth", &clone_data.depth,
>  			   N_("string"),
>  			   N_("depth for shallow clones")),
> -		OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
> +		OPT__QUIET(&quiet, "suppress output for cloning a submodule"),
>  		OPT_BOOL(0, "progress", &progress,
>  			   N_("force cloning progress")),
>  		OPT_BOOL(0, "require-init", &require_init,

It's really good to have these fixed! Ditto for the remaining ones I
elided.

> diff --git a/check-usage-strings.sh b/check-usage-strings.sh
> new file mode 100755
> index 00000000000..a4028e0d00d
> --- /dev/null
> +++ b/check-usage-strings.sh
> @@ -0,0 +1,33 @@
> +{
> +  if test -d ".git"
> +  then
> +    rev=${1:-"HEAD"}
> +    for entry in $(git grep -l 'struct option .* = {$' "$rev" -- \*.c);
> +    do
> +      git show "$entry" |
> +      sed -n '/struct option .* = {/,/OPT_END/{=;p;}' |
> +      sed "N;s/^\\([0-9]*\\)\\n/$(echo "$entry" | sed 's/\//\\&/g'):\\1/";
> +    done
> +  else
> +    for entry in $(grep -rl --include="*.c" 'struct option .* = {$' . );
> +    do
> +      cat "$entry" |
> +      sed -n '/struct option .* = {/,/OPT_END/{=;p;}' |
> +      sed "N;s/^\\([0-9]*\\)\\n/$(echo "$entry" | sed -e 's/\//\\&/g' -e 's/^\.\\\///'):\\1/";
> +    done
> +  fi
> +} |
> +grep -Pe '((?<!OPT_GROUP\(N_\(|OPT_GROUP\()"(?!GPG|DEPRECATED|SHA1|HEAD)[A-Z]|(?<!"|\.\.)\.")' |
> +{
> +  status=0
> +  while read content;
> +  do
> +    if test -n "$content"
> +    then
> +      echo "$content";
> +      status=1;
> +    fi
> +  done
> +
> +  exit $status
> +}
> diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
> index de41888430a..f66848dfc66 100755
> --- a/ci/test-documentation.sh
> +++ b/ci/test-documentation.sh
> @@ -15,6 +15,7 @@ filter_log () {
>  }

As much as I like the idea, I really don't want us to have this method
of doing it though, i.e. to start parsing our C code with a
hard-to-maintain shellscript.

But the good news is that there's much easier way to add this!

Aside: if we did want to do the "parse C" method the right way to do it
would be to have a coccinelle script do it. We don't currently, but we
use coccicheck, and if you look at the linux kernel's use of it there's
multiple such checks there. I.e. you can have it parse the C and run
your checks with an arbitrary script.

But in this case there's really a much easier way to do this, to just
extend something like this:

diff --git a/parse-options.c b/parse-options.c
index 2437ad3bcdd..90d8da6ad4c 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -492,6 +492,8 @@ static void parse_options_check(const struct option *opts)
 		default:
 			; /* ok. (usually accepts an argument) */
 		}
+		if (opts->help && ends_with(opts->help, "."))
+			err |= optbug(opts, xstrfmt("argh should not end with a dot: %s", opts->help));
 		if (opts->argh &&
 		    strcspn(opts->argh, " _") != strlen(opts->argh))
 			err |= optbug(opts, "multi-word argh should use dash to separate words");

Then the t0012-help.sh test will catch these, and that's where these
sorts of checks belong in our tree.

See b6c2a0d45d4 (parse-options: make sure argh string does not have SP
or _, 2014-03-23) for the existing code shown in the context where we
already check "argh" like that, i.e. we're just missing a test for
"help".

Obviously such a function would need to hardcode some of the logic you
added in your shellscript. E.g. this fires on a string ending in "...",
but yours doesn't.

That should be fairly easy to do though, and if not we could always just
dump these to stderr or something if a
git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true, and
do the testing itself in t0012-help.sh.

>  make check-builtins
> +make check-usage-strings
>  make check-docs

Good to have this as a "make" target, not something e.g. peculiar to CI.
Junio C Hamano Feb. 21, 2022, 5:15 p.m. UTC | #3
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
>> index de41888430a..f66848dfc66 100755
>> --- a/ci/test-documentation.sh
>> +++ b/ci/test-documentation.sh
>> @@ -15,6 +15,7 @@ filter_log () {
>>  }
>
> As much as I like the idea, I really don't want us to have this method
> of doing it though, i.e. to start parsing our C code with a
> hard-to-maintain shellscript.
>
> But the good news is that there's much easier way to add this!

;-)  Good suggestions.  I have nothing to add.
Abhradeep Chakraborty Feb. 21, 2022, 5:33 p.m. UTC | #4
Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:

> Sorry about leaving this patch submission hanging. I read this at the
> time, but forgot to find time to loop back to it.

No worries. Thanks for reviewing :)

> But in this case there's really a much easier way to do this, to just
> extend something like this:
> ...
> See b6c2a0d45d4 (parse-options: make sure argh string does not have SP
> or _, 2014-03-23) for the existing code shown in the context where we
> already check "argh" like that, i.e. we're just missing a test for
> "help".
>
> Obviously such a function would need to hardcode some of the logic you
> added in your shellscript. E.g. this fires on a string ending in "...",
> but yours doesn't.

Thank you so much for the suggestion. Didn't aware of it before. I will
try to implement the logic in parse-options.c` (as you suggested).

> That should be fairly easy to do though, and if not we could always just
> dump these to stderr or something if a
> git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true, and
> do the testing itself in t0012-help.sh.

Okay but if the logic can't be implented in the `parse-options.c` file
(most probably I will be able to implement the logic), would you allow me
to try the `coccinelle script` method you mentioned?

Thanks :)
Ævar Arnfjörð Bjarmason Feb. 21, 2022, 6:52 p.m. UTC | #5
On Mon, Feb 21 2022, Abhradeep Chakraborty wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
>> Sorry about leaving this patch submission hanging. I read this at the
>> time, but forgot to find time to loop back to it.
>
> No worries. Thanks for reviewing :)
>
>> But in this case there's really a much easier way to do this, to just
>> extend something like this:
>> ...
>> See b6c2a0d45d4 (parse-options: make sure argh string does not have SP
>> or _, 2014-03-23) for the existing code shown in the context where we
>> already check "argh" like that, i.e. we're just missing a test for
>> "help".
>>
>> Obviously such a function would need to hardcode some of the logic you
>> added in your shellscript. E.g. this fires on a string ending in "...",
>> but yours doesn't.
>
> Thank you so much for the suggestion. Didn't aware of it before. I will
> try to implement the logic in parse-options.c` (as you suggested).
>
>> That should be fairly easy to do though, and if not we could always just
>> dump these to stderr or something if a
>> git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true, and
>> do the testing itself in t0012-help.sh.
>
> Okay but if the logic can't be implented in the `parse-options.c` file
> (most probably I will be able to implement the logic), would you allow me
> to try the `coccinelle script` method you mentioned?

In this case I think there's definitely no reason for why it can't/won't
work in parse-options.c.

If you're doing something like that with coccicheck I'm afraid I can't
help much, I've only seen that the kernel is doing it (it's referenced
in some of the coccinelle docs), but I haven't personally used it for
anything close to that.
Abhradeep Chakraborty Feb. 22, 2022, 10:25 a.m. UTC | #6
Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:

> But in this case there's really a much easier way to do this, to just
> extend something like this:
> ...
> See b6c2a0d45d4 (parse-options: make sure argh string does not have SP
> or _, 2014-03-23) for the existing code shown in the context where we
> already check "argh" like that, i.e. we're just missing a test for
> "help".
>
> Obviously such a function would need to hardcode some of the logic you
> added in your shellscript. E.g. this fires on a string ending in "...",
> but yours doesn't.

Hello Ævar, I have some query related to this method. I have implemented
the logic locally and tests are also passing. However, I think the test
you mentioned is only running against the builtin files and files that
are used in builtin commands (e.g. `diff.c`, `builtin/add.c` etc.). But
some files from `t/helper` (e.g. t/helper/test-run-command.c) also uses
parse option API and it seems that there are no test files (pardon me if I
am wrong) for checking `parse option usage strings check` for `t/helper`
test-tool commands.

E.g. `grep -r --include="*.c" 'struct option .*\[] = {$' .` command gives
the following output - 

./helper/test-parse-options.c:  struct option options[] = {
./helper/test-lazy-init-name-hash.c:    struct option options[] = {
./helper/test-serve-v2.c:       struct option options[] = {
./helper/test-simple-ipc.c:     struct option options[] = {
./helper/test-parse-pathspec-file.c:    struct option options[] = {
./helper/test-getcwd.c: struct option options[] = {
./helper/test-run-command.c:    struct option options[] = {
./helper/test-run-command.c:    struct option options[] = {
./helper/test-proc-receive.c:   struct option options[] = {
./helper/test-progress.c:       struct option options[] = {
./helper/test-tool.c:   struct option options[] = {

So, these files are using parse-options and there is a chance that in
future, usage strings from these files may violate the style guide. In
this case, all tests will be passing even if there are some style
violations. What do you think?

Thanks :)
Johannes Schindelin Feb. 22, 2022, 10:57 a.m. UTC | #7
Hi Julia,

I would like to loop you in here because you have helped us with
Coccinelle questions in the past.

On Mon, 21 Feb 2022, Abhradeep Chakraborty wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> > That should be fairly easy to do though, and if not we could always
> > just dump these to stderr or something if a
> > git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true,
> > and do the testing itself in t0012-help.sh.
>
> Okay but if the logic can't be implented in the `parse-options.c` file
> (most probably I will be able to implement the logic), would you allow
> me to try the `coccinelle script` method you mentioned?

The task at hand is to identify calls to the macro `OPT_CMDMODE()` (and
other, similar macros) that get a fourth argument of the form

	N_("<some string>")

The problem is to identify `<some string>` that ends in a `.` (which we
want to avoid) or that starts with some prefix and a colon but follows
with an upper-case character.

In other words, we want to suggest replacing

	N_("log: Use something")

or

	N_("log: use something.")

by

	N_("log: use something")

Ævar suggested that Coccinelle can do that. Could you give us a hand how
this would be possible using `spatch`?

Thank you,
Johannes
Ævar Arnfjörð Bjarmason Feb. 22, 2022, 12:37 p.m. UTC | #8
On Tue, Feb 22 2022, Johannes Schindelin wrote:

> Hi Julia,
>
> I would like to loop you in here because you have helped us with
> Coccinelle questions in the past.

Thanks. Probably better to CC the relevant ML, adding it.

> On Mon, 21 Feb 2022, Abhradeep Chakraborty wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>
>> > That should be fairly easy to do though, and if not we could always
>> > just dump these to stderr or something if a
>> > git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true,
>> > and do the testing itself in t0012-help.sh.
>>
>> Okay but if the logic can't be implented in the `parse-options.c` file
>> (most probably I will be able to implement the logic), would you allow
>> me to try the `coccinelle script` method you mentioned?
>
> The task at hand is to identify calls to the macro `OPT_CMDMODE()` (and
> other, similar macros) that get a fourth argument of the form
>
> 	N_("<some string>")
>
> The problem is to identify `<some string>` that ends in a `.` (which we
> want to avoid) or that starts with some prefix and a colon but follows
> with an upper-case character.
>
> In other words, we want to suggest replacing
>
> 	N_("log: Use something")
>
> or
>
> 	N_("log: use something.")
>
> by
>
> 	N_("log: use something")
>
> Ævar suggested that Coccinelle can do that. Could you give us a hand how
> this would be possible using `spatch`?

I probably shouldn't have mentioned that at all, and I think this is
academic in this context, because as noted we can just add this to
parse_options_check() (linking it here again for off-git-ml context):

    https://lore.kernel.org/git/220221.86tucsb4oy.gmgdl@evledraar.gmail.com/

We now pay that trivial runtime overhead every time, we could run it
only during the tests if we ever got worried about it.

And it's a lot less fragile and easy to understand than running
coccicheck, i.e. as nice as it is it's still takes a while to run, is
its own mini-language, needs to be kept in sync with code changes etc.

So by doing it at runtime we can adjust messages, code & tests in an
atomic patch more easily (i.e. not assume that you ran some cocci target
to validate it).

It also makes it really easy to do things that are really hard (or
impossible?) with coccinelle. I.e. some of these checks are run as a
function of what flag gets passed into some function later on, which in
the general case would require coccinelle to have some runtime emulator
for C code just to see *what* checks it wants to run.

That being said (and with the caveat that I've only looked at this code,
not done this myself) if you clone linux.git and browse through:

    git grep -C100 -F coccilib.report '*.cocci'

You can see a lot of examples of using cocci for these sorts of checks.

And the same goes if you clone coccinelle.git and do:

    git grep -C100 @script: -- tests

For linux.git it's documented here:
https://github.com/torvalds/linux/blob/master/Documentation/dev-tools/coccinelle.rst

I.e. it's basically writing the sort of cocci rules we have in-tree with
a callback script that complaints about the required change.

For our use it would probably better (in lieu of parse_options_check(),
which is the right thing here) to just have a normal *.cocci file and
complain if it applies changes. We already error in the CI if those need
to apply any changes.

But I don't off-hand know how to do that. E.g. I was trying the other
day to come up with some coccinelle rule that converted:

    die("BUG: blah blah");

To:

    BUG("blah blah");

And while I'm sure there's some way to do this, couldn't find a way to
write a rule to "reach in" to a constant string, apply some check or
search/replacement on it, and to do a subsequent transformation on it.

In this case the OPT_BLAH(1, 2, 3, "string here") has OPT_BLAH() as a
macro. I can't remember if there's extra caveats around using coccinelle
for macros v.s. symbols.

Disclaimer: By "couldn't" I mean I grepped the above examples for all of
a few minutes quickly skimmed the coccinelle docs, didn't find a
template I could copy, then ended up writing some nasty grep/xargs/perl
for-loop instead :)
Julia Lawall Feb. 22, 2022, 1:42 p.m. UTC | #9
On Tue, 22 Feb 2022, Ævar Arnfjörð Bjarmason wrote:

>
> On Tue, Feb 22 2022, Johannes Schindelin wrote:
>
> > Hi Julia,
> >
> > I would like to loop you in here because you have helped us with
> > Coccinelle questions in the past.
>
> Thanks. Probably better to CC the relevant ML, adding it.
>
> > On Mon, 21 Feb 2022, Abhradeep Chakraborty wrote:
> >
> >> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> >>
> >> > That should be fairly easy to do though, and if not we could always
> >> > just dump these to stderr or something if a
> >> > git_env_bool("GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP", 0) was true,
> >> > and do the testing itself in t0012-help.sh.
> >>
> >> Okay but if the logic can't be implented in the `parse-options.c` file
> >> (most probably I will be able to implement the logic), would you allow
> >> me to try the `coccinelle script` method you mentioned?
> >
> > The task at hand is to identify calls to the macro `OPT_CMDMODE()` (and
> > other, similar macros) that get a fourth argument of the form
> >
> > 	N_("<some string>")
> >
> > The problem is to identify `<some string>` that ends in a `.` (which we
> > want to avoid) or that starts with some prefix and a colon but follows
> > with an upper-case character.
> >
> > In other words, we want to suggest replacing
> >
> > 	N_("log: Use something")
> >
> > or
> >
> > 	N_("log: use something.")
> >
> > by
> >
> > 	N_("log: use something")
> >
> > Ævar suggested that Coccinelle can do that. Could you give us a hand how
> > this would be possible using `spatch`?

Hello,

I'm not sure to follow all of the following.

Of there are some cases that are useful to do statically, with only local
information, then using Coccinelle could be useful to get the problem out
of the way once and for all.  Coccinelle doesn't support much processing
of strings directly, but you can always write some python code to test the
contents of a string and to create a new one.

Let me know if you want to try this.  You can also check, eg the demo
demos/pythontococci.cocci to see how to create code in a python script and
then use it in a normal SmPL rule.

If some context has to be taken into account and the context in the same
function, then that can also be done with Coccinelle, eg

A
...
B

matches the case where after an A there is a B on all execution paths
(except perhaps those that end in an error exit) and

A
... when exists
B

matches the case where there is a B sometime after executing A, even if
that does not always occur.

If the context that you are interested in is in a called function or is in
the calling context, then Coccinelle might not be the ideal choice.
Coccinelle works on one function at a time, so to do anything
interprocedural, you have to do some hacks.

julia
>
> I probably shouldn't have mentioned that at all, and I think this is
> academic in this context, because as noted we can just add this to
> parse_options_check() (linking it here again for off-git-ml context):
>
>     https://lore.kernel.org/git/220221.86tucsb4oy.gmgdl@evledraar.gmail.com/
>
> We now pay that trivial runtime overhead every time, we could run it
> only during the tests if we ever got worried about it.
>
> And it's a lot less fragile and easy to understand than running
> coccicheck, i.e. as nice as it is it's still takes a while to run, is
> its own mini-language, needs to be kept in sync with code changes etc.
>
> So by doing it at runtime we can adjust messages, code & tests in an
> atomic patch more easily (i.e. not assume that you ran some cocci target
> to validate it).
>
> It also makes it really easy to do things that are really hard (or
> impossible?) with coccinelle. I.e. some of these checks are run as a
> function of what flag gets passed into some function later on, which in
> the general case would require coccinelle to have some runtime emulator
> for C code just to see *what* checks it wants to run.
>
> That being said (and with the caveat that I've only looked at this code,
> not done this myself) if you clone linux.git and browse through:
>
>     git grep -C100 -F coccilib.report '*.cocci'
>
> You can see a lot of examples of using cocci for these sorts of checks.
>
> And the same goes if you clone coccinelle.git and do:
>
>     git grep -C100 @script: -- tests
>
> For linux.git it's documented here:
> https://github.com/torvalds/linux/blob/master/Documentation/dev-tools/coccinelle.rst
>
> I.e. it's basically writing the sort of cocci rules we have in-tree with
> a callback script that complaints about the required change.
>
> For our use it would probably better (in lieu of parse_options_check(),
> which is the right thing here) to just have a normal *.cocci file and
> complain if it applies changes. We already error in the CI if those need
> to apply any changes.
>
> But I don't off-hand know how to do that. E.g. I was trying the other
> day to come up with some coccinelle rule that converted:
>
>     die("BUG: blah blah");
>
> To:
>
>     BUG("blah blah");
>
> And while I'm sure there's some way to do this, couldn't find a way to
> write a rule to "reach in" to a constant string, apply some check or
> search/replacement on it, and to do a subsequent transformation on it.
>
> In this case the OPT_BLAH(1, 2, 3, "string here") has OPT_BLAH() as a
> macro. I can't remember if there's extra caveats around using coccinelle
> for macros v.s. symbols.
>
> Disclaimer: By "couldn't" I mean I grepped the above examples for all of
> a few minutes quickly skimmed the coccinelle docs, didn't find a
> template I could copy, then ended up writing some nasty grep/xargs/perl
> for-loop instead :)
>
Abhradeep Chakraborty Feb. 22, 2022, 2:03 p.m. UTC | #10
Julia Lawall wrote:

> Of there are some cases that are useful to do statically, with only local
> information, then using Coccinelle could be useful to get the problem out
> of the way once and for all.  Coccinelle doesn't support much processing
> of strings directly, but you can always write some python code to test the
> contents of a string and to create a new one.
>
> Let me know if you want to try this.  You can also check, eg the demo
> demos/pythontococci.cocci to see how to create code in a python script and
> then use it in a normal SmPL rule.
> ...
> If the context that you are interested in is in a called function or is in
> the calling context, then Coccinelle might not be the ideal choice.
> Coccinelle works on one function at a time, so to do anything
> interprocedural, you have to do some hacks.

Thank you Julia for this helpful info. Looking at your description, I think
the `add check to parse-options.c` (that Ævar suggested as the most ideal
method for it) method is more simpler than this. Moreover,as this is only
about checking usage-strings, so adding complexity to it will not be a
good idea.

Thanks :)
Abhradeep Chakraborty Feb. 22, 2022, 3:47 p.m. UTC | #11
Julia Lawall wrote:

> Of there are some cases that are useful to do statically, with only local
> information, then using Coccinelle could be useful to get the problem out
> of the way once and for all.  Coccinelle doesn't support much processing
> of strings directly, but you can always write some python code to test the
> contents of a string and to create a new one.
>
> Let me know if you want to try this.  You can also check, eg the demo
> demos/pythontococci.cocci to see how to create code in a python script and
> then use it in a normal SmPL rule.
> ...
> If the context that you are interested in is in a called function or is in
> the calling context, then Coccinelle might not be the ideal choice.
> Coccinelle works on one function at a time, so to do anything
> interprocedural, you have to do some hacks.

Though in this case, `parse-options.c check` method is better, but in other
cases, this might be a good fit. In those cases, I would also like to help
you (i.e. you, Johannes, Ævar and other devs) to fix those cases.

Thanks :)
Johannes Schindelin Feb. 25, 2022, 3:03 p.m. UTC | #12
Hi Julia,

On Tue, 22 Feb 2022, Julia Lawall wrote:

> [I]f there are some cases that are useful to do statically, with only
> local information, then using Coccinelle could be useful to get the
> problem out of the way once and for all.  Coccinelle doesn't support
> much processing of strings directly, but you can always write some
> python code to test the contents of a string and to create a new one.
>
> Let me know if you want to try this.  You can also check, eg the demo
> demos/pythontococci.cocci to see how to create code in a python script and
> then use it in a normal SmPL rule.
>
> If some context has to be taken into account and the context in the same
> function, then that can also be done with Coccinelle, eg
>
> A
> ...
> B
>
> matches the case where after an A there is a B on all execution paths
> (except perhaps those that end in an error exit) and
>
> A
> ... when exists
> B
>
> matches the case where there is a B sometime after executing A, even if
> that does not always occur.
>
> If the context that you are interested in is in a called function or is in
> the calling context, then Coccinelle might not be the ideal choice.
> Coccinelle works on one function at a time, so to do anything
> interprocedural, you have to do some hacks.

Right. The code in question is not actually calling a function, but a
macro, and passes a literal string to the macro that we would want to
check statically.

I did have my doubts that it would be easy with Coccinelle, but since Ævar
seemed so confident, I tried it, struggled, and decided to follow up with
you.

Thank you for confirming my suspicion!
Johannes
Johannes Schindelin Feb. 25, 2022, 3:30 p.m. UTC | #13
Hi,

On Tue, 22 Feb 2022, Abhradeep Chakraborty wrote:

> Julia Lawall wrote:
>
> > Of there are some cases that are useful to do statically, with only local
> > information, then using Coccinelle could be useful to get the problem out
> > of the way once and for all.  Coccinelle doesn't support much processing
> > of strings directly, but you can always write some python code to test the
> > contents of a string and to create a new one.
> >
> > Let me know if you want to try this.  You can also check, eg the demo
> > demos/pythontococci.cocci to see how to create code in a python script and
> > then use it in a normal SmPL rule.
> > ...
> > If the context that you are interested in is in a called function or is in
> > the calling context, then Coccinelle might not be the ideal choice.
> > Coccinelle works on one function at a time, so to do anything
> > interprocedural, you have to do some hacks.
>
> Though in this case, `parse-options.c check` method is better [...]

I fear that this is incorrect.

In general, it is my experience that it is a mistake any time a static
check is replaced by a runtime check.

I was ready to let it slide in this instance, but in this case I now have
proof that the `parse-options.c` check is worse than the originally
suggested `sed` chain.

That concrete proof is in the output of
https://github.com/git/git/actions/runs/1890665968, where the combination
of `ac/usage-string-fixups` and `jh/builtin-fsmonitor-part2` causes many,
many failures, but all of those failures have the same root cause: the
runtime check.

With the original `check-usage-strings.sh`, the user inspecting any
failure would see precisely what the issue is, in the `static-analysis`
job's logs. It would display something like this:

	HEAD:builtin/fsmonitor--daemon.c:1507:          N_("Max seconds to wait for background daemon startup")),

With v4 of the patch series, it does not spell out anything in
`static-analysis`. Instead, it causes 8 separate jobs to fail,
it causes failures not only in `t0012-help.sh` but also in
`t7519-status-fsmonitor.sh` and in `t7527-builtin-fsmonitor.sh`.

The purpose of t7519 and t7527 is _not_ to verify those usage strings,
though.

The worst part? Look at the relevant output of t0012 (see
https://github.com/git/git/runs/5312844492?check_suite_focus=true#step:5:4902):

	[...]
	++ git -C sub fsmonitor--daemon -h
	++ exit_code=128
	++ test 128 = 129
	++ echo 'test_expect_code: command exited with 128, we wanted 129 git -C sub fsmonitor--daemon -h'
	test_expect_code: command exited with 128, we wanted 129 git -C sub fsmonitor--daemon -h
	++ return 1
	error: last command exited with $?=1
	not ok 81 - fsmonitor--daemon can handle -h
	[...]

Do you see what usage string caused the failure? You can't. And that's
even by design:

	(
		GIT_CEILING_DIRECTORIES=$(pwd) &&
		export GIT_CEILING_DIRECTORIES &&
		test_expect_code 129 git -C sub $builtin -h >output 2>&1
	) &&
	test_i18ngrep usage output

The output is redirected, and since the runtime check added to
`parse-options.c` causes the exit code to be different from the expected
one, the output is never shown.

Arguably the most important job of a regression test is to help software
engineers to diagnose and fix the regression. As quickly and as
conveniently as possible. That means that it is not enough to point out
that there is a regression, the output should be as helpful and concise as
possible to facilitate fixing the problem.

In the above-mentioned case, it was neither as helpful nor as concise as
possible because in the test case that was supposed to identify the
problem, the actual error message was swallowed, and instead of causing
one test failure, it caused a whopping 42 test cases to fail (some of
which even show the error message, but that's not even the purpose of
those test cases).

Since the entire point of this here patch series is to help enforce Git's
rules regarding usage strings, it should expect things like the issue with
`fsmonitor--daemon` _and_ make it as painless to address such an issue.

I am afraid that I have to NAK the `parse-options.c` approach because v1
of this patch series did so much better a job.

Ciao,
Dscho
Julia Lawall Feb. 25, 2022, 3:36 p.m. UTC | #14
On Fri, 25 Feb 2022, Johannes Schindelin wrote:

> Hi Julia,
>
> On Tue, 22 Feb 2022, Julia Lawall wrote:
>
> > [I]f there are some cases that are useful to do statically, with only
> > local information, then using Coccinelle could be useful to get the
> > problem out of the way once and for all.  Coccinelle doesn't support
> > much processing of strings directly, but you can always write some
> > python code to test the contents of a string and to create a new one.
> >
> > Let me know if you want to try this.  You can also check, eg the demo
> > demos/pythontococci.cocci to see how to create code in a python script and
> > then use it in a normal SmPL rule.
> >
> > If some context has to be taken into account and the context in the same
> > function, then that can also be done with Coccinelle, eg
> >
> > A
> > ...
> > B
> >
> > matches the case where after an A there is a B on all execution paths
> > (except perhaps those that end in an error exit) and
> >
> > A
> > ... when exists
> > B
> >
> > matches the case where there is a B sometime after executing A, even if
> > that does not always occur.
> >
> > If the context that you are interested in is in a called function or is in
> > the calling context, then Coccinelle might not be the ideal choice.
> > Coccinelle works on one function at a time, so to do anything
> > interprocedural, you have to do some hacks.
>
> Right. The code in question is not actually calling a function, but a
> macro, and passes a literal string to the macro that we would want to
> check statically.

Coccinelle doesn't care about whether a function is called or whether a
macro is called.  It considers everything to be a function.

>
> I did have my doubts that it would be easy with Coccinelle, but since Ævar
> seemed so confident, I tried it, struggled, and decided to follow up with
> you.

Something like this:

@r1@
expression e;
@@

N(e)

@script:python s@
e << r1.e;
replacement;
@@

if string_ok e
then cocci.include_match(False)
else coccinelle.replacement = "\"better string\""

@@
expression r1.e;
expression s.replacement;
@@
- N(e)
+ N(replacement)

------------------

You can fill in the definition of string_ok and better string.  I think
the \" will be necessary, because the value of an expression metavariable
at the python level is a string, so there should be a string inside of it
to make it a string expression.

julia
Ævar Arnfjörð Bjarmason Feb. 25, 2022, 4:16 p.m. UTC | #15
On Fri, Feb 25 2022, Johannes Schindelin wrote:

> Hi,
>
> On Tue, 22 Feb 2022, Abhradeep Chakraborty wrote:
>
>> Julia Lawall wrote:
>>
>> > Of there are some cases that are useful to do statically, with only local
>> > information, then using Coccinelle could be useful to get the problem out
>> > of the way once and for all.  Coccinelle doesn't support much processing
>> > of strings directly, but you can always write some python code to test the
>> > contents of a string and to create a new one.
>> >
>> > Let me know if you want to try this.  You can also check, eg the demo
>> > demos/pythontococci.cocci to see how to create code in a python script and
>> > then use it in a normal SmPL rule.
>> > ...
>> > If the context that you are interested in is in a called function or is in
>> > the calling context, then Coccinelle might not be the ideal choice.
>> > Coccinelle works on one function at a time, so to do anything
>> > interprocedural, you have to do some hacks.
>>
>> Though in this case, `parse-options.c check` method is better [...]
>
> I fear that this is incorrect.
>
> In general, it is my experience that it is a mistake any time a static
> check is replaced by a runtime check.
>
> I was ready to let it slide in this instance, but in this case I now have
> proof that the `parse-options.c` check is worse than the originally
> suggested `sed` chain.
>
> That concrete proof is in the output of
> https://github.com/git/git/actions/runs/1890665968, where the combination
> of `ac/usage-string-fixups` and `jh/builtin-fsmonitor-part2` causes many,
> many failures, but all of those failures have the same root cause: the
> runtime check.
>
> With the original `check-usage-strings.sh`, the user inspecting any
> failure would see precisely what the issue is, in the `static-analysis`
> job's logs. It would display something like this:
>
> 	HEAD:builtin/fsmonitor--daemon.c:1507:          N_("Max seconds to wait for background daemon startup")),
>
> With v4 of the patch series, it does not spell out anything in
> `static-analysis`. Instead, it causes 8 separate jobs to fail,
> it causes failures not only in `t0012-help.sh` but also in
> `t7519-status-fsmonitor.sh` and in `t7527-builtin-fsmonitor.sh`.
>
> The purpose of t7519 and t7527 is _not_ to verify those usage strings,
> though.
>
> The worst part? Look at the relevant output of t0012 (see
> https://github.com/git/git/runs/5312844492?check_suite_focus=true#step:5:4902):
>
> 	[...]
> 	++ git -C sub fsmonitor--daemon -h
> 	++ exit_code=128
> 	++ test 128 = 129
> 	++ echo 'test_expect_code: command exited with 128, we wanted 129 git -C sub fsmonitor--daemon -h'
> 	test_expect_code: command exited with 128, we wanted 129 git -C sub fsmonitor--daemon -h
> 	++ return 1
> 	error: last command exited with $?=1
> 	not ok 81 - fsmonitor--daemon can handle -h
> 	[...]
>
> Do you see what usage string caused the failure? You can't. And that's
> even by design:
>
> 	(
> 		GIT_CEILING_DIRECTORIES=$(pwd) &&
> 		export GIT_CEILING_DIRECTORIES &&
> 		test_expect_code 129 git -C sub $builtin -h >output 2>&1
> 	) &&
> 	test_i18ngrep usage output
>
> The output is redirected, and since the runtime check added to
> `parse-options.c` causes the exit code to be different from the expected
> one, the output is never shown.
>
> Arguably the most important job of a regression test is to help software
> engineers to diagnose and fix the regression. As quickly and as
> conveniently as possible. That means that it is not enough to point out
> that there is a regression, the output should be as helpful and concise as
> possible to facilitate fixing the problem.
>
> In the above-mentioned case, it was neither as helpful nor as concise as
> possible because in the test case that was supposed to identify the
> problem, the actual error message was swallowed, and instead of causing
> one test failure, it caused a whopping 42 test cases to fail (some of
> which even show the error message, but that's not even the purpose of
> those test cases).
>
> Since the entire point of this here patch series is to help enforce Git's
> rules regarding usage strings, it should expect things like the issue with
> `fsmonitor--daemon` _and_ make it as painless to address such an issue.
>
> I am afraid that I have to NAK the `parse-options.c` approach because v1
> of this patch series did so much better a job.

I think that's a bit of an overreaction to what I think is a solid v2 in
<pull.1147.v2.git.1645545507689.gitgitgadget@gmail.com>, i.e. that we
must go back to v1 because we encountered this issue.

A. I think you're right about the t0012-help.sh output being bad,
   but that's rather easily fixed with something like the [1] below.

   I've run into that a few times, wished it was better, and manually
   grepped or cat'd the "output" file.

   Part of that is ultimately because we're mixing and matching whether this
   "usage" output goes on stdout or stderr in various commands.

B. The fsmonitor--daemon case is worse than most because it's only running on
   OS or Windows, i.e. the error we'd get in various other CI jobs is ifdef'd
   away, even though we could run the parse_options() part there.

   IIRC that's something I commented on in previous rounds of that series...

C. The case of 42 tests failing because of this could be addressed by just having
   t0012-help.sh do these checks if we wanted, although in that case we'd need to
   make sure we deal with other test blind spots. I.e. the
   "GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP" suggestion I had.

D. These sorts of check, by their nature, have an initial period of growing
   pains due to other in-flight topics. Once we move past that it's usually a
   non-issue going forward, as issues will be caught locally before patch
   submission.

   Data in favor of that is various other checks in parse_options_check() being
   mostly a non-issue, e.g. Junio's b6c2a0d45d4 (parse-options: make sure argh
   string does not have SP or _, 2014-03-23).

In this case I don't see how some minor issues when merging this with
"seen" would have us abandon the v1 and commit to a fragile parsing of C
code in shellscript instead, or with some coccinelle check that would
have inherent issues finding the full context we need (passed-down flags
etc.).

1. 

diff --git a/t/t0012-help.sh b/t/t0012-help.sh
index 6c3e1f7159d..5474d463467 100755
--- a/t/t0012-help.sh
+++ b/t/t0012-help.sh
@@ -237,15 +237,24 @@ test_expect_success 'generate builtin list' '
 	git --list-cmds=builtins >builtins
 '
 
+builtin_in_sub () {
+	(
+		GIT_CEILING_DIRECTORIES=$(pwd) &&
+		export GIT_CEILING_DIRECTORIES &&
+		"$@"
+	)
+}
+
+
 while read builtin
 do
-	test_expect_success "$builtin can handle -h" '
-		(
-			GIT_CEILING_DIRECTORIES=$(pwd) &&
-			export GIT_CEILING_DIRECTORIES &&
-			test_expect_code 129 git -C sub $builtin -h >output 2>&1
-		) &&
-		test_i18ngrep usage output
+	test_expect_success "invoking '$builtin -h' yields exit code 129" '
+		builtin_in_sub test_expect_code 129 git -C sub $builtin -h
+	'
+
+	test_expect_success "invoking '$builtin -h' output" '
+		builtin_in_sub test_expect_code 129 git -C sub $builtin -h >output 2>&1 &&
+		grep usage output
 	'
 done <builtins
Ævar Arnfjörð Bjarmason Feb. 25, 2022, 4:28 p.m. UTC | #16
On Fri, Feb 25 2022, Johannes Schindelin wrote:

> Hi Julia,
>
> On Tue, 22 Feb 2022, Julia Lawall wrote:
>
>> [I]f there are some cases that are useful to do statically, with only
>> local information, then using Coccinelle could be useful to get the
>> problem out of the way once and for all.  Coccinelle doesn't support
>> much processing of strings directly, but you can always write some
>> python code to test the contents of a string and to create a new one.
>>
>> Let me know if you want to try this.  You can also check, eg the demo
>> demos/pythontococci.cocci to see how to create code in a python script and
>> then use it in a normal SmPL rule.
>>
>> If some context has to be taken into account and the context in the same
>> function, then that can also be done with Coccinelle, eg
>>
>> A
>> ...
>> B
>>
>> matches the case where after an A there is a B on all execution paths
>> (except perhaps those that end in an error exit) and
>>
>> A
>> ... when exists
>> B
>>
>> matches the case where there is a B sometime after executing A, even if
>> that does not always occur.
>>
>> If the context that you are interested in is in a called function or is in
>> the calling context, then Coccinelle might not be the ideal choice.
>> Coccinelle works on one function at a time, so to do anything
>> interprocedural, you have to do some hacks.
>
> Right. The code in question is not actually calling a function, but a
> macro, and passes a literal string to the macro that we would want to
> check statically.
>
> I did have my doubts that it would be easy with Coccinelle, but since Ævar
> seemed so confident, I tried it, struggled, and decided to follow up with
> you.
>
> Thank you for confirming my suspicion!
> Johannes

In case it's not clear from the upthread (and I thought my [1] explained
it well enough) I never thought it would be easy or even possible to do
this particular thing with coccinelle.

I.e. I mentioned in [1]:

    Aside: if we did want to do the "parse C" method the right way to do it
    would be to have a coccinelle script do it

So it's intended as a side-note to explain to a new contributor that
*if* we do end up wanting to parse or transform C we have coccinelle,
and it's a great tool for those cases where such static transformations
are easy. I and others have added some in-tree in the past where
appropriate.

But I then went on to say (and elaborated on later in [2]) that in this
case the right thing to do is runtime checking.

So, I'm sorry if you wasted time on it. In either case it seems we've
ended up in agreement in this case about appropriate uses of coccinelle.

I.e. it's a fantastic tool as a semantic patch engine, but it
understandably has limitations where you'd effectively need it to
execute your program to decide what to do, as is the case with the
parse_options() API and the eventual parse_options_check() etc. doing
assertions depending on flags that got passed down.

1. https://lore.kernel.org/git/220221.86tucsb4oy.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220222.867d9n83ir.gmgdl@evledraar.gmail.com/
Abhradeep Chakraborty Feb. 26, 2022, 4:22 a.m. UTC | #17
Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:

> A. I think you're right about the t0012-help.sh output being bad,
>    but that's rather easily fixed with something like the [1] below.

Do you think the fix you suggested should be a part of this Patch series
or a dedicated patch request is needed for this?

> C. The case of 42 tests failing because of this could be addressed by just having
>    t0012-help.sh do these checks if we wanted, although in that case we'd need to
>    make sure we deal with other test blind spots. I.e. the
>    "GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP" suggestion I had.

Pardon me, I am having problem to understand the 
`"GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP" suggestion I had.` part
here. Could you please explain a little bit?

Thanks :)
Julia Lawall Feb. 26, 2022, 8:55 a.m. UTC | #18
Hello,

Since it seems that Coccinelle is not useful for your problem, could you
remove me from the CC list on this discussion?

thanks,
julia

On Sat, 26 Feb 2022, Abhradeep Chakraborty wrote:

>
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> > A. I think you're right about the t0012-help.sh output being bad,
> >    but that's rather easily fixed with something like the [1] below.
>
> Do you think the fix you suggested should be a part of this Patch series
> or a dedicated patch request is needed for this?
>
> > C. The case of 42 tests failing because of this could be addressed by just having
> >    t0012-help.sh do these checks if we wanted, although in that case we'd need to
> >    make sure we deal with other test blind spots. I.e. the
> >    "GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP" suggestion I had.
>
> Pardon me, I am having problem to understand the
> `"GIT_TEST_PARSE_OPTIONS_DUMP_FIELD_HELP" suggestion I had.` part
> here. Could you please explain a little bit?
>
> Thanks :)
>
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index 186f9ab6190..93faed51da0 100644
--- a/Makefile
+++ b/Makefile
@@ -3416,6 +3416,11 @@  check-docs::
 check-builtins::
 	./check-builtins.sh
 
+### Make sure all the usage strings follow usage string style guide
+#
+check-usage-strings::
+	./check-usage-strings.sh
+
 ### Test suite coverage testing
 #
 .PHONY: coverage coverage-clean coverage-compile coverage-test coverage-report
diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 28a2e6a5750..614d95b022c 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -1209,7 +1209,7 @@  int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 		OPT_CMDMODE(0, "bisect-visualize", &cmdmode,
 			 N_("visualize the bisection"), BISECT_VISUALIZE),
 		OPT_CMDMODE(0, "bisect-run", &cmdmode,
-			 N_("use <cmd>... to automatically bisect."), BISECT_RUN),
+			 N_("use <cmd>... to automatically bisect"), BISECT_RUN),
 		OPT_BOOL(0, "no-log", &nolog,
 			 N_("no log for BISECT_WRITE")),
 		OPT_END()
diff --git a/builtin/reflog.c b/builtin/reflog.c
index 85b838720c3..28372c5e2b5 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -600,7 +600,7 @@  static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "updateref", &flags,
 			N_("update the reference to the value of the top reflog entry"),
 			EXPIRE_REFLOGS_UPDATE_REF),
-		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
+		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
 		OPT_CALLBACK_F(0, "expire", &cmd, N_("timestamp"),
 			       N_("prune entries older than the specified time"),
 			       PARSE_OPT_NONEG,
@@ -613,7 +613,7 @@  static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 			 N_("prune any reflog entries that point to broken commits")),
 		OPT_BOOL(0, "all", &do_all, N_("process the reflogs of all references")),
 		OPT_BOOL(1, "single-worktree", &all_worktrees,
-			 N_("limits processing to reflogs from the current worktree only.")),
+			 N_("limits processing to reflogs from the current worktree only")),
 		OPT_END()
 	};
 
@@ -736,7 +736,7 @@  static int cmd_reflog_delete(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "updateref", &flags,
 			N_("update the reference to the value of the top reflog entry"),
 			EXPIRE_REFLOGS_UPDATE_REF),
-		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen.")),
+		OPT_BOOL(0, "verbose", &verbose, N_("print extra information on screen")),
 		OPT_END()
 	};
 
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index c5d3fc3817f..9864ec1427d 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1874,7 +1874,7 @@  static int module_clone(int argc, const char **argv, const char *prefix)
 		OPT_STRING(0, "depth", &clone_data.depth,
 			   N_("string"),
 			   N_("depth for shallow clones")),
-		OPT__QUIET(&quiet, "Suppress output for cloning a submodule"),
+		OPT__QUIET(&quiet, "suppress output for cloning a submodule"),
 		OPT_BOOL(0, "progress", &progress,
 			   N_("force cloning progress")),
 		OPT_BOOL(0, "require-init", &require_init,
diff --git a/check-usage-strings.sh b/check-usage-strings.sh
new file mode 100755
index 00000000000..a4028e0d00d
--- /dev/null
+++ b/check-usage-strings.sh
@@ -0,0 +1,33 @@ 
+{
+  if test -d ".git"
+  then
+    rev=${1:-"HEAD"}
+    for entry in $(git grep -l 'struct option .* = {$' "$rev" -- \*.c);
+    do
+      git show "$entry" |
+      sed -n '/struct option .* = {/,/OPT_END/{=;p;}' |
+      sed "N;s/^\\([0-9]*\\)\\n/$(echo "$entry" | sed 's/\//\\&/g'):\\1/";
+    done
+  else
+    for entry in $(grep -rl --include="*.c" 'struct option .* = {$' . );
+    do
+      cat "$entry" |
+      sed -n '/struct option .* = {/,/OPT_END/{=;p;}' |
+      sed "N;s/^\\([0-9]*\\)\\n/$(echo "$entry" | sed -e 's/\//\\&/g' -e 's/^\.\\\///'):\\1/";
+    done
+  fi
+} |
+grep -Pe '((?<!OPT_GROUP\(N_\(|OPT_GROUP\()"(?!GPG|DEPRECATED|SHA1|HEAD)[A-Z]|(?<!"|\.\.)\.")' |
+{
+  status=0
+  while read content;
+  do
+    if test -n "$content"
+    then
+      echo "$content";
+      status=1;
+    fi
+  done
+
+  exit $status
+}
diff --git a/ci/test-documentation.sh b/ci/test-documentation.sh
index de41888430a..f66848dfc66 100755
--- a/ci/test-documentation.sh
+++ b/ci/test-documentation.sh
@@ -15,6 +15,7 @@  filter_log () {
 }
 
 make check-builtins
+make check-usage-strings
 make check-docs
 
 # Build docs with AsciiDoc
diff --git a/diff.c b/diff.c
index c862771a589..000be3bf232 100644
--- a/diff.c
+++ b/diff.c
@@ -5596,7 +5596,7 @@  static void prep_parse_options(struct diff_options *options)
 			       N_("select files by diff type"),
 			       PARSE_OPT_NONEG, diff_opt_diff_filter),
 		{ OPTION_CALLBACK, 0, "output", options, N_("<file>"),
-		  N_("Output to a specific file"),
+		  N_("output to a specific file"),
 		  PARSE_OPT_NONEG, NULL, 0, diff_opt_output },
 
 		OPT_END()
diff --git a/t/helper/test-run-command.c b/t/helper/test-run-command.c
index 913775a14b7..8f370cd89f1 100644
--- a/t/helper/test-run-command.c
+++ b/t/helper/test-run-command.c
@@ -221,9 +221,9 @@  static int quote_stress_test(int argc, const char **argv)
 	struct strbuf out = STRBUF_INIT;
 	struct strvec args = STRVEC_INIT;
 	struct option options[] = {
-		OPT_INTEGER('n', "trials", &trials, "Number of trials"),
-		OPT_INTEGER('s', "skip", &skip, "Skip <n> trials"),
-		OPT_BOOL('m', "msys2", &msys2, "Test quoting for MSYS2's sh"),
+		OPT_INTEGER('n', "trials", &trials, "number of trials"),
+		OPT_INTEGER('s', "skip", &skip, "skip <n> trials"),
+		OPT_BOOL('m', "msys2", &msys2, "test quoting for MSYS2's sh"),
 		OPT_END()
 	};
 	const char * const usage[] = {