diff mbox series

[02/17] cocci: fix incorrect & verbose "the_repository" rules

Message ID patch-02.17-1b1fc5d41f5-20230317T152724Z-avarab@gmail.com (mailing list archive)
State Accepted
Commit 6f1436ba2a72bfcbeac9688fa7fe374870a49779
Headers show
Series cocci: remove "the_index" wrapper macros | expand

Commit Message

Ævar Arnfjörð Bjarmason March 17, 2023, 3:35 p.m. UTC
When these rules started being added in [1] they didn't use a ";"
after the ")", and would thus catch uses of these macros within
expressions. But as of [2] the new additions were broken in that
they'd only match a subset of the users of these macros.

Rather than narrowly fixing that, let's have these use the much less
verbose pattern introduced in my recent [3]: There's no need to
exhaustively enumerate arguments if we use the "..." syntax. This
means that we can fold all of these different rules into one.

1. afd69dcc219 (object-store: prepare read_object_file to deal with
   any repo, 2018-11-13)
2. 21a9651ba3f (commit-reach: prepare get_merge_bases to handle any
   repo, 2018-11-13)
3. 0e6550a2c63 (cocci: add a index-compatibility.pending.cocci,
   2022-11-19)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 .../coccinelle/the_repository.pending.cocci   | 160 +++++-------------
 1 file changed, 46 insertions(+), 114 deletions(-)

Comments

Elijah Newren March 19, 2023, 5:55 a.m. UTC | #1
On Fri, Mar 17, 2023 at 8:53 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> When these rules started being added in [1] they didn't use a ";"
> after the ")", and would thus catch uses of these macros within
> expressions. But as of [2] the new additions were broken in that
> they'd only match a subset of the users of these macros.
>
> Rather than narrowly fixing that, let's have these use the much less
> verbose pattern introduced in my recent [3]: There's no need to
> exhaustively enumerate arguments if we use the "..." syntax. This
> means that we can fold all of these different rules into one.
>
> 1. afd69dcc219 (object-store: prepare read_object_file to deal with
>    any repo, 2018-11-13)
> 2. 21a9651ba3f (commit-reach: prepare get_merge_bases to handle any
>    repo, 2018-11-13)
> 3. 0e6550a2c63 (cocci: add a index-compatibility.pending.cocci,
>    2022-11-19)
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  .../coccinelle/the_repository.pending.cocci   | 160 +++++-------------
>  1 file changed, 46 insertions(+), 114 deletions(-)
>
> diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
> index 23b97536da5..99e192736ee 100644
> --- a/contrib/coccinelle/the_repository.pending.cocci
> +++ b/contrib/coccinelle/the_repository.pending.cocci
> @@ -3,118 +3,50 @@
>  // our code base.
>
>  @@
> -expression E;
> -expression F;
> -expression G;
>  @@
> -- read_object_file(
> -+ repo_read_object_file(the_repository,
> -  E, F, G)
> -
> -@@
> -expression E;
> -@@
> -- has_object_file(
> -+ repo_has_object_file(the_repository,
> -  E)
> -
> -@@
> -expression E;
> -@@
> -- has_object_file_with_flags(
> -+ repo_has_object_file_with_flags(the_repository,
> -  E)
> -
> -@@
> -expression E;
> -expression F;
> -expression G;
> -@@
> -- parse_commit_internal(
> -+ repo_parse_commit_internal(the_repository,
> -  E, F, G)
> -
> -@@
> -expression E;
> -@@
> -- parse_commit(
> -+ repo_parse_commit(the_repository,
> -  E)
> -
> -@@
> -expression E;
> -expression F;
> -@@
> -- get_merge_bases(
> -+ repo_get_merge_bases(the_repository,
> -  E, F);
> -
> -@@
> -expression E;
> -expression F;
> -expression G;
> -@@
> -- get_merge_bases_many(
> -+ repo_get_merge_bases_many(the_repository,
> -  E, F, G);
> -
> -@@
> -expression E;
> -expression F;
> -expression G;
> -@@
> -- get_merge_bases_many_dirty(
> -+ repo_get_merge_bases_many_dirty(the_repository,
> -  E, F, G);
> -
> -@@
> -expression E;
> -expression F;
> -@@
> -- in_merge_bases(
> -+ repo_in_merge_bases(the_repository,
> -  E, F);
> -
> -@@
> -expression E;
> -expression F;
> -expression G;
> -@@
> -- in_merge_bases_many(
> -+ repo_in_merge_bases_many(the_repository,
> -  E, F, G);
> -
> -@@
> -expression E;
> -expression F;
> -@@
> -- get_commit_buffer(
> -+ repo_get_commit_buffer(the_repository,
> -  E, F);
> -
> -@@
> -expression E;
> -expression F;
> -@@
> -- unuse_commit_buffer(
> -+ repo_unuse_commit_buffer(the_repository,
> -  E, F);
> -
> -@@
> -expression E;
> -expression F;
> -expression G;
> -@@
> -- logmsg_reencode(
> -+ repo_logmsg_reencode(the_repository,
> -  E, F, G);
> -
> -@@
> -expression E;
> -expression F;
> -expression G;
> -expression H;
> -@@
> -- format_commit_message(
> -+ repo_format_commit_message(the_repository,
> -  E, F, G, H);
> +(
> +- read_object_file
> ++ repo_read_object_file
> +|
> +- has_object_file
> ++ repo_has_object_file
> +|
> +- has_object_file_with_flags
> ++ repo_has_object_file_with_flags
> +|
> +- parse_commit_internal
> ++ repo_parse_commit_internal
> +|
> +- parse_commit
> ++ repo_parse_commit
> +|
> +- get_merge_bases
> ++ repo_get_merge_bases
> +|
> +- get_merge_bases_many
> ++ repo_get_merge_bases_many
> +|
> +- get_merge_bases_many_dirty
> ++ repo_get_merge_bases_many_dirty
> +|
> +- in_merge_bases
> ++ repo_in_merge_bases
> +|
> +- in_merge_bases_many
> ++ repo_in_merge_bases_many
> +|
> +- get_commit_buffer
> ++ repo_get_commit_buffer
> +|
> +- unuse_commit_buffer
> ++ repo_unuse_commit_buffer
> +|
> +- logmsg_reencode
> ++ repo_logmsg_reencode
> +|
> +- format_commit_message
> ++ repo_format_commit_message
> +)
> +  (
> ++ the_repository,
> +  ...)
> --
> 2.40.0.rc1.1034.g5867a1b10c5

I know virtually nothing about cocci, but the commit message explains
well enough that it's clear what you're doing and what's happening.
:-)
Glen Choo March 22, 2023, 10:46 p.m. UTC | #2
Every time I try to read cocci and spatch docs, I'm impressed at how
impenetrable they are ;) Nevertheless, I'd still like to understand how
the pattern works. I'll take a stab in the dark, and perhaps you can
correct me.

Ævar Arnfjörð Bjarmason         <avarab@gmail.com> writes:

> +(
> +- read_object_file
> ++ repo_read_object_file
> +|
> +- has_object_file
> ++ repo_has_object_file
> +|
> +- has_object_file_with_flags
> ++ repo_has_object_file_with_flags
> +|
> +- parse_commit_internal
> ++ repo_parse_commit_internal
> +|
> +- parse_commit
> ++ repo_parse_commit
> +|
> +- get_merge_bases
> ++ repo_get_merge_bases
> +|
> +- get_merge_bases_many
> ++ repo_get_merge_bases_many
> +|
> +- get_merge_bases_many_dirty
> ++ repo_get_merge_bases_many_dirty
> +|
> +- in_merge_bases
> ++ repo_in_merge_bases
> +|
> +- in_merge_bases_many
> ++ repo_in_merge_bases_many
> +|
> +- get_commit_buffer
> ++ repo_get_commit_buffer
> +|
> +- unuse_commit_buffer
> ++ repo_unuse_commit_buffer
> +|
> +- logmsg_reencode
> ++ repo_logmsg_reencode
> +|
> +- format_commit_message
> ++ repo_format_commit_message
> +)

I assume that `|` characters in parentheses are a logical OR, and each
of the expressions checks for the `-` side in the original and replaces
it with the `+` side.

> +  (
> ++ the_repository,
> +  ...)

Then this is another expression that matches literal `()` after the
previous expression? `+the_repository` adds `the_repository` right after
the opening `(`, then leaves the uninteresting `...` in place.

If so, I don't know how cocci/spatch tells the difference between
literal `()` vs an expression in the syntax (preceding whitespace?).

Either way, as Elijah said, your plain explanation is clear enough that
I feel comfortable with this.

> -- 
> 2.40.0.rc1.1034.g5867a1b10c5
Ævar Arnfjörð Bjarmason March 26, 2023, 5:02 a.m. UTC | #3
On Wed, Mar 22 2023, Glen Choo wrote:

> Every time I try to read cocci and spatch docs, I'm impressed at how
> impenetrable they are ;)

FWIW you should ignore the manpage, which and instead read the
"Coccinelle User’s manual", and particularly "The SmPL Grammar", both of
which are available as PDFs on their website.

But their docs are rather terse, and sometimes even incomplete. I've
often resorted to grepping their own test cases to figure out how
something works.

> Nevertheless, I'd still like to understand how
> the pattern works. I'll take a stab in the dark, and perhaps you can
> correct me.
>
> Ævar Arnfjörð Bjarmason         <avarab@gmail.com> writes:
>
>> +(
>> +- read_object_file
>> ++ repo_read_object_file
>> +|
>> +- has_object_file
>> ++ repo_has_object_file
>> +|
>> +- has_object_file_with_flags
>> ++ repo_has_object_file_with_flags
>> +|
>> +- parse_commit_internal
>> ++ repo_parse_commit_internal
>> +|
>> +- parse_commit
>> ++ repo_parse_commit
>> +|
>> +- get_merge_bases
>> ++ repo_get_merge_bases
>> +|
>> +- get_merge_bases_many
>> ++ repo_get_merge_bases_many
>> +|
>> +- get_merge_bases_many_dirty
>> ++ repo_get_merge_bases_many_dirty
>> +|
>> +- in_merge_bases
>> ++ repo_in_merge_bases
>> +|
>> +- in_merge_bases_many
>> ++ repo_in_merge_bases_many
>> +|
>> +- get_commit_buffer
>> ++ repo_get_commit_buffer
>> +|
>> +- unuse_commit_buffer
>> ++ repo_unuse_commit_buffer
>> +|
>> +- logmsg_reencode
>> ++ repo_logmsg_reencode
>> +|
>> +- format_commit_message
>> ++ repo_format_commit_message
>> +)
>
> I assume that `|` characters in parentheses are a logical OR, and each
> of the expressions checks for the `-` side in the original and replaces
> it with the `+` side.

Yes, just a simple "replace A with B".

>> +  (
>> ++ the_repository,
>> +  ...)
>
> Then this is another expression that matches literal `()` after the
> previous expression? `+the_repository` adds `the_repository` right after
> the opening `(`, then leaves the uninteresting `...` in place.
>
> If so, I don't know how cocci/spatch tells the difference between
> literal `()` vs an expression in the syntax (preceding whitespace?).

Yes, whitespace is significant in the coccinelle syntax, generally its
own "()" grouping goes at the beginning of a line, wheras you indent
program text in the "diff" with whitespace.

E.g. our equals-null.cocci has two rules that use "(" and ")" in a way
that would be ambiguous if this whitespace-disambiguation weren't being
used.
diff mbox series

Patch

diff --git a/contrib/coccinelle/the_repository.pending.cocci b/contrib/coccinelle/the_repository.pending.cocci
index 23b97536da5..99e192736ee 100644
--- a/contrib/coccinelle/the_repository.pending.cocci
+++ b/contrib/coccinelle/the_repository.pending.cocci
@@ -3,118 +3,50 @@ 
 // our code base.
 
 @@
-expression E;
-expression F;
-expression G;
 @@
-- read_object_file(
-+ repo_read_object_file(the_repository,
-  E, F, G)
-
-@@
-expression E;
-@@
-- has_object_file(
-+ repo_has_object_file(the_repository,
-  E)
-
-@@
-expression E;
-@@
-- has_object_file_with_flags(
-+ repo_has_object_file_with_flags(the_repository,
-  E)
-
-@@
-expression E;
-expression F;
-expression G;
-@@
-- parse_commit_internal(
-+ repo_parse_commit_internal(the_repository,
-  E, F, G)
-
-@@
-expression E;
-@@
-- parse_commit(
-+ repo_parse_commit(the_repository,
-  E)
-
-@@
-expression E;
-expression F;
-@@
-- get_merge_bases(
-+ repo_get_merge_bases(the_repository,
-  E, F);
-
-@@
-expression E;
-expression F;
-expression G;
-@@
-- get_merge_bases_many(
-+ repo_get_merge_bases_many(the_repository,
-  E, F, G);
-
-@@
-expression E;
-expression F;
-expression G;
-@@
-- get_merge_bases_many_dirty(
-+ repo_get_merge_bases_many_dirty(the_repository,
-  E, F, G);
-
-@@
-expression E;
-expression F;
-@@
-- in_merge_bases(
-+ repo_in_merge_bases(the_repository,
-  E, F);
-
-@@
-expression E;
-expression F;
-expression G;
-@@
-- in_merge_bases_many(
-+ repo_in_merge_bases_many(the_repository,
-  E, F, G);
-
-@@
-expression E;
-expression F;
-@@
-- get_commit_buffer(
-+ repo_get_commit_buffer(the_repository,
-  E, F);
-
-@@
-expression E;
-expression F;
-@@
-- unuse_commit_buffer(
-+ repo_unuse_commit_buffer(the_repository,
-  E, F);
-
-@@
-expression E;
-expression F;
-expression G;
-@@
-- logmsg_reencode(
-+ repo_logmsg_reencode(the_repository,
-  E, F, G);
-
-@@
-expression E;
-expression F;
-expression G;
-expression H;
-@@
-- format_commit_message(
-+ repo_format_commit_message(the_repository,
-  E, F, G, H);
+(
+- read_object_file
++ repo_read_object_file
+|
+- has_object_file
++ repo_has_object_file
+|
+- has_object_file_with_flags
++ repo_has_object_file_with_flags
+|
+- parse_commit_internal
++ repo_parse_commit_internal
+|
+- parse_commit
++ repo_parse_commit
+|
+- get_merge_bases
++ repo_get_merge_bases
+|
+- get_merge_bases_many
++ repo_get_merge_bases_many
+|
+- get_merge_bases_many_dirty
++ repo_get_merge_bases_many_dirty
+|
+- in_merge_bases
++ repo_in_merge_bases
+|
+- in_merge_bases_many
++ repo_in_merge_bases_many
+|
+- get_commit_buffer
++ repo_get_commit_buffer
+|
+- unuse_commit_buffer
++ repo_unuse_commit_buffer
+|
+- logmsg_reencode
++ repo_logmsg_reencode
+|
+- format_commit_message
++ repo_format_commit_message
+)
+  (
++ the_repository,
+  ...)