diff mbox series

cocci: remove 'unused.cocci'

Message ID 20230420205350.600760-1-szeder.dev@gmail.com (mailing list archive)
State Accepted
Commit 138ef8068c54d72c9bd1b09753408fde7750ec86
Headers show
Series cocci: remove 'unused.cocci' | expand

Commit Message

SZEDER Gábor April 20, 2023, 8:53 p.m. UTC
When 'unused.cocci' was added in 4f40f6cb73 (cocci: add and apply a
rule to find "unused" strbufs, 2022-07-05) it found three unused
strbufs, and when it was generalized in the next commit it managed to
find an unused string_list as well.  That's four unused variables in
over 17 years, so apparently we rarely make this mistake.

Unfortunately, applying 'unused.cocci' is quite expensive, e.g. it
increases the from-scratch runtime of 'make coccicheck' by over 5:30
minutes or over 160%:

  $ make -s cocciclean
  $ time make -s coccicheck
      * new spatch flags

  real    8m56.201s
  user    0m0.420s
  sys     0m0.406s
  $ rm contrib/coccinelle/unused.cocci contrib/coccinelle/tests/unused.*
  $ make -s cocciclean
  $ time make -s coccicheck
      * new spatch flags

  real    3m23.893s
  user    0m0.228s
  sys     0m0.247s

That's a lot of runtime spent for not much in return, and arguably an
unused struct instance sneaking in is not that big of a deal to
justify the significantly increased runtime.

Remove 'unused.cocci', because we are not getting our CPU cycles'
worth.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---
 contrib/coccinelle/tests/unused.c   | 82 -----------------------------
 contrib/coccinelle/tests/unused.res | 45 ----------------
 contrib/coccinelle/unused.cocci     | 43 ---------------
 3 files changed, 170 deletions(-)
 delete mode 100644 contrib/coccinelle/tests/unused.c
 delete mode 100644 contrib/coccinelle/tests/unused.res
 delete mode 100644 contrib/coccinelle/unused.cocci

Comments

Junio C Hamano April 21, 2023, 2:43 a.m. UTC | #1
SZEDER Gábor <szeder.dev@gmail.com> writes:

> When 'unused.cocci' was added in 4f40f6cb73 (cocci: add and apply a
> rule to find "unused" strbufs, 2022-07-05) it found three unused
> strbufs, and when it was generalized in the next commit it managed to
> find an unused string_list as well.  That's four unused variables in
> over 17 years, so apparently we rarely make this mistake.
>
> Unfortunately, applying 'unused.cocci' is quite expensive, e.g. it
> increases the from-scratch runtime of 'make coccicheck' by over 5:30
> minutes or over 160%:
> ...
> That's a lot of runtime spent for not much in return, and arguably an
> unused struct instance sneaking in is not that big of a deal to
> justify the significantly increased runtime.
>
> Remove 'unused.cocci', because we are not getting our CPU cycles'
> worth.

Will queue.  Thanks.
Ævar Arnfjörð Bjarmason May 1, 2023, 1:27 p.m. UTC | #2
On Thu, Apr 20 2023, SZEDER Gábor wrote:

> When 'unused.cocci' was added in 4f40f6cb73 (cocci: add and apply a
> rule to find "unused" strbufs, 2022-07-05) it found three unused
> strbufs, and when it was generalized in the next commit it managed to
> find an unused string_list as well.  That's four unused variables in
> over 17 years, so apparently we rarely make this mistake.
>
> Unfortunately, applying 'unused.cocci' is quite expensive, e.g. it
> increases the from-scratch runtime of 'make coccicheck' by over 5:30
> minutes or over 160%:
>
>   $ make -s cocciclean
>   $ time make -s coccicheck
>       * new spatch flags
>
>   real    8m56.201s
>   user    0m0.420s
>   sys     0m0.406s
>   $ rm contrib/coccinelle/unused.cocci contrib/coccinelle/tests/unused.*
>   $ make -s cocciclean
>   $ time make -s coccicheck
>       * new spatch flags
>
>   real    3m23.893s
>   user    0m0.228s
>   sys     0m0.247s
>
> That's a lot of runtime spent for not much in return, and arguably an
> unused struct instance sneaking in is not that big of a deal to
> justify the significantly increased runtime.
>
> Remove 'unused.cocci', because we are not getting our CPU cycles'
> worth.

It wasn't something I intended at the time, but arguably the main use of
this rule since it was added was that it served as a canary for the tree
becoming completely broken with coccinelle, due to adding C syntax it
didn't understand:
https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/

So, whatever you think of of how worthwhile it is to spot unused
variables, I think that weighs heavily in its favor. There *are* other
ways to detect those sorts of issues, but as it's currently our only
canary for that issue I don't thin we should remove it.

If we hadn't had unused.cocci we wouldn't be able to apply rules the
functions that use "UNUSED", which have increased a lot in number since
then, and we wouldn't have any way of spotting similar parsing issues.

But it's unfortunate that it's this slow in the from-scratch case.

When we last discussed this I pointed out to you that the main
contribution to the runtime of unused.cocci is parsing
builtin/{log,rebase}.c is pathalogical, but in your reply to that you
seem to not have spotted that (or glossed over it):
https://lore.kernel.org/git/20220831180526.GA1802@szeder.dev/

When I test this locally, doing:

	time make contrib/coccinelle/unused.cocci.patch SPATCH=spatch SPATCH_USE_O_DEPENDENCIES=

Takes ~2m, but if I first do:

	>builtin/log.c; >builtin/rebase.c

It takes ~1m.

So, even without digging into those issues, if we just skipped those two
files we'd speed this part up by 100%.

I think such an approach would be much better than just removing this
outright, which feels rather heavy-handed.

We could formalize that by creating a "coccicheck-full" category or
whatever, just as we now have "coccicheck-pending".

Then I and the CI could run "full", and you could run "coccicheck" (or
maybe we'd call that "coccicheck-cheap" or something).

I also submitted patches to both make "coccicheck" incremental, and
added an "spatchcache", both of which have since been merged (that tool
is in contrib/).

I understand from previous discussion that you wanted to use "make -s"
all the time, but does your use-case also preclude using spatchcache?

I run "coccicheck" a lot, and haven't personally be bothered by this
particular slowdown since that got merged, since it'll only affect me in
the cases where builtin/{log,rebase}.c and a small list of other files
are changed, and it's otherwise unnoticeable.

It would also be rather trivial to just add some way to specify patterns
on the "make" command-line that we'd "$(filter-out)", would that also
address your particular use-case? I.e.:

	make coccicheck COCCI_RULES_EXCLUDE=*unused*

Or whatever.
Junio C Hamano May 1, 2023, 3:55 p.m. UTC | #3
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> It wasn't something I intended at the time, but arguably the main use of
> this rule since it was added was that it served as a canary for the tree
> becoming completely broken with coccinelle, due to adding C syntax it
> didn't understand:
> https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/

If it weren't Coccinelle, we could have used the much nicer looking
UNUSED(var) notation, and the compilers were all fine.

Only because Coccinelle did not understand the "cute" syntax trick,
we couldn't.  Yes, it caught us when we used a syntax it couldn't
understand, but is that a good thing in the first place?
Ævar Arnfjörð Bjarmason May 1, 2023, 5:28 p.m. UTC | #4
On Mon, May 01 2023, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> It wasn't something I intended at the time, but arguably the main use of
>> this rule since it was added was that it served as a canary for the tree
>> becoming completely broken with coccinelle, due to adding C syntax it
>> didn't understand:
>> https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/
>
> If it weren't Coccinelle, we could have used the much nicer looking
> UNUSED(var) notation, and the compilers were all fine.
>
> Only because Coccinelle did not understand the "cute" syntax trick,
> we couldn't.  Yes, it caught us when we used a syntax it couldn't
> understand, but is that a good thing in the first place?

I think it's unambiguously a good thing that we spotted an otherwise
unknown side-effect of the proposed UNUSED(var) syntax on coccinelle.

We might also say that some bit of syntax that coccinelle doesn't
understand is so valuable that we'd like to make coccinelle itself
significantly less useful (as it wouldn't reach into those functions),
or stop using it altogether.

But that's a seperate question. I'm just pointing out that we'd be
losing a very valuable check on future syntax incompatibilities,
particularly when it comes to clever use of macros.

A better way to spot that would be to start parsing the coccinelle logs,
and detect when we have unknown parsing issues, and error on those. But
until then...
Junio C Hamano May 10, 2023, 10:45 p.m. UTC | #5
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> A better way to spot that would be to start parsing the coccinelle logs,
> and detect when we have unknown parsing issues, and error on those. But
> until then...

Until then, I do not think a rather costly test that has found only
4 instances of the mistakes the test was designed to find is a good
way to stand in as a replacement.

Let's drop it, as it is easy to resurrect if somebody wants to run
it from time to time from an old version of Git.  Or is it a valid
alternative to move it to "pending"?

Thanks.
diff mbox series

Patch

diff --git a/contrib/coccinelle/tests/unused.c b/contrib/coccinelle/tests/unused.c
deleted file mode 100644
index 8294d734ba..0000000000
--- a/contrib/coccinelle/tests/unused.c
+++ /dev/null
@@ -1,82 +0,0 @@ 
-void test_strbuf(void)
-{
-	struct strbuf sb1 = STRBUF_INIT;
-	struct strbuf sb2 = STRBUF_INIT;
-	struct strbuf sb3 = STRBUF_INIT;
-	struct strbuf sb4 = STRBUF_INIT;
-	struct strbuf sb5;
-	struct strbuf sb6 = { 0 };
-	struct strbuf sb7 = STRBUF_INIT;
-	struct strbuf sb8 = STRBUF_INIT;
-	struct strbuf *sp1;
-	struct strbuf *sp2;
-	struct strbuf *sp3;
-	struct strbuf *sp4 = xmalloc(sizeof(struct strbuf));
-	struct strbuf *sp5 = xmalloc(sizeof(struct strbuf));
-	struct strbuf *sp6 = xmalloc(sizeof(struct strbuf));
-	struct strbuf *sp7;
-
-	strbuf_init(&sb5, 0);
-	strbuf_init(sp1, 0);
-	strbuf_init(sp2, 0);
-	strbuf_init(sp3, 0);
-	strbuf_init(sp4, 0);
-	strbuf_init(sp5, 0);
-	strbuf_init(sp6, 0);
-	strbuf_init(sp7, 0);
-	sp7 = xmalloc(sizeof(struct strbuf));
-
-	use_before(&sb3);
-	use_as_str("%s", sb7.buf);
-	use_as_str("%s", sp1->buf);
-	use_as_str("%s", sp6->buf);
-	pass_pp(&sp3);
-
-	strbuf_release(&sb1);
-	strbuf_reset(&sb2);
-	strbuf_release(&sb3);
-	strbuf_release(&sb4);
-	strbuf_release(&sb5);
-	strbuf_release(&sb6);
-	strbuf_release(&sb7);
-	strbuf_release(sp1);
-	strbuf_release(sp2);
-	strbuf_release(sp3);
-	strbuf_release(sp4);
-	strbuf_release(sp5);
-	strbuf_release(sp6);
-	strbuf_release(sp7);
-
-	use_after(&sb4);
-
-	if (when_strict())
-		return;
-	strbuf_release(&sb8);
-}
-
-void test_other(void)
-{
-	struct string_list l = STRING_LIST_INIT_DUP;
-	struct strbuf sb = STRBUF_INIT;
-
-	string_list_clear(&l, 0);
-	string_list_clear(&sb, 0);
-}
-
-void test_worktrees(void)
-{
-	struct worktree **w1 = get_worktrees();
-	struct worktree **w2 = get_worktrees();
-	struct worktree **w3;
-	struct worktree **w4;
-
-	w3 = get_worktrees();
-	w4 = get_worktrees();
-
-	use_it(w4);
-
-	free_worktrees(w1);
-	free_worktrees(w2);
-	free_worktrees(w3);
-	free_worktrees(w4);
-}
diff --git a/contrib/coccinelle/tests/unused.res b/contrib/coccinelle/tests/unused.res
deleted file mode 100644
index 6d3e745683..0000000000
--- a/contrib/coccinelle/tests/unused.res
+++ /dev/null
@@ -1,45 +0,0 @@ 
-void test_strbuf(void)
-{
-	struct strbuf sb3 = STRBUF_INIT;
-	struct strbuf sb4 = STRBUF_INIT;
-	struct strbuf sb7 = STRBUF_INIT;
-	struct strbuf *sp1;
-	struct strbuf *sp3;
-	struct strbuf *sp6 = xmalloc(sizeof(struct strbuf));
-	strbuf_init(sp1, 0);
-	strbuf_init(sp3, 0);
-	strbuf_init(sp6, 0);
-
-	use_before(&sb3);
-	use_as_str("%s", sb7.buf);
-	use_as_str("%s", sp1->buf);
-	use_as_str("%s", sp6->buf);
-	pass_pp(&sp3);
-
-	strbuf_release(&sb3);
-	strbuf_release(&sb4);
-	strbuf_release(&sb7);
-	strbuf_release(sp1);
-	strbuf_release(sp3);
-	strbuf_release(sp6);
-
-	use_after(&sb4);
-
-	if (when_strict())
-		return;
-}
-
-void test_other(void)
-{
-}
-
-void test_worktrees(void)
-{
-	struct worktree **w4;
-
-	w4 = get_worktrees();
-
-	use_it(w4);
-
-	free_worktrees(w4);
-}
diff --git a/contrib/coccinelle/unused.cocci b/contrib/coccinelle/unused.cocci
deleted file mode 100644
index d84046f82e..0000000000
--- a/contrib/coccinelle/unused.cocci
+++ /dev/null
@@ -1,43 +0,0 @@ 
-// This rule finds sequences of "unused" declerations and uses of a
-// variable, where "unused" is defined to include only calling the
-// equivalent of alloc, init & free functions on the variable.
-@@
-type T;
-identifier I;
-// STRBUF_INIT, but also e.g. STRING_LIST_INIT_DUP (so no anchoring)
-constant INIT_MACRO =~ "_INIT";
-identifier MALLOC1 =~ "^x?[mc]alloc$";
-identifier INIT_ASSIGN1 =~ "^get_worktrees$";
-identifier INIT_CALL1 =~ "^[a-z_]*_init$";
-identifier REL1 =~ "^[a-z_]*_(release|reset|clear|free)$";
-identifier REL2 =~ "^(release|clear|free)_[a-z_]*$";
-@@
-
-(
-- T I;
-|
-- T I = { 0 };
-|
-- T I = INIT_MACRO;
-|
-- T I = MALLOC1(...);
-|
-- T I = INIT_ASSIGN1(...);
-)
-
-<... when != \( I \| &I \)
-(
-- \( INIT_CALL1 \)( \( I \| &I \), ...);
-|
-- I = \( INIT_ASSIGN1 \)(...);
-|
-- I = MALLOC1(...);
-)
-...>
-
-(
-- \( REL1 \| REL2 \)( \( I \| &I \), ...);
-|
-- \( REL1 \| REL2 \)( \( &I \| I \) );
-)
-  ... when != \( I \| &I \)