Message ID | 20230420205350.600760-1-szeder.dev@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 138ef8068c54d72c9bd1b09753408fde7750ec86 |
Headers | show |
Series | cocci: remove 'unused.cocci' | expand |
SZEDER Gábor <szeder.dev@gmail.com> writes: > When 'unused.cocci' was added in 4f40f6cb73 (cocci: add and apply a > rule to find "unused" strbufs, 2022-07-05) it found three unused > strbufs, and when it was generalized in the next commit it managed to > find an unused string_list as well. That's four unused variables in > over 17 years, so apparently we rarely make this mistake. > > Unfortunately, applying 'unused.cocci' is quite expensive, e.g. it > increases the from-scratch runtime of 'make coccicheck' by over 5:30 > minutes or over 160%: > ... > That's a lot of runtime spent for not much in return, and arguably an > unused struct instance sneaking in is not that big of a deal to > justify the significantly increased runtime. > > Remove 'unused.cocci', because we are not getting our CPU cycles' > worth. Will queue. Thanks.
On Thu, Apr 20 2023, SZEDER Gábor wrote: > When 'unused.cocci' was added in 4f40f6cb73 (cocci: add and apply a > rule to find "unused" strbufs, 2022-07-05) it found three unused > strbufs, and when it was generalized in the next commit it managed to > find an unused string_list as well. That's four unused variables in > over 17 years, so apparently we rarely make this mistake. > > Unfortunately, applying 'unused.cocci' is quite expensive, e.g. it > increases the from-scratch runtime of 'make coccicheck' by over 5:30 > minutes or over 160%: > > $ make -s cocciclean > $ time make -s coccicheck > * new spatch flags > > real 8m56.201s > user 0m0.420s > sys 0m0.406s > $ rm contrib/coccinelle/unused.cocci contrib/coccinelle/tests/unused.* > $ make -s cocciclean > $ time make -s coccicheck > * new spatch flags > > real 3m23.893s > user 0m0.228s > sys 0m0.247s > > That's a lot of runtime spent for not much in return, and arguably an > unused struct instance sneaking in is not that big of a deal to > justify the significantly increased runtime. > > Remove 'unused.cocci', because we are not getting our CPU cycles' > worth. It wasn't something I intended at the time, but arguably the main use of this rule since it was added was that it served as a canary for the tree becoming completely broken with coccinelle, due to adding C syntax it didn't understand: https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ So, whatever you think of of how worthwhile it is to spot unused variables, I think that weighs heavily in its favor. There *are* other ways to detect those sorts of issues, but as it's currently our only canary for that issue I don't thin we should remove it. If we hadn't had unused.cocci we wouldn't be able to apply rules the functions that use "UNUSED", which have increased a lot in number since then, and we wouldn't have any way of spotting similar parsing issues. But it's unfortunate that it's this slow in the from-scratch case. When we last discussed this I pointed out to you that the main contribution to the runtime of unused.cocci is parsing builtin/{log,rebase}.c is pathalogical, but in your reply to that you seem to not have spotted that (or glossed over it): https://lore.kernel.org/git/20220831180526.GA1802@szeder.dev/ When I test this locally, doing: time make contrib/coccinelle/unused.cocci.patch SPATCH=spatch SPATCH_USE_O_DEPENDENCIES= Takes ~2m, but if I first do: >builtin/log.c; >builtin/rebase.c It takes ~1m. So, even without digging into those issues, if we just skipped those two files we'd speed this part up by 100%. I think such an approach would be much better than just removing this outright, which feels rather heavy-handed. We could formalize that by creating a "coccicheck-full" category or whatever, just as we now have "coccicheck-pending". Then I and the CI could run "full", and you could run "coccicheck" (or maybe we'd call that "coccicheck-cheap" or something). I also submitted patches to both make "coccicheck" incremental, and added an "spatchcache", both of which have since been merged (that tool is in contrib/). I understand from previous discussion that you wanted to use "make -s" all the time, but does your use-case also preclude using spatchcache? I run "coccicheck" a lot, and haven't personally be bothered by this particular slowdown since that got merged, since it'll only affect me in the cases where builtin/{log,rebase}.c and a small list of other files are changed, and it's otherwise unnoticeable. It would also be rather trivial to just add some way to specify patterns on the "make" command-line that we'd "$(filter-out)", would that also address your particular use-case? I.e.: make coccicheck COCCI_RULES_EXCLUDE=*unused* Or whatever.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > It wasn't something I intended at the time, but arguably the main use of > this rule since it was added was that it served as a canary for the tree > becoming completely broken with coccinelle, due to adding C syntax it > didn't understand: > https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ If it weren't Coccinelle, we could have used the much nicer looking UNUSED(var) notation, and the compilers were all fine. Only because Coccinelle did not understand the "cute" syntax trick, we couldn't. Yes, it caught us when we used a syntax it couldn't understand, but is that a good thing in the first place?
On Mon, May 01 2023, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > >> It wasn't something I intended at the time, but arguably the main use of >> this rule since it was added was that it served as a canary for the tree >> becoming completely broken with coccinelle, due to adding C syntax it >> didn't understand: >> https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ > > If it weren't Coccinelle, we could have used the much nicer looking > UNUSED(var) notation, and the compilers were all fine. > > Only because Coccinelle did not understand the "cute" syntax trick, > we couldn't. Yes, it caught us when we used a syntax it couldn't > understand, but is that a good thing in the first place? I think it's unambiguously a good thing that we spotted an otherwise unknown side-effect of the proposed UNUSED(var) syntax on coccinelle. We might also say that some bit of syntax that coccinelle doesn't understand is so valuable that we'd like to make coccinelle itself significantly less useful (as it wouldn't reach into those functions), or stop using it altogether. But that's a seperate question. I'm just pointing out that we'd be losing a very valuable check on future syntax incompatibilities, particularly when it comes to clever use of macros. A better way to spot that would be to start parsing the coccinelle logs, and detect when we have unknown parsing issues, and error on those. But until then...
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > A better way to spot that would be to start parsing the coccinelle logs, > and detect when we have unknown parsing issues, and error on those. But > until then... Until then, I do not think a rather costly test that has found only 4 instances of the mistakes the test was designed to find is a good way to stand in as a replacement. Let's drop it, as it is easy to resurrect if somebody wants to run it from time to time from an old version of Git. Or is it a valid alternative to move it to "pending"? Thanks.
diff --git a/contrib/coccinelle/tests/unused.c b/contrib/coccinelle/tests/unused.c deleted file mode 100644 index 8294d734ba..0000000000 --- a/contrib/coccinelle/tests/unused.c +++ /dev/null @@ -1,82 +0,0 @@ -void test_strbuf(void) -{ - struct strbuf sb1 = STRBUF_INIT; - struct strbuf sb2 = STRBUF_INIT; - struct strbuf sb3 = STRBUF_INIT; - struct strbuf sb4 = STRBUF_INIT; - struct strbuf sb5; - struct strbuf sb6 = { 0 }; - struct strbuf sb7 = STRBUF_INIT; - struct strbuf sb8 = STRBUF_INIT; - struct strbuf *sp1; - struct strbuf *sp2; - struct strbuf *sp3; - struct strbuf *sp4 = xmalloc(sizeof(struct strbuf)); - struct strbuf *sp5 = xmalloc(sizeof(struct strbuf)); - struct strbuf *sp6 = xmalloc(sizeof(struct strbuf)); - struct strbuf *sp7; - - strbuf_init(&sb5, 0); - strbuf_init(sp1, 0); - strbuf_init(sp2, 0); - strbuf_init(sp3, 0); - strbuf_init(sp4, 0); - strbuf_init(sp5, 0); - strbuf_init(sp6, 0); - strbuf_init(sp7, 0); - sp7 = xmalloc(sizeof(struct strbuf)); - - use_before(&sb3); - use_as_str("%s", sb7.buf); - use_as_str("%s", sp1->buf); - use_as_str("%s", sp6->buf); - pass_pp(&sp3); - - strbuf_release(&sb1); - strbuf_reset(&sb2); - strbuf_release(&sb3); - strbuf_release(&sb4); - strbuf_release(&sb5); - strbuf_release(&sb6); - strbuf_release(&sb7); - strbuf_release(sp1); - strbuf_release(sp2); - strbuf_release(sp3); - strbuf_release(sp4); - strbuf_release(sp5); - strbuf_release(sp6); - strbuf_release(sp7); - - use_after(&sb4); - - if (when_strict()) - return; - strbuf_release(&sb8); -} - -void test_other(void) -{ - struct string_list l = STRING_LIST_INIT_DUP; - struct strbuf sb = STRBUF_INIT; - - string_list_clear(&l, 0); - string_list_clear(&sb, 0); -} - -void test_worktrees(void) -{ - struct worktree **w1 = get_worktrees(); - struct worktree **w2 = get_worktrees(); - struct worktree **w3; - struct worktree **w4; - - w3 = get_worktrees(); - w4 = get_worktrees(); - - use_it(w4); - - free_worktrees(w1); - free_worktrees(w2); - free_worktrees(w3); - free_worktrees(w4); -} diff --git a/contrib/coccinelle/tests/unused.res b/contrib/coccinelle/tests/unused.res deleted file mode 100644 index 6d3e745683..0000000000 --- a/contrib/coccinelle/tests/unused.res +++ /dev/null @@ -1,45 +0,0 @@ -void test_strbuf(void) -{ - struct strbuf sb3 = STRBUF_INIT; - struct strbuf sb4 = STRBUF_INIT; - struct strbuf sb7 = STRBUF_INIT; - struct strbuf *sp1; - struct strbuf *sp3; - struct strbuf *sp6 = xmalloc(sizeof(struct strbuf)); - strbuf_init(sp1, 0); - strbuf_init(sp3, 0); - strbuf_init(sp6, 0); - - use_before(&sb3); - use_as_str("%s", sb7.buf); - use_as_str("%s", sp1->buf); - use_as_str("%s", sp6->buf); - pass_pp(&sp3); - - strbuf_release(&sb3); - strbuf_release(&sb4); - strbuf_release(&sb7); - strbuf_release(sp1); - strbuf_release(sp3); - strbuf_release(sp6); - - use_after(&sb4); - - if (when_strict()) - return; -} - -void test_other(void) -{ -} - -void test_worktrees(void) -{ - struct worktree **w4; - - w4 = get_worktrees(); - - use_it(w4); - - free_worktrees(w4); -} diff --git a/contrib/coccinelle/unused.cocci b/contrib/coccinelle/unused.cocci deleted file mode 100644 index d84046f82e..0000000000 --- a/contrib/coccinelle/unused.cocci +++ /dev/null @@ -1,43 +0,0 @@ -// This rule finds sequences of "unused" declerations and uses of a -// variable, where "unused" is defined to include only calling the -// equivalent of alloc, init & free functions on the variable. -@@ -type T; -identifier I; -// STRBUF_INIT, but also e.g. STRING_LIST_INIT_DUP (so no anchoring) -constant INIT_MACRO =~ "_INIT"; -identifier MALLOC1 =~ "^x?[mc]alloc$"; -identifier INIT_ASSIGN1 =~ "^get_worktrees$"; -identifier INIT_CALL1 =~ "^[a-z_]*_init$"; -identifier REL1 =~ "^[a-z_]*_(release|reset|clear|free)$"; -identifier REL2 =~ "^(release|clear|free)_[a-z_]*$"; -@@ - -( -- T I; -| -- T I = { 0 }; -| -- T I = INIT_MACRO; -| -- T I = MALLOC1(...); -| -- T I = INIT_ASSIGN1(...); -) - -<... when != \( I \| &I \) -( -- \( INIT_CALL1 \)( \( I \| &I \), ...); -| -- I = \( INIT_ASSIGN1 \)(...); -| -- I = MALLOC1(...); -) -...> - -( -- \( REL1 \| REL2 \)( \( I \| &I \), ...); -| -- \( REL1 \| REL2 \)( \( &I \| I \) ); -) - ... when != \( I \| &I \)
When 'unused.cocci' was added in 4f40f6cb73 (cocci: add and apply a rule to find "unused" strbufs, 2022-07-05) it found three unused strbufs, and when it was generalized in the next commit it managed to find an unused string_list as well. That's four unused variables in over 17 years, so apparently we rarely make this mistake. Unfortunately, applying 'unused.cocci' is quite expensive, e.g. it increases the from-scratch runtime of 'make coccicheck' by over 5:30 minutes or over 160%: $ make -s cocciclean $ time make -s coccicheck * new spatch flags real 8m56.201s user 0m0.420s sys 0m0.406s $ rm contrib/coccinelle/unused.cocci contrib/coccinelle/tests/unused.* $ make -s cocciclean $ time make -s coccicheck * new spatch flags real 3m23.893s user 0m0.228s sys 0m0.247s That's a lot of runtime spent for not much in return, and arguably an unused struct instance sneaking in is not that big of a deal to justify the significantly increased runtime. Remove 'unused.cocci', because we are not getting our CPU cycles' worth. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> --- contrib/coccinelle/tests/unused.c | 82 ----------------------------- contrib/coccinelle/tests/unused.res | 45 ---------------- contrib/coccinelle/unused.cocci | 43 --------------- 3 files changed, 170 deletions(-) delete mode 100644 contrib/coccinelle/tests/unused.c delete mode 100644 contrib/coccinelle/tests/unused.res delete mode 100644 contrib/coccinelle/unused.cocci