diff mbox series

[v2,07/10] generate-cmdlist.sh: stop sorting category lines

Message ID patch-v2-07.10-f2f37c2963b-20211022T193027Z-avarab@gmail.com (mailing list archive)
State New, archived
Headers show
Series Makefile: make generate-cmdlist.sh much faster | expand

Commit Message

Ævar Arnfjörð Bjarmason Oct. 22, 2021, 7:36 p.m. UTC
In a preceding commit we changed the print_command_list() loop to use
printf's auto-repeat feature. Let's now get rid of get_category_line()
entirely by not sorting the categories.

This will change the output of the generated code from e.g.:

    -       { "git-apply", N_("Apply a patch to files and/or to the index"), 0 | CAT_complete | CAT_plumbingmanipulators },

To:

    +       { "git-apply", N_("Apply a patch to files and/or to the index"), 0 | CAT_plumbingmanipulators | CAT_complete },

I.e. the categories are no longer sorted, but as they're OR'd together
it won't matter for the end result.

This speeds up the generate-cmdlist.sh a bit. Comparing HEAD~ (old)
and "master" to this code:

  'sh generate-cmdlist.sh command-list.txt' ran
    1.07 ± 0.33 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
    1.15 ± 0.36 times faster than 'sh generate-cmdlist.sh.master command-list.txt'

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 generate-cmdlist.sh | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

Comments

Jeff King Oct. 25, 2021, 4:39 p.m. UTC | #1
On Fri, Oct 22, 2021 at 09:36:11PM +0200, Ævar Arnfjörð Bjarmason wrote:

> In a preceding commit we changed the print_command_list() loop to use
> printf's auto-repeat feature. Let's now get rid of get_category_line()
> entirely by not sorting the categories.
> 
> This will change the output of the generated code from e.g.:
> 
>     -       { "git-apply", N_("Apply a patch to files and/or to the index"), 0 | CAT_complete | CAT_plumbingmanipulators },
> 
> To:
> 
>     +       { "git-apply", N_("Apply a patch to files and/or to the index"), 0 | CAT_plumbingmanipulators | CAT_complete },
> 
> I.e. the categories are no longer sorted, but as they're OR'd together
> it won't matter for the end result.

Thanks for picking this up. The commit message here is well explained.

> This speeds up the generate-cmdlist.sh a bit. Comparing HEAD~ (old)
> and "master" to this code:
> 
>   'sh generate-cmdlist.sh command-list.txt' ran
>     1.07 ± 0.33 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
>     1.15 ± 0.36 times faster than 'sh generate-cmdlist.sh.master command-list.txt'

Curious. I get much more dramatic results (as I'd expect, as we are
cutting out 2 of 3 process spawns in the loop):

    'sh generate-cmdlist.sh command-list.txt' ran
    2.16 ± 0.17 times faster than 'sh generate-cmdlist.sh.old command-list.txt'
    2.37 ± 0.28 times faster than 'sh generate-cmdlist.sh.master command-list.txt'

Either way, I think it's a good idea (and it paves the way for the next
patch, where we get the biggest speedup because we stop spawning any
processes at all).

-Peff
diff mbox series

Patch

diff --git a/generate-cmdlist.sh b/generate-cmdlist.sh
index a1ab2b1f077..f50112c50f8 100755
--- a/generate-cmdlist.sh
+++ b/generate-cmdlist.sh
@@ -9,11 +9,6 @@  command_list () {
 	eval "grep -ve '^#' $exclude_programs" <"$1"
 }
 
-get_category_line () {
-	tr ' ' '\012' |
-	LC_ALL=C sort -u
-}
-
 category_list () {
 	command_list "$1" |
 	cut -c 40- |
@@ -67,7 +62,7 @@  print_command_list () {
 	while read cmd rest
 	do
 		printf "	{ \"$cmd\", $(get_synopsis $cmd), 0"
-		printf " | CAT_%s" $(echo "$rest" | get_category_line)
+		printf " | CAT_%s" $rest
 		echo " },"
 	done
 	echo "};"