Message ID | 20250107091932.126673-2-kuforiji98@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | t/unit-tests: convert hash tests to use clar | expand |
Seyi Kuforiji <kuforiji98@gmail.com> writes: > The `generate-clar-decls.sh` script extracts signatures of test > functions from our unit tests, which will later get used by the clar to > automatically wire up tests. The sed command only matches lines that > ended immediately after `void)`, causing it to miss declarations with > additional content such as comments or annotations. > > Relax the regular expression by making it match lines with trailing data > after the function signature. This ensures that all valid function > declarations are captured and formatted as `extern` declarations > regardless of their formatting style, improving the robustness of the > script when parsing `$suite` files. > > This will be used in subsequent commits to match and capture the > function signature correctly, regardless of any trailing content. I am not sure if this is going in the right direction, though. Especially for things like test suites that are looked at and worked on only by develoeprs *and* these tools, being uniform and consistent weighs more than being more flexible. Let me state it in another way. How many of the existing test pieces are picked up by the current pattern, and among them how many of them would see vast improvements if they are allowed to have arbitrary garbage after their "I do not take any arguments" function signature? Are new tests you are migrating from outside the clar world lose a lot if they are no longer allowed to have comments there, or would it be suffice to have the comments before the functions (which many of our function definition do anyway)? A quick peek at [PATCH 2/2] tells me that this is not even something that would make it easier to port the existing tests by allowing more straight line-by-line copies or something. The patch splits many in-line test pieces in the "main" into separate functions, and it does so in a rather unusual format, e.g., void test_hash__multi_character(void) TEST_HASH_STR("abc", "a9993e364706816aba3e25717850c26c9cd0d89d", "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad") where TEST_HASH_STR() expands to the function body that starts with a "{" and ends with a "}". It can well be written more like void test_hash__multi_character(void) { TEST_HASH_STR("abc", "a9993e364706816aba3e25717850c26c9cd0d89d", "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad"); } and we do not need this step at all if we did so. Such a construct would be a lot friendlier to the editors that auto-indent, too. So, I do not quite see much value in this particular change. > Mentored-by: Patrick Steinhardt <ps@pks.im> > Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> > --- > t/unit-tests/generate-clar-decls.sh | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/t/unit-tests/generate-clar-decls.sh b/t/unit-tests/generate-clar-decls.sh > index 3b315c64b3..02e45cf0ba 100755 > --- a/t/unit-tests/generate-clar-decls.sh > +++ b/t/unit-tests/generate-clar-decls.sh > @@ -14,6 +14,6 @@ do > suite_name=$(basename "$suite") > suite_name=${suite_name%.c} > suite_name=${suite_name#u-} > - sed -ne "s/^\(void test_${suite_name}__[a-zA-Z_0-9][a-zA-Z_0-9]*(void)\)$/extern \1;/p" "$suite" || > + sed -ne "s/^\(void test_${suite_name}__[a-zA-Z_0-9][a-zA-Z_0-9]*(void)\).*/extern \1;/p" "$suite" || > exit 1 > done >"$OUTPUT" > -- > 2.34.1
Junio C Hamano <gitster@pobox.com> writes: > A quick peek at [PATCH 2/2] tells me that this is not even something > that would make it easier to port the existing tests by allowing > more straight line-by-line copies or something. The patch splits > many in-line test pieces in the "main" into separate functions, and > it does so in a rather unusual format, e.g., > > void test_hash__multi_character(void) TEST_HASH_STR("abc", > "a9993e364706816aba3e25717850c26c9cd0d89d", > "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad") > > where TEST_HASH_STR() expands to the function body that starts with > a "{" and ends with a "}". It can well be written more like > > void test_hash__multi_character(void) > { > TEST_HASH_STR("abc", > "a9993e364706816aba3e25717850c26c9cd0d89d", > "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad"); > } > > and we do not need this step at all if we did so. Such a construct > would be a lot friendlier to the editors that auto-indent, too. > > So, I do not quite see much value in this particular change. Having said that, if this were more like that you write a series of DEF_HASH_TEST(multi_character, "abc", "a9993e...", "ba7816bf...") and they expand to void test_hash__multi_character(void) { const char *expected[] = {"a9993e...", "ba7816bf..."}; check_hash_data("abc", strlen("abc"), expected); } then a preparatory step like this patch _might_ be justifiable. You may want to avoid having to write too many boilerplate, and a special rule to find "DEF_HASH_TEST(name, ...)" and it might make sense to add support to extract the name of the test function being defined by the macro automatically. Not that I think such a sequence of DEF_HASH_TEST(), one per line, is an improvement at all (it also is unfriendly to editors that auto-indent the same way as your original version). I just wanted to say that a change to the pattern to pick up the function name may be justifiable if it were so. Thanks.
On Tue, Jan 07, 2025 at 10:16:33AM -0800, Junio C Hamano wrote: > Seyi Kuforiji <kuforiji98@gmail.com> writes: > > > The `generate-clar-decls.sh` script extracts signatures of test > > functions from our unit tests, which will later get used by the clar to > > automatically wire up tests. The sed command only matches lines that > > ended immediately after `void)`, causing it to miss declarations with > > additional content such as comments or annotations. > > > > Relax the regular expression by making it match lines with trailing data > > after the function signature. This ensures that all valid function > > declarations are captured and formatted as `extern` declarations > > regardless of their formatting style, improving the robustness of the > > script when parsing `$suite` files. > > > > This will be used in subsequent commits to match and capture the > > function signature correctly, regardless of any trailing content. > > I am not sure if this is going in the right direction, though. > > Especially for things like test suites that are looked at and worked > on only by develoeprs *and* these tools, being uniform and consistent > weighs more than being more flexible. > > Let me state it in another way. How many of the existing test > pieces are picked up by the current pattern, and among them how many > of them would see vast improvements if they are allowed to have > arbitrary garbage after their "I do not take any arguments" function > signature? Are new tests you are migrating from outside the clar > world lose a lot if they are no longer allowed to have comments > there, or would it be suffice to have the comments before the > functions (which many of our function definition do anyway)? > > A quick peek at [PATCH 2/2] tells me that this is not even something > that would make it easier to port the existing tests by allowing > more straight line-by-line copies or something. The patch splits > many in-line test pieces in the "main" into separate functions, and > it does so in a rather unusual format, e.g., > > void test_hash__multi_character(void) TEST_HASH_STR("abc", > "a9993e364706816aba3e25717850c26c9cd0d89d", > "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad") > > where TEST_HASH_STR() expands to the function body that starts with > a "{" and ends with a "}". It can well be written more like > > void test_hash__multi_character(void) > { > TEST_HASH_STR("abc", > "a9993e364706816aba3e25717850c26c9cd0d89d", > "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad"); > } > > and we do not need this step at all if we did so. Such a construct > would be a lot friendlier to the editors that auto-indent, too. > > So, I do not quite see much value in this particular change. Yeah. This was something I proposed, but I already mentioned to Seyi that I'm not all that happy with the outcome as it has a couple of downsides, for example broken syntax highlighting in lots of editors. I said he can send that version to the mailing list anyway and get some feedback on it to figure out whether my discomfort with my own idea is warranted or not. And your comment here basically confirms that my idea wasn't that great after all :) So I agree with you, let's scrap the idea and have proper function bodies instead. Patrick
On Wed, 8 Jan 2025 at 07:14, Patrick Steinhardt <ps@pks.im> wrote: > > On Tue, Jan 07, 2025 at 10:16:33AM -0800, Junio C Hamano wrote: > > Seyi Kuforiji <kuforiji98@gmail.com> writes: > > > > > The `generate-clar-decls.sh` script extracts signatures of test > > > functions from our unit tests, which will later get used by the clar to > > > automatically wire up tests. The sed command only matches lines that > > > ended immediately after `void)`, causing it to miss declarations with > > > additional content such as comments or annotations. > > > > > > Relax the regular expression by making it match lines with trailing data > > > after the function signature. This ensures that all valid function > > > declarations are captured and formatted as `extern` declarations > > > regardless of their formatting style, improving the robustness of the > > > script when parsing `$suite` files. > > > > > > This will be used in subsequent commits to match and capture the > > > function signature correctly, regardless of any trailing content. > > > > I am not sure if this is going in the right direction, though. > > > > Especially for things like test suites that are looked at and worked > > on only by develoeprs *and* these tools, being uniform and consistent > > weighs more than being more flexible. > > > > Let me state it in another way. How many of the existing test > > pieces are picked up by the current pattern, and among them how many > > of them would see vast improvements if they are allowed to have > > arbitrary garbage after their "I do not take any arguments" function > > signature? Are new tests you are migrating from outside the clar > > world lose a lot if they are no longer allowed to have comments > > there, or would it be suffice to have the comments before the > > functions (which many of our function definition do anyway)? > > > > A quick peek at [PATCH 2/2] tells me that this is not even something > > that would make it easier to port the existing tests by allowing > > more straight line-by-line copies or something. The patch splits > > many in-line test pieces in the "main" into separate functions, and > > it does so in a rather unusual format, e.g., > > > > void test_hash__multi_character(void) TEST_HASH_STR("abc", > > "a9993e364706816aba3e25717850c26c9cd0d89d", > > "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad") > > > > where TEST_HASH_STR() expands to the function body that starts with > > a "{" and ends with a "}". It can well be written more like > > > > void test_hash__multi_character(void) > > { > > TEST_HASH_STR("abc", > > "a9993e364706816aba3e25717850c26c9cd0d89d", > > "ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad"); > > } > > > > and we do not need this step at all if we did so. Such a construct > > would be a lot friendlier to the editors that auto-indent, too. > > > > So, I do not quite see much value in this particular change. > > Yeah. This was something I proposed, but I already mentioned to Seyi > that I'm not all that happy with the outcome as it has a couple of > downsides, for example broken syntax highlighting in lots of editors. I > said he can send that version to the mailing list anyway and get some > feedback on it to figure out whether my discomfort with my own idea is > warranted or not. And your comment here basically confirms that my idea > wasn't that great after all :) > > So I agree with you, let's scrap the idea and have proper function > bodies instead. > > Patrick Thank you for the feedback and insights. I'll update the patch to use standard function bodies. Thanks Seyi
Patrick Steinhardt <ps@pks.im> writes: > So I agree with you, let's scrap the idea and have proper function > bodies instead. Yup, sometimes, simple, stupid, and good enough is the way to go. We could do -- >8 -- #define T(testname, input, expect1, expect256) \ void test_hash__ ## testname(void) \ { \ const char *expect[] = { expect1, expect256 }; \ check_hash_data(input, strlen(input), expect); \ } extern void test_hash__ ## testname() T(empty_string, "", "da39...", "e3b0c4..."); T(single_character, "a", "86f7e4...", "ca97811..."); -- 8< -- which may not upset syntax-aware editors too much. Unless there are more than several dozens of them, I do not think it is worth it, though ;-)
On Wed, Jan 08, 2025 at 07:27:37AM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > So I agree with you, let's scrap the idea and have proper function > > bodies instead. > > Yup, sometimes, simple, stupid, and good enough is the way to go. > > We could do > > -- >8 -- > > #define T(testname, input, expect1, expect256) \ > void test_hash__ ## testname(void) \ > { \ > const char *expect[] = { expect1, expect256 }; \ > check_hash_data(input, strlen(input), expect); \ > } extern void test_hash__ ## testname() > > T(empty_string, "", "da39...", "e3b0c4..."); > T(single_character, "a", "86f7e4...", "ca97811..."); > > -- 8< -- > > which may not upset syntax-aware editors too much. > > Unless there are more than several dozens of them, I do not think it > is worth it, though ;-) Agreed, I'd go with the simple solution for now, which is to have function bodies. Patrick
diff --git a/t/unit-tests/generate-clar-decls.sh b/t/unit-tests/generate-clar-decls.sh index 3b315c64b3..02e45cf0ba 100755 --- a/t/unit-tests/generate-clar-decls.sh +++ b/t/unit-tests/generate-clar-decls.sh @@ -14,6 +14,6 @@ do suite_name=$(basename "$suite") suite_name=${suite_name%.c} suite_name=${suite_name#u-} - sed -ne "s/^\(void test_${suite_name}__[a-zA-Z_0-9][a-zA-Z_0-9]*(void)\)$/extern \1;/p" "$suite" || + sed -ne "s/^\(void test_${suite_name}__[a-zA-Z_0-9][a-zA-Z_0-9]*(void)\).*/extern \1;/p" "$suite" || exit 1 done >"$OUTPUT"
The `generate-clar-decls.sh` script extracts signatures of test functions from our unit tests, which will later get used by the clar to automatically wire up tests. The sed command only matches lines that ended immediately after `void)`, causing it to miss declarations with additional content such as comments or annotations. Relax the regular expression by making it match lines with trailing data after the function signature. This ensures that all valid function declarations are captured and formatted as `extern` declarations regardless of their formatting style, improving the robustness of the script when parsing `$suite` files. This will be used in subsequent commits to match and capture the function signature correctly, regardless of any trailing content. Mentored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com> --- t/unit-tests/generate-clar-decls.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.34.1