Message ID | 20250324021101.7483-1-lucasseikioshiro@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [GSoC,v2] userdiff: add builtin driver for gitconfig syntax | expand |
Am 24.03.25 um 03:11 schrieb Lucas Seiki Oshiro: > From Documentation/config.adoc: > > Add a new builtin driver for gitconfig files, where: > > - the funcname regular expression matches sections and subsections, > i. e. the pattern [SECTION] or [SECTION "SUBSECTION"], where the > section is composed by alphanumeric numbers, `-` and `.`, and > subsection names may be composed by any characters; > > - word_regex is more permissive than the syntax specification, matching > any word with one or more non-whitespace characters without checking > if it is a valid variable name or value. > > A more detailed description on the format of gitconfig syntax can be > seen by running `git show cfd409:Documentation/config.txt`. Can we please have a more recent reference? The difference of config.txt here and config.adoc above is very surprising. > Also add tests for the new userdiff driver. These files define sections > and subsections, with and without indentation. > > Helped-by: Patrick Steinhardt <ps@pks.im> > Helped-by: D. Ben Knoble <ben.knoble@gmail.com> > Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> Thank you for your contribution. The file format of .git/config files isn't specific to .git/config; it's called "ini-file" and is already very old. Wouldn't it make sense to generalize the format? It would be just a matter of choosing a different name; the regular expressions would not have to change. > diff --git a/t/t4018/gitconfig-section b/t/t4018/gitconfig-section > new file mode 100644 > index 0000000000..18c85eb613 > --- /dev/null > +++ b/t/t4018/gitconfig-section > @@ -0,0 +1,6 @@ > +[RIGHT] > + # comment > + ; comment > + name = value > + ChangeMe > + This could test two sections in a row and ensure that the later one is chosen. You have now managed to avoid the "No newline at end of file", but have added a blank line instead. Not a big deal, but unconventional. > diff --git a/t/t4018/gitconfig-section-noindent b/t/t4018/gitconfig-section-noindent > new file mode 100644 > index 0000000000..5c58a7ac92 > --- /dev/null > +++ b/t/t4018/gitconfig-section-noindent > @@ -0,0 +1,6 @@ > +[RIGHT] > +# comment > +; comment > +name = value > +ChangeMe > + > diff --git a/t/t4018/gitconfig-subsection b/t/t4018/gitconfig-subsection > new file mode 100644 > index 0000000000..569be04a32 > --- /dev/null > +++ b/t/t4018/gitconfig-subsection > @@ -0,0 +1,8 @@ > +[LEFT] > + > +[LEFT "RIGHT"] > + # comment > + ; comment > + name = value > + ChangeMe > + This could test two sub-sections in a row and ensure that the later one is chosen. What happens if there is an *indented* header after the "RIGHT" one? Should it be chosen or not? Can this happen in a valid file? > diff --git a/t/t4018/gitconfig-subsection-noindent b/t/t4018/gitconfig-subsection-noindent > new file mode 100644 > index 0000000000..85c5074f47 > --- /dev/null > +++ b/t/t4018/gitconfig-subsection-noindent > @@ -0,0 +1,8 @@ > +[LEFT] > + > +[LEFT "RIGHT"] > +# comment > +; comment > +name = value > +ChangeMe > + > diff --git a/userdiff.c b/userdiff.c > index 340c4eb4f7..5bbcc2b690 100644 > --- a/userdiff.c > +++ b/userdiff.c > @@ -198,6 +198,10 @@ IPATTERN("fountain", > "^((\\.[^.]|(int|ext|est|int\\.?/ext|i/e)[. ]).*)$", > /* -- */ > "[^ \t-]+"), > +PATTERNS("gitconfig", > + "^\\[[a-zA-Z0-9]+\\]|\\[[a-zA-Z0-9]+[ \t]+\".+\"\\]$", The regular expression can assume that the syntax of the processed file is correct. For example, [!not a section!] cannot be a section header and will not occur in a valid file. Or can it? Therefore, it would be sufficient to just take everything after the '[' at the beginning of the line without further inspection. Furthermore, a valid file can look like this: [section] key = value another_key = more values but your pattern would not pick up this header, because it insists in that the closing bracket is at the end of the line. Having a test for this case would be great. > + /* -- */ > + "[^ \t]+"), > PATTERNS("golang", > /* Functions */ > "^[ \t]*(func[ \t]*.*(\\{[ \t]*)?)\n" -- Hannes
Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> writes: > This v2 removes the quoted text from the commit message, using a reference to > the documentation file and the commit that last changed the lines where it was > copied before. > > I'm also adding the missing newlines at the end of the test files. > > t/t4018/gitconfig-section | 6 ++++++ > t/t4018/gitconfig-section-noindent | 6 ++++++ > t/t4018/gitconfig-subsection | 8 ++++++++ > t/t4018/gitconfig-subsection-noindent | 8 ++++++++ > userdiff.c | 4 ++++ > 5 files changed, 32 insertions(+) > create mode 100644 t/t4018/gitconfig-section > create mode 100644 t/t4018/gitconfig-section-noindent > create mode 100644 t/t4018/gitconfig-subsection > create mode 100644 t/t4018/gitconfig-subsection-noindent .git/rebase-apply/patch:83: indent with spaces. "^\\[[a-zA-Z0-9]+\\]|\\[[a-zA-Z0-9]+[ \t]+\".+\"\\]$", .git/rebase-apply/patch:84: indent with spaces. /* -- */ .git/rebase-apply/patch:85: indent with spaces. "[^ \t]+"), warning: 3 lines applied after fixing whitespace errors. Applying: userdiff: add builtin driver for gitconfig syntax
On Mon, Mar 24, 2025 at 3:18 AM Johannes Sixt <j6t@kdbg.org> wrote: > > > You have now managed to avoid the "No newline at end of file", but have > added a blank line instead. Not a big deal, but unconventional. > I missed the opportunity to reply to the original patch on this front, but here's the information I usually give out: - VS Code's default settings do not respect POSIX end-of-file newline standards, so you should probably change them [1] - There's a similar setting for Intellij IDEA [2] and Vim [3], but they might have saner defaults (Vim certainly does) - If your editor or IDE picks up on editorconfig settings, you'll be able to avoid this - Git can warn you with diff or show saying "No newline at end of file," so a careful self-review also catches these. [1]: https://stackoverflow.com/q/44704968/4400820 [2]: https://stackoverflow.com/q/16761227/4400820 [3]: https://vimhelp.org/options.txt.html#%27endofline%27
> Can we please have a more recent reference? The difference of config.txt > here and config.adoc above is very surprising. Hmmmm... My idea was to reference the last change in the paragraphs of the documentation, but I'll change to the last change of this file. > The file format of .git/config files isn't specific to .git/config; it's > called "ini-file" and is already very old. Wouldn't it make sense to > generalize the format? It would be just a matter of choosing a different > name; the regular expressions would not have to change. Indeed. This is was written having the gitconfig in mind, but perhaps I could use a different approach and make a little more flexible for other INI flavors and perhaps even TOML > This could test two sub-sections in a row and ensure that the later one > is chosen. Nice! I'll do that. > What happens if there is an *indented* header after the "RIGHT" one? > Should it be chosen or not? Can this happen in a valid file? I just tested here, it is valid file. I'll take indentation into acoount in v3. > The regular expression can assume that the syntax of the processed file > is correct. For example, > > [!not a section!] > cannot be a section header and will not occur in a valid file. Or can it? Following the gitconfig syntax specification it can't be in a valid file, and this regex won't match it. > Therefore, it would be sufficient to just take everything after the '[' > at the beginning of the line without further inspection. I can't see any harm in just dropping the section name matching and using a generic /\[.+\]/. It may be also useful for more generic INI files that you mentioned before. > Furthermore, a valid file can look like this: > > [section] key = value > another_key = more values To be honest, I didn't know that it could was a valid file. I'll include that in a v3. Thanks for your extensive review!
diff --git a/t/t4018/gitconfig-section b/t/t4018/gitconfig-section new file mode 100644 index 0000000000..18c85eb613 --- /dev/null +++ b/t/t4018/gitconfig-section @@ -0,0 +1,6 @@ +[RIGHT] + # comment + ; comment + name = value + ChangeMe + diff --git a/t/t4018/gitconfig-section-noindent b/t/t4018/gitconfig-section-noindent new file mode 100644 index 0000000000..5c58a7ac92 --- /dev/null +++ b/t/t4018/gitconfig-section-noindent @@ -0,0 +1,6 @@ +[RIGHT] +# comment +; comment +name = value +ChangeMe + diff --git a/t/t4018/gitconfig-subsection b/t/t4018/gitconfig-subsection new file mode 100644 index 0000000000..569be04a32 --- /dev/null +++ b/t/t4018/gitconfig-subsection @@ -0,0 +1,8 @@ +[LEFT] + +[LEFT "RIGHT"] + # comment + ; comment + name = value + ChangeMe + diff --git a/t/t4018/gitconfig-subsection-noindent b/t/t4018/gitconfig-subsection-noindent new file mode 100644 index 0000000000..85c5074f47 --- /dev/null +++ b/t/t4018/gitconfig-subsection-noindent @@ -0,0 +1,8 @@ +[LEFT] + +[LEFT "RIGHT"] +# comment +; comment +name = value +ChangeMe + diff --git a/userdiff.c b/userdiff.c index 340c4eb4f7..5bbcc2b690 100644 --- a/userdiff.c +++ b/userdiff.c @@ -198,6 +198,10 @@ IPATTERN("fountain", "^((\\.[^.]|(int|ext|est|int\\.?/ext|i/e)[. ]).*)$", /* -- */ "[^ \t-]+"), +PATTERNS("gitconfig", + "^\\[[a-zA-Z0-9]+\\]|\\[[a-zA-Z0-9]+[ \t]+\".+\"\\]$", + /* -- */ + "[^ \t]+"), PATTERNS("golang", /* Functions */ "^[ \t]*(func[ \t]*.*(\\{[ \t]*)?)\n"
From Documentation/config.adoc: Add a new builtin driver for gitconfig files, where: - the funcname regular expression matches sections and subsections, i. e. the pattern [SECTION] or [SECTION "SUBSECTION"], where the section is composed by alphanumeric numbers, `-` and `.`, and subsection names may be composed by any characters; - word_regex is more permissive than the syntax specification, matching any word with one or more non-whitespace characters without checking if it is a valid variable name or value. A more detailed description on the format of gitconfig syntax can be seen by running `git show cfd409:Documentation/config.txt`. Also add tests for the new userdiff driver. These files define sections and subsections, with and without indentation. Helped-by: Patrick Steinhardt <ps@pks.im> Helped-by: D. Ben Knoble <ben.knoble@gmail.com> Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com> --- Hi! This v2 removes the quoted text from the commit message, using a reference to the documentation file and the commit that last changed the lines where it was copied before. I'm also adding the missing newlines at the end of the test files. t/t4018/gitconfig-section | 6 ++++++ t/t4018/gitconfig-section-noindent | 6 ++++++ t/t4018/gitconfig-subsection | 8 ++++++++ t/t4018/gitconfig-subsection-noindent | 8 ++++++++ userdiff.c | 4 ++++ 5 files changed, 32 insertions(+) create mode 100644 t/t4018/gitconfig-section create mode 100644 t/t4018/gitconfig-section-noindent create mode 100644 t/t4018/gitconfig-subsection create mode 100644 t/t4018/gitconfig-subsection-noindent