mbox series

[RFC,0/4] make headway towards a clean "clang-format"

Message ID RFC-cover-0.4-00000000000-20220711T110019Z-avarab@gmail.com (mailing list archive)
Headers show
Series make headway towards a clean "clang-format" | expand

Message

Ævar Arnfjörð Bjarmason July 11, 2022, 11:37 a.m. UTC
On Sun, Jul 10 2022, brian m. carlson wrote:

> Having a code formatting tool means that the work for a contributor to
> format the file properly consists of about two keystrokes.  This
> substantially reduces the amount of time that contributors must spend
> thinking about code formatting and simplifies it to an automatic process
> that, if we choose, can even be verified in CI.

I'm generally in favor of this, or more specifically I'm generally in
favor of whatever we can do to make
Documentation/{SubmittingPatches,CodingGuidelines}, t/README,
po/README.md and whatever other "checklist" document we have more
automated.

So we could generally waste less time in asking contributors to read and
internalize these documents, or in fixing submission mistakes, and more
time on worthwhile work.

Whether using an automatic formatter would land us on the right side of
that trade-off is something I'm not sure about, but if it would: Great!

But!

> I should note that we already have a .clang-format file, so we can
> already use clang-format.  However, we cannot blindly apply it because
> it produces output that is not always conformant with our style.  My
> proposal here is to define our style in terms of the formatter to avoid
> this problem.

I think this is very premature at this point.

I experimented a while ago (low number of months) with clang-format for
the first time, and found that the delta between what it proposed as
changes and what it *should* propose was needlessly high.

This RFC is a cleaned-up & updated version of those WIP experiments I
made. I think this series is "ready" in that it's a sensible
improvement, and having a "make style-all" family of rules is very
useful to check how .clang-format behaves.

But I think we're far short of declaring that we should simply refer
to clang-format against our shipped .clang-format configuration as our
canonical style.

At the end of this series we're at a 60% shorter suggested diff from
clang-format, but that still leaves ~13k lines of diff on top of
"master".

And, as 0[2-4]/04 note adjusting those rules isn't always clear, and
there's the major question of what to do about "ColumnLimit", and
things like "BitFieldColonSpacing" which require a much newer version
of clang than we're probably comfortable with requiring.

Someone more interested than me in this (i.e. you:) should really be
building on top of this, i.e. running:

	make style-all-diff-apply

And seeing what general categories these differences fall into, and
whether clang-format has any more options that we've set in the wrong
way, or which we're missing.

There's clearly a lot of changes in the resulting diff that we should
have per our style guide, e.g. there's a lot of "}\nelse {" (bad!)
v.s. "} else {" (good!) and the like.

Before we get to a "clean diff" we could at some point "invert" our
style guide. I.e. instead of saying "code should be like XYZ", we
could say:

	run clang-format, but it might report some differences, as
	long as it's one of the below known-cases that's OK.

I.e. we could one-off search-replace all the "}\nelse {" etc. cases to
the "} else {" form, and then we wouldn't have to waste words on that
in Documentation/CodingGuidelines.

But I *really* don't think we can declare that clang-format is some
source of truth before doing much or all of that work. If you think
nits are annoying now consider e.g. this change to advice.c:
	
	diff --git a/advice.c b/advice.c
	index 6fda9edbc24..e0e453d68d1 100644
	--- a/advice.c
	+++ b/advice.c
	@@ -35,6 +35,7 @@ static struct {
	 	const char *key;
	 	int enabled;
	 } advice_setting[] = {
	+	[ADVICE_FOO]					= { "foo", 1 },f
	 	[ADVICE_ADD_EMBEDDED_REPO]			= { "addEmbeddedRepo", 1 },
	 	[ADVICE_ADD_EMPTY_PATHSPEC]			= { "addEmptyPathspec", 1 },
	 	[ADVICE_ADD_IGNORED_FILE]			= { "addIgnoredFile", 1 },

I'll spare you the full diff, but if you do that and run "make style"
it'll spew out a ~100 line diff at you, as it re-indents and
re-formats that entire block.

That *is* a rather extreme example, most of its suggestions are 1-5
line diffs.

But if there's one hill I'm willing to die on it's that we should
never introduce any sort of "landmine checks" in this codebase.

Those are checks where we don't e.g. assert that the style of git.c is
OK, but rather just run a "diff" on your specific change.

Then if it happens to change any code that was *already in violation*
will make it your problem to fix that. I.e. the landmine was already
there, but you just happened to walk over that path, and *boom*.

But I'm not an unreasonable zealot about that:) E.g. as running the
"make style-all-diff-apply" target here shows you around 40% of our
*.[ch] files don't have any suggested changes.

That's around 30% with just 1/4, i.e. a 10% "gain" is made with the
2-4/4 rule changes here.

So, in combination with updating .clang-format we could start making
headway earlier by e.g. whitelisting files (or globs, or directories)
that are already "style clean".

It would be a bit annoying if we needed to move code from a "dirty" to
a "clean" file, but we mostly don't do that, so I'd think that would
be OK.

But I think like other "partial asserts" (such as the "linux-leaks" CI
job) such an approach really would need to proceed in parallel with
fixing the remaining cases, even though that might take a while.

I.e. we'd want to have some assurances that we'd either be willing to
live with a fully clang-format'd codebase, or at least that the
special-casing of the style rules wasn't too widespread. We have a bit
of "/* clang format (off|on) */" already, and e.g. for the likes of
advice.c that would be OK, but having it be a common pattern would be
annoying.

Ævar Arnfjörð Bjarmason (4):
  Makefile: add a style-all targets for .clang-format testing
  .clang-format: Add a BitFieldColonSpacing=None rule
  .clang-format: do not enforce a ColumnLimit
  .clang-format: don't indent "goto" labels

 .clang-format | 17 +++++++++----
 Makefile      | 67 +++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 79 insertions(+), 5 deletions(-)