diff mbox series

[2/2] fsck: document msg-id

Message ID 3aec3d2c9ca65a37a367c3a7c9081bbd4cd44ae0.1666623639.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Document fsck msg ids | expand

Commit Message

John Cai Oct. 24, 2022, 3 p.m. UTC
From: John Cai <johncai86@gmail.com>

The git-config documentation lacks mention of specific <msg-id> that
are supported. While git-help --config will display a list of these options,
often developers' first instinct is to consult the git docs to find valid
config values.

Add a section under the docs for fsck.<msg-id> with the msg-ids that
git-fsck recognizes.

Signed-off-by: John Cai <johncai86@gmail.com>
---
 Documentation/config/fsck.txt |   5 ++
 Documentation/fsck-msgids.txt | 133 ++++++++++++++++++++++++++++++++++
 2 files changed, 138 insertions(+)
 create mode 100644 Documentation/fsck-msgids.txt

Comments

Junio C Hamano Oct. 24, 2022, 5:33 p.m. UTC | #1
"John Cai via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: John Cai <johncai86@gmail.com>
>
> The git-config documentation lacks mention of specific <msg-id> that
> are supported. While git-help --config will display a list of these options,
> often developers' first instinct is to consult the git docs to find valid
> config values.

Good observation.  "help --config" only gives names and no
description, so maintaining the description like this new file does
make sense.

Can you also add a referencing comment to fsck.h to tell that folks
who add a new error type that they need to update the -msgids.txt
file as well?

Does this list in the new file cover all existing messages, by the
way?

I also wonder if the order of the entries in this file should be
alphabetical, unlike the entries in fsck.h where more severe ones
come earlier than the less severe ones.  In a sense, matching the
order of two lists makes it easier to maintain, and grouping by
severity in the doc might or might not find it easier to scan and
find what they are, but one of the biggest reason why users will
come to this list is when they see a particular error message and
want to understand what that means, no?  At that point it would be
easier to look things up if 

 (1) the list contains EVERYTHING, even the ones that you are not
     supposed to configure away.

 (2) the list is sorted alphabetically, regardless of the severity.

The first point suggests that it may be a mistake to have the main
list appear in the "what configuration knobs are available for
squelching fsck error messages".  It is OK to refer from there to
the main list, but the main list should list everything, with
comments like "(this error cannot be configured away)", no?

In other words, I think it is better to have a master list of
everything, with their message ID and textual description, which
essentially is your fsck-msgids.txt but additionally mention which
ones are by default FATAL and cannot be configured away (even
better, we can mention what severity level each of them is by
default, by mirroring that is in fsck.h).

And that master list should NOT be made a part of fsck.<msg-type>
configuration item, like this patch did.  It should be a separate
section in "git fsck --help" output whose section title is "Fsck
errors" or something.  

Then the existing description of fsck.<msg-type> configuration can
refer to that "Fsck errors" section for what msg types exist.

Another thing that I noticed while thinking about the patch, but is
better left outside the scope of this patch, is that an attempt to
configure fatal ones away is prevented by fsck.c::fsck_set_msg_type() 
but "git help config | grep '^fsck\.'" lists them.  I think the
"help config" should be fixed.

I almost suggested to extend the FOREACH_FSCK_MSG_ID() definition in
fsck.h so that fsck-msgids.txt gets auto-generated (what is missing
in fsck.h that prevents us from doing so is the textual explanation
you added in the new file in your patch---they could come from
comments on the same line, for example, and can be extracted via a
Perl or sed script at build time).  I do not know if it is a good
idea, especially if we forsee a future in which we may be
translating the documentation, so decided against making such a
suggestion.

But at least, we could _lint_ the manually curated fsck-msgids.txt
using what is in fsck.h to see if a new MSG_ID added to fsck.h is
missing from the doc, or a MSG_ID whose default severity is updated
in fsck.h is stale in the doc, etc.  That can be left for the future
updates, but we should at least instruct developers to keep them in
sync in fsck.h by adding a comment.

Thanks.
Ævar Arnfjörð Bjarmason Oct. 25, 2022, 9:41 a.m. UTC | #2
On Mon, Oct 24 2022, Junio C Hamano wrote:

> "John Cai via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> I almost suggested to extend the FOREACH_FSCK_MSG_ID() definition in
> fsck.h so that fsck-msgids.txt gets auto-generated (what is missing
> in fsck.h that prevents us from doing so is the textual explanation
> you added in the new file in your patch---they could come from
> comments on the same line, for example, and can be extracted via a
> Perl or sed script at build time).

I think this is the best eventual approach, whether we want it now is
another matter...

> I do not know if it is a good
> idea, especially if we forsee a future in which we may be
> translating the documentation, so decided against making such a
> suggestion.

...I just wanted to point out that difficulty of translating such a
thing should not be a reason to hold that back, because it's not hard to
translate such an arrangement.

We'd just point the po-doc extraction at the generated *.txt, we'd need
to re-arrange the Makefile dependencies a bit, but it shouldn't be a
problem.

The *.pot-file generation is a step that only the translation
coordinator *needs* to run, so even if it's a manual procedure, or
requires first building the sources...

> But at least, we could _lint_ the manually curated fsck-msgids.txt
> using what is in fsck.h to see if a new MSG_ID added to fsck.h is
> missing from the doc, or a MSG_ID whose default severity is updated
> in fsck.h is stale in the doc, etc.  That can be left for the future
> updates, but we should at least instruct developers to keep them in
> sync in fsck.h by adding a comment.

Yeah, that's a good step.
Junio C Hamano Oct. 25, 2022, 4:07 p.m. UTC | #3
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> We'd just point the po-doc extraction at the generated *.txt, we'd need
> to re-arrange the Makefile dependencies a bit, but it shouldn't be a
> problem.
>
> The *.pot-file generation is a step that only the translation
> coordinator *needs* to run, so even if it's a manual procedure, or
> requires first building the sources...

I thought you made it more distributed to remove the coordinator's
involvement in .pot generation recently, which resulted in us not
tracking the .pot at all, no?  Now all the l10n folks need to do the
extraction?
diff mbox series

Patch

diff --git a/Documentation/config/fsck.txt b/Documentation/config/fsck.txt
index 450e8c38e34..b0632075f22 100644
--- a/Documentation/config/fsck.txt
+++ b/Documentation/config/fsck.txt
@@ -35,6 +35,11 @@  allow new instances of the same breakages go unnoticed.
 Setting an unknown `fsck.<msg-id>` value will cause fsck to die, but
 doing the same for `receive.fsck.<msg-id>` and `fetch.fsck.<msg-id>`
 will only cause git to warn.
++
+The following `<msg-id>` are supported:
++
+
+include::../fsck-msgids.txt[]
 
 fsck.skipList::
 	The path to a list of object names (i.e. one unabbreviated SHA-1 per
diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
new file mode 100644
index 00000000000..888fa3308b7
--- /dev/null
+++ b/Documentation/fsck-msgids.txt
@@ -0,0 +1,133 @@ 
+--
+`badDate`;;
+	Invalid date format in an author/committer line.
+
+`badEmail`;;
+	Invalid email format in an author/committer line.
+
+`badFilemode`;;
+	A tree contains a bad filemode entry.
+
+`badName`;;
+	An author/committer name is empty.
+
+`badObjectSha1`;;
+	An object has a bad sha1.
+
+`badParentSha1`;;
+	A commit object has a bad parent sha1.
+
+`badTagName`;;
+	A tag has an invalid format.
+
+`badTimezone`;;
+	Found an invalid time zone in an author/committer line.
+
+`badTree`;;
+	A tree cannot be parsed.
+
+`badTreeSha1`;;
+	A tree has an invalid format.
+
+`badType`;;
+	Found an invalid object type.
+
+`duplicateEntries`;;
+	A tree contains duplicate file entries.
+
+`emptyName`;;
+	A path contains an empty name.
+
+`fullPathName`;;
+	A path contains the full path starting with "/".
+
+`gitAttributesSymlink`;;
+	`.gitattributes` is a symlink.
+
+`gitignoreSymlink`;;
+	`.gitignore` is a symlink.
+
+`gitmodulesBlob`;;
+	A non-blob found at `.gitmodules`.
+
+`gitmodulesMissing`;;
+	Unable to read `.gitmodules` blob.
+
+`gitmodulesName`;;
+	A submodule name is invalid.
+
+`gitmodulesParse`;;
+	Could not parse `.gitmodules` blob.
+
+`gitmodulesLarge`;
+	`.gitmodules` blob is too large to parse.
+
+`gitmodulesPath`;;
+	`.gitmodules` path is invalid.
+
+`gitmodulesSymlink`;;
+	`.gitmodules` is a symlink.
+
+`gitmodulesUpdate`;;
+	Found an invalid submodule update setting.
+
+`gitmodulesUrl`;;
+	Found an invalid submodule url.
+
+`hasDot`;;
+	A tree contains an entry named `.`.
+
+`hasDotdot`;;
+	A tree contains an entry named `..`.
+
+`hasDotgit`;;
+	A tree contains an entry named `.git`.
+
+`mailmapSymlink`;;
+	`.mailmap` is a symlink.
+
+`missingAuthor`;;
+	Author is missing.
+
+`missingCommitter`;;
+	Committer is missing.
+
+`missingEmail`;;
+	Email is missing in an author/committer line.
+
+`missingNameBeforeEmail`;;
+	Missing space before an email in an author/committer line.
+
+`missingObject`;;
+	Missing `object` line in tag object.
+
+`missingSpaceBeforeDate`;;
+	Missing space before date in an author/committer line.
+
+`missingSpaceBeforeEmail`;;
+	Missing space before the email in author/committer line.
+
+`missingTag`;;
+	Unexpected end after `type` line in a tag object.
+
+`missingTypeEntry`;;
+	Missing `type` line in a tag object.
+
+`multipleAuthors`;;`
+	Multiple author lines found in a commit.
+
+`nulInCommit`;;
+	Found a NUL byte in the commit object body.
+
+`treeNotSorted`;;
+	A tree is not properly sorted.
+
+`unknownType`;;
+	Found an unknown object type.
+
+`zeroPaddingDate`;;
+	Found a zero padded date in an author/commiter line.
+
+`zeroPaddedFilemode`;;
+	Found a zero padded filemode in a tree.
+--