diff mbox series

[GSoC,v10,06/10] builtin/refs: add verify subcommand and verbose_refs for "fsck_options"

Message ID Zo6fFS7xzFwWKrEW@ArchLinux (mailing list archive)
State Superseded
Headers show
Series ref consistency check infra setup | expand

Commit Message

shejialuo July 10, 2024, 2:47 p.m. UTC
Introduce a new subcommand "verify" in git-refs(1) to allow the user to
check the reference database consistency and also this subcommand will
be used as the entry point of checking refs for "git-fsck(1)". Last, add
"verbose_refs" field into "fsck_options" to indicate whether we should
print verbose messages when checking refs consistency.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 Documentation/git-refs.txt | 13 +++++++++++
 builtin/refs.c             | 44 ++++++++++++++++++++++++++++++++++++++
 fsck.h                     |  1 +
 3 files changed, 58 insertions(+)

Comments

Junio C Hamano July 10, 2024, 9:31 p.m. UTC | #1
shejialuo <shejialuo@gmail.com> writes:

> Subject: Re: [GSoC][PATCH v10 06/10] builtin/refs: add verify subcommand and verbose_refs for "fsck_options"

Just saying

	git refs: add verify subcommand

would be clearer.  If you really want to talk about two modes, you
could say

	git refs: add "verify [--strict|--verbose]" subcommand

but that may be too much.

> Introduce a new subcommand "verify" in git-refs(1) to allow the user to
> check the reference database consistency and also this subcommand will
> be used as the entry point of checking refs for "git-fsck(1)". Last, add
> "verbose_refs" field into "fsck_options" to indicate whether we should
> print verbose messages when checking refs consistency.

Is there a reason why this has to be verbose_refs and not a simple
verbose bit?  When people see how it is useful to ask for the
verbose output while checking refs, wouldn't people wish to add the
same "--verbose" support while checking objects, and at that point,
wouldn't it be awkward to add verbose_objs member to the struct and
having to flip both bits on?

> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
>  Documentation/git-refs.txt | 13 +++++++++++
>  builtin/refs.c             | 44 ++++++++++++++++++++++++++++++++++++++
>  fsck.h                     |  1 +
>  3 files changed, 58 insertions(+)
>
> diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
> index 5b99e04385..1244a85b64 100644
> --- a/Documentation/git-refs.txt
> +++ b/Documentation/git-refs.txt
> @@ -10,6 +10,7 @@ SYNOPSIS
>  --------
>  [verse]
>  'git refs migrate' --ref-format=<format> [--dry-run]
> +'git refs verify' [--strict] [--verbose]
>  
>  DESCRIPTION
>  -----------
> @@ -22,6 +23,9 @@ COMMANDS
>  migrate::
>  	Migrate ref store between different formats.
>  
> +verify::
> +	Verify reference database consistency.
> +

The error reporting function for refs consistency check was still
about reporting a problem for a single ref.  I am wondering how
consistency violations that are not about a single ref should be
handled.  For example, if refs/packed-backend.c:packed_fsck() finds
that the file is not sorted properly or has some unparseable garbage
in it, it is not something you can report as "refs/heads/main is
broken", but those who are interested in seeing the "reference
database consistency" verified, it is very much what they want the
tool to notice.  How would detection of such a breakage that is not
attributed to a single ref fit in this "ref consistency check
infrastructure" that was introduced by [05/10]?

> +	argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
> +	if (argc)
> +		usage(_("too many arguments"));

I do not think we want to change this line in this topic, but
because I noticed that the issue is widespread, let me make a note
here that we may want to clean up all the commands that give this
message as a #leftoverbit item:

    $ git cmd foo baz
    usage: too many arguments

is very unfriendly in that it is not immediately obvious to users
which arguments are excess.  Should they have given "git cmd<RET>"?
Is it "git cmd foo" that does not take any argument?

If you said something like

    $ git refs verify baz
    error: 'git refs verify' takes no arguments

or even

    $ git refs verify baz
    error: unknown argument 'baz' given to 'git refs verify'

it would be much more helpful.


Thanks.
shejialuo July 11, 2024, 12:39 p.m. UTC | #2
On Wed, Jul 10, 2024 at 02:31:03PM -0700, Junio C Hamano wrote:
> shejialuo <shejialuo@gmail.com> writes:
> 
> > Subject: Re: [GSoC][PATCH v10 06/10] builtin/refs: add verify subcommand and verbose_refs for "fsck_options"
> 
> Just saying
> 
> 	git refs: add verify subcommand
> 
> would be clearer.  If you really want to talk about two modes, you
> could say
> 
> 	git refs: add "verify [--strict|--verbose]" subcommand
> 
> but that may be too much.
> 

Thanks, I will change in the next version.

> > Introduce a new subcommand "verify" in git-refs(1) to allow the user to
> > check the reference database consistency and also this subcommand will
> > be used as the entry point of checking refs for "git-fsck(1)". Last, add
> > "verbose_refs" field into "fsck_options" to indicate whether we should
> > print verbose messages when checking refs consistency.
> 
> Is there a reason why this has to be verbose_refs and not a simple
> verbose bit?  When people see how it is useful to ask for the
> verbose output while checking refs, wouldn't people wish to add the
> same "--verbose" support while checking objects, and at that point,
> wouldn't it be awkward to add verbose_objs member to the struct and
> having to flip both bits on?
> 

Actually, this is really what I thought. I just want to provide more
find-grained control here. However, when I implemented the code, I also
felt awkward about this. I can't find the balance here.

I will improve this in the next version.

> > Mentored-by: Patrick Steinhardt <ps@pks.im>
> > Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> > Signed-off-by: shejialuo <shejialuo@gmail.com>
> > ---
> >  Documentation/git-refs.txt | 13 +++++++++++
> >  builtin/refs.c             | 44 ++++++++++++++++++++++++++++++++++++++
> >  fsck.h                     |  1 +
> >  3 files changed, 58 insertions(+)
> >
> > diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
> > index 5b99e04385..1244a85b64 100644
> > --- a/Documentation/git-refs.txt
> > +++ b/Documentation/git-refs.txt
> > @@ -10,6 +10,7 @@ SYNOPSIS
> >  --------
> >  [verse]
> >  'git refs migrate' --ref-format=<format> [--dry-run]
> > +'git refs verify' [--strict] [--verbose]
> >  
> >  DESCRIPTION
> >  -----------
> > @@ -22,6 +23,9 @@ COMMANDS
> >  migrate::
> >  	Migrate ref store between different formats.
> >  
> > +verify::
> > +	Verify reference database consistency.
> > +
> 
> The error reporting function for refs consistency check was still
> about reporting a problem for a single ref.  I am wondering how
> consistency violations that are not about a single ref should be
> handled.  For example, if refs/packed-backend.c:packed_fsck() finds
> that the file is not sorted properly or has some unparseable garbage
> in it, it is not something you can report as "refs/heads/main is
> broken", but those who are interested in seeing the "reference
> database consistency" verified, it is very much what they want the
> tool to notice.  How would detection of such a breakage that is not
> attributed to a single ref fit in this "ref consistency check
> infrastructure" that was introduced by [05/10]?
> 

Yes, I didn't consider other cases. Although I have said in the subject
that this series is to set up the infrastructure of fscking refs. It's
a little hard for me to set up a perfect "fsck_refs_report" at the
moment.

As you said, I think currently I should consider about the packed-refs
in this series. I will find a way to achieve this in the next version.
Well, I could say I intentionally ignored this problem. But we should
face the problem directly.

Really thanks.

> > +	argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
> > +	if (argc)
> > +		usage(_("too many arguments"));
> 
> I do not think we want to change this line in this topic, but
> because I noticed that the issue is widespread, let me make a note
> here that we may want to clean up all the commands that give this
> message as a #leftoverbit item:
> 
>     $ git cmd foo baz
>     usage: too many arguments
> 
> is very unfriendly in that it is not immediately obvious to users
> which arguments are excess.  Should they have given "git cmd<RET>"?
> Is it "git cmd foo" that does not take any argument?
> 
> If you said something like
> 
>     $ git refs verify baz
>     error: 'git refs verify' takes no arguments
> 
> or even
> 
>     $ git refs verify baz
>     error: unknown argument 'baz' given to 'git refs verify'
> 
> it would be much more helpful.
> 

I will improve this in the next version.

> 
> Thanks.
diff mbox series

Patch

diff --git a/Documentation/git-refs.txt b/Documentation/git-refs.txt
index 5b99e04385..1244a85b64 100644
--- a/Documentation/git-refs.txt
+++ b/Documentation/git-refs.txt
@@ -10,6 +10,7 @@  SYNOPSIS
 --------
 [verse]
 'git refs migrate' --ref-format=<format> [--dry-run]
+'git refs verify' [--strict] [--verbose]
 
 DESCRIPTION
 -----------
@@ -22,6 +23,9 @@  COMMANDS
 migrate::
 	Migrate ref store between different formats.
 
+verify::
+	Verify reference database consistency.
+
 OPTIONS
 -------
 
@@ -39,6 +43,15 @@  include::ref-storage-format.txt[]
 	can be used to double check that the migration works as expected before
 	performing the actual migration.
 
+The following options are specific to 'git refs verify':
+
+--strict::
+	Enable more strict checking, every WARN severity for the `Fsck Messages`
+	be seen as ERROR. See linkgit:git-fsck[1].
+
+--verbose::
+	When verifying the reference database consistency, be chatty.
+
 KNOWN LIMITATIONS
 -----------------
 
diff --git a/builtin/refs.c b/builtin/refs.c
index 46dcd150d4..599b526786 100644
--- a/builtin/refs.c
+++ b/builtin/refs.c
@@ -1,4 +1,6 @@ 
 #include "builtin.h"
+#include "config.h"
+#include "fsck.h"
 #include "parse-options.h"
 #include "refs.h"
 #include "repository.h"
@@ -7,6 +9,9 @@ 
 #define REFS_MIGRATE_USAGE \
 	N_("git refs migrate --ref-format=<format> [--dry-run]")
 
+#define REFS_VERIFY_USAGE \
+	N_("git refs verify [--strict] [--verbose]")
+
 static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
 {
 	const char * const migrate_usage[] = {
@@ -58,15 +63,54 @@  static int cmd_refs_migrate(int argc, const char **argv, const char *prefix)
 	return err;
 }
 
+static int cmd_refs_verify(int argc, const char **argv, const char *prefix)
+{
+	struct fsck_options fsck_refs_options = FSCK_REFS_OPTIONS_DEFAULT;
+	const char * const verify_usage[] = {
+		REFS_VERIFY_USAGE,
+		NULL,
+	};
+	unsigned int verbose = 0, strict = 0;
+	struct option options[] = {
+		OPT__VERBOSE(&verbose, N_("be verbose")),
+		OPT_BOOL(0, "strict", &strict, N_("enable strict checking")),
+		OPT_END(),
+	};
+	int ret;
+
+	argc = parse_options(argc, argv, prefix, options, verify_usage, 0);
+	if (argc)
+		usage(_("too many arguments"));
+
+	if (verbose)
+		fsck_refs_options.verbose_refs = 1;
+	if (strict)
+		fsck_refs_options.strict = 1;
+
+	git_config(git_fsck_config, &fsck_refs_options);
+	prepare_repo_settings(the_repository);
+
+	ret = refs_fsck(get_main_ref_store(the_repository), &fsck_refs_options);
+
+	/*
+	 * Explicitly free the allocated array and "skip_oids" set
+	 */
+	free(fsck_refs_options.msg_type);
+	oidset_clear(&fsck_refs_options.skip_oids);
+	return ret;
+}
+
 int cmd_refs(int argc, const char **argv, const char *prefix)
 {
 	const char * const refs_usage[] = {
 		REFS_MIGRATE_USAGE,
+		REFS_VERIFY_USAGE,
 		NULL,
 	};
 	parse_opt_subcommand_fn *fn = NULL;
 	struct option opts[] = {
 		OPT_SUBCOMMAND("migrate", &fn, cmd_refs_migrate),
+		OPT_SUBCOMMAND("verify", &fn, cmd_refs_verify),
 		OPT_END(),
 	};
 
diff --git a/fsck.h b/fsck.h
index fe5d4d2ad9..ef1f1ed15e 100644
--- a/fsck.h
+++ b/fsck.h
@@ -147,6 +147,7 @@  struct fsck_options {
 	fsck_walk_func walk;
 	fsck_error error_func;
 	unsigned strict:1;
+	unsigned verbose_refs:1;
 	enum fsck_msg_type *msg_type;
 	struct oidset skip_oids;
 	struct oidset gitmodules_found;