diff mbox series

[3/3] rev-list: add --allow-missing-tips to be used with --missing=...

Message ID 20240201115809.1177064-4-christian.couder@gmail.com (mailing list archive)
State Superseded
Headers show
Series rev-list: allow missing tips with --missing | expand

Commit Message

Christian Couder Feb. 1, 2024, 11:58 a.m. UTC
In 9830926c7d (rev-list: add commit object support in `--missing`
option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
so that it now works with commits too.

Unfortunately, such a command would still fail with a "fatal: bad
object <oid>" if it is passed a missing commit, blob or tree as an
argument.

When such a command is used to find the dependencies of some objects,
for example the dependencies of quarantined objects, it would be
better if the command would instead consider such missing objects,
especially commits, in the same way as other missing objects.

If, for example `--missing=print` is used, it would be nice for some
use cases if the missing tips passed as arguments were reported in
the same way as other missing objects instead of the command just
failing.

Let's introduce a new `--allow-missing-tips` option to make it work
like this.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 builtin/rev-list.c          | 24 ++++++++++++++++-
 revision.c                  |  9 ++++---
 revision.h                  |  8 ++++++
 t/t6022-rev-list-missing.sh | 51 +++++++++++++++++++++++++++++++++++++
 4 files changed, 88 insertions(+), 4 deletions(-)

Comments

Junio C Hamano Feb. 1, 2024, 8:20 p.m. UTC | #1
Christian Couder <christian.couder@gmail.com> writes:

> In 9830926c7d (rev-list: add commit object support in `--missing`
> option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
> so that it now works with commits too.
>
> Unfortunately, such a command would still fail with a "fatal: bad
> object <oid>" if it is passed a missing commit, blob or tree as an
> argument.
>
> When such a command is used to find the dependencies of some objects,
> for example the dependencies of quarantined objects, it would be
> better if the command would instead consider such missing objects,
> especially commits, in the same way as other missing objects.
>
> If, for example `--missing=print` is used, it would be nice for some
> use cases if the missing tips passed as arguments were reported in
> the same way as other missing objects instead of the command just
> failing.
>
> Let's introduce a new `--allow-missing-tips` option to make it work
> like this.

OK.  Unlike a missing object referenced by a tree, a commit, or a
tag, where the expected type of the missing object is known to Git,
I would expect that nobody knowsn what type these missing objects at
the tip have.  So do we now require "object X missing" instead of
"commit X missing" in the output?  If we are not giving any type
information even when we know what the type we expect is, then we do
not have to worry about this change introducing a new output logic,
I guess.

> diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> index b3f4783858..ae7bb15478 100644
> --- a/builtin/rev-list.c
> +++ b/builtin/rev-list.c
> @@ -562,6 +562,16 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  				break;
>  		}
>  	}
> +	for (i = 1; i < argc; i++) {
> +		const char *arg = argv[i];
> +		if (!strcmp(arg, "--allow-missing-tips")) {
> +			if (arg_missing_action == MA_ERROR)
> +				die(_("option '%s' only makes sense with '%s' set to '%s' or '%s'"),
> +				      "--allow-missing-tips", "--missing=", "allow-*", "print");
> +			revs.do_not_die_on_missing_tips = 1;
> +			break;
> +		}
> +	}

It is unfortunate that we need to add yet another dumb loop that
does not even try to understand there may be an option whose value
happens to be the string strcmp() looks for (we already have two
such loops above this hunk).  I have to wonder if we can do better.

An idle piece of idea.  Perhaps we can instruct setup_revisions()
not to die immediately upon seeing a problematic argument, marking
the revs as "broken" instead, and keep going and interpreting as
much as it could, so that it may record the presence of "--missing",
"--exclude-promisor-objects", and "--allow-missing-tips".  Then upon
seeing setup_revisions() return with such an error, we can redo the
call with these bits already on.

Anyway, I digress.

> diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
> index 527aa94f07..283e8fc2c2 100755
> --- a/t/t6022-rev-list-missing.sh
> +++ b/t/t6022-rev-list-missing.sh
> @@ -77,4 +77,55 @@ do
>  	done
>  done
>  
> +for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
> +do
> +	for tip in "" "HEAD"
> +	do
> +		for action in "allow-any" "print"
> +		do
> +			test_expect_success "--missing=$action --allow-missing-tips with tip '$obj' missing and tip '$tip'" '
> +				oid="$(git rev-parse $obj)" &&
> +				path=".git/objects/$(test_oid_to_path $oid)" &&
> +
> +				# Before the object is made missing, we use rev-list to
> +				# get the expected oids.
> +				if [ "$tip" = "HEAD" ]; then

Style:

                                if test "$tip" = "HEAD"
                                then

> +					git rev-list --objects --no-object-names \
> +						HEAD ^$obj >expect.raw
> +				else
> +					>expect.raw
> +				fi &&
> +
> +				# Blobs are shared by all commits, so even though a commit/tree
> +				# might be skipped, its blob must be accounted for.
> +				if [ "$tip" = "HEAD" ] && [ $obj != "HEAD:1.t" ]; then

Ditto.

> +					echo $(git rev-parse HEAD:1.t) >>expect.raw &&
> +					echo $(git rev-parse HEAD:2.t) >>expect.raw
> +				fi &&
> +
> +				mv "$path" "$path.hidden" &&
> +				test_when_finished "mv $path.hidden $path" &&
> +
> +				git rev-list --missing=$action --allow-missing-tips \
> +				     --objects --no-object-names $oid $tip >actual.raw &&
> +
> +				# When the action is to print, we should also add the missing
> +				# oid to the expect list.
> +				case $action in
> +				allow-any)
> +					;;
> +				print)
> +					grep ?$oid actual.raw &&
> +					echo ?$oid >>expect.raw
> +					;;

OK.  We do not say anything more than the object name (and the fact
that it is missing with a single byte '?'), so my earlier worry was
unfounded.  Good.

> +				esac &&
> +
> +				sort actual.raw >actual &&
> +				sort expect.raw >expect &&
> +				test_cmp expect actual
> +			'
> +		done
> +	done
> +done
> +
>  test_done

THanks.
Junio C Hamano Feb. 1, 2024, 9:27 p.m. UTC | #2
Christian Couder <christian.couder@gmail.com> writes:

> When such a command is used to find the dependencies of some objects,
> for example the dependencies of quarantined objects, it would be
> better if the command would instead consider such missing objects,
> especially commits, in the same way as other missing objects.
>
> If, for example `--missing=print` is used, it would be nice for some
> use cases if the missing tips passed as arguments were reported in
> the same way as other missing objects instead of the command just
> failing.
>
> Let's introduce a new `--allow-missing-tips` option to make it work
> like this.

An obvious question is if this even needs to be a new option.  What
are expected use cases where --missing=print without this option is
beneficial?  If there is no plausible use case, perhaps we can treat
it as a mere bugfix to the existing --missing mechanism, especially
given that support of commits in "--missing" itself is relatively
a new thing.

If we can do this as a bugfix that is always on when --missing is
used, then we do not have to worry about adding another tasteless
loop outside the main command line parser, which is a huge upside
;-).
Christian Couder Feb. 2, 2024, 11:29 a.m. UTC | #3
On Thu, Feb 1, 2024 at 10:27 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > When such a command is used to find the dependencies of some objects,
> > for example the dependencies of quarantined objects, it would be
> > better if the command would instead consider such missing objects,
> > especially commits, in the same way as other missing objects.
> >
> > If, for example `--missing=print` is used, it would be nice for some
> > use cases if the missing tips passed as arguments were reported in
> > the same way as other missing objects instead of the command just
> > failing.
> >
> > Let's introduce a new `--allow-missing-tips` option to make it work
> > like this.
>
> An obvious question is if this even needs to be a new option.  What
> are expected use cases where --missing=print without this option is
> beneficial?

I am not sure if such a case is really beneficial but some
people/script/forges might rely on an error from `git rev-list
--missing=print` to propagate back an error to some user interface.

> If there is no plausible use case, perhaps we can treat
> it as a mere bugfix to the existing --missing mechanism, especially
> given that support of commits in "--missing" itself is relatively
> a new thing.

`--missing=...` detecting missing commits is new, but it might have
been used to find missing blobs or trees for some time as it exists
since:

caf3827e2f (rev-list: add list-objects filtering support, 2017-11-21)

> If we can do this as a bugfix that is always on when --missing is
> used, then we do not have to worry about adding another tasteless
> loop outside the main command line parser, which is a huge upside
> ;-).

Even if I very much dislike the tasteless loops, I'd rather not risk
user regressions by changing how `git rev-list --missing=...` handles
errors in the commits it is passed as arguments. See my other message
responding to your previous comments about how I'd prefer we handle
this.
Christian Couder Feb. 2, 2024, 11:29 a.m. UTC | #4
On Thu, Feb 1, 2024 at 9:20 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > In 9830926c7d (rev-list: add commit object support in `--missing`
> > option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
> > so that it now works with commits too.
> >
> > Unfortunately, such a command would still fail with a "fatal: bad
> > object <oid>" if it is passed a missing commit, blob or tree as an
> > argument.
> >
> > When such a command is used to find the dependencies of some objects,
> > for example the dependencies of quarantined objects, it would be
> > better if the command would instead consider such missing objects,
> > especially commits, in the same way as other missing objects.
> >
> > If, for example `--missing=print` is used, it would be nice for some
> > use cases if the missing tips passed as arguments were reported in
> > the same way as other missing objects instead of the command just
> > failing.
> >
> > Let's introduce a new `--allow-missing-tips` option to make it work
> > like this.
>
> OK.  Unlike a missing object referenced by a tree, a commit, or a
> tag, where the expected type of the missing object is known to Git,
> I would expect that nobody knowsn what type these missing objects at
> the tip have.  So do we now require "object X missing" instead of
> "commit X missing" in the output?  If we are not giving any type
> information even when we know what the type we expect is, then we do
> not have to worry about this change introducing a new output logic,
> I guess.

Yeah, we are not giving type information in the output, only a `?`
before the oid.

> > diff --git a/builtin/rev-list.c b/builtin/rev-list.c
> > index b3f4783858..ae7bb15478 100644
> > --- a/builtin/rev-list.c
> > +++ b/builtin/rev-list.c
> > @@ -562,6 +562,16 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> >                               break;
> >               }
> >       }
> > +     for (i = 1; i < argc; i++) {
> > +             const char *arg = argv[i];
> > +             if (!strcmp(arg, "--allow-missing-tips")) {
> > +                     if (arg_missing_action == MA_ERROR)
> > +                             die(_("option '%s' only makes sense with '%s' set to '%s' or '%s'"),
> > +                                   "--allow-missing-tips", "--missing=", "allow-*", "print");
> > +                     revs.do_not_die_on_missing_tips = 1;
> > +                     break;
> > +             }
> > +     }
>
> It is unfortunate that we need to add yet another dumb loop that
> does not even try to understand there may be an option whose value
> happens to be the string strcmp() looks for (we already have two
> such loops above this hunk).  I have to wonder if we can do better.

I am also not happy with adding yet another dump loop like this. I did
it because I couldn't think of a simple solution to avoid that.

> An idle piece of idea.  Perhaps we can instruct setup_revisions()
> not to die immediately upon seeing a problematic argument, marking
> the revs as "broken" instead, and keep going and interpreting as
> much as it could, so that it may record the presence of "--missing",
> "--exclude-promisor-objects", and "--allow-missing-tips".  Then upon
> seeing setup_revisions() return with such an error, we can redo the
> call with these bits already on.

When we discussed rejecting some rev walking options before calling
setup_revisions() in the context of `git replay`, one thing people
seemed to agree on was that there should be a mechanism in
setup_revisions() that allows us to tell setup_revisions() which rev
walking options it should allow or not. I think that such a mechanism
might need an early parsing of options which could be reused for
options like `--exclude-promisor-objects`, `--missing=...`  and
`--allow-missing-tips`. But I think adding such a mechanism does not
belong to this patch series. It's some cleanup and refactoring that we
might have needed for a long time already.

In my current version, I have added the following NEEDSWORK comment
before the three dump loops to make sure we remember about this:

    /*
     * NEEDSWORK: These dump loops to look for some options early
     * are ugly. We really need setup_revisions() to have a
     * mechanism to allow and disallow some sets of options for
     * different commands (like rev-list, replay, etc). Such
     * mechanism should do an early parsing of option and be able
     * to manage the `--exclude-promisor-objects`, `--missing=...`
     * and `--allow-missing-tips` options below.
     */

I have also added the following to the commit message:

"We introduce another dump loop to look for that options early, but
addressing the dump loops should likely be part of adding a new
mechanism to setup_revisions() which is a different topic. Anyway
let's add a big NEEDSWORK comment to remember that we really need to
do this."

> Anyway, I digress.
>
> > diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
> > index 527aa94f07..283e8fc2c2 100755
> > --- a/t/t6022-rev-list-missing.sh
> > +++ b/t/t6022-rev-list-missing.sh
> > @@ -77,4 +77,55 @@ do
> >       done
> >  done
> >
> > +for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
> > +do
> > +     for tip in "" "HEAD"
> > +     do
> > +             for action in "allow-any" "print"
> > +             do
> > +                     test_expect_success "--missing=$action --allow-missing-tips with tip '$obj' missing and tip '$tip'" '
> > +                             oid="$(git rev-parse $obj)" &&
> > +                             path=".git/objects/$(test_oid_to_path $oid)" &&
> > +
> > +                             # Before the object is made missing, we use rev-list to
> > +                             # get the expected oids.
> > +                             if [ "$tip" = "HEAD" ]; then
>
> Style:
>
>                                 if test "$tip" = "HEAD"
>                                 then

Fixed in my current version. In it, I also fixed in patch 2/3 a
similar style typo in the tests before these ones.

> > +                                     git rev-list --objects --no-object-names \
> > +                                             HEAD ^$obj >expect.raw
> > +                             else
> > +                                     >expect.raw
> > +                             fi &&
> > +
> > +                             # Blobs are shared by all commits, so even though a commit/tree
> > +                             # might be skipped, its blob must be accounted for.
> > +                             if [ "$tip" = "HEAD" ] && [ $obj != "HEAD:1.t" ]; then
>
> Ditto.

Fixed too in my current version.

> > +                                     echo $(git rev-parse HEAD:1.t) >>expect.raw &&
> > +                                     echo $(git rev-parse HEAD:2.t) >>expect.raw
> > +                             fi &&
> > +
> > +                             mv "$path" "$path.hidden" &&
> > +                             test_when_finished "mv $path.hidden $path" &&
> > +
> > +                             git rev-list --missing=$action --allow-missing-tips \
> > +                                  --objects --no-object-names $oid $tip >actual.raw &&
> > +
> > +                             # When the action is to print, we should also add the missing
> > +                             # oid to the expect list.
> > +                             case $action in
> > +                             allow-any)
> > +                                     ;;
> > +                             print)
> > +                                     grep ?$oid actual.raw &&
> > +                                     echo ?$oid >>expect.raw
> > +                                     ;;
>
> OK.  We do not say anything more than the object name (and the fact
> that it is missing with a single byte '?'), so my earlier worry was
> unfounded.  Good.
>
> > +                             esac &&
> > +
> > +                             sort actual.raw >actual &&
> > +                             sort expect.raw >expect &&
> > +                             test_cmp expect actual
> > +                     '
> > +             done
> > +     done
> > +done
> > +
> >  test_done
>
> THanks.

Thanks for the review!
Junio C Hamano Feb. 2, 2024, 4:47 p.m. UTC | #5
Christian Couder <christian.couder@gmail.com> writes:

> I am also not happy with adding yet another dump loop like this. I did
> it because I couldn't think of a simple solution to avoid that.

FWIW, that word is "dumB" loop, if you are including the word in
your updated version.

But I am still wondering if this can be unconditionally turned on
any time --missing=... is used.  Have you considered it?  If we can
do that, it would eliminate the need for that extra broken loop that
will fail with "git rev-list --grep --allow-missing-tips -20 HEAD".
Junio C Hamano Feb. 2, 2024, 4:54 p.m. UTC | #6
Christian Couder <christian.couder@gmail.com> writes:

>> An obvious question is if this even needs to be a new option.  What
>> are expected use cases where --missing=print without this option is
>> beneficial?
>
> I am not sure if such a case is really beneficial but some
> people/script/forges might rely on an error from `git rev-list
> --missing=print` to propagate back an error to some user interface.

Yes, and they are missing errors for starting point objects right
now, which is a bug, just like they were missing output for commit
objects before we fixed it recently?

>> If there is no plausible use case, perhaps we can treat
>> it as a mere bugfix to the existing --missing mechanism, especially
>> given that support of commits in "--missing" itself is relatively
>> a new thing.
>
> `--missing=...` detecting missing commits is new, but it might have
> been used to find missing blobs or trees for some time as it exists
> since:
>
> caf3827e2f (rev-list: add list-objects filtering support, 2017-11-21)

So it fixes a long-standing bug.  So what?

Especially given that the change in behaviour is to error out only
when the user gave an object name for a missing object, and we know
the user wants to be notified on all missing objects "found" during
the traversal, it smells like (1) a rare enough case that may not
matter in the real world and (2) the existing scripts that may feed
potentially missing objects to the command would be doing an extra
check to ensure what they feed the command are not missing, in order
to avoid triggering that "starting point objects induce an error
instead of --missing=print report" bug, and the bugfix will make
such a workaround unnecessary.
Linus Arver Feb. 7, 2024, 9:40 a.m. UTC | #7
Christian Couder <christian.couder@gmail.com> writes:

> In 9830926c7d (rev-list: add commit object support in `--missing`
> option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
> so that it now works with commits too.
>
> Unfortunately, such a command would still fail with a "fatal: bad
> object <oid>" if it is passed a missing commit, blob or tree as an
> argument.

Hmm, but according to the manpage for rev-list, it only accepts commits
as arguments. Conflict...?

Also it took me a second to realize that you are talking about
missing objects passed in at the command line to `git rev-list` and not
the objects that this command encounters during the normal rev walking
process. It would have been a touch nicer to say something more
explicit, like

    In 9830926c7d (rev-list: add commit object support in `--missing`
    option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
    so that it works with missing commits, not just blobs/trees.
    
    Unfortunately, it would still fail with a "fatal: bad
    object <oid>" if it is passed a missing commit as an
    argument (before the rev walking even begins).

> When such a command is used to find the dependencies of some objects,
> for example the dependencies of quarantined objects, it would be
> better if the command would instead consider such missing objects,
> especially commits, in the same way as other missing objects.

Could you define what quarantined objects are (it's not in the manpage
for rev-list)? But also, I'm not sure how much additional value this
paragraph is adding on top of what you already said in the first two
paragraphs.

> If, for example `--missing=print` is used, it would be nice for some
> use cases if the missing tips passed as arguments were reported in
> the same way as other missing objects instead of the command just
> failing.
>
> Let's introduce a new `--allow-missing-tips` option to make it work
> like this.

I assume "tips" means "the starting commits which are passed into this
command from which it begins the rev walk". Might be worth including in
the commit message.

But also, it's curious that you chose the word "allow" when we already
have "--ignore-missing". I assume this is because you still want the
missing tips to still be displayed in the output, and not ignored
(omitted from processing altogether). Perhaps "--report-missing-tips"
is more explicit?

> @@ -753,9 +765,19 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
>  
>  	if (arg_print_omitted)
>  		oidset_init(&omitted_objects, DEFAULT_OIDSET_SIZE);
> -	if (arg_missing_action == MA_PRINT)
> +	if (arg_missing_action == MA_PRINT) {
> +		struct oidset_iter iter;
> +		struct object_id *oid;
> +
>  		oidset_init(&missing_objects, DEFAULT_OIDSET_SIZE);
>  
> +		/* Already add missing commits */

Did you mean "Add already-missing commits"? Perhaps even more explicit
would be "Add missing tips"?

> +		oidset_iter_init(&revs.missing_commits, &iter);
> +		while ((oid = oidset_iter_next(&iter)))
> +			oidset_insert(&missing_objects, oid);
> +		oidset_clear(&revs.missing_commits);
> +	}

Aside: I am surprised that there is no helper function already that
simply copies things in one oidset into another oidset.

>  	traverse_commit_list_filtered(
>  		&revs, show_commit, show_object, &info,
>  		(arg_print_omitted ? &omitted_objects : NULL));
> diff --git a/revision.c b/revision.c
> index 4c5cd7c3ce..9f25faa249 100644
> --- a/revision.c
> +++ b/revision.c
> 
> [...]
> 
> @@ -2184,7 +2189,7 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl
>  		verify_non_filename(revs->prefix, arg);
>  	object = get_reference(revs, arg, &oid, flags ^ local_flags);
>  	if (!object)
> -		return revs->ignore_missing ? 0 : -1;
> +		return (revs->ignore_missing || revs->do_not_die_on_missing_tips) ? 0 : -1;
>  	add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
>  	add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
>  	free(oc.path);

With a few more context lines, the diff looks like

--8<---------------cut here---------------start------------->8---
        if (get_oid_with_context(revs->repo, arg, get_sha1_flags, &oid, &oc))
                return revs->ignore_missing ? 0 : -1;
        if (!cant_be_filename)
                verify_non_filename(revs->prefix, arg);
        object = get_reference(revs, arg, &oid, flags ^ local_flags);
        if (!object)
-               return revs->ignore_missing ? 0 : -1;
+               return (revs->ignore_missing || revs->do_not_die_on_missing_tips) ? 0 : -1;
        add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
        add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
        free(oc.path);
        return 0;
--8<---------------cut here---------------end--------------->8---

and it's not obvious to me why you only touched the "if (!object)" case
but not the "if (get_oid_with_context(...))" case. Are there any
subtleties here that would not be obvious to reviewers?

> @@ -3830,8 +3835,6 @@ int prepare_revision_walk(struct rev_info *revs)
>  				       FOR_EACH_OBJECT_PROMISOR_ONLY);
>  	}
>  
> -	oidset_init(&revs->missing_commits, 0);
> -
>  	if (!revs->reflog_info)
>  		prepare_to_use_bloom_filter(revs);
>  	if (!revs->unsorted_input)
> diff --git a/revision.h b/revision.h
> index 94c43138bc..67435a5d8a 100644
> --- a/revision.h
> +++ b/revision.h
> @@ -227,6 +227,14 @@ struct rev_info {
>  			 */
>  			do_not_die_on_missing_objects:1,
>  
> +			/*
> +			 * When the do_not_die_on_missing_objects flag above is set,
> +			 * a rev walk could still die with "fatal: bad object <oid>"
> +			 * if one of the tips it is passed is missing. With this flag
> +			 * such a tip will be reported as missing too.
> +			 */
> +			 do_not_die_on_missing_tips:1,
> +
>  			/* for internal use only */
>  			exclude_promisor_objects:1;

IIUC, we won't get to even do the rev walk though if all tips are
missing. So perhaps a more precise comment would be

    /*
     * When the do_not_die_on_missing_objects flag above is set,
     * we could prematurely die with "fatal: bad object <oid>"
     * if one of the tips it is passed is missing before even getting to
     * the rev walk. With this flag such a tip will be reported as
     * missing in the output after the rev walk completes.
     */
  
> diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
> index 527aa94f07..283e8fc2c2 100755
> --- a/t/t6022-rev-list-missing.sh
> +++ b/t/t6022-rev-list-missing.sh
> @@ -77,4 +77,55 @@ do
>  	done
>  done
>  
> +for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
> +do
> +	for tip in "" "HEAD"

So far from the patch diff it was not obvious that you wanted to check
for the empty string as a "tip".

> +	do
> +		for action in "allow-any" "print"
> +		do
> +			test_expect_success "--missing=$action --allow-missing-tips with tip '$obj' missing and tip '$tip'" '

If I run this test without the new --allow-missing-tips flag, I get 

    fatal: bad object 82335b23aa7234320d6f8055243c852e4b6a5ca3
    not ok 11 - --missing=allow-any --allow-missing-tips with tip 'HEAD~1' missing and tip ''

and I think the last "tip ''" part is misleading because IIUC it's not
actually passed in as a tip (see my comment below about shell quoting).

> +				oid="$(git rev-parse $obj)" &&
> +				path=".git/objects/$(test_oid_to_path $oid)" &&
> +
> +				# Before the object is made missing, we use rev-list to
> +				# get the expected oids.
> +				if [ "$tip" = "HEAD" ]; then
> +					git rev-list --objects --no-object-names \
> +						HEAD ^$obj >expect.raw
> +				else
> +					>expect.raw
> +				fi &&

OK so you are using this empty string to clear the expect.raw file. If
that's true then what stops us from doing this at the start of every
iteration?

> +				# Blobs are shared by all commits, so even though a commit/tree
> +				# might be skipped, its blob must be accounted for.
> +				if [ "$tip" = "HEAD" ] && [ $obj != "HEAD:1.t" ]; then
> +					echo $(git rev-parse HEAD:1.t) >>expect.raw &&
> +					echo $(git rev-parse HEAD:2.t) >>expect.raw
> +				fi &&
> +
> +				mv "$path" "$path.hidden" &&
> +				test_when_finished "mv $path.hidden $path" &&
> +
> +				git rev-list --missing=$action --allow-missing-tips \
> +				     --objects --no-object-names $oid $tip >actual.raw &&

The fact that $tip is not quoted here means that this is equivalent to

    --objects --no-object-names $oid >actual.raw &&

and not

    --objects --no-object-names $oid "" >actual.raw &&

for the case where tip is the empty string. I wonder if we could avoid
being nuanced here with subtle shell quoting rules, especially because
these are tests (where making everything as explicit as possible is
desirable).

> +				# When the action is to print, we should also add the missing
> +				# oid to the expect list.
> +				case $action in
> +				allow-any)
> +					;;
> +				print)
> +					grep ?$oid actual.raw &&
> +					echo ?$oid >>expect.raw
> +					;;
> +				esac &&
> +
> +				sort actual.raw >actual &&
> +				sort expect.raw >expect &&
> +				test_cmp expect actual
> +			'
> +		done
> +	done
> +done
> +

Wait, but where are we actually making the $tip commit's object
_missing_? Hmm...

Ah, actually the tips are just the $obj variables. So it seems like you
could have done

    for tip in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"

in the beginning to make it more obvious.

Also, given how most of this is identical from the loop already in t6022
just above it, it would help to add a comment at the top of this one
explaining how it's different than the one above it.

Thanks.

>  test_done
> -- 
> 2.43.0.496.gd667eb0d7d.dirty
Linus Arver Feb. 7, 2024, 9:57 a.m. UTC | #8
Christian Couder <christian.couder@gmail.com> writes:

> On Thu, Feb 1, 2024 at 10:27 PM Junio C Hamano <gitster@pobox.com> wrote:
>>
>> Christian Couder <christian.couder@gmail.com> writes:
>>
>> > When such a command is used to find the dependencies of some objects,
>> > for example the dependencies of quarantined objects, it would be
>> > better if the command would instead consider such missing objects,
>> > especially commits, in the same way as other missing objects.
>> >
>> > If, for example `--missing=print` is used, it would be nice for some
>> > use cases if the missing tips passed as arguments were reported in
>> > the same way as other missing objects instead of the command just
>> > failing.
>> >
>> > Let's introduce a new `--allow-missing-tips` option to make it work
>> > like this.
>>
>> An obvious question is if this even needs to be a new option.  What
>> are expected use cases where --missing=print without this option is
>> beneficial?
>
> I am not sure if such a case is really beneficial but some
> people/script/forges might rely on an error from `git rev-list
> --missing=print` to propagate back an error to some user interface.

I currently learn toward just making the new flag's behavior be absorved
into the existing "--missing=..." flag. Nevertheless, you raise an
interesting concern.

Perhaps a compromise would be to make "--missing=..." learn the new
behavior of this patch as Junio suggested, but to introduce a new flag,
something like "--fail-on-missing-tips" to fail early if any of the tip
commits' objects are missing? That way we could keep the current
"strict" behavior of complaining if we feed rev-list any tips whose
objects are missing. And for the vast majority of cases the
"--missing=..." flag could (intuitively) gracefully handle tips with
missing objects and you wouldn't have to pass in the additional flag.

IOW, make the minority (certainly not majority, I think?) of users who
really need the error propagation use the (new) extra flag, while the
rest of us (including the version of you who was surprised by the
limited behavior of "--missing=...", enough to write this series) don't
have to.

Thanks.
Christian Couder Feb. 7, 2024, 4:11 p.m. UTC | #9
On Wed, Feb 7, 2024 at 10:40 AM Linus Arver <linusa@google.com> wrote:
>
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > In 9830926c7d (rev-list: add commit object support in `--missing`
> > option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
> > so that it now works with commits too.
> >
> > Unfortunately, such a command would still fail with a "fatal: bad
> > object <oid>" if it is passed a missing commit, blob or tree as an
> > argument.
>
> Hmm, but according to the manpage for rev-list, it only accepts commits
> as arguments. Conflict...?

It actually kind of "works" when you pass blobs or trees. It looks
like the command just doesn't use them (except for checking that they
actually exist) unless options like --objects, --missing=print and
perhaps others are used. So yeah, the doc might need an update, but I
think it could be a separate issue, as it's not new and doesn't depend
on this small series.

> Also it took me a second to realize that you are talking about
> missing objects passed in at the command line to `git rev-list` and not
> the objects that this command encounters during the normal rev walking
> process. It would have been a touch nicer to say something more
> explicit, like
>
>     In 9830926c7d (rev-list: add commit object support in `--missing`
>     option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
>     so that it works with missing commits, not just blobs/trees.
>
>     Unfortunately, it would still fail with a "fatal: bad
>     object <oid>" if it is passed a missing commit as an
>     argument (before the rev walking even begins).

Thanks, I have used your suggestions in my current version, except
that I still use "passed a missing commit, blob or tree" instead of
"passed a missing commit" as the command also accepts blobs and trees.

> > When such a command is used to find the dependencies of some objects,
> > for example the dependencies of quarantined objects, it would be
> > better if the command would instead consider such missing objects,
> > especially commits, in the same way as other missing objects.
>
> Could you define what quarantined objects are (it's not in the manpage
> for rev-list)?

"quarantined objects" refers to the following doc:

https://www.git-scm.com/docs/git-receive-pack#_quarantine_environment

So to clarify things, the above paragraph looks like the following now:

"When such a command is used to find the dependencies of some objects,
for example the dependencies of quarantined objects (see the
"QUARANTINE ENVIRONMENT" section in the git-receive-pack(1)
documentation), it would be better if the command would instead
consider such missing objects, especially commits, in the same way as
other missing objects."

> But also, I'm not sure how much additional value this
> paragraph is adding on top of what you already said in the first two
> paragraphs.

It's a more specific example, and it's how we detected the issue we
would like to address, so I think it has some value. I am Ok with
removing it, if others don't see some value in it though.

> > If, for example `--missing=print` is used, it would be nice for some
> > use cases if the missing tips passed as arguments were reported in
> > the same way as other missing objects instead of the command just
> > failing.
> >
> > Let's introduce a new `--allow-missing-tips` option to make it work
> > like this.
>
> I assume "tips" means "the starting commits which are passed into this
> command from which it begins the rev walk". Might be worth including in
> the commit message.

I think "tips" is better than "commits" because blobs and trees are
allowed (see above).

> But also, it's curious that you chose the word "allow" when we already
> have "--ignore-missing". I assume this is because you still want the
> missing tips to still be displayed in the output, and not ignored
> (omitted from processing altogether). Perhaps "--report-missing-tips"
> is more explicit?

I think "allow" is better as it makes it clearer that the command will
not fail while it should fail without the option.

Anyway I think I will implement Junio's suggestion to just avoid
adding a new option and just change the behavior of the command when
--missing=[print|allow*] is used.

> > @@ -753,9 +765,19 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
> >
> >       if (arg_print_omitted)
> >               oidset_init(&omitted_objects, DEFAULT_OIDSET_SIZE);
> > -     if (arg_missing_action == MA_PRINT)
> > +     if (arg_missing_action == MA_PRINT) {
> > +             struct oidset_iter iter;
> > +             struct object_id *oid;
> > +
> >               oidset_init(&missing_objects, DEFAULT_OIDSET_SIZE);
> >
> > +             /* Already add missing commits */
>
> Did you mean "Add already-missing commits"? Perhaps even more explicit
> would be "Add missing tips"?

Yeah, right. I have changed it in my current version.

> > +             oidset_iter_init(&revs.missing_commits, &iter);
> > +             while ((oid = oidset_iter_next(&iter)))
> > +                     oidset_insert(&missing_objects, oid);
> > +             oidset_clear(&revs.missing_commits);
> > +     }
>
> Aside: I am surprised that there is no helper function already that
> simply copies things in one oidset into another oidset.

Yeah, I was surprised too. I only found the following similar code in
list-objects-filter.c:

    oidset_iter_init(src, &iter);
    while ((src_oid = oidset_iter_next(&iter)) != NULL)
        oidset_insert(dest, src_oid);

So yeah perhaps we could create such a helper function.

> >       traverse_commit_list_filtered(
> >               &revs, show_commit, show_object, &info,
> >               (arg_print_omitted ? &omitted_objects : NULL));
> > diff --git a/revision.c b/revision.c
> > index 4c5cd7c3ce..9f25faa249 100644
> > --- a/revision.c
> > +++ b/revision.c
> >
> > [...]
> >
> > @@ -2184,7 +2189,7 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl
> >               verify_non_filename(revs->prefix, arg);
> >       object = get_reference(revs, arg, &oid, flags ^ local_flags);
> >       if (!object)
> > -             return revs->ignore_missing ? 0 : -1;
> > +             return (revs->ignore_missing || revs->do_not_die_on_missing_tips) ? 0 : -1;
> >       add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
> >       add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
> >       free(oc.path);
>
> With a few more context lines, the diff looks like
>
> --8<---------------cut here---------------start------------->8---
>         if (get_oid_with_context(revs->repo, arg, get_sha1_flags, &oid, &oc))
>                 return revs->ignore_missing ? 0 : -1;
>         if (!cant_be_filename)
>                 verify_non_filename(revs->prefix, arg);
>         object = get_reference(revs, arg, &oid, flags ^ local_flags);
>         if (!object)
> -               return revs->ignore_missing ? 0 : -1;
> +               return (revs->ignore_missing || revs->do_not_die_on_missing_tips) ? 0 : -1;
>         add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
>         add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
>         free(oc.path);
>         return 0;
> --8<---------------cut here---------------end--------------->8---
>
> and it's not obvious to me why you only touched the "if (!object)" case
> but not the "if (get_oid_with_context(...))" case. Are there any
> subtleties here that would not be obvious to reviewers?

I thought that if we can't get an oid, we cannot put anything in the
'missing_commit' oidset anyway, so it might be better to error out.

> > @@ -3830,8 +3835,6 @@ int prepare_revision_walk(struct rev_info *revs)
> >                                      FOR_EACH_OBJECT_PROMISOR_ONLY);
> >       }
> >
> > -     oidset_init(&revs->missing_commits, 0);
> > -
> >       if (!revs->reflog_info)
> >               prepare_to_use_bloom_filter(revs);
> >       if (!revs->unsorted_input)
> > diff --git a/revision.h b/revision.h
> > index 94c43138bc..67435a5d8a 100644
> > --- a/revision.h
> > +++ b/revision.h
> > @@ -227,6 +227,14 @@ struct rev_info {
> >                        */
> >                       do_not_die_on_missing_objects:1,
> >
> > +                     /*
> > +                      * When the do_not_die_on_missing_objects flag above is set,
> > +                      * a rev walk could still die with "fatal: bad object <oid>"
> > +                      * if one of the tips it is passed is missing. With this flag
> > +                      * such a tip will be reported as missing too.
> > +                      */
> > +                      do_not_die_on_missing_tips:1,
> > +
> >                       /* for internal use only */
> >                       exclude_promisor_objects:1;
>
> IIUC, we won't get to even do the rev walk though if all tips are
> missing. So perhaps a more precise comment would be
>
>     /*
>      * When the do_not_die_on_missing_objects flag above is set,
>      * we could prematurely die with "fatal: bad object <oid>"
>      * if one of the tips it is passed is missing before even getting to
>      * the rev walk. With this flag such a tip will be reported as
>      * missing in the output after the rev walk completes.
>      */

Thanks for the suggestion, I have now:

            /*
             * When the do_not_die_on_missing_objects flag above is set,
             * we could prematurely die with "fatal: bad object <oid>"
             * if one of the tips passed to a rev walk is missing, before
             * the rev walk even starts. With this flag such a tip will
             * be reported as missing in the output after the rev walk
             * completes.
             */
             do_not_die_on_missing_tips:1,


> > diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
> > index 527aa94f07..283e8fc2c2 100755
> > --- a/t/t6022-rev-list-missing.sh
> > +++ b/t/t6022-rev-list-missing.sh
> > @@ -77,4 +77,55 @@ do
> >       done
> >  done
> >
> > +for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
> > +do
> > +     for tip in "" "HEAD"
>
> So far from the patch diff it was not obvious that you wanted to check
> for the empty string as a "tip".

We want to check that things work as expected both:

  - when we pass only one missing tip, and:
  - when we pass one missing tip and a tip that is not missing.

We are setting 'tip' to "" in the former case so that when we will use
"$oid $tip" in the `git rev-list --missing=$action
--allow-missing-tips ...` command below, it will be like we passed
only "$oid" to the command.

> > +     do
> > +             for action in "allow-any" "print"
> > +             do
> > +                     test_expect_success "--missing=$action --allow-missing-tips with tip '$obj' missing and tip '$tip'" '
>
> If I run this test without the new --allow-missing-tips flag, I get
>
>     fatal: bad object 82335b23aa7234320d6f8055243c852e4b6a5ca3
>     not ok 11 - --missing=allow-any --allow-missing-tips with tip 'HEAD~1' missing and tip ''
>
> and I think the last "tip ''" part is misleading because IIUC it's not
> actually passed in as a tip (see my comment below about shell quoting).
>
> > +                             oid="$(git rev-parse $obj)" &&
> > +                             path=".git/objects/$(test_oid_to_path $oid)" &&
> > +
> > +                             # Before the object is made missing, we use rev-list to
> > +                             # get the expected oids.
> > +                             if [ "$tip" = "HEAD" ]; then
> > +                                     git rev-list --objects --no-object-names \
> > +                                             HEAD ^$obj >expect.raw
> > +                             else
> > +                                     >expect.raw
> > +                             fi &&
>
> OK so you are using this empty string to clear the expect.raw file. If
> that's true then what stops us from doing this at the start of every
> iteration?

I am not sure what you mean. Both:

    git rev-list --objects --no-object-names HEAD ^$obj >expect.raw

and:

    >expect.raw #2

are clearing "expect.raw" as ">expect.raw" is used in both cases.

If we did the latter at the start of every iteration, then when we do
the former afterwards, we would clear "expect.raw" raw again, while
there is no point in clearing it twice.

> > +                             # Blobs are shared by all commits, so even though a commit/tree
> > +                             # might be skipped, its blob must be accounted for.
> > +                             if [ "$tip" = "HEAD" ] && [ $obj != "HEAD:1.t" ]; then
> > +                                     echo $(git rev-parse HEAD:1.t) >>expect.raw &&
> > +                                     echo $(git rev-parse HEAD:2.t) >>expect.raw
> > +                             fi &&
> > +
> > +                             mv "$path" "$path.hidden" &&
> > +                             test_when_finished "mv $path.hidden $path" &&
> > +
> > +                             git rev-list --missing=$action --allow-missing-tips \
> > +                                  --objects --no-object-names $oid $tip >actual.raw &&
>
> The fact that $tip is not quoted here means that this is equivalent to
>
>     --objects --no-object-names $oid >actual.raw &&
>
> and not
>
>     --objects --no-object-names $oid "" >actual.raw &&

Yeah, right, and that's what we want.

> for the case where tip is the empty string. I wonder if we could avoid
> being nuanced here with subtle shell quoting rules, especially because
> these are tests (where making everything as explicit as possible is
> desirable).

I think it's not very subtle, but perhaps a comment would help. So I
added the following to my current version before the loop:

    # We want to check that things work when both
    #   - all the tips passed are missing (case tip = ""), and
    #   - there is one missing tip and one existing tip (case tip = "HEAD")

> > +                             # When the action is to print, we should also add the missing
> > +                             # oid to the expect list.
> > +                             case $action in
> > +                             allow-any)
> > +                                     ;;
> > +                             print)
> > +                                     grep ?$oid actual.raw &&
> > +                                     echo ?$oid >>expect.raw
> > +                                     ;;
> > +                             esac &&
> > +
> > +                             sort actual.raw >actual &&
> > +                             sort expect.raw >expect &&
> > +                             test_cmp expect actual
> > +                     '
> > +             done
> > +     done
> > +done
> > +
>
> Wait, but where are we actually making the $tip commit's object
> _missing_? Hmm...
>
> Ah, actually the tips are just the $obj variables. So it seems like you
> could have done
>
>     for tip in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
>
> in the beginning to make it more obvious.

The previous tests in this test script used "oid" as the variable name
for possibly missing objects, so I did the same in the tests I added.
I could have used "tip" and "extra_tip" instead of "oid" and "tip" or
perhaps "oid" and "extra_oid", but I don't think it makes a big
difference in understanding these tests.

> Also, given how most of this is identical from the loop already in t6022
> just above it, it would help to add a comment at the top of this one
> explaining how it's different than the one above it.

The one I added has an extra `for tip in "" "HEAD"` loop. Anyway I
think I will just modify the existing test in the next version I will
send, as I plan to implement Junio's suggestion and there will be no
new option.

Thanks for the review!
Junio C Hamano Feb. 7, 2024, 4:34 p.m. UTC | #10
Linus Arver <linusa@google.com> writes:

> IOW, make the minority (certainly not majority, I think?) of users who
> really need the error propagation use the (new) extra flag, while the
> rest of us (including the version of you who was surprised by the
> limited behavior of "--missing=...", enough to write this series) don't
> have to.

I am skeptical that we even want that.  Those who want to use the
"--missing" and care about special casing missing starting points
can easily check with things like "cat-file -e" before even running
the "rev-list --missing".
Christian Couder Feb. 7, 2024, 4:38 p.m. UTC | #11
On Wed, Feb 7, 2024 at 10:57 AM Linus Arver <linusa@google.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > On Thu, Feb 1, 2024 at 10:27 PM Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >> Christian Couder <christian.couder@gmail.com> writes:
> >>
> >> > When such a command is used to find the dependencies of some objects,
> >> > for example the dependencies of quarantined objects, it would be
> >> > better if the command would instead consider such missing objects,
> >> > especially commits, in the same way as other missing objects.
> >> >
> >> > If, for example `--missing=print` is used, it would be nice for some
> >> > use cases if the missing tips passed as arguments were reported in
> >> > the same way as other missing objects instead of the command just
> >> > failing.
> >> >
> >> > Let's introduce a new `--allow-missing-tips` option to make it work
> >> > like this.
> >>
> >> An obvious question is if this even needs to be a new option.  What
> >> are expected use cases where --missing=print without this option is
> >> beneficial?
> >
> > I am not sure if such a case is really beneficial but some
> > people/script/forges might rely on an error from `git rev-list
> > --missing=print` to propagate back an error to some user interface.
>
> I currently learn toward just making the new flag's behavior be absorved
> into the existing "--missing=..." flag.

Ok, then I am going to implement that in the next version I will send.

> Nevertheless, you raise an
> interesting concern.
>
> Perhaps a compromise would be to make "--missing=..." learn the new
> behavior of this patch as Junio suggested, but to introduce a new flag,
> something like "--fail-on-missing-tips" to fail early if any of the tip
> commits' objects are missing? That way we could keep the current
> "strict" behavior of complaining if we feed rev-list any tips whose
> objects are missing. And for the vast majority of cases the
> "--missing=..." flag could (intuitively) gracefully handle tips with
> missing objects and you wouldn't have to pass in the additional flag.
>
> IOW, make the minority (certainly not majority, I think?) of users who
> really need the error propagation use the (new) extra flag, while the
> rest of us (including the version of you who was surprised by the
> limited behavior of "--missing=...", enough to write this series) don't
> have to.

I agree that the majority of users would prefer `git rev-list` with
"--missing=<arg>" (when <arg> is not "error") not to error out when
one of the tips is missing. And yeah, indeed at GitLab, we are among
this majority of users. I was worried about a small minority relying
on the fact that it would error out in such case. But maybe we don't
need to care unless it appears that this minority actually exists.
Linus Arver Feb. 7, 2024, 8:48 p.m. UTC | #12
Christian Couder <christian.couder@gmail.com> writes:

> On Wed, Feb 7, 2024 at 10:40 AM Linus Arver <linusa@google.com> wrote:
>>
>>
>> Christian Couder <christian.couder@gmail.com> writes:
>>
>> > In 9830926c7d (rev-list: add commit object support in `--missing`
>> > option, 2023-10-27) we fixed the `--missing` option in `git rev-list`
>> > so that it now works with commits too.
>> >
>> > Unfortunately, such a command would still fail with a "fatal: bad
>> > object <oid>" if it is passed a missing commit, blob or tree as an
>> > argument.
>>
>> Hmm, but according to the manpage for rev-list, it only accepts commits
>> as arguments. Conflict...?
>
> It actually kind of "works" when you pass blobs or trees. It looks
> like the command just doesn't use them (except for checking that they
> actually exist) unless options like --objects, --missing=print and
> perhaps others are used. So yeah, the doc might need an update, but I
> think it could be a separate issue, as it's not new and doesn't depend
> on this small series.

SG. But also, if there's a way to put invisible (developer-facing, not
user facing) comments inside the relevant doc file it might be easy
enough to add a to this series.

>> > When such a command is used to find the dependencies of some objects,
>> > for example the dependencies of quarantined objects, it would be
>> > better if the command would instead consider such missing objects,
>> > especially commits, in the same way as other missing objects.
>>
>> Could you define what quarantined objects are (it's not in the manpage
>> for rev-list)?
>
> "quarantined objects" refers to the following doc:
>
> https://www.git-scm.com/docs/git-receive-pack#_quarantine_environment
>
> So to clarify things, the above paragraph looks like the following now:
>
> "When such a command is used to find the dependencies of some objects,
> for example the dependencies of quarantined objects (see the
> "QUARANTINE ENVIRONMENT" section in the git-receive-pack(1)
> documentation), it would be better if the command would instead
> consider such missing objects, especially commits, in the same way as
> other missing objects."

This reads much better, and is a good motivation to keep in the log
message.

>> But also, I'm not sure how much additional value this
>> paragraph is adding on top of what you already said in the first two
>> paragraphs.
>
> It's a more specific example, and it's how we detected the issue we
> would like to address, so I think it has some value.

Agreed.

>> > If, for example `--missing=print` is used, it would be nice for some
>> > use cases if the missing tips passed as arguments were reported in
>> > the same way as other missing objects instead of the command just
>> > failing.
>> >
>> > Let's introduce a new `--allow-missing-tips` option to make it work
>> > like this.
>>
>> I assume "tips" means "the starting commits which are passed into this
>> command from which it begins the rev walk". Might be worth including in
>> the commit message.
>
> I think "tips" is better than "commits" because blobs and trees are
> allowed (see above).

Makes sense.

>> > +             oidset_iter_init(&revs.missing_commits, &iter);
>> > +             while ((oid = oidset_iter_next(&iter)))
>> > +                     oidset_insert(&missing_objects, oid);
>> > +             oidset_clear(&revs.missing_commits);
>> > +     }
>>
>> Aside: I am surprised that there is no helper function already that
>> simply copies things in one oidset into another oidset.
>
> Yeah, I was surprised too. I only found the following similar code in
> list-objects-filter.c:
>
>     oidset_iter_init(src, &iter);
>     while ((src_oid = oidset_iter_next(&iter)) != NULL)
>         oidset_insert(dest, src_oid);
>
> So yeah perhaps we could create such a helper function.

Perhaps a NEEDSWORK comment is appropriate?

>> >       traverse_commit_list_filtered(
>> >               &revs, show_commit, show_object, &info,
>> >               (arg_print_omitted ? &omitted_objects : NULL));
>> > diff --git a/revision.c b/revision.c
>> > index 4c5cd7c3ce..9f25faa249 100644
>> > --- a/revision.c
>> > +++ b/revision.c
>> >
>> > [...]
>> >
>> > @@ -2184,7 +2189,7 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl
>> >               verify_non_filename(revs->prefix, arg);
>> >       object = get_reference(revs, arg, &oid, flags ^ local_flags);
>> >       if (!object)
>> > -             return revs->ignore_missing ? 0 : -1;
>> > +             return (revs->ignore_missing || revs->do_not_die_on_missing_tips) ? 0 : -1;
>> >       add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
>> >       add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
>> >       free(oc.path);
>>
>> With a few more context lines, the diff looks like
>>
>> --8<---------------cut here---------------start------------->8---
>>         if (get_oid_with_context(revs->repo, arg, get_sha1_flags, &oid, &oc))
>>                 return revs->ignore_missing ? 0 : -1;
>>         if (!cant_be_filename)
>>                 verify_non_filename(revs->prefix, arg);
>>         object = get_reference(revs, arg, &oid, flags ^ local_flags);
>>         if (!object)
>> -               return revs->ignore_missing ? 0 : -1;
>> +               return (revs->ignore_missing || revs->do_not_die_on_missing_tips) ? 0 : -1;
>>         add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
>>         add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
>>         free(oc.path);
>>         return 0;
>> --8<---------------cut here---------------end--------------->8---
>>
>> and it's not obvious to me why you only touched the "if (!object)" case
>> but not the "if (get_oid_with_context(...))" case. Are there any
>> subtleties here that would not be obvious to reviewers?
>
> I thought that if we can't get an oid, we cannot put anything in the
> 'missing_commit' oidset anyway, so it might be better to error out.

Ah, makes sense. This is a subtle detail, and perhaps worth keeping
either as a comment (just above the "if (get_oid_with_context(...))"
case) or in the log message.

>> > diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
>> > index 527aa94f07..283e8fc2c2 100755
>> > --- a/t/t6022-rev-list-missing.sh
>> > +++ b/t/t6022-rev-list-missing.sh
>> > @@ -77,4 +77,55 @@ do
>> >       done
>> >  done
>> >
>> > +for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
>> > +do
>> > +     for tip in "" "HEAD"
>>
>> So far from the patch diff it was not obvious that you wanted to check
>> for the empty string as a "tip".
>
> We want to check that things work as expected both:
>
>   - when we pass only one missing tip, and:
>   - when we pass one missing tip and a tip that is not missing.

Ah, I see. I think you could add a comment like

    We want to check that things work as expected both:
    
      - when we pass only one missing tip (when $tip is ""), and:
      - when we pass one missing tip and a tip that is not missing (when
        $tip is "HEAD").

at the top of the test case, and probably rename $obj to $missing_tip,
and rename $tip to $existing_tip.

Aside: it's a bit unfortunate that the meaning of "missing" becomes
overloaded slightly here because one could say '$tip == ""' is a
"missing" tip becauuse the name is not provided. Subtle. Plus there's
some interplay with the "if (get_oid_with_context(...))" case above
because we will (?) hit that path if we do pass in "" (empty string arg)
as a tip to rev-list. It might be worth saying in the comments in the
implementation, something like

    The term "missing" here means the case where we could not find the object
    given the object_id. For example, given HEAD~1, its object id (oid)
    could be 82335b23aa7234320d6f8055243c852e4b6a5ca3. If no real object
    with this oid exists, it is considered missing. Providing an empty
    string to rev-list does not touch the "missing tip" codepath.

I wrote the above hastily so it may need further edits to make it
succinct. But the point is that this definition isn't spelled out in the
test case, which makes the "" argument for $tip that much more subtle.

>> > +     do
>> > +             for action in "allow-any" "print"
>> > +             do
>> > +                     test_expect_success "--missing=$action --allow-missing-tips with tip '$obj' missing and tip '$tip'" '
>>
>> If I run this test without the new --allow-missing-tips flag, I get
>>
>>     fatal: bad object 82335b23aa7234320d6f8055243c852e4b6a5ca3
>>     not ok 11 - --missing=allow-any --allow-missing-tips with tip 'HEAD~1' missing and tip ''
>>
>> and I think the last "tip ''" part is misleading because IIUC it's not
>> actually passed in as a tip (see my comment below about shell quoting).
>>
>> > +                             oid="$(git rev-parse $obj)" &&
>> > +                             path=".git/objects/$(test_oid_to_path $oid)" &&
>> > +
>> > +                             # Before the object is made missing, we use rev-list to
>> > +                             # get the expected oids.
>> > +                             if [ "$tip" = "HEAD" ]; then
>> > +                                     git rev-list --objects --no-object-names \
>> > +                                             HEAD ^$obj >expect.raw
>> > +                             else
>> > +                                     >expect.raw
>> > +                             fi &&
>>
>> OK so you are using this empty string to clear the expect.raw file. If
>> that's true then what stops us from doing this at the start of every
>> iteration?
>
> I am not sure what you mean. Both:
>
>     git rev-list --objects --no-object-names HEAD ^$obj >expect.raw
>
> and:
>
>     >expect.raw #2
>
> are clearing "expect.raw" as ">expect.raw" is used in both cases.
>
> If we did the latter at the start of every iteration, then when we do
> the former afterwards, we would clear "expect.raw" raw again, while
> there is no point in clearing it twice.

Yes but doing it that way would separate "set up a clean environment for
this test case" from "find the oid of the non-missing tip" from each
other in the same if/else stanza, which I believe helps readability.

>> for the case where tip is the empty string. I wonder if we could avoid
>> being nuanced here with subtle shell quoting rules, especially because
>> these are tests (where making everything as explicit as possible is
>> desirable).
>
> I think it's not very subtle, but perhaps a comment would help. So I
> added the following to my current version before the loop:
>
>     # We want to check that things work when both
>     #   - all the tips passed are missing (case tip = ""), and
>     #   - there is one missing tip and one existing tip (case tip = "HEAD")

Great, thanks.

>> > +                             # When the action is to print, we should also add the missing
>> > +                             # oid to the expect list.
>> > +                             case $action in
>> > +                             allow-any)
>> > +                                     ;;
>> > +                             print)
>> > +                                     grep ?$oid actual.raw &&
>> > +                                     echo ?$oid >>expect.raw
>> > +                                     ;;
>> > +                             esac &&
>> > +
>> > +                             sort actual.raw >actual &&
>> > +                             sort expect.raw >expect &&
>> > +                             test_cmp expect actual
>> > +                     '
>> > +             done
>> > +     done
>> > +done
>> > +
>>
>> Wait, but where are we actually making the $tip commit's object
>> _missing_? Hmm...
>>
>> Ah, actually the tips are just the $obj variables. So it seems like you
>> could have done
>>
>>     for tip in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
>>
>> in the beginning to make it more obvious.
>
> The previous tests in this test script used "oid" as the variable name
> for possibly missing objects, so I did the same in the tests I added.

Didn't the previous tests _always_ make those objects missing?

> I could have used "tip" and "extra_tip" instead of "oid" and "tip" or
> perhaps "oid" and "extra_oid", but I don't think it makes a big
> difference in understanding these tests.

I agree that it doesn't make a big difference now (to me at least), but
I worry for the other future Git developers who'll need to do that small
amount of mental effort to gain the same understanding as us in this
review process. I would still prefer the names "missing_tip" (or
"possibly_missing_tip" if my immediate comment above is not correct) and
"existing_tip" as suggested in my other comment.

>> Also, given how most of this is identical from the loop already in t6022
>> just above it, it would help to add a comment at the top of this one
>> explaining how it's different than the one above it.
>
> The one I added has an extra `for tip in "" "HEAD"` loop. Anyway I
> think I will just modify the existing test in the next version I will
> send, as I plan to implement Junio's suggestion and there will be no
> new option.

SG.

> Thanks for the review!

You're welcome.

Now that I have your attention (was this my plan all along? ;) ), I will
take this opportunity to ping you directly for a review of my own patch
series for the trailers subsystem:
https://lore.kernel.org/git/pull.1632.v4.git.1707196348.gitgitgadget@gmail.com/
which is in its 4th iteration after many helpful comments from Junio.
Please don't let the patch count (28) raise any alarms --- they used to
be 10 but I broke them down into smaller steps to ease the review
process.

Thanks.
Christian Couder Feb. 8, 2024, 3:03 p.m. UTC | #13
On Wed, Feb 7, 2024 at 9:48 PM Linus Arver <linusa@google.com> wrote:
> Christian Couder <christian.couder@gmail.com> writes:

> > It actually kind of "works" when you pass blobs or trees. It looks
> > like the command just doesn't use them (except for checking that they
> > actually exist) unless options like --objects, --missing=print and
> > perhaps others are used. So yeah, the doc might need an update, but I
> > think it could be a separate issue, as it's not new and doesn't depend
> > on this small series.
>
> SG. But also, if there's a way to put invisible (developer-facing, not
> user facing) comments inside the relevant doc file it might be easy
> enough to add a to this series.

It might seem easy to add/improve some doc, but there is sometimes
bikeshedding. So I don't think making this series dependent on such a
dco update is a good thing for both that doc update and this series. I
will try to work on such a doc update soon though.

> > "quarantined objects" refers to the following doc:
> >
> > https://www.git-scm.com/docs/git-receive-pack#_quarantine_environment
> >
> > So to clarify things, the above paragraph looks like the following now:
> >
> > "When such a command is used to find the dependencies of some objects,
> > for example the dependencies of quarantined objects (see the
> > "QUARANTINE ENVIRONMENT" section in the git-receive-pack(1)
> > documentation), it would be better if the command would instead
> > consider such missing objects, especially commits, in the same way as
> > other missing objects."
>
> This reads much better, and is a good motivation to keep in the log
> message.

Ok, I have kept it in the V2 I just sent:

https://lore.kernel.org/git/20240208135055.2705260-1-christian.couder@gmail.com/

By the way, sorry for forgetting to use the --in-reply-to=... option
when I sent it.

> > Yeah, I was surprised too. I only found the following similar code in
> > list-objects-filter.c:
> >
> >     oidset_iter_init(src, &iter);
> >     while ((src_oid = oidset_iter_next(&iter)) != NULL)
> >         oidset_insert(dest, src_oid);
> >
> > So yeah perhaps we could create such a helper function.
>
> Perhaps a NEEDSWORK comment is appropriate?

I actually added another small patch in the V2 that refactors the
existing code into a function in oidset.{c,h}.


> >> With a few more context lines, the diff looks like
> >>
> >> --8<---------------cut here---------------start------------->8---
> >>         if (get_oid_with_context(revs->repo, arg, get_sha1_flags, &oid, &oc))
> >>                 return revs->ignore_missing ? 0 : -1;
> >>         if (!cant_be_filename)
> >>                 verify_non_filename(revs->prefix, arg);
> >>         object = get_reference(revs, arg, &oid, flags ^ local_flags);
> >>         if (!object)
> >> -               return revs->ignore_missing ? 0 : -1;
> >> +               return (revs->ignore_missing || revs->do_not_die_on_missing_tips) ? 0 : -1;
> >>         add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
> >>         add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
> >>         free(oc.path);
> >>         return 0;
> >> --8<---------------cut here---------------end--------------->8---
> >>
> >> and it's not obvious to me why you only touched the "if (!object)" case
> >> but not the "if (get_oid_with_context(...))" case. Are there any
> >> subtleties here that would not be obvious to reviewers?
> >
> > I thought that if we can't get an oid, we cannot put anything in the
> > 'missing_commit' oidset anyway, so it might be better to error out.
>
> Ah, makes sense. This is a subtle detail, and perhaps worth keeping
> either as a comment (just above the "if (get_oid_with_context(...))"
> case) or in the log message.

I added a code comment just above the "if (get_oid_with_context(...))"
case in the V2.


> Ah, I see. I think you could add a comment like
>
>     We want to check that things work as expected both:
>
>       - when we pass only one missing tip (when $tip is ""), and:
>       - when we pass one missing tip and a tip that is not missing (when
>         $tip is "HEAD").
>
> at the top of the test case, and probably rename $obj to $missing_tip,
> and rename $tip to $existing_tip.

I have renamed the variables like you suggest and added a code comment.

> Aside: it's a bit unfortunate that the meaning of "missing" becomes
> overloaded slightly here because one could say '$tip == ""' is a
> "missing" tip becauuse the name is not provided. Subtle. Plus there's
> some interplay with the "if (get_oid_with_context(...))" case above
> because we will (?) hit that path if we do pass in "" (empty string arg)
> as a tip to rev-list. It might be worth saying in the comments in the
> implementation, something like
>
>     The term "missing" here means the case where we could not find the object
>     given the object_id. For example, given HEAD~1, its object id (oid)
>     could be 82335b23aa7234320d6f8055243c852e4b6a5ca3. If no real object
>     with this oid exists, it is considered missing. Providing an empty
>     string to rev-list does not touch the "missing tip" codepath.
>
> I wrote the above hastily so it may need further edits to make it
> succinct. But the point is that this definition isn't spelled out in the
> test case, which makes the "" argument for $tip that much more subtle.

I think this is related to the --missing option in general (which has
been with us for around 6 years already), not specifically to this
series, so I think it can be done separately.

> >> OK so you are using this empty string to clear the expect.raw file. If
> >> that's true then what stops us from doing this at the start of every
> >> iteration?
> >
> > I am not sure what you mean. Both:
> >
> >     git rev-list --objects --no-object-names HEAD ^$obj >expect.raw
> >
> > and:
> >
> >     >expect.raw #2
> >
> > are clearing "expect.raw" as ">expect.raw" is used in both cases.
> >
> > If we did the latter at the start of every iteration, then when we do
> > the former afterwards, we would clear "expect.raw" raw again, while
> > there is no point in clearing it twice.
>
> Yes but doing it that way would separate "set up a clean environment for
> this test case" from "find the oid of the non-missing tip" from each
> other in the same if/else stanza, which I believe helps readability.

On one hand it can help readability, but on other hand some people
interested in test performance might see it as some waste. So I prefer
not to do it for now.

> >> Also, given how most of this is identical from the loop already in t6022
> >> just above it, it would help to add a comment at the top of this one
> >> explaining how it's different than the one above it.
> >
> > The one I added has an extra `for tip in "" "HEAD"` loop. Anyway I
> > think I will just modify the existing test in the next version I will
> > send, as I plan to implement Junio's suggestion and there will be no
> > new option.

Actually I changed my mind and didn't modify the existing test, but I
renamed the variables as you suggested.

> Now that I have your attention (was this my plan all along? ;) ), I will
> take this opportunity to ping you directly for a review of my own patch
> series for the trailers subsystem:
> https://lore.kernel.org/git/pull.1632.v4.git.1707196348.gitgitgadget@gmail.com/
> which is in its 4th iteration after many helpful comments from Junio.
> Please don't let the patch count (28) raise any alarms --- they used to
> be 10 but I broke them down into smaller steps to ease the review
> process.

I will try to take a look soon. Thanks for working on improving the
trailers subsystem!
Linus Arver Feb. 8, 2024, 8:42 p.m. UTC | #14
Christian Couder <christian.couder@gmail.com> writes:

> On Wed, Feb 7, 2024 at 9:48 PM Linus Arver <linusa@google.com> wrote:
>> Christian Couder <christian.couder@gmail.com> writes:
>
>> > It actually kind of "works" when you pass blobs or trees. It looks
>> > like the command just doesn't use them (except for checking that they
>> > actually exist) unless options like --objects, --missing=print and
>> > perhaps others are used. So yeah, the doc might need an update, but I
>> > think it could be a separate issue, as it's not new and doesn't depend
>> > on this small series.
>>
>> SG. But also, if there's a way to put invisible (developer-facing, not
>> user facing) comments inside the relevant doc file it might be easy
>> enough to add a to this series.
>
> It might seem easy to add/improve some doc, but there is sometimes
> bikeshedding. So I don't think making this series dependent on such a
> dco update is a good thing for both that doc update and this series. I
> will try to work on such a doc update soon though.

Oh, I did not mean to say you should add doc updates in this series at
all. I was just asking if you could add a comment with "NEEDSWORK" or
similar that only developers will see. Looks like asciidoc does support
such comments that are left out of the published text:
https://docs.asciidoctor.org/asciidoc/latest/comments/. I just tested
with "//" style comments (asciidoc has other styles also, not sure which
one is preferred in our codebase) and running "make" to generate the
docs confirm that we can indeed put in such developer-facing comments.

Back to this topic, you could add something like "// NEEDSWORK: rev-list
can take blob and tree objects also as arguments, not just commits".
This is why I thought it would be easy.

>> > "quarantined objects" refers to the following doc:
>> >
>> > https://www.git-scm.com/docs/git-receive-pack#_quarantine_environment
>> >
>> > So to clarify things, the above paragraph looks like the following now:
>> >
>> > "When such a command is used to find the dependencies of some objects,
>> > for example the dependencies of quarantined objects (see the
>> > "QUARANTINE ENVIRONMENT" section in the git-receive-pack(1)
>> > documentation), it would be better if the command would instead
>> > consider such missing objects, especially commits, in the same way as
>> > other missing objects."
>>
>> This reads much better, and is a good motivation to keep in the log
>> message.
>
> Ok, I have kept it in the V2 I just sent:
>
> https://lore.kernel.org/git/20240208135055.2705260-1-christian.couder@gmail.com/
>
> By the way, sorry for forgetting to use the --in-reply-to=... option
> when I sent it.

NP, thanks for the heads up. See you in the new thread.

>> Aside: it's a bit unfortunate that the meaning of "missing" becomes
>> overloaded slightly here because one could say '$tip == ""' is a
>> "missing" tip becauuse the name is not provided. Subtle. Plus there's
>> some interplay with the "if (get_oid_with_context(...))" case above
>> because we will (?) hit that path if we do pass in "" (empty string arg)
>> as a tip to rev-list. It might be worth saying in the comments in the
>> implementation, something like
>>
>>     The term "missing" here means the case where we could not find the object
>>     given the object_id. For example, given HEAD~1, its object id (oid)
>>     could be 82335b23aa7234320d6f8055243c852e4b6a5ca3. If no real object
>>     with this oid exists, it is considered missing. Providing an empty
>>     string to rev-list does not touch the "missing tip" codepath.
>>
>> I wrote the above hastily so it may need further edits to make it
>> succinct. But the point is that this definition isn't spelled out in the
>> test case, which makes the "" argument for $tip that much more subtle.
>
> I think this is related to the --missing option in general (which has
> been with us for around 6 years already), not specifically to this
> series, so I think it can be done separately.

Ah, thanks for the added context. I agree.

>> >> OK so you are using this empty string to clear the expect.raw file. If
>> >> that's true then what stops us from doing this at the start of every
>> >> iteration?
>> >
>> > I am not sure what you mean. Both:
>> >
>> >     git rev-list --objects --no-object-names HEAD ^$obj >expect.raw
>> >
>> > and:
>> >
>> >     >expect.raw #2
>> >
>> > are clearing "expect.raw" as ">expect.raw" is used in both cases.
>> >
>> > If we did the latter at the start of every iteration, then when we do
>> > the former afterwards, we would clear "expect.raw" raw again, while
>> > there is no point in clearing it twice.
>>
>> Yes but doing it that way would separate "set up a clean environment for
>> this test case" from "find the oid of the non-missing tip" from each
>> other in the same if/else stanza, which I believe helps readability.
>
> On one hand it can help readability, but on other hand some people
> interested in test performance might see it as some waste. So I prefer
> not to do it for now.

In most situations I would agree with "let's hurt test performance by
1%, in exchange for better readability". I think this is one such
situation. And also, it's not even clear if we can measure the
performance hit from an extra ">expect.raw" call.

But as this is a small nit, I won't pursue this suggestion further.

>> Now that I have your attention (was this my plan all along? ;) ), I will
>> take this opportunity to ping you directly for a review of my own patch
>> series for the trailers subsystem:
>> https://lore.kernel.org/git/pull.1632.v4.git.1707196348.gitgitgadget@gmail.com/
>> which is in its 4th iteration after many helpful comments from Junio.
>> Please don't let the patch count (28) raise any alarms --- they used to
>> be 10 but I broke them down into smaller steps to ease the review
>> process.
>
> I will try to take a look soon. Thanks for working on improving the
> trailers subsystem!

Thanks!
diff mbox series

Patch

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index b3f4783858..ae7bb15478 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -562,6 +562,16 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 				break;
 		}
 	}
+	for (i = 1; i < argc; i++) {
+		const char *arg = argv[i];
+		if (!strcmp(arg, "--allow-missing-tips")) {
+			if (arg_missing_action == MA_ERROR)
+				die(_("option '%s' only makes sense with '%s' set to '%s' or '%s'"),
+				      "--allow-missing-tips", "--missing=", "allow-*", "print");
+			revs.do_not_die_on_missing_tips = 1;
+			break;
+		}
+	}
 
 	if (arg_missing_action)
 		revs.do_not_die_on_missing_objects = 1;
@@ -627,6 +637,8 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			continue; /* already handled above */
 		if (skip_prefix(arg, "--missing=", &arg))
 			continue; /* already handled above */
+		if (!strcmp(arg, "--allow-missing-tips"))
+			continue; /* already handled above */
 
 		if (!strcmp(arg, ("--no-object-names"))) {
 			arg_show_object_names = 0;
@@ -753,9 +765,19 @@  int cmd_rev_list(int argc, const char **argv, const char *prefix)
 
 	if (arg_print_omitted)
 		oidset_init(&omitted_objects, DEFAULT_OIDSET_SIZE);
-	if (arg_missing_action == MA_PRINT)
+	if (arg_missing_action == MA_PRINT) {
+		struct oidset_iter iter;
+		struct object_id *oid;
+
 		oidset_init(&missing_objects, DEFAULT_OIDSET_SIZE);
 
+		/* Already add missing commits */
+		oidset_iter_init(&revs.missing_commits, &iter);
+		while ((oid = oidset_iter_next(&iter)))
+			oidset_insert(&missing_objects, oid);
+		oidset_clear(&revs.missing_commits);
+	}
+
 	traverse_commit_list_filtered(
 		&revs, show_commit, show_object, &info,
 		(arg_print_omitted ? &omitted_objects : NULL));
diff --git a/revision.c b/revision.c
index 4c5cd7c3ce..9f25faa249 100644
--- a/revision.c
+++ b/revision.c
@@ -388,6 +388,10 @@  static struct object *get_reference(struct rev_info *revs, const char *name,
 			return NULL;
 		if (revs->exclude_promisor_objects && is_promisor_object(oid))
 			return NULL;
+		if (revs->do_not_die_on_missing_tips) {
+			oidset_insert(&revs->missing_commits, oid);
+			return NULL;
+		}
 		die("bad object %s", name);
 	}
 	object->flags |= flags;
@@ -1947,6 +1951,7 @@  void repo_init_revisions(struct repository *r,
 	init_display_notes(&revs->notes_opt);
 	list_objects_filter_init(&revs->filter);
 	init_ref_exclusions(&revs->ref_excludes);
+	oidset_init(&revs->missing_commits, 0);
 }
 
 static void add_pending_commit_list(struct rev_info *revs,
@@ -2184,7 +2189,7 @@  static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl
 		verify_non_filename(revs->prefix, arg);
 	object = get_reference(revs, arg, &oid, flags ^ local_flags);
 	if (!object)
-		return revs->ignore_missing ? 0 : -1;
+		return (revs->ignore_missing || revs->do_not_die_on_missing_tips) ? 0 : -1;
 	add_rev_cmdline(revs, object, arg_, REV_CMD_REV, flags ^ local_flags);
 	add_pending_object_with_path(revs, object, arg, oc.mode, oc.path);
 	free(oc.path);
@@ -3830,8 +3835,6 @@  int prepare_revision_walk(struct rev_info *revs)
 				       FOR_EACH_OBJECT_PROMISOR_ONLY);
 	}
 
-	oidset_init(&revs->missing_commits, 0);
-
 	if (!revs->reflog_info)
 		prepare_to_use_bloom_filter(revs);
 	if (!revs->unsorted_input)
diff --git a/revision.h b/revision.h
index 94c43138bc..67435a5d8a 100644
--- a/revision.h
+++ b/revision.h
@@ -227,6 +227,14 @@  struct rev_info {
 			 */
 			do_not_die_on_missing_objects:1,
 
+			/*
+			 * When the do_not_die_on_missing_objects flag above is set,
+			 * a rev walk could still die with "fatal: bad object <oid>"
+			 * if one of the tips it is passed is missing. With this flag
+			 * such a tip will be reported as missing too.
+			 */
+			 do_not_die_on_missing_tips:1,
+
 			/* for internal use only */
 			exclude_promisor_objects:1;
 
diff --git a/t/t6022-rev-list-missing.sh b/t/t6022-rev-list-missing.sh
index 527aa94f07..283e8fc2c2 100755
--- a/t/t6022-rev-list-missing.sh
+++ b/t/t6022-rev-list-missing.sh
@@ -77,4 +77,55 @@  do
 	done
 done
 
+for obj in "HEAD~1" "HEAD~1^{tree}" "HEAD:1.t"
+do
+	for tip in "" "HEAD"
+	do
+		for action in "allow-any" "print"
+		do
+			test_expect_success "--missing=$action --allow-missing-tips with tip '$obj' missing and tip '$tip'" '
+				oid="$(git rev-parse $obj)" &&
+				path=".git/objects/$(test_oid_to_path $oid)" &&
+
+				# Before the object is made missing, we use rev-list to
+				# get the expected oids.
+				if [ "$tip" = "HEAD" ]; then
+					git rev-list --objects --no-object-names \
+						HEAD ^$obj >expect.raw
+				else
+					>expect.raw
+				fi &&
+
+				# Blobs are shared by all commits, so even though a commit/tree
+				# might be skipped, its blob must be accounted for.
+				if [ "$tip" = "HEAD" ] && [ $obj != "HEAD:1.t" ]; then
+					echo $(git rev-parse HEAD:1.t) >>expect.raw &&
+					echo $(git rev-parse HEAD:2.t) >>expect.raw
+				fi &&
+
+				mv "$path" "$path.hidden" &&
+				test_when_finished "mv $path.hidden $path" &&
+
+				git rev-list --missing=$action --allow-missing-tips \
+				     --objects --no-object-names $oid $tip >actual.raw &&
+
+				# When the action is to print, we should also add the missing
+				# oid to the expect list.
+				case $action in
+				allow-any)
+					;;
+				print)
+					grep ?$oid actual.raw &&
+					echo ?$oid >>expect.raw
+					;;
+				esac &&
+
+				sort actual.raw >actual &&
+				sort expect.raw >expect &&
+				test_cmp expect actual
+			'
+		done
+	done
+done
+
 test_done