diff mbox series

revision.c: drop missing objects from cmdline

Message ID 20181023215745.245333-1-matvore@google.com (mailing list archive)
State New, archived
Headers show
Series revision.c: drop missing objects from cmdline | expand

Commit Message

Matthew DeVore Oct. 23, 2018, 9:57 p.m. UTC
No code which reads cmdline in struct rev_info can handle NULL objects
in cmdline.rev[i].item, so stop adding them to the cmdline.rev array.
Objects in cmdline are NULL when the given object is promisor and
--exclude-promisor-objects is enabled.

This new behavior avoids a segmentation fault in the added test case in
t0410.

We could simply die if add_rev_cmdline is called with a NULL item,
(rather than warn if --exclude-promisor-objects is set) but because the
amended test case already expects the command to finish successfully,
difference and show a warning. Note that this command:

	git rev-list --objects --missing=print $missing_hash

Already fails with a "fatal: bad object HASH" message and this patch
does not change that.

Signed-off-by: Matthew DeVore <matvore@google.com>
---
 revision.c               | 12 ++++++++++++
 t/t0410-partial-clone.sh | 11 ++++++++++-
 2 files changed, 22 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Oct. 24, 2018, 4:54 a.m. UTC | #1
Matthew DeVore <matvore@google.com> writes:

> No code which reads cmdline in struct rev_info can handle NULL objects
> in cmdline.rev[i].item, so stop adding them to the cmdline.rev array.

"The code is not prepared to have cmdline.rev[].item that is NULL"
is something everybody would understand and agree with, but that
does not automatically lead to "so ignoring or rejecting and dying
is OK", though.  The cmdline thing is used for the commands to learn
the end-user intent that cannot be learned by the resulting objects
in the object array (e.g. the user may have said 'master' but the
pending[] (and later revs.commits) would only have the object names,
and some callers would want to know if it was a branch name, a
refname refs/heads/master, or the hexadecimal object name), so
unless absolutely needed, I'm hesitant to take a change that loses
information (e.g. the user named this object that is not locally
available, we cannot afford to add it to the pending[] and add it to
revs.commits to traverse from there, but we still want to know what
object was given by the user).

> Objects in cmdline are NULL when the given object is promisor and
> --exclude-promisor-objects is enabled.

A "promisor" is a remote repository.  It promises certain objects
that you do not have are later retrievable from it.  The way you can
see if the promisor promised to later give you an object is to see
if that missing object is reachable from an object in a packfile the
promisor gave you earlier.  

"The given object" is never a "promisor", so I am not sure what the
above wants to say.  Is 

    When an object is given on the command line and if it is missing
    from the local repository, add_rev_cmdline() receives NULL in
    its "item" parameter.

what you meant?  Is that the _only_ case in which "item" could be
NULL, or is it also true for any missing object due to repository
corruption?
Matthew DeVore Oct. 25, 2018, 11:13 p.m. UTC | #2
On Tue, Oct 23, 2018 at 9:54 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Matthew DeVore <matvore@google.com> writes:
>
> > No code which reads cmdline in struct rev_info can handle NULL objects
> > in cmdline.rev[i].item, so stop adding them to the cmdline.rev array.
>
> "The code is not prepared to have cmdline.rev[].item that is NULL"
> is something everybody would understand and agree with, but that
> does not automatically lead to "so ignoring or rejecting and dying
> is OK", though.  The cmdline thing is used for the commands to learn
> the end-user intent that cannot be learned by the resulting objects
> in the object array (e.g. the user may have said 'master' but the
> pending[] (and later revs.commits) would only have the object names,
> and some callers would want to know if it was a branch name, a
> refname refs/heads/master, or the hexadecimal object name), so
> unless absolutely needed, I'm hesitant to take a change that loses
> information (e.g. the user named this object that is not locally
> available, we cannot afford to add it to the pending[] and add it to
> revs.commits to traverse from there, but we still want to know what
> object was given by the user).
Hmm, when you explain the purpose of cmdline, it's obvious now that it
doesn't make sense to mechanically drop items from it. I'm sending
another version of this patch which uses a more focused approach and
is a bit simpler.

>
> > Objects in cmdline are NULL when the given object is promisor and
> > --exclude-promisor-objects is enabled.
>
> A "promisor" is a remote repository.  It promises certain objects
> that you do not have are later retrievable from it.  The way you can
> see if the promisor promised to later give you an object is to see
> if that missing object is reachable from an object in a packfile the
> promisor gave you earlier.
>
> "The given object" is never a "promisor", so I am not sure what the
> above wants to say.  Is
>
>     When an object is given on the command line and if it is missing
>     from the local repository, add_rev_cmdline() receives NULL in
>     its "item" parameter.
>
> what you meant?  Is that the _only_ case in which "item" could be
> NULL, or is it also true for any missing object due to repository
> corruption?

Yes, that is what I meant. I believe for corruption there is an actual
error shown with die() or the like, though I am not certain.
diff mbox series

Patch

diff --git a/revision.c b/revision.c
index a1ddb9e11c..8724dca2e2 100644
--- a/revision.c
+++ b/revision.c
@@ -1148,6 +1148,18 @@  static void add_rev_cmdline(struct rev_info *revs,
 			    int whence,
 			    unsigned flags)
 {
+	if (!item) {
+		/*
+		 * item is likely a promisor object returned from get_reference.
+		 */
+		if (revs->exclude_promisor_objects) {
+			warning(_("ignoring missing object %s"), name);
+			return;
+		} else {
+			die(_("missing object %s"), name);
+		}
+	}
+
 	struct rev_cmdline_info *info = &revs->cmdline;
 	unsigned int nr = info->nr;
 
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index ba3887f178..e02cd3f818 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -366,7 +366,16 @@  test_expect_success 'rev-list accepts missing and promised objects on command li
 
 	git -C repo config core.repositoryformatversion 1 &&
 	git -C repo config extensions.partialclone "arbitrary string" &&
-	git -C repo rev-list --exclude-promisor-objects --objects "$COMMIT" "$TREE" "$BLOB"
+
+	printf "warning: ignoring missing object %s\n" \
+	       "$COMMIT" "$TREE" "$BLOB" >expect &&
+	git -C repo rev-list --objects \
+		--exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB" 2>actual &&
+	test_cmp expect actual &&
+
+	git -C repo rev-list --objects-edge-aggressive \
+		--exclude-promisor-objects "$COMMIT" "$TREE" "$BLOB" 2>actual &&
+	test_cmp expect actual
 '
 
 test_expect_success 'gc repacks promisor objects separately from non-promisor objects' '