diff mbox series

Segmentation fault on non-commit objects.

Message ID 20191029092735.GA84120@carpenter.lan (mailing list archive)
State New, archived
Headers show
Series Segmentation fault on non-commit objects. | expand

Commit Message

Davide Berardi Oct. 29, 2019, 9:27 a.m. UTC
Fixed segmentation fault that can be triggered using
$ git clone --branch $object $repository
with object pointing to a non-commit (e.g. a blob).

Signed-off-by: Davide Berardi <berardi.dav@gmail.com>
---
 builtin/clone.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Johannes Schindelin Oct. 29, 2019, 1:11 p.m. UTC | #1
Hi Davide,

[please remove the trailing period character in the commit subject]

On Tue, 29 Oct 2019, Davide Berardi wrote:

> Fixed segmentation fault that can be triggered using
> $ git clone --branch $object $repository
> with object pointing to a non-commit (e.g. a blob).
>
> Signed-off-by: Davide Berardi <berardi.dav@gmail.com>
> ---
> builtin/clone.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index f665b28ccc..6ad2d8fe77 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -720,6 +720,9 @@ static void update_head(const struct ref *our, const
> struct ref *remote,
> 	} else if (our) {
> 		struct commit *c = lookup_commit_reference(the_repository,
> 							   &our->old_oid);
> +		/* Check if --branch specifies a non-commit. */
> +		if (c == NULL)
> +			die(_("unable to update HEAD (cannot find commit)"));

Could the error message maybe repeat whatever the user specified for
`$object`? That would probably be more helpful. Maybe even say "not a
commit"?

Ciao,
Johannes

> 		/* --branch specifies a non-branch (i.e. tags), detach HEAD */
> 		update_ref(msg, "HEAD", &c->object.oid, NULL, REF_NO_DEREF,
> 			   UPDATE_REFS_DIE_ON_ERR);
> --
> 2.23.0
>
>
>
Jeff King Oct. 29, 2019, 2:06 p.m. UTC | #2
On Tue, Oct 29, 2019 at 10:27:35AM +0100, Davide Berardi wrote:

> Fixed segmentation fault that can be triggered using
> $ git clone --branch $object $repository
> with object pointing to a non-commit (e.g. a blob).

One subtle thing here is that $object still needs to be at the tip of a
ref (an easy real-world case is "-b junio-gpg-pub" against git.git).

It might be nice to cover this with a test.

> diff --git a/builtin/clone.c b/builtin/clone.c
> index f665b28ccc..6ad2d8fe77 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -720,6 +720,9 @@ static void update_head(const struct ref *our, const struct ref *remote,
> 	} else if (our) {
> 		struct commit *c = lookup_commit_reference(the_repository,
> 							   &our->old_oid);
> +		/* Check if --branch specifies a non-commit. */
> +		if (c == NULL)
> +			die(_("unable to update HEAD (cannot find commit)"));

This is definitely a strict improvement over the current behavior
(though I agree with Dscho's comments on the error message). A few
further thoughts:

  - we'll have successfully completed the rest of the clone at this
    point. Should we leave the objects and refs in place to allow the
    user to fix it up, as we do when "git checkout" fails?

    We'd have to leave _something_ in HEAD for it to be a valid repo. I
    guess just "refs/heads/master" would be fine, or perhaps we could
    fall back to whatever the other side had in their HEAD (i.e.,
    pretending that "-b" wasn't specified).

  - there's a related case just above the lines you touched: what
    happens if the other side feeds us a non-commit in their
    refs/heads/? That shouldn't happen (i.e., their repo is broken), but
    should we be protecting ourselves on the receiving side more?

    Likewise in "else" below just above your lines.

    I think in either case we wouldn't segfault (because we don't try to
    dereference to a commit ourselves), but we'd produce a bogus on-disk
    state.

-Peff
Junio C Hamano Oct. 30, 2019, 2:44 a.m. UTC | #3
Jeff King <peff@peff.net> writes:

> This is definitely a strict improvement over the current behavior
> (though I agree with Dscho's comments on the error message). A few
> further thoughts:
>
>   - we'll have successfully completed the rest of the clone at this
>     point. Should we leave the objects and refs in place to allow the
>     user to fix it up, as we do when "git checkout" fails?
>
>     We'd have to leave _something_ in HEAD for it to be a valid repo. I
>     guess just "refs/heads/master" would be fine, or perhaps we could
>     fall back to whatever the other side had in their HEAD (i.e.,
>     pretending that "-b" wasn't specified).

Do we know for sure that the object at HEAD on the other side is a
commit, or do we need to prepare for a case where it is not?  I
suspect it is the latter.  HEAD needs to exist and point at a ref
that is in refs/heads/ hierarchy, and the ref can even be unborn, so
falling back on 'master' sounds like a good position.

>   - there's a related case just above the lines you touched: what
>     happens if the other side feeds us a non-commit in their
>     refs/heads/? That shouldn't happen (i.e., their repo is broken), but
>     should we be protecting ourselves on the receiving side more?
>
>     Likewise in "else" below just above your lines.

Both are good points.

Thanks, all.
Jeff King Oct. 31, 2019, 5:37 a.m. UTC | #4
On Wed, Oct 30, 2019 at 11:44:23AM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > This is definitely a strict improvement over the current behavior
> > (though I agree with Dscho's comments on the error message). A few
> > further thoughts:
> >
> >   - we'll have successfully completed the rest of the clone at this
> >     point. Should we leave the objects and refs in place to allow the
> >     user to fix it up, as we do when "git checkout" fails?
> >
> >     We'd have to leave _something_ in HEAD for it to be a valid repo. I
> >     guess just "refs/heads/master" would be fine, or perhaps we could
> >     fall back to whatever the other side had in their HEAD (i.e.,
> >     pretending that "-b" wasn't specified).
> 
> Do we know for sure that the object at HEAD on the other side is a
> commit, or do we need to prepare for a case where it is not?  I
> suspect it is the latter.  HEAD needs to exist and point at a ref
> that is in refs/heads/ hierarchy, and the ref can even be unborn, so
> falling back on 'master' sounds like a good position.

Yeah, I don't think that we do. This is the same as the case I mentioned
later, and it should be handled in all three arms of the conditional.

Davide, do you have an interest in trying to make these code paths a bit
more robust?

-Peff
Davide Berardi Oct. 31, 2019, 10:43 a.m. UTC | #5
>Davide, do you have an interest in trying to make these code paths a bit
>more robust?
>
Absolutely, I am writing a patch taking in consideration your comments.
I only need to write a test routine then I will commit my new patch.

>-Peff

Thanks and sorry for the silence but I am learning from your comments.
:)
D.
Davide Berardi Nov. 1, 2019, 12:26 a.m. UTC | #6
>Ciao,
>Johannes
Hi Johannes,

I've implemented your comments on a new patch, you can find it at[1]

Thanks.
Ciao, D.

[1] https://public-inbox.org/git/20191101002432.GA49846@carpenter.lan/T/#u
Davide Berardi Nov. 1, 2019, 12:29 a.m. UTC | #7
>Davide, do you have an interest in trying to make these code paths a bit
>more robust?
>
I've tried to implement your comments in a new patch, you can find it
at[1].  Sorry if this is not the standard procedure to submit a new
patch and let me know if you have new comments.

Thanks to all.
Ciao,
D.

[1] https://public-inbox.org/git/20191101002432.GA49846@carpenter.lan/T/#u
diff mbox series

Patch

diff --git a/builtin/clone.c b/builtin/clone.c
index f665b28ccc..6ad2d8fe77 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -720,6 +720,9 @@  static void update_head(const struct ref *our, const struct ref *remote,
 	} else if (our) {
 		struct commit *c = lookup_commit_reference(the_repository,
 							   &our->old_oid);
+		/* Check if --branch specifies a non-commit. */
+		if (c == NULL)
+			die(_("unable to update HEAD (cannot find commit)"));
 		/* --branch specifies a non-branch (i.e. tags), detach HEAD */
 		update_ref(msg, "HEAD", &c->object.oid, NULL, REF_NO_DEREF,
 			   UPDATE_REFS_DIE_ON_ERR);