[v2] clone: Don't segfault on -b specifing a non-commit
diff mbox series

Message ID 20191103180716.GA72007@carpenter.lan
State New
Headers show
Series
  • [v2] clone: Don't segfault on -b specifing a non-commit
Related show

Commit Message

Davide Berardi Nov. 3, 2019, 6:07 p.m. UTC
The code in "git clone -b B" to decide what branch to check out
assumed that B points at a commit object without checking,
leading to dereferencing a NULL pointer and causing a segmentation
fault.

Just aborting the operation when it happens is not a very
attractive option because we would be left with a directory
without .git/HEAD that cannot be used as a valid repository the
user can attempt to recover from the situation by checking out
something.

Fall back to use the 'master' branch, which is what we use when
the command without the "-b" option cannot figure out what
branch the remote side points with its HEAD.

Signed-off-by: Davide Berardi <berardi.dav@gmail.com>
---
 builtin/clone.c         | 49 ++++++++++++++++++++++++++++++++++++-----
 commit.c                | 16 +++++++++++---
 commit.h                |  4 ++++
 t/t5609-clone-branch.sh | 16 +++++++++++++-
 4 files changed, 75 insertions(+), 10 deletions(-)

Comments

Jeff King Nov. 5, 2019, 4:37 a.m. UTC | #1
On Sun, Nov 03, 2019 at 06:07:18PM +0000, Davide Berardi wrote:

> -static void update_head(const struct ref *our, const struct ref *remote,
> +static struct commit *lookup_commit_helper(const struct ref *our,
> +					   const struct ref *remote,
> +					   const char *msg, int *err)
> +{
> +	const struct object_id *tip = NULL;
> +	struct commit *tip_commit = NULL;
> +
> +	if (our)
> +		tip = &our->old_oid;
> +	else if (remote)
> +		tip = &remote->old_oid;
> +
> +	if (!tip)
> +		return NULL;
> +
> +	tip_commit = lookup_commit_reference_gently_err(the_repository, tip, 1, err);
> +	if (!tip_commit) {
> +		/*
> +		 * The given non-commit cannot be checked out,
> +		 * so have a 'master' branch and leave it unborn.
> +		 */
> +		warning(_("non-commit cannot be checked out"));
> +		create_symref("HEAD", "refs/heads/master", msg);
> +		return NULL;
> +	}
> +
> +	return tip_commit;
> +}

I like the logic flow in this function, which is IMHO clearer than the
existing code. But the "err" thing puzzled me for a moment. I think you
are trying to tell the difference between the case that both "our" and
"remote" are NULL, and the case that we saw a non-commit. In either case
we return NULL, but only one is an error.

But:

  - I don't think that logic needs to extend down into
    lookup_commit_reference_gently(); a NULL return from it would always
    be an error, wouldn't it?

  - we could probably simplify this by just inlining it into
    update_head(). Something like:

      if (our)
              tip = &our->old_oid;
      else if (remote)
              tip = &remote->old_oid;

      if (!tip) {
	      /*
	       * We have no local branch requested with "-b", and the
	       * remote HEAD is unborn. There's nothing to update HEAD
	       * to, but this state is not an error.
	       */
              return 0;
      }

      tip_commit = lookup_commit_reference_gently(...);
      if (!tip_commit) {
              /*
	       * The given non-commit cannot be checked out, etc...
	       */
	      warning(...);
	      create_symref(...);
	      return -1;
      }

      ...and then existing code to use tip_commit...

      /*
       * we'd always return 0 here, because our update_ref calls die on
       * error
       */
      return 0;

> @@ -1268,8 +1303,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	}
>  
>  	junk_mode = JUNK_LEAVE_REPO;
> -	fetch_if_missing = 1;
> -	err = checkout(submodule_progress);
> +	if (!err) {
> +		fetch_if_missing = 1;
> +		err = checkout(submodule_progress);
> +	}

This part makes sense. We might want an explanatory comment along the
lines of:

  /*
   * Only try to checkout if we successfully updated HEAD; otherwise
   * HEAD isn't pointing to the thing the user wanted.
   /
   if (!err) {
       ...
   

> -struct commit *lookup_commit_reference_gently(struct repository *r,
> -		const struct object_id *oid, int quiet)
> +struct commit *lookup_commit_reference_gently_err(struct repository *r,
> +		const struct object_id *oid, int quiet, int *err)

And this part I think could just go away, if you take my suggestion
above.

> diff --git a/t/t5609-clone-branch.sh b/t/t5609-clone-branch.sh
> index 6e7a7be052..d57f750eeb 100755
> --- a/t/t5609-clone-branch.sh
> +++ b/t/t5609-clone-branch.sh
> @@ -20,7 +20,10 @@ test_expect_success 'setup' '
>  	 echo one >file && git add file && git commit -m one &&
>  	 git checkout -b two &&
>  	 echo two >file && git add file && git commit -m two &&
> -	 git checkout master) &&
> +	 git checkout master &&
> +	 blob=$(git rev-parse HEAD:file)  &&
> +	 echo $blob > .git/refs/heads/broken-tag &&
> +	 echo $blob > .git/refs/heads/broken-head) &&

Minor style nit, but we usually avoid the space after ">".

> +test_expect_success 'clone -b with a non-commit tag must fallback' '
> +	test_must_fail git clone -b broken-tag parent clone-broken-tag &&
> +	(cd clone-broken-tag &&
> +	 check_HEAD master)
> +'
> +test_expect_success 'clone -b with a non-commit head must fallback' '
> +	test_must_fail git clone -b broken-head parent clone-broken-head &&
> +	(cd clone-broken-head &&
> +	 check_HEAD master)
> +'

OK, this second one covers the first conditional from update_head():

  if (our && skip_prefix(our->name, "refs/heads/", &head)) {

and the first one covers the second conditional:

  } else if (our) {

Should we cover the third conditional, too?

I think it would be the case that the remote HEAD is pointing to a
non-commit (since that's a corrupt repository, it might make sense
create a separate sub-repository).

-Peff
Junio C Hamano Nov. 6, 2019, 1:36 a.m. UTC | #2
Jeff King <peff@peff.net> writes:

> I like the logic flow in this function, which is IMHO clearer than the
> existing code. But the "err" thing puzzled me for a moment. I think you
> are trying to tell the difference between the case that both "our" and
> "remote" are NULL, and the case that we saw a non-commit. In either case
> we return NULL, but only one is an error.

Yup.

> But:
>
>   - I don't think that logic needs to extend down into
>     lookup_commit_reference_gently(); a NULL return from it would always
>     be an error, wouldn't it?

Yes.

>   - we could probably simplify this by just inlining it into
>     update_head(). Something like:
>
>       if (our)
>               tip = &our->old_oid;
>       else if (remote)
>               tip = &remote->old_oid;
>
>       if (!tip) {
> 	      /*
> 	       * We have no local branch requested with "-b", and the
> 	       * remote HEAD is unborn. There's nothing to update HEAD
> 	       * to, but this state is not an error.
> 	       */
>               return 0;
>       }

I somehow had an impression that Davide was protecting against the
case where tip->old_oid is null_oid (cloning from an empty repo?);
NULL return from lookup_commit_reference_gently(null_oid) would not
deserve a warning from this codepath, and should work just like the
way it has worked before these changes.

>       tip_commit = lookup_commit_reference_gently(...);
>       if (!tip_commit) {
>               /*
>> -	fetch_if_missing = 1;
>> -	err = checkout(submodule_progress);
>> +	if (!err) {
>> +		fetch_if_missing = 1;
>> +		err = checkout(submodule_progress);
>> +	}
>
> This part makes sense. We might want an explanatory comment along the
> lines of:
>
>   /*
>    * Only try to checkout if we successfully updated HEAD; otherwise
>    * HEAD isn't pointing to the thing the user wanted.
>    /
>    if (!err) {
>        ...

Yup.

>> diff --git a/t/t5609-clone-branch.sh b/t/t5609-clone-branch.sh
>
> I think it would be the case that the remote HEAD is pointing to a
> non-commit (since that's a corrupt repository, it might make sense
> create a separate sub-repository).

All good comments. 

Thanks
Jeff King Nov. 6, 2019, 4:05 a.m. UTC | #3
On Wed, Nov 06, 2019 at 10:36:18AM +0900, Junio C Hamano wrote:

> >   - we could probably simplify this by just inlining it into
> >     update_head(). Something like:
> >
> >       if (our)
> >               tip = &our->old_oid;
> >       else if (remote)
> >               tip = &remote->old_oid;
> >
> >       if (!tip) {
> > 	      /*
> > 	       * We have no local branch requested with "-b", and the
> > 	       * remote HEAD is unborn. There's nothing to update HEAD
> > 	       * to, but this state is not an error.
> > 	       */
> >               return 0;
> >       }
> 
> I somehow had an impression that Davide was protecting against the
> case where tip->old_oid is null_oid (cloning from an empty repo?);
> NULL return from lookup_commit_reference_gently(null_oid) would not
> deserve a warning from this codepath, and should work just like the
> way it has worked before these changes.

I'm not sure if we need to worry about that case. Right now it would
just segfault, I think, which means it's not something that comes up
regularly (though whether the _best_ thing might be to consider a
non-error, I don't know without understanding why we'd see a null_oid in
the first place).

If we did want to handle it, though, I think it would be easy with the
setup I described; update_head() could check is_null_oid() itself.

-Peff
Junio C Hamano Nov. 6, 2019, 9:53 a.m. UTC | #4
Jeff King <peff@peff.net> writes:

> If we did want to handle it, though, I think it would be easy with the
> setup I described; update_head() could check is_null_oid() itself.

Yup, that was what I had in my review on the previous round ;-)

Patch
diff mbox series

diff --git a/builtin/clone.c b/builtin/clone.c
index f665b28ccc..f185b7f193 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -704,10 +704,46 @@  static void update_remote_refs(const struct ref *refs,
 	}
 }
 
-static void update_head(const struct ref *our, const struct ref *remote,
+static struct commit *lookup_commit_helper(const struct ref *our,
+					   const struct ref *remote,
+					   const char *msg, int *err)
+{
+	const struct object_id *tip = NULL;
+	struct commit *tip_commit = NULL;
+
+	if (our)
+		tip = &our->old_oid;
+	else if (remote)
+		tip = &remote->old_oid;
+
+	if (!tip)
+		return NULL;
+
+	tip_commit = lookup_commit_reference_gently_err(the_repository, tip, 1, err);
+	if (!tip_commit) {
+		/*
+		 * The given non-commit cannot be checked out,
+		 * so have a 'master' branch and leave it unborn.
+		 */
+		warning(_("non-commit cannot be checked out"));
+		create_symref("HEAD", "refs/heads/master", msg);
+		return NULL;
+	}
+
+	return tip_commit;
+}
+
+static int update_head(const struct ref *our, const struct ref *remote,
 			const char *msg)
 {
+	int err = 0;
 	const char *head;
+	struct commit *c = NULL;
+
+	c = lookup_commit_helper(our, remote, msg, &err);
+	if (err)
+		return -1;
+
 	if (our && skip_prefix(our->name, "refs/heads/", &head)) {
 		/* Local default branch link */
 		if (create_symref("HEAD", our->name, NULL) < 0)
@@ -718,8 +754,6 @@  static void update_head(const struct ref *our, const struct ref *remote,
 			install_branch_config(0, head, option_origin, our->name);
 		}
 	} else if (our) {
-		struct commit *c = lookup_commit_reference(the_repository,
-							   &our->old_oid);
 		/* --branch specifies a non-branch (i.e. tags), detach HEAD */
 		update_ref(msg, "HEAD", &c->object.oid, NULL, REF_NO_DEREF,
 			   UPDATE_REFS_DIE_ON_ERR);
@@ -732,6 +766,7 @@  static void update_head(const struct ref *our, const struct ref *remote,
 		update_ref(msg, "HEAD", &remote->old_oid, NULL, REF_NO_DEREF,
 			   UPDATE_REFS_DIE_ON_ERR);
 	}
+	return err;
 }
 
 static int checkout(int submodule_progress)
@@ -1249,7 +1284,7 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 			   branch_top.buf, reflog_msg.buf, transport,
 			   !is_local, filter_options.choice);
 
-	update_head(our_head_points_at, remote_head, reflog_msg.buf);
+	err = update_head(our_head_points_at, remote_head, reflog_msg.buf) < 0;
 
 	/*
 	 * We want to show progress for recursive submodule clones iff
@@ -1268,8 +1303,10 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	}
 
 	junk_mode = JUNK_LEAVE_REPO;
-	fetch_if_missing = 1;
-	err = checkout(submodule_progress);
+	if (!err) {
+		fetch_if_missing = 1;
+		err = checkout(submodule_progress);
+	}
 
 	strbuf_release(&reflog_msg);
 	strbuf_release(&branch_top);
diff --git a/commit.c b/commit.c
index a98de16e3d..ffb5230f0f 100644
--- a/commit.c
+++ b/commit.c
@@ -26,16 +26,26 @@  int save_commit_buffer = 1;
 
 const char *commit_type = "commit";
 
-struct commit *lookup_commit_reference_gently(struct repository *r,
-		const struct object_id *oid, int quiet)
+struct commit *lookup_commit_reference_gently_err(struct repository *r,
+		const struct object_id *oid, int quiet, int *err)
 {
+	struct commit *retval;
 	struct object *obj = deref_tag(r,
 				       parse_object(r, oid),
 				       NULL, 0);
 
 	if (!obj)
 		return NULL;
-	return object_as_type(r, obj, OBJ_COMMIT, quiet);
+	retval = object_as_type(r, obj, OBJ_COMMIT, quiet);
+	if (!retval && err)
+		*err = 1;
+	return retval;
+}
+
+struct commit *lookup_commit_reference_gently(struct repository *r,
+		const struct object_id *oid, int quiet)
+{
+	return lookup_commit_reference_gently_err(r, oid, quiet, NULL);
 }
 
 struct commit *lookup_commit_reference(struct repository *r, const struct object_id *oid)
diff --git a/commit.h b/commit.h
index f5295ca7f3..157c5dc526 100644
--- a/commit.h
+++ b/commit.h
@@ -70,6 +70,10 @@  struct commit *lookup_commit_reference(struct repository *r,
 struct commit *lookup_commit_reference_gently(struct repository *r,
 					      const struct object_id *oid,
 					      int quiet);
+struct commit *lookup_commit_reference_gently_err(struct repository *r,
+						  const struct object_id *oid,
+						  int quiet,
+						  int *err);
 struct commit *lookup_commit_reference_by_name(const char *name);
 
 /*
diff --git a/t/t5609-clone-branch.sh b/t/t5609-clone-branch.sh
index 6e7a7be052..d57f750eeb 100755
--- a/t/t5609-clone-branch.sh
+++ b/t/t5609-clone-branch.sh
@@ -20,7 +20,10 @@  test_expect_success 'setup' '
 	 echo one >file && git add file && git commit -m one &&
 	 git checkout -b two &&
 	 echo two >file && git add file && git commit -m two &&
-	 git checkout master) &&
+	 git checkout master &&
+	 blob=$(git rev-parse HEAD:file)  &&
+	 echo $blob > .git/refs/heads/broken-tag &&
+	 echo $blob > .git/refs/heads/broken-head) &&
 	mkdir empty &&
 	(cd empty && git init)
 '
@@ -67,4 +70,15 @@  test_expect_success 'clone -b not allowed with empty repos' '
 	test_must_fail git clone -b branch empty clone-branch-empty
 '
 
+test_expect_success 'clone -b with a non-commit tag must fallback' '
+	test_must_fail git clone -b broken-tag parent clone-broken-tag &&
+	(cd clone-broken-tag &&
+	 check_HEAD master)
+'
+test_expect_success 'clone -b with a non-commit head must fallback' '
+	test_must_fail git clone -b broken-head parent clone-broken-head &&
+	(cd clone-broken-head &&
+	 check_HEAD master)
+'
+
 test_done