diff mbox series

[v3,1/2] commit-reach: properly peel tags

Message ID 0a1e661271660b1fab317aac3997589a94b7c98f.1537542323.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Properly peel tags in can_all_from_reach_with_flags() | expand

Commit Message

Derrick Stolee via GitGitGadget Sept. 21, 2018, 3:05 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

The can_all_from_reach_with_flag() algorithm was refactored in 4fbcca4e
"commit-reach: make can_all_from_reach... linear" but incorrectly
assumed that all objects provided were commits. During a fetch
negotiation, ok_to_give_up() in upload-pack.c may provide unpeeled tags
to the 'from' array. The current code creates a segfault.

Add a direct call to can_all_from_reach_with_flag() in 'test-tool reach'
and add a test in t6600-test-reach.sh that demonstrates this segfault.

Correct the issue by peeling tags when investigating the initial list
of objects in the 'from' array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 commit-reach.c        | 36 +++++++++++++++++++++++++++++-------
 t/helper/test-reach.c | 22 +++++++++++++++++-----
 t/t6600-test-reach.sh | 30 ++++++++++++++++++++++++++++--
 3 files changed, 74 insertions(+), 14 deletions(-)

Comments

Jeff King Sept. 21, 2018, 11:56 p.m. UTC | #1
On Fri, Sep 21, 2018 at 08:05:26AM -0700, Derrick Stolee via GitGitGadget wrote:

> diff --git a/commit-reach.c b/commit-reach.c
> index 86715c103c..e748414d04 100644
> --- a/commit-reach.c
> +++ b/commit-reach.c
> @@ -544,20 +544,42 @@ int can_all_from_reach_with_flag(struct object_array *from,
>  {
>  	struct commit **list = NULL;
>  	int i;
> +	int nr_commits;
>  	int result = 1;
>  
>  	ALLOC_ARRAY(list, from->nr);
> +	nr_commits = 0;
>  	for (i = 0; i < from->nr; i++) {
> -		list[i] = (struct commit *)from->objects[i].item;
> +		struct object *from_one = from->objects[i].item;
>  
> -		if (parse_commit(list[i]) ||
> -		    list[i]->generation < min_generation)
> -			return 0;
> +		if (!from_one || from_one->flags & assign_flag)
> +			continue;
> +
> +		from_one = deref_tag(the_repository, from_one,
> +				     "a from object", 0);
> +		if (!from_one || from_one->type != OBJ_COMMIT) {
> +			/* no way to tell if this is reachable by
> +			 * looking at the ancestry chain alone, so
> +			 * leave a note to ourselves not to worry about
> +			 * this object anymore.
> +			 */

A minor nit, but the original comment you restored here has a style
violation. Might be worth fixing up (but certainly not worth a re-roll
on its own).

> +			from->objects[i].item->flags |= assign_flag;

OK, so here we mark the original tag with a flag...

> +
> +		list[nr_commits] = (struct commit *)from_one;
> +		if (parse_commit(list[nr_commits]) ||
> +		    list[nr_commits]->generation < min_generation) {
> +			result = 0;
> +			goto cleanup;
> +		}

And we jump to the cleanup here, but...

> @@ -600,7 +622,7 @@ int can_all_from_reach_with_flag(struct object_array *from,
>  	}
>  
>  cleanup:
> -	for (i = 0; i < from->nr; i++) {
> +	for (i = 0; i < nr_commits; i++) {
>  		clear_commit_marks(list[i], RESULT);
>  		clear_commit_marks(list[i], assign_flag);

This walks over the items in the list array, which don't include the tag
we marked. Did we decide that's OK? I think it's actually how the
original code behaved, too, but it seems funny. At the least we might
want to call it out in the commit message.

Should we also be walking from->objects and clearing the flags from
there (maybe not RESULT, but assign_flag)? It's not clear to me if it's
unintentional for those to be marked after the function leaves or not.

Also, a minor aside, but I think it would be slightly more efficient for
those final two lines to do:

  clear_commit_marks(list[i], RESULT | assign_flag);

Of course, that's totally orthogonal to this patch, but it may make you
feel better to offset the other round of clearing I'm suggesting. ;)

> diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
> index eb21103998..08d2ea68e8 100644
> --- a/t/helper/test-reach.c
> +++ b/t/helper/test-reach.c

These bits all looked good to me.

-Peff
Derrick Stolee Sept. 24, 2018, 11:48 a.m. UTC | #2
On 9/21/2018 7:56 PM, Jeff King wrote:
> On Fri, Sep 21, 2018 at 08:05:26AM -0700, Derrick Stolee via GitGitGadget wrote:
> +		if (!from_one || from_one->type != OBJ_COMMIT) {
> +			/* no way to tell if this is reachable by
> +			 * looking at the ancestry chain alone, so
> +			 * leave a note to ourselves not to worry about
> +			 * this object anymore.
> +			 */
> A minor nit, but the original comment you restored here has a style
> violation. Might be worth fixing up (but certainly not worth a re-roll
> on its own).
>
>> @@ -600,7 +622,7 @@ int can_all_from_reach_with_flag(struct object_array *from,
>>   	}
>>   
>>   cleanup:
>> -	for (i = 0; i < from->nr; i++) {
>> +	for (i = 0; i < nr_commits; i++) {
>>   		clear_commit_marks(list[i], RESULT);
>>   		clear_commit_marks(list[i], assign_flag);
> Also, a minor aside, but I think it would be slightly more efficient for
> those final two lines to do:
>
>    clear_commit_marks(list[i], RESULT | assign_flag);
>
> Of course, that's totally orthogonal to this patch, but it may make you
> feel better to offset the other round of clearing I'm suggesting. ;)

This is definitely a better thing to do, and I'll make the change in v4, 
along with the comment style cleanup above.

Thanks,

-Stolee
diff mbox series

Patch

diff --git a/commit-reach.c b/commit-reach.c
index 86715c103c..e748414d04 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -544,20 +544,42 @@  int can_all_from_reach_with_flag(struct object_array *from,
 {
 	struct commit **list = NULL;
 	int i;
+	int nr_commits;
 	int result = 1;
 
 	ALLOC_ARRAY(list, from->nr);
+	nr_commits = 0;
 	for (i = 0; i < from->nr; i++) {
-		list[i] = (struct commit *)from->objects[i].item;
+		struct object *from_one = from->objects[i].item;
 
-		if (parse_commit(list[i]) ||
-		    list[i]->generation < min_generation)
-			return 0;
+		if (!from_one || from_one->flags & assign_flag)
+			continue;
+
+		from_one = deref_tag(the_repository, from_one,
+				     "a from object", 0);
+		if (!from_one || from_one->type != OBJ_COMMIT) {
+			/* no way to tell if this is reachable by
+			 * looking at the ancestry chain alone, so
+			 * leave a note to ourselves not to worry about
+			 * this object anymore.
+			 */
+			from->objects[i].item->flags |= assign_flag;
+			continue;
+		}
+
+		list[nr_commits] = (struct commit *)from_one;
+		if (parse_commit(list[nr_commits]) ||
+		    list[nr_commits]->generation < min_generation) {
+			result = 0;
+			goto cleanup;
+		}
+
+		nr_commits++;
 	}
 
-	QSORT(list, from->nr, compare_commits_by_gen);
+	QSORT(list, nr_commits, compare_commits_by_gen);
 
-	for (i = 0; i < from->nr; i++) {
+	for (i = 0; i < nr_commits; i++) {
 		/* DFS from list[i] */
 		struct commit_list *stack = NULL;
 
@@ -600,7 +622,7 @@  int can_all_from_reach_with_flag(struct object_array *from,
 	}
 
 cleanup:
-	for (i = 0; i < from->nr; i++) {
+	for (i = 0; i < nr_commits; i++) {
 		clear_commit_marks(list[i], RESULT);
 		clear_commit_marks(list[i], assign_flag);
 	}
diff --git a/t/helper/test-reach.c b/t/helper/test-reach.c
index eb21103998..08d2ea68e8 100644
--- a/t/helper/test-reach.c
+++ b/t/helper/test-reach.c
@@ -31,6 +31,7 @@  int cmd__reach(int ac, const char **av)
 	struct object_id oid_A, oid_B;
 	struct commit *A, *B;
 	struct commit_list *X, *Y;
+	struct object_array X_obj = OBJECT_ARRAY_INIT;
 	struct commit **X_array;
 	int X_nr, X_alloc;
 	struct strbuf buf = STRBUF_INIT;
@@ -49,7 +50,8 @@  int cmd__reach(int ac, const char **av)
 
 	while (strbuf_getline(&buf, stdin) != EOF) {
 		struct object_id oid;
-		struct object *o;
+		struct object *orig;
+		struct object *peeled;
 		struct commit *c;
 		if (buf.len < 3)
 			continue;
@@ -57,14 +59,14 @@  int cmd__reach(int ac, const char **av)
 		if (get_oid_committish(buf.buf + 2, &oid))
 			die("failed to resolve %s", buf.buf + 2);
 
-		o = parse_object(r, &oid);
-		o = deref_tag_noverify(o);
+		orig = parse_object(r, &oid);
+		peeled = deref_tag_noverify(orig);
 
-		if (!o)
+		if (!peeled)
 			die("failed to load commit for input %s resulting in oid %s\n",
 			    buf.buf, oid_to_hex(&oid));
 
-		c = object_as_type(r, o, OBJ_COMMIT, 0);
+		c = object_as_type(r, peeled, OBJ_COMMIT, 0);
 
 		if (!c)
 			die("failed to load commit for input %s resulting in oid %s\n",
@@ -85,6 +87,7 @@  int cmd__reach(int ac, const char **av)
 				commit_list_insert(c, &X);
 				ALLOC_GROW(X_array, X_nr + 1, X_alloc);
 				X_array[X_nr++] = c;
+				add_object_array(orig, NULL, &X_obj);
 				break;
 
 			case 'Y':
@@ -113,6 +116,15 @@  int cmd__reach(int ac, const char **av)
 		print_sorted_commit_ids(list);
 	} else if (!strcmp(av[1], "can_all_from_reach")) {
 		printf("%s(X,Y):%d\n", av[1], can_all_from_reach(X, Y, 1));
+	} else if (!strcmp(av[1], "can_all_from_reach_with_flag")) {
+		struct commit_list *iter = Y;
+
+		while (iter) {
+			iter->item->object.flags |= 2;
+			iter = iter->next;
+		}
+
+		printf("%s(X,_,_,0,0):%d\n", av[1], can_all_from_reach_with_flag(&X_obj, 2, 4, 0, 0));
 	} else if (!strcmp(av[1], "commit_contains")) {
 		struct ref_filter filter;
 		struct contains_cache cache;
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index d139a00d1d..ae94b27f70 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -31,7 +31,8 @@  test_expect_success 'setup' '
 	for i in $(test_seq 1 10)
 	do
 		test_commit "1-$i" &&
-		git branch -f commit-1-$i
+		git branch -f commit-1-$i &&
+		git tag -a -m "1-$i" tag-1-$i commit-1-$i
 	done &&
 	for j in $(test_seq 1 9)
 	do
@@ -39,11 +40,13 @@  test_expect_success 'setup' '
 		x=$(($j + 1)) &&
 		test_commit "$x-1" &&
 		git branch -f commit-$x-1 &&
+		git tag -a -m "$x-1" tag-$x-1 commit-$x-1 &&
 
 		for i in $(test_seq 2 10)
 		do
 			git merge commit-$j-$i -m "$x-$i" &&
-			git branch -f commit-$x-$i
+			git branch -f commit-$x-$i &&
+			git tag -a -m "$x-$i" tag-$x-$i commit-$x-$i
 		done
 	done &&
 	git commit-graph write --reachable &&
@@ -205,6 +208,29 @@  test_expect_success 'can_all_from_reach:miss' '
 	test_three_modes can_all_from_reach
 '
 
+test_expect_success 'can_all_from_reach_with_flag: tags case' '
+	cat >input <<-\EOF &&
+	X:tag-2-10
+	X:tag-3-9
+	X:tag-4-8
+	X:commit-5-7
+	X:commit-6-6
+	X:commit-7-5
+	X:commit-8-4
+	X:commit-9-3
+	Y:tag-1-9
+	Y:tag-2-8
+	Y:tag-3-7
+	Y:commit-4-6
+	Y:commit-5-5
+	Y:commit-6-4
+	Y:commit-7-3
+	Y:commit-8-1
+	EOF
+	echo "can_all_from_reach_with_flag(X,_,_,0,0):1" >expect &&
+	test_three_modes can_all_from_reach_with_flag
+'
+
 test_expect_success 'commit_contains:hit' '
 	cat >input <<-\EOF &&
 	A:commit-7-7