[v2,2/2] tree:<depth>: skip some trees even when collecting omits
diff mbox series

Message ID 20181210234030.176178-3-matvore@google.com
State New
Headers show
Series
  • support for filtering trees and blobs based on depth
Related show

Commit Message

Matthew DeVore Dec. 10, 2018, 11:40 p.m. UTC
If a tree has already been recorded as omitted, we don't need to
traverse it again just to collect its omits. Stop traversing trees a
second time when collecting omits.

Signed-off-by: Matthew DeVore <matvore@google.com>
---
 list-objects-filter.c               | 17 +++++++++++------
 t/t6112-rev-list-filters-objects.sh | 11 ++++++++++-
 2 files changed, 21 insertions(+), 7 deletions(-)

Comments

Jonathan Tan Jan. 8, 2019, 2 a.m. UTC | #1
> -static void filter_trees_update_omits(
> +static int filter_trees_update_omits(
>  	struct object *obj,
>  	struct filter_trees_depth_data *filter_data,
>  	int include_it)
>  {
>  	if (!filter_data->omits)
> -		return;
> +		return 1;
>  
>  	if (include_it)
> -		oidset_remove(filter_data->omits, &obj->oid);
> +		return oidset_remove(filter_data->omits, &obj->oid);
>  	else
> -		oidset_insert(filter_data->omits, &obj->oid);
> +		return oidset_insert(filter_data->omits, &obj->oid);
>  }

I think this function is getting too magical - if filter_data->omits is
not set, we pretend that we have omitted the tree, because we want the
same behavior when not needing omits and when the tree is omitted. Could
this be done another way?
Jonathan Tan Jan. 8, 2019, 11:22 p.m. UTC | #2
> > -static void filter_trees_update_omits(
> > +static int filter_trees_update_omits(
> >  	struct object *obj,
> >  	struct filter_trees_depth_data *filter_data,
> >  	int include_it)
> >  {
> >  	if (!filter_data->omits)
> > -		return;
> > +		return 1;
> >  
> >  	if (include_it)
> > -		oidset_remove(filter_data->omits, &obj->oid);
> > +		return oidset_remove(filter_data->omits, &obj->oid);
> >  	else
> > -		oidset_insert(filter_data->omits, &obj->oid);
> > +		return oidset_insert(filter_data->omits, &obj->oid);
> >  }
> 
> I think this function is getting too magical - if filter_data->omits is
> not set, we pretend that we have omitted the tree, because we want the
> same behavior when not needing omits and when the tree is omitted. Could
> this be done another way?

Giving some more thought to this, since this is a static function, maybe
documenting it as "Returns 1 if the objects that this object references need to
be traversed for "omits" updates, and 0 otherwise" (with the appropriate code
updates) would suffice.
Matthew DeVore Jan. 9, 2019, 12:29 a.m. UTC | #3
Thank you for the review :) See below.

On 2019/01/07 18:00, Jonathan Tan wrote:
>> -static void filter_trees_update_omits(
>> +static int filter_trees_update_omits(
>>   	struct object *obj,
>>   	struct filter_trees_depth_data *filter_data,
>>   	int include_it)
>>   {
>>   	if (!filter_data->omits)
>> -		return;
>> +		return 1;
>>   
>>   	if (include_it)
>> -		oidset_remove(filter_data->omits, &obj->oid);
>> +		return oidset_remove(filter_data->omits, &obj->oid);
>>   	else
>> -		oidset_insert(filter_data->omits, &obj->oid);
>> +		return oidset_insert(filter_data->omits, &obj->oid);
>>   }
> I think this function is getting too magical - if filter_data->omits is
> not set, we pretend that we have omitted the tree, because we want the
> same behavior when not needing omits and when the tree is omitted. Could
> this be done another way?

Yes, returning a manipulative lie when omits is NULL is rather 
confusing. So I changed it to this (interdiff):

+/* Returns 1 if the oid was in the omits set before it was invoked. */
  static int filter_trees_update_omits(
      struct object *obj,
      struct filter_trees_depth_data *filter_data,
      int include_it)
  {
      if (!filter_data->omits)
-        return 1;
+        return 0;

      if (include_it)
          return oidset_remove(filter_data->omits, &obj->oid);
@@ -177,7 +178,7 @@ static enum list_objects_filter_result 
filter_trees_depth(

              if (include_it)
                  filter_res = LOFR_DO_SHOW;
-            else if (!been_omitted)
+            else if (filter_data->omits && !been_omitted)
                  /*
                   * Must update omit information of children
                   * recursively; they have not been omitted yet.
Matthew DeVore Jan. 9, 2019, 2:47 a.m. UTC | #4
> On January 8, 2019 at 3:22 PM Jonathan Tan <jonathantanmy@google.com> wrote:
> 
> 
> > > -static void filter_trees_update_omits(
> > > +static int filter_trees_update_omits(
> > >  	struct object *obj,
> > >  	struct filter_trees_depth_data *filter_data,
> > >  	int include_it)
> > >  {
> > >  	if (!filter_data->omits)
> > > -		return;
> > > +		return 1;
> > >  
> > >  	if (include_it)
> > > -		oidset_remove(filter_data->omits, &obj->oid);
> > > +		return oidset_remove(filter_data->omits, &obj->oid);
> > >  	else
> > > -		oidset_insert(filter_data->omits, &obj->oid);
> > > +		return oidset_insert(filter_data->omits, &obj->oid);
> > >  }
> > 
> > I think this function is getting too magical - if filter_data->omits is
> > not set, we pretend that we have omitted the tree, because we want the
> > same behavior when not needing omits and when the tree is omitted. Could
> > this be done another way?
> 
> Giving some more thought to this, since this is a static function, maybe
> documenting it as "Returns 1 if the objects that this object references need to
> be traversed for "omits" updates, and 0 otherwise" (with the appropriate code
> updates) would suffice.

That's not bad. But I sent a correction which is more like "/* Returns 1 if the oid was in the omits set before it was invoked. */" and returns 0 if omits was NULL. I thought it clearer when the function returns a value in terms of its own arguments and logic, rather than what the caller needs to do. The code I save going with your suggestion (vs. the one I just sent) is offset by the necessity of more detailed comments.

Patch
diff mbox series

diff --git a/list-objects-filter.c b/list-objects-filter.c
index ca99f0dd02..3e9802c676 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -107,18 +107,18 @@  struct seen_map_entry {
 	size_t depth;
 };
 
-static void filter_trees_update_omits(
+static int filter_trees_update_omits(
 	struct object *obj,
 	struct filter_trees_depth_data *filter_data,
 	int include_it)
 {
 	if (!filter_data->omits)
-		return;
+		return 1;
 
 	if (include_it)
-		oidset_remove(filter_data->omits, &obj->oid);
+		return oidset_remove(filter_data->omits, &obj->oid);
 	else
-		oidset_insert(filter_data->omits, &obj->oid);
+		return oidset_insert(filter_data->omits, &obj->oid);
 }
 
 static enum list_objects_filter_result filter_trees_depth(
@@ -170,12 +170,17 @@  static enum list_objects_filter_result filter_trees_depth(
 		if (already_seen)
 			filter_res = LOFR_SKIP_TREE;
 		else {
+			int been_omitted = filter_trees_update_omits(
+				obj, filter_data, include_it);
 			seen_info->depth = filter_data->current_depth;
-			filter_trees_update_omits(obj, filter_data, include_it);
 
 			if (include_it)
 				filter_res = LOFR_DO_SHOW;
-			else if (filter_data->omits)
+			else if (!been_omitted)
+				/*
+				 * Must update omit information of children
+				 * recursively; they have not been omitted yet.
+				 */
 				filter_res = LOFR_ZERO;
 			else
 				filter_res = LOFR_SKIP_TREE;
diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh
index 54e7096d40..18b0b14d5a 100755
--- a/t/t6112-rev-list-filters-objects.sh
+++ b/t/t6112-rev-list-filters-objects.sh
@@ -283,7 +283,7 @@  test_expect_success 'verify tree:0 includes trees in "filtered" output' '
 
 # Make sure tree:0 does not iterate through any trees.
 
-test_expect_success 'filter a GIANT tree through tree:0' '
+test_expect_success 'verify skipping tree iteration when not collecting omits' '
 	GIT_TRACE=1 git -C r3 rev-list \
 		--objects --filter=tree:0 HEAD 2>filter_trace &&
 	grep "Skipping contents of tree [.][.][.]" filter_trace >actual &&
@@ -377,6 +377,15 @@  test_expect_success 'test tree:# filter provisional omit for blob and tree' '
 	expect_has_with_different_name r4 filt/subdir
 '
 
+test_expect_success 'verify skipping tree iteration when collecting omits' '
+	GIT_TRACE=1 git -C r4 rev-list --filter-print-omitted \
+		--objects --filter=tree:0 HEAD 2>filter_trace &&
+	grep "^Skipping contents of tree " filter_trace >actual &&
+
+	echo "Skipping contents of tree subdir/..." >expect &&
+	test_cmp expect actual
+'
+
 # Test tree:<depth> where a tree is iterated to twice - once where a subentry is
 # too deep to be included, and again where the blob inside it is shallow enough
 # to be included. This makes sure we don't use LOFR_MARK_SEEN incorrectly (we