diff mbox series

[v2,1/2] list-objects-filter: teach tree:# how to handle >0

Message ID 20181210234030.176178-2-matvore@google.com (mailing list archive)
State New, archived
Headers show
Series support for filtering trees and blobs based on depth | expand

Commit Message

Matthew DeVore Dec. 10, 2018, 11:40 p.m. UTC
Implement positive values for <depth> in the tree:<depth> filter. The
exact semantics are described in Documentation/rev-list-options.txt.

The long-term goal at the end of this is to allow a partial clone to
eagerly fetch an entire directory of files by fetching a tree and
specifying <depth>=1. This, for instance, would make a build operation
fast and convenient. It is fast because the partial clone does not need
to fetch each file individually, and convenient because the user does
not need to supply a sparse-checkout specification.

Another way of considering this feature is as a way to reduce
round-trips, since the client can get any number of levels of
directories in a single request, rather than wait for each level of tree
objects to come back, whose entries are used to construct a new request.

Signed-off-by: Matthew DeVore <matvore@google.com>
---
 Documentation/rev-list-options.txt  |   9 ++-
 list-objects-filter-options.c       |   7 +-
 list-objects-filter-options.h       |   3 +-
 list-objects-filter.c               | 115 +++++++++++++++++++++++-----
 t/t6112-rev-list-filters-objects.sh | 104 +++++++++++++++++++++++++
 5 files changed, 210 insertions(+), 28 deletions(-)

Comments

Jonathan Tan Jan. 8, 2019, 1:56 a.m. UTC | #1
Thanks - as stated in your commit message, this adds quite a useful
piece of functionality.

> -	case LOFS_BEGIN_TREE:
> -	case LOFS_BLOB:
> -		if (filter_data->omits) {
> -			oidset_insert(filter_data->omits, &obj->oid);
> -			/* _MARK_SEEN but not _DO_SHOW (hard omit) */
> -			return LOFR_MARK_SEEN;
> -		} else {
> -			/*
> -			 * Not collecting omits so no need to to traverse tree.
> -			 */
> -			return LOFR_SKIP_TREE | LOFR_MARK_SEEN;
> -		}
> -
>  	case LOFS_END_TREE:
>  		assert(obj->type == OBJ_TREE);
> +		filter_data->current_depth--;
>  		return LOFR_ZERO;
>  
> +	case LOFS_BLOB:
> +		filter_trees_update_omits(obj, filter_data, include_it);
> +		return include_it ? LOFR_MARK_SEEN | LOFR_DO_SHOW : LOFR_ZERO;

Any reason for moving "case LOFS_BLOB" (and "case LOFS_BEGIN_TREE"
below) after LOFS_END_TREE?

This is drastically different from the previous case, but this makes
sense - previously, all blobs accessed through traversal were not shown,
but now they are sometimes shown. Here, filter_trees_update_omits() is
only ever used to remove a blob from the omits set, since once this blob
is encountered with include_it == true, it is marked as LOFR_MARK_SEEN
and will not be traversed again.

> +	case LOFS_BEGIN_TREE:
> +		seen_info = oidmap_get(
> +			&filter_data->seen_at_depth, &obj->oid);
> +		if (!seen_info) {
> +			seen_info = xcalloc(1, sizeof(struct seen_map_entry));

Use sizeof(*seen_info).

> +			seen_info->base.oid = obj->oid;

We have been using oidcpy, but come to think of it, I'm not sure why...

> +			seen_info->depth = filter_data->current_depth;
> +			oidmap_put(&filter_data->seen_at_depth, seen_info);
> +			already_seen = 0;
> +		} else
> +			already_seen =
> +				filter_data->current_depth >= seen_info->depth;

There has been recently some clarification that if one branch of an
if/else construct requires braces, braces should be put on all of them:
1797dc5176 ("CodingGuidelines: clarify multi-line brace style",
2017-01-17). Likewise below.

> +		if (already_seen)
> +			filter_res = LOFR_SKIP_TREE;
> +		else {
> +			seen_info->depth = filter_data->current_depth;
> +			filter_trees_update_omits(obj, filter_data, include_it);
> +
> +			if (include_it)
> +				filter_res = LOFR_DO_SHOW;
> +			else if (filter_data->omits)
> +				filter_res = LOFR_ZERO;
> +			else
> +				filter_res = LOFR_SKIP_TREE;

Looks straightforward. If we have already seen it at a shallower or
equal depth, we can skip it (since we have already done the appropriate
processing). Otherwise, we need to ensure that its "omit" is correctly
set, and:
 - show it if include_it
 - don't do anything special if not include_it and we need the omit set
 - skip the tree if not include_it and we don't need the omit set

> +static void filter_trees_free(void *filter_data) {
> +	struct filter_trees_depth_data* d = filter_data;
> +	oidmap_free(&d->seen_at_depth, 1);
> +	free(d);
> +}

Check for NULL-ness of filter_data too, to match the usual behavior of
free functions.

> diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh
> index eb32505a6e..54e7096d40 100755
> --- a/t/t6112-rev-list-filters-objects.sh
> +++ b/t/t6112-rev-list-filters-objects.sh

[snip]

Thanks for the tests that cover quite a wide range of cases. Can you
also demonstrate the case where a blob would normally be omitted
(because it is too deep) but it is directly specified, so it is
included.

> +expect_has_with_different_name () {
> +	repo=$1 &&
> +	name=$2 &&
> +
> +	hash=$(git -C $repo rev-parse HEAD:$name) &&
> +	! grep "^$hash $name$" actual &&
> +	grep "^$hash " actual &&
> +	! grep "~$hash" actual
> +}

Should we also check that a "~" entry appears with $name?
Matthew DeVore Jan. 8, 2019, 7:22 p.m. UTC | #2
On 2019/01/07 17:56, Jonathan Tan wrote:
>>   	case LOFS_END_TREE:
>>   		assert(obj->type == OBJ_TREE);
>> +		filter_data->current_depth--;
>>   		return LOFR_ZERO;
>>   
>> +	case LOFS_BLOB:
>> +		filter_trees_update_omits(obj, filter_data, include_it);
>> +		return include_it ? LOFR_MARK_SEEN | LOFR_DO_SHOW : LOFR_ZERO;
> Any reason for moving "case LOFS_BLOB" (and "case LOFS_BEGIN_TREE"
> below) after LOFS_END_TREE?

I put LOFS_BLOB and after LOFS_END_TREE since that is the order in all 
the other filter logic functions. I put LOFS_BEGIN_TREE at the end 
(which is different from the other filter logic functions) because it's 
usually better to put simpler things before longer or more complex 
things. LOFS_BEGIN_TREE is much more complex and if it were not the last 
switch section, it would tend to hide the sections that come after it.

FWIW, I consider this the coding corollary of the end-weight problem in 
linguistics - see https://www.thoughtco.com/end-weight-grammar-1690594 - 
this is not my original idea, but something from the book Perl Best 
Practices, although that book only mentioned it in the context of 
ordering clauses in single statements rather than ordering entire blocks.

>
> This is drastically different from the previous case, but this makes
> sense - previously, all blobs accessed through traversal were not shown,
> but now they are sometimes shown.
Yes.
> Here, filter_trees_update_omits() is
> only ever used to remove a blob from the omits set, since once this blob
> is encountered with include_it == true, it is marked as LOFR_MARK_SEEN
> and will not be traversed again.
It is possible that include_it can be false and then in a later 
invocation it can be true. In that case, the blob will be added to the 
set and then removed from it.
>
>> +	case LOFS_BEGIN_TREE:
>> +		seen_info = oidmap_get(
>> +			&filter_data->seen_at_depth, &obj->oid);
>> +		if (!seen_info) {
>> +			seen_info = xcalloc(1, sizeof(struct seen_map_entry));
> Use sizeof(*seen_info).
Done.
>
>> +			seen_info->base.oid = obj->oid;
> We have been using oidcpy, but come to think of it, I'm not sure why...
Because the hash algorithm in use may not use the entire structure, 
apparently. Or there are future improvements planned to the function and 
they need to be picked up by all current hash-copying operations. Fixed.
>
>> +			seen_info->depth = filter_data->current_depth;
>> +			oidmap_put(&filter_data->seen_at_depth, seen_info);
>> +			already_seen = 0;
>> +		} else
>> +			already_seen =
>> +				filter_data->current_depth >= seen_info->depth;
> There has been recently some clarification that if one branch of an
> if/else construct requires braces, braces should be put on all of them:
> 1797dc5176 ("CodingGuidelines: clarify multi-line brace style",
> 2017-01-17). Likewise below.
Done, thank you - that's good to know.
>
>> +		if (already_seen)
>> +			filter_res = LOFR_SKIP_TREE;
>> +		else {
>> +			seen_info->depth = filter_data->current_depth;
>> +			filter_trees_update_omits(obj, filter_data, include_it);
>> +
>> +			if (include_it)
>> +				filter_res = LOFR_DO_SHOW;
>> +			else if (filter_data->omits)
>> +				filter_res = LOFR_ZERO;
>> +			else
>> +				filter_res = LOFR_SKIP_TREE;
> Looks straightforward. If we have already seen it at a shallower or
> equal depth, we can skip it (since we have already done the appropriate
> processing). Otherwise, we need to ensure that its "omit" is correctly
> set, and:
>   - show it if include_it
>   - don't do anything special if not include_it and we need the omit set
>   - skip the tree if not include_it and we don't need the omit set
Right.
>
>> +static void filter_trees_free(void *filter_data) {
>> +	struct filter_trees_depth_data* d = filter_data;
>> +	oidmap_free(&d->seen_at_depth, 1);
>> +	free(d);
>> +}
> Check for NULL-ness of filter_data too, to match the usual behavior of
> free functions.
>
Done.
>> diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh
>> index eb32505a6e..54e7096d40 100755
>> --- a/t/t6112-rev-list-filters-objects.sh
>> +++ b/t/t6112-rev-list-filters-objects.sh
> [snip]
>
> Thanks for the tests that cover quite a wide range of cases. Can you
> also demonstrate the case where a blob would normally be omitted
> (because it is too deep) but it is directly specified, so it is
> included.

I didn't exactly use TDD, but I did try to cover every line of code as 
well as both branches of each ternary operator.

Added such a test.

>
>> +expect_has_with_different_name () {
>> +	repo=$1 &&
>> +	name=$2 &&
>> +
>> +	hash=$(git -C $repo rev-parse HEAD:$name) &&
>> +	! grep "^$hash $name$" actual &&
>> +	grep "^$hash " actual &&
>> +	! grep "~$hash" actual
>> +}
> Should we also check that a "~" entry appears with $name?

I don't believe there is a way to get the object names to appear next to 
~ entries (note that the names are not saved in the omits oidset).

For your reference, here is an interdiff for this particular patch after 
applying your comments:

--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -158,18 +158,19 @@ static enum list_objects_filter_result 
filter_trees_depth(
          seen_info = oidmap_get(
              &filter_data->seen_at_depth, &obj->oid);
          if (!seen_info) {
-            seen_info = xcalloc(1, sizeof(struct seen_map_entry));
-            seen_info->base.oid = obj->oid;
+            seen_info = xcalloc(1, sizeof(*seen_info));
+            oidcpy(&seen_info->base.oid, &obj->oid);
              seen_info->depth = filter_data->current_depth;
              oidmap_put(&filter_data->seen_at_depth, seen_info);
              already_seen = 0;
-        } else
+        } else {
              already_seen =
                  filter_data->current_depth >= seen_info->depth;
+        }

-        if (already_seen)
+        if (already_seen) {
              filter_res = LOFR_SKIP_TREE;
-        else {
+        } else {
              int been_omitted = filter_trees_update_omits(
                  obj, filter_data, include_it);
              seen_info->depth = filter_data->current_depth;
@@ -193,6 +194,8 @@ static enum list_objects_filter_result 
filter_trees_depth(

  static void filter_trees_free(void *filter_data) {
      struct filter_trees_depth_data* d = filter_data;
+    if (!d)
+        return;
      oidmap_free(&d->seen_at_depth, 1);
      free(d);
  }
diff --git a/t/'t6112-rev-list-filters-objects.sh 
b/t/'t6112-rev-list-filters-objects.sh
new file mode 100644
index 0000000000..e69de29bb2
diff --git a/t/t6112-rev-list-filters-objects.sh 
b/t/t6112-rev-list-filters-objects.sh
index 18b0b14d5a..d6edad6a01 100755
--- a/t/t6112-rev-list-filters-objects.sh
+++ b/t/t6112-rev-list-filters-objects.sh
@@ -407,6 +407,13 @@ test_expect_success 'tree:<depth> where we iterate 
over tree at two levels' '
      expect_has_with_different_name r5 a/subdir/b/foo
  '

+test_expect_success 'tree:<depth> which filters out blob but given as 
arg' '
+    export blob_hash=$(git -C r4 rev-parse HEAD:subdir/bar) &&
+
+    git -C r4 rev-list --objects --filter=tree:1 HEAD $blob_hash >actual &&
+    grep ^$blob_hash actual
+'
+
  # Delete some loose objects and use rev-list, but WITHOUT any filtering.
  # This models previously omitted objects that we did not receive.
Jonathan Tan Jan. 8, 2019, 11:19 p.m. UTC | #3
> > Any reason for moving "case LOFS_BLOB" (and "case LOFS_BEGIN_TREE"
> > below) after LOFS_END_TREE?
> 
> I put LOFS_BLOB and after LOFS_END_TREE since that is the order in all 
> the other filter logic functions. I put LOFS_BEGIN_TREE at the end 
> (which is different from the other filter logic functions) because it's 
> usually better to put simpler things before longer or more complex 
> things. LOFS_BEGIN_TREE is much more complex and if it were not the last 
> switch section, it would tend to hide the sections that come after it.
> 
> FWIW, I consider this the coding corollary of the end-weight problem in 
> linguistics - see https://www.thoughtco.com/end-weight-grammar-1690594 - 
> this is not my original idea, but something from the book Perl Best 
> Practices, although that book only mentioned it in the context of 
> ordering clauses in single statements rather than ordering entire blocks.

OK - my thinking was that we should minimize the diff, but this
reasoning makes sense to me.

> > Here, filter_trees_update_omits() is
> > only ever used to remove a blob from the omits set, since once this blob
> > is encountered with include_it == true, it is marked as LOFR_MARK_SEEN
> > and will not be traversed again.
> It is possible that include_it can be false and then in a later 
> invocation it can be true. In that case, the blob will be added to the 
> set and then removed from it.

Ah...yes, you're right.

> For your reference, here is an interdiff for this particular patch after 
> applying your comments:

The interdiff looks good, thanks. All my issues are resolved.
Junio C Hamano Jan. 8, 2019, 11:36 p.m. UTC | #4
Jonathan Tan <jonathantanmy@google.com> writes:

>> For your reference, here is an interdiff for this particular patch after 
>> applying your comments:
>
> The interdiff looks good, thanks. All my issues are resolved.

Just to make sure.  That's not "v2 is good", but "v2 plus that
proposed update, when materializes, would be good", right?  I'll
mark the topic as "expecting a reroll" then.

Thanks.
Junio C Hamano Jan. 8, 2019, 11:39 p.m. UTC | #5
Jonathan Tan <jonathantanmy@google.com> writes:

>> +static void filter_trees_free(void *filter_data) {
>> +	struct filter_trees_depth_data* d = filter_data;
>> +	oidmap_free(&d->seen_at_depth, 1);
>> +	free(d);
>> +}
>
> Check for NULL-ness of filter_data too, to match the usual behavior of
> free functions.

Also, the asterisk sticks to the variable, not type, i.e.

	struct filter_trees_depth_data *d = filter_data;
Jonathan Tan Jan. 8, 2019, 11:41 p.m. UTC | #6
> Jonathan Tan <jonathantanmy@google.com> writes:
> 
> >> For your reference, here is an interdiff for this particular patch after 
> >> applying your comments:
> >
> > The interdiff looks good, thanks. All my issues are resolved.
> 
> Just to make sure.  That's not "v2 is good", but "v2 plus that
> proposed update, when materializes, would be good", right?  I'll
> mark the topic as "expecting a reroll" then.

Yes, that's right - there should be a reroll of this topic. Also,
besides the proposed update, I have a comment in patch 2 of this set.
Matthew DeVore Jan. 9, 2019, 2:43 a.m. UTC | #7
> On January 8, 2019 at 3:39 PM Junio C Hamano <gitster@pobox.com> wrote:
> 
> Also, the asterisk sticks to the variable, not type, i.e.
> 
> 	struct filter_trees_depth_data *d = filter_data;
>

Fixed. Thanks.
diff mbox series

Patch

diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index bab5f50b17..f8ab00f7c9 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -734,8 +734,13 @@  specification contained in <path>.
 +
 The form '--filter=tree:<depth>' omits all blobs and trees whose depth
 from the root tree is >= <depth> (minimum depth if an object is located
-at multiple depths in the commits traversed). Currently, only <depth>=0
-is supported, which omits all blobs and trees.
+at multiple depths in the commits traversed). <depth>=0 will not include
+any trees or blobs unless included explicitly in the command-line (or
+standard input when --stdin is used). <depth>=1 will include only the
+tree and blobs which are referenced directly by a commit reachable from
+<commit> or an explicitly-given object. <depth>=2 is like <depth>=1
+while also including trees and blobs one more level removed from an
+explicitly-given commit or tree.
 
 --no-filter::
 	Turn off any previous `--filter=` argument.
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index e8da2e8581..5285e7674d 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -50,16 +50,15 @@  static int gently_parse_list_objects_filter(
 		}
 
 	} else if (skip_prefix(arg, "tree:", &v0)) {
-		unsigned long depth;
-		if (!git_parse_ulong(v0, &depth) || depth != 0) {
+		if (!git_parse_ulong(v0, &filter_options->tree_exclude_depth)) {
 			if (errbuf) {
 				strbuf_addstr(
 					errbuf,
-					_("only 'tree:0' is supported"));
+					_("expected 'tree:<depth>'"));
 			}
 			return 1;
 		}
-		filter_options->choice = LOFC_TREE_NONE;
+		filter_options->choice = LOFC_TREE_DEPTH;
 		return 0;
 
 	} else if (skip_prefix(arg, "sparse:oid=", &v0)) {
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index af64e5c66f..477cd97029 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -10,7 +10,7 @@  enum list_objects_filter_choice {
 	LOFC_DISABLED = 0,
 	LOFC_BLOB_NONE,
 	LOFC_BLOB_LIMIT,
-	LOFC_TREE_NONE,
+	LOFC_TREE_DEPTH,
 	LOFC_SPARSE_OID,
 	LOFC_SPARSE_PATH,
 	LOFC__COUNT /* must be last */
@@ -44,6 +44,7 @@  struct list_objects_filter_options {
 	struct object_id *sparse_oid_value;
 	char *sparse_path_value;
 	unsigned long blob_limit_value;
+	unsigned long tree_exclude_depth;
 };
 
 /* Normalized command line arguments */
diff --git a/list-objects-filter.c b/list-objects-filter.c
index a62624a1ce..ca99f0dd02 100644
--- a/list-objects-filter.c
+++ b/list-objects-filter.c
@@ -10,6 +10,7 @@ 
 #include "list-objects.h"
 #include "list-objects-filter.h"
 #include "list-objects-filter-options.h"
+#include "oidmap.h"
 #include "oidset.h"
 #include "object-store.h"
 
@@ -84,11 +85,43 @@  static void *filter_blobs_none__init(
  * A filter for list-objects to omit ALL trees and blobs from the traversal.
  * Can OPTIONALLY collect a list of the omitted OIDs.
  */
-struct filter_trees_none_data {
+struct filter_trees_depth_data {
 	struct oidset *omits;
+
+	/*
+	 * Maps trees to the minimum depth at which they were seen. It is not
+	 * necessary to re-traverse a tree at deeper or equal depths than it has
+	 * already been traversed.
+	 *
+	 * We can't use LOFR_MARK_SEEN for tree objects since this will prevent
+	 * it from being traversed at shallower depths.
+	 */
+	struct oidmap seen_at_depth;
+
+	unsigned long exclude_depth;
+	unsigned long current_depth;
 };
 
-static enum list_objects_filter_result filter_trees_none(
+struct seen_map_entry {
+	struct oidmap_entry base;
+	size_t depth;
+};
+
+static void filter_trees_update_omits(
+	struct object *obj,
+	struct filter_trees_depth_data *filter_data,
+	int include_it)
+{
+	if (!filter_data->omits)
+		return;
+
+	if (include_it)
+		oidset_remove(filter_data->omits, &obj->oid);
+	else
+		oidset_insert(filter_data->omits, &obj->oid);
+}
+
+static enum list_objects_filter_result filter_trees_depth(
 	struct repository *r,
 	enum list_objects_filter_situation filter_situation,
 	struct object *obj,
@@ -96,43 +129,83 @@  static enum list_objects_filter_result filter_trees_none(
 	const char *filename,
 	void *filter_data_)
 {
-	struct filter_trees_none_data *filter_data = filter_data_;
+	struct filter_trees_depth_data *filter_data = filter_data_;
+	struct seen_map_entry *seen_info;
+	int include_it = filter_data->current_depth <
+		filter_data->exclude_depth;
+	int filter_res;
+	int already_seen;
+
+	/*
+	 * Note that we do not use _MARK_SEEN in order to allow re-traversal in
+	 * case we encounter a tree or blob again at a shallower depth.
+	 */
 
 	switch (filter_situation) {
 	default:
 		BUG("unknown filter_situation: %d", filter_situation);
 
-	case LOFS_BEGIN_TREE:
-	case LOFS_BLOB:
-		if (filter_data->omits) {
-			oidset_insert(filter_data->omits, &obj->oid);
-			/* _MARK_SEEN but not _DO_SHOW (hard omit) */
-			return LOFR_MARK_SEEN;
-		} else {
-			/*
-			 * Not collecting omits so no need to to traverse tree.
-			 */
-			return LOFR_SKIP_TREE | LOFR_MARK_SEEN;
-		}
-
 	case LOFS_END_TREE:
 		assert(obj->type == OBJ_TREE);
+		filter_data->current_depth--;
 		return LOFR_ZERO;
 
+	case LOFS_BLOB:
+		filter_trees_update_omits(obj, filter_data, include_it);
+		return include_it ? LOFR_MARK_SEEN | LOFR_DO_SHOW : LOFR_ZERO;
+
+	case LOFS_BEGIN_TREE:
+		seen_info = oidmap_get(
+			&filter_data->seen_at_depth, &obj->oid);
+		if (!seen_info) {
+			seen_info = xcalloc(1, sizeof(struct seen_map_entry));
+			seen_info->base.oid = obj->oid;
+			seen_info->depth = filter_data->current_depth;
+			oidmap_put(&filter_data->seen_at_depth, seen_info);
+			already_seen = 0;
+		} else
+			already_seen =
+				filter_data->current_depth >= seen_info->depth;
+
+		if (already_seen)
+			filter_res = LOFR_SKIP_TREE;
+		else {
+			seen_info->depth = filter_data->current_depth;
+			filter_trees_update_omits(obj, filter_data, include_it);
+
+			if (include_it)
+				filter_res = LOFR_DO_SHOW;
+			else if (filter_data->omits)
+				filter_res = LOFR_ZERO;
+			else
+				filter_res = LOFR_SKIP_TREE;
+		}
+
+		filter_data->current_depth++;
+		return filter_res;
 	}
 }
 
-static void* filter_trees_none__init(
+static void filter_trees_free(void *filter_data) {
+	struct filter_trees_depth_data* d = filter_data;
+	oidmap_free(&d->seen_at_depth, 1);
+	free(d);
+}
+
+static void* filter_trees_depth__init(
 	struct oidset *omitted,
 	struct list_objects_filter_options *filter_options,
 	filter_object_fn *filter_fn,
 	filter_free_fn *filter_free_fn)
 {
-	struct filter_trees_none_data *d = xcalloc(1, sizeof(*d));
+	struct filter_trees_depth_data *d = xcalloc(1, sizeof(*d));
 	d->omits = omitted;
+	oidmap_init(&d->seen_at_depth, 0);
+	d->exclude_depth = filter_options->tree_exclude_depth;
+	d->current_depth = 0;
 
-	*filter_fn = filter_trees_none;
-	*filter_free_fn = free;
+	*filter_fn = filter_trees_depth;
+	*filter_free_fn = filter_trees_free;
 	return d;
 }
 
@@ -430,7 +503,7 @@  static filter_init_fn s_filters[] = {
 	NULL,
 	filter_blobs_none__init,
 	filter_blobs_limit__init,
-	filter_trees_none__init,
+	filter_trees_depth__init,
 	filter_sparse_oid__init,
 	filter_sparse_path__init,
 };
diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh
index eb32505a6e..54e7096d40 100755
--- a/t/t6112-rev-list-filters-objects.sh
+++ b/t/t6112-rev-list-filters-objects.sh
@@ -294,6 +294,110 @@  test_expect_success 'filter a GIANT tree through tree:0' '
 	! grep "Skipping contents of tree [^.]" filter_trace
 '
 
+# Test tree:# filters.
+
+expect_has () {
+	commit=$1 &&
+	name=$2 &&
+
+	hash=$(git -C r3 rev-parse $commit:$name) &&
+	grep "^$hash $name$" actual
+}
+
+test_expect_success 'verify tree:1 includes root trees' '
+	git -C r3 rev-list --objects --filter=tree:1 HEAD >actual &&
+
+	# We should get two root directories and two commits.
+	expect_has HEAD "" &&
+	expect_has HEAD~1 ""  &&
+	test_line_count = 4 actual
+'
+
+test_expect_success 'verify tree:2 includes root trees and immediate children' '
+	git -C r3 rev-list --objects --filter=tree:2 HEAD >actual &&
+
+	expect_has HEAD "" &&
+	expect_has HEAD~1 "" &&
+	expect_has HEAD dir1 &&
+	expect_has HEAD pattern &&
+	expect_has HEAD sparse1 &&
+	expect_has HEAD sparse2 &&
+
+	# There are also 2 commit objects
+	test_line_count = 8 actual
+'
+
+test_expect_success 'verify tree:3 includes everything expected' '
+	git -C r3 rev-list --objects --filter=tree:3 HEAD >actual &&
+
+	expect_has HEAD "" &&
+	expect_has HEAD~1 "" &&
+	expect_has HEAD dir1 &&
+	expect_has HEAD dir1/sparse1 &&
+	expect_has HEAD dir1/sparse2 &&
+	expect_has HEAD pattern &&
+	expect_has HEAD sparse1 &&
+	expect_has HEAD sparse2 &&
+
+	# There are also 2 commit objects
+	test_line_count = 10 actual
+'
+
+# Test provisional omit collection logic with a repo that has objects appearing
+# at multiple depths - first deeper than the filter's threshold, then shallow.
+
+test_expect_success 'setup r4' '
+	git init r4 &&
+
+	echo foo > r4/foo &&
+	mkdir r4/subdir &&
+	echo bar > r4/subdir/bar &&
+
+	mkdir r4/filt &&
+	cp -r r4/foo r4/subdir r4/filt &&
+
+	git -C r4 add foo subdir filt &&
+	git -C r4 commit -m "commit msg"
+'
+
+expect_has_with_different_name () {
+	repo=$1 &&
+	name=$2 &&
+
+	hash=$(git -C $repo rev-parse HEAD:$name) &&
+	! grep "^$hash $name$" actual &&
+	grep "^$hash " actual &&
+	! grep "~$hash" actual
+}
+
+test_expect_success 'test tree:# filter provisional omit for blob and tree' '
+	git -C r4 rev-list --objects --filter-print-omitted --filter=tree:2 \
+		HEAD >actual &&
+	expect_has_with_different_name r4 filt/foo &&
+	expect_has_with_different_name r4 filt/subdir
+'
+
+# Test tree:<depth> where a tree is iterated to twice - once where a subentry is
+# too deep to be included, and again where the blob inside it is shallow enough
+# to be included. This makes sure we don't use LOFR_MARK_SEEN incorrectly (we
+# can't use it because a tree can be iterated over again at a lower depth).
+
+test_expect_success 'tree:<depth> where we iterate over tree at two levels' '
+	git init r5 &&
+
+	mkdir -p r5/a/subdir/b &&
+	echo foo > r5/a/subdir/b/foo &&
+
+	mkdir -p r5/subdir/b &&
+	echo foo > r5/subdir/b/foo &&
+
+	git -C r5 add a subdir &&
+	git -C r5 commit -m "commit msg" &&
+
+	git -C r5 rev-list --objects --filter=tree:4 HEAD >actual &&
+	expect_has_with_different_name r5 a/subdir/b/foo
+'
+
 # Delete some loose objects and use rev-list, but WITHOUT any filtering.
 # This models previously omitted objects that we did not receive.