From patchwork Fri Dec 20 16:21:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13916979 Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0CE77219A69 for ; Fri, 20 Dec 2024 16:21:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711684; cv=none; b=PUX/bYX+6jFH+rqK5/TOt64NtZl59nD1hl4GiDUtDRYpR/DKkbbwXZgXrEDj8ECFB9VAEdJ4aspkiKXOBRJySLQXentUlGF4czAkr4fgDFb6tKncMW1stw8I6uaxpRrN/F4csBNS/VRPDtts4V4YbQSki61mylhcsCgH9G6rY64= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711684; c=relaxed/simple; bh=10kIavHsNQglQPjc1QLrR7lp8CzYceKQpzu9mepfkvs=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=keu8Br/X+FcNbjlqhXDxfzQhJtrOPoUCSfDioeCl4ldjY0Evqh0k9ONsZ9spaGAV08qfwI1afyXoOQ2HqITLE6QImG8uEXpp9/73t/elv5T5MDenfY4poHVT7cPrwwX9qAxC65Kbw8jNrQtjV6lyA2aOOFi8JH1gNzdDB9GGbRs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BRRPbyQo; arc=none smtp.client-ip=209.85.221.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BRRPbyQo" Received: by mail-wr1-f46.google.com with SMTP id ffacd0b85a97d-3862d6d5765so1335961f8f.3 for ; Fri, 20 Dec 2024 08:21:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734711680; x=1735316480; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=Z6HFrEcwLiwhmvIYztVgNR2/ErfabQR3p2XsMXcKbOo=; b=BRRPbyQoO+CiML8CHKhRZ1Kt4vQxOnyr+iQGD3zxNm/LAWjBFSWcacX+ZyO+lBIAUr H0Dd4ceRW8icSU7oiH7dDTnvh2SZ1nblD9kh1N9+WmIdj1DPad9aqFMffpiPMbHrWMkx uObHgpE1JyPKTd2RIEsQsCjzF41cmngISRi0mKaU8oJrIO9wpzEIDS8BdhXwk0Mf0JjV Iy7CWXuIF3n38OydcyjYctfEeP/fcpqke5UXZVMFy7+7fteFsnjS4jXBj22mnHCO8R6j /WIFzKzi2gO+wb0cyUVcx4CgGVNIEQwQSt8e8min6zdB/FNrXZMaMm/3ZkhmEzILERce ni6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734711680; x=1735316480; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z6HFrEcwLiwhmvIYztVgNR2/ErfabQR3p2XsMXcKbOo=; b=pbPD8CHtDUl2rgoYApPw4tMDvdAuZ3JPWWfSMUBmqjRyvyexu9VI4BC7nn3/M1h3s4 r+XFn2PWL40ERyeSxlkm7eVY7tCGmK9DWXfEd++hDTst2XeTSCSo28zTUOXdtxB+Pa4n 1AYy0STD+UQBtSOzA/ivTMSwjW7FHe0YishMaXemlxrAHDYYmiZChjZFh7RTvQJfDxSc atGRxjDv+h2cZOaKwXTCTGQ7lvreC1oivWG9QooM/ySGmhWBV0+4TPf8jW8/yrtJ+zp6 xaMNmYatsrmWD1MIxbP1FlTGBeoxkObG+ERf9O2gKAPcHB+cEDspx2DayJTOKEMbRu8Q 2eDw== X-Gm-Message-State: AOJu0YxDnhiKFPTc9XMgtGgN9aYHixKGYrea/HAR3PD2Es1lL9N/J5Kw nLyfGChE7VnPgVL6aaheMUAvCM1xHtR1uzHi5pk6wVS1O0plEgtTrhYSjw== X-Gm-Gg: ASbGncsMaSi+NDsCb9ELowb5me9aoOXPxosAaaVzli1bqG/R0P0+/Gugb9OtkMglH8d VMwUWMM/GDrOVTx4EY7bFSnf8BQMCUT5FMvN1/CLFcj+qQpTbcXfIXzXy7PeSBIPua1WEw1RVnD Ru6dLqAU8gfzCdu1j7mjLlsDjBGn1U4S2MxU4WaTeoApeDOvk20dohWA9NkQ+Buf6hOchoh1tFz dPDkUuow8Nfc5LKHjc/NQNN//dwu4h81GJGyw+S3+amO27PUZM4L4roqA== X-Google-Smtp-Source: AGHT+IG2/qGp3w+UKPNMNdrqqemDaebL3HgBFfmRJGTfni0X8bFry/F9A7HYPOi0RmMZ/Q8UBUdU0w== X-Received: by 2002:a05:6000:1846:b0:385:ef97:78c with SMTP id ffacd0b85a97d-38a221f3179mr2617517f8f.6.1734711679600; Fri, 20 Dec 2024 08:21:19 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38a1c6ad3e3sm4582018f8f.0.2024.12.20.08.21.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Dec 2024 08:21:19 -0800 (PST) Message-Id: <62f7aae477bc542dce0bb6f8b59438442a77d154.1734711676.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 20 Dec 2024 16:21:09 +0000 Subject: [PATCH v4 1/7] path-walk: introduce an object walk by path Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, johannes.schindelin@gmx.de, peff@peff.net, ps@pks.im, me@ttaylorr.com, johncai86@gmail.com, newren@gmail.com, christian.couder@gmail.com, kristofferhaugsbakk@fastmail.com, jonathantanmy@google.com, karthik nayak , Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee In anticipation of a few planned applications, introduce the most basic form of a path-walk API. It currently assumes that there are no UNINTERESTING objects, and does not include any complicated filters. It calls a function pointer on groups of tree and blob objects as grouped by path. This only includes objects the first time they are discovered, so an object that appears at multiple paths will not be included in two batches. These batches are collected in 'struct type_and_oid_list' objects, which store an object type and an oid_array of objects. The data structures are documented in 'struct path_walk_context', but in summary the most important are: * 'paths_to_lists' is a strmap that connects a path to a type_and_oid_list for that path. To avoid conflicts in path names, we make sure that tree paths end in "/" (except the root path with is an empty string) and blob paths do not end in "/". * 'path_stack' is a string list that is added to in an append-only way. This stores the stack of our depth-first search on the heap instead of using recursion. * 'path_stack_pushed' is a strmap that stores path names that were already added to 'path_stack', to avoid repeating paths in the stack. Mostly, this saves us from quadratic lookups from doing unsorted checks into the string_list. The coupling of 'path_stack' and 'path_stack_pushed' is protected by the push_to_stack() method. Call this instead of inserting into these structures directly. The walk_objects_by_path() method initializes these structures and starts walking commits from the given rev_info struct. The commits are used to find the list of root trees which populate the start of our depth-first search. The core of our depth-first search is in a while loop that continues while we have not indicated an early exit and our 'path_stack' still has entries in it. The loop body pops a path off of the stack and "visits" the path via the walk_path() method. The walk_path() method gets the list of OIDs from the 'path_to_lists' strmap and executes the callback method on that list with the given path and type. If the OIDs correspond to tree objects, then iterate over all trees in the list and run add_children() to add the child objects to their own lists, adding new entries to the stack if necessary. In testing, this depth-first search approach was the one that used the least memory while iterating over the object lists. There is still a chance that repositories with too-wide path patterns could cause memory pressure issues. Limiting the stack size could be done in the future by limiting how many objects are being considered in-progress, or by visiting blob paths earlier than trees. There are many future adaptations that could be made, but they are left for future updates when consumers are ready to take advantage of those features. Signed-off-by: Derrick Stolee --- Documentation/technical/api-path-walk.txt | 45 ++++ Makefile | 1 + path-walk.c | 279 ++++++++++++++++++++++ path-walk.h | 47 ++++ 4 files changed, 372 insertions(+) create mode 100644 Documentation/technical/api-path-walk.txt create mode 100644 path-walk.c create mode 100644 path-walk.h diff --git a/Documentation/technical/api-path-walk.txt b/Documentation/technical/api-path-walk.txt new file mode 100644 index 00000000000..c550c77ca30 --- /dev/null +++ b/Documentation/technical/api-path-walk.txt @@ -0,0 +1,45 @@ +Path-Walk API +============= + +The path-walk API is used to walk reachable objects, but to visit objects +in batches based on a common path they appear in, or by type. + +For example, all reachable commits are visited in a group. All tags are +visited in a group. Then, all root trees are visited. At some point, all +blobs reachable via a path `my/dir/to/A` are visited. When there are +multiple paths possible to reach the same object, then only one of those +paths is used to visit the object. + +Basics +------ + +To use the path-walk API, include `path-walk.h` and call +`walk_objects_by_path()` with a customized `path_walk_info` struct. The +struct is used to set all of the options for how the walk should proceed. +Let's dig into the different options and their use. + +`path_fn` and `path_fn_data`:: + The most important option is the `path_fn` option, which is a + function pointer to the callback that can execute logic on the + object IDs for objects grouped by type and path. This function + also receives a `data` value that corresponds to the + `path_fn_data` member, for providing custom data structures to + this callback function. + +`revs`:: + To configure the exact details of the reachable set of objects, + use the `revs` member and initialize it using the revision + machinery in `revision.h`. Initialize `revs` using calls such as + `setup_revisions()` or `parse_revision_opt()`. Do not call + `prepare_revision_walk()`, as that will be called within + `walk_objects_by_path()`. ++ +It is also important that you do not specify the `--objects` flag for the +`revs` struct. The revision walk should only be used to walk commits, and +the objects will be walked in a separate way based on those starting +commits. + +Examples +-------- + +See example usages in future changes. diff --git a/Makefile b/Makefile index 7344a7f7257..d0d8d6888e3 100644 --- a/Makefile +++ b/Makefile @@ -1094,6 +1094,7 @@ LIB_OBJS += parse-options.o LIB_OBJS += patch-delta.o LIB_OBJS += patch-ids.o LIB_OBJS += path.o +LIB_OBJS += path-walk.o LIB_OBJS += pathspec.o LIB_OBJS += pkt-line.o LIB_OBJS += preload-index.o diff --git a/path-walk.c b/path-walk.c new file mode 100644 index 00000000000..021840ab41b --- /dev/null +++ b/path-walk.c @@ -0,0 +1,279 @@ +/* + * path-walk.c: implementation for path-based walks of the object graph. + */ +#include "git-compat-util.h" +#include "path-walk.h" +#include "blob.h" +#include "commit.h" +#include "dir.h" +#include "hashmap.h" +#include "hex.h" +#include "object.h" +#include "oid-array.h" +#include "revision.h" +#include "string-list.h" +#include "strmap.h" +#include "trace2.h" +#include "tree.h" +#include "tree-walk.h" + +struct type_and_oid_list { + enum object_type type; + struct oid_array oids; +}; + +#define TYPE_AND_OID_LIST_INIT { \ + .type = OBJ_NONE, \ + .oids = OID_ARRAY_INIT \ +} + +struct path_walk_context { + /** + * Repeats of data in 'struct path_walk_info' for + * access with fewer characters. + */ + struct repository *repo; + struct rev_info *revs; + struct path_walk_info *info; + + /** + * Map a path to a 'struct type_and_oid_list' + * containing the objects discovered at that + * path. + */ + struct strmap paths_to_lists; + + /** + * Store the current list of paths in a stack, to + * facilitate depth-first-search without recursion. + * + * Use path_stack_pushed to indicate whether a path + * was previously added to path_stack. + */ + struct string_list path_stack; + struct strset path_stack_pushed; +}; + +static void push_to_stack(struct path_walk_context *ctx, + const char *path) +{ + if (strset_contains(&ctx->path_stack_pushed, path)) + return; + + strset_add(&ctx->path_stack_pushed, path); + string_list_append(&ctx->path_stack, path); +} + +static int add_tree_entries(struct path_walk_context *ctx, + const char *base_path, + struct object_id *oid) +{ + struct tree_desc desc; + struct name_entry entry; + struct strbuf path = STRBUF_INIT; + size_t base_len; + struct tree *tree = lookup_tree(ctx->repo, oid); + + if (!tree) { + error(_("failed to walk children of tree %s: not found"), + oid_to_hex(oid)); + return -1; + } else if (parse_tree_gently(tree, 1)) { + error("bad tree object %s", oid_to_hex(oid)); + return -1; + } + + strbuf_addstr(&path, base_path); + base_len = path.len; + + parse_tree(tree); + init_tree_desc(&desc, &tree->object.oid, tree->buffer, tree->size); + while (tree_entry(&desc, &entry)) { + struct type_and_oid_list *list; + struct object *o; + /* Not actually true, but we will ignore submodules later. */ + enum object_type type = S_ISDIR(entry.mode) ? OBJ_TREE : OBJ_BLOB; + + /* Skip submodules. */ + if (S_ISGITLINK(entry.mode)) + continue; + + if (type == OBJ_TREE) { + struct tree *child = lookup_tree(ctx->repo, &entry.oid); + o = child ? &child->object : NULL; + } else if (type == OBJ_BLOB) { + struct blob *child = lookup_blob(ctx->repo, &entry.oid); + o = child ? &child->object : NULL; + } else { + BUG("invalid type for tree entry: %d", type); + } + + if (!o) { + error(_("failed to find object %s"), + oid_to_hex(&o->oid)); + return -1; + } + + /* Skip this object if already seen. */ + if (o->flags & SEEN) + continue; + o->flags |= SEEN; + + strbuf_setlen(&path, base_len); + strbuf_add(&path, entry.path, entry.pathlen); + + /* + * Trees will end with "/" for concatenation and distinction + * from blobs at the same path. + */ + if (type == OBJ_TREE) + strbuf_addch(&path, '/'); + + if (!(list = strmap_get(&ctx->paths_to_lists, path.buf))) { + CALLOC_ARRAY(list, 1); + list->type = type; + strmap_put(&ctx->paths_to_lists, path.buf, list); + } + push_to_stack(ctx, path.buf); + oid_array_append(&list->oids, &entry.oid); + } + + free_tree_buffer(tree); + strbuf_release(&path); + return 0; +} + +/* + * For each path in paths_to_explore, walk the trees another level + * and add any found blobs to the batch (but only if they exist and + * haven't been added yet). + */ +static int walk_path(struct path_walk_context *ctx, + const char *path) +{ + struct type_and_oid_list *list; + int ret = 0; + + list = strmap_get(&ctx->paths_to_lists, path); + + if (!list->oids.nr) + return 0; + + /* Evaluate function pointer on this data. */ + ret = ctx->info->path_fn(path, &list->oids, list->type, + ctx->info->path_fn_data); + + /* Expand data for children. */ + if (list->type == OBJ_TREE) { + for (size_t i = 0; i < list->oids.nr; i++) { + ret |= add_tree_entries(ctx, + path, + &list->oids.oid[i]); + } + } + + oid_array_clear(&list->oids); + strmap_remove(&ctx->paths_to_lists, path, 1); + return ret; +} + +static void clear_paths_to_lists(struct strmap *map) +{ + struct hashmap_iter iter; + struct strmap_entry *e; + + hashmap_for_each_entry(&map->map, &iter, e, ent) { + struct type_and_oid_list *list = e->value; + oid_array_clear(&list->oids); + } + strmap_clear(map, 1); + strmap_init(map); +} + +/** + * Given the configuration of 'info', walk the commits based on 'info->revs' and + * call 'info->path_fn' on each discovered path. + * + * Returns nonzero on an error. + */ +int walk_objects_by_path(struct path_walk_info *info) +{ + const char *root_path = ""; + int ret = 0; + size_t commits_nr = 0, paths_nr = 0; + struct commit *c; + struct type_and_oid_list *root_tree_list; + struct path_walk_context ctx = { + .repo = info->revs->repo, + .revs = info->revs, + .info = info, + .path_stack = STRING_LIST_INIT_DUP, + .path_stack_pushed = STRSET_INIT, + .paths_to_lists = STRMAP_INIT + }; + + trace2_region_enter("path-walk", "commit-walk", info->revs->repo); + + /* Insert a single list for the root tree into the paths. */ + CALLOC_ARRAY(root_tree_list, 1); + root_tree_list->type = OBJ_TREE; + strmap_put(&ctx.paths_to_lists, root_path, root_tree_list); + push_to_stack(&ctx, root_path); + + if (prepare_revision_walk(info->revs)) + die(_("failed to setup revision walk")); + + while ((c = get_revision(info->revs))) { + struct object_id *oid = get_commit_tree_oid(c); + struct tree *t; + commits_nr++; + + oid = get_commit_tree_oid(c); + t = lookup_tree(info->revs->repo, oid); + + if (!t) { + error("could not find tree %s", oid_to_hex(oid)); + return -1; + } + + if (t->object.flags & SEEN) + continue; + t->object.flags |= SEEN; + oid_array_append(&root_tree_list->oids, oid); + } + + trace2_data_intmax("path-walk", ctx.repo, "commits", commits_nr); + trace2_region_leave("path-walk", "commit-walk", info->revs->repo); + + trace2_region_enter("path-walk", "path-walk", info->revs->repo); + while (!ret && ctx.path_stack.nr) { + char *path = ctx.path_stack.items[ctx.path_stack.nr - 1].string; + ctx.path_stack.nr--; + paths_nr++; + + ret = walk_path(&ctx, path); + + free(path); + } + trace2_data_intmax("path-walk", ctx.repo, "paths", paths_nr); + trace2_region_leave("path-walk", "path-walk", info->revs->repo); + + clear_paths_to_lists(&ctx.paths_to_lists); + strset_clear(&ctx.path_stack_pushed); + string_list_clear(&ctx.path_stack, 0); + return ret; +} + +void path_walk_info_init(struct path_walk_info *info) +{ + struct path_walk_info empty = PATH_WALK_INFO_INIT; + memcpy(info, &empty, sizeof(empty)); +} + +void path_walk_info_clear(struct path_walk_info *info UNUSED) +{ + /* + * This destructor is empty for now, as info->revs + * is not owned by 'struct path_walk_info'. + */ +} diff --git a/path-walk.h b/path-walk.h new file mode 100644 index 00000000000..7cb3538cd8f --- /dev/null +++ b/path-walk.h @@ -0,0 +1,47 @@ +/* + * path-walk.h : Methods and structures for walking the object graph in batches + * by the paths that can reach those objects. + */ +#include "object.h" /* Required for 'enum object_type'. */ + +struct rev_info; +struct oid_array; + +/** + * The type of a function pointer for the method that is called on a list of + * objects reachable at a given path. + */ +typedef int (*path_fn)(const char *path, + struct oid_array *oids, + enum object_type type, + void *data); + +struct path_walk_info { + /** + * revs provides the definitions for the commit walk, including + * which commits are UNINTERESTING or not. This structure is + * expected to be owned by the caller. + */ + struct rev_info *revs; + + /** + * The caller wishes to execute custom logic on objects reachable at a + * given path. Every reachable object will be visited exactly once, and + * the first path to see an object wins. This may not be a stable choice. + */ + path_fn path_fn; + void *path_fn_data; +}; + +#define PATH_WALK_INFO_INIT { 0 } + +void path_walk_info_init(struct path_walk_info *info); +void path_walk_info_clear(struct path_walk_info *info); + +/** + * Given the configuration of 'info', walk the commits based on 'info->revs' and + * call 'info->path_fn' on each discovered path. + * + * Returns nonzero on an error. + */ +int walk_objects_by_path(struct path_walk_info *info); From patchwork Fri Dec 20 16:21:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13916980 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C9DC8218EA2 for ; Fri, 20 Dec 2024 16:21:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711684; cv=none; b=L+uRKi7NYI6f5fiCqN5QCpflZ2uHB9Gza8ISbDeT8Gi9vUKCYfdM/8bpAzHWN3qYLSk321WSkuUPBDQL1WXMvpEf215CZdhNEqd1R7wTy+Yvwzm8rXv39U6DU8eh7uQf3rnij5TJIm+AvZBrTdxfz+dsxC94EI/mUPU+ta/RuP8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711684; c=relaxed/simple; bh=PMdw0FTFVG2iUoZMyeWr7FZupnaMUztTqBtsfXjAVNQ=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=XlOb/t/sscWXc36zFjQJwT3LXyWvj0TaCAIDz26Hn1mG/mGgOr0pj93eIMfspsQZX+I7vfM3kNJLDGNIpoKVBh3/Es5hDh1Qf+cljuUO1VM58KobE9k0f7uDY2iQideb3v/N976EDsOmj5gzR4z0s1Vnga2Tr3OrIN1VJ68V65I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Ts4Lkab5; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Ts4Lkab5" Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-436202dd7f6so22880835e9.0 for ; Fri, 20 Dec 2024 08:21:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734711681; x=1735316481; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=/dK2I2TXW5iAu4adBn9RAI0BX8gzXE/7TFrMGsUREmg=; b=Ts4Lkab52VwYVy+dJDI7LG38Rkbvx2HcjAlU42ysayGBiVnGH5MjnHdvWy8oAev7qK Js6LBLDubQSy0T/S5aEi7vZMDRTapxvxj2qBUVL2dE8GwxKIvML9ib1ZI9S8j5E6iOgJ eFcJJiJ8Fg4jRXsnEEDEAcJhtrZQ5vpk69wYmuCZyd35iTnibZsWJTgThOccmqAuiYn0 FQS9IFw03FVsO1ap9VuzyuOOeFWBZ6tcsMUkLoMvewjTzSKMVqx0XdhV5Okh+fG3VG5R MXY+8a787JjnAbnkO/RWhOF6Wt7lPKOv/RIosXEPLYrR14ReDKPV9zZxM0Z5x0lx0pJ4 qdnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734711681; x=1735316481; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/dK2I2TXW5iAu4adBn9RAI0BX8gzXE/7TFrMGsUREmg=; b=vWgs8DQPg4vxa1t48RDg3zLl7K+TzUylS+LlzjhyxmFZHfErs3X3SKhaYIvJyI7iNI RNmxHWib+JR88yaOZ4LzGos7N53VzEa/etPop4/gQWdL65HGmg5HG0uzX7OAt4H9Vf7q S1kapui311tUQlJIz3pkrCjnU0NudK5UTPeVncl2iL6E/DPbW8NUXm6LOR4Rn8e8wrMp 9pZ9VNukmQjh9gCJZDIatTRdj5TZD2YNfdnlSQ9hvg+cv65IKvpawSz852p8To74Utro g106j9jUV6K/2jIMnGwiKdsUnabaU5ZbzJpUnpkHGYEeGMmq8np0AtZpWes3k/UMF8gf joXQ== X-Gm-Message-State: AOJu0Yw72AiflZ2f/e5uMA6z2aTKzQ0f++fw0FHE267bHkZRB82Uxlp4 2gTT7O8qBt4/B1s4Cw5cxIMtQ75pMT1AY7fKSpyNz2VKUM6Of7b+JVZbmw== X-Gm-Gg: ASbGnctRsNxUnQ4MMzmImPfYyJM69Lp/21sCHgALLCn9C5As2H197cShNQDo+neI4Pr PvSFw3ZmUkDhJ4JJGImz6XwMcSQ5AcVwIHHjGOVRrkJb/fVbJs9SwkGwd2P1SkRZwbSEkiizWGE NtSoM04kMfsMIHL0zQjWhXsdnBZY/MJFutggMt4ASeuELvz5jNdb6dBpxD2UcNMeqgXeBF/m3ZM Uuxj2FskjDhVlz5IO0klfqEfPtxjA0T68yje6GSdzosi6DvZSE90oumpw== X-Google-Smtp-Source: AGHT+IEaJz97MINcCZbigBB0M98faj6sd7RgiUgkBbqdO7MJtLwaEMobsbDj4jQouTjI2yY+i9oz4g== X-Received: by 2002:a5d:59ab:0:b0:385:fd31:ca31 with SMTP id ffacd0b85a97d-38a223fd39fmr4249467f8f.53.1734711680480; Fri, 20 Dec 2024 08:21:20 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38a1c8a636asm4505583f8f.88.2024.12.20.08.21.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Dec 2024 08:21:19 -0800 (PST) Message-Id: <8ad2a5e79a25975394e5863f7a35293f1e2e5a44.1734711676.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 20 Dec 2024 16:21:10 +0000 Subject: [PATCH v4 2/7] test-lib-functions: add test_cmp_sorted Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, johannes.schindelin@gmx.de, peff@peff.net, ps@pks.im, me@ttaylorr.com, johncai86@gmail.com, newren@gmail.com, christian.couder@gmail.com, kristofferhaugsbakk@fastmail.com, jonathantanmy@google.com, karthik nayak , Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee This test helper will be helpful to reduce repeated logic in t6601-path-walk.sh, but may be helpful elsewhere, too. Signed-off-by: Derrick Stolee --- t/test-lib-functions.sh | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index fde9bf54fc3..16b70aebd60 100644 --- a/t/test-lib-functions.sh +++ b/t/test-lib-functions.sh @@ -1267,6 +1267,16 @@ test_cmp () { eval "$GIT_TEST_CMP" '"$@"' } +# test_cmp_sorted runs test_cmp on sorted versions of the two +# input files. Uses "$1.sorted" and "$2.sorted" as temp files. + +test_cmp_sorted () { + sort <"$1" >"$1.sorted" && + sort <"$2" >"$2.sorted" && + test_cmp "$1.sorted" "$2.sorted" && + rm "$1.sorted" "$2.sorted" +} + # Check that the given config key has the expected value. # # test_cmp_config [-C ] From patchwork Fri Dec 20 16:21:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13916981 Received: from mail-wm1-f43.google.com (mail-wm1-f43.google.com [209.85.128.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29636219A78 for ; Fri, 20 Dec 2024 16:21:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711686; cv=none; b=tcIfTJSYeJmxjWk3cSSVfrRO+e3g2KkGCts/KA77yVRmIGIsdGviCivtWIYb9ls0GQiDd/2x9HgUKvFfC7CbtaXvggWm5aSd8rfGjpJ6ZlSaG1obcp8avKZrtT+sqDbAjFmbDzbF0WtLE5u6fFOyEna2Svbv07cTLQYHW3ZfnaU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711686; c=relaxed/simple; bh=AxEmc2eTK5I4gxJZ8UnqpVwkogjpTcccKUe4bZ2aHyA=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=Fc+NnHU5FrlggVQpx6vloaeLR9/LiurKpOiMOZ32mVC1dhlwiMKat/6sLikCHrzd6wG4PMSeCVptZytfIgSXNrQK8qfJYYUMvev+Eq3RYlQOq1BiuNh4EBi1ThHYA4vuEAp/Oq91mxYNskn25deQW4TMFWYiA2Yk+YhCGcvyTjk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=YveHOE27; arc=none smtp.client-ip=209.85.128.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YveHOE27" Received: by mail-wm1-f43.google.com with SMTP id 5b1f17b1804b1-4361b6f9faeso13618785e9.1 for ; Fri, 20 Dec 2024 08:21:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734711682; x=1735316482; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=E/mlE7gohAENwH25tsyvZWFf+T9m6cwxzh9jcqpU9AI=; b=YveHOE27gvMCRBU/cdWpjCw0EsiPKMBGf4FH6meuW3C6UYkXjVoQdq5Qwlgjwofiu1 5t+UEz0+Jrq/Zyirm04fAlILYGVheh2VymWkDbU6t32iEqZVSNbjJXyDLWmmj00Kp1Mm H/cWM+21AlOhMpGWDwIAsg15b31nEVqWdfifW49XXyZN6eaMsHJhC/U11Lg/Zb0uXJdt x5oMc1A+rSVWY7IvT/SF0TbReBzPdsvXuAGZ3ENtQZQItGxFd7jM89EdsLNUnyKPaToU aLxO2upFmGgN1LVAjYMXwQJZDSjIeUVIaLewDkge6qCzYt1emC06XMOddoyJDBVJ5NLw hj7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734711682; x=1735316482; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=E/mlE7gohAENwH25tsyvZWFf+T9m6cwxzh9jcqpU9AI=; b=luy1Gz/QeIBoruhJYv2mTy3OG5YAKn5UkEKpK5xRVtCZNg/0IVmpdn2sofKdg6rZdD SjwJwx58rR0zX+X385bw3quyREIo2lDhl8kMAGm4UUFEy6UZM1xR7lLpWW1UwJJkBo5X Xq9fjrLJehJFtnU7URnnvKTg1b/W/l8eB8DamyBh++6LwbC7NGE1z3gTsFLHH+GEHQCm uO6yo9JFO3mvvyNsS9qlLsTSQNdWcGHF23wOJaHZbiENlETKaAahY1F+WludD/0dTtnK baEymvHjzLvx0yPMFY5rLcK5OkLrwyq+gF6vbO5qwSIF6fOFsWWtnDtNzbzUNn3fIQzU ok5g== X-Gm-Message-State: AOJu0YzylwTksGMJF1Nfzl0w5h0+5vGZFWiqNIDg/FfIBSIIcHezE+I6 NtQe9MQSsU1bM7QFAlzEjDZV/DILPiUgzyTRL1zjg+Z5NgMw4JizjiRGEQ== X-Gm-Gg: ASbGncvzYHPzC0UEh7q/wKXpIrieRNbuuOvRq20c1/DI0DjgSCJfFN0ZY6vXY9GgrmK 7sP/1roFHvrn6iaeR044zqELHh8KQXO28Spiq9/z0Xz/N4uZsFwb4I/zQmWuH/dk6U6Rc+FU8nX ydXQq2E+ljqwXG3P9E8y6D28ONfYpaafpIBY0a7qTITS3h967BamFI4EhXbg+DIzV0cKPYBiU2f +zz7kSujhuxCMIHIIhr9TLSXYV0045x1Qr+0t/OoJsOB5DFxJvlDKQ/7w== X-Google-Smtp-Source: AGHT+IHIlRvHCV8UNNeOySv2zD1HTP/T4v4hkZTXntw69SfpQT0j0h7Es3LnFpqQ/dT+NjxsRwQhuw== X-Received: by 2002:a05:600c:4302:b0:435:edb0:5d27 with SMTP id 5b1f17b1804b1-4365c775131mr68762605e9.9.1734711681864; Fri, 20 Dec 2024 08:21:21 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43656b417afsm84300605e9.36.2024.12.20.08.21.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Dec 2024 08:21:21 -0800 (PST) Message-Id: <2bc0538bce9ae2f502cd88aedf3b82c66e87c467.1734711676.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 20 Dec 2024 16:21:11 +0000 Subject: [PATCH v4 3/7] t6601: add helper for testing path-walk API Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, johannes.schindelin@gmx.de, peff@peff.net, ps@pks.im, me@ttaylorr.com, johncai86@gmail.com, newren@gmail.com, christian.couder@gmail.com, kristofferhaugsbakk@fastmail.com, jonathantanmy@google.com, karthik nayak , Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee Add some tests based on the current behavior, doing interesting checks for different sets of branches, ranges, and the --boundary option. This sets a baseline for the behavior and we can extend it as new options are introduced. Store and output a 'batch_nr' value so we can demonstrate that the paths are grouped together in a batch and not following some other ordering. This allows us to test the depth-first behavior of the path-walk API. However, we purposefully do not test the order of the objects in the batch, so the output is compared to the expected output through a sort. It is important to mention that the behavior of the API will change soon as we start to handle UNINTERESTING objects differently, but these tests will demonstrate the change in behavior. Signed-off-by: Derrick Stolee --- Documentation/technical/api-path-walk.txt | 3 +- Makefile | 1 + t/helper/test-path-walk.c | 84 +++++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/t6601-path-walk.sh | 120 ++++++++++++++++++++++ 6 files changed, 209 insertions(+), 1 deletion(-) create mode 100644 t/helper/test-path-walk.c create mode 100755 t/t6601-path-walk.sh diff --git a/Documentation/technical/api-path-walk.txt b/Documentation/technical/api-path-walk.txt index c550c77ca30..662162ec70b 100644 --- a/Documentation/technical/api-path-walk.txt +++ b/Documentation/technical/api-path-walk.txt @@ -42,4 +42,5 @@ commits. Examples -------- -See example usages in future changes. +See example usages in: + `t/helper/test-path-walk.c` diff --git a/Makefile b/Makefile index d0d8d6888e3..50413d96492 100644 --- a/Makefile +++ b/Makefile @@ -818,6 +818,7 @@ TEST_BUILTINS_OBJS += test-parse-options.o TEST_BUILTINS_OBJS += test-parse-pathspec-file.o TEST_BUILTINS_OBJS += test-partial-clone.o TEST_BUILTINS_OBJS += test-path-utils.o +TEST_BUILTINS_OBJS += test-path-walk.o TEST_BUILTINS_OBJS += test-pcre2-config.o TEST_BUILTINS_OBJS += test-pkt-line.o TEST_BUILTINS_OBJS += test-proc-receive.o diff --git a/t/helper/test-path-walk.c b/t/helper/test-path-walk.c new file mode 100644 index 00000000000..def7c81ac4f --- /dev/null +++ b/t/helper/test-path-walk.c @@ -0,0 +1,84 @@ +#define USE_THE_REPOSITORY_VARIABLE + +#include "test-tool.h" +#include "environment.h" +#include "hex.h" +#include "object-name.h" +#include "object.h" +#include "pretty.h" +#include "revision.h" +#include "setup.h" +#include "parse-options.h" +#include "path-walk.h" +#include "oid-array.h" + +static const char * const path_walk_usage[] = { + N_("test-tool path-walk -- "), + NULL +}; + +struct path_walk_test_data { + uintmax_t batch_nr; + uintmax_t tree_nr; + uintmax_t blob_nr; +}; + +static int emit_block(const char *path, struct oid_array *oids, + enum object_type type, void *data) +{ + struct path_walk_test_data *tdata = data; + const char *typestr; + + if (type == OBJ_TREE) + tdata->tree_nr += oids->nr; + else if (type == OBJ_BLOB) + tdata->blob_nr += oids->nr; + else + BUG("we do not understand this type"); + + typestr = type_name(type); + + for (size_t i = 0; i < oids->nr; i++) + printf("%"PRIuMAX":%s:%s:%s\n", + tdata->batch_nr, typestr, path, + oid_to_hex(&oids->oid[i])); + + tdata->batch_nr++; + return 0; +} + +int cmd__path_walk(int argc, const char **argv) +{ + int res; + struct rev_info revs = REV_INFO_INIT; + struct path_walk_info info = PATH_WALK_INFO_INIT; + struct path_walk_test_data data = { 0 }; + struct option options[] = { + OPT_END(), + }; + + setup_git_directory(); + revs.repo = the_repository; + + argc = parse_options(argc, argv, NULL, + options, path_walk_usage, + PARSE_OPT_KEEP_UNKNOWN_OPT | PARSE_OPT_KEEP_ARGV0); + + if (argc > 1) + setup_revisions(argc, argv, &revs, NULL); + else + usage(path_walk_usage[0]); + + info.revs = &revs; + info.path_fn = emit_block; + info.path_fn_data = &data; + + res = walk_objects_by_path(&info); + + printf("trees:%" PRIuMAX "\n" + "blobs:%" PRIuMAX "\n", + data.tree_nr, data.blob_nr); + + release_revisions(&revs); + return res; +} diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c index 1ebb69a5dc4..43676e7b93a 100644 --- a/t/helper/test-tool.c +++ b/t/helper/test-tool.c @@ -52,6 +52,7 @@ static struct test_cmd cmds[] = { { "parse-subcommand", cmd__parse_subcommand }, { "partial-clone", cmd__partial_clone }, { "path-utils", cmd__path_utils }, + { "path-walk", cmd__path_walk }, { "pcre2-config", cmd__pcre2_config }, { "pkt-line", cmd__pkt_line }, { "proc-receive", cmd__proc_receive }, diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h index 21802ac27da..9cfc5da6e57 100644 --- a/t/helper/test-tool.h +++ b/t/helper/test-tool.h @@ -45,6 +45,7 @@ int cmd__parse_pathspec_file(int argc, const char** argv); int cmd__parse_subcommand(int argc, const char **argv); int cmd__partial_clone(int argc, const char **argv); int cmd__path_utils(int argc, const char **argv); +int cmd__path_walk(int argc, const char **argv); int cmd__pcre2_config(int argc, const char **argv); int cmd__pkt_line(int argc, const char **argv); int cmd__proc_receive(int argc, const char **argv); diff --git a/t/t6601-path-walk.sh b/t/t6601-path-walk.sh new file mode 100755 index 00000000000..4e052c09309 --- /dev/null +++ b/t/t6601-path-walk.sh @@ -0,0 +1,120 @@ +#!/bin/sh + +TEST_PASSES_SANITIZE_LEAK=true + +test_description='direct path-walk API tests' + +. ./test-lib.sh + +test_expect_success 'setup test repository' ' + git checkout -b base && + + mkdir left && + mkdir right && + echo a >a && + echo b >left/b && + echo c >right/c && + git add . && + git commit -m "first" && + + echo d >right/d && + git add right && + git commit -m "second" && + + echo bb >left/b && + git commit -a -m "third" && + + git checkout -b topic HEAD~1 && + echo cc >right/c && + git commit -a -m "topic" +' + +test_expect_success 'all' ' + test-tool path-walk -- --all >out && + + cat >expect <<-EOF && + 0:tree::$(git rev-parse topic^{tree}) + 0:tree::$(git rev-parse base^{tree}) + 0:tree::$(git rev-parse base~1^{tree}) + 0:tree::$(git rev-parse base~2^{tree}) + 1:tree:right/:$(git rev-parse topic:right) + 1:tree:right/:$(git rev-parse base~1:right) + 1:tree:right/:$(git rev-parse base~2:right) + 2:blob:right/d:$(git rev-parse base~1:right/d) + 3:blob:right/c:$(git rev-parse base~2:right/c) + 3:blob:right/c:$(git rev-parse topic:right/c) + 4:tree:left/:$(git rev-parse base:left) + 4:tree:left/:$(git rev-parse base~2:left) + 5:blob:left/b:$(git rev-parse base~2:left/b) + 5:blob:left/b:$(git rev-parse base:left/b) + 6:blob:a:$(git rev-parse base~2:a) + blobs:6 + trees:9 + EOF + + test_cmp_sorted expect out +' + +test_expect_success 'topic only' ' + test-tool path-walk -- topic >out && + + cat >expect <<-EOF && + 0:tree::$(git rev-parse topic^{tree}) + 0:tree::$(git rev-parse base~1^{tree}) + 0:tree::$(git rev-parse base~2^{tree}) + 1:tree:right/:$(git rev-parse topic:right) + 1:tree:right/:$(git rev-parse base~1:right) + 1:tree:right/:$(git rev-parse base~2:right) + 2:blob:right/d:$(git rev-parse base~1:right/d) + 3:blob:right/c:$(git rev-parse base~2:right/c) + 3:blob:right/c:$(git rev-parse topic:right/c) + 4:tree:left/:$(git rev-parse base~2:left) + 5:blob:left/b:$(git rev-parse base~2:left/b) + 6:blob:a:$(git rev-parse base~2:a) + blobs:5 + trees:7 + EOF + + test_cmp_sorted expect out +' + +test_expect_success 'topic, not base' ' + test-tool path-walk -- topic --not base >out && + + cat >expect <<-EOF && + 0:tree::$(git rev-parse topic^{tree}) + 1:tree:right/:$(git rev-parse topic:right) + 2:blob:right/d:$(git rev-parse topic:right/d) + 3:blob:right/c:$(git rev-parse topic:right/c) + 4:tree:left/:$(git rev-parse topic:left) + 5:blob:left/b:$(git rev-parse topic:left/b) + 6:blob:a:$(git rev-parse topic:a) + blobs:4 + trees:3 + EOF + + test_cmp_sorted expect out +' + +test_expect_success 'topic, not base, boundary' ' + test-tool path-walk -- --boundary topic --not base >out && + + cat >expect <<-EOF && + 0:tree::$(git rev-parse topic^{tree}) + 0:tree::$(git rev-parse base~1^{tree}) + 1:tree:right/:$(git rev-parse topic:right) + 1:tree:right/:$(git rev-parse base~1:right) + 2:blob:right/d:$(git rev-parse base~1:right/d) + 3:blob:right/c:$(git rev-parse base~1:right/c) + 3:blob:right/c:$(git rev-parse topic:right/c) + 4:tree:left/:$(git rev-parse base~1:left) + 5:blob:left/b:$(git rev-parse base~1:left/b) + 6:blob:a:$(git rev-parse base~1:a) + blobs:5 + trees:5 + EOF + + test_cmp_sorted expect out +' + +test_done From patchwork Fri Dec 20 16:21:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13916982 Received: from mail-wr1-f50.google.com (mail-wr1-f50.google.com [209.85.221.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B247C21A437 for ; Fri, 20 Dec 2024 16:21:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711687; cv=none; b=aC2pH3yjzpiWmrH6P7MPDHunSQlkG92xlamMvUjGHmX76bhm0jtxqPNO1APCtoPErfYyKSq+lB1TrJ6xcKuHVn/YQUYroTId4qEQdF88akOhX6ULMtYxNdnSey4/OR86Hlx/nGRHMKO9QXYHSOWrllmHf4GDq36VqSVuAYnZxbs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711687; c=relaxed/simple; bh=TX2eg2bC6r2fyjjLFZ1R/cCQCY7L57g7zo4gRUrs84w=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=R5k5Sd3+ogE6mCi1c7bcbeGfOotjf2ScAx22QcODRPsSnY2gBSgaCzJMXMI3ujJeHZY+Fm0PQr8p+S1xHkZrz+7W6J31Yfa9MN8iBaV7o2qFwtnF88eHPWkX8lPBWvRvFeZ6kElwWpnrYKxIqYYfQO/uWCfzQBT4wqU98cRUkZA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=H79uTpS1; arc=none smtp.client-ip=209.85.221.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="H79uTpS1" Received: by mail-wr1-f50.google.com with SMTP id ffacd0b85a97d-3862d161947so1093062f8f.3 for ; Fri, 20 Dec 2024 08:21:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734711684; x=1735316484; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=te9A5Z4cilQv6ticNcx81z2SLbSgkLxZ+XAWRjW0RDU=; b=H79uTpS1XYbhLznx2i5abIrRronk4RiN3mu3JKzZwBnyMOpYEmw9S9CUU6iwNrNan7 Jp8HCuxzgz+QIlDf7FOjGI5ngvlWL6RcAn/6kAikxL8IQE6TVzd9jboLuWkJJq5RT1k3 kjlmvSnqpFdw0OeIahhQk6yan2tb1TZv+OipDgwbELIYEHcFL+DpYzWoy/ZrQSzXVudJ 2jPVekNYQhP1Uqkjevhsg+HoI2T2cW3aiN94HMaYr5K6ntrp1LN3cswXsz/9hTGYjgvs zurD1+oWVfCQwlyGLPt2USbRRW2DBWs4N+3OUslSKjxavM1g3WI7XnY6JHBbmDupukqT pk+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734711684; x=1735316484; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=te9A5Z4cilQv6ticNcx81z2SLbSgkLxZ+XAWRjW0RDU=; b=d7Bjnwcscyh30GXJD4Z1LTSM6/ETZXhcB36QZEpHowCnLBLsp0jBM50+3I8i96Rq+9 NtLO/R2kI5rv9F0kkwIP+zItWgx3wQX/f0SMbFPohbfdKfAoR30NlMmajOZWeFMWqa3N Cc+dJOyMaYAC5GjrjfwPsDhJsBmT6TvVf3WZCHlugczd/fiy+C62mj28SLecwtyWr2E8 hI7Gb2q2flMcTm7NFI6x5w1ALJMKU6wJOuvyzVrEUrRe2UtraZu75bX+VaIKP40pn5ar oNqOtM8n/LFum2SR1ze8LnV/c+7KOsdLz2ZB8uBMJV6QbkQcrmOb7Y5mjRoKbOm8tp7s oU9Q== X-Gm-Message-State: AOJu0YzSChrW2tUOY60SDl8dBEhGu0iPfI4toyqd3UO0jhAZAIY37yPM YLEBGhB5Ng+JgOfoxyWFMgRZXdtA347vyXS9/rlj+6O/rXEWDrshJ7W21w== X-Gm-Gg: ASbGncssj6EjIycTToS3VF38DZC9f31BWrA2r7UWJzprJYIcYl3l57JAK6mEJLnuQNo Na9mnYYJEi0oRpchofhsno8tg3pmOtm07QY2mzK1xAxnNFyFbwOFTb8opAq27wRYOD9Xol5nUfl bG+SfJg0GCOaqEhx0q9ijT4p6aIxFPF44RVdxLD59uHenvBxiQmvHL781fTWNFUcG4SJ4W7nVlR Q2Ge99C76xTDoninBWqFO0ZFeLvIT37TDhyB1FVjETxJDSXzdeR5DFm6g== X-Google-Smtp-Source: AGHT+IEYvp9kZZObb1dxU7VXsMTt7wJNGe0v93/odgA/BxQEWqCKLFskPFaa2gCCTqpbO5ZDB0Z18Q== X-Received: by 2002:a05:6000:18a5:b0:382:5aae:87c7 with SMTP id ffacd0b85a97d-38a221f9d65mr3466231f8f.31.1734711683242; Fri, 20 Dec 2024 08:21:23 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43661219611sm50250265e9.23.2024.12.20.08.21.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Dec 2024 08:21:22 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 20 Dec 2024 16:21:12 +0000 Subject: [PATCH v4 4/7] path-walk: allow consumer to specify object types Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, johannes.schindelin@gmx.de, peff@peff.net, ps@pks.im, me@ttaylorr.com, johncai86@gmail.com, newren@gmail.com, christian.couder@gmail.com, kristofferhaugsbakk@fastmail.com, jonathantanmy@google.com, karthik nayak , Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee We add the ability to filter the object types in the path-walk API so the callback function is called fewer times. This adds the ability to ask for the commits in a list, as well. We re-use the empty string for this set of objects because these are passed directly to the callback function instead of being part of the 'path_stack'. Future changes will add the ability to visit annotated tags. Signed-off-by: Derrick Stolee --- Documentation/technical/api-path-walk.txt | 9 ++ path-walk.c | 33 ++++- path-walk.h | 14 +- t/helper/test-path-walk.c | 15 ++- t/t6601-path-walk.sh | 149 +++++++++++++++------- 5 files changed, 170 insertions(+), 50 deletions(-) diff --git a/Documentation/technical/api-path-walk.txt b/Documentation/technical/api-path-walk.txt index 662162ec70b..dce553b6114 100644 --- a/Documentation/technical/api-path-walk.txt +++ b/Documentation/technical/api-path-walk.txt @@ -39,6 +39,15 @@ It is also important that you do not specify the `--objects` flag for the the objects will be walked in a separate way based on those starting commits. +`commits`, `blobs`, `trees`:: + By default, these members are enabled and signal that the path-walk + API should call the `path_fn` on objects of these types. Specialized + applications could disable some options to make it simpler to walk + the objects or to have fewer calls to `path_fn`. ++ +While it is possible to walk only commits in this way, consumers would be +better off using the revision walk API instead. + Examples -------- diff --git a/path-walk.c b/path-walk.c index 021840ab41b..05ca7b2442a 100644 --- a/path-walk.c +++ b/path-walk.c @@ -98,6 +98,10 @@ static int add_tree_entries(struct path_walk_context *ctx, if (S_ISGITLINK(entry.mode)) continue; + /* If the caller doesn't want blobs, then don't bother. */ + if (!ctx->info->blobs && type == OBJ_BLOB) + continue; + if (type == OBJ_TREE) { struct tree *child = lookup_tree(ctx->repo, &entry.oid); o = child ? &child->object : NULL; @@ -159,9 +163,11 @@ static int walk_path(struct path_walk_context *ctx, if (!list->oids.nr) return 0; - /* Evaluate function pointer on this data. */ - ret = ctx->info->path_fn(path, &list->oids, list->type, - ctx->info->path_fn_data); + /* Evaluate function pointer on this data, if requested. */ + if ((list->type == OBJ_TREE && ctx->info->trees) || + (list->type == OBJ_BLOB && ctx->info->blobs)) + ret = ctx->info->path_fn(path, &list->oids, list->type, + ctx->info->path_fn_data); /* Expand data for children. */ if (list->type == OBJ_TREE) { @@ -203,6 +209,7 @@ int walk_objects_by_path(struct path_walk_info *info) size_t commits_nr = 0, paths_nr = 0; struct commit *c; struct type_and_oid_list *root_tree_list; + struct type_and_oid_list *commit_list; struct path_walk_context ctx = { .repo = info->revs->repo, .revs = info->revs, @@ -214,6 +221,9 @@ int walk_objects_by_path(struct path_walk_info *info) trace2_region_enter("path-walk", "commit-walk", info->revs->repo); + CALLOC_ARRAY(commit_list, 1); + commit_list->type = OBJ_COMMIT; + /* Insert a single list for the root tree into the paths. */ CALLOC_ARRAY(root_tree_list, 1); root_tree_list->type = OBJ_TREE; @@ -224,10 +234,18 @@ int walk_objects_by_path(struct path_walk_info *info) die(_("failed to setup revision walk")); while ((c = get_revision(info->revs))) { - struct object_id *oid = get_commit_tree_oid(c); + struct object_id *oid; struct tree *t; commits_nr++; + if (info->commits) + oid_array_append(&commit_list->oids, + &c->object.oid); + + /* If we only care about commits, then skip trees. */ + if (!info->trees && !info->blobs) + continue; + oid = get_commit_tree_oid(c); t = lookup_tree(info->revs->repo, oid); @@ -245,6 +263,13 @@ int walk_objects_by_path(struct path_walk_info *info) trace2_data_intmax("path-walk", ctx.repo, "commits", commits_nr); trace2_region_leave("path-walk", "commit-walk", info->revs->repo); + /* Track all commits. */ + if (info->commits && commit_list->oids.nr) + ret = info->path_fn("", &commit_list->oids, OBJ_COMMIT, + info->path_fn_data); + oid_array_clear(&commit_list->oids); + free(commit_list); + trace2_region_enter("path-walk", "path-walk", info->revs->repo); while (!ret && ctx.path_stack.nr) { char *path = ctx.path_stack.items[ctx.path_stack.nr - 1].string; diff --git a/path-walk.h b/path-walk.h index 7cb3538cd8f..2cafc71e153 100644 --- a/path-walk.h +++ b/path-walk.h @@ -31,9 +31,21 @@ struct path_walk_info { */ path_fn path_fn; void *path_fn_data; + + /** + * Initialize which object types the path_fn should be called on. This + * could also limit the walk to skip blobs if not set. + */ + int commits; + int trees; + int blobs; }; -#define PATH_WALK_INFO_INIT { 0 } +#define PATH_WALK_INFO_INIT { \ + .blobs = 1, \ + .trees = 1, \ + .commits = 1, \ +} void path_walk_info_init(struct path_walk_info *info); void path_walk_info_clear(struct path_walk_info *info); diff --git a/t/helper/test-path-walk.c b/t/helper/test-path-walk.c index def7c81ac4f..a57a05a6391 100644 --- a/t/helper/test-path-walk.c +++ b/t/helper/test-path-walk.c @@ -19,6 +19,8 @@ static const char * const path_walk_usage[] = { struct path_walk_test_data { uintmax_t batch_nr; + + uintmax_t commit_nr; uintmax_t tree_nr; uintmax_t blob_nr; }; @@ -33,6 +35,8 @@ static int emit_block(const char *path, struct oid_array *oids, tdata->tree_nr += oids->nr; else if (type == OBJ_BLOB) tdata->blob_nr += oids->nr; + else if (type == OBJ_COMMIT) + tdata->commit_nr += oids->nr; else BUG("we do not understand this type"); @@ -54,6 +58,12 @@ int cmd__path_walk(int argc, const char **argv) struct path_walk_info info = PATH_WALK_INFO_INIT; struct path_walk_test_data data = { 0 }; struct option options[] = { + OPT_BOOL(0, "blobs", &info.blobs, + N_("toggle inclusion of blob objects")), + OPT_BOOL(0, "commits", &info.commits, + N_("toggle inclusion of commit objects")), + OPT_BOOL(0, "trees", &info.trees, + N_("toggle inclusion of tree objects")), OPT_END(), }; @@ -75,9 +85,10 @@ int cmd__path_walk(int argc, const char **argv) res = walk_objects_by_path(&info); - printf("trees:%" PRIuMAX "\n" + printf("commits:%" PRIuMAX "\n" + "trees:%" PRIuMAX "\n" "blobs:%" PRIuMAX "\n", - data.tree_nr, data.blob_nr); + data.commit_nr, data.tree_nr, data.blob_nr); release_revisions(&revs); return res; diff --git a/t/t6601-path-walk.sh b/t/t6601-path-walk.sh index 4e052c09309..4a4939a1b02 100755 --- a/t/t6601-path-walk.sh +++ b/t/t6601-path-walk.sh @@ -33,22 +33,27 @@ test_expect_success 'all' ' test-tool path-walk -- --all >out && cat >expect <<-EOF && - 0:tree::$(git rev-parse topic^{tree}) - 0:tree::$(git rev-parse base^{tree}) - 0:tree::$(git rev-parse base~1^{tree}) - 0:tree::$(git rev-parse base~2^{tree}) - 1:tree:right/:$(git rev-parse topic:right) - 1:tree:right/:$(git rev-parse base~1:right) - 1:tree:right/:$(git rev-parse base~2:right) - 2:blob:right/d:$(git rev-parse base~1:right/d) - 3:blob:right/c:$(git rev-parse base~2:right/c) - 3:blob:right/c:$(git rev-parse topic:right/c) - 4:tree:left/:$(git rev-parse base:left) - 4:tree:left/:$(git rev-parse base~2:left) - 5:blob:left/b:$(git rev-parse base~2:left/b) - 5:blob:left/b:$(git rev-parse base:left/b) - 6:blob:a:$(git rev-parse base~2:a) + 0:commit::$(git rev-parse topic) + 0:commit::$(git rev-parse base) + 0:commit::$(git rev-parse base~1) + 0:commit::$(git rev-parse base~2) + 1:tree::$(git rev-parse topic^{tree}) + 1:tree::$(git rev-parse base^{tree}) + 1:tree::$(git rev-parse base~1^{tree}) + 1:tree::$(git rev-parse base~2^{tree}) + 2:tree:right/:$(git rev-parse topic:right) + 2:tree:right/:$(git rev-parse base~1:right) + 2:tree:right/:$(git rev-parse base~2:right) + 3:blob:right/d:$(git rev-parse base~1:right/d) + 4:blob:right/c:$(git rev-parse base~2:right/c) + 4:blob:right/c:$(git rev-parse topic:right/c) + 5:tree:left/:$(git rev-parse base:left) + 5:tree:left/:$(git rev-parse base~2:left) + 6:blob:left/b:$(git rev-parse base~2:left/b) + 6:blob:left/b:$(git rev-parse base:left/b) + 7:blob:a:$(git rev-parse base~2:a) blobs:6 + commits:4 trees:9 EOF @@ -59,19 +64,23 @@ test_expect_success 'topic only' ' test-tool path-walk -- topic >out && cat >expect <<-EOF && - 0:tree::$(git rev-parse topic^{tree}) - 0:tree::$(git rev-parse base~1^{tree}) - 0:tree::$(git rev-parse base~2^{tree}) - 1:tree:right/:$(git rev-parse topic:right) - 1:tree:right/:$(git rev-parse base~1:right) - 1:tree:right/:$(git rev-parse base~2:right) - 2:blob:right/d:$(git rev-parse base~1:right/d) - 3:blob:right/c:$(git rev-parse base~2:right/c) - 3:blob:right/c:$(git rev-parse topic:right/c) - 4:tree:left/:$(git rev-parse base~2:left) - 5:blob:left/b:$(git rev-parse base~2:left/b) - 6:blob:a:$(git rev-parse base~2:a) + 0:commit::$(git rev-parse topic) + 0:commit::$(git rev-parse base~1) + 0:commit::$(git rev-parse base~2) + 1:tree::$(git rev-parse topic^{tree}) + 1:tree::$(git rev-parse base~1^{tree}) + 1:tree::$(git rev-parse base~2^{tree}) + 2:tree:right/:$(git rev-parse topic:right) + 2:tree:right/:$(git rev-parse base~1:right) + 2:tree:right/:$(git rev-parse base~2:right) + 3:blob:right/d:$(git rev-parse base~1:right/d) + 4:blob:right/c:$(git rev-parse base~2:right/c) + 4:blob:right/c:$(git rev-parse topic:right/c) + 5:tree:left/:$(git rev-parse base~2:left) + 6:blob:left/b:$(git rev-parse base~2:left/b) + 7:blob:a:$(git rev-parse base~2:a) blobs:5 + commits:3 trees:7 EOF @@ -82,15 +91,66 @@ test_expect_success 'topic, not base' ' test-tool path-walk -- topic --not base >out && cat >expect <<-EOF && + 0:commit::$(git rev-parse topic) + 1:tree::$(git rev-parse topic^{tree}) + 2:tree:right/:$(git rev-parse topic:right) + 3:blob:right/d:$(git rev-parse topic:right/d) + 4:blob:right/c:$(git rev-parse topic:right/c) + 5:tree:left/:$(git rev-parse topic:left) + 6:blob:left/b:$(git rev-parse topic:left/b) + 7:blob:a:$(git rev-parse topic:a) + blobs:4 + commits:1 + trees:3 + EOF + + test_cmp_sorted expect out +' + +test_expect_success 'topic, not base, only blobs' ' + test-tool path-walk --no-trees --no-commits \ + -- topic --not base >out && + + cat >expect <<-EOF && + commits:0 + trees:0 + 0:blob:right/d:$(git rev-parse topic:right/d) + 1:blob:right/c:$(git rev-parse topic:right/c) + 2:blob:left/b:$(git rev-parse topic:left/b) + 3:blob:a:$(git rev-parse topic:a) + blobs:4 + EOF + + test_cmp_sorted expect out +' + +# No, this doesn't make a lot of sense for the path-walk API, +# but it is possible to do. +test_expect_success 'topic, not base, only commits' ' + test-tool path-walk --no-blobs --no-trees \ + -- topic --not base >out && + + cat >expect <<-EOF && + 0:commit::$(git rev-parse topic) + commits:1 + trees:0 + blobs:0 + EOF + + test_cmp_sorted expect out +' + +test_expect_success 'topic, not base, only trees' ' + test-tool path-walk --no-blobs --no-commits \ + -- topic --not base >out && + + cat >expect <<-EOF && + commits:0 0:tree::$(git rev-parse topic^{tree}) 1:tree:right/:$(git rev-parse topic:right) - 2:blob:right/d:$(git rev-parse topic:right/d) - 3:blob:right/c:$(git rev-parse topic:right/c) - 4:tree:left/:$(git rev-parse topic:left) - 5:blob:left/b:$(git rev-parse topic:left/b) - 6:blob:a:$(git rev-parse topic:a) - blobs:4 + 2:tree:left/:$(git rev-parse topic:left) trees:3 + blobs:0 EOF test_cmp_sorted expect out @@ -100,17 +160,20 @@ test_expect_success 'topic, not base, boundary' ' test-tool path-walk -- --boundary topic --not base >out && cat >expect <<-EOF && - 0:tree::$(git rev-parse topic^{tree}) - 0:tree::$(git rev-parse base~1^{tree}) - 1:tree:right/:$(git rev-parse topic:right) - 1:tree:right/:$(git rev-parse base~1:right) - 2:blob:right/d:$(git rev-parse base~1:right/d) - 3:blob:right/c:$(git rev-parse base~1:right/c) - 3:blob:right/c:$(git rev-parse topic:right/c) - 4:tree:left/:$(git rev-parse base~1:left) - 5:blob:left/b:$(git rev-parse base~1:left/b) - 6:blob:a:$(git rev-parse base~1:a) + 0:commit::$(git rev-parse topic) + 0:commit::$(git rev-parse base~1) + 1:tree::$(git rev-parse topic^{tree}) + 1:tree::$(git rev-parse base~1^{tree}) + 2:tree:right/:$(git rev-parse topic:right) + 2:tree:right/:$(git rev-parse base~1:right) + 3:blob:right/d:$(git rev-parse base~1:right/d) + 4:blob:right/c:$(git rev-parse base~1:right/c) + 4:blob:right/c:$(git rev-parse topic:right/c) + 5:tree:left/:$(git rev-parse base~1:left) + 6:blob:left/b:$(git rev-parse base~1:left/b) + 7:blob:a:$(git rev-parse base~1:a) blobs:5 + commits:2 trees:5 EOF From patchwork Fri Dec 20 16:21:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13916983 Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 622AF21A44B for ; Fri, 20 Dec 2024 16:21:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711689; cv=none; b=VvjlU0mtT0qlqGJKch/TSlqV1wEJU9lncPazak0SRe9ThvevsOTag/32TCHW/gEOdHcd89bZn0KA3gHQ/8xNLryEgUiZ2WKo1y/cDvn7rqEtf5nBiJouN0YbW3kHd31jEXTRwLtbGRHSzTosP4jgEiZifyZ59I9WBknh4YqV2FU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711689; c=relaxed/simple; bh=U+ZkCGmJgnQV4fF1NW3vObut6tFdYZZB9AY6pOPgb4k=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=WwWzLtnOZQ1g+BtPbpcTimwQT6JZsHgDZVspJEmzjnHNckmB9l53iw2lOCANrLKGWkX5W6Vj8PIz0fZOBsR9R4Aox/eiMEjzLwYxs4fvIMCbeJAa5jM2x3BxW94JLdeogn3BzlZDZuSHxDhwDbBomsVDQTeAmTA6i94CMRKux6E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=OqjDDeFi; arc=none smtp.client-ip=209.85.128.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OqjDDeFi" Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-436637e8c8dso16550745e9.1 for ; Fri, 20 Dec 2024 08:21:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734711685; x=1735316485; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=Gfo/Ofpw1zxe3+3DXFtxPQuC5DDyLG3rX7mSZguqM/w=; b=OqjDDeFiKM+hpcmLnPY5B/Z2TdOgpa8UhJ0NPvQe+73/CgvXexEvoWs4/ksTZ9AwK+ 0Z5KhKGfTmoWVIa/Le+xr80x2ndm/RKDwZfirEsfhEeTvA2WvVOOCJDj12hYzKKRaxU+ enS4isAAqkE0dG0IAirqEfHqiBusmGVi+teVMjejP/nrAppu7V1nsNpptnbPAxl2h/jE idisGeCFngcCzmEyDaelYwXBMvtS0zbRCIeHknGOX4qQqnsoToGxhXPMYvpQQ77cFyCE Z6Yyt8QaaDNYJqQ/g1+5RN/Dst+2T3s5FMJ8/kG/qO3P9oo6OuJtyjH/fAxCQIKQT/4K D3MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734711685; x=1735316485; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Gfo/Ofpw1zxe3+3DXFtxPQuC5DDyLG3rX7mSZguqM/w=; b=kASbZTaolB4nSAr90uasbFHO6IgNOFgJYj3wZhuubDe670FzGfpt4MVmvdhd/HMCrz j+Q46CUnmiDeYr0eWT0gNnAZLRkbrd2QWrFlhVP49+SlvgRmciRUY5wNkHt767zqTu6c B4oO6QUvEJBBut+jyzVwIQgiW77JO+kYdeCAtyrZpFPmshm6r7ZWyjT6cSSPGQT6073c hd66rzxBQHD+jiYMJJpm4RSGB+E1FJhZN/hsausaAvz2Ash7IYmQMlQp79tvsB2XkerV +p7R9g4u4C2f6nRdtWZW7m6MML5Cw1K0TGh66CPsuKlVuVR8fEescD8ArvGSMOD5jd37 zXdg== X-Gm-Message-State: AOJu0YzdJ3agkjibMGcJvnQqDMZNPi08V6wJDozcJDH2lv1kj7nTn2HL 1QGwH7CIgpyeiPoBsOf5wiFh5XqaYr+wAT9htq7afwl50TQGW/UXN+kpBw== X-Gm-Gg: ASbGncun++vi9T0fG/GU5muZLs+oOuazXH7jG4Qs5k88hXDtGWOaXGWqoN73Xi00yaV 997z+mvVKnQIU/hWnteOD+Sc36PWUEUtbtMOT9mL1KekmZYKgBtq093dw9Bz0PPP4t7H9j8qdbE gPIZ0rQ1+NU7NIKZQiGofvfSmkG6xgPmIpVOphB28/GVlTztD5vy8fd9rkYS5o4m4CuYfYB1crI k7rtVImwMBgnR8C69CknLqIEUjAtL5wGo/MGIMFTikdOX2eEpLKYOyeCQ== X-Google-Smtp-Source: AGHT+IHGLeu+a5MzAHxiRnhyLiH38xmCys3D+Ri+2Ug1qy8sS4UwwnjfQQ9KR9EVJ27YqQmyi8z73w== X-Received: by 2002:a05:600c:1986:b0:434:edcf:7461 with SMTP id 5b1f17b1804b1-43668b78823mr34194635e9.30.1734711684724; Fri, 20 Dec 2024 08:21:24 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-436604e9c2csm51411595e9.43.2024.12.20.08.21.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Dec 2024 08:21:23 -0800 (PST) Message-Id: <6df56f465d7bdb2c2d360752f6530e29b4d3dc3c.1734711676.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 20 Dec 2024 16:21:13 +0000 Subject: [PATCH v4 5/7] path-walk: visit tags and cached objects Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, johannes.schindelin@gmx.de, peff@peff.net, ps@pks.im, me@ttaylorr.com, johncai86@gmail.com, newren@gmail.com, christian.couder@gmail.com, kristofferhaugsbakk@fastmail.com, jonathantanmy@google.com, karthik nayak , Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee The rev_info that is specified for a path-walk traversal may specify visiting tag refs (both lightweight and annotated) and also may specify indexed objects (blobs and trees). Update the path-walk API to walk these objects as well. When walking tags, we need to peel the annotated objects until reaching a non-tag object. If we reach a commit, then we can add it to the pending objects to make sure we visit in the commit walk portion. If we reach a tree, then we will assume that it is a root tree. If we reach a blob, then we have no good path name and so add it to a new list of "tagged blobs". When the rev_info includes the "--indexed-objects" flag, then the pending set includes blobs and trees found in the cache entries and cache-tree. The cache entries are usually blobs, though they could be trees in the case of a sparse index. The cache-tree stores previously-hashed tree objects but these are cleared out when staging objects below those paths. We add tests that demonstrate this. The indexed objects come with a non-NULL 'path' value in the pending item. This allows us to prepopulate the 'path_to_lists' strmap with lists for these paths. The tricky thing about this walk is that we will want to combine the indexed objects walk with the commit walk, especially in the future case of walking objects during a command like 'git repack'. Whenever possible, we want the objects from the index to be grouped with similar objects in history. We don't want to miss any paths that appear only in the index and not in the commit history. Thus, we need to be careful to let the path stack be populated initially with only the root tree path (and possibly tags and tagged blobs) and go through the normal depth-first search. Afterwards, if there are other paths that are remaining in the paths_to_lists strmap, we should then iterate through the stack and visit those objects recursively. Signed-off-by: Derrick Stolee --- Documentation/technical/api-path-walk.txt | 2 +- path-walk.c | 184 ++++++++++++++++++++- path-walk.h | 2 + t/helper/test-path-walk.c | 15 +- t/t6601-path-walk.sh | 186 +++++++++++++++++++--- 5 files changed, 362 insertions(+), 27 deletions(-) diff --git a/Documentation/technical/api-path-walk.txt b/Documentation/technical/api-path-walk.txt index dce553b6114..6022c381b7c 100644 --- a/Documentation/technical/api-path-walk.txt +++ b/Documentation/technical/api-path-walk.txt @@ -39,7 +39,7 @@ It is also important that you do not specify the `--objects` flag for the the objects will be walked in a separate way based on those starting commits. -`commits`, `blobs`, `trees`:: +`commits`, `blobs`, `trees`, `tags`:: By default, these members are enabled and signal that the path-walk API should call the `path_fn` on objects of these types. Specialized applications could disable some options to make it simpler to walk diff --git a/path-walk.c b/path-walk.c index 05ca7b2442a..f34dbf61de0 100644 --- a/path-walk.c +++ b/path-walk.c @@ -13,10 +13,13 @@ #include "revision.h" #include "string-list.h" #include "strmap.h" +#include "tag.h" #include "trace2.h" #include "tree.h" #include "tree-walk.h" +static const char *root_path = ""; + struct type_and_oid_list { enum object_type type; struct oid_array oids; @@ -160,12 +163,16 @@ static int walk_path(struct path_walk_context *ctx, list = strmap_get(&ctx->paths_to_lists, path); + if (!list) + BUG("provided path '%s' that had no associated list", path); + if (!list->oids.nr) return 0; /* Evaluate function pointer on this data, if requested. */ if ((list->type == OBJ_TREE && ctx->info->trees) || - (list->type == OBJ_BLOB && ctx->info->blobs)) + (list->type == OBJ_BLOB && ctx->info->blobs) || + (list->type == OBJ_TAG && ctx->info->tags)) ret = ctx->info->path_fn(path, &list->oids, list->type, ctx->info->path_fn_data); @@ -196,6 +203,139 @@ static void clear_paths_to_lists(struct strmap *map) strmap_init(map); } +static int setup_pending_objects(struct path_walk_info *info, + struct path_walk_context *ctx) +{ + struct type_and_oid_list *tags = NULL; + struct type_and_oid_list *tagged_blobs = NULL; + struct type_and_oid_list *root_tree_list = NULL; + + if (info->tags) + CALLOC_ARRAY(tags, 1); + if (info->blobs) + CALLOC_ARRAY(tagged_blobs, 1); + if (info->trees) + root_tree_list = strmap_get(&ctx->paths_to_lists, root_path); + + /* + * Pending objects include: + * * Commits at branch tips. + * * Annotated tags at tag tips. + * * Any kind of object at lightweight tag tips. + * * Trees and blobs in the index (with an associated path). + */ + for (size_t i = 0; i < info->revs->pending.nr; i++) { + struct object_array_entry *pending = info->revs->pending.objects + i; + struct object *obj = pending->item; + + /* Commits will be picked up by revision walk. */ + if (obj->type == OBJ_COMMIT) + continue; + + /* Navigate annotated tag object chains. */ + while (obj->type == OBJ_TAG) { + struct tag *tag = lookup_tag(info->revs->repo, &obj->oid); + if (!tag) { + error(_("failed to find tag %s"), + oid_to_hex(&obj->oid)); + return -1; + } + if (tag->object.flags & SEEN) + break; + tag->object.flags |= SEEN; + + if (tags) + oid_array_append(&tags->oids, &obj->oid); + obj = tag->tagged; + } + + if (obj->type == OBJ_TAG) + continue; + + /* We are now at a non-tag object. */ + if (obj->flags & SEEN) + continue; + obj->flags |= SEEN; + + switch (obj->type) { + case OBJ_TREE: + if (!info->trees) + continue; + if (pending->path) { + struct type_and_oid_list *list; + char *path = *pending->path ? xstrfmt("%s/", pending->path) + : xstrdup(""); + if (!(list = strmap_get(&ctx->paths_to_lists, path))) { + CALLOC_ARRAY(list, 1); + list->type = OBJ_TREE; + strmap_put(&ctx->paths_to_lists, path, list); + } + oid_array_append(&list->oids, &obj->oid); + free(path); + } else { + /* assume a root tree, such as a lightweight tag. */ + oid_array_append(&root_tree_list->oids, &obj->oid); + } + break; + + case OBJ_BLOB: + if (!info->blobs) + continue; + if (pending->path) { + struct type_and_oid_list *list; + char *path = pending->path; + if (!(list = strmap_get(&ctx->paths_to_lists, path))) { + CALLOC_ARRAY(list, 1); + list->type = OBJ_BLOB; + strmap_put(&ctx->paths_to_lists, path, list); + } + oid_array_append(&list->oids, &obj->oid); + } else { + /* assume a root tree, such as a lightweight tag. */ + oid_array_append(&tagged_blobs->oids, &obj->oid); + } + break; + + case OBJ_COMMIT: + /* Make sure it is in the object walk */ + if (obj != pending->item) + add_pending_object(info->revs, obj, ""); + break; + + default: + BUG("should not see any other type here"); + } + } + + /* + * Add tag objects and tagged blobs if they exist. + */ + if (tagged_blobs) { + if (tagged_blobs->oids.nr) { + const char *tagged_blob_path = "/tagged-blobs"; + tagged_blobs->type = OBJ_BLOB; + push_to_stack(ctx, tagged_blob_path); + strmap_put(&ctx->paths_to_lists, tagged_blob_path, tagged_blobs); + } else { + oid_array_clear(&tagged_blobs->oids); + free(tagged_blobs); + } + } + if (tags) { + if (tags->oids.nr) { + const char *tag_path = "/tags"; + tags->type = OBJ_TAG; + push_to_stack(ctx, tag_path); + strmap_put(&ctx->paths_to_lists, tag_path, tags); + } else { + oid_array_clear(&tags->oids); + free(tags); + } + } + + return 0; +} + /** * Given the configuration of 'info', walk the commits based on 'info->revs' and * call 'info->path_fn' on each discovered path. @@ -204,8 +344,7 @@ static void clear_paths_to_lists(struct strmap *map) */ int walk_objects_by_path(struct path_walk_info *info) { - const char *root_path = ""; - int ret = 0; + int ret; size_t commits_nr = 0, paths_nr = 0; struct commit *c; struct type_and_oid_list *root_tree_list; @@ -224,15 +363,34 @@ int walk_objects_by_path(struct path_walk_info *info) CALLOC_ARRAY(commit_list, 1); commit_list->type = OBJ_COMMIT; + if (info->tags) + info->revs->tag_objects = 1; + /* Insert a single list for the root tree into the paths. */ CALLOC_ARRAY(root_tree_list, 1); root_tree_list->type = OBJ_TREE; strmap_put(&ctx.paths_to_lists, root_path, root_tree_list); push_to_stack(&ctx, root_path); + /* + * Set these values before preparing the walk to catch + * lightweight tags pointing to non-commits and indexed objects. + */ + info->revs->blob_objects = info->blobs; + info->revs->tree_objects = info->trees; + if (prepare_revision_walk(info->revs)) die(_("failed to setup revision walk")); + info->revs->blob_objects = info->revs->tree_objects = 0; + + trace2_region_enter("path-walk", "pending-walk", info->revs->repo); + ret = setup_pending_objects(info, &ctx); + trace2_region_leave("path-walk", "pending-walk", info->revs->repo); + + if (ret) + return ret; + while ((c = get_revision(info->revs))) { struct object_id *oid; struct tree *t; @@ -280,6 +438,26 @@ int walk_objects_by_path(struct path_walk_info *info) free(path); } + + /* Are there paths remaining? Likely they are from indexed objects. */ + if (!strmap_empty(&ctx.paths_to_lists)) { + struct hashmap_iter iter; + struct strmap_entry *entry; + + strmap_for_each_entry(&ctx.paths_to_lists, &iter, entry) + push_to_stack(&ctx, entry->key); + + while (!ret && ctx.path_stack.nr) { + char *path = ctx.path_stack.items[ctx.path_stack.nr - 1].string; + ctx.path_stack.nr--; + paths_nr++; + + ret = walk_path(&ctx, path); + + free(path); + } + } + trace2_data_intmax("path-walk", ctx.repo, "paths", paths_nr); trace2_region_leave("path-walk", "path-walk", info->revs->repo); diff --git a/path-walk.h b/path-walk.h index 2cafc71e153..3679fa7a859 100644 --- a/path-walk.h +++ b/path-walk.h @@ -39,12 +39,14 @@ struct path_walk_info { int commits; int trees; int blobs; + int tags; }; #define PATH_WALK_INFO_INIT { \ .blobs = 1, \ .trees = 1, \ .commits = 1, \ + .tags = 1, \ } void path_walk_info_init(struct path_walk_info *info); diff --git a/t/helper/test-path-walk.c b/t/helper/test-path-walk.c index a57a05a6391..56289859e69 100644 --- a/t/helper/test-path-walk.c +++ b/t/helper/test-path-walk.c @@ -23,6 +23,7 @@ struct path_walk_test_data { uintmax_t commit_nr; uintmax_t tree_nr; uintmax_t blob_nr; + uintmax_t tag_nr; }; static int emit_block(const char *path, struct oid_array *oids, @@ -37,11 +38,18 @@ static int emit_block(const char *path, struct oid_array *oids, tdata->blob_nr += oids->nr; else if (type == OBJ_COMMIT) tdata->commit_nr += oids->nr; + else if (type == OBJ_TAG) + tdata->tag_nr += oids->nr; else BUG("we do not understand this type"); typestr = type_name(type); + /* This should never be output during tests. */ + if (!oids->nr) + printf("%"PRIuMAX":%s:%s:EMPTY\n", + tdata->batch_nr, typestr, path); + for (size_t i = 0; i < oids->nr; i++) printf("%"PRIuMAX":%s:%s:%s\n", tdata->batch_nr, typestr, path, @@ -62,6 +70,8 @@ int cmd__path_walk(int argc, const char **argv) N_("toggle inclusion of blob objects")), OPT_BOOL(0, "commits", &info.commits, N_("toggle inclusion of commit objects")), + OPT_BOOL(0, "tags", &info.tags, + N_("toggle inclusion of tag objects")), OPT_BOOL(0, "trees", &info.trees, N_("toggle inclusion of tree objects")), OPT_END(), @@ -87,8 +97,9 @@ int cmd__path_walk(int argc, const char **argv) printf("commits:%" PRIuMAX "\n" "trees:%" PRIuMAX "\n" - "blobs:%" PRIuMAX "\n", - data.commit_nr, data.tree_nr, data.blob_nr); + "blobs:%" PRIuMAX "\n" + "tags:%" PRIuMAX "\n", + data.commit_nr, data.tree_nr, data.blob_nr, data.tag_nr); release_revisions(&revs); return res; diff --git a/t/t6601-path-walk.sh b/t/t6601-path-walk.sh index 4a4939a1b02..1f3d2e0cb76 100755 --- a/t/t6601-path-walk.sh +++ b/t/t6601-path-walk.sh @@ -9,29 +9,142 @@ test_description='direct path-walk API tests' test_expect_success 'setup test repository' ' git checkout -b base && + # Make some objects that will only be reachable + # via non-commit tags. + mkdir child && + echo file >child/file && + git add child && + git commit -m "will abandon" && + git tag -a -m "tree" tree-tag HEAD^{tree} && + echo file2 >file2 && + git add file2 && + git commit --amend -m "will abandon" && + git tag tree-tag2 HEAD^{tree} && + + echo blob >file && + blob_oid=$(git hash-object -t blob -w --stdin file2 && + blob2_oid=$(git hash-object -t blob -w --stdin a && echo b >left/b && echo c >right/c && git add . && - git commit -m "first" && + git commit --amend -m "first" && + git tag -m "first" first HEAD && echo d >right/d && git add right && git commit -m "second" && + git tag -a -m "second (under)" second.1 HEAD && + git tag -a -m "second (top)" second.2 second.1 && + # Set up file/dir collision in history. + rm a && + mkdir a && + echo a >a/a && echo bb >left/b && - git commit -a -m "third" && + git add a left && + git commit -m "third" && + git tag -a -m "third" third && git checkout -b topic HEAD~1 && echo cc >right/c && - git commit -a -m "topic" + git commit -a -m "topic" && + git tag -a -m "fourth" fourth ' test_expect_success 'all' ' test-tool path-walk -- --all >out && + cat >expect <<-EOF && + 0:commit::$(git rev-parse topic) + 0:commit::$(git rev-parse base) + 0:commit::$(git rev-parse base~1) + 0:commit::$(git rev-parse base~2) + 1:tag:/tags:$(git rev-parse refs/tags/first) + 1:tag:/tags:$(git rev-parse refs/tags/second.1) + 1:tag:/tags:$(git rev-parse refs/tags/second.2) + 1:tag:/tags:$(git rev-parse refs/tags/third) + 1:tag:/tags:$(git rev-parse refs/tags/fourth) + 1:tag:/tags:$(git rev-parse refs/tags/tree-tag) + 1:tag:/tags:$(git rev-parse refs/tags/blob-tag) + 2:blob:/tagged-blobs:$(git rev-parse refs/tags/blob-tag^{}) + 2:blob:/tagged-blobs:$(git rev-parse refs/tags/blob-tag2^{}) + 3:tree::$(git rev-parse topic^{tree}) + 3:tree::$(git rev-parse base^{tree}) + 3:tree::$(git rev-parse base~1^{tree}) + 3:tree::$(git rev-parse base~2^{tree}) + 3:tree::$(git rev-parse refs/tags/tree-tag^{}) + 3:tree::$(git rev-parse refs/tags/tree-tag2^{}) + 4:blob:a:$(git rev-parse base~2:a) + 5:tree:right/:$(git rev-parse topic:right) + 5:tree:right/:$(git rev-parse base~1:right) + 5:tree:right/:$(git rev-parse base~2:right) + 6:blob:right/d:$(git rev-parse base~1:right/d) + 7:blob:right/c:$(git rev-parse base~2:right/c) + 7:blob:right/c:$(git rev-parse topic:right/c) + 8:tree:left/:$(git rev-parse base:left) + 8:tree:left/:$(git rev-parse base~2:left) + 9:blob:left/b:$(git rev-parse base~2:left/b) + 9:blob:left/b:$(git rev-parse base:left/b) + 10:tree:a/:$(git rev-parse base:a) + 11:blob:file2:$(git rev-parse refs/tags/tree-tag2^{}:file2) + 12:tree:child/:$(git rev-parse refs/tags/tree-tag:child) + 13:blob:child/file:$(git rev-parse refs/tags/tree-tag:child/file) + blobs:10 + commits:4 + tags:7 + trees:13 + EOF + + test_cmp_sorted expect out +' + +test_expect_success 'indexed objects' ' + test_when_finished git reset --hard && + + # stage change into index, adding a blob but + # also invalidating the cache-tree for the root + # and the "left" directory. + echo bogus >left/c && + git add left && + + test-tool path-walk -- --indexed-objects >out && + + cat >expect <<-EOF && + 0:blob:a:$(git rev-parse HEAD:a) + 1:blob:left/b:$(git rev-parse HEAD:left/b) + 2:blob:left/c:$(git rev-parse :left/c) + 3:blob:right/c:$(git rev-parse HEAD:right/c) + 4:blob:right/d:$(git rev-parse HEAD:right/d) + 5:tree:right/:$(git rev-parse topic:right) + blobs:5 + commits:0 + tags:0 + trees:1 + EOF + + test_cmp_sorted expect out +' + +test_expect_success 'branches and indexed objects mix well' ' + test_when_finished git reset --hard && + + # stage change into index, adding a blob but + # also invalidating the cache-tree for the root + # and the "right" directory. + echo fake >right/d && + git add right && + + test-tool path-walk -- --indexed-objects --branches >out && + cat >expect <<-EOF && 0:commit::$(git rev-parse topic) 0:commit::$(git rev-parse base) @@ -41,20 +154,23 @@ test_expect_success 'all' ' 1:tree::$(git rev-parse base^{tree}) 1:tree::$(git rev-parse base~1^{tree}) 1:tree::$(git rev-parse base~2^{tree}) - 2:tree:right/:$(git rev-parse topic:right) - 2:tree:right/:$(git rev-parse base~1:right) - 2:tree:right/:$(git rev-parse base~2:right) - 3:blob:right/d:$(git rev-parse base~1:right/d) - 4:blob:right/c:$(git rev-parse base~2:right/c) - 4:blob:right/c:$(git rev-parse topic:right/c) - 5:tree:left/:$(git rev-parse base:left) - 5:tree:left/:$(git rev-parse base~2:left) - 6:blob:left/b:$(git rev-parse base~2:left/b) - 6:blob:left/b:$(git rev-parse base:left/b) - 7:blob:a:$(git rev-parse base~2:a) - blobs:6 + 2:blob:a:$(git rev-parse base~2:a) + 3:tree:right/:$(git rev-parse topic:right) + 3:tree:right/:$(git rev-parse base~1:right) + 3:tree:right/:$(git rev-parse base~2:right) + 4:blob:right/d:$(git rev-parse base~1:right/d) + 4:blob:right/d:$(git rev-parse :right/d) + 5:blob:right/c:$(git rev-parse base~2:right/c) + 5:blob:right/c:$(git rev-parse topic:right/c) + 6:tree:left/:$(git rev-parse base:left) + 6:tree:left/:$(git rev-parse base~2:left) + 7:blob:left/b:$(git rev-parse base:left/b) + 7:blob:left/b:$(git rev-parse base~2:left/b) + 8:tree:a/:$(git rev-parse refs/tags/third:a) + blobs:7 commits:4 - trees:9 + tags:0 + trees:10 EOF test_cmp_sorted expect out @@ -81,6 +197,7 @@ test_expect_success 'topic only' ' 7:blob:a:$(git rev-parse base~2:a) blobs:5 commits:3 + tags:0 trees:7 EOF @@ -101,6 +218,7 @@ test_expect_success 'topic, not base' ' 7:blob:a:$(git rev-parse topic:a) blobs:4 commits:1 + tags:0 trees:3 EOF @@ -112,13 +230,14 @@ test_expect_success 'topic, not base, only blobs' ' -- topic --not base >out && cat >expect <<-EOF && - commits:0 - trees:0 0:blob:right/d:$(git rev-parse topic:right/d) 1:blob:right/c:$(git rev-parse topic:right/c) 2:blob:left/b:$(git rev-parse topic:left/b) 3:blob:a:$(git rev-parse topic:a) blobs:4 + commits:0 + tags:0 + trees:0 EOF test_cmp_sorted expect out @@ -133,8 +252,9 @@ test_expect_success 'topic, not base, only commits' ' cat >expect <<-EOF && 0:commit::$(git rev-parse topic) commits:1 - trees:0 blobs:0 + tags:0 + trees:0 EOF test_cmp_sorted expect out @@ -145,12 +265,13 @@ test_expect_success 'topic, not base, only trees' ' -- topic --not base >out && cat >expect <<-EOF && - commits:0 0:tree::$(git rev-parse topic^{tree}) 1:tree:right/:$(git rev-parse topic:right) 2:tree:left/:$(git rev-parse topic:left) - trees:3 + commits:0 blobs:0 + tags:0 + trees:3 EOF test_cmp_sorted expect out @@ -174,10 +295,33 @@ test_expect_success 'topic, not base, boundary' ' 7:blob:a:$(git rev-parse base~1:a) blobs:5 commits:2 + tags:0 trees:5 EOF test_cmp_sorted expect out ' +test_expect_success 'trees are reported exactly once' ' + test_when_finished "rm -rf unique-trees" && + test_create_repo unique-trees && + ( + cd unique-trees && + mkdir initial && + test_commit initial/file && + + git switch -c move-to-top && + git mv initial/file.t ./ && + test_tick && + git commit -m moved && + + git update-ref refs/heads/other HEAD + ) && + + test-tool -C unique-trees path-walk -- --all >out && + tree=$(git -C unique-trees rev-parse HEAD:) && + grep "$tree" out >out-filtered && + test_line_count = 1 out-filtered +' + test_done From patchwork Fri Dec 20 16:21:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13916984 Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1EAE721A452 for ; Fri, 20 Dec 2024 16:21:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711690; cv=none; b=PYTW3IJsAeG6vhmLuql54IT554tvYN6EJYjQBHjNL4bH/XG9ROT4ag2JspIlwj0YAoMbrMHbsZYxYFschCKHoI98xq7xEG6si3zMR8uS5ZpZbTmZWpQx1Sva8kX3EeOzf4WTnArB6+/7BXNXB47arhA9lPGGuBQR98qcehAj9+k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711690; c=relaxed/simple; bh=GjsMXug+vIKde5hYP3Aia2ahlM3KoOml5oSaOXSnZPU=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=FUF64FG3SHT8pRyJmlitg82oSSk6fFvN24ksjyoEHtafUNYcRH99FAlEGVQunRk1QsRa5PQJ97YgQVrNwdX8q5ETalZvwM1qE/7nAEaD19HOvmOW+uMbgBZT/Dem4SkvKpVJ3DdbxvvnHynqOo3xbUfE6UgswtfghUoelkaFy7g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=k8u53kqB; arc=none smtp.client-ip=209.85.128.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="k8u53kqB" Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-43626213fffso20188705e9.1 for ; Fri, 20 Dec 2024 08:21:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734711686; x=1735316486; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=xTBSN1N3JtKb/Zgwqcq+umgISBS8Bt5EJwykvrKBjQM=; b=k8u53kqBSLPmIpoFEFSOGPQzGGfwrZpGbPc1XN9CRQ9oL54wha3vvLhJDTF7HajH5V MU/1CioC9xq1fpYnsfQB5jMQNca9Pp6NQmqvH4aahY2GHy8yXhYxIoF+rwVns7aBuxgS SJWf2v5kvylVNX799/mvchSNNGxVFLClg05kSUsIMB2WYiv+2TmEcTH4e6fodCWWgSf9 FSjeU2mg4nhpLxr7XOahdNFHhphZhba8MSPPAJ7az45YFRdEWITCNcOUd0xalwmquQix /zJi+0USZTentMlWcLyL/q29QD6x1OMzNIVXEJtq2acybVRqmacoVzUA4ezXfvF7+WUF Qpmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734711686; x=1735316486; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xTBSN1N3JtKb/Zgwqcq+umgISBS8Bt5EJwykvrKBjQM=; b=bdH+DVNJHZfiU8oS8j2fgr6DuSCLvwcUO1JkKoLXi6iuugL/ROBTCauNCa9S63eikK nVmlyrM4XHJgXyftNgjVQP/YY/rI+/N9H+tz8yWR1NuvgIVvF7xxAP0o/Jj2oIfn8uOe l50Q6QDbF//RTwtzkKzRgZV7wPBEO6GMpD2MepYJ3gyWhecyTRn0S0uMwTQWWCOFfGpr Rzkbe7hQk1aaPj8t43dhr1USLXoK8TXhzhmW2p2HjlcI+VnhpG2l/d9Yc4/kBPl63DRU Turu8viQpy3kcJJiayTif9GA07bq2vLvSgMGFgRsGdh3PtK5B07Jewz0G+FxBONDWNy1 l57A== X-Gm-Message-State: AOJu0YxfxioWNwsI/xnA1B02LCiprakx80+s2Xzjo4SLG6ivOf8T5uui Tb9rkX2kF+7z4aQ83hXvXex130No2yenj9WTeE+o+dg2j9vYj+GyIrR2yQ== X-Gm-Gg: ASbGncve+96U1nCgPdMdeYnzx1bq6AzjXWqnZPgS3KHgWpqSBfyAonlY68n5klL02uT my3ySTu13msxZHDqNjN2FCTE42rk9GVND7bavdZcYu14L8nJVN8SaQwv8kbJsdtrf+PrRxk3EcO dWtFJDJQ3XujUo2DdzITSiOh1FkfJWsGQAVeAvIzWZeSr2dqIaECeeS7VTIBcpp1m3DZqhvTTgr tlZhbvZLRhlYE66DVGRhEodwmXJl322RQ8Qfx5mvfwwGP6alkfMlJoocw== X-Google-Smtp-Source: AGHT+IGbLLli8O9WhKB2+cZGt2SZBC+UtPHoiur2tY/iZKu1QAIT3QM+1qxnEV7AiG4PmwGfvWM+xw== X-Received: by 2002:a5d:64cc:0:b0:386:42b1:d7e4 with SMTP id ffacd0b85a97d-38a1a264ae2mr7494202f8f.19.1734711685653; Fri, 20 Dec 2024 08:21:25 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38a2432e587sm1485637f8f.95.2024.12.20.08.21.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Dec 2024 08:21:25 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 20 Dec 2024 16:21:14 +0000 Subject: [PATCH v4 6/7] path-walk: mark trees and blobs as UNINTERESTING Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, johannes.schindelin@gmx.de, peff@peff.net, ps@pks.im, me@ttaylorr.com, johncai86@gmail.com, newren@gmail.com, christian.couder@gmail.com, kristofferhaugsbakk@fastmail.com, jonathantanmy@google.com, karthik nayak , Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee When the input rev_info has UNINTERESTING starting points, we want to be sure that the UNINTERESTING flag is passed appropriately through the objects. To match how this is done in places such as 'git pack-objects', we use the mark_edges_uninteresting() method. This method has an option for using the "sparse" walk, which is similar in spirit to the path-walk API's walk. To be sure to keep it independent, add a new 'prune_all_uninteresting' option to the path_walk_info struct. To check how the UNINTERSTING flag is spread through our objects, extend the 'test-tool path-walk' command to output whether or not an object has that flag. This changes our tests significantly, including the removal of some objects that were previously visited due to the incomplete implementation. Signed-off-by: Derrick Stolee --- Documentation/technical/api-path-walk.txt | 8 +++ path-walk.c | 74 +++++++++++++++++++++ path-walk.h | 8 +++ t/helper/test-path-walk.c | 12 +++- t/t6601-path-walk.sh | 79 +++++++++++++++++------ 5 files changed, 159 insertions(+), 22 deletions(-) diff --git a/Documentation/technical/api-path-walk.txt b/Documentation/technical/api-path-walk.txt index 6022c381b7c..7075d0d5ab5 100644 --- a/Documentation/technical/api-path-walk.txt +++ b/Documentation/technical/api-path-walk.txt @@ -48,6 +48,14 @@ commits. While it is possible to walk only commits in this way, consumers would be better off using the revision walk API instead. +`prune_all_uninteresting`:: + By default, all reachable paths are emitted by the path-walk API. + This option allows consumers to declare that they are not + interested in paths where all included objects are marked with the + `UNINTERESTING` flag. This requires using the `boundary` option in + the revision walk so that the walk emits commits marked with the + `UNINTERESTING` flag. + Examples -------- diff --git a/path-walk.c b/path-walk.c index f34dbf61de0..4013569e9e4 100644 --- a/path-walk.c +++ b/path-walk.c @@ -8,6 +8,7 @@ #include "dir.h" #include "hashmap.h" #include "hex.h" +#include "list-objects.h" #include "object.h" #include "oid-array.h" #include "revision.h" @@ -23,6 +24,7 @@ static const char *root_path = ""; struct type_and_oid_list { enum object_type type; struct oid_array oids; + int maybe_interesting; }; #define TYPE_AND_OID_LIST_INIT { \ @@ -142,6 +144,10 @@ static int add_tree_entries(struct path_walk_context *ctx, strmap_put(&ctx->paths_to_lists, path.buf, list); } push_to_stack(ctx, path.buf); + + if (!(o->flags & UNINTERESTING)) + list->maybe_interesting = 1; + oid_array_append(&list->oids, &entry.oid); } @@ -169,6 +175,43 @@ static int walk_path(struct path_walk_context *ctx, if (!list->oids.nr) return 0; + if (ctx->info->prune_all_uninteresting) { + /* + * This is true if all objects were UNINTERESTING + * when added to the list. + */ + if (!list->maybe_interesting) + return 0; + + /* + * But it's still possible that the objects were set + * as UNINTERESTING after being added. Do a quick check. + */ + list->maybe_interesting = 0; + for (size_t i = 0; + !list->maybe_interesting && i < list->oids.nr; + i++) { + if (list->type == OBJ_TREE) { + struct tree *t = lookup_tree(ctx->repo, + &list->oids.oid[i]); + if (t && !(t->object.flags & UNINTERESTING)) + list->maybe_interesting = 1; + } else if (list->type == OBJ_BLOB) { + struct blob *b = lookup_blob(ctx->repo, + &list->oids.oid[i]); + if (b && !(b->object.flags & UNINTERESTING)) + list->maybe_interesting = 1; + } else { + /* Tags are always interesting if visited. */ + list->maybe_interesting = 1; + } + } + + /* We have confirmed that all objects are UNINTERESTING. */ + if (!list->maybe_interesting) + return 0; + } + /* Evaluate function pointer on this data, if requested. */ if ((list->type == OBJ_TREE && ctx->info->trees) || (list->type == OBJ_BLOB && ctx->info->blobs) || @@ -203,6 +246,26 @@ static void clear_paths_to_lists(struct strmap *map) strmap_init(map); } +static struct repository *edge_repo; +static struct type_and_oid_list *edge_tree_list; + +static void show_edge(struct commit *commit) +{ + struct tree *t = repo_get_commit_tree(edge_repo, commit); + + if (!t) + return; + + if (commit->object.flags & UNINTERESTING) + t->object.flags |= UNINTERESTING; + + if (t->object.flags & SEEN) + return; + t->object.flags |= SEEN; + + oid_array_append(&edge_tree_list->oids, &t->object.oid); +} + static int setup_pending_objects(struct path_walk_info *info, struct path_walk_context *ctx) { @@ -314,6 +377,7 @@ static int setup_pending_objects(struct path_walk_info *info, if (tagged_blobs->oids.nr) { const char *tagged_blob_path = "/tagged-blobs"; tagged_blobs->type = OBJ_BLOB; + tagged_blobs->maybe_interesting = 1; push_to_stack(ctx, tagged_blob_path); strmap_put(&ctx->paths_to_lists, tagged_blob_path, tagged_blobs); } else { @@ -325,6 +389,7 @@ static int setup_pending_objects(struct path_walk_info *info, if (tags->oids.nr) { const char *tag_path = "/tags"; tags->type = OBJ_TAG; + tags->maybe_interesting = 1; push_to_stack(ctx, tag_path); strmap_put(&ctx->paths_to_lists, tag_path, tags); } else { @@ -369,6 +434,7 @@ int walk_objects_by_path(struct path_walk_info *info) /* Insert a single list for the root tree into the paths. */ CALLOC_ARRAY(root_tree_list, 1); root_tree_list->type = OBJ_TREE; + root_tree_list->maybe_interesting = 1; strmap_put(&ctx.paths_to_lists, root_path, root_tree_list); push_to_stack(&ctx, root_path); @@ -382,6 +448,14 @@ int walk_objects_by_path(struct path_walk_info *info) if (prepare_revision_walk(info->revs)) die(_("failed to setup revision walk")); + /* Walk trees to mark them as UNINTERESTING. */ + edge_repo = info->revs->repo; + edge_tree_list = root_tree_list; + mark_edges_uninteresting(info->revs, show_edge, + info->prune_all_uninteresting); + edge_repo = NULL; + edge_tree_list = NULL; + info->revs->blob_objects = info->revs->tree_objects = 0; trace2_region_enter("path-walk", "pending-walk", info->revs->repo); diff --git a/path-walk.h b/path-walk.h index 3679fa7a859..414d6db23c2 100644 --- a/path-walk.h +++ b/path-walk.h @@ -40,6 +40,14 @@ struct path_walk_info { int trees; int blobs; int tags; + + /** + * When 'prune_all_uninteresting' is set and a path has all objects + * marked as UNINTERESTING, then the path-walk will not visit those + * objects. It will not call path_fn on those objects and will not + * walk the children of such trees. + */ + int prune_all_uninteresting; }; #define PATH_WALK_INFO_INIT { \ diff --git a/t/helper/test-path-walk.c b/t/helper/test-path-walk.c index 56289859e69..7f2d409c5bc 100644 --- a/t/helper/test-path-walk.c +++ b/t/helper/test-path-walk.c @@ -50,10 +50,14 @@ static int emit_block(const char *path, struct oid_array *oids, printf("%"PRIuMAX":%s:%s:EMPTY\n", tdata->batch_nr, typestr, path); - for (size_t i = 0; i < oids->nr; i++) - printf("%"PRIuMAX":%s:%s:%s\n", + for (size_t i = 0; i < oids->nr; i++) { + struct object *o = lookup_unknown_object(the_repository, + &oids->oid[i]); + printf("%"PRIuMAX":%s:%s:%s%s\n", tdata->batch_nr, typestr, path, - oid_to_hex(&oids->oid[i])); + oid_to_hex(&oids->oid[i]), + o->flags & UNINTERESTING ? ":UNINTERESTING" : ""); + } tdata->batch_nr++; return 0; @@ -74,6 +78,8 @@ int cmd__path_walk(int argc, const char **argv) N_("toggle inclusion of tag objects")), OPT_BOOL(0, "trees", &info.trees, N_("toggle inclusion of tree objects")), + OPT_BOOL(0, "prune", &info.prune_all_uninteresting, + N_("toggle pruning of uninteresting paths")), OPT_END(), }; diff --git a/t/t6601-path-walk.sh b/t/t6601-path-walk.sh index 1f3d2e0cb76..a317cdf289e 100755 --- a/t/t6601-path-walk.sh +++ b/t/t6601-path-walk.sh @@ -211,11 +211,11 @@ test_expect_success 'topic, not base' ' 0:commit::$(git rev-parse topic) 1:tree::$(git rev-parse topic^{tree}) 2:tree:right/:$(git rev-parse topic:right) - 3:blob:right/d:$(git rev-parse topic:right/d) + 3:blob:right/d:$(git rev-parse topic:right/d):UNINTERESTING 4:blob:right/c:$(git rev-parse topic:right/c) - 5:tree:left/:$(git rev-parse topic:left) - 6:blob:left/b:$(git rev-parse topic:left/b) - 7:blob:a:$(git rev-parse topic:a) + 5:tree:left/:$(git rev-parse topic:left):UNINTERESTING + 6:blob:left/b:$(git rev-parse topic:left/b):UNINTERESTING + 7:blob:a:$(git rev-parse topic:a):UNINTERESTING blobs:4 commits:1 tags:0 @@ -225,15 +225,38 @@ test_expect_success 'topic, not base' ' test_cmp_sorted expect out ' +test_expect_success 'fourth, blob-tag2, not base' ' + test-tool path-walk -- fourth blob-tag2 --not base >out && + + cat >expect <<-EOF && + 0:commit::$(git rev-parse topic) + 1:tag:/tags:$(git rev-parse fourth) + 2:blob:/tagged-blobs:$(git rev-parse refs/tags/blob-tag2^{}) + 3:tree::$(git rev-parse topic^{tree}) + 4:tree:right/:$(git rev-parse topic:right) + 5:blob:right/d:$(git rev-parse base~1:right/d):UNINTERESTING + 6:blob:right/c:$(git rev-parse topic:right/c) + 7:tree:left/:$(git rev-parse base~1:left):UNINTERESTING + 8:blob:left/b:$(git rev-parse base~1:left/b):UNINTERESTING + 9:blob:a:$(git rev-parse base~1:a):UNINTERESTING + blobs:5 + commits:1 + tags:1 + trees:3 + EOF + + test_cmp_sorted expect out +' + test_expect_success 'topic, not base, only blobs' ' test-tool path-walk --no-trees --no-commits \ -- topic --not base >out && cat >expect <<-EOF && - 0:blob:right/d:$(git rev-parse topic:right/d) + 0:blob:right/d:$(git rev-parse topic:right/d):UNINTERESTING 1:blob:right/c:$(git rev-parse topic:right/c) - 2:blob:left/b:$(git rev-parse topic:left/b) - 3:blob:a:$(git rev-parse topic:a) + 2:blob:left/b:$(git rev-parse topic:left/b):UNINTERESTING + 3:blob:a:$(git rev-parse topic:a):UNINTERESTING blobs:4 commits:0 tags:0 @@ -267,7 +290,7 @@ test_expect_success 'topic, not base, only trees' ' cat >expect <<-EOF && 0:tree::$(git rev-parse topic^{tree}) 1:tree:right/:$(git rev-parse topic:right) - 2:tree:left/:$(git rev-parse topic:left) + 2:tree:left/:$(git rev-parse topic:left):UNINTERESTING commits:0 blobs:0 tags:0 @@ -282,17 +305,17 @@ test_expect_success 'topic, not base, boundary' ' cat >expect <<-EOF && 0:commit::$(git rev-parse topic) - 0:commit::$(git rev-parse base~1) + 0:commit::$(git rev-parse base~1):UNINTERESTING 1:tree::$(git rev-parse topic^{tree}) - 1:tree::$(git rev-parse base~1^{tree}) + 1:tree::$(git rev-parse base~1^{tree}):UNINTERESTING 2:tree:right/:$(git rev-parse topic:right) - 2:tree:right/:$(git rev-parse base~1:right) - 3:blob:right/d:$(git rev-parse base~1:right/d) - 4:blob:right/c:$(git rev-parse base~1:right/c) + 2:tree:right/:$(git rev-parse base~1:right):UNINTERESTING + 3:blob:right/d:$(git rev-parse base~1:right/d):UNINTERESTING + 4:blob:right/c:$(git rev-parse base~1:right/c):UNINTERESTING 4:blob:right/c:$(git rev-parse topic:right/c) - 5:tree:left/:$(git rev-parse base~1:left) - 6:blob:left/b:$(git rev-parse base~1:left/b) - 7:blob:a:$(git rev-parse base~1:a) + 5:tree:left/:$(git rev-parse base~1:left):UNINTERESTING + 6:blob:left/b:$(git rev-parse base~1:left/b):UNINTERESTING + 7:blob:a:$(git rev-parse base~1:a):UNINTERESTING blobs:5 commits:2 tags:0 @@ -302,6 +325,27 @@ test_expect_success 'topic, not base, boundary' ' test_cmp_sorted expect out ' +test_expect_success 'topic, not base, boundary with pruning' ' + test-tool path-walk --prune -- --boundary topic --not base >out && + + cat >expect <<-EOF && + 0:commit::$(git rev-parse topic) + 0:commit::$(git rev-parse base~1):UNINTERESTING + 1:tree::$(git rev-parse topic^{tree}) + 1:tree::$(git rev-parse base~1^{tree}):UNINTERESTING + 2:tree:right/:$(git rev-parse topic:right) + 2:tree:right/:$(git rev-parse base~1:right):UNINTERESTING + 3:blob:right/c:$(git rev-parse base~1:right/c):UNINTERESTING + 3:blob:right/c:$(git rev-parse topic:right/c) + blobs:2 + commits:2 + tags:0 + trees:4 + EOF + + test_cmp_sorted expect out +' + test_expect_success 'trees are reported exactly once' ' test_when_finished "rm -rf unique-trees" && test_create_repo unique-trees && @@ -309,15 +353,12 @@ test_expect_success 'trees are reported exactly once' ' cd unique-trees && mkdir initial && test_commit initial/file && - git switch -c move-to-top && git mv initial/file.t ./ && test_tick && git commit -m moved && - git update-ref refs/heads/other HEAD ) && - test-tool -C unique-trees path-walk -- --all >out && tree=$(git -C unique-trees rev-parse HEAD:) && grep "$tree" out >out-filtered && From patchwork Fri Dec 20 16:21:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 13916985 Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B55CD21A458 for ; Fri, 20 Dec 2024 16:21:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711690; cv=none; b=urpI9DYoVPHy9dNu0N7moqgCiataPcHjy6blPwRanH2LWek7rNkTXnMlM8YzeiNLd+N6uAOKtHs3Ay8zR/IDOb2o0O8hTT07V8PyL0uoRJNwX8FCR1aLLoDckPsLEIT5h3CAvdJhhms2ALsrmgvdD9nfHE4insqsBsSnMtq4Hkc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734711690; c=relaxed/simple; bh=iPj9apbad7G9Dkvr80M6Q8f1APNGa3ViPfHIy7aZGSU=; h=Message-Id:In-Reply-To:References:From:Date:Subject:Content-Type: MIME-Version:To:Cc; b=knpIdZc0QmK1517l9Yh8Bwwa8QYw3Ln6Kl+6WMjDO0oDHcnfWFWm8hrfoiMkA/aCbnJ04rWe56uONyzdMQKz3EinEBHrTndD2o1IjXVuDocpob3REm9zHqmO5C6+cjlfwbxkl00oJWcgu4DtTaJPnXEd4fxV55Q9LRX+1Q3sUuw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=fJq/ykx5; arc=none smtp.client-ip=209.85.128.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fJq/ykx5" Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-4361b0ec57aso20482985e9.0 for ; Fri, 20 Dec 2024 08:21:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734711687; x=1735316487; darn=vger.kernel.org; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:from:to:cc:subject:date :message-id:reply-to; bh=jS6IRldV0kXxDpzTrwX4vJ4Stv77WPjfZ/AiohQUSoE=; b=fJq/ykx5znX4do6z93Z0ItRuycgACduGJuNsaeDcmLX5m7BwcbjwTzqO4WcWWfSYKd Q4uWbHY4ejeqzcxwba21+2TpFGoQaImFdtU0Q5WDll8HUCZF1jtVLO4R+B/vHfzvvJec +VWQ8iusslTuxsY+43+7sHDHHdF0YWMy7PtjAKBk09de5gnric9qzuytNfuCMv+rSVy/ EjKx8y1xUzpPERCjqpRARQzaUAw9fcO344vrnnVX8dIacJPkjqgwKLNHL8Ag4C1JO7GN rdaqBGvJwlo5BthezV9mY4342SzY8+F1sWKavKq4MaVl7lF0+RLDlNl/A4WZXaxZbjbj 7wUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734711687; x=1735316487; h=cc:to:mime-version:content-transfer-encoding:fcc:subject:date:from :references:in-reply-to:message-id:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jS6IRldV0kXxDpzTrwX4vJ4Stv77WPjfZ/AiohQUSoE=; b=lykq+aZxIsn6Ba5H0t4X5NpuLiHaAfbOybizh8GTlo1/ih7hy8HnNcuEjLOzhIaofz dc+Web9iv4RkaOVgNzZ6SBgVlddvRcp1KvJxCbA5Y6OaKx/KSTK3iIy0VCdJGam+IlIq jWh5MQQtdyKaf043huPKK3ymkTcuyMDyTU0/HNVoP9aGDIlFCG1zFIEiO7oZKpFObcif OC2vnI9EpMkkxTVy+wyimcubceefpqxUyA4Kwzt3fqoexGoPiJ1ZbyMcUqYuLE/f0SE6 2NT/bEJVoDQRtzDnUNcz4nU26kj2l+diNPE0e9MpCNZ+Z240Te4z4Lqbyy8HNSvzzrRb WT8Q== X-Gm-Message-State: AOJu0YxlMSwjir/r08Y/Wa5DNBzmUXrLl40VZU5eq/1w3tR4Ilx3leJc VxcKlqMf+o3yF9U7e76e3WYC4EDn/3FlF5NHeQkTN0ntvNGlbSj8CkOvBg== X-Gm-Gg: ASbGnctHP3+aHN/0h/Bu406xYnj9wztYhJAK4RDJzJngb93QYnHvNOa64dlo1PHs9dm NdNAwX1neRd5ilgVePD+23yZH/4PNhnvnXrCiB11ZvvAXlSLayKaA7EZmreNEPwHpuU6XcTP3LM QIkQo/hpLL3MjK1gpQgX1QS4O6YGJ+p/urqdrRN8y7ouYOartP2NwObzWm2c9prIZ9FkVxh7MHI PFcri3sLZ11g4a4+7POLKqZIO0Svm64gpiCoqTjgNtU0S0S1LpSWE4BDA== X-Google-Smtp-Source: AGHT+IGMcsHfca3jvVKdSrZQD1kRuvyaC7AMhu5oDitjqs9rlDQnflKi5Hi9XOHf1brfOyRnKetUFg== X-Received: by 2002:a5d:6da1:0:b0:385:e5d8:2bea with SMTP id ffacd0b85a97d-38a221fa065mr3587711f8f.20.1734711686432; Fri, 20 Dec 2024 08:21:26 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43656b013ecsm83873635e9.16.2024.12.20.08.21.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Dec 2024 08:21:25 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 20 Dec 2024 16:21:15 +0000 Subject: [PATCH v4 7/7] path-walk: reorder object visits Fcc: Sent Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, johannes.schindelin@gmx.de, peff@peff.net, ps@pks.im, me@ttaylorr.com, johncai86@gmail.com, newren@gmail.com, christian.couder@gmail.com, kristofferhaugsbakk@fastmail.com, jonathantanmy@google.com, karthik nayak , Derrick Stolee , Derrick Stolee From: Derrick Stolee From: Derrick Stolee The path-walk API currently uses a stack-based approach to recursing through the list of paths within the repository. This guarantees that after a tree path is explored, all paths contained within that tree path will be explored before continuing to explore siblings of that tree path. The initial motivation of this depth-first approach was to minimize memory pressure while exploring the repository. A breadth-first approach would have too many "active" paths being stored in the paths_to_lists map. We can take this approach one step further by making sure that blob paths are visited before tree paths. This allows the API to free the memory for these blob objects before continuing to perform the depth-first search. This modifies the order in which we visit siblings, but does not change the fact that we are performing depth-first search. To achieve this goal, use a priority queue with a custom sorting method. The sort needs to handle tags, blobs, and trees (commits are handled slightly differently). When objects share a type, we can sort by path name. This will keep children of the latest path to leave the stack be preferred over the rest of the paths in the stack, since they agree in prefix up to and including a directory separator. When the types are different, we can prefer tags over other types and blobs over trees. This causes significant adjustments to t6601-path-walk.sh to rearrange the order of the visited paths. Signed-off-by: Derrick Stolee --- path-walk.c | 60 ++++++++++++++++----- t/t6601-path-walk.sh | 124 +++++++++++++++++++++---------------------- 2 files changed, 110 insertions(+), 74 deletions(-) diff --git a/path-walk.c b/path-walk.c index 4013569e9e4..136ec08fb0e 100644 --- a/path-walk.c +++ b/path-walk.c @@ -11,6 +11,7 @@ #include "list-objects.h" #include "object.h" #include "oid-array.h" +#include "prio-queue.h" #include "revision.h" #include "string-list.h" #include "strmap.h" @@ -49,16 +50,50 @@ struct path_walk_context { struct strmap paths_to_lists; /** - * Store the current list of paths in a stack, to - * facilitate depth-first-search without recursion. + * Store the current list of paths in a priority queue, + * using object type as a sorting mechanism, mostly to + * make sure blobs are popped off the stack first. No + * other sort is made, so within each object type it acts + * like a stack and performs a DFS within the trees. * * Use path_stack_pushed to indicate whether a path * was previously added to path_stack. */ - struct string_list path_stack; + struct prio_queue path_stack; struct strset path_stack_pushed; }; +static int compare_by_type(const void *one, const void *two, void *cb_data) +{ + struct type_and_oid_list *list1, *list2; + const char *str1 = one; + const char *str2 = two; + struct path_walk_context *ctx = cb_data; + + list1 = strmap_get(&ctx->paths_to_lists, str1); + list2 = strmap_get(&ctx->paths_to_lists, str2); + + /* + * If object types are equal, then use path comparison. + */ + if (!list1 || !list2 || list1->type == list2->type) + return strcmp(str1, str2); + + /* Prefer tags to be popped off first. */ + if (list1->type == OBJ_TAG) + return -1; + if (list2->type == OBJ_TAG) + return 1; + + /* Prefer blobs to be popped off second. */ + if (list1->type == OBJ_BLOB) + return -1; + if (list2->type == OBJ_BLOB) + return 1; + + return 0; +} + static void push_to_stack(struct path_walk_context *ctx, const char *path) { @@ -66,7 +101,7 @@ static void push_to_stack(struct path_walk_context *ctx, return; strset_add(&ctx->path_stack_pushed, path); - string_list_append(&ctx->path_stack, path); + prio_queue_put(&ctx->path_stack, xstrdup(path)); } static int add_tree_entries(struct path_walk_context *ctx, @@ -378,8 +413,8 @@ static int setup_pending_objects(struct path_walk_info *info, const char *tagged_blob_path = "/tagged-blobs"; tagged_blobs->type = OBJ_BLOB; tagged_blobs->maybe_interesting = 1; - push_to_stack(ctx, tagged_blob_path); strmap_put(&ctx->paths_to_lists, tagged_blob_path, tagged_blobs); + push_to_stack(ctx, tagged_blob_path); } else { oid_array_clear(&tagged_blobs->oids); free(tagged_blobs); @@ -390,8 +425,8 @@ static int setup_pending_objects(struct path_walk_info *info, const char *tag_path = "/tags"; tags->type = OBJ_TAG; tags->maybe_interesting = 1; - push_to_stack(ctx, tag_path); strmap_put(&ctx->paths_to_lists, tag_path, tags); + push_to_stack(ctx, tag_path); } else { oid_array_clear(&tags->oids); free(tags); @@ -418,7 +453,10 @@ int walk_objects_by_path(struct path_walk_info *info) .repo = info->revs->repo, .revs = info->revs, .info = info, - .path_stack = STRING_LIST_INIT_DUP, + .path_stack = { + .compare = compare_by_type, + .cb_data = &ctx + }, .path_stack_pushed = STRSET_INIT, .paths_to_lists = STRMAP_INIT }; @@ -504,8 +542,7 @@ int walk_objects_by_path(struct path_walk_info *info) trace2_region_enter("path-walk", "path-walk", info->revs->repo); while (!ret && ctx.path_stack.nr) { - char *path = ctx.path_stack.items[ctx.path_stack.nr - 1].string; - ctx.path_stack.nr--; + char *path = prio_queue_get(&ctx.path_stack); paths_nr++; ret = walk_path(&ctx, path); @@ -522,8 +559,7 @@ int walk_objects_by_path(struct path_walk_info *info) push_to_stack(&ctx, entry->key); while (!ret && ctx.path_stack.nr) { - char *path = ctx.path_stack.items[ctx.path_stack.nr - 1].string; - ctx.path_stack.nr--; + char *path = prio_queue_get(&ctx.path_stack); paths_nr++; ret = walk_path(&ctx, path); @@ -537,7 +573,7 @@ int walk_objects_by_path(struct path_walk_info *info) clear_paths_to_lists(&ctx.paths_to_lists); strset_clear(&ctx.path_stack_pushed); - string_list_clear(&ctx.path_stack, 0); + clear_prio_queue(&ctx.path_stack); return ret; } diff --git a/t/t6601-path-walk.sh b/t/t6601-path-walk.sh index a317cdf289e..5f04acb8a2f 100755 --- a/t/t6601-path-walk.sh +++ b/t/t6601-path-walk.sh @@ -84,20 +84,20 @@ test_expect_success 'all' ' 3:tree::$(git rev-parse refs/tags/tree-tag^{}) 3:tree::$(git rev-parse refs/tags/tree-tag2^{}) 4:blob:a:$(git rev-parse base~2:a) - 5:tree:right/:$(git rev-parse topic:right) - 5:tree:right/:$(git rev-parse base~1:right) - 5:tree:right/:$(git rev-parse base~2:right) - 6:blob:right/d:$(git rev-parse base~1:right/d) - 7:blob:right/c:$(git rev-parse base~2:right/c) - 7:blob:right/c:$(git rev-parse topic:right/c) - 8:tree:left/:$(git rev-parse base:left) - 8:tree:left/:$(git rev-parse base~2:left) - 9:blob:left/b:$(git rev-parse base~2:left/b) - 9:blob:left/b:$(git rev-parse base:left/b) - 10:tree:a/:$(git rev-parse base:a) - 11:blob:file2:$(git rev-parse refs/tags/tree-tag2^{}:file2) - 12:tree:child/:$(git rev-parse refs/tags/tree-tag:child) - 13:blob:child/file:$(git rev-parse refs/tags/tree-tag:child/file) + 5:blob:file2:$(git rev-parse refs/tags/tree-tag2^{}:file2) + 6:tree:a/:$(git rev-parse base:a) + 7:tree:child/:$(git rev-parse refs/tags/tree-tag:child) + 8:blob:child/file:$(git rev-parse refs/tags/tree-tag:child/file) + 9:tree:left/:$(git rev-parse base:left) + 9:tree:left/:$(git rev-parse base~2:left) + 10:blob:left/b:$(git rev-parse base~2:left/b) + 10:blob:left/b:$(git rev-parse base:left/b) + 11:tree:right/:$(git rev-parse topic:right) + 11:tree:right/:$(git rev-parse base~1:right) + 11:tree:right/:$(git rev-parse base~2:right) + 12:blob:right/c:$(git rev-parse base~2:right/c) + 12:blob:right/c:$(git rev-parse topic:right/c) + 13:blob:right/d:$(git rev-parse base~1:right/d) blobs:10 commits:4 tags:7 @@ -154,19 +154,19 @@ test_expect_success 'branches and indexed objects mix well' ' 1:tree::$(git rev-parse base^{tree}) 1:tree::$(git rev-parse base~1^{tree}) 1:tree::$(git rev-parse base~2^{tree}) - 2:blob:a:$(git rev-parse base~2:a) - 3:tree:right/:$(git rev-parse topic:right) - 3:tree:right/:$(git rev-parse base~1:right) - 3:tree:right/:$(git rev-parse base~2:right) - 4:blob:right/d:$(git rev-parse base~1:right/d) - 4:blob:right/d:$(git rev-parse :right/d) - 5:blob:right/c:$(git rev-parse base~2:right/c) - 5:blob:right/c:$(git rev-parse topic:right/c) - 6:tree:left/:$(git rev-parse base:left) - 6:tree:left/:$(git rev-parse base~2:left) - 7:blob:left/b:$(git rev-parse base:left/b) - 7:blob:left/b:$(git rev-parse base~2:left/b) - 8:tree:a/:$(git rev-parse refs/tags/third:a) + 2:tree:a/:$(git rev-parse refs/tags/third:a) + 3:tree:left/:$(git rev-parse base:left) + 3:tree:left/:$(git rev-parse base~2:left) + 4:blob:left/b:$(git rev-parse base:left/b) + 4:blob:left/b:$(git rev-parse base~2:left/b) + 5:tree:right/:$(git rev-parse topic:right) + 5:tree:right/:$(git rev-parse base~1:right) + 5:tree:right/:$(git rev-parse base~2:right) + 6:blob:right/c:$(git rev-parse base~2:right/c) + 6:blob:right/c:$(git rev-parse topic:right/c) + 7:blob:right/d:$(git rev-parse base~1:right/d) + 7:blob:right/d:$(git rev-parse :right/d) + 8:blob:a:$(git rev-parse base~2:a) blobs:7 commits:4 tags:0 @@ -186,15 +186,15 @@ test_expect_success 'topic only' ' 1:tree::$(git rev-parse topic^{tree}) 1:tree::$(git rev-parse base~1^{tree}) 1:tree::$(git rev-parse base~2^{tree}) - 2:tree:right/:$(git rev-parse topic:right) - 2:tree:right/:$(git rev-parse base~1:right) - 2:tree:right/:$(git rev-parse base~2:right) - 3:blob:right/d:$(git rev-parse base~1:right/d) - 4:blob:right/c:$(git rev-parse base~2:right/c) - 4:blob:right/c:$(git rev-parse topic:right/c) - 5:tree:left/:$(git rev-parse base~2:left) - 6:blob:left/b:$(git rev-parse base~2:left/b) - 7:blob:a:$(git rev-parse base~2:a) + 2:blob:a:$(git rev-parse base~2:a) + 3:tree:left/:$(git rev-parse base~2:left) + 4:blob:left/b:$(git rev-parse base~2:left/b) + 5:tree:right/:$(git rev-parse topic:right) + 5:tree:right/:$(git rev-parse base~1:right) + 5:tree:right/:$(git rev-parse base~2:right) + 6:blob:right/c:$(git rev-parse base~2:right/c) + 6:blob:right/c:$(git rev-parse topic:right/c) + 7:blob:right/d:$(git rev-parse base~1:right/d) blobs:5 commits:3 tags:0 @@ -210,12 +210,12 @@ test_expect_success 'topic, not base' ' cat >expect <<-EOF && 0:commit::$(git rev-parse topic) 1:tree::$(git rev-parse topic^{tree}) - 2:tree:right/:$(git rev-parse topic:right) - 3:blob:right/d:$(git rev-parse topic:right/d):UNINTERESTING - 4:blob:right/c:$(git rev-parse topic:right/c) - 5:tree:left/:$(git rev-parse topic:left):UNINTERESTING - 6:blob:left/b:$(git rev-parse topic:left/b):UNINTERESTING - 7:blob:a:$(git rev-parse topic:a):UNINTERESTING + 2:blob:a:$(git rev-parse topic:a):UNINTERESTING + 3:tree:left/:$(git rev-parse topic:left):UNINTERESTING + 4:blob:left/b:$(git rev-parse topic:left/b):UNINTERESTING + 5:tree:right/:$(git rev-parse topic:right) + 6:blob:right/c:$(git rev-parse topic:right/c) + 7:blob:right/d:$(git rev-parse topic:right/d):UNINTERESTING blobs:4 commits:1 tags:0 @@ -233,12 +233,12 @@ test_expect_success 'fourth, blob-tag2, not base' ' 1:tag:/tags:$(git rev-parse fourth) 2:blob:/tagged-blobs:$(git rev-parse refs/tags/blob-tag2^{}) 3:tree::$(git rev-parse topic^{tree}) - 4:tree:right/:$(git rev-parse topic:right) - 5:blob:right/d:$(git rev-parse base~1:right/d):UNINTERESTING - 6:blob:right/c:$(git rev-parse topic:right/c) - 7:tree:left/:$(git rev-parse base~1:left):UNINTERESTING - 8:blob:left/b:$(git rev-parse base~1:left/b):UNINTERESTING - 9:blob:a:$(git rev-parse base~1:a):UNINTERESTING + 4:blob:a:$(git rev-parse base~1:a):UNINTERESTING + 5:tree:left/:$(git rev-parse base~1:left):UNINTERESTING + 6:blob:left/b:$(git rev-parse base~1:left/b):UNINTERESTING + 7:tree:right/:$(git rev-parse topic:right) + 8:blob:right/c:$(git rev-parse topic:right/c) + 9:blob:right/d:$(git rev-parse base~1:right/d):UNINTERESTING blobs:5 commits:1 tags:1 @@ -253,10 +253,10 @@ test_expect_success 'topic, not base, only blobs' ' -- topic --not base >out && cat >expect <<-EOF && - 0:blob:right/d:$(git rev-parse topic:right/d):UNINTERESTING - 1:blob:right/c:$(git rev-parse topic:right/c) - 2:blob:left/b:$(git rev-parse topic:left/b):UNINTERESTING - 3:blob:a:$(git rev-parse topic:a):UNINTERESTING + 0:blob:a:$(git rev-parse topic:a):UNINTERESTING + 1:blob:left/b:$(git rev-parse topic:left/b):UNINTERESTING + 2:blob:right/c:$(git rev-parse topic:right/c) + 3:blob:right/d:$(git rev-parse topic:right/d):UNINTERESTING blobs:4 commits:0 tags:0 @@ -289,8 +289,8 @@ test_expect_success 'topic, not base, only trees' ' cat >expect <<-EOF && 0:tree::$(git rev-parse topic^{tree}) - 1:tree:right/:$(git rev-parse topic:right) - 2:tree:left/:$(git rev-parse topic:left):UNINTERESTING + 1:tree:left/:$(git rev-parse topic:left):UNINTERESTING + 2:tree:right/:$(git rev-parse topic:right) commits:0 blobs:0 tags:0 @@ -308,14 +308,14 @@ test_expect_success 'topic, not base, boundary' ' 0:commit::$(git rev-parse base~1):UNINTERESTING 1:tree::$(git rev-parse topic^{tree}) 1:tree::$(git rev-parse base~1^{tree}):UNINTERESTING - 2:tree:right/:$(git rev-parse topic:right) - 2:tree:right/:$(git rev-parse base~1:right):UNINTERESTING - 3:blob:right/d:$(git rev-parse base~1:right/d):UNINTERESTING - 4:blob:right/c:$(git rev-parse base~1:right/c):UNINTERESTING - 4:blob:right/c:$(git rev-parse topic:right/c) - 5:tree:left/:$(git rev-parse base~1:left):UNINTERESTING - 6:blob:left/b:$(git rev-parse base~1:left/b):UNINTERESTING - 7:blob:a:$(git rev-parse base~1:a):UNINTERESTING + 2:blob:a:$(git rev-parse base~1:a):UNINTERESTING + 3:tree:left/:$(git rev-parse base~1:left):UNINTERESTING + 4:blob:left/b:$(git rev-parse base~1:left/b):UNINTERESTING + 5:tree:right/:$(git rev-parse topic:right) + 5:tree:right/:$(git rev-parse base~1:right):UNINTERESTING + 6:blob:right/c:$(git rev-parse base~1:right/c):UNINTERESTING + 6:blob:right/c:$(git rev-parse topic:right/c) + 7:blob:right/d:$(git rev-parse base~1:right/d):UNINTERESTING blobs:5 commits:2 tags:0