From patchwork Thu May 16 18:56:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew DeVore X-Patchwork-Id: 10946993 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 87BAE1390 for ; Thu, 16 May 2019 18:57:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 75FE228B30 for ; Thu, 16 May 2019 18:57:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6973028B38; Thu, 16 May 2019 18:57:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 196C828B30 for ; Thu, 16 May 2019 18:57:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726691AbfEPS5U (ORCPT ); Thu, 16 May 2019 14:57:20 -0400 Received: from mail-pf1-f201.google.com ([209.85.210.201]:49325 "EHLO mail-pf1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726529AbfEPS5T (ORCPT ); Thu, 16 May 2019 14:57:19 -0400 Received: by mail-pf1-f201.google.com with SMTP id r4so2759564pfh.16 for ; Thu, 16 May 2019 11:57:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=Fu+KxqorTXsSSSbU7rsIFkxijPGpB0m1O5O/e2ApYPA=; b=ZOYbtMPJZu82UTuhLTZK3tJtXRliKuxVIs1TG6C0eXzbAMe+v04oXSla0PS6bdtaJQ fCPMDKeSUw+Jo4nStr6CYzqN/Gs4ETUbv4EDGG43/r73ZEITg+S8XVXPSxD4ajFz2G1O 1+PmcM7WtjmSLovmeUJe0C1mCuXropDQNeSTA3qxusGDAg23/sL90QoyC/tAvtycoWj8 hUzdoU5HAzKyc5jo+MwHXYDZFF+jK7bj+Paj6dIksZduqTrB41CbqSrx7f29H/EqD1p4 p41AUAfy8z7kL6sTwJLzBUAvSW2W14O1JqdbP1Wb2eqnGH2+mkOXSrbrxD1YFuwP1LEb ZUBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Fu+KxqorTXsSSSbU7rsIFkxijPGpB0m1O5O/e2ApYPA=; b=kw8S5Z4pRlH4Pngnbl/1cGwEwBZCjmNHSOX9P03NrGa8mKMrplM2Y2GSr5wb0nNQbq XciU1DUyN0aotY//nX5bB6HDkZ6T1MqswUXLrOUKs5QPqmc/Hg/KJMMeA7KeQ4P/WRgL EkV3jQQaoB8bAS9VZ6qe1l3zkZfSFAxgiSqpGTRQI+bLzxXJUS7GO+sXgygkMkdGjLQN 5i0uKkyI3jan9WFCNokvavDrAUkPa3cALR0RQivagUJXOjk/DXPJmIWvfXA2GqUs/0E+ EtUA5iolOjuLz0cxnfvc1XkCIG62dadKu6x4ek8p5NFCl/HJR6/1L7f1ySrDoAqLoI6l KQqg== X-Gm-Message-State: APjAAAX3HQ7Gyhr1ZMT1KoRYPNFpe2jfNKTDsDryywO8SVUq23x5/39s 0JbRbwBD6qaMBramuo8fyfsf0MY+ATM= X-Google-Smtp-Source: APXvYqwQATdpVhv7RtEqOalksskICw2EDpXtC3Y7/TaIT8ri5gjd9WrWGWDNj9TdLNCzpX93GnVYXiqycMx4 X-Received: by 2002:a63:4c54:: with SMTP id m20mr52403467pgl.316.1558033038664; Thu, 16 May 2019 11:57:18 -0700 (PDT) Date: Thu, 16 May 2019 11:56:49 -0700 In-Reply-To: Message-Id: <341bc55d4a3f5438b1523525cf683f96d75e8c3e.1558030802.git.matvore@google.com> Mime-Version: 1.0 References: <20190514001610.GA136746@google.com> X-Mailer: git-send-email 2.21.0.1020.gf2820cf01a-goog Subject: [RFC PATCH 1/3] list-objects-filter: refactor into a context struct From: Matthew DeVore To: jonathantanmy@google.com, jrn@google.com, git@vger.kernel.org, dstolee@microsoft.com, jeffhost@microsoft.com, jrnieder@gmail.com Cc: Matthew DeVore , matvore@comcast.net Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The next patch will create and manage filters in a new way, which means that this bundle of data will have to be managed at a new callsite. Make this bundle of data more manageable by putting it in a struct and making it part of the list-objects-filter module's API. Signed-off-by: Matthew DeVore --- list-objects-filter.c | 156 +++++++++++++++--------------------------- list-objects-filter.h | 31 ++++++--- list-objects.c | 45 ++++++------ 3 files changed, 96 insertions(+), 136 deletions(-) diff --git a/list-objects-filter.c b/list-objects-filter.c index ee449de3f7..8e8616b9b8 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -19,82 +19,60 @@ * FILTER_SHOWN_BUT_REVISIT -- we set this bit on tree objects * that have been shown, but should be revisited if they appear * in the traversal (until we mark it SEEN). This is a way to * let us silently de-dup calls to show() in the caller. This * is subtly different from the "revision.h:SHOWN" and the * "sha1-name.c:ONELINE_SEEN" bits. And also different from * the non-de-dup usage in pack-bitmap.c */ #define FILTER_SHOWN_BUT_REVISIT (1<<21) -/* - * A filter for list-objects to omit ALL blobs from the traversal. - * And to OPTIONALLY collect a list of the omitted OIDs. - */ -struct filter_blobs_none_data { - struct oidset *omits; -}; - static enum list_objects_filter_result filter_blobs_none( struct repository *r, enum list_objects_filter_situation filter_situation, struct object *obj, const char *pathname, const char *filename, - void *filter_data_) + struct filter_context *ctx) { - struct filter_blobs_none_data *filter_data = filter_data_; - switch (filter_situation) { default: BUG("unknown filter_situation: %d", filter_situation); case LOFS_BEGIN_TREE: assert(obj->type == OBJ_TREE); /* always include all tree objects */ return LOFR_MARK_SEEN | LOFR_DO_SHOW; case LOFS_END_TREE: assert(obj->type == OBJ_TREE); return LOFR_ZERO; case LOFS_BLOB: assert(obj->type == OBJ_BLOB); assert((obj->flags & SEEN) == 0); - if (filter_data->omits) - oidset_insert(filter_data->omits, &obj->oid); + if (ctx->omits) + oidset_insert(ctx->omits, &obj->oid); return LOFR_MARK_SEEN; /* but not LOFR_DO_SHOW (hard omit) */ } } -static void *filter_blobs_none__init( - struct oidset *omitted, +static void filter_blobs_none__init( struct list_objects_filter_options *filter_options, - filter_object_fn *filter_fn, - filter_free_fn *filter_free_fn) + struct filter_context *ctx) { - struct filter_blobs_none_data *d = xcalloc(1, sizeof(*d)); - d->omits = omitted; - - *filter_fn = filter_blobs_none; - *filter_free_fn = free; - return d; + ctx->filter_fn = filter_blobs_none; } -/* - * A filter for list-objects to omit ALL trees and blobs from the traversal. - * Can OPTIONALLY collect a list of the omitted OIDs. - */ +/* A filter for list-objects to omit ALL trees and blobs from the traversal. */ struct filter_trees_depth_data { - struct oidset *omits; - /* * Maps trees to the minimum depth at which they were seen. It is not * necessary to re-traverse a tree at deeper or equal depths than it has * already been traversed. * * We can't use LOFR_MARK_SEEN for tree objects since this will prevent * it from being traversed at shallower depths. */ struct oidmap seen_at_depth; @@ -103,41 +81,41 @@ struct filter_trees_depth_data { }; struct seen_map_entry { struct oidmap_entry base; size_t depth; }; /* Returns 1 if the oid was in the omits set before it was invoked. */ static int filter_trees_update_omits( struct object *obj, - struct filter_trees_depth_data *filter_data, + struct filter_context *ctx, int include_it) { - if (!filter_data->omits) + if (!ctx->omits) return 0; if (include_it) - return oidset_remove(filter_data->omits, &obj->oid); + return oidset_remove(ctx->omits, &obj->oid); else - return oidset_insert(filter_data->omits, &obj->oid); + return oidset_insert(ctx->omits, &obj->oid); } static enum list_objects_filter_result filter_trees_depth( struct repository *r, enum list_objects_filter_situation filter_situation, struct object *obj, const char *pathname, const char *filename, - void *filter_data_) + struct filter_context *ctx) { - struct filter_trees_depth_data *filter_data = filter_data_; + struct filter_trees_depth_data *filter_data = ctx->data; struct seen_map_entry *seen_info; int include_it = filter_data->current_depth < filter_data->exclude_depth; int filter_res; int already_seen; /* * Note that we do not use _MARK_SEEN in order to allow re-traversal in * case we encounter a tree or blob again at a shallower depth. */ @@ -145,47 +123,47 @@ static enum list_objects_filter_result filter_trees_depth( switch (filter_situation) { default: BUG("unknown filter_situation: %d", filter_situation); case LOFS_END_TREE: assert(obj->type == OBJ_TREE); filter_data->current_depth--; return LOFR_ZERO; case LOFS_BLOB: - filter_trees_update_omits(obj, filter_data, include_it); + filter_trees_update_omits(obj, ctx, include_it); return include_it ? LOFR_MARK_SEEN | LOFR_DO_SHOW : LOFR_ZERO; case LOFS_BEGIN_TREE: seen_info = oidmap_get( &filter_data->seen_at_depth, &obj->oid); if (!seen_info) { seen_info = xcalloc(1, sizeof(*seen_info)); oidcpy(&seen_info->base.oid, &obj->oid); seen_info->depth = filter_data->current_depth; oidmap_put(&filter_data->seen_at_depth, seen_info); already_seen = 0; } else { already_seen = filter_data->current_depth >= seen_info->depth; } if (already_seen) { filter_res = LOFR_SKIP_TREE; } else { int been_omitted = filter_trees_update_omits( - obj, filter_data, include_it); + obj, ctx, include_it); seen_info->depth = filter_data->current_depth; if (include_it) filter_res = LOFR_DO_SHOW; - else if (filter_data->omits && !been_omitted) + else if (ctx->omits && !been_omitted) /* * Must update omit information of children * recursively; they have not been omitted yet. */ filter_res = LOFR_ZERO; else filter_res = LOFR_SKIP_TREE; } filter_data->current_depth++; @@ -194,55 +172,48 @@ static enum list_objects_filter_result filter_trees_depth( } static void filter_trees_free(void *filter_data) { struct filter_trees_depth_data *d = filter_data; if (!d) return; oidmap_free(&d->seen_at_depth, 1); free(d); } -static void *filter_trees_depth__init( - struct oidset *omitted, +static void filter_trees_depth__init( struct list_objects_filter_options *filter_options, - filter_object_fn *filter_fn, - filter_free_fn *filter_free_fn) + struct filter_context *ctx) { struct filter_trees_depth_data *d = xcalloc(1, sizeof(*d)); - d->omits = omitted; oidmap_init(&d->seen_at_depth, 0); d->exclude_depth = filter_options->tree_exclude_depth; d->current_depth = 0; - *filter_fn = filter_trees_depth; - *filter_free_fn = filter_trees_free; - return d; + ctx->filter_fn = filter_trees_depth; + ctx->free_fn = filter_trees_free; + ctx->data = d; } -/* - * A filter for list-objects to omit large blobs. - * And to OPTIONALLY collect a list of the omitted OIDs. - */ +/* A filter for list-objects to omit large blobs. */ struct filter_blobs_limit_data { - struct oidset *omits; unsigned long max_bytes; }; static enum list_objects_filter_result filter_blobs_limit( struct repository *r, enum list_objects_filter_situation filter_situation, struct object *obj, const char *pathname, const char *filename, - void *filter_data_) + struct filter_context *ctx) { - struct filter_blobs_limit_data *filter_data = filter_data_; + struct filter_blobs_limit_data *filter_data = ctx->data; unsigned long object_length; enum object_type t; switch (filter_situation) { default: BUG("unknown filter_situation: %d", filter_situation); case LOFS_BEGIN_TREE: assert(obj->type == OBJ_TREE); /* always include all tree objects */ @@ -263,44 +234,41 @@ static enum list_objects_filter_result filter_blobs_limit( * apply the size filter criteria. Be conservative * and force show it (and let the caller deal with * the ambiguity). */ goto include_it; } if (object_length < filter_data->max_bytes) goto include_it; - if (filter_data->omits) - oidset_insert(filter_data->omits, &obj->oid); + if (ctx->omits) + oidset_insert(ctx->omits, &obj->oid); return LOFR_MARK_SEEN; /* but not LOFR_DO_SHOW (hard omit) */ } include_it: - if (filter_data->omits) - oidset_remove(filter_data->omits, &obj->oid); + if (ctx->omits) + oidset_remove(ctx->omits, &obj->oid); return LOFR_MARK_SEEN | LOFR_DO_SHOW; } -static void *filter_blobs_limit__init( - struct oidset *omitted, +static void filter_blobs_limit__init( struct list_objects_filter_options *filter_options, - filter_object_fn *filter_fn, - filter_free_fn *filter_free_fn) + struct filter_context *ctx) { struct filter_blobs_limit_data *d = xcalloc(1, sizeof(*d)); - d->omits = omitted; d->max_bytes = filter_options->blob_limit_value; - *filter_fn = filter_blobs_limit; - *filter_free_fn = free; - return d; + ctx->filter_fn = filter_blobs_limit; + ctx->free_fn = free; + ctx->data = d; } /* * A filter driven by a sparse-checkout specification to only * include blobs that a sparse checkout would populate. * * The sparse-checkout spec can be loaded from a blob with the * given OID or from a local pathname. We allow an OID because * the repo may be bare or we may be doing the filtering on the * server. @@ -319,36 +287,35 @@ struct frame { * omitted objects. * * 0 if everything (recursively) contained in this directory * has been explicitly included (SHOWN) in the result and * the directory may be short-cut later in the traversal. */ unsigned child_prov_omit : 1; }; struct filter_sparse_data { - struct oidset *omits; struct exclude_list el; size_t nr, alloc; struct frame *array_frame; }; static enum list_objects_filter_result filter_sparse( struct repository *r, enum list_objects_filter_situation filter_situation, struct object *obj, const char *pathname, const char *filename, - void *filter_data_) + struct filter_context *ctx) { - struct filter_sparse_data *filter_data = filter_data_; + struct filter_sparse_data *filter_data = ctx->data; int val, dtype; struct frame *frame; switch (filter_situation) { default: BUG("unknown filter_situation: %d", filter_situation); case LOFS_BEGIN_TREE: assert(obj->type == OBJ_TREE); dtype = DT_DIR; @@ -414,128 +381,117 @@ static enum list_objects_filter_result filter_sparse( frame = &filter_data->array_frame[filter_data->nr]; dtype = DT_REG; val = is_excluded_from_list(pathname, strlen(pathname), filename, &dtype, &filter_data->el, r->index); if (val < 0) val = frame->defval; if (val > 0) { - if (filter_data->omits) - oidset_remove(filter_data->omits, &obj->oid); + if (ctx->omits) + oidset_remove(ctx->omits, &obj->oid); return LOFR_MARK_SEEN | LOFR_DO_SHOW; } /* * Provisionally omit it. We've already established that * this pathname is not in the sparse-checkout specification * with the CURRENT pathname, so we *WANT* to omit this blob. * * However, a pathname elsewhere in the tree may also * reference this same blob, so we cannot reject it yet. * Leave the LOFR_ bits unset so that if the blob appears * again in the traversal, we will be asked again. */ - if (filter_data->omits) - oidset_insert(filter_data->omits, &obj->oid); + if (ctx->omits) + oidset_insert(ctx->omits, &obj->oid); /* * Remember that at least 1 blob in this tree was * provisionally omitted. This prevents us from short * cutting the tree in future iterations. */ frame->child_prov_omit = 1; return LOFR_ZERO; } } static void filter_sparse_free(void *filter_data) { struct filter_sparse_data *d = filter_data; /* TODO free contents of 'd' */ free(d); } -static void *filter_sparse_oid__init( - struct oidset *omitted, +static void filter_sparse_oid__init( struct list_objects_filter_options *filter_options, - filter_object_fn *filter_fn, - filter_free_fn *filter_free_fn) + struct filter_context *ctx) { struct filter_sparse_data *d = xcalloc(1, sizeof(*d)); - d->omits = omitted; if (add_excludes_from_blob_to_list(filter_options->sparse_oid_value, NULL, 0, &d->el) < 0) die("could not load filter specification"); ALLOC_GROW(d->array_frame, d->nr + 1, d->alloc); d->array_frame[d->nr].defval = 0; /* default to include */ d->array_frame[d->nr].child_prov_omit = 0; - *filter_fn = filter_sparse; - *filter_free_fn = filter_sparse_free; - return d; + ctx->filter_fn = filter_sparse; + ctx->free_fn = filter_sparse_free; + ctx->data = d; } -static void *filter_sparse_path__init( - struct oidset *omitted, +static void filter_sparse_path__init( struct list_objects_filter_options *filter_options, - filter_object_fn *filter_fn, - filter_free_fn *filter_free_fn) + struct filter_context *ctx) { struct filter_sparse_data *d = xcalloc(1, sizeof(*d)); - d->omits = omitted; if (add_excludes_from_file_to_list(filter_options->sparse_path_value, NULL, 0, &d->el, NULL) < 0) die("could not load filter specification"); ALLOC_GROW(d->array_frame, d->nr + 1, d->alloc); d->array_frame[d->nr].defval = 0; /* default to include */ d->array_frame[d->nr].child_prov_omit = 0; - *filter_fn = filter_sparse; - *filter_free_fn = filter_sparse_free; - return d; + ctx->filter_fn = filter_sparse; + ctx->free_fn = filter_sparse_free; + ctx->data = d; } -typedef void *(*filter_init_fn)( - struct oidset *omitted, +typedef void (*filter_init_fn)( struct list_objects_filter_options *filter_options, - filter_object_fn *filter_fn, - filter_free_fn *filter_free_fn); + struct filter_context *ctx); /* * Must match "enum list_objects_filter_choice". */ static filter_init_fn s_filters[] = { NULL, filter_blobs_none__init, filter_blobs_limit__init, filter_trees_depth__init, filter_sparse_oid__init, filter_sparse_path__init, }; -void *list_objects_filter__init( +void list_objects_filter__init( struct oidset *omitted, struct list_objects_filter_options *filter_options, - filter_object_fn *filter_fn, - filter_free_fn *filter_free_fn) + struct filter_context *ctx) { filter_init_fn init_fn; assert((sizeof(s_filters) / sizeof(s_filters[0])) == LOFC__COUNT); if (filter_options->choice >= LOFC__COUNT) BUG("invalid list-objects filter choice: %d", filter_options->choice); + memset(ctx, 0, sizeof(*ctx)); + ctx->omits = omitted; init_fn = s_filters[filter_options->choice]; if (init_fn) - return init_fn(omitted, filter_options, - filter_fn, filter_free_fn); - *filter_fn = NULL; - *filter_free_fn = NULL; - return NULL; + init_fn(filter_options, ctx); } diff --git a/list-objects-filter.h b/list-objects-filter.h index 1d45a4ad57..ee807f5d9b 100644 --- a/list-objects-filter.h +++ b/list-objects-filter.h @@ -53,37 +53,46 @@ enum list_objects_filter_result { LOFR_DO_SHOW = 1<<1, LOFR_SKIP_TREE = 1<<2, }; enum list_objects_filter_situation { LOFS_BEGIN_TREE, LOFS_END_TREE, LOFS_BLOB }; -typedef enum list_objects_filter_result (*filter_object_fn)( - struct repository *r, - enum list_objects_filter_situation filter_situation, - struct object *obj, - const char *pathname, - const char *filename, - void *filter_data); +struct filter_context { + enum list_objects_filter_result (*filter_fn)( + struct repository *r, + enum list_objects_filter_situation filter_situation, + struct object *obj, + const char *pathname, + const char *filename, + struct filter_context *ctx); + void (*free_fn)(void *filter_data); -typedef void (*filter_free_fn)(void *filter_data); + struct oidset *omits; + void *data; +}; /* * Constructor for the set of defined list-objects filters. * Returns a generic "void *filter_data". * * The returned "filter_fn" will be used by traverse_commit_list() * to filter the results. * * The returned "filter_free_fn" is a destructor for the * filter_data. */ -void *list_objects_filter__init( +void list_objects_filter__init( struct oidset *omitted, struct list_objects_filter_options *filter_options, - filter_object_fn *filter_fn, - filter_free_fn *filter_free_fn); + struct filter_context *ctx); + +static inline void list_objects_filter__release(struct filter_context *ctx) { + if (ctx->data && ctx->free_fn) + ctx->free_fn(ctx->data); + memset(ctx, 0, sizeof(*ctx)); +} #endif /* LIST_OBJECTS_FILTER_H */ diff --git a/list-objects.c b/list-objects.c index b5651ddd5b..7a73f7deee 100644 --- a/list-objects.c +++ b/list-objects.c @@ -11,22 +11,21 @@ #include "list-objects-filter-options.h" #include "packfile.h" #include "object-store.h" #include "trace.h" struct traversal_context { struct rev_info *revs; show_object_fn show_object; show_commit_fn show_commit; void *show_data; - filter_object_fn filter_fn; - void *filter_data; + struct filter_context filter_ctx; }; static void process_blob(struct traversal_context *ctx, struct blob *blob, struct strbuf *path, const char *name) { struct object *obj = &blob->object; size_t pathlen; enum list_objects_filter_result r = LOFR_MARK_SEEN | LOFR_DO_SHOW; @@ -47,25 +46,25 @@ static void process_blob(struct traversal_context *ctx, * may cause the actual filter to report an incomplete list * of missing objects. */ if (ctx->revs->exclude_promisor_objects && !has_object_file(&obj->oid) && is_promisor_object(&obj->oid)) return; pathlen = path->len; strbuf_addstr(path, name); - if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) - r = ctx->filter_fn(ctx->revs->repo, - LOFS_BLOB, obj, - path->buf, &path->buf[pathlen], - ctx->filter_data); + if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_ctx.filter_fn) + r = ctx->filter_ctx.filter_fn(ctx->revs->repo, + LOFS_BLOB, obj, + path->buf, &path->buf[pathlen], + &ctx->filter_ctx); if (r & LOFR_MARK_SEEN) obj->flags |= SEEN; if (r & LOFR_DO_SHOW) ctx->show_object(obj, path->buf, ctx->show_data); strbuf_setlen(path, pathlen); } /* * Processing a gitlink entry currently does nothing, since * we do not recurse into the subproject. @@ -179,42 +178,42 @@ static void process_tree(struct traversal_context *ctx, */ if (revs->exclude_promisor_objects && is_promisor_object(&obj->oid)) return; if (!revs->do_not_die_on_missing_tree) die("bad tree object %s", oid_to_hex(&obj->oid)); } strbuf_addstr(base, name); - if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) - r = ctx->filter_fn(ctx->revs->repo, - LOFS_BEGIN_TREE, obj, - base->buf, &base->buf[baselen], - ctx->filter_data); + if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_ctx.filter_fn) + r = ctx->filter_ctx.filter_fn(ctx->revs->repo, + LOFS_BEGIN_TREE, obj, + base->buf, &base->buf[baselen], + &ctx->filter_ctx); if (r & LOFR_MARK_SEEN) obj->flags |= SEEN; if (r & LOFR_DO_SHOW) ctx->show_object(obj, base->buf, ctx->show_data); if (base->len) strbuf_addch(base, '/'); if (r & LOFR_SKIP_TREE) trace_printf("Skipping contents of tree %s...\n", base->buf); else if (!failed_parse) process_tree_contents(ctx, tree, base); - if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_fn) { - r = ctx->filter_fn(ctx->revs->repo, - LOFS_END_TREE, obj, - base->buf, &base->buf[baselen], - ctx->filter_data); + if ((obj->flags & NOT_USER_GIVEN) && ctx->filter_ctx.filter_fn) { + r = ctx->filter_ctx.filter_fn(ctx->revs->repo, + LOFS_END_TREE, obj, + base->buf, &base->buf[baselen], + &ctx->filter_ctx); if (r & LOFR_MARK_SEEN) obj->flags |= SEEN; if (r & LOFR_DO_SHOW) ctx->show_object(obj, base->buf, ctx->show_data); } strbuf_setlen(base, baselen); free_tree_buffer(tree); } @@ -395,38 +394,34 @@ static void do_traverse(struct traversal_context *ctx) void traverse_commit_list(struct rev_info *revs, show_commit_fn show_commit, show_object_fn show_object, void *show_data) { struct traversal_context ctx; ctx.revs = revs; ctx.show_commit = show_commit; ctx.show_object = show_object; ctx.show_data = show_data; - ctx.filter_fn = NULL; - ctx.filter_data = NULL; + memset(&ctx.filter_ctx, 0, sizeof(ctx.filter_ctx)); do_traverse(&ctx); } void traverse_commit_list_filtered( struct list_objects_filter_options *filter_options, struct rev_info *revs, show_commit_fn show_commit, show_object_fn show_object, void *show_data, struct oidset *omitted) { struct traversal_context ctx; - filter_free_fn filter_free_fn = NULL; + memset(&ctx, 0, sizeof(ctx)); ctx.revs = revs; ctx.show_object = show_object; ctx.show_commit = show_commit; ctx.show_data = show_data; - ctx.filter_fn = NULL; - ctx.filter_data = list_objects_filter__init(omitted, filter_options, - &ctx.filter_fn, &filter_free_fn); + list_objects_filter__init(omitted, filter_options, &ctx.filter_ctx); do_traverse(&ctx); - if (ctx.filter_data && filter_free_fn) - filter_free_fn(ctx.filter_data); + list_objects_filter__release(&ctx.filter_ctx); } From patchwork Thu May 16 18:56:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew DeVore X-Patchwork-Id: 10946995 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2FD3F1390 for ; Thu, 16 May 2019 18:58:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 20B8E28B30 for ; Thu, 16 May 2019 18:58:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1514528B37; Thu, 16 May 2019 18:58:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8A32B28B30 for ; Thu, 16 May 2019 18:58:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728004AbfEPS57 (ORCPT ); Thu, 16 May 2019 14:57:59 -0400 Received: from mail-yb1-f201.google.com ([209.85.219.201]:35959 "EHLO mail-yb1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726529AbfEPS56 (ORCPT ); Thu, 16 May 2019 14:57:58 -0400 Received: by mail-yb1-f201.google.com with SMTP id a62so2960346ybg.3 for ; Thu, 16 May 2019 11:57:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=Le9FMCTJI6pXCdFlqVPhGri1nKiAp54cd08micw8HzI=; b=XDxT98RnlCvq9+cCOTmW2cZV12DVwiLbDfqykmr7TPVGOZF1tT+phTN9qeyrC4MlG5 kefuLXY1No0Z8zlA/d3JnwIcUj38rr05daTXnj9JCc3WvblitINOsXj/7+TKVVUAOH7X jd7+9nFZ2rpFa5MaIVCe/nNNmjm34Ugtc9wUh8s6ZJnd5716eVUVuodQOLWQquule8rb iR78uwdEvYR/LNlCwzym/rAGJ2HBKUCziFa2fP+7ny2pOCm/HElJJOW0F8xOrC1TTOi9 mKoalwa+bA03xYaWQm+fw0bure6S+n/MAtZKszeU9LzgazGGpOZ29/H9YitQqe0WaFVf gsXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Le9FMCTJI6pXCdFlqVPhGri1nKiAp54cd08micw8HzI=; b=CsLkgZAbOOlfJb7LaD+41eRfWS/dJeBYaGZ4+BF8zcBeJ94BxP2QDYI+L5lq20m3io SwGw1sNjpJkL+JMDw/hY3u6Q/T7HBkeqSwumndE+LRz8/AHuXaA0FI8KIsjNyH2EKyjC UEO2lxef3aQhk6SVNy9S4m/hmKKx5qGDGWAAKklCeIZRi8iiLJnsTqCTeqocgEVn/m6l o48lkQSp4nU501uXnCzdL/0dL6ljhmugbu5tydiwBfkLsDSw2fF28NCcXFmlO6REQ+wG 8MUE8csgQAD1xu7OsJXZZiYuz5dIIWQN7KMeSEmmaV6n8U88DHrWELbTF1ZbvwEe6ay7 CjGw== X-Gm-Message-State: APjAAAXEw0Z3WsH7r42Qb7qdGiKvy/OwWL0rTS4m4PQ05bAVYqnb1O49 ovG/Ivf462noYjgNgdFt8CfMJYSHJVQ= X-Google-Smtp-Source: APXvYqzSiXgLRRP/afQOPYGlZ2hEmrX6CMIoUdgj8+JcSiz5pJoIS/1KWlfaTJEwoq44PCjL7B4lc/ZcLTqh X-Received: by 2002:a81:3d8:: with SMTP id 207mr26043855ywd.337.1558033077562; Thu, 16 May 2019 11:57:57 -0700 (PDT) Date: Thu, 16 May 2019 11:56:50 -0700 In-Reply-To: Message-Id: <6f4da02d494323e3ca946b4b20bf78d9dee419e4.1558030802.git.matvore@google.com> Mime-Version: 1.0 References: <20190514001610.GA136746@google.com> X-Mailer: git-send-email 2.21.0.1020.gf2820cf01a-goog Subject: [RFC PATCH 2/3] list-objects-filter-options: error is localizeable From: Matthew DeVore To: jonathantanmy@google.com, jrn@google.com, git@vger.kernel.org, dstolee@microsoft.com, jeffhost@microsoft.com, jrnieder@gmail.com Cc: Matthew DeVore , matvore@comcast.net Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The "invalid filter-spec" message is user-facing and not a BUG, so make it localizeable. Signed-off-by: Matthew DeVore --- list-objects-filter-options.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index c0036f7378..e46ea467bc 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -81,21 +81,21 @@ static int gently_parse_list_objects_filter( filter_options->choice = LOFC_SPARSE_PATH; filter_options->sparse_path_value = strdup(v0); return 0; } /* * Please update _git_fetch() in git-completion.bash when you * add new filters */ if (errbuf) - strbuf_addf(errbuf, "invalid filter-spec '%s'", arg); + strbuf_addf(errbuf, _("invalid filter-spec '%s'"), arg); memset(filter_options, 0, sizeof(*filter_options)); return 1; } int parse_list_objects_filter(struct list_objects_filter_options *filter_options, const char *arg) { struct strbuf buf = STRBUF_INIT; if (gently_parse_list_objects_filter(filter_options, arg, &buf)) From patchwork Thu May 16 18:56:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew DeVore X-Patchwork-Id: 10946997 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2E848933 for ; Thu, 16 May 2019 18:58:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C94A28B30 for ; Thu, 16 May 2019 18:58:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0F26328B37; Thu, 16 May 2019 18:58:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AAC5128B30 for ; Thu, 16 May 2019 18:58:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728028AbfEPS6E (ORCPT ); Thu, 16 May 2019 14:58:04 -0400 Received: from mail-ot1-f73.google.com ([209.85.210.73]:53986 "EHLO mail-ot1-f73.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727998AbfEPS6D (ORCPT ); Thu, 16 May 2019 14:58:03 -0400 Received: by mail-ot1-f73.google.com with SMTP id f11so2101095otl.20 for ; Thu, 16 May 2019 11:58:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=2mo4HxWdFo58aub/cF1TsFZo38RBIWJ021anQwPKdd0=; b=aLIaPWoPfwQd4FlV8LpoCuAZvnjQh/74fEXbLxx6kmCXp0rw5t+LfAWEYvH0UQ5cS8 eeCj+d0C9v+sQE5CK0F43ALKthGyYLefnv3mwkV1gvYghDZ0fg7NUZBThs8JGE5yIhSa d6h5SCEwaCgL1My4GS60tTRvysT3FDq9PI2A/a5EnFaXm4VFLoO0VYR8+CwD/C38hLH3 t4rIiammN8TOi9f1jtL7hI2SIRn2+kCrIy6YnawSW8xNueCsuH9MrOco1m+3VnDJ6G80 zkw77FO0/k60JtYGYogI9n4HP/+/a76cdD850iktdEsCpo9t3zGjPykW9Tdo7NoX60mO AEbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=2mo4HxWdFo58aub/cF1TsFZo38RBIWJ021anQwPKdd0=; b=T6ZHD20jgDJsF6gTuY0CLjOOnY2LED6iPxd7QWMnXElRZFjTDBza6oIwPENwJ7WOqw uxmJWrHXukxp1RNVT0X0TeIKN+dw2YbfJl/0h9DDVq/eW15yAV3GmJRUYkTjMyXwavFC UDmhUojE9l7qd2jsMwYlvokClANUXYfRD1SFUBr2EbDyb6/9vdVr3kSXu+ruYviUW/IH gNuU8/JE0iP6t6iNKW76oNiNin3130e4HXtgtMF/hb7mrXQGbaFzG7z6OYF+Wccd6+Rf lxTQnWEyZbB08Nq6k5OmA3OoE8rQVsBHrJRcsX9vMhf1o6NgkbcDt37gGeS0BI3Matav HZyg== X-Gm-Message-State: APjAAAX2UCO34yuu+vxHp8cqn2pNic2vHccXJ6AP6RKTwytt6ujPxMqj fqAc9JcRiL07ROgRCu61rF85NQ+Mndo= X-Google-Smtp-Source: APXvYqxvMP2ueHKHRUKph+Mrgr2F32+GKeI6pYQl8xAy5XqQkMeY43PQadZxy0YvgK8nwoiWZSovBxzlHG2k X-Received: by 2002:aca:a98b:: with SMTP id s133mr10602058oie.146.1558033082373; Thu, 16 May 2019 11:58:02 -0700 (PDT) Date: Thu, 16 May 2019 11:56:51 -0700 In-Reply-To: Message-Id: <02a8c9b017d8df056d7e90aff907d6e0b5506467.1558030802.git.matvore@google.com> Mime-Version: 1.0 References: <20190514001610.GA136746@google.com> X-Mailer: git-send-email 2.21.0.1020.gf2820cf01a-goog Subject: [RFC PATCH 3/3] list-objects-filter: implement composite filters From: Matthew DeVore To: jonathantanmy@google.com, jrn@google.com, git@vger.kernel.org, dstolee@microsoft.com, jeffhost@microsoft.com, jrnieder@gmail.com Cc: Matthew DeVore , matvore@comcast.net Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Allow combining filters such that only objects accepted by all filters are shown. The motivation for this is to allow getting directory listings without also fetching blobs. This can be done by combining blob:none with tree:. There are massive repositories that have larger-than-expected trees - even if you include only a single commit. The current usage requires passing the filter to rev-list, or sending it over the wire, as: combine:+ (i.e.: git rev-list --filter=combine:tree:2+blob:limit=32k). This is potentially awkward because individual filters must be URL-encoded if they contain + or %. This can potentially be improved by supporting a repeated flag syntax, e.g.: $ git rev-list --filter=tree:2 --filter:blob:limit=32k Such usage is currently an error, so giving it a meaning is backwards- compatible. Signed-off-by: Matthew DeVore --- Documentation/rev-list-options.txt | 8 ++ contrib/completion/git-completion.bash | 2 +- list-objects-filter-options.c | 130 +++++++++++++++++++++++++ list-objects-filter-options.h | 14 ++- list-objects-filter.c | 116 ++++++++++++++++++++++ t/t6112-rev-list-filters-objects.sh | 108 +++++++++++++++++++- 6 files changed, 373 insertions(+), 5 deletions(-) diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index ddbc1de43f..a2e14d3753 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -730,20 +730,28 @@ specification contained in . + The form '--filter=tree:' omits all blobs and trees whose depth from the root tree is >= (minimum depth if an object is located at multiple depths in the commits traversed). =0 will not include any trees or blobs unless included explicitly in the command-line (or standard input when --stdin is used). =1 will include only the tree and blobs which are referenced directly by a commit reachable from or an explicitly-given object. =2 is like =1 while also including trees and blobs one more level removed from an explicitly-given commit or tree. ++ +The form '--filter=combine:++...' combines +several filters. Only objects which are accepted by every filter are +included. Filters are joined by "+" and individual filters are %-encoded +(i.e. URL-encoded). Only the "%" and "+" characters must be encoded. For +instance, 'combine:tree:3+blob:none' and +'combine:tree%3A2+blob%3Anone' are equivalent, and result in only tree +objects whose depth from the root is >= 3 being included. --no-filter:: Turn off any previous `--filter=` argument. --filter-print-omitted:: Only useful with `--filter=`; prints a list of the objects omitted by the filter. Object IDs are prefixed with a ``~'' character. --missing=:: A debug option to help with future "partial clone" development. diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 3eefbabdb1..0fd0a10d0c 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1529,21 +1529,21 @@ _git_difftool () __git_fetch_recurse_submodules="yes on-demand no" _git_fetch () { case "$cur" in --recurse-submodules=*) __gitcomp "$__git_fetch_recurse_submodules" "" "${cur##--recurse-submodules=}" return ;; --filter=*) - __gitcomp "blob:none blob:limit= sparse:oid= sparse:path=" "" "${cur##--filter=}" + __gitcomp "blob:none blob:limit= sparse:oid= sparse:path= combine: tree:" "" "${cur##--filter=}" return ;; --*) __gitcomp_builtin fetch return ;; esac __git_complete_remote_or_refspec } diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index e46ea467bc..e9d0de17e0 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -1,19 +1,24 @@ #include "cache.h" #include "commit.h" #include "config.h" #include "revision.h" #include "argv-array.h" #include "list-objects.h" #include "list-objects-filter.h" #include "list-objects-filter-options.h" +static int parse_combine_filter( + struct list_objects_filter_options *filter_options, + const char *arg, + struct strbuf *errbuf); + /* * Parse value of the argument to the "filter" keyword. * On the command line this looks like: * --filter= * and in the pack protocol as: * "filter" SP * * The filter keyword will be used by many commands. * See Documentation/rev-list-options.txt for allowed values for . * @@ -74,33 +79,152 @@ static int gently_parse_list_objects_filter( if (!get_oid_with_context(the_repository, v0, GET_OID_BLOB, &sparse_oid, &oc)) filter_options->sparse_oid_value = oiddup(&sparse_oid); filter_options->choice = LOFC_SPARSE_OID; return 0; } else if (skip_prefix(arg, "sparse:path=", &v0)) { filter_options->choice = LOFC_SPARSE_PATH; filter_options->sparse_path_value = strdup(v0); return 0; + + } else if (skip_prefix(arg, "combine:", &v0)) { + int sub_parse_res = parse_combine_filter( + filter_options, v0, errbuf); + if (sub_parse_res) + return sub_parse_res; + return 0; + } /* * Please update _git_fetch() in git-completion.bash when you * add new filters */ if (errbuf) strbuf_addf(errbuf, _("invalid filter-spec '%s'"), arg); memset(filter_options, 0, sizeof(*filter_options)); return 1; } +static int digit_value(int c, struct strbuf *errbuf) { + if (c >= '0' && c <= '9') + return c - '0'; + if (c >= 'a' && c <= 'f') + return c - 'a' + 10; + if (c >= 'A' && c <= 'F') + return c - 'A' + 10; + + if (!errbuf) + return -1; + + strbuf_addf(errbuf, _("error in filter-spec - ")); + if (c) + strbuf_addf( + errbuf, + _("expect two hex digits after %%, but got: '%c'"), + c); + else + strbuf_addf( + errbuf, + _("not enough hex digits after %%; expected two")); + + return -1; +} + +static int url_decode(struct strbuf *s, struct strbuf *errbuf) { + char *dest = s->buf; + char *src = s->buf; + while (*src) { + int digit_value_0, digit_value_1; + + if (src[0] != '%') { + *dest++ = *src++; + continue; + } + src++; + + digit_value_0 = digit_value(*src++, errbuf); + if (digit_value_0 < 0) + return 1; + digit_value_1 = digit_value(*src++, errbuf); + if (digit_value_1 < 0) + return 1; + *dest++ = digit_value_0 * 16 + digit_value_1; + } + size_t new_len = dest - s->buf; + strbuf_remove(s, new_len, s->len - new_len); + + return 0; +} + +static int parse_combine_filter( + struct list_objects_filter_options *filter_options, + const char *arg, + struct strbuf *errbuf) +{ + struct strbuf **sub_specs = strbuf_split_str(arg, '+', 2); + int result; + + if (!sub_specs[0]) { + if (errbuf) + strbuf_addf(errbuf, + _("expected something after combine:")); + result = 1; + goto cleanup; + } + + /* + * Only decode the first sub-filter, since the rest will be decoded on + * the recursive call. + */ + result = url_decode(sub_specs[0], errbuf); + if (result) + goto cleanup; + + if (!sub_specs[1]) { + /* + * There is only one sub-filter, so we don't need the + * combine: - just parse it as a non-composite filter. + */ + result = gently_parse_list_objects_filter( + filter_options, sub_specs[0]->buf, errbuf); + goto cleanup; + } + + /* Remove trailing "+" so we can parse it. */ + assert(sub_specs[0]->buf[sub_specs[0]->len - 1] == '+'); + strbuf_remove(sub_specs[0], sub_specs[0]->len - 1, 1); + + filter_options->choice = LOFC_COMBINE; + filter_options->lhs = xcalloc(1, sizeof(*filter_options->lhs)); + result = gently_parse_list_objects_filter(filter_options->lhs, + sub_specs[0]->buf, + errbuf); + if (result) + goto cleanup; + + filter_options->rhs = xcalloc(1, sizeof(*filter_options->rhs)); + result = parse_combine_filter(filter_options->rhs, + sub_specs[1]->buf, + errbuf); + +cleanup: + strbuf_list_free(sub_specs); + if (result) { + list_objects_filter_release(filter_options); + memset(filter_options, 0, sizeof(*filter_options)); + } + return result; +} + int parse_list_objects_filter(struct list_objects_filter_options *filter_options, const char *arg) { struct strbuf buf = STRBUF_INIT; if (gently_parse_list_objects_filter(filter_options, arg, &buf)) die("%s", buf.buf); return 0; } int opt_parse_list_objects_filter(const struct option *opt, @@ -127,23 +251,29 @@ void expand_list_objects_filter_spec( else if (filter->choice == LOFC_TREE_DEPTH) strbuf_addf(expanded_spec, "tree:%lu", filter->tree_exclude_depth); else strbuf_addstr(expanded_spec, filter->filter_spec); } void list_objects_filter_release( struct list_objects_filter_options *filter_options) { + if (!filter_options) + return; free(filter_options->filter_spec); free(filter_options->sparse_oid_value); free(filter_options->sparse_path_value); + list_objects_filter_release(filter_options->lhs); + free(filter_options->lhs); + list_objects_filter_release(filter_options->rhs); + free(filter_options->rhs); memset(filter_options, 0, sizeof(*filter_options)); } void partial_clone_register( const char *remote, const struct list_objects_filter_options *filter_options) { /* * Record the name of the partial clone remote in the * config and in the global variable -- the latter is diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h index e3adc78ebf..6c0f0ecd08 100644 --- a/list-objects-filter-options.h +++ b/list-objects-filter-options.h @@ -7,20 +7,21 @@ /* * The list of defined filters for list-objects. */ enum list_objects_filter_choice { LOFC_DISABLED = 0, LOFC_BLOB_NONE, LOFC_BLOB_LIMIT, LOFC_TREE_DEPTH, LOFC_SPARSE_OID, LOFC_SPARSE_PATH, + LOFC_COMBINE, LOFC__COUNT /* must be last */ }; struct list_objects_filter_options { /* * 'filter_spec' is the raw argument value given on the command line * or protocol request. (The part after the "--keyword=".) For * commands that launch filtering sub-processes, or for communication * over the network, don't use this value; use the result of * expand_list_objects_filter_spec() instead. @@ -32,28 +33,35 @@ struct list_objects_filter_options { * the filtering algorithm to use. */ enum list_objects_filter_choice choice; /* * Choice is LOFC_DISABLED because "--no-filter" was requested. */ unsigned int no_filter : 1; /* - * Parsed values (fields) from within the filter-spec. These are - * choice-specific; not all values will be defined for any given - * choice. + * BEGIN choice-specific parsed values from within the filter-spec. Only + * some values will be defined for any given choice. */ + struct object_id *sparse_oid_value; char *sparse_path_value; unsigned long blob_limit_value; unsigned long tree_exclude_depth; + + /* LOFC_COMBINE values */ + struct list_objects_filter_options *lhs, *rhs; + + /* + * END choice-specific parsed values. + */ }; /* Normalized command line arguments */ #define CL_ARG__FILTER "filter" int parse_list_objects_filter( struct list_objects_filter_options *filter_options, const char *arg); int opt_parse_list_objects_filter(const struct option *opt, diff --git a/list-objects-filter.c b/list-objects-filter.c index 8e8616b9b8..28496f31f7 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -453,34 +453,150 @@ static void filter_sparse_path__init( ALLOC_GROW(d->array_frame, d->nr + 1, d->alloc); d->array_frame[d->nr].defval = 0; /* default to include */ d->array_frame[d->nr].child_prov_omit = 0; ctx->filter_fn = filter_sparse; ctx->free_fn = filter_sparse_free; ctx->data = d; } +struct filter_combine_data { + /* sub[0] corresponds to lhs, sub[1] to rhs. */ + struct { + struct filter_context ctx; + struct oidset seen; + struct object_id skip_tree; + unsigned is_skipping_tree : 1; + } sub[2]; +}; + +static void filter_combine_free(void *filter_data) +{ + struct filter_combine_data *d = filter_data; + int i; + for (i = 0; i < 2; i++) { + list_objects_filter__release(&d->sub[i].ctx); + oidset_clear(&d->sub[i].seen); + } + free(d); +} + +static int should_delegate(enum list_objects_filter_situation filter_situation, + struct object *obj, + struct filter_combine_data *d, + int side) +{ + if (!d->sub[side].is_skipping_tree) + return 1; + if (filter_situation == LOFS_END_TREE && + oideq(&obj->oid, &d->sub[side].skip_tree)) { + d->sub[side].is_skipping_tree = 0; + return 1; + } + return 0; +} + +static enum list_objects_filter_result filter_combine( + struct repository *r, + enum list_objects_filter_situation filter_situation, + struct object *obj, + const char *pathname, + const char *filename, + struct filter_context *ctx) +{ + struct filter_combine_data *d = ctx->data; + enum list_objects_filter_result lhs_result = LOFR_ZERO; + enum list_objects_filter_result rhs_result = LOFR_ZERO; + int lhs_already_seen = oidset_contains(&d->sub[0].seen, &obj->oid); + int rhs_already_seen = oidset_contains(&d->sub[1].seen, &obj->oid); + int delegate_lhs = !lhs_already_seen && + should_delegate(filter_situation, obj, d, 0); + int delegate_rhs = !rhs_already_seen && + should_delegate(filter_situation, obj, d, 1); + enum list_objects_filter_result combined_result = LOFR_ZERO; + + if (lhs_already_seen && rhs_already_seen) + return LOFR_ZERO; + + if (delegate_lhs) + lhs_result = d->sub[0].ctx.filter_fn( + r, filter_situation, obj, pathname, filename, + &d->sub[0].ctx); + if (delegate_rhs) + rhs_result = d->sub[1].ctx.filter_fn( + r, filter_situation, obj, pathname, filename, + &d->sub[1].ctx); + + if (lhs_result & LOFR_MARK_SEEN) + oidset_insert(&d->sub[0].seen, &obj->oid); + + if (rhs_result & LOFR_MARK_SEEN) + oidset_insert(&d->sub[1].seen, &obj->oid); + + if (lhs_result & LOFR_SKIP_TREE) { + d->sub[0].is_skipping_tree = 1; + d->sub[0].skip_tree = obj->oid; + } + if (rhs_result & LOFR_SKIP_TREE) { + d->sub[1].is_skipping_tree = 1; + d->sub[1].skip_tree = obj->oid; + } + + if ((lhs_result & LOFR_DO_SHOW) && (rhs_result & LOFR_DO_SHOW)) + combined_result |= LOFR_DO_SHOW; + if (d->sub[0].is_skipping_tree && d->sub[1].is_skipping_tree) + combined_result |= LOFR_SKIP_TREE; + + return combined_result; +} + +static void filter_combine__init( + struct list_objects_filter_options *filter_options, + struct filter_context *ctx) +{ + struct filter_combine_data *d = xcalloc(1, sizeof(*d)); + struct oidset *lhs_omits = NULL; + struct oidset *rhs_omits = NULL; + + if (ctx->omits) { + lhs_omits = xcalloc(1, sizeof(*lhs_omits)); + oidset_init(lhs_omits, 16); + rhs_omits = xcalloc(1, sizeof(*rhs_omits)); + oidset_init(rhs_omits, 16); + } + + list_objects_filter__init(lhs_omits, filter_options->lhs, + &d->sub[0].ctx); + list_objects_filter__init(rhs_omits, filter_options->rhs, + &d->sub[1].ctx); + + ctx->filter_fn = filter_combine; + ctx->free_fn = filter_combine_free; + ctx->data = d; +} + typedef void (*filter_init_fn)( struct list_objects_filter_options *filter_options, struct filter_context *ctx); /* * Must match "enum list_objects_filter_choice". */ static filter_init_fn s_filters[] = { NULL, filter_blobs_none__init, filter_blobs_limit__init, filter_trees_depth__init, filter_sparse_oid__init, filter_sparse_path__init, + filter_combine__init, }; void list_objects_filter__init( struct oidset *omitted, struct list_objects_filter_options *filter_options, struct filter_context *ctx) { filter_init_fn init_fn; assert((sizeof(s_filters) / sizeof(s_filters[0])) == LOFC__COUNT); diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh index 9c11427719..63c1524f6f 100755 --- a/t/t6112-rev-list-filters-objects.sh +++ b/t/t6112-rev-list-filters-objects.sh @@ -284,21 +284,33 @@ test_expect_success 'verify tree:0 includes trees in "filtered" output' ' # Make sure tree:0 does not iterate through any trees. test_expect_success 'verify skipping tree iteration when not collecting omits' ' GIT_TRACE=1 git -C r3 rev-list \ --objects --filter=tree:0 HEAD 2>filter_trace && grep "Skipping contents of tree [.][.][.]" filter_trace >actual && # One line for each commit traversed. test_line_count = 2 actual && # Make sure no other trees were considered besides the root. - ! grep "Skipping contents of tree [^.]" filter_trace + ! grep "Skipping contents of tree [^.]" filter_trace && + + # Try this again with "combine:". If both sub-filters are skipping + # trees, the composite filter should also skip trees. This is not + # important unless the user does combine:tree:X+tree:Y or another filter + # besides "tree:" is implemented in the future which can skip trees. + GIT_TRACE=1 git -C r3 rev-list \ + --objects --filter=combine:tree:1+tree:3 HEAD 2>filter_trace && + + # Only skip the dir1/ tree, which is shared between the two commits. + grep "Skipping contents of tree " filter_trace >actual && + test_write_lines "Skipping contents of tree dir1/..." >expected && + test_cmp expected actual ' # Test tree:# filters. expect_has () { commit=$1 && name=$2 && hash=$(git -C r3 rev-parse $commit:$name) && grep "^$hash $name$" actual @@ -336,20 +348,114 @@ test_expect_success 'verify tree:3 includes everything expected' ' expect_has HEAD dir1/sparse1 && expect_has HEAD dir1/sparse2 && expect_has HEAD pattern && expect_has HEAD sparse1 && expect_has HEAD sparse2 && # There are also 2 commit objects test_line_count = 10 actual ' +test_expect_success 'combine:... for a simple combination' ' + git -C r3 rev-list --objects --filter=combine:tree:2+blob:none HEAD \ + >actual && + + expect_has HEAD "" && + expect_has HEAD~1 "" && + expect_has HEAD dir1 && + + # There are also 2 commit objects + test_line_count = 5 actual +' + +test_expect_success 'combine:... with URL encoding' ' + git -C r3 rev-list --objects \ + --filter=combine:tree%3a2+blob:%6Eon%65 HEAD >actual && + + expect_has HEAD "" && + expect_has HEAD~1 "" && + expect_has HEAD dir1 && + + # There are also 2 commit objects + test_line_count = 5 actual +' + +expect_invalid_filter_spec () { + spec="$1" && + err="$2" && + + test_must_fail git -C r3 rev-list --objects --filter="$spec" HEAD \ + >actual 2>actual_stderr && + test_must_be_empty actual && + test_i18ngrep "$err" actual_stderr && cat actual_stderr >/dev/stderr +} + +test_expect_success 'combine:... while URL-encoding things that should not be' ' + expect_invalid_filter_spec combine%3Atree:2+blob:none \ + "invalid filter-spec" +' + +test_expect_success 'combine:... with invalid URL-encoded sequences' ' + expect_invalid_filter_spec combine:tree:2+blob:non%a \ + "error in filter-spec - not enough hex digits after %" && + # Edge cases for non-hex chars: "Gg/:@`" + expect_invalid_filter_spec combine:tree:2+blob%G5none \ + "error in filter-spec - expect two hex digits .*: .G." && + expect_invalid_filter_spec combine:tree:2+blob%5/none \ + "error in filter-spec - expect two hex digits .*: ./." && + expect_invalid_filter_spec combine:%:5tree:2+blob:none \ + "error in filter-spec - expect two hex digits .*: .:." && + expect_invalid_filter_spec combine:tree:2+blob:none%f@ \ + "error in filter-spec - expect two hex digits .*: .@." && + expect_invalid_filter_spec combine:tree:2+blob:none%f\` \ + "error in filter-spec - expect two hex digits .*: .\`." +' + +test_expect_success 'combine:... with edge-case hex digits: Ff Aa 0 9' ' + git -C r3 rev-list --objects --filter="combine:tree:2+bl%6Fb:n%6fne" \ + HEAD >actual && + test_line_count = 5 actual && + git -C r3 rev-list --objects --filter="combine:tree%3A2+blob%3anone" \ + HEAD >actual && + test_line_count = 5 actual && + git -C r3 rev-list --objects --filter="combine:tree:%30" HEAD >actual && + test_line_count = 2 actual && + git -C r3 rev-list --objects --filter="combine:tree:%39+blob:none" \ + HEAD >actual && + test_line_count = 5 actual +' + +test_expect_success 'combine:... with more than two sub-filters' ' + git -C r3 rev-list --objects \ + --filter=combine:tree:3+blob:limit=40+sparse:path=../pattern1 \ + HEAD >actual && + + expect_has HEAD "" && + expect_has HEAD~1 "" && + expect_has HEAD dir1 && + expect_has HEAD dir1/sparse1 && + expect_has HEAD dir1/sparse2 && + + # Should also have 2 commits + test_line_count = 7 actual && + + # Try again, this time making sure the last sub-filter is only + # URL-decoded once. + cp pattern1 pattern1+renamed% && + cp actual expect && + + git -C r3 rev-list --objects \ + --filter=combine:tree:3+blob:limit=40+sparse:path=../pattern1%2brenamed%25 \ + HEAD >actual && + test_cmp expect actual +' + # Test provisional omit collection logic with a repo that has objects appearing # at multiple depths - first deeper than the filter's threshold, then shallow. test_expect_success 'setup r4' ' git init r4 && echo foo > r4/foo && mkdir r4/subdir && echo bar > r4/subdir/bar &&