From patchwork Sun Sep 15 01:11:16 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11145809 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F07814E5 for ; Sun, 15 Sep 2019 01:11:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id ED6B620692 for ; Sun, 15 Sep 2019 01:11:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727158AbfIOBLR (ORCPT ); Sat, 14 Sep 2019 21:11:17 -0400 Received: from cloud.peff.net ([104.130.231.41]:50538 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1726791AbfIOBLR (ORCPT ); Sat, 14 Sep 2019 21:11:17 -0400 Received: (qmail 23232 invoked by uid 109); 15 Sep 2019 01:11:17 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Sun, 15 Sep 2019 01:11:17 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 28888 invoked by uid 111); 15 Sep 2019 01:13:21 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sat, 14 Sep 2019 21:13:21 -0400 Authentication-Results: peff.net; auth=none Date: Sat, 14 Sep 2019 21:11:16 -0400 From: Jeff King To: Jeff Hostetler Cc: Junio C Hamano , Jeff Hostetler , Jon Simons , git@vger.kernel.org, me@ttaylorr.com, sunshine@sunshineco.com, stolee@gmail.com Subject: [PATCH 1/3] t5616: test cloning/fetching with sparse:oid= filter Message-ID: <20190915011115.GA11208@sigill.intra.peff.net> References: <20190915010942.GA19787@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190915010942.GA19787@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jon Simons We test in t5317 that "sparse:oid" filters work with rev-list, but there's no coverage at all confirming that they work with a fetch or clone (and in fact, there are several bugs). Let's do a basic test that a clone fetches the correct objects. [jk: extracted from Jon's earlier fix patches. I also simplified the setup down to a single sparse file, and I added checks that we got the right blobs] Signed-off-by: Jon Simons Signed-off-by: Jeff King --- t/t5616-partial-clone.sh | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 565254558f..0bdbc819f1 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -241,6 +241,42 @@ test_expect_success 'fetch what is specified on CLI even if already promised' ' ! grep "?$(cat blob)" missing_after ' +test_expect_success 'setup src repo for sparse filter' ' + git init sparse-src && + git -C sparse-src config --local uploadpack.allowfilter 1 && + git -C sparse-src config --local uploadpack.allowanysha1inwant 1 && + test_commit -C sparse-src one && + test_commit -C sparse-src two && + echo /one.t >sparse-src/only-one && + git -C sparse-src add . && + git -C sparse-src commit -m "add sparse checkout files" +' + +test_expect_failure 'partial clone with sparse filter succeeds' ' + rm -rf dst.git && + git clone --no-local --bare \ + --filter=sparse:oid=master:only-one \ + sparse-src dst.git && + ( + cd dst.git && + git rev-list --objects --missing=print HEAD >out && + grep "^$(git rev-parse HEAD:one.t)" out && + grep "^?$(git rev-parse HEAD:two.t)" out + ) +' + +test_expect_failure 'partial clone with unresolvable sparse filter fails cleanly' ' + rm -rf dst.git && + test_must_fail git clone --no-local --bare \ + --filter=sparse:oid=master:no-such-name \ + sparse-src dst.git 2>err && + test_i18ngrep "unable to access sparse blob in .master:no-such-name" err && + test_must_fail git clone --no-local --bare \ + --filter=sparse:oid=master \ + sparse-src dst.git 2>err && + test_i18ngrep "could not load filter specification" err +' + . "$TEST_DIRECTORY"/lib-httpd.sh start_httpd From patchwork Sun Sep 15 01:13:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11145811 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7BAC414E5 for ; Sun, 15 Sep 2019 01:13:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5C28D20692 for ; Sun, 15 Sep 2019 01:13:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727163AbfIOBNW (ORCPT ); Sat, 14 Sep 2019 21:13:22 -0400 Received: from cloud.peff.net ([104.130.231.41]:50562 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1725805AbfIOBNV (ORCPT ); Sat, 14 Sep 2019 21:13:21 -0400 Received: (qmail 23292 invoked by uid 109); 15 Sep 2019 01:13:21 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Sun, 15 Sep 2019 01:13:21 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 28913 invoked by uid 111); 15 Sep 2019 01:15:25 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sat, 14 Sep 2019 21:15:25 -0400 Authentication-Results: peff.net; auth=none Date: Sat, 14 Sep 2019 21:13:20 -0400 From: Jeff King To: Jeff Hostetler Cc: Junio C Hamano , Jeff Hostetler , Jon Simons , git@vger.kernel.org, me@ttaylorr.com, sunshine@sunshineco.com, stolee@gmail.com Subject: [PATCH 2/3] list-objects-filter: delay parsing of sparse oid Message-ID: <20190915011319.GB11208@sigill.intra.peff.net> References: <20190915010942.GA19787@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190915010942.GA19787@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The list-objects-filter code has two steps to its initialization: 1. parse_list_objects_filter() makes sure the spec is a filter we know about and is syntactically correct. This step is done by "rev-list" or "upload-pack" that is going to apply a filter, but also by "git clone" or "git fetch" before they send the spec across the wire. 2. list_objects_filter__init() runs the type-specific initialization (using function pointers established in step 1). This happens at the start of traverse_commit_list_filtered(), when we're about to actually use the filter. It's a good idea to parse as much as we can in step 1, in order to catch problems early (e.g., a blob size limit that isn't a number). But one thing we _shouldn't_ do is resolve any oids at that step (e.g., for sparse-file contents specified by oid). In the case of a fetch, the oid has to be resolved on the remote side. The current code does resolve the oid during the parse phase, but ignores any error (which we must do, because we might just be sending the spec across the wire). This leads to two bugs: - if we're not in a repository (e.g., because it's git-clone parsing the spec), then we trigger a BUG() trying to resolve the name - if we did hit the error case, we still have to notice that later and bail. The code path in rev-list handles this, but the one in upload-pack does not, leading to a segfault. We can fix both by moving the oid resolution into the sparse-oid init function. At that point we know we have a repository (because we're about to traverse), and handling the error there fixes the segfault. As a bonus, we can drop the NULL sparse_oid_value check in rev-list, since this is now handled in the sparse-oid-filter init function. Signed-off-by: Jeff King Signed-off-by: Jeff King --- builtin/rev-list.c | 4 ---- list-objects-filter-options.c | 14 ++------------ list-objects-filter-options.h | 2 +- list-objects-filter.c | 11 +++++++++-- t/t5616-partial-clone.sh | 4 ++-- 5 files changed, 14 insertions(+), 21 deletions(-) diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 301ccb970b..74dbfad73d 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -471,10 +471,6 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) parse_list_objects_filter(&filter_options, arg); if (filter_options.choice && !revs.blob_objects) die(_("object filtering requires --objects")); - if (filter_options.choice == LOFC_SPARSE_OID && - !filter_options.sparse_oid_value) - die(_("invalid sparse value '%s'"), - filter_options.filter_spec); continue; } if (!strcmp(arg, ("--no-" CL_ARG__FILTER))) { diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c index 1cb20c659c..43f41f446c 100644 --- a/list-objects-filter-options.c +++ b/list-objects-filter-options.c @@ -63,17 +63,8 @@ static int gently_parse_list_objects_filter( return 0; } else if (skip_prefix(arg, "sparse:oid=", &v0)) { - struct object_context oc; - struct object_id sparse_oid; - - /* - * Try to parse into an OID for the current - * command, but DO NOT complain if we don't have the blob or - * ref locally. - */ - if (!get_oid_with_context(the_repository, v0, GET_OID_BLOB, - &sparse_oid, &oc)) - filter_options->sparse_oid_value = oiddup(&sparse_oid); + /* v0 points into filter_options->filter_spec; no allocation needed */ + filter_options->sparse_oid_name = v0; filter_options->choice = LOFC_SPARSE_OID; return 0; @@ -138,7 +129,6 @@ void list_objects_filter_release( struct list_objects_filter_options *filter_options) { free(filter_options->filter_spec); - free(filter_options->sparse_oid_value); memset(filter_options, 0, sizeof(*filter_options)); } diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h index c54f0000fb..a819e42ffb 100644 --- a/list-objects-filter-options.h +++ b/list-objects-filter-options.h @@ -42,7 +42,7 @@ struct list_objects_filter_options { * choice-specific; not all values will be defined for any given * choice. */ - struct object_id *sparse_oid_value; + const char *sparse_oid_name; unsigned long blob_limit_value; unsigned long tree_exclude_depth; }; diff --git a/list-objects-filter.c b/list-objects-filter.c index 36e1f774bc..d2cdc03a73 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -463,9 +463,16 @@ static void *filter_sparse_oid__init( filter_free_fn *filter_free_fn) { struct filter_sparse_data *d = xcalloc(1, sizeof(*d)); + struct object_context oc; + struct object_id sparse_oid; + + if (get_oid_with_context(the_repository, + filter_options->sparse_oid_name, + GET_OID_BLOB, &sparse_oid, &oc)) + die("unable to access sparse blob in '%s'", + filter_options->sparse_oid_name); d->omits = omitted; - if (add_excludes_from_blob_to_list(filter_options->sparse_oid_value, - NULL, 0, &d->el) < 0) + if (add_excludes_from_blob_to_list(&sparse_oid, NULL, 0, &d->el) < 0) die("could not load filter specification"); ALLOC_GROW(d->array_frame, d->nr + 1, d->alloc); diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 0bdbc819f1..84365b802a 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -252,7 +252,7 @@ test_expect_success 'setup src repo for sparse filter' ' git -C sparse-src commit -m "add sparse checkout files" ' -test_expect_failure 'partial clone with sparse filter succeeds' ' +test_expect_success 'partial clone with sparse filter succeeds' ' rm -rf dst.git && git clone --no-local --bare \ --filter=sparse:oid=master:only-one \ @@ -265,7 +265,7 @@ test_expect_failure 'partial clone with sparse filter succeeds' ' ) ' -test_expect_failure 'partial clone with unresolvable sparse filter fails cleanly' ' +test_expect_success 'partial clone with unresolvable sparse filter fails cleanly' ' rm -rf dst.git && test_must_fail git clone --no-local --bare \ --filter=sparse:oid=master:no-such-name \ From patchwork Sun Sep 15 01:13:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11145813 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9AA3414E5 for ; Sun, 15 Sep 2019 01:13:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8572720692 for ; Sun, 15 Sep 2019 01:13:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727182AbfIOBNs (ORCPT ); Sat, 14 Sep 2019 21:13:48 -0400 Received: from cloud.peff.net ([104.130.231.41]:50582 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1725805AbfIOBNs (ORCPT ); Sat, 14 Sep 2019 21:13:48 -0400 Received: (qmail 23320 invoked by uid 109); 15 Sep 2019 01:13:48 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Sun, 15 Sep 2019 01:13:48 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 28938 invoked by uid 111); 15 Sep 2019 01:15:52 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sat, 14 Sep 2019 21:15:52 -0400 Authentication-Results: peff.net; auth=none Date: Sat, 14 Sep 2019 21:13:47 -0400 From: Jeff King To: Jeff Hostetler Cc: Junio C Hamano , Jeff Hostetler , Jon Simons , git@vger.kernel.org, me@ttaylorr.com, sunshine@sunshineco.com, stolee@gmail.com Subject: [PATCH 3/3] list-objects-filter: give a more specific error sparse parsing error Message-ID: <20190915011347.GC11208@sigill.intra.peff.net> References: <20190915010942.GA19787@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190915010942.GA19787@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jon Simons The sparse:oid filter has two error modes: we might fail to resolve the name to an OID, or we might fail to parse the contents of that OID. In the latter case, let's give a less generic error message, and mention the OID we did find. While we're here, let's also mark both messages as translatable. Signed-off-by: Jeff King --- list-objects-filter.c | 5 +++-- t/t5616-partial-clone.sh | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/list-objects-filter.c b/list-objects-filter.c index d2cdc03a73..50f0c6d07b 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -469,11 +469,12 @@ static void *filter_sparse_oid__init( if (get_oid_with_context(the_repository, filter_options->sparse_oid_name, GET_OID_BLOB, &sparse_oid, &oc)) - die("unable to access sparse blob in '%s'", + die(_("unable to access sparse blob in '%s'"), filter_options->sparse_oid_name); d->omits = omitted; if (add_excludes_from_blob_to_list(&sparse_oid, NULL, 0, &d->el) < 0) - die("could not load filter specification"); + die(_("unable to parse sparse filter data in %s"), + oid_to_hex(&sparse_oid)); ALLOC_GROW(d->array_frame, d->nr + 1, d->alloc); d->array_frame[d->nr].defval = 0; /* default to include */ diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 84365b802a..b85d3f5e4c 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -274,7 +274,7 @@ test_expect_success 'partial clone with unresolvable sparse filter fails cleanly test_must_fail git clone --no-local --bare \ --filter=sparse:oid=master \ sparse-src dst.git 2>err && - test_i18ngrep "could not load filter specification" err + test_i18ngrep "unable to parse sparse filter data in" err ' . "$TEST_DIRECTORY"/lib-httpd.sh From patchwork Sun Sep 15 16:51:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 11146005 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D7FB714DB for ; Sun, 15 Sep 2019 16:52:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C0293214D8 for ; Sun, 15 Sep 2019 16:52:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726039AbfIOQv6 (ORCPT ); Sun, 15 Sep 2019 12:51:58 -0400 Received: from cloud.peff.net ([104.130.231.41]:50780 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1725904AbfIOQv6 (ORCPT ); Sun, 15 Sep 2019 12:51:58 -0400 Received: (qmail 402 invoked by uid 109); 15 Sep 2019 16:51:58 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Sun, 15 Sep 2019 16:51:57 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 32088 invoked by uid 111); 15 Sep 2019 16:54:03 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 15 Sep 2019 12:54:03 -0400 Authentication-Results: peff.net; auth=none Date: Sun, 15 Sep 2019 12:51:56 -0400 From: Jeff King To: Jeff Hostetler Cc: Junio C Hamano , Jeff Hostetler , Jon Simons , git@vger.kernel.org, me@ttaylorr.com, sunshine@sunshineco.com, stolee@gmail.com Subject: [PATCH 4/3] list-objects-filter: use empty string instead of NULL for sparse "base" Message-ID: <20190915165156.GA28436@sigill.intra.peff.net> References: <20190829231925.15223-1-jon@jonsimons.org> <20190829231925.15223-2-jon@jonsimons.org> <20190904045424.GA6488@sigill.intra.peff.net> <20190909170823.GA30470@sigill.intra.peff.net> <20190915010942.GA19787@sigill.intra.peff.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190915010942.GA19787@sigill.intra.peff.net> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Sat, Sep 14, 2019 at 09:09:42PM -0400, Jeff King wrote: > On Mon, Sep 09, 2019 at 01:08:24PM -0400, Jeff King wrote: > > > I'll work up what I sent earlier into a real patch, and include some of > > this discussion. > > Here it is. I pulled Jon's tests out into their own patch (mostly > because it makes it easier to give credit). Then patch 2 is my fix, and > patch 3 is the message fixups he had done. > > This replaces what's queued in js/partial-clone-sparse-blob. > > [1/3]: t5616: test cloning/fetching with sparse:oid= filter > [2/3]: list-objects-filter: delay parsing of sparse oid > [3/3]: list-objects-filter: give a more specific error sparse parsing error And here's a bonus patch that I found while running under ASan/UBSan (since I wanted to double-check the memory handling of patch 2 when merged with 'next'). -- >8 -- Subject: list-objects-filter: use empty string instead of NULL for sparse "base" We use add_excludes_from_blob_to_list() to parse a sparse blob. Since we don't have a base path, we pass NULL and 0 for the base and baselen, respectively. But the rest of the exclude code passes a literal empty string instead of NULL for this case. And indeed, we eventually end up with match_pathname() calling fspathncmp(), which then calls the system strncmp(path, base, baselen). This works on many platforms, which notice that baselen is 0 and do not look at the bytes of "base" at all. But it does violate the C standard, and building with SANITIZE=undefined will complain. You can also see it by instrumenting fspathncmp like this: diff --git a/dir.c b/dir.c index d021c908e5..4bb3d3ec96 100644 --- a/dir.c +++ b/dir.c @@ -71,6 +71,8 @@ int fspathcmp(const char *a, const char *b) int fspathncmp(const char *a, const char *b, size_t count) { + if (!a || !b) + BUG("null fspathncmp arguments"); return ignore_case ? strncasecmp(a, b, count) : strncmp(a, b, count); } We could perhaps be more defensive in match_pathname(), but even if we did so, it makes sense for this code to match the rest of the exclude callers. Signed-off-by: Jeff King --- list-objects-filter.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/list-objects-filter.c b/list-objects-filter.c index 50f0c6d07b..83c788e8b5 100644 --- a/list-objects-filter.c +++ b/list-objects-filter.c @@ -472,7 +472,7 @@ static void *filter_sparse_oid__init( die(_("unable to access sparse blob in '%s'"), filter_options->sparse_oid_name); d->omits = omitted; - if (add_excludes_from_blob_to_list(&sparse_oid, NULL, 0, &d->el) < 0) + if (add_excludes_from_blob_to_list(&sparse_oid, "", 0, &d->el) < 0) die(_("unable to parse sparse filter data in %s"), oid_to_hex(&sparse_oid));