From patchwork Wed Apr 26 20:53:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "brian m. carlson" X-Patchwork-Id: 13224929 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34E9BC7618E for ; Wed, 26 Apr 2023 20:53:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233709AbjDZUxh (ORCPT ); Wed, 26 Apr 2023 16:53:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231958AbjDZUxg (ORCPT ); Wed, 26 Apr 2023 16:53:36 -0400 Received: from ring.crustytoothpaste.net (ring.crustytoothpaste.net [IPv6:2600:3c04::f03c:92ff:fe9e:c6d8]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F83B1A7 for ; Wed, 26 Apr 2023 13:53:34 -0700 (PDT) Received: from tapette.crustytoothpaste.net (unknown [IPv6:2001:470:b056:101:5e4a:89fa:93b9:2058]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (3072 bits) server-digest SHA256) (No client certificate requested) by ring.crustytoothpaste.net (Postfix) with ESMTPSA id 9F4DD5B3A1; Wed, 26 Apr 2023 20:53:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=crustytoothpaste.net; s=default; t=1682542413; bh=wa1p133rgzZGEnAaWul+iKsjqHm1lMRUMXivY/U5upU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From:Reply-To: Subject:Date:To:CC:Resent-Date:Resent-From:Resent-To:Resent-Cc: In-Reply-To:References:Content-Type:Content-Disposition; b=eUt7ZJcySj4N3D5hojwmrKWtY6oeMX6Lft/HZfMnw5dnLC5/Yu03q5sSwugXLeE3b QC9ex/d+I+ZWgSrFxTHPYNdZk6z/M/jl0CYBBH9YDMe3w0534oaqsxf5TqbOIXZcnZ EFu+3B+R9LHz1/JZpIaR8uN/RycDyxegkzLyWxbyxe8HtJahPYImtlf3Izpzd7aGkM 1fyMfCJOQnnj0vK7GWv62rM2+KvDTaphiGaZuUMVC+3psB464hQ729QRLtoyFHZEy6 oLO1jq4MZjaUcu9OrT3dvmusHTtskbnKfxeqgnDXTPyYwb3XI+UaPTa1t0g9tBCKxx pbhh5YPyqg67z8bADEBpONDWY4dpqONj+pmgPxwFasfcRDc2/LzMwR+tqnZym3qCDV oz9PGCzZqU1gaffCZGi5EM19pvYXVYwx4/pG2fe+peYkYx0CoGGnG7UT3eO09qpAjz FeQyqmFZVnTpcq5fi9K1ntUU2WC7sGihvi5ssKLhVzrOFk7sdVi From: "brian m. carlson" To: Cc: Jeff King , Junio C Hamano , Adam Majer Subject: [PATCH 1/2] http: advertise capabilities when cloning empty repos Date: Wed, 26 Apr 2023 20:53:23 +0000 Message-Id: <20230426205324.326501-2-sandals@crustytoothpaste.net> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230426205324.326501-1-sandals@crustytoothpaste.net> References: <20230426205324.326501-1-sandals@crustytoothpaste.net> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: "brian m. carlson" When cloning an empty repository, the HTTP protocol version 0 currently offers nothing but the header and flush packets for the /info/refs endpoint. This means that no capabilities are provided, so the client side doesn't know what capabilities are present. However, this does pose a problem when working with SHA-256 repositories, since we use the capabilities to know the remote side's object format (hash algorithm). It used to be possible to set the correct algorithm with `GIT_DEFAULT_HASH` (which is what the Git LFS testsuite did), but this no longer works as of 8b214c2e9d ("clone: propagate object-format when cloning from void", 2023-04-05), since there we always read the hash algorithm from the remote. If there is no hash algorithm provided, we default to SHA-1 for backwards compatibility. Fortunately, the push version of the protocol already indicates a clue for how to solve this. When the /info/refs endpoint is accessed for a push and the remote is empty, we include a dummy "capabilities^{}" ref pointing to the all-zeros object ID. The protocol documentation already indicates this should _always_ be sent, even for fetches and clones, so let's just do that, which means we'll properly announce the hash algorithm as part of the capabilities. This just works with the existing code because we share the same ref code for fetches and clones, and libgit2 does as well. Signed-off-by: brian m. carlson --- t/t5551-http-fetch-smart.sh | 27 +++++++++++++++++++++++++++ t/t5700-protocol-v1.sh | 21 +++++++++++++++++++-- upload-pack.c | 4 ++++ 3 files changed, 50 insertions(+), 2 deletions(-) diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh index 0908534f25..21b7767cbd 100755 --- a/t/t5551-http-fetch-smart.sh +++ b/t/t5551-http-fetch-smart.sh @@ -611,6 +611,33 @@ test_expect_success 'client falls back from v2 to v0 to match server' ' grep symref=HEAD:refs/heads/ trace ' +test_expect_success 'create empty http-accessible SHA-256 repository' ' + mkdir "$HTTPD_DOCUMENT_ROOT_PATH/sha256.git" && + (cd "$HTTPD_DOCUMENT_ROOT_PATH/sha256.git" && + git --bare init --object-format=sha256 + ) +' + +test_expect_success 'clone empty SHA-256 repository with protocol v2' ' + rm -fr sha256 && + echo sha256 >expected && + git -c protocol.version=2 clone "$HTTPD_URL/smart/sha256.git" && + git -C sha256 rev-parse --show-object-format >actual && + test_cmp actual expected && + git ls-remote "$HTTPD_URL/smart/sha256.git" >actual && + test_must_be_empty actual +' + +test_expect_success 'clone empty SHA-256 repository with protocol v0' ' + rm -fr sha256 && + echo sha256 >expected && + GIT_TRACE=1 GIT_TRACE_PACKET=1 git -c protocol.version=0 clone "$HTTPD_URL/smart/sha256.git" && + git -C sha256 rev-parse --show-object-format >actual && + test_cmp actual expected && + git ls-remote "$HTTPD_URL/smart/sha256.git" >actual && + test_must_be_empty actual +' + test_expect_success 'passing hostname resolution information works' ' BOGUS_HOST=gitbogusexamplehost.invalid && BOGUS_HTTPD_URL=$HTTPD_PROTO://$BOGUS_HOST:$LIB_HTTPD_PORT && diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh index 6c8d4c6cf1..3cd9db9012 100755 --- a/t/t5700-protocol-v1.sh +++ b/t/t5700-protocol-v1.sh @@ -249,10 +249,12 @@ test_expect_success 'push with ssh:// using protocol v1' ' . "$TEST_DIRECTORY"/lib-httpd.sh start_httpd -test_expect_success 'create repo to be served by http:// transport' ' +test_expect_success 'create repos to be served by http:// transport' ' git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" && git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true && - test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one + test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one && + git init --object-format=sha256 "$HTTPD_DOCUMENT_ROOT_PATH/sha256" && + git -C "$HTTPD_DOCUMENT_ROOT_PATH/sha256" config http.receivepack true ' test_expect_success 'clone with http:// using protocol v1' ' @@ -269,6 +271,21 @@ test_expect_success 'clone with http:// using protocol v1' ' grep "git< version 1" log ' +test_expect_success 'clone with http:// using protocol v1 with empty SHA-256 repo' ' + GIT_TRACE_PACKET=1 GIT_TRACE_CURL=1 git -c protocol.version=1 \ + clone "$HTTPD_URL/smart/sha256" sha256 2>log && + + cat log && + echo sha256 >expect && + git -C sha256 rev-parse --show-object-format >actual && + test_cmp expect actual && + + # Client requested to use protocol v1 + grep "Git-Protocol: version=1" log && + # Server responded using protocol v1 + grep "git< version 1" log +' + test_expect_success 'fetch with http:// using protocol v1' ' test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two && diff --git a/upload-pack.c b/upload-pack.c index 08633dc121..5ef9b162b6 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -120,6 +120,7 @@ struct upload_pack_data { unsigned allow_ref_in_want : 1; /* v2 only */ unsigned allow_sideband_all : 1; /* v2 only */ unsigned advertise_sid : 1; + unsigned sent_capabilities : 1; }; static void upload_pack_data_init(struct upload_pack_data *data) @@ -1240,6 +1241,7 @@ static int send_ref(const char *refname, const struct object_id *oid, git_user_agent_sanitized()); strbuf_release(&symref_info); strbuf_release(&session_id); + data->sent_capabilities = 1; } else { packet_fwrite_fmt(stdout, "%s %s\n", oid_to_hex(oid), refname_nons); } @@ -1379,6 +1381,8 @@ void upload_pack(const int advertise_refs, const int stateless_rpc, data.no_done = 1; head_ref_namespaced(send_ref, &data); for_each_namespaced_ref(send_ref, &data); + if (!data.sent_capabilities && advertise_refs) + send_ref("capabilities^{}", null_oid(), 0, &data); /* * fflush stdout before calling advertise_shallow_grafts because send_ref * uses stdio. From patchwork Wed Apr 26 20:53:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "brian m. carlson" X-Patchwork-Id: 13224931 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E577AC77B7C for ; Wed, 26 Apr 2023 20:53:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239410AbjDZUxj (ORCPT ); Wed, 26 Apr 2023 16:53:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234643AbjDZUxh (ORCPT ); Wed, 26 Apr 2023 16:53:37 -0400 Received: from ring.crustytoothpaste.net (ring.crustytoothpaste.net [172.105.110.227]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77CEB1FC9 for ; Wed, 26 Apr 2023 13:53:34 -0700 (PDT) Received: from tapette.crustytoothpaste.net (unknown [IPv6:2001:470:b056:101:5e4a:89fa:93b9:2058]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (3072 bits) server-digest SHA256) (No client certificate requested) by ring.crustytoothpaste.net (Postfix) with ESMTPSA id B15E95B3A2; Wed, 26 Apr 2023 20:53:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=crustytoothpaste.net; s=default; t=1682542413; bh=+JPb588FGQIc7yz9rKsLTCpbAO7tzpMvwA8SpSqRvos=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From:Reply-To: Subject:Date:To:CC:Resent-Date:Resent-From:Resent-To:Resent-Cc: In-Reply-To:References:Content-Type:Content-Disposition; b=ALq2vrVs0RTrukE4iPp8v3FeEtqLWBys8aeCLJ/LUYFoUZbdOybylnerP3VhlPSTz nThEQDmg1VvafBemRGvSAdrjAqb6LSX7WSwBDGaxYVLem4UW0kocka/GGop0OnVbuv eoqCoBnO23aaDUfzvNvrmJD2ZZVkNblzUhKEzdylBnohFbEzmxFSOvs4TxANUdgWrr mnBm5kljn6K42gEQ6HynGErcoib6KoDKWudoHYEqdYX8DAsp3DVMP2XYYjG3TAMWU8 w4rGyPefWKC+UyITKxYffyUN1VUjVFMaWSXuKb8IBty8+rHbEBKhW4++KYr1aIIQII qJBpz8qEjy4K0DO8yOYO5wszMk59qaSqGhFYb1hTwV/ZyE6EjUjAp5AT9xkRf5xGzG gIrAK1c8ZOoM8bOvzAI4r49FwvmbYItJiltAgh8/KJWI5lYl4OeA5hYLef2fHuvD6R Z/tKbS++O1pQHZJJv1an9GOaIwMGJ3wrNtClz3+xcVYOxaLxDbV From: "brian m. carlson" To: Cc: Jeff King , Junio C Hamano , Adam Majer Subject: [PATCH 2/2] Honor GIT_DEFAULT_HASH for empty clones without remote algo Date: Wed, 26 Apr 2023 20:53:24 +0000 Message-Id: <20230426205324.326501-3-sandals@crustytoothpaste.net> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230426205324.326501-1-sandals@crustytoothpaste.net> References: <20230426205324.326501-1-sandals@crustytoothpaste.net> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: "brian m. carlson" The previous commit introduced a change that allows HTTP v0 and v1 operations to determine the hash of an empty remote repository by sending capabilities. However, there are still some cases, such as when cloning locally, where the capabilities are not sent. This is because for local operations, we don't strip out the fake "capabilities^{}" ref, and thus "git ls-remote" would produce incorrect values if we did. However, up until 8b214c2e9d ("clone: propagate object-format when cloning from void", 2023-04-05), we honored GIT_DEFAULT_HASH in this case, so let's continue to do that. Check whether the hash algorithm was explicitly set, and if so, continue to use that value. If not, use the default value for GIT_DEFAULT_HASH to ensure that we can at least properly configure an empty clone whose hash algorithm we know. Note that without this patch, git clone cannot create a SHA-256 repository from an empty remote without protocol v2 (except over HTTP, as in the previous patch). --- Documentation/git.txt | 10 +++++++--- builtin/clone.c | 8 +++++--- connect.c | 5 ++++- pkt-line.h | 2 ++ t/t5700-protocol-v1.sh | 11 +++++++++++ transport-helper.c | 1 + transport.c | 14 ++++++++++++++ transport.h | 14 ++++++++++++++ 8 files changed, 58 insertions(+), 7 deletions(-) diff --git a/Documentation/git.txt b/Documentation/git.txt index 74973d3cc4..48eda9f883 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -547,9 +547,13 @@ double-quotes and respecting backslash escapes. E.g., the value `GIT_DEFAULT_HASH`:: If this variable is set, the default hash algorithm for new repositories will be set to this value. This value is currently - ignored when cloning; the setting of the remote repository - is used instead. The default is "sha1". THIS VARIABLE IS - EXPERIMENTAL! See `--object-format` in linkgit:git-init[1]. + ignored when cloning if the remote value can be definitively + determined; the setting of the remote repository is used + instead. The value is honored if the remote repository's + algorithm cannot be determined, such as some cases when + the remote repository is empty. The default is "sha1". + THIS VARIABLE IS EXPERIMENTAL! See `--object-format` + in linkgit:git-init[1]. Git Commits ~~~~~~~~~~~ diff --git a/builtin/clone.c b/builtin/clone.c index 186845ef0b..c207798de9 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -1316,13 +1316,15 @@ int cmd_clone(int argc, const char **argv, const char *prefix) } } + if (transport_get_hash_algo_explicit(transport)) { /* * Now that we know what algorithm the remote side is using, * let's set ours to the same thing. */ - hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport)); - initialize_repository_version(hash_algo, 1); - repo_set_hash_algo(the_repository, hash_algo); + hash_algo = hash_algo_by_ptr(transport_get_hash_algo(transport)); + initialize_repository_version(hash_algo, 1); + repo_set_hash_algo(the_repository, hash_algo); + } if (mapped_refs) { /* diff --git a/connect.c b/connect.c index 3a0186280c..40cb9bf261 100644 --- a/connect.c +++ b/connect.c @@ -243,8 +243,10 @@ static void process_capabilities(struct packet_reader *reader, int *linelen) if (feat_val) { char *hash_name = xstrndup(feat_val, feat_len); int hash_algo = hash_algo_by_name(hash_name); - if (hash_algo != GIT_HASH_UNKNOWN) + if (hash_algo != GIT_HASH_UNKNOWN) { reader->hash_algo = &hash_algos[hash_algo]; + reader->hash_algo_explicit = 1; + } free(hash_name); } else { reader->hash_algo = &hash_algos[GIT_HASH_SHA1]; @@ -493,6 +495,7 @@ static void send_capabilities(int fd_out, struct packet_reader *reader) if (hash_algo == GIT_HASH_UNKNOWN) die(_("unknown object format '%s' specified by server"), hash_name); reader->hash_algo = &hash_algos[hash_algo]; + reader->hash_algo_explicit = 1; packet_write_fmt(fd_out, "object-format=%s", reader->hash_algo->name); } else { reader->hash_algo = &hash_algos[GIT_HASH_SHA1]; diff --git a/pkt-line.h b/pkt-line.h index 8e9846f315..10700a9d8c 100644 --- a/pkt-line.h +++ b/pkt-line.h @@ -190,6 +190,8 @@ struct packet_reader { int line_peeked; unsigned use_sideband : 1; + /* indicates if we saw an explicit capability */ + unsigned hash_algo_explicit : 1; const char *me; /* hash algorithm in use */ diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh index 3cd9db9012..ad24c7fe64 100755 --- a/t/t5700-protocol-v1.sh +++ b/t/t5700-protocol-v1.sh @@ -244,6 +244,17 @@ test_expect_success 'push with ssh:// using protocol v1' ' grep "push< version 1" log ' +test_expect_success 'clone propagates object-format from empty repo' ' + test_when_finished "rm -fr src256 dst256" && + + echo sha256 >expect && + git init --object-format=sha256 src256 && + GIT_DEFAULT_HASH=sha256 git -c protocol.version=1 clone --no-local src256 dst256 && + git -C dst256 rev-parse --show-object-format >actual && + + test_cmp expect actual +' + # Test protocol v1 with 'http://' transport # . "$TEST_DIRECTORY"/lib-httpd.sh diff --git a/transport-helper.c b/transport-helper.c index 6b816940dc..c65cf7c620 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -1236,6 +1236,7 @@ static struct ref *get_refs_list_using_list(struct transport *transport, die(_("unsupported object format '%s'"), value); transport->hash_algo = &hash_algos[algo]; + transport->hash_algo_explicit = 1; } continue; } diff --git a/transport.c b/transport.c index 67afdae57c..7774487e8d 100644 --- a/transport.c +++ b/transport.c @@ -147,6 +147,12 @@ static void get_refs_from_bundle_inner(struct transport *transport) die(_("could not read bundle '%s'"), transport->url); transport->hash_algo = data->header.hash_algo; + /* + * This is always set, even if we didn't get an explicit object-format + * capability, since we know that a missing capability or a v2 bundle + * definitively indicates SHA-1. + */ + transport->hash_algo_explicit = 1; } static struct ref *get_refs_from_bundle(struct transport *transport, @@ -190,6 +196,7 @@ static int fetch_refs_from_bundle(struct transport *transport, ret = unbundle(the_repository, &data->header, data->fd, &extra_index_pack_args, 0); transport->hash_algo = data->header.hash_algo; + transport->hash_algo_explicit = 1; return ret; } @@ -360,6 +367,7 @@ static struct ref *handshake(struct transport *transport, int for_push, } data->finished_handshake = 1; transport->hash_algo = reader.hash_algo; + transport->hash_algo_explicit = reader.hash_algo_explicit; if (reader.line_peeked) BUG("buffer must be empty at the end of handshake()"); @@ -1190,6 +1198,7 @@ struct transport *transport_get(struct remote *remote, const char *url) } ret->hash_algo = &hash_algos[GIT_HASH_SHA1]; + ret->hash_algo_explicit = 0; return ret; } @@ -1199,6 +1208,11 @@ const struct git_hash_algo *transport_get_hash_algo(struct transport *transport) return transport->hash_algo; } +int transport_get_hash_algo_explicit(struct transport *transport) +{ + return transport->hash_algo_explicit; +} + int transport_set_option(struct transport *transport, const char *name, const char *value) { diff --git a/transport.h b/transport.h index 6393cd9823..ce67eefc58 100644 --- a/transport.h +++ b/transport.h @@ -128,6 +128,11 @@ struct transport { * in transport_set_verbosity(). **/ unsigned progress : 1; + /* + * Indicates whether the hash algorithm was initialized explicitly as + * opposed to using a fallback. + */ + unsigned hash_algo_explicit : 1; /* * If transport is at least potentially smart, this points to * git_transport_options structure to use in case transport @@ -305,6 +310,15 @@ int transport_get_remote_bundle_uri(struct transport *transport); * This can only be called after fetching the remote refs. */ const struct git_hash_algo *transport_get_hash_algo(struct transport *transport); +/* + * Fetch whether the hash algorithm provided was explicitly set. + * + * If this value is false, "transport_get_hash_algo" will always return a value + * of SHA-1, which is the default algorithm if none is specified. + * + * This can only be called after fetching the remote refs. + */ +int transport_get_hash_algo_explicit(struct transport *transport); int transport_fetch_refs(struct transport *transport, struct ref *refs); /*