diff mbox series

[v3,3/5] clone: add --bundle-uri option

Message ID 00debaf6e77852efe1dcad4bfda5ebd5bf590ac4.1660050704.git.gitgitgadget@gmail.com (mailing list archive)
State Accepted
Commit 55568919616429fbc209880cf189a3adaceb6093
Headers show
Series Bundle URIs II: git clone --bundle-uri | expand

Commit Message

Derrick Stolee Aug. 9, 2022, 1:11 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

Cloning a remote repository is one of the most expensive operations in
Git. The server can spend a lot of CPU time generating a pack-file for
the client's request. The amount of data can clog the network for a long
time, and the Git protocol is not resumable. For users with poor network
connections or are located far away from the origin server, this can be
especially painful.

Add a new '--bundle-uri' option to 'git clone' to bootstrap a clone from
a bundle. If the user is aware of a bundle server, then they can tell
Git to bootstrap the new repository with these bundles before fetching
the remaining objects from the origin server.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 Documentation/git-clone.txt |  6 ++++++
 builtin/clone.c             | 15 +++++++++++++++
 t/t5558-clone-bundle-uri.sh | 36 ++++++++++++++++++++++++++++++++++++
 3 files changed, 57 insertions(+)
 create mode 100755 t/t5558-clone-bundle-uri.sh

Comments

Junio C Hamano Aug. 22, 2022, 9:24 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +	/*
> +	 * Before fetching from the remote, download and install bundle
> +	 * data from the --bundle-uri option.
> +	 */
> +	if (bundle_uri) {
> +		/* At this point, we need the_repository to match the cloned repo. */
> +		repo_init(the_repository, git_dir, work_tree);
> +		if (fetch_bundle_uri(the_repository, bundle_uri))
> +			warning(_("failed to fetch objects from bundle URI '%s'"),
> +				bundle_uri);
> +	}

I do not offhand know why I suddenly started seeing the issue for
this relatively old commit I have had in my tree for at least 10
days, but I am getting this

builtin/clone.c: In function 'cmd_clone':
builtin/clone.c:1248:17: error: ignoring return value of 'repo_init' declared with attribute 'warn_unused_result' [-Werror=unused-result]
 1248 |                 repo_init(the_repository, git_dir, work_tree);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


with the commit merged in 'seen'.

It seems that the updated ab/submodule-helper-prep is doing it.  

Why can't that (or any of Ævar's) topic focus on what it needs to
do, without churning the codebase and inflict damages to other
topics like this?  Quite frustrating.

I'll redo today's integration without the offending topic to give
other topics test coverage.
Derrick Stolee Aug. 23, 2022, 2:05 p.m. UTC | #2
On 8/22/2022 5:24 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> +	/*
>> +	 * Before fetching from the remote, download and install bundle
>> +	 * data from the --bundle-uri option.
>> +	 */
>> +	if (bundle_uri) {
>> +		/* At this point, we need the_repository to match the cloned repo. */
>> +		repo_init(the_repository, git_dir, work_tree);
>> +		if (fetch_bundle_uri(the_repository, bundle_uri))
>> +			warning(_("failed to fetch objects from bundle URI '%s'"),
>> +				bundle_uri);
>> +	}
> 
> I do not offhand know why I suddenly started seeing the issue for
> this relatively old commit I have had in my tree for at least 10
> days, but I am getting this
> 
> builtin/clone.c: In function 'cmd_clone':
> builtin/clone.c:1248:17: error: ignoring return value of 'repo_init' declared with attribute 'warn_unused_result' [-Werror=unused-result]
>  1248 |                 repo_init(the_repository, git_dir, work_tree);
>       |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> 
> with the commit merged in 'seen'.

Thank you for pointing this out. Here is a patch that fixes
the issue from this patch:

--- >8 ---

From 6df8bc6d7ffdc1b115d85ef9550bab5dcf1811f8 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <derrickstolee@github.com>
Date: Tue, 23 Aug 2022 09:53:47 -0400
Subject: [PATCH] clone: warn on failure to repo_init()

The --bundle-uri option was added in 55568919616 (clone: add
--bundle-uri option, 2022-08-09), but this also introduced a call to
repo_init() whose return value was ignored. Fix that ignored value by
warning that the bundle URI process could not continue if it failed.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 builtin/clone.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 4463789680b..e21d42dfee5 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1245,8 +1245,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	 */
 	if (bundle_uri) {
 		/* At this point, we need the_repository to match the cloned repo. */
-		repo_init(the_repository, git_dir, work_tree);
-		if (fetch_bundle_uri(the_repository, bundle_uri))
+		if (repo_init(the_repository, git_dir, work_tree))
+			warning(_("failed to initialize the repo, skipping bundle URI"));
+		else if (fetch_bundle_uri(the_repository, bundle_uri))
 			warning(_("failed to fetch objects from bundle URI '%s'"),
 				bundle_uri);
 	}
Junio C Hamano Aug. 24, 2022, 3:46 p.m. UTC | #3
Derrick Stolee <derrickstolee@github.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
> Date: Tue, 23 Aug 2022 09:53:47 -0400
> Subject: [PATCH] clone: warn on failure to repo_init()
>
> The --bundle-uri option was added in 55568919616 (clone: add
> --bundle-uri option, 2022-08-09), but this also introduced a call to
> repo_init() whose return value was ignored. Fix that ignored value by
> warning that the bundle URI process could not continue if it failed.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---

OK.  It looks like a sensible thing to do.

>  builtin/clone.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 4463789680b..e21d42dfee5 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1245,8 +1245,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	 */
>  	if (bundle_uri) {
>  		/* At this point, we need the_repository to match the cloned repo. */
> -		repo_init(the_repository, git_dir, work_tree);
> -		if (fetch_bundle_uri(the_repository, bundle_uri))
> +		if (repo_init(the_repository, git_dir, work_tree))
> +			warning(_("failed to initialize the repo, skipping bundle URI"));
> +		else if (fetch_bundle_uri(the_repository, bundle_uri))
>  			warning(_("failed to fetch objects from bundle URI '%s'"),
>  				bundle_uri);
>  	}
diff mbox series

Patch

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 632bd1348ea..60fedf7eb5e 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -323,6 +323,12 @@  or `--mirror` is given)
 	for `host.xz:foo/.git`).  Cloning into an existing directory
 	is only allowed if the directory is empty.
 
+--bundle-uri=<uri>::
+	Before fetching from the remote, fetch a bundle from the given
+	`<uri>` and unbundle the data into the local repository. The refs
+	in the bundle will be stored under the hidden `refs/bundle/*`
+	namespace.
+
 :git-clone: 1
 include::urls.txt[]
 
diff --git a/builtin/clone.c b/builtin/clone.c
index c4ff4643ecd..4224d562758 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -34,6 +34,7 @@ 
 #include "list-objects-filter-options.h"
 #include "hook.h"
 #include "bundle.h"
+#include "bundle-uri.h"
 
 /*
  * Overall FIXMEs:
@@ -77,6 +78,7 @@  static int option_filter_submodules = -1;    /* unspecified */
 static int config_filter_submodules = -1;    /* unspecified */
 static struct string_list server_options = STRING_LIST_INIT_NODUP;
 static int option_remote_submodules;
+static const char *bundle_uri;
 
 static int recurse_submodules_cb(const struct option *opt,
 				 const char *arg, int unset)
@@ -160,6 +162,8 @@  static struct option builtin_clone_options[] = {
 		    N_("any cloned submodules will use their remote-tracking branch")),
 	OPT_BOOL(0, "sparse", &option_sparse_checkout,
 		    N_("initialize sparse-checkout file to include only files at root")),
+	OPT_STRING(0, "bundle-uri", &bundle_uri,
+		   N_("uri"), N_("a URI for downloading bundles before fetching from origin remote")),
 	OPT_END()
 };
 
@@ -1232,6 +1236,17 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (transport->smart_options && !deepen && !filter_options.choice)
 		transport->smart_options->check_self_contained_and_connected = 1;
 
+	/*
+	 * Before fetching from the remote, download and install bundle
+	 * data from the --bundle-uri option.
+	 */
+	if (bundle_uri) {
+		/* At this point, we need the_repository to match the cloned repo. */
+		repo_init(the_repository, git_dir, work_tree);
+		if (fetch_bundle_uri(the_repository, bundle_uri))
+			warning(_("failed to fetch objects from bundle URI '%s'"),
+				bundle_uri);
+	}
 
 	strvec_push(&transport_ls_refs_options.ref_prefixes, "HEAD");
 	refspec_ref_prefixes(&remote->fetch,
diff --git a/t/t5558-clone-bundle-uri.sh b/t/t5558-clone-bundle-uri.sh
new file mode 100755
index 00000000000..f709bcb729c
--- /dev/null
+++ b/t/t5558-clone-bundle-uri.sh
@@ -0,0 +1,36 @@ 
+#!/bin/sh
+
+test_description='test fetching bundles with --bundle-uri'
+
+. ./test-lib.sh
+
+test_expect_success 'fail to clone from non-existent file' '
+	test_when_finished rm -rf test &&
+	git clone --bundle-uri="$(pwd)/does-not-exist" . test 2>err &&
+	grep "failed to download bundle from URI" err
+'
+
+test_expect_success 'fail to clone from non-bundle file' '
+	test_when_finished rm -rf test &&
+	echo bogus >bogus &&
+	git clone --bundle-uri="$(pwd)/bogus" . test 2>err &&
+	grep "is not a bundle" err
+'
+
+test_expect_success 'create bundle' '
+	git init clone-from &&
+	git -C clone-from checkout -b topic &&
+	test_commit -C clone-from A &&
+	test_commit -C clone-from B &&
+	git -C clone-from bundle create B.bundle topic
+'
+
+test_expect_success 'clone with path bundle' '
+	git clone --bundle-uri="clone-from/B.bundle" \
+		clone-from clone-path &&
+	git -C clone-path rev-parse refs/bundles/topic >actual &&
+	git -C clone-from rev-parse topic >expect &&
+	test_cmp expect actual
+'
+
+test_done