diff mbox series

fetch: allow adding a filter after initial clone.

Message ID 20200513200040.68968-1-delphij@google.com (mailing list archive)
State New, archived
Headers show
Series fetch: allow adding a filter after initial clone. | expand

Commit Message

Xin Li May 13, 2020, 8 p.m. UTC
Signed-off-by: Xin Li <delphij@google.com>
---
 builtin/fetch.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Comments

Junio C Hamano May 13, 2020, 8:43 p.m. UTC | #1
Xin Li <delphij@google.com> writes:

(nothnig).

Can you help readers by describing what this change is about?

This space is reserved for the patch author to describe why this
change is a good idea (if this patch is adding a new feature), what
is already broken without this patch (if this patch is a bugfix),
and why this change is a safe thing to do (if this patch lifts a
limitation we had before that has been protecting us from getting
into a bad state).


> Signed-off-by: Xin Li <delphij@google.com>
> ---
>  builtin/fetch.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 3ae52c015d..e5faa17ecd 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1790,8 +1790,16 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  	if (depth || deepen_since || deepen_not.nr)
>  		deepen = 1;
>  
> -	if (filter_options.choice && !has_promisor_remote())
> -		die("--filter can only be used when extensions.partialClone is set");
> +	if (filter_options.choice && !has_promisor_remote()) {
> +		char repo_version_string[10];
> +
> +		xsnprintf(repo_version_string, sizeof(repo_version_string),
> +			  "%d", (int)GIT_REPO_VERSION);
> +		git_config_set("core.repositoryformatversion",
> +			repo_version_string);
> +		git_config_set("extensions.partialclone", "origin");
> +		promisor_remote_reinit();
> +	}
>  
>  	if (all) {
>  		if (argc == 1)
Xin Li May 13, 2020, 9:41 p.m. UTC | #2
The proposed change would allow converting an existing git clone to
use partial clones.  For example:

$ git clone --depth=1 https://android.googlesource.com/platform/bionic .
$ git fetch --unshallow --filter=blob:none origin

Previously, to allow this one would have to do the following manually
first; the existing code would handle the rest gracefully:

$ git config core.repositoryFormatVersion 1
$ git config extensions.partialClone origin

And the proposed change would have git do it for the user
automatically, the existing workflow remains otherwise unchanged.
Junio C Hamano May 13, 2020, 10:07 p.m. UTC | #3
Xin Li <delphij@google.com> writes:

> The proposed change would allow converting an existing git clone to
> use partial clones.  For example:
>
> $ git clone --depth=1 https://android.googlesource.com/platform/bionic .
> $ git fetch --unshallow --filter=blob:none origin
>
> Previously, to allow this one would have to do the following manually
> first; the existing code would handle the rest gracefully:
>
> $ git config core.repositoryFormatVersion 1
> $ git config extensions.partialClone origin
>
> And the proposed change would have git do it for the user
> automatically, the existing workflow remains otherwise unchanged.

Please do not explain these things to the list readers.  Instead,
remember to write them in the proposed log message when you are
ready to send an updated patch, which you would probably want to
wait before giving other people to comment on the patch.

Thanks.
Junio C Hamano May 13, 2020, 10:18 p.m. UTC | #4
Xin Li <delphij@google.com> writes:

> The proposed change would allow converting an existing git clone to
> use partial clones.  For example:
>
> $ git clone --depth=1 https://android.googlesource.com/platform/bionic .
> $ git fetch --unshallow --filter=blob:none origin
>
> Previously, to allow this one would have to do the following manually
> first; the existing code would handle the rest gracefully:
>
> $ git config core.repositoryFormatVersion 1
> $ git config extensions.partialClone origin
>
> And the proposed change would have git do it for the user
> automatically, the existing workflow remains otherwise unchanged.

Actually, do we even know, when we "upgrade" the repository with the
patch, that the 'origin' supports the required protocol extensions
like on-demand fetching of objects?

At "git clone" time, "git clone --filter=..."  against a server that
is not prepared to serve as a lazy clone's backup remote would fail,
so we will never write extensions.partialClone=origin when origin
cannot serve as such. 

But writing these config you suggest to "do manually" is already not
a safe thing to do, without making sure it is supported, no?  It
does not sound like a good idea to do an unsafe thing for the user
automatically.

Added a few folks on Cc: list, who know and care more about partial
clones than I do for input.

Thanks.
brian m. carlson May 13, 2020, 11:44 p.m. UTC | #5
On 2020-05-13 at 20:00:40, Xin Li wrote:
> Signed-off-by: Xin Li <delphij@google.com>
> ---
>  builtin/fetch.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 3ae52c015d..e5faa17ecd 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1790,8 +1790,16 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>  	if (depth || deepen_since || deepen_not.nr)
>  		deepen = 1;
>  
> -	if (filter_options.choice && !has_promisor_remote())
> -		die("--filter can only be used when extensions.partialClone is set");
> +	if (filter_options.choice && !has_promisor_remote()) {
> +		char repo_version_string[10];
> +
> +		xsnprintf(repo_version_string, sizeof(repo_version_string),
> +			  "%d", (int)GIT_REPO_VERSION);
> +		git_config_set("core.repositoryformatversion",
> +			repo_version_string);
> +		git_config_set("extensions.partialclone", "origin");

Some things stood out to me here.  One, is this setting up the
repository if it's not already created?  If so, we'd probably want to
use one of the appropriate functions in setup.c.  Even if we're just
changing it, we should probably use a helper function.

Two, it isn't necessarily safe to automatically change the repository
version.  Keys that start with "extensions." are not special in format
version 0, but they are in format version 1.  I can technically have an
"extensions.tomatoSalad" in version 0 without any special meaning or
negative consequences, but as soon as we change to version 1, Git will
refuse to read it, since it doesn't know about the tomatoSalad extension
and in v1 unknown extensions are fatal.

My example may sound silly, but since extensions can be set in the
global config, users could well rely on v0 repositories ignoring them
and having them automatically turned on for their v1 repositories.  (I'm
thinking of the future extensions.objectFormat as something somebody
might try to do here, as dangerous as it may be.)

These aren't insurmountable problems, but they are things we'd need to
check for before just changing the repository version, so we'd want to
stuff some code in setup.c to handle this case.

Third, I'm not sure that "origin" is always the value we want to use
here.  At a previous employer, the upstream remote was called
"upstream", and your personal fork was called "origin", so I'd have
wanted upstream here..  We'd probably want to use whatever remote the
user is already using in this case, and gracefully handle the URL case
if that isn't allowed here (which it may be; I'm not that familiar with
partial clone).

I also agree with Junio's assessment that you'd probably want to explain
more about this feature in the commit message.  For example, I'd want to
know what this patch does and have some idea of how I might invoke this
feature, why it's safe to change the repository version, how one sets
the remote for fetch, and pretty much answers to all the other things I
asked here.  Even if I understand these things now, that doesn't mean a
future developer will in six months' time, and mentioning these things
in the commit message helps leave a note to them that you considered (or
did not consider, as the case may be) certain issues and helps them
understand the code as you saw and wrote it.
diff mbox series

Patch

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 3ae52c015d..e5faa17ecd 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1790,8 +1790,16 @@  int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (depth || deepen_since || deepen_not.nr)
 		deepen = 1;
 
-	if (filter_options.choice && !has_promisor_remote())
-		die("--filter can only be used when extensions.partialClone is set");
+	if (filter_options.choice && !has_promisor_remote()) {
+		char repo_version_string[10];
+
+		xsnprintf(repo_version_string, sizeof(repo_version_string),
+			  "%d", (int)GIT_REPO_VERSION);
+		git_config_set("core.repositoryformatversion",
+			repo_version_string);
+		git_config_set("extensions.partialclone", "origin");
+		promisor_remote_reinit();
+	}
 
 	if (all) {
 		if (argc == 1)