mbox series

[v3,0/5] Support server option from configuration

Message ID pull.1776.v3.git.git.1728358699.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Support server option from configuration | expand

Message

Usman Akinyemi via GitGitGadget Oct. 8, 2024, 3:38 a.m. UTC
We manage some internal repositories with numerous CI tasks, each requiring
code preparation through git-clone or git-fetch. These tasks, triggered by
post-receive hooks, often fetch the same copy of code concurrently using
--depth=1, causing extremely high load spikes on our Git servers.

To reduce performance impacts caused by these tasks, we plan to deploy a
specially designed pack-objects-hook [1]. This hook would allow the packs
generated by git-pack-objects(during git-clone or git-fetch) to be reused.
Since not all clone/fetch operations will benefit from this caching (e.g.,
pulls from developer environments), clients need to pass a special
identifier to indicate whether caching should be enabled. Using server
options [2] is suitable for this purpose.

However, server options can only be specified via the command line option
(via --server-option or -o), which is inconvenient and requires
modifications to CI scripts. A configuration-based approach is preferable,
as it can be propagated through global configuration (e.g. ~/.gitconfig) and
avoids compatibility issues with older Git versions that don't support
--server-option.

This patch series introduces a new multi-valued configuration,
remote.<name>.serverOption, similar to push.pushOption, to specify default
server options for the corresponding remote.

 * Patches 1~3 contain the main changes for introducing the new
   configuration.
 * Patch 4 fixes a issue for git-fetch not sending server-options when
   fetching from multiple remotes.
 * Patch 5 is a minor fix for a server options-related memory leak.

 1. https://git-scm.com/docs/git-config#Documentation/git-config.txt-uploadpackpackObjectsHook
 2. https://git-scm.com/docs/gitprotocol-v2#_server_option

Xing Xin (5):
  transport: introduce parse_transport_option() method
  remote: introduce remote.<name>.serverOption configuration
  transport.c::handshake: make use of server options from remote
  fetch: respect --server-option when fetching multiple remotes
  ls-remote: leakfix for not clearing server_options

 Documentation/config/remote.txt |  10 +++
 Documentation/fetch-options.txt |   3 +
 Documentation/git-clone.txt     |   3 +
 Documentation/git-ls-remote.txt |   3 +
 builtin/fetch.c                 |   2 +
 builtin/ls-remote.c             |   1 +
 builtin/push.c                  |   9 +--
 remote.c                        |   6 ++
 remote.h                        |   3 +
 t/t5702-protocol-v2.sh          | 133 ++++++++++++++++++++++++++++++++
 transport.c                     |  15 ++++
 transport.h                     |   4 +
 12 files changed, 184 insertions(+), 8 deletions(-)


base-commit: 6258f68c3c1092c901337895c864073dcdea9213
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1776%2Fblanet%2Fxx%2Fadd-server-option-from-config-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1776/blanet/xx/add-server-option-from-config-v3
Pull-Request: https://github.com/git/git/pull/1776

Range-diff vs v2:

 1:  c95ed5e0dd5 = 1:  b44face42e1 transport: introduce parse_transport_option() method
 2:  2474b4c69d6 ! 2:  3c6b129d368 remote: introduce remote.<name>.serverOption configuration
     @@ Documentation/config/remote.txt: remote.<name>.partialclonefilter::
       	database, use the `--refetch` option of linkgit:git-fetch[1].
      +
      +remote.<name>.serverOption::
     -+	When no `--server-option=<option>` argument is given from the command
     -+	line, git will use the values from this configuration as a default list of
     -+	server options for this remote.
     ++	The default set of server options used when fetching from this remote.
     ++	These server options can be overridden by the `--server-option=` command
     ++	line arguments.
      ++
      +This is a multi-valued variable, and an empty value can be used in a higher
      +priority configuration file (e.g. `.git/config` in a repository) to clear
     @@ remote.c
       
       enum map_direction { FROM_SRC, FROM_DST };
       
     -@@ remote.c: static struct remote *make_remote(struct remote_state *remote_state,
     - 	struct remote *ret;
     - 	struct remotes_hash_key lookup;
     - 	struct hashmap_entry lookup_entry, *e;
     -+	struct string_list server_options = STRING_LIST_INIT_DUP;
     - 
     - 	if (!len)
     - 		len = strlen(name);
      @@ remote.c: static struct remote *make_remote(struct remote_state *remote_state,
       	ret->name = xstrndup(name, len);
       	refspec_init(&ret->push, REFSPEC_PUSH);
       	refspec_init(&ret->fetch, REFSPEC_FETCH);
     -+	ret->server_options = server_options;
     ++	string_list_init_dup(&ret->server_options);
       
       	ALLOC_GROW(remote_state->remotes, remote_state->remotes_nr + 1,
       		   remote_state->remotes_alloc);
     @@ remote.c: static void remote_clear(struct remote *remote)
       
       static void add_merge(struct branch *branch, const char *name)
      @@ remote.c: static int handle_config(const char *key, const char *value,
     - 					 key, value);
       	} else if (!strcmp(subkey, "vcs")) {
     + 		FREE_AND_NULL(remote->foreign_vcs);
       		return git_config_string(&remote->foreign_vcs, key, value);
      +	} else if (!strcmp(subkey, "serveroption")) {
      +		return parse_transport_option(key, value,
 3:  a7f3e458501 = 3:  f0835259b06 transport.c::handshake: make use of server options from remote
 4:  39ee8dbef78 = 4:  420b15d9f37 fetch: respect --server-option when fetching multiple remotes
 5:  39c07a6c8ee ! 5:  2528d929c7e ls-remote: leakfix for not clearing server_options
     @@ Commit message
          Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
      
       ## builtin/ls-remote.c ##
     -@@ builtin/ls-remote.c: int cmd_ls_remote(int argc, const char **argv, const char *prefix)
     +@@ builtin/ls-remote.c: int cmd_ls_remote(int argc,
       	transport_ls_refs_options_release(&transport_options);
       
       	strvec_clear(&pattern);

Comments

Patrick Steinhardt Oct. 8, 2024, 4 a.m. UTC | #1
On Tue, Oct 08, 2024 at 03:38:14AM +0000, blanet via GitGitGadget wrote:
> We manage some internal repositories with numerous CI tasks, each requiring
> code preparation through git-clone or git-fetch. These tasks, triggered by
> post-receive hooks, often fetch the same copy of code concurrently using
> --depth=1, causing extremely high load spikes on our Git servers.
> 
> To reduce performance impacts caused by these tasks, we plan to deploy a
> specially designed pack-objects-hook [1]. This hook would allow the packs
> generated by git-pack-objects(during git-clone or git-fetch) to be reused.
> Since not all clone/fetch operations will benefit from this caching (e.g.,
> pulls from developer environments), clients need to pass a special
> identifier to indicate whether caching should be enabled. Using server
> options [2] is suitable for this purpose.
> 
> However, server options can only be specified via the command line option
> (via --server-option or -o), which is inconvenient and requires
> modifications to CI scripts. A configuration-based approach is preferable,
> as it can be propagated through global configuration (e.g. ~/.gitconfig) and
> avoids compatibility issues with older Git versions that don't support
> --server-option.
> 
> This patch series introduces a new multi-valued configuration,
> remote.<name>.serverOption, similar to push.pushOption, to specify default
> server options for the corresponding remote.
> 
>  * Patches 1~3 contain the main changes for introducing the new
>    configuration.
>  * Patch 4 fixes a issue for git-fetch not sending server-options when
>    fetching from multiple remotes.
>  * Patch 5 is a minor fix for a server options-related memory leak.
> 
>  1. https://git-scm.com/docs/git-config#Documentation/git-config.txt-uploadpackpackObjectsHook
>  2. https://git-scm.com/docs/gitprotocol-v2#_server_option

The range-diff looks as expected to me, so this should be ready
to go from my point of view. Thanks!

Patrick
Junio C Hamano Oct. 8, 2024, 5:23 p.m. UTC | #2
Patrick Steinhardt <ps@pks.im> writes:

> On Tue, Oct 08, 2024 at 03:38:14AM +0000, blanet via GitGitGadget wrote:
>> We manage some internal repositories with numerous CI tasks, each requiring
>> code preparation through git-clone or git-fetch. These tasks, triggered by
>> post-receive hooks, often fetch the same copy of code concurrently using
>> --depth=1, causing extremely high load spikes on our Git servers.
>> ...
>> This patch series introduces a new multi-valued configuration,
>> remote.<name>.serverOption, similar to push.pushOption, to specify default
>> server options for the corresponding remote.
>> 
>>  * Patches 1~3 contain the main changes for introducing the new
>>    configuration.
>>  * Patch 4 fixes a issue for git-fetch not sending server-options when
>>    fetching from multiple remotes.
>>  * Patch 5 is a minor fix for a server options-related memory leak.
>> 
>>  1. https://git-scm.com/docs/git-config#Documentation/git-config.txt-uploadpackpackObjectsHook
>>  2. https://git-scm.com/docs/gitprotocol-v2#_server_option
>
> The range-diff looks as expected to me, so this should be ready
> to go from my point of view. Thanks!

Thanks, both of you.  Queued.