diff mbox series

[09/15] job-runner: load repos from config by default

Message ID 9104ae46371a08967805f6ce27586ccade37972a.1585946894.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Maintenance jobs and job runner | expand

Commit Message

John Passaro via GitGitGadget April 3, 2020, 8:48 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

The 'git job-runner' command was introduced with a '--repo=<path>'
option so a user could start a process with an explicit list of
repos. However, it is more likely that we will want 'git
job-runner' to start without any arguments and discover the set of
repos from another source.

A natural source is the Git config. The 'git job-runner' does not
need to run in a Git repository, but the config could be located in
the global or system config.

Create the job.repo config setting as a multi-valued config option.

Read all values for job.repo whenever triggering an iteration of
the job loop. This allows a user to add or remove repos dynamically
without restarting the job-runner.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/config/job.txt |  7 +++++++
 builtin/job-runner.c         | 20 ++++++++++++++++++++
 2 files changed, 27 insertions(+)

Comments

Phillip Wood April 5, 2020, 3:18 p.m. UTC | #1
Hi Stolee

On 03/04/2020 21:48, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The 'git job-runner' command was introduced with a '--repo=<path>'
> option so a user could start a process with an explicit list of
> repos. However, it is more likely that we will want 'git
> job-runner' to start without any arguments and discover the set of
> repos from another source.
> 
> A natural source is the Git config. The 'git job-runner' does not
> need to run in a Git repository, but the config could be located in
> the global or system config.
> 
> Create the job.repo config setting as a multi-valued config option.
> 
> Read all values for job.repo whenever triggering an iteration of
> the job loop. This allows a user to add or remove repos dynamically
> without restarting the job-runner.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>   Documentation/config/job.txt |  7 +++++++
>   builtin/job-runner.c         | 20 ++++++++++++++++++++
>   2 files changed, 27 insertions(+)
> 
> diff --git a/Documentation/config/job.txt b/Documentation/config/job.txt
> index efdb76afad3..2dfed3935fa 100644
> --- a/Documentation/config/job.txt
> +++ b/Documentation/config/job.txt
> @@ -4,3 +4,10 @@ job.pack-files.batchSize::
>   	part of `git run-job pack-files`. If not specified, then a
>   	dynamic size calculation is run. See linkgit:git-run-job[1]
>   	for more details.
> +
> +job.repo::
> +	Store an absolute path to a repository that wants background
> +	jobs run by `git job-runner`. This is a multi-valued config
> +	option, but it must be stored in a config file seen by the
> +	background runner. This may be the global or system config
> +	depending on how `git job-runner` is launched on your system.
> diff --git a/builtin/job-runner.c b/builtin/job-runner.c
> index 135288bcaae..bed401f94bf 100644
> --- a/builtin/job-runner.c
> +++ b/builtin/job-runner.c
> @@ -20,6 +20,9 @@ static int arg_repos_append(const struct option *opt,
>   
>   static int load_active_repos(struct string_list *repos)
>   {
> +	struct string_list_item *item;
> +	const struct string_list *config_repos;
> +
>   	if (arg_repos.nr) {
>   		struct string_list_item *item;
>   		for (item = arg_repos.items;
> @@ -29,6 +32,23 @@ static int load_active_repos(struct string_list *repos)
>   		return 0;
>   	}
>   
> +	config_repos = git_config_get_value_multi("job.repo");

Does this really re-read the config files as the commit message suggests 
or just return the cached values? Perhaps the runner could exec itself 
with a --run-jobs option to actually run the jobs so that it sees any 
updated config values.

Best Wishes

Phillip

> +
> +	for (item = config_repos->items;
> +	     item && item < config_repos->items + config_repos->nr;
> +	     item++) {
> +		DIR *dir = opendir(item->string);
> +
> +		if (!dir)
> +			continue;
> +
> +		closedir(dir);
> +		string_list_append(repos, item->string);
> +	}
> +
> +	string_list_sort(repos);
> +	string_list_remove_duplicates(repos, 0);
> +
>   	return 0;
>   }
>   
>
Phillip Wood April 5, 2020, 3:41 p.m. UTC | #2
Hi Stolee

On 03/04/2020 21:48, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The 'git job-runner' command was introduced with a '--repo=<path>'
> option so a user could start a process with an explicit list of
> repos. However, it is more likely that we will want 'git
> job-runner' to start without any arguments and discover the set of
> repos from another source.

One thought that occurred to me was that it might be convenient to put 
'git job-runner $HOME &' in my .bashrc to have it start on login and 
find all the repositories under $HOME without me having to list each one 
(and remember to update the list each time a clone a new repository). 
There are a couple of issues with this
  1 - We only want to run the jobs once per repo, not for each worktree. 
That should be easy enough to implement by checking if we're in the main 
worktree.
  2 - If I start several shells I only want one instance of 'git 
job-runner' to start. One way to handle that would be for the runner to 
hold a lock file in the directory it's given and quit if the lock is 
already taken. It should probably walk up the directory hierarchy to 
check for lock files as well in case I try to start another job-runner 
instance in a subdirectory.

Best Wishes

Phillip


> A natural source is the Git config. The 'git job-runner' does not
> need to run in a Git repository, but the config could be located in
> the global or system config.
> 
> Create the job.repo config setting as a multi-valued config option.
> 
> Read all values for job.repo whenever triggering an iteration of
> the job loop. This allows a user to add or remove repos dynamically
> without restarting the job-runner.
> 
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>   Documentation/config/job.txt |  7 +++++++
>   builtin/job-runner.c         | 20 ++++++++++++++++++++
>   2 files changed, 27 insertions(+)
> 
> diff --git a/Documentation/config/job.txt b/Documentation/config/job.txt
> index efdb76afad3..2dfed3935fa 100644
> --- a/Documentation/config/job.txt
> +++ b/Documentation/config/job.txt
> @@ -4,3 +4,10 @@ job.pack-files.batchSize::
>   	part of `git run-job pack-files`. If not specified, then a
>   	dynamic size calculation is run. See linkgit:git-run-job[1]
>   	for more details.
> +
> +job.repo::
> +	Store an absolute path to a repository that wants background
> +	jobs run by `git job-runner`. This is a multi-valued config
> +	option, but it must be stored in a config file seen by the
> +	background runner. This may be the global or system config
> +	depending on how `git job-runner` is launched on your system.
> diff --git a/builtin/job-runner.c b/builtin/job-runner.c
> index 135288bcaae..bed401f94bf 100644
> --- a/builtin/job-runner.c
> +++ b/builtin/job-runner.c
> @@ -20,6 +20,9 @@ static int arg_repos_append(const struct option *opt,
>   
>   static int load_active_repos(struct string_list *repos)
>   {
> +	struct string_list_item *item;
> +	const struct string_list *config_repos;
> +
>   	if (arg_repos.nr) {
>   		struct string_list_item *item;
>   		for (item = arg_repos.items;
> @@ -29,6 +32,23 @@ static int load_active_repos(struct string_list *repos)
>   		return 0;
>   	}
>   
> +	config_repos = git_config_get_value_multi("job.repo");
> +
> +	for (item = config_repos->items;
> +	     item && item < config_repos->items + config_repos->nr;
> +	     item++) {
> +		DIR *dir = opendir(item->string);
> +
> +		if (!dir)
> +			continue;
> +
> +		closedir(dir);
> +		string_list_append(repos, item->string);
> +	}
> +
> +	string_list_sort(repos);
> +	string_list_remove_duplicates(repos, 0);
> +
>   	return 0;
>   }
>   
>
Derrick Stolee April 6, 2020, 12:49 p.m. UTC | #3
On 4/5/2020 11:18 AM, Phillip Wood wrote:
> Hi Stolee
> 
> On 03/04/2020 21:48, Derrick Stolee via GitGitGadget wrote:
>>   +    config_repos = git_config_get_value_multi("job.repo");
> 
> Does this really re-read the config files as the commit message suggests or just return the cached values? Perhaps the runner could exec itself with a --run-jobs option to actually run the jobs so that it sees any updated config values.

You're right, I need to double-check that. I'm guessing that you are
right and it uses the cached values. This should use the run_command()
pattern to ensure the config is as new as possible.

Thanks,
-Stolee
Derrick Stolee April 6, 2020, 12:57 p.m. UTC | #4
On 4/5/2020 11:41 AM, Phillip Wood wrote:
> Hi Stolee
> 
> On 03/04/2020 21:48, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> The 'git job-runner' command was introduced with a '--repo=<path>'
>> option so a user could start a process with an explicit list of
>> repos. However, it is more likely that we will want 'git
>> job-runner' to start without any arguments and discover the set of
>> repos from another source.
> 
> One thought that occurred to me was that it might be convenient to put
> 'git job-runner $HOME &' in my .bashrc to have it start on login and
> find all the repositories under $HOME without me having to list each
> one (and remember to update the list each time a clone a new repository).
> There are a couple of issues with this
> 1 - We only want to run the jobs once per repo, not for each worktree.
> That should be easy enough to implement by checking if we're in the
> main worktree.

This idea of "please check all (immediate) directories under <dir> for
repositories to run maintenance" is an interesting one. It certainly
could be added later.

Your concern about worktrees is actually not a big deal. Since we check
the config for the "last run" of a job on a repo, we will not run the
same job immediately on a worktree after running it on the original
(unless the interval times are incredibly short).

> 2 - If I start several shells I only want one instance of 'git
> job-runner' to start. One way to handle that would be for the runner
> to hold a lock file in the directory it's given and quit if the lock
> is already taken. It should probably walk up the directory hierarchy
> to check for lock files as well in case I try to start another
> job-runner instance in a subdirectory.

This is an interesting idea. My gut reaction is that we don't want
to prevent users from running multiple processes if they want to. But
if they run the process twice in the same directory then they are
likely to be running on the same set of repos (barring "-c jobs.repo"
or equivalent).

Again, my hope is that we would solve this problem by having a cross-
platform "job service" that makes the user setup of editing .bashrc
irrelevant. Not only is that idea getting push back, we should allow
expert users the ability to customize these steps to their delight.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/Documentation/config/job.txt b/Documentation/config/job.txt
index efdb76afad3..2dfed3935fa 100644
--- a/Documentation/config/job.txt
+++ b/Documentation/config/job.txt
@@ -4,3 +4,10 @@  job.pack-files.batchSize::
 	part of `git run-job pack-files`. If not specified, then a
 	dynamic size calculation is run. See linkgit:git-run-job[1]
 	for more details.
+
+job.repo::
+	Store an absolute path to a repository that wants background
+	jobs run by `git job-runner`. This is a multi-valued config
+	option, but it must be stored in a config file seen by the
+	background runner. This may be the global or system config
+	depending on how `git job-runner` is launched on your system.
diff --git a/builtin/job-runner.c b/builtin/job-runner.c
index 135288bcaae..bed401f94bf 100644
--- a/builtin/job-runner.c
+++ b/builtin/job-runner.c
@@ -20,6 +20,9 @@  static int arg_repos_append(const struct option *opt,
 
 static int load_active_repos(struct string_list *repos)
 {
+	struct string_list_item *item;
+	const struct string_list *config_repos;
+
 	if (arg_repos.nr) {
 		struct string_list_item *item;
 		for (item = arg_repos.items;
@@ -29,6 +32,23 @@  static int load_active_repos(struct string_list *repos)
 		return 0;
 	}
 
+	config_repos = git_config_get_value_multi("job.repo");
+
+	for (item = config_repos->items;
+	     item && item < config_repos->items + config_repos->nr;
+	     item++) {
+		DIR *dir = opendir(item->string);
+
+		if (!dir)
+			continue;
+
+		closedir(dir);
+		string_list_append(repos, item->string);
+	}
+
+	string_list_sort(repos);
+	string_list_remove_duplicates(repos, 0);
+
 	return 0;
 }