diff mbox series

[17/17] fetch: add fetch.writeCommitGraph config setting

Message ID 3c52385e5696887c40cab4a6b9b7923d60a0567c.1557330827.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Commit-graph: Write incremental files | expand

Commit Message

Linus Arver via GitGitGadget May 8, 2019, 3:54 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/fetch.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Ævar Arnfjörð Bjarmason May 9, 2019, 8:07 a.m. UTC | #1
On Wed, May 08 2019, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/fetch.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index b620fd54b4..cf0944bad5 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -23,6 +23,7 @@
>  #include "packfile.h"
>  #include "list-objects-filter-options.h"
>  #include "commit-reach.h"
> +#include "commit-graph.h"
>
>  static const char * const builtin_fetch_usage[] = {
>  	N_("git fetch [<options>] [<repository> [<refspec>...]]"),
> @@ -62,6 +63,7 @@ static const char *submodule_prefix = "";
>  static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
>  static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND;
>  static int shown_url = 0;
> +static int fetch_write_commit_graph = 0;
>  static struct refspec refmap = REFSPEC_INIT_FETCH;
>  static struct list_objects_filter_options filter_options;
>  static struct string_list server_options = STRING_LIST_INIT_DUP;
> @@ -79,6 +81,11 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
>  		return 0;
>  	}
>
> +	if (!strcmp(k, "fetch.writecommitgraph")) {
> +		fetch_write_commit_graph = 1;
> +		return 0;
> +	}
> +
>  	if (!strcmp(k, "submodule.recurse")) {
>  		int r = git_config_bool(k, v) ?
>  			RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
> @@ -1670,6 +1677,16 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
>
>  	string_list_clear(&list, 0);
>
> +	if (fetch_write_commit_graph) {
> +		int commit_graph_flags = COMMIT_GRAPH_SPLIT;
> +
> +		if (progress)
> +			commit_graph_flags |= COMMIT_GRAPH_PROGRESS;
> +
> +		write_commit_graph_reachable(get_object_directory(),
> +					     commit_graph_flags);
> +	}
> +
>  	close_all_packs(the_repository->objects);
>
>  	argv_array_pushl(&argv_gc_auto, "gc", "--auto", NULL);

I'm keen in general to refactor "git gc --auto" a bit so it moves away
from being an all-or-nothing to something where we can do an
"incremental" gc.

I'm happy to do that work, the main thing it's been blocked on is not
having some fast easy-to-lookup heuristic for one of those "incremental"
things.

The two obvious candidates are the commit-graph (I mainly wanted this on
"gc --auto" after clone), and pack-refs (but doing that is more
expensive).

So rather than have this patch I'd like to as noted in 00/17 get the
refactoring bits of the commit-graph in first.

Then some version of my WIP patch in
https://public-inbox.org/git/87lfzprkfc.fsf@evledraar.gmail.com/ where
we'd note the number of objects we had when we did the last commit-graph
in the graph itself.

Then "gc --auto" would look at that, then approximate_object_count(),
and have some percentage threshhold for doing a "do some of the gc"
task, which would just be a small change to need_to_gc() to make it
return/populate a "what needs to be done" rather than "yes/no".

That would give you what you want here, but also be a more general
solution. E.g. we'd write the graph on "clone" once "gc --auto" was
called there, as well as on "fetch".
Derrick Stolee May 9, 2019, 2:21 p.m. UTC | #2
On 5/9/2019 4:07 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> So rather than have this patch I'd like to as noted in 00/17 get the
> refactoring bits of the commit-graph in first.

Refactor-only series coming soon.

> Then some version of my WIP patch in
> https://public-inbox.org/git/87lfzprkfc.fsf@evledraar.gmail.com/ where
> we'd note the number of objects we had when we did the last commit-graph
> in the graph itself.
> 
> Then "gc --auto" would look at that, then approximate_object_count(),
> and have some percentage threshhold for doing a "do some of the gc"
> task, which would just be a small change to need_to_gc() to make it
> return/populate a "what needs to be done" rather than "yes/no".
> 
> That would give you what you want here, but also be a more general
> solution. E.g. we'd write the graph on "clone" once "gc --auto" was
> called there, as well as on "fetch".

I think this "gc --auto" idea is solid, but I also think there is value
in writing a commit-graph after _every_ fetch, not just one big enough
to trigger this new gc behavior. Perhaps your new metadata in the
commit-graph file could store multiple values for "number of objects since
X was cleaned up" where X is in { packs, reflog, commit-graph, etc. } and 
GC could consider each maintenance task independently. _That_ would make
this patch unnecessary.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/builtin/fetch.c b/builtin/fetch.c
index b620fd54b4..cf0944bad5 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -23,6 +23,7 @@ 
 #include "packfile.h"
 #include "list-objects-filter-options.h"
 #include "commit-reach.h"
+#include "commit-graph.h"
 
 static const char * const builtin_fetch_usage[] = {
 	N_("git fetch [<options>] [<repository> [<refspec>...]]"),
@@ -62,6 +63,7 @@  static const char *submodule_prefix = "";
 static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT;
 static int recurse_submodules_default = RECURSE_SUBMODULES_ON_DEMAND;
 static int shown_url = 0;
+static int fetch_write_commit_graph = 0;
 static struct refspec refmap = REFSPEC_INIT_FETCH;
 static struct list_objects_filter_options filter_options;
 static struct string_list server_options = STRING_LIST_INIT_DUP;
@@ -79,6 +81,11 @@  static int git_fetch_config(const char *k, const char *v, void *cb)
 		return 0;
 	}
 
+	if (!strcmp(k, "fetch.writecommitgraph")) {
+		fetch_write_commit_graph = 1;
+		return 0;
+	}
+
 	if (!strcmp(k, "submodule.recurse")) {
 		int r = git_config_bool(k, v) ?
 			RECURSE_SUBMODULES_ON : RECURSE_SUBMODULES_OFF;
@@ -1670,6 +1677,16 @@  int cmd_fetch(int argc, const char **argv, const char *prefix)
 
 	string_list_clear(&list, 0);
 
+	if (fetch_write_commit_graph) {
+		int commit_graph_flags = COMMIT_GRAPH_SPLIT;
+
+		if (progress)
+			commit_graph_flags |= COMMIT_GRAPH_PROGRESS;
+
+		write_commit_graph_reachable(get_object_directory(),
+					     commit_graph_flags);
+	}
+
 	close_all_packs(the_repository->objects);
 
 	argv_array_pushl(&argv_gc_auto, "gc", "--auto", NULL);