diff mbox series

[GSoC,RFC] show-branch: use commit-slab for flag storage

Message ID 20250217055049.9217-1-meetsoni3017@gmail.com (mailing list archive)
State New
Headers show
Series [GSoC,RFC] show-branch: use commit-slab for flag storage | expand

Commit Message

Meet Soni Feb. 17, 2025, 5:50 a.m. UTC
Replace direct accesses to commit->object.flags with the commit-slab
mechanism. Introduce `get_commit_flags()` and `set_commit_flags()` to
retrieve and update flags, respectively, and include `revision.h` so that
the canonical UNINTERESTING definition is used.

Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
---
I'm not entirely sure what the TODO comment meant by storing a pointer to
the "ref name" directly, so I've assumed that the intent was to store
flags (of type int) directly in the commit-slab instead of commit->object.

I've tested these changes using:
 - test suite -- they passed.
 - github ci result -- https://github.com/inosmeet/git/actions/runs/13355488433

The review from Junio that led to this TODO comment [1]:
> Another place we could use commit-slab in this program, which I
> think is a more interesting application, is to use it to store a
> bitmask with runtime-computed width to replace those object->flags
> bits, which will allow us to lift the MAX_REVS limitation.

Ultimately, I'm interested in implementing this change and would appreciate
some guidance. Specifically, does this mean I should define the commit-slab
using a struct containing both an int and a size, instead of just an int?

[1]: https://lore.kernel.org/git/xmqq36yud9bp.fsf@gitster-ct.c.googlers.com/

 builtin/show-branch.c | 59 +++++++++++++++++++++++++------------------
 1 file changed, 34 insertions(+), 25 deletions(-)


base-commit: e2067b49ecaef9b7f51a17ce251f9207f72ef52d

Comments

Junio C Hamano Feb. 18, 2025, 6:40 p.m. UTC | #1
Meet Soni <meetsoni3017@gmail.com> writes:

> Replace direct accesses to commit->object.flags with the commit-slab
> mechanism. Introduce `get_commit_flags()` and `set_commit_flags()` to
> retrieve and update flags, respectively, and include `revision.h` so that
> the canonical UNINTERESTING definition is used.
>
> Signed-off-by: Meet Soni <meetsoni3017@gmail.com>

Ohhhh.  I thought people somehow have "refactored" the commit
traversal code here to share more with the machinery used by the
"log" family of commands, but the change in this patch being
contained within the single "show-branch" file indicates that it is
not the case, which is good ;-)

And the MAX_REVS limitation has been with us from the very beginning
of the "show-branch" command.  Lifting it is very good ;-) ;-).

> ---
> I'm not entirely sure what the TODO comment meant by storing a pointer to
> the "ref name" directly, so I've assumed that the intent was to store
> flags (of type int) directly in the commit-slab instead of commit->object.

It has been forever since I looked at the code around here the last
time, but I suspect that it meant the final mapping the code makes
at the output phase from the bit position in the flags bits to which
reference the bit (i.e. "I am reachable from that ref") could be
omitted if we make the slab entry a set of (interned) refnames.

But I think using a slab whose element is still a bag of bits that
is wider than object.flags word is is the most straight-forward way
to lift MAX_REVS limitation.  If we can leave everything else
unchanged, that would be great.

> I've tested these changes using:
>  - test suite -- they passed.
>  - github ci result -- https://github.com/inosmeet/git/actions/runs/13355488433

New tests that traverse from more than historical MAX_REVS would be
the most interesting.

It is not exactly surprising if the existing tests do not exercise
such a case (even though it probably should have been checking that
the command refused to accept more than MAX_REVS refs to prevent it
from producing garbage results).

If there were such a test to illustrate what happens when too many
refs are given, we can demonstrate how the behaviour changes, for
the better, with this change ;-)

However ...

> +static int get_commit_flags(struct commit *commit)
> +{
> +	int *result = commit_flags_peek(&commit_flags, commit);
> +	return result ? *result : 0;
> +}
> +
> +static void set_commit_flags(struct commit *commit, int flags)
> +{
> +	*commit_flags_at(&commit_flags, commit) = flags;
> +}

... it does not make much sense to use the slab mechanism if we are
still limited to bitsizeof(type(flags)).  If your "int" is 32 bit,
and the command line fed 100 revs, we'd want a flags parameter that
can house at least 100 bits passed into the "set" function, and
yields the result that can hold at least 100 bits from the "get"
function.  I'd expect that the interface would be more like

    static int get_commit_flags(struct commit *commit, unsigned flags[]);
    static int set_commit_flags(struct commit *commit, unsigned flags[]);

with a file-scope static variable signaling how many bits we are
dealing with (allowing these functions to infer how many words
flags[] array has), and the return values from these helpers to
indicate success (with 0) and failure (with a negative value).  On a
32-bit platform, you'd need at least 4 words in flags[] array to
deal with 100 revs.  On a 64-bit box, 2 words would be sufficient.

I would imagine that we would need two more helpers

    static int set_flag_bit(unsigned flags[], unsigned n);
    static int get_flag_bit(unsigned flags[], unsigned n);

to query the n-th bit in a set of bits stored in flags[] array.

Thanks.
diff mbox series

Patch

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index fce6b404e9..909a22990d 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -9,6 +9,7 @@ 
 #include "hex.h"
 #include "pretty.h"
 #include "refs.h"
+#include "revision.h"
 #include "color.h"
 #include "strvec.h"
 #include "object-name.h"
@@ -33,18 +34,25 @@  static int showbranch_use_color = -1;
 
 static struct strvec default_args = STRVEC_INIT;
 
-/*
- * TODO: convert this use of commit->object.flags to commit-slab
- * instead to store a pointer to ref name directly. Then use the same
- * UNINTERESTING definition from revision.h here.
- */
-#define UNINTERESTING	01
-
 #define REV_SHIFT	 2
 #define MAX_REVS	(FLAG_BITS - REV_SHIFT) /* should not exceed bits_per_int - REV_SHIFT */
 
 #define DEFAULT_REFLOG	4
 
+define_commit_slab(commit_flags, int);
+static struct commit_flags commit_flags;
+
+static int get_commit_flags(struct commit *commit)
+{
+	int *result = commit_flags_peek(&commit_flags, commit);
+	return result ? *result : 0;
+}
+
+static void set_commit_flags(struct commit *commit, int flags)
+{
+	*commit_flags_at(&commit_flags, commit) = flags;
+}
+
 static const char *get_color_code(int idx)
 {
 	if (want_color(showbranch_use_color))
@@ -64,7 +72,7 @@  static struct commit *interesting(struct commit_list *list)
 	while (list) {
 		struct commit *commit = list->item;
 		list = list->next;
-		if (commit->object.flags & UNINTERESTING)
+		if (get_commit_flags(commit) & UNINTERESTING)
 			continue;
 		return commit;
 	}
@@ -215,7 +223,7 @@  static void name_commits(struct commit_list *list,
 
 static int mark_seen(struct commit *commit, struct commit_list **seen_p)
 {
-	if (!commit->object.flags) {
+	if (!get_commit_flags(commit)) {
 		commit_list_insert(commit, seen_p);
 		return 1;
 	}
@@ -233,7 +241,7 @@  static void join_revs(struct commit_list **list_p,
 		struct commit_list *parents;
 		int still_interesting = !!interesting(*list_p);
 		struct commit *commit = pop_commit(list_p);
-		int flags = commit->object.flags & all_mask;
+		int flags = get_commit_flags(commit) & all_mask;
 
 		if (!still_interesting && extra <= 0)
 			break;
@@ -245,14 +253,14 @@  static void join_revs(struct commit_list **list_p,
 
 		while (parents) {
 			struct commit *p = parents->item;
-			int this_flag = p->object.flags;
+			int this_flag = get_commit_flags(p);
 			parents = parents->next;
 			if ((this_flag & flags) == flags)
 				continue;
 			repo_parse_commit(the_repository, p);
 			if (mark_seen(p, seen_p) && !still_interesting)
 				extra--;
-			p->object.flags |= flags;
+			set_commit_flags(p, get_commit_flags(p) | flags);
 			commit_list_insert_by_date(p, list_p);
 		}
 	}
@@ -271,8 +279,8 @@  static void join_revs(struct commit_list **list_p,
 			struct commit *c = s->item;
 			struct commit_list *parents;
 
-			if (((c->object.flags & all_revs) != all_revs) &&
-			    !(c->object.flags & UNINTERESTING))
+			if (((get_commit_flags(c) & all_revs) != all_revs) &&
+			    !(get_commit_flags(c) & UNINTERESTING))
 				continue;
 
 			/* The current commit is either a merge base or
@@ -285,8 +293,8 @@  static void join_revs(struct commit_list **list_p,
 			while (parents) {
 				struct commit *p = parents->item;
 				parents = parents->next;
-				if (!(p->object.flags & UNINTERESTING)) {
-					p->object.flags |= UNINTERESTING;
+				if (!(get_commit_flags(p) & UNINTERESTING)) {
+					set_commit_flags(p, get_commit_flags(p) | UNINTERESTING);
 					changed = 1;
 				}
 			}
@@ -513,12 +521,12 @@  static int show_merge_base(const struct commit_list *seen, int num_rev)
 
 	for (const struct commit_list *s = seen; s; s = s->next) {
 		struct commit *commit = s->item;
-		int flags = commit->object.flags & all_mask;
+		int flags = get_commit_flags(commit) & all_mask;
 		if (!(flags & UNINTERESTING) &&
 		    ((flags & all_revs) == all_revs)) {
 			puts(oid_to_hex(&commit->object.oid));
 			exit_status = 0;
-			commit->object.flags |= UNINTERESTING;
+			set_commit_flags(commit, get_commit_flags(commit) | UNINTERESTING);
 		}
 	}
 	return exit_status;
@@ -534,9 +542,9 @@  static int show_independent(struct commit **rev,
 		struct commit *commit = rev[i];
 		unsigned int flag = rev_mask[i];
 
-		if (commit->object.flags == flag)
+		if (get_commit_flags(commit) == flag)
 			puts(oid_to_hex(&commit->object.oid));
-		commit->object.flags |= UNINTERESTING;
+		set_commit_flags(commit, get_commit_flags(commit) | UNINTERESTING);
 	}
 	return 0;
 }
@@ -603,7 +611,7 @@  static int omit_in_dense(struct commit *commit, struct commit **rev, int n)
 	for (i = 0; i < n; i++)
 		if (rev[i] == commit)
 			return 0;
-	flag = commit->object.flags;
+	flag = get_commit_flags(commit);
 	for (i = count = 0; i < n; i++) {
 		if (flag & (1u << (i + REV_SHIFT)))
 			count++;
@@ -702,6 +710,7 @@  int cmd_show_branch(int ac,
 	int ret;
 
 	init_commit_name_slab(&name_slab);
+	init_commit_flags(&commit_flags);
 
 	git_config(git_show_branch_config, NULL);
 
@@ -877,13 +886,13 @@  int cmd_show_branch(int ac,
 		 * and so on.  REV_SHIFT bits from bit 0 are used for
 		 * internal bookkeeping.
 		 */
-		commit->object.flags |= flag;
-		if (commit->object.flags == flag)
+		set_commit_flags(commit, get_commit_flags(commit) | flag);
+		if (get_commit_flags(commit) == flag)
 			commit_list_insert_by_date(commit, &list);
 		rev[num_rev] = commit;
 	}
 	for (i = 0; i < num_rev; i++)
-		rev_mask[i] = rev[i]->object.flags;
+		rev_mask[i] = get_commit_flags(rev[i]);
 
 	if (0 <= extra)
 		join_revs(&list, &seen, num_rev, extra);
@@ -951,7 +960,7 @@  int cmd_show_branch(int ac,
 
 	for (struct commit_list *l = seen; l; l = l->next) {
 		struct commit *commit = l->item;
-		int this_flag = commit->object.flags;
+		int this_flag = get_commit_flags(commit);
 		int is_merge_point = ((this_flag & all_revs) == all_revs);
 
 		shown_merge_point |= is_merge_point;