diff mbox series

revision: Don't queue uninteresting commits

Message ID 20231011123534.119994-1-oystwa@gmail.com (mailing list archive)
State New, archived
Headers show
Series revision: Don't queue uninteresting commits | expand

Commit Message

Øystein Walle Oct. 11, 2023, 12:35 p.m. UTC
Currently all given commits are added to the topo_queue during
init_topo_walk(). Later on in get_revision_1() the uninteresting ones
are skipped because simplify_commit() tells it to.

Let's not add them to the topo_queue in the first place.

Signed-off-by: Øystein Walle <oystwa@gmail.com>
---

I noticed this while trying to understand the generation based algorithm
introduced in b45424181e (revision.c: generation-based topo-order
algorithm, 2018-11-01) in an attempt to write a similar one for
gitoxide. Comparing my solution to git's output I fixed a mismatch by
essentially doing this, and it turns out it works in git too. I am not
extremely confident, but all the tests pass...

For fun I also tried removing the UNINTERESTING check from
get_commit_action() altogether but then a lot of tests fail. I expected
that because both the flag and function predate the new algorithm.

 revision.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Oct. 11, 2023, 4:40 p.m. UTC | #1
Øystein Walle <oystwa@gmail.com> writes:

> Currently all given commits are added to the topo_queue during
> init_topo_walk(). Later on in get_revision_1() the uninteresting ones
> are skipped because simplify_commit() tells it to.
>
> Let's not add them to the topo_queue in the first place.

What is not described here is what benefit we are expecting to gain
by making this change.  Is anything leaking?  Are we showing wrong
output?  Is the effect something we can demonstrate, and more
importantly we can protect from future breakages, with a test or
two?
Øystein Walle Nov. 6, 2023, 11:28 a.m. UTC | #2
Hi, Junio, and sorry for the late response.

On Wed, 11 Oct 2023 at 18:40, Junio C Hamano <gitster@pobox.com> wrote:

> What is not described here is what benefit we are expecting to gain
> by making this change.  Is anything leaking?  Are we showing wrong
> output?  Is the effect something we can demonstrate, and more
> importantly we can protect from future breakages, with a test or
> two?

As far as I know there is no significant benefit to this change. The
only one I can think of is a case such as this:

    git rev-list some-rev ^a ^very ^large ^amount ^of ^negative ^revs ^here

but even then I would assume the work done by the algorithm in total is
so large that the work saved by this change is insignificant.

I was just a bit happy after grokking a piece of this code and let the
excitement get the best of me :-) I suggest we just drop it.

Øsse
diff mbox series

Patch

diff --git a/revision.c b/revision.c
index 2f4c53ea20..deeab813c7 100644
--- a/revision.c
+++ b/revision.c
@@ -3681,7 +3681,8 @@  static void init_topo_walk(struct rev_info *revs)
 	for (list = revs->commits; list; list = list->next) {
 		struct commit *c = list->item;
 
-		if (*(indegree_slab_at(&info->indegree, c)) == 1)
+		if (*(indegree_slab_at(&info->indegree, c)) == 1 &&
+		    !(c->object.flags & UNINTERESTING))
 			prio_queue_put(&info->topo_queue, c);
 	}