diff mbox series

[v2,2/3] reftable/stack: use geometric table compaction

Message ID def7008452303f71c1fa469609bc199c629a19ec.1711060820.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series reftable/stack: use geometric table compaction | expand

Commit Message

Justin Tobler March 21, 2024, 10:40 p.m. UTC
From: Justin Tobler <jltobler@gmail.com>

To reduce the number of on-disk reftables, compaction is performed.
Contiguous tables with the same binary log value of size are grouped
into segments. The segment that has both the lowest binary log value and
contains more than one table is set as the starting point when
identifying the compaction segment.

Since segments containing a single table are not initially considered
for compaction, if the table appended to the list does not match the
previous table log value, no compaction occurs for the new table. It is
therefore possible for unbounded growth of the table list. This can be
demonstrated by repeating the following sequence:

git branch -f foo
git branch -d foo

Each operation results in a new table being written with no compaction
occurring until a separate operation produces a table matching the
previous table log value.

Instead, to avoid unbounded growth of the table list, the compaction
strategy is updated to ensure tables follow a geometric sequence after
each operation. This is done by walking the table list in reverse index
order to identify the compaction segment start and end. The compaction
segment end is found by identifying the first table which has a
preceding table size less than twice the current table. Next, the
compaction segment start is found iterating through the remaining tables
in the list checking if the previous table size is less than twice the
cumulative of tables from the segment end. This ensures the correct
segment start is found and that the newly compacted table does not
violate the geometric sequence.

When creating 10 thousand references, the new strategy has no
performance impact:

Benchmark 1: update-ref: create refs sequentially (revision = HEAD~)
  Time (mean ± σ):     26.516 s ±  0.047 s    [User: 17.864 s, System: 8.491 s]
  Range (min … max):   26.447 s … 26.569 s    10 runs

Benchmark 2: update-ref: create refs sequentially (revision = HEAD)
  Time (mean ± σ):     26.417 s ±  0.028 s    [User: 17.738 s, System: 8.500 s]
  Range (min … max):   26.366 s … 26.444 s    10 runs

Summary
  update-ref: create refs sequentially (revision = HEAD) ran
    1.00 ± 0.00 times faster than update-ref: create refs sequentially (revision = HEAD~)

Some tests in `t0610-reftable-basics.sh` assert the on-disk state of
tables and are therefore updated to specify the correct new table count.
Since compaction is more aggressive in ensuring tables maintain a
geometric sequence, the expected table count is reduced in these tests.
In `reftable/stack_test.c` tests related to `sizes_to_segments()` are
removed because the function is no longer needed. Also, the
`test_suggest_compaction_segment()` test is updated to better showcase
and reflect the new geometric compaction behavior.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 reftable/stack.c           | 109 ++++++++++++++++---------------------
 reftable/stack.h           |   3 -
 reftable/stack_test.c      |  66 +++++-----------------
 t/t0610-reftable-basics.sh |  43 ++++++++++-----
 4 files changed, 91 insertions(+), 130 deletions(-)

Comments

Patrick Steinhardt March 22, 2024, 1:25 a.m. UTC | #1
On Thu, Mar 21, 2024 at 10:40:18PM +0000, Justin Tobler via GitGitGadget wrote:
> From: Justin Tobler <jltobler@gmail.com>
> 
> To reduce the number of on-disk reftables, compaction is performed.
> Contiguous tables with the same binary log value of size are grouped
> into segments. The segment that has both the lowest binary log value and
> contains more than one table is set as the starting point when
> identifying the compaction segment.
> 
> Since segments containing a single table are not initially considered
> for compaction, if the table appended to the list does not match the
> previous table log value, no compaction occurs for the new table. It is
> therefore possible for unbounded growth of the table list. This can be
> demonstrated by repeating the following sequence:
> 
> git branch -f foo
> git branch -d foo
> 
> Each operation results in a new table being written with no compaction
> occurring until a separate operation produces a table matching the
> previous table log value.
> 
> Instead, to avoid unbounded growth of the table list, the compaction
> strategy is updated to ensure tables follow a geometric sequence after
> each operation. This is done by walking the table list in reverse index
> order to identify the compaction segment start and end. The compaction
> segment end is found by identifying the first table which has a
> preceding table size less than twice the current table. Next, the
> compaction segment start is found iterating through the remaining tables
> in the list checking if the previous table size is less than twice the
> cumulative of tables from the segment end. This ensures the correct
> segment start is found and that the newly compacted table does not
> violate the geometric sequence.

I don't think we need to go into so much detail how exactly the
algorithm is working -- these kind of comments should ideally exist in
the code. What would be more interesting to explain here is _why_ we
chose the new algorithm over the old one instead of just trying to fix
the issue.

Other than that this patch LGTM.

Patrick

> When creating 10 thousand references, the new strategy has no
> performance impact:
> 
> Benchmark 1: update-ref: create refs sequentially (revision = HEAD~)
>   Time (mean ± σ):     26.516 s ±  0.047 s    [User: 17.864 s, System: 8.491 s]
>   Range (min … max):   26.447 s … 26.569 s    10 runs
> 
> Benchmark 2: update-ref: create refs sequentially (revision = HEAD)
>   Time (mean ± σ):     26.417 s ±  0.028 s    [User: 17.738 s, System: 8.500 s]
>   Range (min … max):   26.366 s … 26.444 s    10 runs
> 
> Summary
>   update-ref: create refs sequentially (revision = HEAD) ran
>     1.00 ± 0.00 times faster than update-ref: create refs sequentially (revision = HEAD~)
> 
> Some tests in `t0610-reftable-basics.sh` assert the on-disk state of
> tables and are therefore updated to specify the correct new table count.
> Since compaction is more aggressive in ensuring tables maintain a
> geometric sequence, the expected table count is reduced in these tests.
> In `reftable/stack_test.c` tests related to `sizes_to_segments()` are
> removed because the function is no longer needed. Also, the
> `test_suggest_compaction_segment()` test is updated to better showcase
> and reflect the new geometric compaction behavior.
> 
> Signed-off-by: Justin Tobler <jltobler@gmail.com>
> ---
>  reftable/stack.c           | 109 ++++++++++++++++---------------------
>  reftable/stack.h           |   3 -
>  reftable/stack_test.c      |  66 +++++-----------------
>  t/t0610-reftable-basics.sh |  43 ++++++++++-----
>  4 files changed, 91 insertions(+), 130 deletions(-)
> 
> diff --git a/reftable/stack.c b/reftable/stack.c
> index 2370d93d13b..ef55dc75cde 100644
> --- a/reftable/stack.c
> +++ b/reftable/stack.c
> @@ -1214,75 +1214,62 @@ static int segment_size(struct segment *s)
>  	return s->end - s->start;
>  }
>  
> -int fastlog2(uint64_t sz)
> -{
> -	int l = 0;
> -	if (sz == 0)
> -		return 0;
> -	for (; sz; sz /= 2) {
> -		l++;
> -	}
> -	return l - 1;
> -}
> -
> -struct segment *sizes_to_segments(size_t *seglen, uint64_t *sizes, size_t n)
> -{
> -	struct segment *segs = reftable_calloc(n, sizeof(*segs));
> -	struct segment cur = { 0 };
> -	size_t next = 0, i;
> -
> -	if (n == 0) {
> -		*seglen = 0;
> -		return segs;
> -	}
> -	for (i = 0; i < n; i++) {
> -		int log = fastlog2(sizes[i]);
> -		if (cur.log != log && cur.bytes > 0) {
> -			struct segment fresh = {
> -				.start = i,
> -			};
> -
> -			segs[next++] = cur;
> -			cur = fresh;
> -		}
> -
> -		cur.log = log;
> -		cur.end = i + 1;
> -		cur.bytes += sizes[i];
> -	}
> -	segs[next++] = cur;
> -	*seglen = next;
> -	return segs;
> -}
> -
>  struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
>  {
> -	struct segment min_seg = {
> -		.log = 64,
> -	};
> -	struct segment *segs;
> -	size_t seglen = 0, i;
> -
> -	segs = sizes_to_segments(&seglen, sizes, n);
> -	for (i = 0; i < seglen; i++) {
> -		if (segment_size(&segs[i]) == 1)
> -			continue;
> +	struct segment seg = { 0 };
> +	uint64_t bytes;
> +	size_t i;
>  
> -		if (segs[i].log < min_seg.log)
> -			min_seg = segs[i];
> -	}
> +	/*
> +	 * If there are no tables or only a single one then we don't have to
> +	 * compact anything. The sequence is geometric by definition already.
> +	 */
> +	if (n <= 1)
> +		return seg;
>  
> -	while (min_seg.start > 0) {
> -		size_t prev = min_seg.start - 1;
> -		if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev]))
> +	/*
> +	 * Find the ending table of the compaction segment needed to restore the
> +	 * geometric sequence.
> +	 *
> +	 * To do so, we iterate backwards starting from the most recent table
> +	 * until a valid segment end is found. If the preceding table is smaller
> +	 * than the current table multiplied by the geometric factor (2), the
> +	 * current table is set as the compaction segment end.
> +	 *
> +	 * Tables after the ending point are not added to the byte count because
> +	 * they are already valid members of the geometric sequence. Due to the
> +	 * properties of a geometric sequence, it is not possible for the sum of
> +	 * these tables to exceed the value of the ending point table.
> +	 */
> +	for (i = n - 1; i > 0; i--) {
> +		if (sizes[i - 1] < sizes[i] * 2) {
> +			seg.end = i + 1;
> +			bytes = sizes[i];
>  			break;
> +		}
> +	}
> +
> +	/*
> +	 * Find the starting table of the compaction segment by iterating
> +	 * through the remaining tables and keeping track of the accumulated
> +	 * size of all tables seen from the segment end table.
> +	 *
> +	 * Note that we keep iterating even after we have found the first
> +	 * starting point. This is because there may be tables in the stack
> +	 * preceding that first starting point which violate the geometric
> +	 * sequence.
> +	 */
> +	for (; i > 0; i--) {
> +		uint64_t curr = bytes;
> +		bytes += sizes[i - 1];
>  
> -		min_seg.start = prev;
> -		min_seg.bytes += sizes[prev];
> +		if (sizes[i - 1] < curr * 2) {
> +			seg.start = i - 1;
> +			seg.bytes = bytes;
> +		}
>  	}
>  
> -	reftable_free(segs);
> -	return min_seg;
> +	return seg;
>  }
>  
>  static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
> diff --git a/reftable/stack.h b/reftable/stack.h
> index d919455669e..656f896cc28 100644
> --- a/reftable/stack.h
> +++ b/reftable/stack.h
> @@ -33,12 +33,9 @@ int read_lines(const char *filename, char ***lines);
>  
>  struct segment {
>  	size_t start, end;
> -	int log;
>  	uint64_t bytes;
>  };
>  
> -int fastlog2(uint64_t sz);
> -struct segment *sizes_to_segments(size_t *seglen, uint64_t *sizes, size_t n);
>  struct segment suggest_compaction_segment(uint64_t *sizes, size_t n);
>  
>  #endif
> diff --git a/reftable/stack_test.c b/reftable/stack_test.c
> index 509f4866236..e5f6ff5c9e4 100644
> --- a/reftable/stack_test.c
> +++ b/reftable/stack_test.c
> @@ -720,59 +720,14 @@ static void test_reftable_stack_hash_id(void)
>  	clear_dir(dir);
>  }
>  
> -static void test_log2(void)
> -{
> -	EXPECT(1 == fastlog2(3));
> -	EXPECT(2 == fastlog2(4));
> -	EXPECT(2 == fastlog2(5));
> -}
> -
> -static void test_sizes_to_segments(void)
> -{
> -	uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 };
> -	/* .................0  1  2  3  4  5 */
> -
> -	size_t seglen = 0;
> -	struct segment *segs =
> -		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
> -	EXPECT(segs[2].log == 3);
> -	EXPECT(segs[2].start == 5);
> -	EXPECT(segs[2].end == 6);
> -
> -	EXPECT(segs[1].log == 2);
> -	EXPECT(segs[1].start == 2);
> -	EXPECT(segs[1].end == 5);
> -	reftable_free(segs);
> -}
> -
> -static void test_sizes_to_segments_empty(void)
> -{
> -	size_t seglen = 0;
> -	struct segment *segs = sizes_to_segments(&seglen, NULL, 0);
> -	EXPECT(seglen == 0);
> -	reftable_free(segs);
> -}
> -
> -static void test_sizes_to_segments_all_equal(void)
> -{
> -	uint64_t sizes[] = { 5, 5 };
> -	size_t seglen = 0;
> -	struct segment *segs =
> -		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
> -	EXPECT(seglen == 1);
> -	EXPECT(segs[0].start == 0);
> -	EXPECT(segs[0].end == 2);
> -	reftable_free(segs);
> -}
> -
>  static void test_suggest_compaction_segment(void)
>  {
> -	uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 };
> +	uint64_t sizes[] = { 512, 64, 17, 16, 9, 9, 9, 16, 2, 16 };
>  	/* .................0    1    2  3   4  5  6 */
>  	struct segment min =
>  		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
> -	EXPECT(min.start == 2);
> -	EXPECT(min.end == 7);
> +	EXPECT(min.start == 1);
> +	EXPECT(min.end == 10);
>  }
>  
>  static void test_suggest_compaction_segment_nothing(void)
> @@ -884,6 +839,17 @@ static void test_empty_add(void)
>  	reftable_stack_destroy(st2);
>  }
>  
> +static int fastlog2(uint64_t sz)
> +{
> +	int l = 0;
> +	if (sz == 0)
> +		return 0;
> +	for (; sz; sz /= 2) {
> +		l++;
> +	}
> +	return l - 1;
> +}
> +
>  static void test_reftable_stack_auto_compaction(void)
>  {
>  	struct reftable_write_options cfg = { 0 };
> @@ -1072,7 +1038,6 @@ static void test_reftable_stack_compaction_concurrent_clean(void)
>  int stack_test_main(int argc, const char *argv[])
>  {
>  	RUN_TEST(test_empty_add);
> -	RUN_TEST(test_log2);
>  	RUN_TEST(test_names_equal);
>  	RUN_TEST(test_parse_names);
>  	RUN_TEST(test_read_file);
> @@ -1092,9 +1057,6 @@ int stack_test_main(int argc, const char *argv[])
>  	RUN_TEST(test_reftable_stack_update_index_check);
>  	RUN_TEST(test_reftable_stack_uptodate);
>  	RUN_TEST(test_reftable_stack_validate_refname);
> -	RUN_TEST(test_sizes_to_segments);
> -	RUN_TEST(test_sizes_to_segments_all_equal);
> -	RUN_TEST(test_sizes_to_segments_empty);
>  	RUN_TEST(test_suggest_compaction_segment);
>  	RUN_TEST(test_suggest_compaction_segment_nothing);
>  	return 0;
> diff --git a/t/t0610-reftable-basics.sh b/t/t0610-reftable-basics.sh
> index 686781192eb..e6c3f94d874 100755
> --- a/t/t0610-reftable-basics.sh
> +++ b/t/t0610-reftable-basics.sh
> @@ -293,12 +293,24 @@ test_expect_success 'ref transaction: writes cause auto-compaction' '
>  	test_line_count = 1 repo/.git/reftable/tables.list &&
>  
>  	test_commit -C repo --no-tag A &&
> -	test_line_count = 2 repo/.git/reftable/tables.list &&
> +	test_line_count = 1 repo/.git/reftable/tables.list &&
>  
>  	test_commit -C repo --no-tag B &&
>  	test_line_count = 1 repo/.git/reftable/tables.list
>  '
>  
> +test_expect_success 'ref transaction: alternating table sizes are compacted' '
> +	test_when_finished "rm -rf repo" &&
> +	git init repo &&
> +	test_commit -C repo A &&
> +	for i in $(test_seq 20)
> +	do
> +		git -C repo branch -f foo &&
> +		git -C repo branch -d foo || return 1
> +	done &&
> +	test_line_count = 2 repo/.git/reftable/tables.list
> +'
> +
>  check_fsync_events () {
>  	local trace="$1" &&
>  	shift &&
> @@ -324,7 +336,7 @@ test_expect_success 'ref transaction: writes are synced' '
>  		git -C repo -c core.fsync=reference \
>  		-c core.fsyncMethod=fsync update-ref refs/heads/branch HEAD &&
>  	check_fsync_events trace2.txt <<-EOF
> -	"name":"hardware-flush","count":2
> +	"name":"hardware-flush","count":4
>  	EOF
>  '
>  
> @@ -346,8 +358,8 @@ test_expect_success 'pack-refs: compacts tables' '
>  
>  	test_commit -C repo A &&
>  	ls -1 repo/.git/reftable >table-files &&
> -	test_line_count = 4 table-files &&
> -	test_line_count = 3 repo/.git/reftable/tables.list &&
> +	test_line_count = 3 table-files &&
> +	test_line_count = 2 repo/.git/reftable/tables.list &&
>  
>  	git -C repo pack-refs &&
>  	ls -1 repo/.git/reftable >table-files &&
> @@ -379,7 +391,7 @@ do
>  			umask $umask &&
>  			git init --shared=true repo &&
>  			test_commit -C repo A &&
> -			test_line_count = 3 repo/.git/reftable/tables.list
> +			test_line_count = 2 repo/.git/reftable/tables.list
>  		) &&
>  		git -C repo pack-refs &&
>  		test_expect_perms "-rw-rw-r--" repo/.git/reftable/tables.list &&
> @@ -747,12 +759,13 @@ test_expect_success 'worktree: pack-refs in main repo packs main refs' '
>  	test_when_finished "rm -rf repo worktree" &&
>  	git init repo &&
>  	test_commit -C repo A &&
> -	git -C repo worktree add ../worktree &&
> +	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C repo worktree add ../worktree &&
> +	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/worktree/per-worktree HEAD &&
>  
> -	test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
> -	test_line_count = 4 repo/.git/reftable/tables.list &&
> +	test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
> +	test_line_count = 3 repo/.git/reftable/tables.list &&
>  	git -C repo pack-refs &&
> -	test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
> +	test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
>  	test_line_count = 1 repo/.git/reftable/tables.list
>  '
>  
> @@ -760,19 +773,21 @@ test_expect_success 'worktree: pack-refs in worktree packs worktree refs' '
>  	test_when_finished "rm -rf repo worktree" &&
>  	git init repo &&
>  	test_commit -C repo A &&
> -	git -C repo worktree add ../worktree &&
> +	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C repo worktree add ../worktree &&
> +	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/worktree/per-worktree HEAD &&
>  
> -	test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
> -	test_line_count = 4 repo/.git/reftable/tables.list &&
> +	test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
> +	test_line_count = 3 repo/.git/reftable/tables.list &&
>  	git -C worktree pack-refs &&
>  	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
> -	test_line_count = 4 repo/.git/reftable/tables.list
> +	test_line_count = 3 repo/.git/reftable/tables.list
>  '
>  
>  test_expect_success 'worktree: creating shared ref updates main stack' '
>  	test_when_finished "rm -rf repo worktree" &&
>  	git init repo &&
>  	test_commit -C repo A &&
> +	test_commit -C repo B &&
>  
>  	git -C repo worktree add ../worktree &&
>  	git -C repo pack-refs &&
> @@ -780,7 +795,7 @@ test_expect_success 'worktree: creating shared ref updates main stack' '
>  	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
>  	test_line_count = 1 repo/.git/reftable/tables.list &&
>  
> -	git -C worktree update-ref refs/heads/shared HEAD &&
> +	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/heads/shared HEAD &&
>  	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
>  	test_line_count = 2 repo/.git/reftable/tables.list
>  '
> -- 
> gitgitgadget
>
Karthik Nayak March 27, 2024, 1:24 p.m. UTC | #2
"Justin Tobler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Justin Tobler <jltobler@gmail.com>
>
> To reduce the number of on-disk reftables, compaction is performed.
> Contiguous tables with the same binary log value of size are grouped
> into segments. The segment that has both the lowest binary log value and
> contains more than one table is set as the starting point when
> identifying the compaction segment.
>
> Since segments containing a single table are not initially considered
> for compaction, if the table appended to the list does not match the
> previous table log value, no compaction occurs for the new table. It is
> therefore possible for unbounded growth of the table list. This can be
> demonstrated by repeating the following sequence:
>

Nit: A numerical example would really help make this simpler to understand.

> +	/*
> +	 * Find the ending table of the compaction segment needed to restore the
> +	 * geometric sequence.
> +	 *
> +	 * To do so, we iterate backwards starting from the most recent table
> +	 * until a valid segment end is found. If the preceding table is smaller
> +	 * than the current table multiplied by the geometric factor (2), the
> +	 * current table is set as the compaction segment end.
> +	 *
> +	 * Tables after the ending point are not added to the byte count because
> +	 * they are already valid members of the geometric sequence. Due to the
> +	 * properties of a geometric sequence, it is not possible for the sum of
> +	 * these tables to exceed the value of the ending point table.
> +	 */
> +	for (i = n - 1; i > 0; i--) {
> +		if (sizes[i - 1] < sizes[i] * 2) {
> +			seg.end = i + 1;
> +			bytes = sizes[i];
>  			break;
> +		}
> +	}
> +
> +	/*
> +	 * Find the starting table of the compaction segment by iterating
> +	 * through the remaining tables and keeping track of the accumulated
> +	 * size of all tables seen from the segment end table.
> +	 *

Nit: we need the accumulated sum because the tables from the end of the
segment will be recursively merged backwards. This might be worthwhile
to add here.


>  static void test_suggest_compaction_segment(void)
>  {
> -	uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 };
> +	uint64_t sizes[] = { 512, 64, 17, 16, 9, 9, 9, 16, 2, 16 };
>  	/* .................0    1    2  3   4  5  6 */

Nit: since we're here, maybe worthwhile cleaning up this comment. Not
sure what it actually is for.
diff mbox series

Patch

diff --git a/reftable/stack.c b/reftable/stack.c
index 2370d93d13b..ef55dc75cde 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -1214,75 +1214,62 @@  static int segment_size(struct segment *s)
 	return s->end - s->start;
 }
 
-int fastlog2(uint64_t sz)
-{
-	int l = 0;
-	if (sz == 0)
-		return 0;
-	for (; sz; sz /= 2) {
-		l++;
-	}
-	return l - 1;
-}
-
-struct segment *sizes_to_segments(size_t *seglen, uint64_t *sizes, size_t n)
-{
-	struct segment *segs = reftable_calloc(n, sizeof(*segs));
-	struct segment cur = { 0 };
-	size_t next = 0, i;
-
-	if (n == 0) {
-		*seglen = 0;
-		return segs;
-	}
-	for (i = 0; i < n; i++) {
-		int log = fastlog2(sizes[i]);
-		if (cur.log != log && cur.bytes > 0) {
-			struct segment fresh = {
-				.start = i,
-			};
-
-			segs[next++] = cur;
-			cur = fresh;
-		}
-
-		cur.log = log;
-		cur.end = i + 1;
-		cur.bytes += sizes[i];
-	}
-	segs[next++] = cur;
-	*seglen = next;
-	return segs;
-}
-
 struct segment suggest_compaction_segment(uint64_t *sizes, size_t n)
 {
-	struct segment min_seg = {
-		.log = 64,
-	};
-	struct segment *segs;
-	size_t seglen = 0, i;
-
-	segs = sizes_to_segments(&seglen, sizes, n);
-	for (i = 0; i < seglen; i++) {
-		if (segment_size(&segs[i]) == 1)
-			continue;
+	struct segment seg = { 0 };
+	uint64_t bytes;
+	size_t i;
 
-		if (segs[i].log < min_seg.log)
-			min_seg = segs[i];
-	}
+	/*
+	 * If there are no tables or only a single one then we don't have to
+	 * compact anything. The sequence is geometric by definition already.
+	 */
+	if (n <= 1)
+		return seg;
 
-	while (min_seg.start > 0) {
-		size_t prev = min_seg.start - 1;
-		if (fastlog2(min_seg.bytes) < fastlog2(sizes[prev]))
+	/*
+	 * Find the ending table of the compaction segment needed to restore the
+	 * geometric sequence.
+	 *
+	 * To do so, we iterate backwards starting from the most recent table
+	 * until a valid segment end is found. If the preceding table is smaller
+	 * than the current table multiplied by the geometric factor (2), the
+	 * current table is set as the compaction segment end.
+	 *
+	 * Tables after the ending point are not added to the byte count because
+	 * they are already valid members of the geometric sequence. Due to the
+	 * properties of a geometric sequence, it is not possible for the sum of
+	 * these tables to exceed the value of the ending point table.
+	 */
+	for (i = n - 1; i > 0; i--) {
+		if (sizes[i - 1] < sizes[i] * 2) {
+			seg.end = i + 1;
+			bytes = sizes[i];
 			break;
+		}
+	}
+
+	/*
+	 * Find the starting table of the compaction segment by iterating
+	 * through the remaining tables and keeping track of the accumulated
+	 * size of all tables seen from the segment end table.
+	 *
+	 * Note that we keep iterating even after we have found the first
+	 * starting point. This is because there may be tables in the stack
+	 * preceding that first starting point which violate the geometric
+	 * sequence.
+	 */
+	for (; i > 0; i--) {
+		uint64_t curr = bytes;
+		bytes += sizes[i - 1];
 
-		min_seg.start = prev;
-		min_seg.bytes += sizes[prev];
+		if (sizes[i - 1] < curr * 2) {
+			seg.start = i - 1;
+			seg.bytes = bytes;
+		}
 	}
 
-	reftable_free(segs);
-	return min_seg;
+	return seg;
 }
 
 static uint64_t *stack_table_sizes_for_compaction(struct reftable_stack *st)
diff --git a/reftable/stack.h b/reftable/stack.h
index d919455669e..656f896cc28 100644
--- a/reftable/stack.h
+++ b/reftable/stack.h
@@ -33,12 +33,9 @@  int read_lines(const char *filename, char ***lines);
 
 struct segment {
 	size_t start, end;
-	int log;
 	uint64_t bytes;
 };
 
-int fastlog2(uint64_t sz);
-struct segment *sizes_to_segments(size_t *seglen, uint64_t *sizes, size_t n);
 struct segment suggest_compaction_segment(uint64_t *sizes, size_t n);
 
 #endif
diff --git a/reftable/stack_test.c b/reftable/stack_test.c
index 509f4866236..e5f6ff5c9e4 100644
--- a/reftable/stack_test.c
+++ b/reftable/stack_test.c
@@ -720,59 +720,14 @@  static void test_reftable_stack_hash_id(void)
 	clear_dir(dir);
 }
 
-static void test_log2(void)
-{
-	EXPECT(1 == fastlog2(3));
-	EXPECT(2 == fastlog2(4));
-	EXPECT(2 == fastlog2(5));
-}
-
-static void test_sizes_to_segments(void)
-{
-	uint64_t sizes[] = { 2, 3, 4, 5, 7, 9 };
-	/* .................0  1  2  3  4  5 */
-
-	size_t seglen = 0;
-	struct segment *segs =
-		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
-	EXPECT(segs[2].log == 3);
-	EXPECT(segs[2].start == 5);
-	EXPECT(segs[2].end == 6);
-
-	EXPECT(segs[1].log == 2);
-	EXPECT(segs[1].start == 2);
-	EXPECT(segs[1].end == 5);
-	reftable_free(segs);
-}
-
-static void test_sizes_to_segments_empty(void)
-{
-	size_t seglen = 0;
-	struct segment *segs = sizes_to_segments(&seglen, NULL, 0);
-	EXPECT(seglen == 0);
-	reftable_free(segs);
-}
-
-static void test_sizes_to_segments_all_equal(void)
-{
-	uint64_t sizes[] = { 5, 5 };
-	size_t seglen = 0;
-	struct segment *segs =
-		sizes_to_segments(&seglen, sizes, ARRAY_SIZE(sizes));
-	EXPECT(seglen == 1);
-	EXPECT(segs[0].start == 0);
-	EXPECT(segs[0].end == 2);
-	reftable_free(segs);
-}
-
 static void test_suggest_compaction_segment(void)
 {
-	uint64_t sizes[] = { 128, 64, 17, 16, 9, 9, 9, 16, 16 };
+	uint64_t sizes[] = { 512, 64, 17, 16, 9, 9, 9, 16, 2, 16 };
 	/* .................0    1    2  3   4  5  6 */
 	struct segment min =
 		suggest_compaction_segment(sizes, ARRAY_SIZE(sizes));
-	EXPECT(min.start == 2);
-	EXPECT(min.end == 7);
+	EXPECT(min.start == 1);
+	EXPECT(min.end == 10);
 }
 
 static void test_suggest_compaction_segment_nothing(void)
@@ -884,6 +839,17 @@  static void test_empty_add(void)
 	reftable_stack_destroy(st2);
 }
 
+static int fastlog2(uint64_t sz)
+{
+	int l = 0;
+	if (sz == 0)
+		return 0;
+	for (; sz; sz /= 2) {
+		l++;
+	}
+	return l - 1;
+}
+
 static void test_reftable_stack_auto_compaction(void)
 {
 	struct reftable_write_options cfg = { 0 };
@@ -1072,7 +1038,6 @@  static void test_reftable_stack_compaction_concurrent_clean(void)
 int stack_test_main(int argc, const char *argv[])
 {
 	RUN_TEST(test_empty_add);
-	RUN_TEST(test_log2);
 	RUN_TEST(test_names_equal);
 	RUN_TEST(test_parse_names);
 	RUN_TEST(test_read_file);
@@ -1092,9 +1057,6 @@  int stack_test_main(int argc, const char *argv[])
 	RUN_TEST(test_reftable_stack_update_index_check);
 	RUN_TEST(test_reftable_stack_uptodate);
 	RUN_TEST(test_reftable_stack_validate_refname);
-	RUN_TEST(test_sizes_to_segments);
-	RUN_TEST(test_sizes_to_segments_all_equal);
-	RUN_TEST(test_sizes_to_segments_empty);
 	RUN_TEST(test_suggest_compaction_segment);
 	RUN_TEST(test_suggest_compaction_segment_nothing);
 	return 0;
diff --git a/t/t0610-reftable-basics.sh b/t/t0610-reftable-basics.sh
index 686781192eb..e6c3f94d874 100755
--- a/t/t0610-reftable-basics.sh
+++ b/t/t0610-reftable-basics.sh
@@ -293,12 +293,24 @@  test_expect_success 'ref transaction: writes cause auto-compaction' '
 	test_line_count = 1 repo/.git/reftable/tables.list &&
 
 	test_commit -C repo --no-tag A &&
-	test_line_count = 2 repo/.git/reftable/tables.list &&
+	test_line_count = 1 repo/.git/reftable/tables.list &&
 
 	test_commit -C repo --no-tag B &&
 	test_line_count = 1 repo/.git/reftable/tables.list
 '
 
+test_expect_success 'ref transaction: alternating table sizes are compacted' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	test_commit -C repo A &&
+	for i in $(test_seq 20)
+	do
+		git -C repo branch -f foo &&
+		git -C repo branch -d foo || return 1
+	done &&
+	test_line_count = 2 repo/.git/reftable/tables.list
+'
+
 check_fsync_events () {
 	local trace="$1" &&
 	shift &&
@@ -324,7 +336,7 @@  test_expect_success 'ref transaction: writes are synced' '
 		git -C repo -c core.fsync=reference \
 		-c core.fsyncMethod=fsync update-ref refs/heads/branch HEAD &&
 	check_fsync_events trace2.txt <<-EOF
-	"name":"hardware-flush","count":2
+	"name":"hardware-flush","count":4
 	EOF
 '
 
@@ -346,8 +358,8 @@  test_expect_success 'pack-refs: compacts tables' '
 
 	test_commit -C repo A &&
 	ls -1 repo/.git/reftable >table-files &&
-	test_line_count = 4 table-files &&
-	test_line_count = 3 repo/.git/reftable/tables.list &&
+	test_line_count = 3 table-files &&
+	test_line_count = 2 repo/.git/reftable/tables.list &&
 
 	git -C repo pack-refs &&
 	ls -1 repo/.git/reftable >table-files &&
@@ -379,7 +391,7 @@  do
 			umask $umask &&
 			git init --shared=true repo &&
 			test_commit -C repo A &&
-			test_line_count = 3 repo/.git/reftable/tables.list
+			test_line_count = 2 repo/.git/reftable/tables.list
 		) &&
 		git -C repo pack-refs &&
 		test_expect_perms "-rw-rw-r--" repo/.git/reftable/tables.list &&
@@ -747,12 +759,13 @@  test_expect_success 'worktree: pack-refs in main repo packs main refs' '
 	test_when_finished "rm -rf repo worktree" &&
 	git init repo &&
 	test_commit -C repo A &&
-	git -C repo worktree add ../worktree &&
+	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C repo worktree add ../worktree &&
+	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/worktree/per-worktree HEAD &&
 
-	test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
-	test_line_count = 4 repo/.git/reftable/tables.list &&
+	test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
+	test_line_count = 3 repo/.git/reftable/tables.list &&
 	git -C repo pack-refs &&
-	test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
+	test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
 	test_line_count = 1 repo/.git/reftable/tables.list
 '
 
@@ -760,19 +773,21 @@  test_expect_success 'worktree: pack-refs in worktree packs worktree refs' '
 	test_when_finished "rm -rf repo worktree" &&
 	git init repo &&
 	test_commit -C repo A &&
-	git -C repo worktree add ../worktree &&
+	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C repo worktree add ../worktree &&
+	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/worktree/per-worktree HEAD &&
 
-	test_line_count = 3 repo/.git/worktrees/worktree/reftable/tables.list &&
-	test_line_count = 4 repo/.git/reftable/tables.list &&
+	test_line_count = 4 repo/.git/worktrees/worktree/reftable/tables.list &&
+	test_line_count = 3 repo/.git/reftable/tables.list &&
 	git -C worktree pack-refs &&
 	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
-	test_line_count = 4 repo/.git/reftable/tables.list
+	test_line_count = 3 repo/.git/reftable/tables.list
 '
 
 test_expect_success 'worktree: creating shared ref updates main stack' '
 	test_when_finished "rm -rf repo worktree" &&
 	git init repo &&
 	test_commit -C repo A &&
+	test_commit -C repo B &&
 
 	git -C repo worktree add ../worktree &&
 	git -C repo pack-refs &&
@@ -780,7 +795,7 @@  test_expect_success 'worktree: creating shared ref updates main stack' '
 	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
 	test_line_count = 1 repo/.git/reftable/tables.list &&
 
-	git -C worktree update-ref refs/heads/shared HEAD &&
+	GIT_TEST_REFTABLE_NO_AUTOCOMPACTION=true git -C worktree update-ref refs/heads/shared HEAD &&
 	test_line_count = 1 repo/.git/worktrees/worktree/reftable/tables.list &&
 	test_line_count = 2 repo/.git/reftable/tables.list
 '