mbox series

[v2,0/9] reftable: improvements and fixes for compaction

Message ID cover.1722862822.git.ps@pks.im (mailing list archive)
Headers show
Series reftable: improvements and fixes for compaction | expand

Message

Patrick Steinhardt Aug. 5, 2024, 1:07 p.m. UTC
Hi,

this is the second version of my patch series that aims to improve the
way reftable stack perform compaction.

Changes compared to v1:

  - Extract a test helper function that sets up a stack with N tables
    containing refs.

  - Reuse file descriptor that we have already stored in a local
    variable instead of calling `lock_file_fd()` a second time.

  - Remove a no-op change in the last patch.

  - Add a comment explaining why we have to allocate N+1 many table
    names.

  - Some typo fixes.

Thanks!

Patrick

Patrick Steinhardt (9):
  reftable/stack: refactor function to gather table sizes
  reftable/stack: extract function to setup stack with N tables
  reftable/stack: test compaction with already-locked tables
  reftable/stack: update stats on failed full compaction
  reftable/stack: simplify tracking of table locks
  reftable/stack: do not die when fsyncing lock file files
  reftable/stack: use lock_file when adding table to "tables.list"
  reftable/stack: fix corruption on concurrent compaction
  reftable/stack: handle locked tables during auto-compaction

 reftable/stack.c           | 231 +++++++++++++++++++++++++++++--------
 reftable/stack_test.c      | 138 +++++++++++++++++-----
 t/t0610-reftable-basics.sh |  21 ++--
 3 files changed, 306 insertions(+), 84 deletions(-)

Range-diff against v1:
 1:  5d99191f5c =  1:  6d2b54ba8e reftable/stack: refactor function to gather table sizes
 -:  ---------- >  2:  ff17306cc0 reftable/stack: extract function to setup stack with N tables
 2:  123fb9d80e !  3:  63e64c8d82 reftable/stack: test compaction with already-locked tables
    @@ reftable/stack_test.c: static void test_reftable_stack_auto_compaction(void)
     +	err = reftable_new_stack(&st, dir, &opts);
     +	EXPECT_ERR(err);
     +
    -+	for (size_t i = 0; i < 5; i++) {
    -+		struct reftable_ref_record ref = {
    -+			.update_index = reftable_stack_next_update_index(st),
    -+			.value_type = REFTABLE_REF_VAL1,
    -+			.value.val1 = { i },
    -+		};
    -+
    -+		strbuf_reset(&buf);
    -+		strbuf_addf(&buf, "refs/heads/branch-%04" PRIuMAX, (uintmax_t) i);
    -+		ref.refname = buf.buf;
    -+
    -+		err = reftable_stack_add(st, &write_test_ref, &ref);
    -+		EXPECT_ERR(err);
    -+	}
    ++	write_n_ref_tables(st, &opts, 5);
     +	EXPECT(st->merged->stack_len == 5);
     +
     +	/*
    @@ reftable/stack_test.c: static void test_reftable_stack_add_performs_auto_compact
     +	err = reftable_new_stack(&st, dir, &opts);
     +	EXPECT_ERR(err);
     +
    -+	for (size_t i = 0; i < 3; i++) {
    -+		struct reftable_ref_record ref = {
    -+			.update_index = reftable_stack_next_update_index(st),
    -+			.value_type = REFTABLE_REF_VAL1,
    -+			.value.val1 = { i },
    -+		};
    -+
    -+		strbuf_reset(&buf);
    -+		strbuf_addf(&buf, "refs/heads/branch-%04" PRIuMAX, (uintmax_t) i);
    -+		ref.refname = buf.buf;
    -+
    -+		err = reftable_stack_add(st, &write_test_ref, &ref);
    -+		EXPECT_ERR(err);
    -+	}
    ++	write_n_ref_tables(st, &opts, 3);
     +	EXPECT(st->merged->stack_len == 3);
     +
     +	/* Lock one of the tables that we're about to compact. */
 3:  1fa7acbddf =  4:  1989dafcb4 reftable/stack: update stats on failed full compaction
 4:  40d9f75cf2 =  5:  73e5d104eb reftable/stack: simplify tracking of table locks
 5:  9233c36894 =  6:  e411e14904 reftable/stack: do not die when fsyncing lock file files
 6:  9703246156 !  7:  b868a518d6 reftable/stack: use lock_file when adding table to "tables.list"
    @@ Commit message
         compacting the stack, we manually handle the lock when adding a new
         table to it. While not wrong, it is at least inconsistent.
     
    -    Refactor the code to consistenly lock "tables.list" via the `lock_file`
    +    Refactor the code to consistently lock "tables.list" via the `lock_file`
         subsytem.
     
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
    @@ reftable/stack.c: int reftable_addition_commit(struct reftable_addition *add)
      		goto done;
      	}
      
    --	err = fsync_component(FSYNC_COMPONENT_REFERENCE, lock_file_fd);
    -+	err = fsync_component(FSYNC_COMPONENT_REFERENCE,
    -+			      get_lock_file_fd(&add->tables_list_lock));
    - 	if (err < 0) {
    - 		err = REFTABLE_IO_ERROR;
    - 		goto done;
    - 	}
    - 
     -	err = rename_tempfile(&add->lock_file, add->stack->list_file);
     +	err = commit_lock_file(&add->tables_list_lock);
      	if (err < 0) {
 7:  b746565bf0 !  8:  ff17414d26 reftable/stack: fix corruption on concurrent compaction
    @@ Commit message
         Letting concurrent processes modify the "tables.list" file while we are
         doing the compaction is very much part of the design and thus expected.
         After all, it may take some time to compact tables in the case where we
    -    are compacting a lot or very large tables.
    +    are compacting a lot of very large tables.
     
         But there is a bug in the code. Suppose we have two processes which are
         compacting two slices of the table. Given that we lock each of the
    @@ Commit message
         process will always impact what the other process needs to write to the
         "tables.list" file.
     
    -    Right now , we do not check whether the "tables.list" has been
    -    changed after we have locked it for the second time in (5). This has the
    +    Right now, we do not check whether the "tables.list" has been changed
    +    after we have locked it for the second time in (5). This has the
         consequence that we will always commit the old, cached in-core tables to
         disk without paying to respect what the other process has written. This
         scenario would then lead to data loss and corruption.
    @@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st,
     +		last_to_replace = last + (new_offset - first);
     +		first_to_replace = new_offset;
     +	} else {
    ++		/*
    ++		 * `fd_read_lines()` uses a `NULL` sentinel to indicate that
    ++		 * the array is at its end. As we use `free_names()` to free
    ++		 * the array, we need to include this sentinel value here and
    ++		 * thus have to allocate `stack_len + 1` many entries.
    ++		 */
     +		REFTABLE_CALLOC_ARRAY(names, st->merged->stack_len + 1);
     +		for (size_t i = 0; i < st->merged->stack_len; i++)
     +			names[i] = xstrdup(st->readers[i]->name);
 8:  dc22178307 !  9:  1ef560feb1 reftable/stack: handle locked tables during auto-compaction
    @@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st,
      		}
      
      		/*
    -@@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st,
    - 	 * delete the files after we closed them on Windows, so this needs to
    - 	 * happen first.
    - 	 */
    --	err = reftable_stack_reload_maybe_reuse(st, first < last);
    -+	err = reftable_stack_reload_maybe_reuse(st, first_to_replace < last_to_replace);
    - 	if (err < 0)
    - 		goto done;
    - 
     @@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st,
      
      static int stack_compact_range_stats(struct reftable_stack *st,

base-commit: 39bf06adf96da25b87c9aa7d35a32ef3683eb4a4

Comments

karthik nayak Aug. 8, 2024, 12:26 p.m. UTC | #1
Patrick Steinhardt <ps@pks.im> writes:

> Hi,
>
> this is the second version of my patch series that aims to improve the
> way reftable stack perform compaction.
>
> Changes compared to v1:
>
>   - Extract a test helper function that sets up a stack with N tables
>     containing refs.
>
>   - Reuse file descriptor that we have already stored in a local
>     variable instead of calling `lock_file_fd()` a second time.
>
>   - Remove a no-op change in the last patch.
>
>   - Add a comment explaining why we have to allocate N+1 many table
>     names.
>
>   - Some typo fixes.
>
> Thanks!

I haven't reviewed v1, but did skim through it. I left some comments,
(maybe they were already asked/answered) overall the series looks great
to me.

[snip]