diff mbox series

[v2,1/3] refs/reftable: introduce "reftable.lockTimeout"

Message ID 700a35df125eca6639d08d30720332dbc1ed15a1.1726633812.git.ps@pks.im (mailing list archive)
State Superseded
Headers show
Series reftable: graceful concurrent writes | expand

Commit Message

Patrick Steinhardt Sept. 18, 2024, 4:32 a.m. UTC
When multiple concurrent processes try to update references in a
repository they may try to lock the same lockfiles. This can happen even
when the updates are non-conflicting and can both be applied, so it
doesn't always make sense to abort the transaction immediately. Both the
"loose" and "packed" backends thus have a grace period that they wait
for the lock to be released that can be controlled via the config values
"core.filesRefLockTimeout" and "core.packedRefsTimeout", respectively.

The reftable backend doesn't have such a setting yet and instead fails
immediately when it sees such a lock. But the exact same concepts apply
here as they do apply to the other backends.

Introduce a new "reftable.lockTimeout" config that controls how long we
may wait for a "tables.list" lock to be released. The default value of
this config is 100ms, which is the same default as we have it for the
"loose" backend.

Note that even though we also lock individual tables, this config really
only applies to the "tables.list" file. This is because individual
tables are only ever locked when we already hold the "tables.list" lock
during compaction. When we observe such a lock we in fact do not want to
compact the table at all because it is already in the process of being
compacted by a concurrent process. So applying the same timeout here
would not make any sense and only delay progress.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 Documentation/config/reftable.txt |  7 +++++++
 refs/reftable-backend.c           |  4 ++++
 reftable/reftable-writer.h        |  8 ++++++++
 reftable/stack.c                  | 18 ++++++++++++------
 t/t0610-reftable-basics.sh        | 27 +++++++++++++++++++++++++++
 5 files changed, 58 insertions(+), 6 deletions(-)

Comments

James Liu Sept. 18, 2024, 9:22 a.m. UTC | #1
On Wed Sep 18, 2024 at 2:32 PM AEST, Patrick Steinhardt wrote:
> diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> index 1c4b19e737f..e90ddfb98dd 100644
> --- a/refs/reftable-backend.c
> +++ b/refs/reftable-backend.c
> @@ -256,6 +256,9 @@ static int reftable_be_config(const char *var, const char *value,
>  		if (factor > UINT8_MAX)
>  			die("reftable geometric factor cannot exceed %u", (unsigned)UINT8_MAX);
>  		opts->auto_compaction_factor = factor;
> +	} else if (!strcmp(var, "reftable.locktimeout")) {
> +		unsigned long lock_timeout = git_config_ulong(var, value, ctx->kvi);
> +		opts->lock_timeout_ms = lock_timeout;
>  	}

Do we need to support the `0` and `-1` values that are possible for
the "core.filesRefLockTimeout" and "core.packedRefsTimeout" timeouts
here as well?
Patrick Steinhardt Sept. 18, 2024, 9:39 a.m. UTC | #2
On Wed, Sep 18, 2024 at 07:22:28PM +1000, James Liu wrote:
> On Wed Sep 18, 2024 at 2:32 PM AEST, Patrick Steinhardt wrote:
> > diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
> > index 1c4b19e737f..e90ddfb98dd 100644
> > --- a/refs/reftable-backend.c
> > +++ b/refs/reftable-backend.c
> > @@ -256,6 +256,9 @@ static int reftable_be_config(const char *var, const char *value,
> >  		if (factor > UINT8_MAX)
> >  			die("reftable geometric factor cannot exceed %u", (unsigned)UINT8_MAX);
> >  		opts->auto_compaction_factor = factor;
> > +	} else if (!strcmp(var, "reftable.locktimeout")) {
> > +		unsigned long lock_timeout = git_config_ulong(var, value, ctx->kvi);
> > +		opts->lock_timeout_ms = lock_timeout;
> >  	}
> 
> Do we need to support the `0` and `-1` values that are possible for
> the "core.filesRefLockTimeout" and "core.packedRefsTimeout" timeouts
> here as well?

We already handle `0`, which is provided by the underlying lockfile
interface. But we don't handle `-1` yet. I guess wiring it up does make
sense indeed, even if it is just to be consistent with the other
timeouts.

Patrick
diff mbox series

Patch

diff --git a/Documentation/config/reftable.txt b/Documentation/config/reftable.txt
index 05157279778..abb793bcbb0 100644
--- a/Documentation/config/reftable.txt
+++ b/Documentation/config/reftable.txt
@@ -46,3 +46,10 @@  reftable.geometricFactor::
 By default, the geometric sequence uses a factor of 2, meaning that for any
 table, the next-biggest table must at least be twice as big. A maximum factor
 of 256 is supported.
+
+reftable.lockTimeout::
+	Whenever the reftable backend appends a new table to the stack, it has
+	to lock the central "tables.list" file before updating it. This config
+	controls how long the process will wait to acquire the lock in case
+	another process has already acquired it. Default is 100 (i.e., retry
+	for 100ms).
diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c
index 1c4b19e737f..e90ddfb98dd 100644
--- a/refs/reftable-backend.c
+++ b/refs/reftable-backend.c
@@ -256,6 +256,9 @@  static int reftable_be_config(const char *var, const char *value,
 		if (factor > UINT8_MAX)
 			die("reftable geometric factor cannot exceed %u", (unsigned)UINT8_MAX);
 		opts->auto_compaction_factor = factor;
+	} else if (!strcmp(var, "reftable.locktimeout")) {
+		unsigned long lock_timeout = git_config_ulong(var, value, ctx->kvi);
+		opts->lock_timeout_ms = lock_timeout;
 	}
 
 	return 0;
@@ -281,6 +284,7 @@  static struct ref_store *reftable_be_init(struct repository *repo,
 	refs->write_options.default_permissions = calc_shared_perm(0666 & ~mask);
 	refs->write_options.disable_auto_compact =
 		!git_env_bool("GIT_TEST_REFTABLE_AUTOCOMPACTION", 1);
+	refs->write_options.lock_timeout_ms = 100;
 
 	git_config(reftable_be_config, &refs->write_options);
 
diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h
index 189b1f4144f..d5f2446220b 100644
--- a/reftable/reftable-writer.h
+++ b/reftable/reftable-writer.h
@@ -51,6 +51,14 @@  struct reftable_write_options {
 	 * tables to compact. Defaults to 2 if unset.
 	 */
 	uint8_t auto_compaction_factor;
+
+	/*
+	 * The number of milliseconds to wait when trying to lock "tables.list".
+	 * Note that this does not apply to locking individual tables, as these
+	 * should only ever be locked when already holding the "tables.list"
+	 * lock.
+	 */
+	unsigned lock_timeout_ms;
 };
 
 /* reftable_block_stats holds statistics for a single block type */
diff --git a/reftable/stack.c b/reftable/stack.c
index ce0a35216ba..5ccad2cff31 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -603,8 +603,10 @@  static int reftable_stack_init_addition(struct reftable_addition *add,
 
 	add->stack = st;
 
-	err = hold_lock_file_for_update(&add->tables_list_lock, st->list_file,
-					LOCK_NO_DEREF);
+	err = hold_lock_file_for_update_timeout(&add->tables_list_lock,
+						st->list_file,
+						LOCK_NO_DEREF,
+						st->opts.lock_timeout_ms);
 	if (err < 0) {
 		if (errno == EEXIST) {
 			err = REFTABLE_LOCK_ERROR;
@@ -1056,8 +1058,10 @@  static int stack_compact_range(struct reftable_stack *st,
 	 * Hold the lock so that we can read "tables.list" and lock all tables
 	 * which are part of the user-specified range.
 	 */
-	err = hold_lock_file_for_update(&tables_list_lock, st->list_file,
-					LOCK_NO_DEREF);
+	err = hold_lock_file_for_update_timeout(&tables_list_lock,
+						st->list_file,
+						LOCK_NO_DEREF,
+						st->opts.lock_timeout_ms);
 	if (err < 0) {
 		if (errno == EEXIST)
 			err = REFTABLE_LOCK_ERROR;
@@ -1156,8 +1160,10 @@  static int stack_compact_range(struct reftable_stack *st,
 	 * "tables.list". We'll then replace the compacted range of tables with
 	 * the new table.
 	 */
-	err = hold_lock_file_for_update(&tables_list_lock, st->list_file,
-					LOCK_NO_DEREF);
+	err = hold_lock_file_for_update_timeout(&tables_list_lock,
+						st->list_file,
+						LOCK_NO_DEREF,
+						st->opts.lock_timeout_ms);
 	if (err < 0) {
 		if (errno == EEXIST)
 			err = REFTABLE_LOCK_ERROR;
diff --git a/t/t0610-reftable-basics.sh b/t/t0610-reftable-basics.sh
index 37510c2b2a5..62da3e37823 100755
--- a/t/t0610-reftable-basics.sh
+++ b/t/t0610-reftable-basics.sh
@@ -423,6 +423,33 @@  test_expect_success 'ref transaction: fails gracefully when auto compaction fail
 	)
 '
 
+test_expect_success 'ref transaction: timeout acquiring tables.list lock' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		>.git/reftable/tables.list.lock &&
+		test_must_fail git update-ref refs/heads/branch HEAD 2>err &&
+		test_grep "cannot lock references" err
+	)
+'
+
+test_expect_success 'ref transaction: retry acquiring tables.list lock' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	(
+		cd repo &&
+		test_commit initial &&
+		LOCK=.git/reftable/tables.list.lock &&
+		>$LOCK &&
+		{
+			( sleep 1 && rm -f $LOCK ) &
+		} &&
+		git -c reftable.lockTimeout=5000 update-ref refs/heads/branch HEAD
+	)
+'
+
 test_expect_success 'pack-refs: compacts tables' '
 	test_when_finished "rm -rf repo" &&
 	git init repo &&