mbox series

[v5,0/8] refs: introduce support for batched reference updates

Message ID 20250327-245-partially-atomic-ref-updates-v5-0-4db2a3e34404@gmail.com (mailing list archive)
Headers show
Series refs: introduce support for batched reference updates | expand

Message

Karthik Nayak March 27, 2025, 11:13 a.m. UTC
Git supports making reference updates with or without transactions.
Updates with transactions are generally better optimized. But
transactions are all or nothing. This means, if a user wants to batch
updates to take advantage of the optimizations without the hard
requirement that all updates must succeed, there is no way currently to
do so. Particularly with the reftable backend where batching multiple
reference updates is more efficient than performing them sequentially.

This series introduces support for batched reference updates without
transactions allowing individual reference updates to fail while letting
others proceed. This capability is exposed through git-update-ref's
`--allow-partial` flag, which can be used in `--stdin` mode to batch
updates and handle failures gracefully. Under the hood, these batched
updates still use the transactions infrastructure, while modifying
sections to allow partial failures.

The changes are structured to carefully build up this functionality:

First, we clean up and consolidate the reference update checking logic.
This includes removing duplicate checks in the files backend and moving
refname tracking to the generic layer, which simplifies the codebase and
prepares it for the new feature.

We then restructure the reftable backend's transaction preparation code,
extracting the update validation logic into a dedicated function. This
not only improves code organization but sets the stage for implementing
partial transaction support.

To ensure we only skip errors which are user-oriented, we introduce
typed errors for transactions with 'enum ref_transaction_error'. We
extend the existing errors to include other scenarios and use this new
errors throughout the refs code.

With this groundwork in place, we implement the core batch update
support in the refs subsystem. This adds the necessary infrastructure to
track and report rejected updates while allowing transactions to
proceed. All reference backends are modified to support this behavior
when enabled.

Finally, we expose this functionality to users through
git-update-ref(1)'s `--allow-partial` flag, complete with test coverage
and documentation. The flag is specifically limited to `--stdin` mode
where batching multiple updates is most relevant.

This enhancement improves Git's flexibility in handling reference
updates while maintaining the safety of atomic transactions by default.
It's particularly valuable for tools and workflows that need to handle
reference update failures gracefully without abandoning the entire batch
of updates.

This series is based on top of 683c54c999 (Git 2.49, 2025-03-14) with
Patrick's series 'refs: batch refname availability checks' [1] merged
in.

[1]: https://lore.kernel.org/all/20250217-pks-update-ref-optimization-v1-0-a2b6d87a24af@pks.im/

---
Changes in v5:
- Inline the comments around the 'ref_transaction_error'.
- Use 'strbuf_reset()' wherever possible instead of 'strbuf_setlen(err, 0)'.
- Use an extra 'conflicting_dirnames' strset in 'refs_verify_refnames_available()' to track
  dirnames which were found to be conflicting, this is to avoid re-reading those dirnames.
- Add curly braces style mismatch in if..else block.
- Link to v4: https://lore.kernel.org/r/20250320-245-partially-atomic-ref-updates-v4-0-3dcc1b311dc9@gmail.com

Changes in v4:
- Rebased on top of 2.49 since there was a long time between the
  previous iteration and we have a new release.
- Changed the naming to say 'batched' updates instead of 'partial
  transactions'. While we still use the transaction infrastructure
  underneath, the new naming causes less ambiguity.
- Clean up some of the commit messages.
- Raise BUG for invalid update index while setting rejections.
- Fix an incorrect early return.
- Link to v3: https://lore.kernel.org/r/20250305-245-partially-atomic-ref-updates-v3-0-0c64e3052354@gmail.com

Changes in v3:
- Changed 'transaction_error' to 'ref_transaction_error' along with the
  error names. Removed 'TRANSACTION_OK' since it can potentially be
  missed instead of simply 'return 0'.
- Rename 'ref_transaction_set_rejected' to
  'ref_transaction_maybe_set_rejected' and move logic around error
  checks to within this function.
- Add a new struct 'ref_transaction_rejections' to track the rejections
  within a transaction. This allows us to only iterate over rejected
  updates.
- Add a new commit to also support partial transactions within the
  batched F/D checks.
- Remove NUL delimited outputs in 'git-update-ref(1)'.
- Remove translations for plumbing outputs.
- Other small cleanups in the commit message and code.

Changes in v2:
- Introduce and use structured errors. This consolidates the errors
  and their handling between the ref backends.
- In the previous version, we skipped over all failures. This include
  system failures such as low memory or IO problems. Let's instead, only
  skip user-oriented failures, such as invalid old OID and so on.
- Change the rejection function name to `ref_transaction_set_rejected()`.
- Modify the commit messages and documentation to be a little more
  verbose.
- Link to v1: https://lore.kernel.org/r/20250207-245-partially-atomic-ref-updates-v1-0-e6a3690ff23a@gmail.com

---
 Documentation/git-update-ref.adoc |  14 +-
 builtin/fetch.c                   |   2 +-
 builtin/update-ref.c              |  66 ++++-
 refs.c                            | 171 +++++++++++--
 refs.h                            |  70 ++++--
 refs/files-backend.c              | 314 +++++++++++-------------
 refs/packed-backend.c             |  69 +++---
 refs/refs-internal.h              |  51 +++-
 refs/reftable-backend.c           | 502 +++++++++++++++++++-------------------
 t/t1400-update-ref.sh             | 233 ++++++++++++++++++
 10 files changed, 969 insertions(+), 523 deletions(-)

Karthik Nayak (8):
      refs/files: remove redundant check in split_symref_update()
      refs: move duplicate refname update check to generic layer
      refs/files: remove duplicate duplicates check
      refs/reftable: extract code from the transaction preparation
      refs: introduce enum-based transaction error types
      refs: implement batch reference update support
      refs: support rejection in batch updates during F/D checks
      update-ref: add --batch-updates flag for stdin mode
---

Range-diff versus v4:

1:  c682fce9d0 = 1:  cae24142a1 refs/files: remove redundant check in split_symref_update()
2:  7483120888 = 2:  239aecdb0f refs: move duplicate refname update check to generic layer
3:  e54c9042b5 = 3:  06404dd350 refs/files: remove duplicate duplicates check
4:  4f905880af = 4:  a3e645aa37 refs/reftable: extract code from the transaction preparation
5:  7846fc43f5 ! 5:  2615bfe78e refs: introduce enum-based transaction error types
    @@ refs.h: struct worktree;
      enum ref_storage_format ref_storage_format_by_name(const char *name);
      const char *ref_storage_format_to_name(enum ref_storage_format ref_storage_format);
      
    -+/*
    -+ * enum ref_transaction_error represents the following return codes:
    -+ * REF_TRANSACTION_ERROR_GENERIC error_code: default error code.
    -+ * REF_TRANSACTION_ERROR_NAME_CONFLICT error_code: ref name conflict like A vs A/B.
    -+ * REF_TRANSACTION_ERROR_CREATE_EXISTS error_code: ref to be created already exists.
    -+ * REF_TRANSACTION_ERROR_NONEXISTENT_REF error_code: ref expected but doesn't exist.
    -+ * REF_TRANSACTION_ERROR_INCORRECT_OLD_VALUE error_code: provided old_oid or old_target of
    -+ * reference doesn't match actual.
    -+ * REF_TRANSACTION_ERROR_INVALID_NEW_VALUE error_code: provided new_oid or new_target is
    -+ * invalid.
    -+ * REF_TRANSACTION_ERROR_EXPECTED_SYMREF error_code: expected ref to be symref, but is a
    -+ * regular ref.
    -+ */
     +enum ref_transaction_error {
    ++	/* Default error code */
     +	REF_TRANSACTION_ERROR_GENERIC = -1,
    ++	/* Ref name conflict like A vs A/B */
     +	REF_TRANSACTION_ERROR_NAME_CONFLICT = -2,
    ++	/* Ref to be created already exists */
     +	REF_TRANSACTION_ERROR_CREATE_EXISTS = -3,
    ++	/* ref expected but doesn't exist */
     +	REF_TRANSACTION_ERROR_NONEXISTENT_REF = -4,
    ++	/* Provided old_oid or old_target of reference doesn't match actual */
     +	REF_TRANSACTION_ERROR_INCORRECT_OLD_VALUE = -5,
    ++	/* Provided new_oid or new_target is invalid */
     +	REF_TRANSACTION_ERROR_INVALID_NEW_VALUE = -6,
    ++	/* Expected ref to be symref, but is a regular ref */
     +	REF_TRANSACTION_ERROR_EXPECTED_SYMREF = -7,
     +};
     +
6:  398b93689a ! 6:  d5c1c77b0d refs: implement batch reference update support
    @@ refs/files-backend.c: static int files_transaction_prepare(struct ref_store *ref
     -		if (ret)
     +		if (ret) {
     +			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
    -+				strbuf_setlen(err, 0);
    ++				strbuf_reset(err);
     +				ret = 0;
     +
     +				continue;
    @@ refs/packed-backend.c: static enum ref_transaction_error write_with_updates(stru
      					ret = REF_TRANSACTION_ERROR_CREATE_EXISTS;
     +
     +					if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
    -+						strbuf_setlen(err, 0);
    ++						strbuf_reset(err);
     +						ret = 0;
     +						continue;
     +					}
    @@ refs/packed-backend.c: static enum ref_transaction_error write_with_updates(stru
      					ret = REF_TRANSACTION_ERROR_INCORRECT_OLD_VALUE;
     +
     +					if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
    -+						strbuf_setlen(err, 0);
    ++						strbuf_reset(err);
     +						ret = 0;
     +						continue;
     +					}
    @@ refs/packed-backend.c: static enum ref_transaction_error write_with_updates(stru
      				ret = REF_TRANSACTION_ERROR_NONEXISTENT_REF;
     +
     +				if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
    -+					strbuf_setlen(err, 0);
    ++					strbuf_reset(err);
     +					ret = 0;
     +					continue;
     +				}
    @@ refs/reftable-backend.c: static int reftable_be_transaction_prepare(struct ref_s
     -		if (ret)
     +		if (ret) {
     +			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
    -+				strbuf_setlen(err, 0);
    ++				strbuf_reset(err);
     +				ret = 0;
     +
     +				continue;
7:  965cd76097 ! 7:  4bb4902631 refs: support rejection in batch updates during F/D checks
    @@ refs.c: enum ref_transaction_error refs_verify_refnames_available(struct ref_sto
      					  struct strbuf *err)
      {
     @@ refs.c: enum ref_transaction_error refs_verify_refnames_available(struct ref_store *refs
    + 	struct strbuf referent = STRBUF_INIT;
    + 	struct string_list_item *item;
    + 	struct ref_iterator *iter = NULL;
    ++	struct strset conflicting_dirnames;
    + 	struct strset dirnames;
    + 	int ret = REF_TRANSACTION_ERROR_NAME_CONFLICT;
    + 
    +@@ refs.c: enum ref_transaction_error refs_verify_refnames_available(struct ref_store *refs
    + 
    + 	assert(err);
    + 
    ++	strset_init(&conflicting_dirnames);
      	strset_init(&dirnames);
      
      	for_each_string_list_item(item, refnames) {
    @@ refs.c: enum ref_transaction_error refs_verify_refnames_available(struct ref_sto
      		const char *extra_refname;
      		struct object_id oid;
     @@ refs.c: enum ref_transaction_error refs_verify_refnames_available(struct ref_store *refs
    + 				continue;
    + 
      			if (!initial_transaction &&
    - 			    !refs_read_raw_ref(refs, dirname.buf, &oid, &referent,
    - 					       &type, &ignore_errno)) {
    +-			    !refs_read_raw_ref(refs, dirname.buf, &oid, &referent,
    +-					       &type, &ignore_errno)) {
    ++			    (strset_contains(&conflicting_dirnames, dirname.buf) ||
    ++			     !refs_read_raw_ref(refs, dirname.buf, &oid, &referent,
    ++						       &type, &ignore_errno))) {
     +				if (transaction && ref_transaction_maybe_set_rejected(
     +					    transaction, *update_idx,
     +					    REF_TRANSACTION_ERROR_NAME_CONFLICT)) {
     +					strset_remove(&dirnames, dirname.buf);
    ++					strset_add(&conflicting_dirnames, dirname.buf);
     +					continue;
     +				}
     +
    @@ refs.c: enum ref_transaction_error refs_verify_refnames_available(struct ref_sto
      			strbuf_addf(err, _("cannot process '%s' and '%s' at the same time"),
      				    refname, extra_refname);
      			goto cleanup;
    +@@ refs.c: enum ref_transaction_error refs_verify_refnames_available(struct ref_store *refs
    + cleanup:
    + 	strbuf_release(&referent);
    + 	strbuf_release(&dirname);
    ++	strset_clear(&conflicting_dirnames);
    + 	strset_clear(&dirnames);
    + 	ref_iterator_free(iter);
    + 	return ret;
     @@ refs.c: enum ref_transaction_error refs_verify_refname_available(
      	};
      
8:  ed58c67cd7 ! 8:  674630f77c update-ref: add --batch-updates flag for stdin mode
    @@ builtin/update-ref.c: int cmd_update_ref(int argc,
     -		update_refs_stdin();
     +		update_refs_stdin(flags);
      		return 0;
    --	}
    -+	} else if (flags & REF_TRANSACTION_ALLOW_FAILURE)
    ++	} else if (flags & REF_TRANSACTION_ALLOW_FAILURE) {
     +		die("--batch-updates can only be used with --stdin");
    + 	}
      
      	if (end_null)
    - 		usage_with_options(git_update_ref_usage, options);
     
      ## t/t1400-update-ref.sh ##
     @@ t/t1400-update-ref.sh: do


---

base-commit: 679c868f5fffadd1f7e8e49d4d87d745ee36ffb7
change-id: 20241206-245-partially-atomic-ref-updates-9fe8b080345c

Thanks
- Karthik

Comments

Patrick Steinhardt March 28, 2025, 9:24 a.m. UTC | #1
On Thu, Mar 27, 2025 at 12:13:24PM +0100, Karthik Nayak wrote:
> Changes in v5:
> - Inline the comments around the 'ref_transaction_error'.
> - Use 'strbuf_reset()' wherever possible instead of 'strbuf_setlen(err, 0)'.
> - Use an extra 'conflicting_dirnames' strset in 'refs_verify_refnames_available()' to track
>   dirnames which were found to be conflicting, this is to avoid re-reading those dirnames.
> - Add curly braces style mismatch in if..else block.
> - Link to v4: https://lore.kernel.org/r/20250320-245-partially-atomic-ref-updates-v4-0-3dcc1b311dc9@gmail.com

Thanks, the series looks good to me judging by the range diff.

Patrick