mbox series

[v5,00/12] refs: ref storage migrations

Message ID cover.1717649802.git.ps@pks.im (mailing list archive)
Headers show
Series refs: ref storage migrations | expand

Message

Patrick Steinhardt June 6, 2024, 5:28 a.m. UTC
Hi,

the ref storage migration was merged to `next`, but got reverted due to
some additional findings by Peff and/or Coverity.

Changes compared to v4:

  - Adapt comment of `ref_store_init()` to the new parameter.

  - Fix use of an uninitialized return value in `for_each_root_ref()`.

  - Fix overwrite of ret code in `files_ref_store_remove_on_disk()`.

  - Adapt an error message to more clearly point out that deletion of
    "refs/" directory failed in `reftable_be_remove_on_disk()`.

  - Fix a leak when `mkdtemp()` fails.

Thanks!

Patrick

Patrick Steinhardt (12):
  setup: unset ref storage when reinitializing repository version
  refs: convert ref storage format to an enum
  refs: pass storage format to `ref_store_init()` explicitly
  refs: allow to skip creation of reflog entries
  refs/files: refactor `add_pseudoref_and_head_entries()`
  refs/files: extract function to iterate through root refs
  refs/files: fix NULL pointer deref when releasing ref store
  reftable: inline `merged_table_release()`
  worktree: don't store main worktree twice
  refs: implement removal of ref storages
  refs: implement logic to migrate between ref storage formats
  builtin/refs: new command to migrate ref storage formats

 .gitignore                 |   1 +
 Documentation/git-refs.txt |  61 +++++++
 Makefile                   |   1 +
 builtin.h                  |   1 +
 builtin/clone.c            |   2 +-
 builtin/init-db.c          |   2 +-
 builtin/refs.c             |  75 ++++++++
 command-list.txt           |   1 +
 git.c                      |   1 +
 refs.c                     | 345 +++++++++++++++++++++++++++++++++++--
 refs.h                     |  41 ++++-
 refs/files-backend.c       | 124 +++++++++++--
 refs/packed-backend.c      |  15 ++
 refs/ref-cache.c           |   2 +
 refs/refs-internal.h       |   7 +
 refs/reftable-backend.c    |  55 +++++-
 reftable/merged.c          |  12 +-
 reftable/merged.h          |   2 -
 reftable/stack.c           |   8 +-
 repository.c               |   3 +-
 repository.h               |  10 +-
 setup.c                    |  10 +-
 setup.h                    |   9 +-
 t/helper/test-ref-store.c  |   1 +
 t/t1460-refs-migrate.sh    | 243 ++++++++++++++++++++++++++
 worktree.c                 |  29 ++--
 26 files changed, 979 insertions(+), 82 deletions(-)
 create mode 100644 Documentation/git-refs.txt
 create mode 100644 builtin/refs.c
 create mode 100755 t/t1460-refs-migrate.sh

Range-diff against v4:
 1:  afb705f6a0 =  1:  afb705f6a0 setup: unset ref storage when reinitializing repository version
 2:  7989e82dcd =  2:  7989e82dcd refs: convert ref storage format to an enum
 3:  7d1a86292c !  3:  26005abb28 refs: pass storage format to `ref_store_init()` explicitly
    @@ Commit message
     
      ## refs.c ##
     @@ refs.c: static struct ref_store *lookup_ref_store_map(struct strmap *map,
    -  * gitdir.
    + 
    + /*
    +  * Create, record, and return a ref_store instance for the specified
    +- * gitdir.
    ++ * gitdir using the given ref storage format.
       */
      static struct ref_store *ref_store_init(struct repository *repo,
     +					enum ref_storage_format format,
 4:  d0539b7456 =  4:  053f1be657 refs: allow to skip creation of reflog entries
 5:  7f9ce5af2e =  5:  29147da2b9 refs/files: refactor `add_pseudoref_and_head_entries()`
 6:  f7577a0ab3 !  6:  86cf0c84b1 refs/files: extract function to iterate through root refs
    @@ refs/files-backend.c: static void add_root_refs(struct files_ref_store *refs,
      		strbuf_setlen(&refname, dirnamelen);
      	}
     +
    ++	ret = 0;
    ++
     +done:
      	strbuf_release(&refname);
      	strbuf_release(&path);
 7:  56baa798fb =  7:  6b0aaf2ac8 refs/files: fix NULL pointer deref when releasing ref store
 8:  c7e8ab40b5 =  8:  0690d5eae9 reftable: inline `merged_table_release()`
 9:  7a89aae515 =  9:  89699a641d worktree: don't store main worktree twice
10:  f9d9420cf9 ! 10:  7b5fee2185 refs: implement removal of ref storages
    @@ refs/files-backend.c: static int files_ref_store_create_on_disk(struct ref_store
     +	}
     +	strbuf_reset(&sb);
     +
    -+	ret = for_each_root_ref(refs, remove_one_root_ref, &data);
    -+	if (ret < 0)
    ++	if (for_each_root_ref(refs, remove_one_root_ref, &data) < 0)
     +		ret = -1;
     +
     +	if (ref_store_remove_on_disk(refs->packed_ref_store, err) < 0)
    @@ refs/reftable-backend.c: static int reftable_be_create_on_disk(struct ref_store
     +
     +	strbuf_addf(&sb, "%s/refs", refs->base.gitdir);
     +	if (rmdir(sb.buf) < 0) {
    -+		strbuf_addf(err, "could not delete stub heads: %s",
    ++		strbuf_addf(err, "could not delete refs directory: %s",
     +			    strerror(errno));
     +		ret = -1;
     +	}
11:  1f26051eff ! 11:  893d99e98e refs: implement logic to migrate between ref storage formats
    @@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
     +{
     +	struct ref_store *old_refs = NULL, *new_refs = NULL;
     +	struct ref_transaction *transaction = NULL;
    -+	struct strbuf buf = STRBUF_INIT;
    ++	struct strbuf new_gitdir = STRBUF_INIT;
     +	struct migration_data data;
     +	size_t reflog_count = 0;
    -+	char *new_gitdir = NULL;
     +	int did_migrate_refs = 0;
     +	int ret;
     +
    ++	if (repo->ref_storage_format == format) {
    ++		strbuf_addstr(errbuf, "current and new ref storage format are equal");
    ++		ret = -1;
    ++		goto done;
    ++	}
    ++
     +	old_refs = get_main_ref_store(repo);
     +
     +	/*
    @@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
     +	 *
     +	 *   6. Change the repository format to the new ref format.
     +	 */
    -+	strbuf_addf(&buf, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
    -+	new_gitdir = mkdtemp(xstrdup(buf.buf));
    -+	if (!new_gitdir) {
    ++	strbuf_addf(&new_gitdir, "%s/%s", old_refs->gitdir, "ref_migration.XXXXXX");
    ++	if (!mkdtemp(new_gitdir.buf)) {
     +		strbuf_addf(errbuf, "cannot create migration directory: %s",
     +			    strerror(errno));
     +		ret = -1;
     +		goto done;
     +	}
     +
    -+	new_refs = ref_store_init(repo, format, new_gitdir,
    ++	new_refs = ref_store_init(repo, format, new_gitdir.buf,
     +				  REF_STORE_ALL_CAPS);
     +	ret = ref_store_create_on_disk(new_refs, 0, errbuf);
     +	if (ret < 0)
    @@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
     +
     +	if (flags & REPO_MIGRATE_REF_STORAGE_FORMAT_DRYRUN) {
     +		printf(_("Finished dry-run migration of refs, "
    -+			 "the result can be found at '%s'\n"), new_gitdir);
    ++			 "the result can be found at '%s'\n"), new_gitdir.buf);
     +		ret = 0;
     +		goto done;
     +	}
    @@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
     +	if (ret < 0)
     +		goto done;
     +
    -+	ret = move_files(new_gitdir, old_refs->gitdir, errbuf);
    ++	ret = move_files(new_gitdir.buf, old_refs->gitdir, errbuf);
     +	if (ret < 0)
     +		goto done;
     +
    -+	if (rmdir(new_gitdir) < 0)
    ++	if (rmdir(new_gitdir.buf) < 0)
     +		warning_errno(_("could not remove temporary migration directory '%s'"),
    -+			      new_gitdir);
    ++			      new_gitdir.buf);
     +
     +	/*
     +	 * We have migrated the repository, so we now need to adjust the
    @@ refs.c: int ref_update_check_old_target(const char *referent, struct ref_update
     +	if (ret && did_migrate_refs) {
     +		strbuf_complete(errbuf, '\n');
     +		strbuf_addf(errbuf, _("migrated refs can be found at '%s'"),
    -+			    new_gitdir);
    ++			    new_gitdir.buf);
     +	}
     +
     +	if (ret && new_refs)
     +		ref_store_release(new_refs);
     +	ref_transaction_free(transaction);
    -+	strbuf_release(&buf);
    -+	free(new_gitdir);
    ++	strbuf_release(&new_gitdir);
     +	return ret;
     +}
     
12:  83cb3f8c96 = 12:  ec0c6d3cf1 builtin/refs: new command to migrate ref storage formats

Comments

Jeff King June 6, 2024, 7:06 a.m. UTC | #1
On Thu, Jun 06, 2024 at 07:28:52AM +0200, Patrick Steinhardt wrote:

> the ref storage migration was merged to `next`, but got reverted due to
> some additional findings by Peff and/or Coverity.
> 
> Changes compared to v4:
> 
>   - Adapt comment of `ref_store_init()` to the new parameter.
> 
>   - Fix use of an uninitialized return value in `for_each_root_ref()`.
> 
>   - Fix overwrite of ret code in `files_ref_store_remove_on_disk()`.
> 
>   - Adapt an error message to more clearly point out that deletion of
>     "refs/" directory failed in `reftable_be_remove_on_disk()`.
> 
>   - Fix a leak when `mkdtemp()` fails.

These all looked good to me (though I did not carefully read the
original topic, so was just looking at the parts I mentioned earlier).

>  6:  f7577a0ab3 !  6:  86cf0c84b1 refs/files: extract function to iterate through root refs
>     @@ refs/files-backend.c: static void add_root_refs(struct files_ref_store *refs,
>       		strbuf_setlen(&refname, dirnamelen);
>       	}
>      +
>     ++	ret = 0;
>     ++
>      +done:
>       	strbuf_release(&refname);
>       	strbuf_release(&path);

Since the context doesn't show much, I wondered whether there was any
case where we'd overwrite an earlier "ret" here. But nope, we always
jump to "done" after finding "ret" contains a non-zero value. So setting
it to zero here is the right thing.

-Peff
Junio C Hamano June 6, 2024, 4:18 p.m. UTC | #2
Patrick Steinhardt <ps@pks.im> writes:

> Hi,
>
> the ref storage migration was merged to `next`, but got reverted due to
> some additional findings by Peff and/or Coverity.
>
> Changes compared to v4:
>
>   - Adapt comment of `ref_store_init()` to the new parameter.
>
>   - Fix use of an uninitialized return value in `for_each_root_ref()`.
>
>   - Fix overwrite of ret code in `files_ref_store_remove_on_disk()`.
>
>   - Adapt an error message to more clearly point out that deletion of
>     "refs/" directory failed in `reftable_be_remove_on_disk()`.
>
>   - Fix a leak when `mkdtemp()` fails.
>
> Thanks!
>
> Patrick

Looking good.  The use of strbuf for mkdtemp() template and relying
on the fact that mkdtemp() makes an in-place modification of the
template string made the resulting code easier to follow, I would
think.

Let's mark the topic ready for 'next'.  Thanks.