mbox series

[00/30,RFC] extensions.refFormat and packed-refs v2 file format

Message ID pull.1408.git.1667846164.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series extensions.refFormat and packed-refs v2 file format | expand

Message

John Passaro via GitGitGadget Nov. 7, 2022, 6:35 p.m. UTC
Introduction
============

I became interested in our packed-ref format based on the asymmetry between
ref updates and ref deletions: if we delete a packed ref, then the
packed-refs file needs to be rewritten. Compared to writing a loose ref,
this is an O(N) cost instead of O(1).

In this way, I set out with some goals:

 * (Primary) Make packed ref deletions be nearly as fast as loose ref
   updates.
 * (Secondary) Allow using a packed ref format for all refs, dropping loose
   refs and creating a clear way to snapshot all refs at a given point in
   time.

I also had one major non-goal to keep things focused:

 * (Non-goal) Update the reflog format.

After carefully considering several options, it seemed that there are two
solutions that can solve this effectively:

 1. Wait for reftable to be integrated into Git.
 2. Update the packed-refs backend to have a stacked version.

The reftable work seems currently dormant. The format is pretty complicated
and I have a difficult time seeing a way forward for it to be fully
integrated into Git. Personally, I'd prefer a more incremental approach with
formats that are built for a basic filesystem. During the process, we can
create APIs within Git that can benefit other file formats within Git.

Further, there is a simpler model that satisfies my primary goal without the
complication required for the secondary goal. Suppose we create a stacked
packed-refs file but only have two layers: the first (base) layer is created
when git pack-refs collapses the full stack and adds the loose ref updates
to the packed-refs file; the second (top) layer contains only ref deletions
(allowing null OIDs to indicate a deleted ref). Then, ref deletions would
only need to rewrite that top layer, making ref deletions take O(deletions)
time instead of O(all refs) time. With a reasonable schedule to squash the
packed-refs stack, this would be a dramatic improvement. (A prototype
implementation showed that updating a layer of 1,000 deletions takes only
twice the time as writing a single loose ref.)

If we want to satisfy the secondary goal of passing all ref updates through
the packed storage, then more complicated layering would be necessary. The
point of bringing this up is that we have incremental goals along the way to
that final state that give us good stopping points to test the benefits of
each step.

Stacking the packed-refs format introduces several interesting strategy
points that are complicated to resolve. Before we can do that, we first need
to establish a way to modify the ref format of a Git repository. Hence, we
need a new extension for the ref formats.

To simplify the first update to the ref formats, it seemed better to add a
new file format version to the existing packed-refs file format. This format
has the exact lock/write/rename mechanics of the current packed-refs format,
but uses a file format that structures the information in a more compact
way. It uses the chunk-format API, with some tweaks. This format update is
useful to the final goal of a stacked packed-refs API, since each layer will
have faster reads and writes. The main reason to do this first is that it is
much simpler to understand the value-add (smaller files means faster
performance).


RFC Organization
================

This RFC is quite long, but the length seemed necessary to actually provide
and end-to-end implementation that demonstrates the packed-refs v2 format
along with test coverage (via the new GIT_TEST_PACKED_REFS_VERSION
variable).

For convenience, I've broken each section of the full RFC into parts, which
resembles how I intend to submit the pieces for full review. These parts are
available as pull requests in my fork, but here is a breakdown:


Part I: Optionally hash the index
=================================

[1] https://github.com/derrickstolee/git/pull/23 Packed-refs v2 Part I:
Optionally hash the index (Patches 1-2)

The chunk-format API uses the hashfile API as a buffered write, but also all
existing formats that use the chunk-format API also have a trailing hash as
part of the format. Since the packed-refs file has a critical path involving
its write speed (deleting a packed ref), it seemed important to allow
apples-to-apples comparison between the v1 and v2 format by skipping the
hashing. This is later toggled by a config option.

In this part, the focus is on allowing the hashfile API to ignore updating
the hash during the buffered writes. We've been using this in microsoft/git
to optionally speed up index writes, which patch 2 introduces here. The file
format instead writes a null OID which would look like a corrupt file to an
older 'git fsck'. Before submitting a full version, I would update 'git
fsck' to ignore a null OID in all of our file formats that include a
trailing hash. Since the index is more short-lived than other formats (such
as pack-files) this trailing hash is less useful. The write time is also
critical as the performance tests demonstrate.


Part II: Create extensions.refFormat
====================================

[2] https://github.com/derrickstolee/git/pull/24 Packed-refs v2 Part II:
create extensions.refFormat (Patches 3-7)

This part is a critical concept that has yet to be defined in the Git
codebase. We have no way to incrementally modify the ref format. Since refs
are so critical, we cannot add an optionally-understood layer on top (like
we did with the multi-pack-index and commit-graph files). The reftable draft
[6] proposes the same extension name (extensions.refFormat) but focuses
instead on only a single value. This means that the reftable must be defined
at git init or git clone time and cannot be upgraded from the files backend.

In this RFC, I propose a different model that allows for more customization
and incremental updates. The extensions.refFormat config key is multi-valued
and defaults to the list of files and packed. In the context of this RFC,
the intention is to be able to add packed-v2 so the list of all three values
would allow Git to write and read either file format version (v1 or v2). In
the larger scheme, the extension could allow restricting to only loose refs
(just files) or only packed-refs (just packed) or even later when reftable
is complete, files and reftable could mean that loose refs are the primary
ref storage, but the reftable format serves as a drop-in replacement for the
packed-refs file. Not all combinations need to be understood by Git, but
having them available as an option could be useful for flexibility,
especially when trying to upgrade existing repositories to new formats.

In the future, beyond the scope of this RFC, it would be good to add a
stacked value that allows a stack of files in packed-refs format (whose
version is specified by the packed or packed-v2 values) so we can further
speed up writes to the packed layer. Depending on how well that works, we
could focus on speeding up ref deletions or sending all ref writes straight
to the packed-refs layer. With the option to keep the loose refs storage, we
have flexibility to explore that space incrementally when we have time to
get to it.


Part III: Allow a trailing table-of-contents in the chunk-format API
====================================================================

[3] https://github.com/derrickstolee/git/pull/25 Packed-refs v2 Part III:
trailing table of contents in chunk-format (Patches 8-17)

In order to optimize the write speed of the packed-refs v2 file format, we
want to write immediately to the file as we stream existing refs from the
current refs. The current chunk-format API requires computing the chunk
lengths in advance, which can slow down the write and take more memory than
necessary. Using a trailing table of contents solves this problem, and was
recommended earlier [7]. We just didn't have enough evidence to justify the
work to update the existing chunk formats. Here, we update the API in
advance of using in the packed-refs v2 format.

We could consider updating the commit-graph and multi-pack-index formats to
use trailing table of contents, but it requires a version bump. That might
be worth it in the case of the commit-graph where computing the size of the
changed-path Bloom filters chunk requires a lot of memory at the moment.
After this chunk-format API update is reviewed and merged, we can pursue
those directions more closely. We would want to investigate the formats more
carefully to see if we want to update the chunks themselves as well as some
header information.


Part IV: Abstract some parts of the v1 file format
==================================================

[4] https://github.com/derrickstolee/git/pull/26 Packed-refs v2 Part IV:
abstract some parts of the v1 file format (Patches 18-21)

These patches move the part of the refs/packed-backend.c file that deal with
the specifics of the packed-refs v1 file format into a new file:
refs/packed-format-v1.c. This also creates an abstraction layer that will
allow inserting the v2 format more easily.

One thing that doesn't exist currently is a documentation file describing
the packed-refs file format. I would add that file in this part before
submitting it for full review. (I also haven't written the file format doc
for the packed-refs v2 format, either.)


Part V: Implement the v2 file format
====================================

[5] https://github.com/derrickstolee/git/pull/27 Packed-refs v2 Part V: the
v2 file format (Patches 22-35)

This is the real meat of the work. Perhaps there are ways to split it
further, but for now this is what I have ready. The very last patch does a
complete performance comparison for a repo with many refs.

The format is not yet documented, but is broken up into these pieces:

 1. The refs data chunk stores the same data as the packed-refs file, but
    each ref is broken down as follows: the ref name (with trailing zero),
    the OID for the ref in its raw bytes, and (if necessary) the peeled OID
    for the ref in its raw bytes. The refs are sorted lexicographically.

 2. The ref offsets chunk is a single column of 64-bit offsets into the refs
    chunk indicating where each ref starts. The most-significant bit of that
    value indicates whether or not there is a peeled OID.

 3. The prefix data chunk lists a set of ref prefixes (currently writes only
    allow depth-2 prefixes, such as refs/heads/ and refs/tags/). When
    present, these prefixes are written in this chunk and not in the refs
    data chunk. The prefixes are sorted lexicographically.

 4. The prefix offset chunk has two 32-bit integer columns. The first column
    stores the offset within the prefix data chunk to the start of the
    prefix string. The second column points to the row position for the
    first ref that has name greater than this prefix (the 0th prefix is
    assumed to start at row 0, so we can interpret the prefix range from
    row[i-1] and row[i]).

Between using raw OIDs and storing the depth-2 prefixes only once, this
format compresses the file to ~60% of its v1 size. (The format allows not
writing the prefix chunks, and the prefix chunks are implemented after the
basics of the ref chunks are complete.)

The write times are reduced in a similar fraction to the size difference.
Reads are sped up somewhat, and we have the potential to do a ref count by
prefix much faster by doing a binary search for the start and end of the
prefix and then subtracting the row positions instead of scanning the file
between to count refs.


Relationship to Reftable
========================

I mentioned earlier that I had considered using reftable as a way to achieve
the stated goals. With the current state of that work, I'm not confident
that it is the right approach here.

My main worry is that the reftable is more complicated than we need for a
typical Git repository that is based on a typical filesystem. This makes
testing the format very critical, and we seem to not be near reaching that
approach. The v2 format here is very similar to existing Git file formats
since it uses the chunk-format API. This means that the amount of code
custom to just the v2 format is quite small.

As mentioned, the current extension plan [6] only allows reftable or files
and does not allow for a mix of both. This RFC introduces the possibility
that both could co-exist. Using that multi-valued approach means that I'm
able to test the v2 packed-refs file format almost as well as the v1 file
format within this RFC. (More tests need to be added that are specific to
this format, but I'm waiting for confirmation that this is an acceptable
direction.) At the very least, this multi-valued approach could be used as a
way to allow using the reftable format as a drop-in replacement for the
packed-refs file, as well as upgrading an existing repo to use reftable.
That might even help the integration process to allow the reftable format to
be tested at least by some subset of tests instead of waiting for a full
test suite update.

I'm interested to hear from people more involved in the reftable work to see
the status of that project and how it matches or differs from my
perspective.

The one thing I can say is that if the reftable work had not already begun,
then this is RFC is how I would have approached a new ref format.

I look forward to your feedback!

Thanks,

 * Stolee

[6]
https://github.com/git/git/pull/1215/files#diff-a30f88b458b1f01e7a67e72576584b5b77ddb0362e40da6f7bf4a9ddf79db7b8R41-R48
The draft version of extensions.refFormat for reftable.

[7] https://lore.kernel.org/git/4696bd93-9406-0abd-25ec-a739665a24d5@web.de/
Re: [PATCH 00/15] Refactor chunk-format into an API (where René recommends a
trailing table of contents)

Derrick Stolee (30):
  hashfile: allow skipping the hash function
  read-cache: add index.computeHash config option
  extensions: add refFormat extension
  config: fix multi-level bulleted list
  repository: wire ref extensions to ref backends
  refs: allow loose files without packed-refs
  chunk-format: number of chunks is optional
  chunk-format: document trailing table of contents
  chunk-format: store chunk offset during write
  chunk-format: allow trailing table of contents
  chunk-format: parse trailing table of contents
  refs: extract packfile format to new file
  packed-backend: extract add_write_error()
  packed-backend: extract iterator/updates merge
  packed-backend: create abstraction for writing refs
  config: add config values for packed-refs v2
  packed-backend: create shell of v2 writes
  packed-refs: write file format version 2
  packed-refs: read file format v2
  packed-refs: read optional prefix chunks
  packed-refs: write prefix chunks
  packed-backend: create GIT_TEST_PACKED_REFS_VERSION
  t1409: test with packed-refs v2
  t5312: allow packed-refs v2 format
  t5502: add PACKED_REFS_V1 prerequisite
  t3210: require packed-refs v1 for some tests
  t*: skip packed-refs v2 over http tests
  ci: run GIT_TEST_PACKED_REFS_VERSION=2 in some builds
  p1401: create performance test for ref operations
  refs: skip hashing when writing packed-refs v2

 Documentation/config.txt            |   2 +
 Documentation/config/extensions.txt |  76 ++-
 Documentation/config/index.txt      |   8 +
 Documentation/config/refs.txt       |  13 +
 Documentation/gitformat-chunk.txt   |  26 +-
 Makefile                            |   2 +
 cache.h                             |   2 +
 chunk-format.c                      | 109 +++-
 chunk-format.h                      |  18 +-
 ci/run-build-and-tests.sh           |   1 +
 commit-graph.c                      |   2 +-
 csum-file.c                         |  14 +-
 csum-file.h                         |   7 +
 midx.c                              |   2 +-
 read-cache.c                        |  22 +-
 refs.c                              |  24 +-
 refs/files-backend.c                |   8 +-
 refs/packed-backend.c               | 880 +++++++---------------------
 refs/packed-backend.h               | 281 +++++++++
 refs/packed-format-v1.c             | 456 ++++++++++++++
 refs/packed-format-v2.c             | 624 ++++++++++++++++++++
 refs/refs-internal.h                |   9 +
 repository.c                        |   2 +
 repository.h                        |   7 +
 setup.c                             |  26 +
 t/perf/p1401-ref-operations.sh      |  52 ++
 t/t1409-avoid-packing-refs.sh       |  22 +-
 t/t1600-index.sh                    |   8 +
 t/t3210-pack-refs.sh                |   8 +-
 t/t3212-ref-formats.sh              | 100 ++++
 t/t5502-quickfetch.sh               |   2 +-
 t/t5539-fetch-http-shallow.sh       |   7 +
 t/t5541-http-push-smart.sh          |   7 +
 t/t5542-push-http-shallow.sh        |   7 +
 t/t5551-http-fetch-smart.sh         |   7 +
 t/t5558-clone-bundle-uri.sh         |   7 +
 t/test-lib.sh                       |   4 +
 37 files changed, 2157 insertions(+), 695 deletions(-)
 create mode 100644 Documentation/config/refs.txt
 create mode 100644 refs/packed-format-v1.c
 create mode 100644 refs/packed-format-v2.c
 create mode 100755 t/perf/p1401-ref-operations.sh
 create mode 100755 t/t3212-ref-formats.sh


base-commit: c03801e19cb8ab36e9c0d17ff3d5e0c3b0f24193
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1408%2Fderrickstolee%2Frefs%2Frfc-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1408/derrickstolee/refs/rfc-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1408

Comments

Derrick Stolee Nov. 9, 2022, 3:15 p.m. UTC | #1
On 11/7/2022 1:35 PM, Derrick Stolee via GitGitGadget wrote:
> This RFC is quite long, but the length seemed necessary to actually provide
> and end-to-end implementation that demonstrates the packed-refs v2 format
> along with test coverage (via the new GIT_TEST_PACKED_REFS_VERSION
> variable).

Apologies. I had intended to CC a long list of people, but I messed up
the "cc:" lines on my GGG PR. Appropriate people are CC'd. Please add
anyone who I might have missed.

Thanks,
-Stolee
Elijah Newren Nov. 11, 2022, 11:28 p.m. UTC | #2
On Mon, Nov 7, 2022 at 11:01 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> Introduction
> ============
>
> I became interested in our packed-ref format based on the asymmetry between
> ref updates and ref deletions: if we delete a packed ref, then the
> packed-refs file needs to be rewritten. Compared to writing a loose ref,
> this is an O(N) cost instead of O(1).
>
> In this way, I set out with some goals:
>
>  * (Primary) Make packed ref deletions be nearly as fast as loose ref
>    updates.

Performance is always nice.  :-)

>  * (Secondary) Allow using a packed ref format for all refs, dropping loose
>    refs and creating a clear way to snapshot all refs at a given point in
>    time.

Is this secondary goal the actual goal you have, or just the
implementation by which you get the real underlying goal?

To me, it appears that such a capability would solve both (a) D/F
conflict problems (i.e. the ability to simultaneously have a
refs/heads/feature and refs/heads/feature/shiny ref), and (b) case
sensitivity issues in refnames (i.e. inability of some users to work
with both a refs/heads/feature and a refs/heads/FeAtUrE, due to
constraints of their filesystem and the loose storage mechanism).  Are
either of those the goal you are trying to achieve (I think both would
be really nice, more so than the performance goal you have), or is
there another?

> I also had one major non-goal to keep things focused:
>
>  * (Non-goal) Update the reflog format.
>
> After carefully considering several options, it seemed that there are two
> solutions that can solve this effectively:
>
>  1. Wait for reftable to be integrated into Git.
>  2. Update the packed-refs backend to have a stacked version.
>
> The reftable work seems currently dormant. The format is pretty complicated
> and I have a difficult time seeing a way forward for it to be fully
> integrated into Git. Personally, I'd prefer a more incremental approach with
> formats that are built for a basic filesystem. During the process, we can
> create APIs within Git that can benefit other file formats within Git.
>
> Further, there is a simpler model that satisfies my primary goal without the
> complication required for the secondary goal. Suppose we create a stacked
> packed-refs file but only have two layers: the first (base) layer is created
> when git pack-refs collapses the full stack and adds the loose ref updates
> to the packed-refs file; the second (top) layer contains only ref deletions
> (allowing null OIDs to indicate a deleted ref). Then, ref deletions would
> only need to rewrite that top layer, making ref deletions take O(deletions)
> time instead of O(all refs) time. With a reasonable schedule to squash the
> packed-refs stack, this would be a dramatic improvement. (A prototype
> implementation showed that updating a layer of 1,000 deletions takes only
> twice the time as writing a single loose ref.)

Makes sense.  If a ref is re-introduced after deletion, then do you
remove it from the deletion layer and then write the single loose ref?

> If we want to satisfy the secondary goal of passing all ref updates through
> the packed storage, then more complicated layering would be necessary. The
> point of bringing this up is that we have incremental goals along the way to
> that final state that give us good stopping points to test the benefits of
> each step.

I like the incremental plan.  Your primary goal perhaps benefits
hosting providers the most, while the second appears to me to be an
interesting usability improvement (some of my users might argue it's
even a bugfix) that would affect users with far fewer refs as well.
So, lots of benefits and we get some along the way to the final plan.

> Stacking the packed-refs format introduces several interesting strategy
> points that are complicated to resolve. Before we can do that, we first need
> to establish a way to modify the ref format of a Git repository. Hence, we
> need a new extension for the ref formats.
>
> To simplify the first update to the ref formats, it seemed better to add a
> new file format version to the existing packed-refs file format. This format
> has the exact lock/write/rename mechanics of the current packed-refs format,
> but uses a file format that structures the information in a more compact
> way. It uses the chunk-format API, with some tweaks. This format update is
> useful to the final goal of a stacked packed-refs API, since each layer will
> have faster reads and writes. The main reason to do this first is that it is
> much simpler to understand the value-add (smaller files means faster
> performance).
>
>
> RFC Organization
> ================
>
> This RFC is quite long, but the length seemed necessary to actually provide
> and end-to-end implementation that demonstrates the packed-refs v2 format
> along with test coverage (via the new GIT_TEST_PACKED_REFS_VERSION
> variable).
>
> For convenience, I've broken each section of the full RFC into parts, which
> resembles how I intend to submit the pieces for full review. These parts are
> available as pull requests in my fork, but here is a breakdown:


> Part I: Optionally hash the index
> =================================
>
> [1] https://github.com/derrickstolee/git/pull/23 Packed-refs v2 Part I:
> Optionally hash the index (Patches 1-2)
>
> The chunk-format API uses the hashfile API as a buffered write, but also all
> existing formats that use the chunk-format API also have a trailing hash as
> part of the format. Since the packed-refs file has a critical path involving
> its write speed (deleting a packed ref), it seemed important to allow
> apples-to-apples comparison between the v1 and v2 format by skipping the
> hashing. This is later toggled by a config option.
>
> In this part, the focus is on allowing the hashfile API to ignore updating
> the hash during the buffered writes. We've been using this in microsoft/git
> to optionally speed up index writes, which patch 2 introduces here. The file
> format instead writes a null OID which would look like a corrupt file to an
> older 'git fsck'. Before submitting a full version, I would update 'git
> fsck' to ignore a null OID in all of our file formats that include a
> trailing hash. Since the index is more short-lived than other formats (such
> as pack-files) this trailing hash is less useful. The write time is also
> critical as the performance tests demonstrate.

This feels like a diversion from your goals.  Should it really be up-front?

Reading through the patches, the first patch does appear to be
necessary but isn't well motivated from the cover letter.  The second
patch seems orthogonal to your series, though it is really nice to see
index writes dropping almost to half the time.

> Part II: Create extensions.refFormat
> ====================================
>
> [2] https://github.com/derrickstolee/git/pull/24 Packed-refs v2 Part II:
> create extensions.refFormat (Patches 3-7)
>
> This part is a critical concept that has yet to be defined in the Git
> codebase. We have no way to incrementally modify the ref format. Since refs
> are so critical, we cannot add an optionally-understood layer on top (like
> we did with the multi-pack-index and commit-graph files). The reftable draft
> [6] proposes the same extension name (extensions.refFormat) but focuses
> instead on only a single value. This means that the reftable must be defined
> at git init or git clone time and cannot be upgraded from the files backend.
>
> In this RFC, I propose a different model that allows for more customization
> and incremental updates. The extensions.refFormat config key is multi-valued
> and defaults to the list of files and packed.

This last sentence doesn't parse that well for me.  Perhaps "...and
defaults to a combination of 'files' and 'packed', meaning supporting
both loose refs and packed refs "?

> In the context of this RFC,
> the intention is to be able to add packed-v2 so the list of all three values
> would allow Git to write and read either file format version (v1 or v2). In
> the larger scheme, the extension could allow restricting to only loose refs
> (just files) or only packed-refs (just packed) or even later when reftable
> is complete, files and reftable could mean that loose refs are the primary
> ref storage, but the reftable format serves as a drop-in replacement for the
> packed-refs file. Not all combinations need to be understood by Git, but
> having them available as an option could be useful for flexibility,
> especially when trying to upgrade existing repositories to new formats.
>
> In the future, beyond the scope of this RFC, it would be good to add a
> stacked value that allows a stack of files in packed-refs format (whose
> version is specified by the packed or packed-v2 values) so we can further
> speed up writes to the packed layer. Depending on how well that works, we
> could focus on speeding up ref deletions or sending all ref writes straight
> to the packed-refs layer. With the option to keep the loose refs storage, we
> have flexibility to explore that space incrementally when we have time to
> get to it.
>
>
> Part III: Allow a trailing table-of-contents in the chunk-format API
> ====================================================================
>
> [3] https://github.com/derrickstolee/git/pull/25 Packed-refs v2 Part III:
> trailing table of contents in chunk-format (Patches 8-17)
>
> In order to optimize the write speed of the packed-refs v2 file format, we
> want to write immediately to the file as we stream existing refs from the
> current refs. The current chunk-format API requires computing the chunk
> lengths in advance, which can slow down the write and take more memory than
> necessary. Using a trailing table of contents solves this problem, and was
> recommended earlier [7]. We just didn't have enough evidence to justify the
> work to update the existing chunk formats. Here, we update the API in
> advance of using in the packed-refs v2 format.
>
> We could consider updating the commit-graph and multi-pack-index formats to
> use trailing table of contents, but it requires a version bump. That might
> be worth it in the case of the commit-graph where computing the size of the
> changed-path Bloom filters chunk requires a lot of memory at the moment.
> After this chunk-format API update is reviewed and merged, we can pursue
> those directions more closely. We would want to investigate the formats more
> carefully to see if we want to update the chunks themselves as well as some
> header information.

I like how you point out additional benefits the series could provide,
but leave them out.  Perhaps do the same with the optional index
hashing in patch 2?

> Part IV: Abstract some parts of the v1 file format
> ==================================================
>
> [4] https://github.com/derrickstolee/git/pull/26 Packed-refs v2 Part IV:
> abstract some parts of the v1 file format (Patches 18-21)
>
> These patches move the part of the refs/packed-backend.c file that deal with
> the specifics of the packed-refs v1 file format into a new file:
> refs/packed-format-v1.c. This also creates an abstraction layer that will
> allow inserting the v2 format more easily.
>
> One thing that doesn't exist currently is a documentation file describing
> the packed-refs file format. I would add that file in this part before
> submitting it for full review. (I also haven't written the file format doc
> for the packed-refs v2 format, either.)

Sounds like another win-win opportunity to get someone from the
community to contribute.  :-)   ..Or maybe that's not the best
strategy, since recent empirical evidence suggests that trick doesn't
work.  Oh well, it was worth a shot.

> Part V: Implement the v2 file format
> ====================================
>
> [5] https://github.com/derrickstolee/git/pull/27 Packed-refs v2 Part V: the
> v2 file format (Patches 22-35)
>
> This is the real meat of the work. Perhaps there are ways to split it
> further, but for now this is what I have ready. The very last patch does a
> complete performance comparison for a repo with many refs.
>
> The format is not yet documented, but is broken up into these pieces:
>
>  1. The refs data chunk stores the same data as the packed-refs file, but
>     each ref is broken down as follows: the ref name (with trailing zero),
>     the OID for the ref in its raw bytes, and (if necessary) the peeled OID
>     for the ref in its raw bytes. The refs are sorted lexicographically.
>
>  2. The ref offsets chunk is a single column of 64-bit offsets into the refs
>     chunk indicating where each ref starts. The most-significant bit of that
>     value indicates whether or not there is a peeled OID.
>
>  3. The prefix data chunk lists a set of ref prefixes (currently writes only
>     allow depth-2 prefixes, such as refs/heads/ and refs/tags/). When
>     present, these prefixes are written in this chunk and not in the refs
>     data chunk. The prefixes are sorted lexicographically.
>
>  4. The prefix offset chunk has two 32-bit integer columns. The first column
>     stores the offset within the prefix data chunk to the start of the
>     prefix string. The second column points to the row position for the
>     first ref that has name greater than this prefix (the 0th prefix is
>     assumed to start at row 0, so we can interpret the prefix range from
>     row[i-1] and row[i]).
>
> Between using raw OIDs and storing the depth-2 prefixes only once, this
> format compresses the file to ~60% of its v1 size. (The format allows not
> writing the prefix chunks, and the prefix chunks are implemented after the
> basics of the ref chunks are complete.)
>
> The write times are reduced in a similar fraction to the size difference.
> Reads are sped up somewhat, and we have the potential to do a ref count by
> prefix much faster by doing a binary search for the start and end of the
> prefix and then subtracting the row positions instead of scanning the file
> between to count refs.
>
>
> Relationship to Reftable
> ========================
>
> I mentioned earlier that I had considered using reftable as a way to achieve
> the stated goals. With the current state of that work, I'm not confident
> that it is the right approach here.
>
> My main worry is that the reftable is more complicated than we need for a
> typical Git repository that is based on a typical filesystem. This makes
> testing the format very critical, and we seem to not be near reaching that
> approach. The v2 format here is very similar to existing Git file formats
> since it uses the chunk-format API. This means that the amount of code
> custom to just the v2 format is quite small.
>
> As mentioned, the current extension plan [6] only allows reftable or files
> and does not allow for a mix of both. This RFC introduces the possibility
> that both could co-exist. Using that multi-valued approach means that I'm
> able to test the v2 packed-refs file format almost as well as the v1 file
> format within this RFC. (More tests need to be added that are specific to
> this format, but I'm waiting for confirmation that this is an acceptable
> direction.) At the very least, this multi-valued approach could be used as a
> way to allow using the reftable format as a drop-in replacement for the
> packed-refs file, as well as upgrading an existing repo to use reftable.
> That might even help the integration process to allow the reftable format to
> be tested at least by some subset of tests instead of waiting for a full
> test suite update.

Thanks for providing this background; I also like how it potentially
makes it easier to adopt reftable in the future.

> I'm interested to hear from people more involved in the reftable work to see
> the status of that project and how it matches or differs from my
> perspective.

That wouldn't be me, but I appreciate the comparisons to help me
orient where things are.
Derrick Stolee Nov. 14, 2022, 12:07 a.m. UTC | #3
On 11/11/22 6:28 PM, Elijah Newren wrote:
> On Mon, Nov 7, 2022 at 11:01 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> Introduction
>> ============
>>
>> I became interested in our packed-ref format based on the asymmetry between
>> ref updates and ref deletions: if we delete a packed ref, then the
>> packed-refs file needs to be rewritten. Compared to writing a loose ref,
>> this is an O(N) cost instead of O(1).
>>
>> In this way, I set out with some goals:
>>
>>  * (Primary) Make packed ref deletions be nearly as fast as loose ref
>>    updates.
> 
> Performance is always nice.  :-)
> 
>>  * (Secondary) Allow using a packed ref format for all refs, dropping loose
>>    refs and creating a clear way to snapshot all refs at a given point in
>>    time.
> 
> Is this secondary goal the actual goal you have, or just the
> implementation by which you get the real underlying goal?

To me, the primary goal takes precedence. It turns out that the best
way to solve for that goal happens to also make it possible to store
all refs in a packed form, because we can update the packed form
much faster than our current setup. There are alternatives that I
considered (and prototyped) that were more specific to the deletions
case, but they were not actually as fast as the stacked method. Those
alternatives also would never help reach the secondary goal, but I
probably would have considered them anyway if they were faster, if
only for their simplicity.

> To me, it appears that such a capability would solve both (a) D/F
> conflict problems (i.e. the ability to simultaneously have a
> refs/heads/feature and refs/heads/feature/shiny ref), and (b) case
> sensitivity issues in refnames (i.e. inability of some users to work
> with both a refs/heads/feature and a refs/heads/FeAtUrE, due to
> constraints of their filesystem and the loose storage mechanism).  Are
> either of those the goal you are trying to achieve (I think both would
> be really nice, more so than the performance goal you have), or is
> there another?

For a Git host provider, these D/F conflict and case-sensitivity
situations probably would need to stay as restrictions on the
server side for quite some time because we don't want users on
older Git clients to be unable to fetch a repository just because
we updated our ref storage to allow for such possibilities.

The biggest benefit on the server side is actually for consistency
checks. Using a stacked packed-refs (especially with a tip file
that describes all of the layers) allows an atomic way to take a
snapshot of the refs and run a checksum operation on their values.
With loose refs, concurrent updates can modify the checksum during
its computation. This is a super niche reason for this, but it's
nice that the performance-only focus also ends up with a design
that satisfies this goal.

...

>> Further, there is a simpler model that satisfies my primary goal without the
>> complication required for the secondary goal. Suppose we create a stacked
>> packed-refs file but only have two layers: the first (base) layer is created
>> when git pack-refs collapses the full stack and adds the loose ref updates
>> to the packed-refs file; the second (top) layer contains only ref deletions
>> (allowing null OIDs to indicate a deleted ref). Then, ref deletions would
>> only need to rewrite that top layer, making ref deletions take O(deletions)
>> time instead of O(all refs) time. With a reasonable schedule to squash the
>> packed-refs stack, this would be a dramatic improvement. (A prototype
>> implementation showed that updating a layer of 1,000 deletions takes only
>> twice the time as writing a single loose ref.)
> 
> Makes sense.  If a ref is re-introduced after deletion, then do you
> remove it from the deletion layer and then write the single loose ref?

Loose refs always take precedence over the packed layer, so if the loose
ref exists we ignore its status in the packed layer. That allows us to not
update the packed layer unless it is a ref deletion or ref maintenance.

>> If we want to satisfy the secondary goal of passing all ref updates through
>> the packed storage, then more complicated layering would be necessary. The
>> point of bringing this up is that we have incremental goals along the way to
>> that final state that give us good stopping points to test the benefits of
>> each step.
> 
> I like the incremental plan.  Your primary goal perhaps benefits
> hosting providers the most, while the second appears to me to be an
> interesting usability improvement (some of my users might argue it's
> even a bugfix) that would affect users with far fewer refs as well.
> So, lots of benefits and we get some along the way to the final plan.

As I mentioned earlier in this reply, we have a ways to go before we
can realize the usability issue, but we can get started somewhere and
see how things progress.

>> In this part, the focus is on allowing the hashfile API to ignore updating
>> the hash during the buffered writes. We've been using this in microsoft/git
>> to optionally speed up index writes, which patch 2 introduces here. The file
>> format instead writes a null OID which would look like a corrupt file to an
>> older 'git fsck'. Before submitting a full version, I would update 'git
>> fsck' to ignore a null OID in all of our file formats that include a
>> trailing hash. Since the index is more short-lived than other formats (such
>> as pack-files) this trailing hash is less useful. The write time is also
>> critical as the performance tests demonstrate.
> 
> This feels like a diversion from your goals.  Should it really be up-front?
> 
> Reading through the patches, the first patch does appear to be
> necessary but isn't well motivated from the cover letter.  The second
> patch seems orthogonal to your series, though it is really nice to see
> index writes dropping almost to half the time.

This one I put up front only because it is a good candidate for
submitting soon, in parallel to any discussion about the rest of the
RFC.

The last part uses the chunk-format API for the packed-refs v2 format,
but the write speed is critical and hence we need this ability to skip
the hashing. Patch 1 mentions the application in the refs space as a
potential future, but the immediate benefit to index updates can help
lots of users right now.

Even if the community said "this packed-refs v2 is a bad idea" I would
still want to submit these two patches. That's the main reason they are
up front.

(Also: one thing I didn't mention in this cover letter or in the later
patches is that we could enable the hashing on the packed-refs v2 via
a config value, eventually. That would allow users who really care
about validating their file hashes an option to do so. While this
would make packed-refs v2 writes slower than v1 writes, the v1 format
does not have a checksum available, so that might be a valuable option
to those users.)

>> Part II: Create extensions.refFormat
>> ====================================
>>
>> [2] https://github.com/derrickstolee/git/pull/24 Packed-refs v2 Part II:
>> create extensions.refFormat (Patches 3-7)
>>
>> This part is a critical concept that has yet to be defined in the Git
>> codebase. We have no way to incrementally modify the ref format. Since refs
>> are so critical, we cannot add an optionally-understood layer on top (like
>> we did with the multi-pack-index and commit-graph files). The reftable draft
>> [6] proposes the same extension name (extensions.refFormat) but focuses
>> instead on only a single value. This means that the reftable must be defined
>> at git init or git clone time and cannot be upgraded from the files backend.
>>
>> In this RFC, I propose a different model that allows for more customization
>> and incremental updates. The extensions.refFormat config key is multi-valued
>> and defaults to the list of files and packed.
> 
> This last sentence doesn't parse that well for me.  Perhaps "...and
> defaults to a combination of 'files' and 'packed', meaning supporting
> both loose refs and packed refs "?

Sounds good to me. Thanks.

>> Part III: Allow a trailing table-of-contents in the chunk-format API
>> ====================================================================
>>
>> [3] https://github.com/derrickstolee/git/pull/25 Packed-refs v2 Part III:
>> trailing table of contents in chunk-format (Patches 8-17)
>>
>> In order to optimize the write speed of the packed-refs v2 file format, we
>> want to write immediately to the file as we stream existing refs from the
>> current refs. The current chunk-format API requires computing the chunk
>> lengths in advance, which can slow down the write and take more memory than
>> necessary. Using a trailing table of contents solves this problem, and was
>> recommended earlier [7]. We just didn't have enough evidence to justify the
>> work to update the existing chunk formats. Here, we update the API in
>> advance of using in the packed-refs v2 format.
>>
>> We could consider updating the commit-graph and multi-pack-index formats to
>> use trailing table of contents, but it requires a version bump. That might
>> be worth it in the case of the commit-graph where computing the size of the
>> changed-path Bloom filters chunk requires a lot of memory at the moment.
>> After this chunk-format API update is reviewed and merged, we can pursue
>> those directions more closely. We would want to investigate the formats more
>> carefully to see if we want to update the chunks themselves as well as some
>> header information.
> 
> I like how you point out additional benefits the series could provide,
> but leave them out.  Perhaps do the same with the optional index
> hashing in patch 2?

The index hashing is just _so easy_ that I couldn't bring myself to
leave it out. I didn't include the skip_hash option for these other
formats since the write times are not as critical for these files as
write time is for the index.

Updating these formats to a v2 that uses a trailing format (and
likely other small deviations based on what we've learned since they
were first created) would be an interesting direction to pursue with
care. Absolutely not something to do while blocking the refs work.

>> Part IV: Abstract some parts of the v1 file format
>> ==================================================
>>
>> [4] https://github.com/derrickstolee/git/pull/26 Packed-refs v2 Part IV:
>> abstract some parts of the v1 file format (Patches 18-21)
>>
>> These patches move the part of the refs/packed-backend.c file that deal with
>> the specifics of the packed-refs v1 file format into a new file:
>> refs/packed-format-v1.c. This also creates an abstraction layer that will
>> allow inserting the v2 format more easily.
>>
>> One thing that doesn't exist currently is a documentation file describing
>> the packed-refs file format. I would add that file in this part before
>> submitting it for full review. (I also haven't written the file format doc
>> for the packed-refs v2 format, either.)
> 
> Sounds like another win-win opportunity to get someone from the
> community to contribute.  :-)   ..Or maybe that's not the best
> strategy, since recent empirical evidence suggests that trick doesn't
> work.  Oh well, it was worth a shot.

Hey, if someone beats me to it, I won't complain. They should expect
me to CC them on reviews for the packed-refs v2 format ;)

Thanks,
-Stolee
Elijah Newren Nov. 15, 2022, 2:47 a.m. UTC | #4
On Sun, Nov 13, 2022 at 4:07 PM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 11/11/22 6:28 PM, Elijah Newren wrote:
> > On Mon, Nov 7, 2022 at 11:01 AM Derrick Stolee via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> Introduction
> >> ============
> >>
> >> I became interested in our packed-ref format based on the asymmetry between
> >> ref updates and ref deletions: if we delete a packed ref, then the
> >> packed-refs file needs to be rewritten. Compared to writing a loose ref,
> >> this is an O(N) cost instead of O(1).
> >>
> >> In this way, I set out with some goals:
> >>
> >>  * (Primary) Make packed ref deletions be nearly as fast as loose ref
> >>    updates.
> >
> > Performance is always nice.  :-)
> >
> >>  * (Secondary) Allow using a packed ref format for all refs, dropping loose
> >>    refs and creating a clear way to snapshot all refs at a given point in
> >>    time.
> >
> > Is this secondary goal the actual goal you have, or just the
> > implementation by which you get the real underlying goal?
>
> To me, the primary goal takes precedence. It turns out that the best
> way to solve for that goal happens to also make it possible to store
> all refs in a packed form, because we can update the packed form
> much faster than our current setup. There are alternatives that I
> considered (and prototyped) that were more specific to the deletions
> case, but they were not actually as fast as the stacked method. Those
> alternatives also would never help reach the secondary goal, but I
> probably would have considered them anyway if they were faster, if
> only for their simplicity.

That's orthogonal to my question, though.  For your primary goal, you
stated it in a form where it was obvious what benefit it would provide
to end users.  Your secondary goal, as stated, didn't list any benefit
to end users that I could see (update: reading the rest of your
response it appears I just didn't understand it), so I was trying to
guess at why your secondary goal might be a goal, i.e. what the real
secondary goal was.

> > To me, it appears that such a capability would solve both (a) D/F
> > conflict problems (i.e. the ability to simultaneously have a
> > refs/heads/feature and refs/heads/feature/shiny ref), and (b) case
> > sensitivity issues in refnames (i.e. inability of some users to work
> > with both a refs/heads/feature and a refs/heads/FeAtUrE, due to
> > constraints of their filesystem and the loose storage mechanism).  Are
> > either of those the goal you are trying to achieve (I think both would
> > be really nice, more so than the performance goal you have), or is
> > there another?
>
> For a Git host provider, these D/F conflict and case-sensitivity
> situations probably would need to stay as restrictions on the
> server side for quite some time because we don't want users on
> older Git clients to be unable to fetch a repository just because
> we updated our ref storage to allow for such possibilities.

Okay, but even if not used on the server side, this capability could
still be used on the client side and provide a big benefit to end
users.

But I think there's a minor issue with what you stated; as far as I
can tell, there is no case-sensitivity restriction on the server side
for GitHub currently, and users do currently have problems cloning and
using repositories with branches that differ in case only.  See e.g.
https://github.com/newren/git-filter-repo/issues/48 and the multiple
duplicates which reference that issue.  We've also had issues at
$DAYJOB, though for GHE we added some hooks to deny creating branches
that differ only in case from another branch to avoid the problem.

Also, D/F restrictions on the server do not stop users from having D/F
problems when fetching.  If users forget to use `--prune`, then when a
refs/heads/foo has already been fetched is deleted and replaced by a
refs/heads/foo/bar, then the user gets errors.  This issue actually
caused a bit of a fire-drill for us just recently.

So both kinds of problems already exist, for users with any git client
version (although the former only for users with unfortunate file
systems).  And both problems cause pain.  Both issues are caused by
loose refs, so limiting git storage to packed refs would fix both
issues.

> The biggest benefit on the server side is actually for consistency
> checks. Using a stacked packed-refs (especially with a tip file
> that describes all of the layers) allows an atomic way to take a
> snapshot of the refs and run a checksum operation on their values.
> With loose refs, concurrent updates can modify the checksum during
> its computation. This is a super niche reason for this, but it's
> nice that the performance-only focus also ends up with a design
> that satisfies this goal.

Ah...so this is the reason for your secondary goal?  Re-reading it
looks like you did state this, I just missed it without the longer
explanation.

Anyway, it might be worth calling out in your cover letter that there
are (at least) three benefits to this secondary goal of yours -- the
one you list here, plus the two I list above.
Derrick Stolee Nov. 16, 2022, 2:45 p.m. UTC | #5
On 11/14/22 9:47 PM, Elijah Newren wrote:
> On Sun, Nov 13, 2022 at 4:07 PM Derrick Stolee <derrickstolee@github.com> wrote:
>>
>> On 11/11/22 6:28 PM, Elijah Newren wrote:
>>> On Mon, Nov 7, 2022 at 11:01 AM Derrick Stolee via GitGitGadget
>>> <gitgitgadget@gmail.com> wrote:
>>>>
>>>> Introduction
>>>> ============
>>>>
>>>> I became interested in our packed-ref format based on the asymmetry between
>>>> ref updates and ref deletions: if we delete a packed ref, then the
>>>> packed-refs file needs to be rewritten. Compared to writing a loose ref,
>>>> this is an O(N) cost instead of O(1).
>>>>
>>>> In this way, I set out with some goals:
>>>>
>>>>  * (Primary) Make packed ref deletions be nearly as fast as loose ref
>>>>    updates.
>>>
>>> Performance is always nice.  :-)
>>>
>>>>  * (Secondary) Allow using a packed ref format for all refs, dropping loose
>>>>    refs and creating a clear way to snapshot all refs at a given point in
>>>>    time.
>>>
>>> Is this secondary goal the actual goal you have, or just the
>>> implementation by which you get the real underlying goal?
>>
>> To me, the primary goal takes precedence. It turns out that the best
>> way to solve for that goal happens to also make it possible to store
>> all refs in a packed form, because we can update the packed form
>> much faster than our current setup. There are alternatives that I
>> considered (and prototyped) that were more specific to the deletions
>> case, but they were not actually as fast as the stacked method. Those
>> alternatives also would never help reach the secondary goal, but I
>> probably would have considered them anyway if they were faster, if
>> only for their simplicity.
> 
> That's orthogonal to my question, though.  For your primary goal, you
> stated it in a form where it was obvious what benefit it would provide
> to end users.  Your secondary goal, as stated, didn't list any benefit
> to end users that I could see (update: reading the rest of your
> response it appears I just didn't understand it), so I was trying to
> guess at why your secondary goal might be a goal, i.e. what the real
> secondary goal was.

The reason is in the goal "creating a clear way to snapshot all refs
at a given point in time". This is a server-side benefit with no
visible benefit to users, immediately.

The D/F conflicts and case-sensitive parts that could fall from that
are not included in my goals. Part of that is because we would need a
new reflog format to complete that part. Let's take things one step
at a time and handle reflogs after we have ref update performance
handled.

>>> To me, it appears that such a capability would solve both (a) D/F
>>> conflict problems (i.e. the ability to simultaneously have a
>>> refs/heads/feature and refs/heads/feature/shiny ref), and (b) case
>>> sensitivity issues in refnames (i.e. inability of some users to work
>>> with both a refs/heads/feature and a refs/heads/FeAtUrE, due to
>>> constraints of their filesystem and the loose storage mechanism).  Are
>>> either of those the goal you are trying to achieve (I think both would
>>> be really nice, more so than the performance goal you have), or is
>>> there another?
>>
>> For a Git host provider, these D/F conflict and case-sensitivity
>> situations probably would need to stay as restrictions on the
>> server side for quite some time because we don't want users on
>> older Git clients to be unable to fetch a repository just because
>> we updated our ref storage to allow for such possibilities.
> 
> Okay, but even if not used on the server side, this capability could
> still be used on the client side and provide a big benefit to end
> users.
> 
> But I think there's a minor issue with what you stated; as far as I
> can tell, there is no case-sensitivity restriction on the server side
> for GitHub currently, and users do currently have problems cloning and
> using repositories with branches that differ in case only.  See e.g.
> https://github.com/newren/git-filter-repo/issues/48 and the multiple
> duplicates which reference that issue.  We've also had issues at
> $DAYJOB, though for GHE we added some hooks to deny creating branches
> that differ only in case from another branch to avoid the problem.

Yes, you're right here. We could do better in rejecting case-sensitive
matches upon request.

> Also, D/F restrictions on the server do not stop users from having D/F
> problems when fetching.  If users forget to use `--prune`, then when a
> refs/heads/foo has already been fetched is deleted and replaced by a
> refs/heads/foo/bar, then the user gets errors.  This issue actually
> caused a bit of a fire-drill for us just recently.

And similar to the case-sensitive situation, I'm not sure if we have
checks to avoid D/F conflicts if they happen across the loose/packed
boundary. We might just be using the filesystem as a constraint. I'll
need to dig in more here.

(This is all the more reason why this space is already complicated and
will take some time to unwind.)

> So both kinds of problems already exist, for users with any git client
> version (although the former only for users with unfortunate file
> systems).  And both problems cause pain.  Both issues are caused by
> loose refs, so limiting git storage to packed refs would fix both
> issues.
> 
>> The biggest benefit on the server side is actually for consistency
>> checks. Using a stacked packed-refs (especially with a tip file
>> that describes all of the layers) allows an atomic way to take a
>> snapshot of the refs and run a checksum operation on their values.
>> With loose refs, concurrent updates can modify the checksum during
>> its computation. This is a super niche reason for this, but it's
>> nice that the performance-only focus also ends up with a design
>> that satisfies this goal.
> 
> Ah...so this is the reason for your secondary goal?  Re-reading it
> looks like you did state this, I just missed it without the longer
> explanation.
> 
> Anyway, it might be worth calling out in your cover letter that there
> are (at least) three benefits to this secondary goal of yours -- the
> one you list here, plus the two I list above.

I suppose I assumed that the D/F and case conflicts were a "known"
benefit and a huge motivation of the reftable work. Instead of trying
to solve all of the ref problems at once, I wanted to focus on the
subset that I knew could be solved with a simpler solution, leaving
the full solution to later steps. It would help to be explicit about
how this direction helps solve this problem while also being clear
about how it does not solve it completely.

Thanks,
-Stolee
Elijah Newren Nov. 17, 2022, 4:28 a.m. UTC | #6
On Wed, Nov 16, 2022 at 6:45 AM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 11/14/22 9:47 PM, Elijah Newren wrote:
> > On Sun, Nov 13, 2022 at 4:07 PM Derrick Stolee <derrickstolee@github.com> wrote:
> >>
> >> On 11/11/22 6:28 PM, Elijah Newren wrote:
> >>> On Mon, Nov 7, 2022 at 11:01 AM Derrick Stolee via GitGitGadget
> >>> <gitgitgadget@gmail.com> wrote:
[...]
> >>>>  * (Secondary) Allow using a packed ref format for all refs, dropping loose
> >>>>    refs and creating a clear way to snapshot all refs at a given point in
> >>>>    time.
[...]
>
> The reason is in the goal "creating a clear way to snapshot all refs
> at a given point in time". This is a server-side benefit with no
> visible benefit to users, immediately.

Yes, sorry, I just missed it.  I didn't understand it and wrongly
assumed it was continuing to talk about the implementation details
rather than the benefit details.  My bad.

Thanks for patiently correcting me.

> The D/F conflicts and case-sensitive parts that could fall from that
> are not included in my goals. Part of that is because we would need a
> new reflog format to complete that part. Let's take things one step
> at a time and handle reflogs after we have ref update performance
> handled.

Ah, right, I can see how reflog would affect both of those problems
now that you highlight it, but it hadn't occurred to me before.

> >> The biggest benefit on the server side is actually for consistency
> >> checks. Using a stacked packed-refs (especially with a tip file
> >> that describes all of the layers) allows an atomic way to take a
> >> snapshot of the refs and run a checksum operation on their values.
> >> With loose refs, concurrent updates can modify the checksum during
> >> its computation. This is a super niche reason for this, but it's
> >> nice that the performance-only focus also ends up with a design
> >> that satisfies this goal.
> >
> > Ah...so this is the reason for your secondary goal?  Re-reading it
> > looks like you did state this, I just missed it without the longer
> > explanation.
> >
> > Anyway, it might be worth calling out in your cover letter that there
> > are (at least) three benefits to this secondary goal of yours -- the
> > one you list here, plus the two I list above.
>
> I suppose I assumed that the D/F and case conflicts were a "known"
> benefit and a huge motivation of the reftable work.

Yes, and I thought you had just found a simpler solution to those
problems that might not provide all the benefits of reftable (e.g.
performance with huge numbers of refs) but did solve those particular
problems.  I've only looked at reftable from the surface from a
distance, and I was unaware previously that reflog also affected these
two problems (though it seems obvious in hindsight).  And I do
remember you calling out that you weren't changing the reflog format
in your cover letter, but I didn't understand the ramifications of
that statement at the time.

> Instead of trying
> to solve all of the ref problems at once, I wanted to focus on the
> subset that I knew could be solved with a simpler solution, leaving
> the full solution to later steps. It would help to be explicit about
> how this direction helps solve this problem while also being clear
> about how it does not solve it completely.

It certainly would have helped me.  :-)

Thanks for explaining all these details.
Junio C Hamano Nov. 18, 2022, 11:31 p.m. UTC | #7
Derrick Stolee <derrickstolee@github.com> writes:

> On 11/11/22 6:28 PM, Elijah Newren wrote:
>> On Mon, Nov 7, 2022 at 11:01 AM Derrick Stolee via GitGitGadget
>> <gitgitgadget@gmail.com> wrote:
>>>
>>> Introduction
>>> ============
>>>
>>> I became interested in our packed-ref format based on the asymmetry between
>>> ref updates and ref deletions: if we delete a packed ref, then the
>>> packed-refs file needs to be rewritten. Compared to writing a loose ref,
>>> this is an O(N) cost instead of O(1).
>>>
>>> In this way, I set out with some goals:
>>>
>>>  * (Primary) Make packed ref deletions be nearly as fast as loose ref
>>>    updates.
>> 
>> Performance is always nice.  :-)
>> 
>>>  * (Secondary) Allow using a packed ref format for all refs, dropping loose
>>>    refs and creating a clear way to snapshot all refs at a given point in
>>>    time.
>> 
>> Is this secondary goal the actual goal you have, or just the
>> implementation by which you get the real underlying goal?
>
> To me, the primary goal takes precedence. It turns out that the best
> way to solve for that goal happens to also make it possible to store
> all refs in a packed form, because we can update the packed form
> much faster than our current setup. There are alternatives that I
> considered (and prototyped) that were more specific to the deletions
> case, but they were not actually as fast as the stacked method. Those
> alternatives also would never help reach the secondary goal, but I
> probably would have considered them anyway if they were faster, if
> only for their simplicity.

I have been and am still offline and haven't examined this proposal
in detail, but would it be a better longer-term approach to improve
reftable backend, instead of piling more effort on loose+packed
filesystem based backend?
Elijah Newren Nov. 19, 2022, 12:41 a.m. UTC | #8
On Fri, Nov 18, 2022 at 3:31 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Derrick Stolee <derrickstolee@github.com> writes:
>
> > On 11/11/22 6:28 PM, Elijah Newren wrote:
> >> On Mon, Nov 7, 2022 at 11:01 AM Derrick Stolee via GitGitGadget
> >> <gitgitgadget@gmail.com> wrote:

> I have been and am still offline and haven't examined this proposal
> in detail, but would it be a better longer-term approach to improve
> reftable backend, instead of piling more effort on loose+packed
> filesystem based backend?

Well, Stolee explicitly brought this up multiple times in his cover
letter with various arguments about why he thinks this approach is a
better way to move us on the path towards improved ref handling, and
doesn't see it as excluding the reftable option but just opening us up
to more incremental (and incrementally testable) improvements.  This
question came up early and often in the cover letter; he even ends
with a "Relationship to reftable" section.

But he is clearly open to feedback about whether others agree or
disagree with his thesis.

(I haven't looked much at reftable, so I can't opine on that question,
but Stolee's approach did seem eminently easier to review.  I did have
some questions about his proposal(s) because I didn't quite understand
them, in part due to being unfamiliar with the area.)
Taylor Blau Nov. 19, 2022, 3 a.m. UTC | #9
On Fri, Nov 18, 2022 at 04:41:25PM -0800, Elijah Newren wrote:
> (I haven't looked much at reftable, so I can't opine on that question,
> but Stolee's approach did seem eminently easier to review.  I did have
> some questions about his proposal(s) because I didn't quite understand
> them, in part due to being unfamiliar with the area.)

For what it's worth, I'm in the same boat as you are.

That being said, I do find it somewhat sad that we have reftable bits in
Junio's tree that don't appear to be progressing all that much. So as
much as I would like to see us have fewer reference backends, I'd rather
see whatever the "next-gen" backend be have good support and momentum,
even if that means carrying more code.

Thanks,
Taylor
Han-Wen Nienhuys Nov. 28, 2022, 6:56 p.m. UTC | #10
On Mon, Nov 7, 2022 at 7:36 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
>
> Introduction
> ============
>
> I became interested in our packed-ref format based on the asymmetry between
> ref updates and ref deletions: if we delete a packed ref, then the
> packed-refs file needs to be rewritten. Compared to writing a loose ref,
> this is an O(N) cost instead of O(1).
>
> In this way, I set out with some goals:
>
>  * (Primary) Make packed ref deletions be nearly as fast as loose ref
>    updates.
>  * (Secondary) Allow using a packed ref format for all refs, dropping loose
>    refs and creating a clear way to snapshot all refs at a given point in
>    time.
>
> I also had one major non-goal to keep things focused:
>
>  * (Non-goal) Update the reflog format.
>
> After carefully considering several options, it seemed that there are two
> solutions that can solve this effectively:
>
>  1. Wait for reftable to be integrated into Git.
>  2. Update the packed-refs backend to have a stacked version.
>
> The reftable work seems currently dormant. The format is pretty complicated
> and I have a difficult time seeing a way forward for it to be fully
> integrated into Git.

The format is somewhat complicated, and I think it would have been
possible to design a block-oriented sorted-table approach that is
simpler, but the JGit implementation has set it in stone. But, to put
this in perspective, the amount of work for getting the format to
read/write correctly has been completely dwarfed by the effort needed
to make the refs API in git represent a true abstraction boundary.
Also, if you're introducing a new format, one might as well try to
optimize it a bit.

Here are some of the hard problems that I encountered

* Worktrees and the main repository have a separate view of the ref
namespace. This is not explicit in the ref backend API, and there is a
technical limitation that the packed-refs file cannot be in a
worktree. This means that worktrees will always continue to use
loose-ref storage if you only extend the packed-refs backend.

* Symrefs are refs too, but for some reason the packed-refs file
doesn't support them. Does packed-refs v2 support symrefs too?  If you
want to snapshot the state of refs, do you want to snapshot the value
of HEAD too?

* By not changing reflogs, you are making things simpler. (if a
transaction updates the branch that HEAD points to, the reflog for
HEAD has to be updated too. Because reftable updates the reflog
transactionally, this was some extra work)
Then again, I feel the current way that reflogs work are a bit messy,
because directory/file conflicts force reflogs to be deleted at times
that don't make sense from a user-perspective.

* There are a lot of commands that store SHA1s in files under .git/,
and access them as if they are a ref (for example: rebase-apply/ ,
CHERRY_PICK_HEAD etc.).

> In this RFC, I propose a different model that allows for more customization
> and incremental updates. The extensions.refFormat config key is multi-valued
> and defaults to the list of files and packed. In the context of this RFC,
> the intention is to be able to add packed-v2 so the list of all three values
> would allow Git to write and read either file format version (v1 or v2). In
> the larger scheme, the extension could allow restricting to only loose refs
> (just files) or only packed-refs (just packed) or even later when reftable
> is complete, files and reftable could mean that loose refs are the primary
> ref storage, but the reftable format serves as a drop-in replacement for the
> packed-refs file. Not all combinations need to be understood by Git, but

I'm not sure how feasible this is. reftable also holds reflog data. A
setting {files,reftable} would either not work, or necessitate hairy
merging of data to get the reflogs working correctly.

> In order to optimize the write speed of the packed-refs v2 file format, we
> want to write immediately to the file as we stream existing refs from the
> current refs. The current chunk-format API requires computing the chunk
> lengths in advance, which can slow down the write and take more memory than

yes, this sounds sensible. reftable has the secondary indexes trailing the data.

> Between using raw OIDs and storing the depth-2 prefixes only once, this
> format compresses the file to ~60% of its v1 size. (The format allows not
> writing the prefix chunks, and the prefix chunks are implemented after the
> basics of the ref chunks are complete.)
>
> The write times are reduced in a similar fraction to the size difference.
> Reads are sped up somewhat, and we have the potential to do a ref count by

Do you mean 'enumerate refs' ? Why would you want to count refs by prefix?

> I mentioned earlier that I had considered using reftable as a way to achieve
> the stated goals. With the current state of that work, I'm not confident
> that it is the right approach here.
>
> My main worry is that the reftable is more complicated than we need for a
> typical Git repository that is based on a typical filesystem. This makes
> testing the format very critical, and we seem to not be near reaching that
> approach.

I think the base code of reading and writing the reftable format is
exercised quite exhaustively tested in unit tests. You say 'seem', but
do you have anything concrete to say?

> As mentioned, the current extension plan [6] only allows reftable or files
> and does not allow for a mix of both. This RFC introduces the possibility
> that both could co-exist. Using that multi-valued approach means that I'm
> able to test the v2 packed-refs file format almost as well as the v1 file
> format within this RFC. (More tests need to be added that are specific to
> this format, but I'm waiting for confirmation that this is an acceptable
> direction.) At the very least, this multi-valued approach could be used as a
> way to allow using the reftable format as a drop-in replacement for the
> packed-refs file, as well as upgrading an existing repo to use reftable.

The multi-value approach creates more combinations of code of how
different pieces of code can interact, so I think it actually makes it
more error-prone.
Also,

> That might even help the integration process to allow the reftable format to
> be tested at least by some subset of tests instead of waiting for a full
> test suite update.

I don't understand this comment. In the current state,
https://github.com/git/git/pull/1215 already passes 922 of the 968
test files if you set GIT_TEST_REFTABLE=1.

See https://github.com/git/git/pull/1215#issuecomment-1329579459 for
details. As you can see, for most test files, it's just a few
individual test cases that fail.

> I'm interested to hear from people more involved in the reftable work to see
> the status of that project and how it matches or differs from my
> perspective.

Overall, I found that the loose/packed ref code hard to understand and
full of arbitrary limitations (dir/file conflicts, deleting reflogs
when branches are deleted, locking across loose/packed refs etc.).
The way reftable stacks are setup (with both reflog and ref data
including symrefs in the same file) make it much easier to verify that
it behaves transactionally.

For deleting refs quickly, it seems that you only need to support
$ZEROID in packed-refs and then implement a ref database as a stack of
packed-ref files? If you're going for minimal effort and minimal
disruption wouldn't that be the place to start?

You're concerned about the reftable file format (and maybe rightly
so), but if you're changing the file format anyway and you're not
picking reftable, why not create a block-based, indexed format that
can support storing reflog entries at some point in the future too,
rather than build on (the limitations) of packed-refs? Or is
packed-refs v2 backward compatible with v1 (could an old git client
read v2 files? I think not, right?).

The reftable project has gotten into a slump because my work
responsibilities have increased over the last 1.5 year squeezing down
how much time I have for 'fun' projects. I chatted with John Cai, who
was trying to staff this project out of Gitlab resources. I don't know
where that stands, though.

> The one thing I can say is that if the reftable work had not already begun,
> then this is RFC is how I would have approached a new ref format.
>
> I look forward to your feedback!

Hope this helps.
Derrick Stolee Nov. 30, 2022, 3:16 p.m. UTC | #11
On 11/28/2022 1:56 PM, Han-Wen Nienhuys wrote:

Han-Wen,

Thanks for taking the time to reply. I was specifically hoping for your
perspective on the ideas here.

> On Mon, Nov 7, 2022 at 7:36 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> After carefully considering several options, it seemed that there are two
>> solutions that can solve this effectively:
>>
>>  1. Wait for reftable to be integrated into Git.
>>  2. Update the packed-refs backend to have a stacked version.
>>
>> The reftable work seems currently dormant. The format is pretty complicated
>> and I have a difficult time seeing a way forward for it to be fully
>> integrated into Git.
>
> The format is somewhat complicated, and I think it would have been
> possible to design a block-oriented sorted-table approach that is
> simpler, but the JGit implementation has set it in stone.

I agree that if we pursue reftable, that we should use the format as
agreed upon and implemented in JGit. I do want to say that while I admire
JGit's dedication to being compatible with repositories created by Git, I
don't think the reverse is a goal of the Git project.

> But, to put
> this in perspective, the amount of work for getting the format to
> read/write correctly has been completely dwarfed by the effort needed
> to make the refs API in git represent a true abstraction boundary.
> Also, if you're introducing a new format, one might as well try to
> optimize it a bit.

That's another reason why I was able to make an incremental improvement so
quickly in this RFC: I worked within the existing API, reducing the
overall impact of the change. It's easier to evaluate the performance
difference of packed-refs v2 versus packed-refs v1 because the change is
isolated.

That work to make the Git refs API work with the reftable library is
further ahead than I though (in your draft PR) but it is also completely
missing from the current Git tree, so that work still needs to be arranged
into a reviewable series before it is available to us. That does seem like
a substantial amount of work, but I might have been overestimating how
much work it will be compared to these changes I am advocating for.

> Here are some of the hard problems that I encountered

Thanks for including these.

> * Worktrees and the main repository have a separate view of the ref
> namespace. This is not explicit in the ref backend API, and there is a
> technical limitation that the packed-refs file cannot be in a
> worktree. This means that worktrees will always continue to use
> loose-ref storage if you only extend the packed-refs backend.

If I'm understanding it correctly [1], only the special refs (like HEAD or
REBASE_HEAD) are worktree-specific, and all refs under "refs/*" are
repository-scoped. I don't actually think of those special refs as "loose"
refs and thus they should still work under the "only packed-refs" value
for extensions.refFormat. I should definitely cover this in the
documentation, though. Also, [1] probably needs updating because it calls
HEAD a pseudo ref even though it explicitly is not [2].

[1] https://git-scm.com/docs/git-worktree#_refs
[2] https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpseudorefapseudoref

> * Symrefs are refs too, but for some reason the packed-refs file
> doesn't support them. Does packed-refs v2 support symrefs too?  If you
> want to snapshot the state of refs, do you want to snapshot the value
> of HEAD too?

I forgot that loose refs under .git/refs/ can be symrefs. This definitely
is a limitation that I should mention. Again, pseudorefs like HEAD are not
included and are stored separately, but symrefs within refs/* are not
available in packed-refs (v1 or v2). That should be explicitly called out
in the extensions.refFormat docs.

I imagine that such symrefs are uncommon, and users can make their own
evaluation of whether that use is worth keeping loose refs or not. We can
still have the {files, packed[-v2]} extension value while having a
writing strategy that writes as much as possible into the packed layer.

> * By not changing reflogs, you are making things simpler. (if a
> transaction updates the branch that HEAD points to, the reflog for
> HEAD has to be updated too. Because reftable updates the reflog
> transactionally, this was some extra work)
> Then again, I feel the current way that reflogs work are a bit messy,
> because directory/file conflicts force reflogs to be deleted at times
> that don't make sense from a user-perspective.

I agree that reflogs are messy. I also think that reflogs have different
needs than the ref storage, so separating their needs is valuable.

> * There are a lot of commands that store SHA1s in files under .git/,
> and access them as if they are a ref (for example: rebase-apply/ ,
> CHERRY_PICK_HEAD etc.).

Yes, I think these pseudorefs are stored differently from usual refs, and
hence the {packed[-v2]} extension value would still work, but I'll confirm
this with more testing.

>> In this RFC, I propose a different model that allows for more customization
>> and incremental updates. The extensions.refFormat config key is multi-valued
>> and defaults to the list of files and packed. In the context of this RFC,
>> the intention is to be able to add packed-v2 so the list of all three values
>> would allow Git to write and read either file format version (v1 or v2). In
>> the larger scheme, the extension could allow restricting to only loose refs
>> (just files) or only packed-refs (just packed) or even later when reftable
>> is complete, files and reftable could mean that loose refs are the primary
>> ref storage, but the reftable format serves as a drop-in replacement for the
>> packed-refs file. Not all combinations need to be understood by Git, but
>
> I'm not sure how feasible this is. reftable also holds reflog data. A
> setting {files,reftable} would either not work, or necessitate hairy
> merging of data to get the reflogs working correctly.

In this setup, would it be possible to continue using the "loose reflog"
format while using reftable as the packed layer? I personally think this
combination of formats to be critical to upgrading existing repositories
to reftable.

(Note: there is a strategy that doesn't need this approach, but it's a bit
complicated. It would involve rotating all replicas to new repositories
that are configured to use reftable upon creation, getting the refs from
other replicas via fetches. In my opinion, this is prohibitively
expensive.)

>> Between using raw OIDs and storing the depth-2 prefixes only once, this
>> format compresses the file to ~60% of its v1 size. (The format allows not
>> writing the prefix chunks, and the prefix chunks are implemented after the
>> basics of the ref chunks are complete.)
>>
>> The write times are reduced in a similar fraction to the size difference.
>> Reads are sped up somewhat, and we have the potential to do a ref count by
>
> Do you mean 'enumerate refs' ? Why would you want to count refs by prefix?

Generally, I mean these kind of operations:

* 'git for-each-ref' enumerates all refs within a prefix.

* Serving the ref advertisement enumerates all refs.

* There was a GitHub feature that counted refs and tags, but wanted to
  ignore internal ref prefixes (outside of refs/heads/* or refs/tags/*).
  It turns out that we didn't actually need the full count but an
  existence indicator, but it would be helpful to quickly identify how
  many branches or tags are in a repository at a glance. Packed-refs v1
  requires scanning the whole file while packed-refs v2 does a fixed
  number of binary searches followed by a subtraction of row indexes.

>> I mentioned earlier that I had considered using reftable as a way to achieve
>> the stated goals. With the current state of that work, I'm not confident
>> that it is the right approach here.
>>
>> My main worry is that the reftable is more complicated than we need for a
>> typical Git repository that is based on a typical filesystem. This makes
>> testing the format very critical, and we seem to not be near reaching that
>> approach.
>
> I think the base code of reading and writing the reftable format is
> exercised quite exhaustively tested in unit tests. You say 'seem', but
> do you have anything concrete to say?

Our test suite is focused on integration tests at the command level. While
unit tests are helpful, I'm not sure if all of the corner cases would be
covered by tests that check Git commands only.

>> As mentioned, the current extension plan [6] only allows reftable or files
>> and does not allow for a mix of both. This RFC introduces the possibility
>> that both could co-exist. Using that multi-valued approach means that I'm
>> able to test the v2 packed-refs file format almost as well as the v1 file
>> format within this RFC. (More tests need to be added that are specific to
>> this format, but I'm waiting for confirmation that this is an acceptable
>> direction.) At the very least, this multi-valued approach could be used as a
>> way to allow using the reftable format as a drop-in replacement for the
>> packed-refs file, as well as upgrading an existing repo to use reftable.
>
> The multi-value approach creates more combinations of code of how
> different pieces of code can interact, so I think it actually makes it
> more error-prone.

As multiple values are added, it will be important to indicate which
values are not compatible with each other. However, the plan for the
packed-refs improvements do add values that are orthogonal to each other.
It does make testing all combinations more difficult.

Of course, if reftable is truly incompatible with loose refs, then Git can
say that {reftable} is the only set of values that can use reftable, and
make {files, reftable} an incompatible set (which could be understood by
a later version of Git, if those barriers are overcome). However, if we do
not specify the extension as multi-valued from the start, then we cannot
later add this multi-valued option without changing the extension name.

>> That might even help the integration process to allow the reftable format to
>> be tested at least by some subset of tests instead of waiting for a full
>> test suite update.
>
> I don't understand this comment. In the current state,
> https://github.com/git/git/pull/1215 already passes 922 of the 968
> test files if you set GIT_TEST_REFTABLE=1.
>
> See https://github.com/git/git/pull/1215#issuecomment-1329579459 for
> details. As you can see, for most test files, it's just a few
> individual test cases that fail.

My point is that to get those remaining tests passing requires a
significant update to the test suite. I imagined that the complexity of
that update was the blocker to completing the reftable work.

It seems that my estimation of that complexity was overly high compared to
what you appear to be describing.

>> I'm interested to hear from people more involved in the reftable work to see
>> the status of that project and how it matches or differs from my
>> perspective.
>
> Overall, I found that the loose/packed ref code hard to understand and
> full of arbitrary limitations (dir/file conflicts, deleting reflogs
> when branches are deleted, locking across loose/packed refs etc.).
> The way reftable stacks are setup (with both reflog and ref data
> including symrefs in the same file) make it much easier to verify that
> it behaves transactionally.

I believe you that starting with a new data model makes many of these
things easier to reason with.

> For deleting refs quickly, it seems that you only need to support
> $ZEROID in packed-refs and then implement a ref database as a stack of
> packed-ref files? If you're going for minimal effort and minimal
> disruption wouldn't that be the place to start?

I disagree that jumping straight to stacked packed-refs is minimal effort
or minimal disruption.

Creating the stack approach does require changing the semantics of the
packed-refs format to include $ZEROID, which will modify some meanings in
the iteration code. The use of a stack, as well as how layers are combined
during a ref write or also during maintenance, adds complications to the
locking semantics that are decently complicated.

By contrast, the v2 format is isolated to the on-disk format. None of the
writing or reading semantics are changed in terms of which files to look
at or write in which order. Instead, it's relatively simple to see from
the format exactly how it reduces the file size but otherwise has exactly
the same read/write behavior. In fact, since the refs and OIDs are all
located in the same chunk in a similar order to the v1 file, we can even
deduce that page cache semantics will only improve in the new format.

The reason to start with this step is that the benefits and risks are
clearly understood, which can motivate us to establish the mechanism for
changing the ref format by defining the extension.

> You're concerned about the reftable file format (and maybe rightly
> so), but if you're changing the file format anyway and you're not
> picking reftable, why not create a block-based, indexed format that
> can support storing reflog entries at some point in the future too,
> rather than build on (the limitations) of packed-refs?

My personal feeling is that storing ref tips and storing the history of a
ref are sufficiently different problems that should have their own data
structures. Even if they could be combined by a common format, I don't
think it is safe to transition every part of every ref operation to a new
format all at once.

Looking at reftable from the perspective of a hosting provider, I'm very
hesitant to recommend transitioning to it because of how it is an "all or
nothing" switch. It does not fit with my expectations for safe deployment
practices.

Yes, packed-refs have some limitations, but those limitations are known
and we are working within them right now. I'd rather make a change to
write smaller versions of the file with the same semantics as a first
step.

> Or is
> packed-refs v2 backward compatible with v1 (could an old git client
> read v2 files? I think not, right?).

No, it is not backward compatible. That's why the extension is needed.

> The reftable project has gotten into a slump because my work
> responsibilities have increased over the last 1.5 year squeezing down
> how much time I have for 'fun' projects. I chatted with John Cai, who
> was trying to staff this project out of Gitlab resources. I don't know
> where that stands, though.

I'll have my EM reach out to John to see where that stands to see how we
can coordinate in this space.

>> The one thing I can say is that if the reftable work had not already begun,
>> then this is RFC is how I would have approached a new ref format.
>>
>> I look forward to your feedback!
>
> Hope this helps.

It does help clarify where the reftable project currently stands as well
as some key limitations of the packed-refs format. You've given me a lot
to think about so I'll do some poking around in your branch (and do some
performance tests) to see what I can make of it.

Let me attempt to summarize my understanding, now that you've added
clarity:

* The reftable work needs its refs backend implemented, but your draft PR
  has a prototype of this and some basic test suite integration. There are
  54 test files that have one or more failing tests, and likely these just
  need to be adjusted to not care about loose references.

* The reftable is currently fundamentally different enough that it could
  not be used as a replacement for the packed-refs file underneath loose
  refs (primarily due to its integration with the reflog). Doing so would
  require significant work on top of your prototype.

* This further indicates that moving to reftable is an "all or nothing"
  transition, and even requires starting a repository from scratch with
  reftable enabled. This is a bit of a blocker for a hosting provider to
  transition to the format, and will likely be difficult for clients to
  adopt the feature.

* The plan established by this RFC does _not_ block reftable progress, but
  generally we prefer not having competing formats in Git, so it would be
  better to have only one, unless there is enough of a justification to
  have different formats for different use cases.

I'm going to take the following actions on my end to better understand the
situation:

1. I'll take your draft PR branch and do some performance evaluations on
   the speed of ref updates compared to loose refs and my prototype of a
   two-stack packed-ref where the second layer of the stack is only for
   deleted refs.

2. I'll consult with my peers to determine how expensive it would be to
   roll out reftable via a complete replacement of our hosted
   repositories. I'll also try to discover ways to roll out the feature to
   subsets of the fleet to create a safe deployment strategy.

3. My EM and I will reach out to John Cai to learn about plans to push
   reftable over the finish line.

4. I will split out the "skip_hash" part of this RFC into its own series,
   after adding the necessary details to fsck to understand a null
   trailing hash.

Please let me know if I'm missing anything I should be investigating here.

Thanks,
-Stolee
Derrick Stolee Nov. 30, 2022, 3:31 p.m. UTC | #12
On 11/18/2022 6:31 PM, Junio C Hamano wrote:

> I have been and am still offline and haven't examined this proposal
> in detail, but would it be a better longer-term approach to improve
> reftable backend, instead of piling more effort on loose+packed
> filesystem based backend?

If reftable was complete and stable, then I would have carefully examined
it to check that it solves the problems at hand. I interpreted the lack of
progress in the area to be due to significant work required or hard
problems blocking its completion. That appears to be a wrong assumption,
so we are exploring what that will take to get it complete.

I am still wary of it requiring a clean slate and not having any way to
upgrade from an existing repository to one with reftable. I'm going to
reevaluate this to see how expensive it would be to upgrade to reftable
and how we can deploy that change safely.

These upgrade concerns may require us to eventually consider a world where
we can upgrade a repository by replacing the packed-refs file with a
reftable file, then later removing the ability to read or write loose
refs. To do so might benefit from the multi-valued extensions.refFormat
that is proposed in this RFC, even if packed-v2 does not become a
recognized value.

My personal opinion is that if reftable was not already implemented in
JGit and was not already partially contributed to Git, then we would not
choose that format or that "all or nothing" upgrade path. Instead,
incremental improvements on the existing ref formats are easier to
understand and test in parts.

But my opinion is not the most important one. I'll defer to the
community in this. I thought it worthwhile to present an alternative.

Thanks,
-Stolee
Phillip Wood Nov. 30, 2022, 3:38 p.m. UTC | #13
Hi Stolee

On 30/11/2022 15:16, Derrick Stolee wrote:
> On 11/28/2022 1:56 PM, Han-Wen Nienhuys wrote:
>> * Worktrees and the main repository have a separate view of the ref
>> namespace. This is not explicit in the ref backend API, and there is a
>> technical limitation that the packed-refs file cannot be in a
>> worktree. This means that worktrees will always continue to use
>> loose-ref storage if you only extend the packed-refs backend.
> 
> If I'm understanding it correctly [1], only the special refs (like HEAD or
> REBASE_HEAD) are worktree-specific, and all refs under "refs/*" are
> repository-scoped. I don't actually think of those special refs as "loose"
> refs and thus they should still work under the "only packed-refs" value
> for extensions.refFormat. I should definitely cover this in the
> documentation, though. Also, [1] probably needs updating because it calls
> HEAD a pseudo ref even though it explicitly is not [2].
>
 > [1] https://git-scm.com/docs/git-worktree#_refs
 > [2] 
https://git-scm.com/docs/gitglossary#Documentation/gitglossary.txt-aiddefpseudorefapseudoref

Unfortunately I think it is a little messier than that (see 
refs.c:is_per_worktree_ref()). refs/bisect/*, refs/rewritten/* and 
refs/worktree/* are all worktree specific.

Best Wishes

Phillip
Taylor Blau Nov. 30, 2022, 4:37 p.m. UTC | #14
On Wed, Nov 30, 2022 at 10:16:52AM -0500, Derrick Stolee wrote:
> > Do you mean 'enumerate refs' ? Why would you want to count refs by prefix?
>
> * There was a GitHub feature that counted refs and tags, but wanted to
>   ignore internal ref prefixes (outside of refs/heads/* or refs/tags/*).
>   It turns out that we didn't actually need the full count but an
>   existence indicator, but it would be helpful to quickly identify how
>   many branches or tags are in a repository at a glance. Packed-refs v1
>   requires scanning the whole file while packed-refs v2 does a fixed
>   number of binary searches followed by a subtraction of row indexes.

True. On the surface, it seemed odd to use a function which returns
something like:

    { "refs/heads/*" => NNNN, "refs/tags/*" => MMMM }

only to check whether or not NNNN and MMMM are zero or non-zero.

But there's a little more to the story. That emptiness check does occur
at the beginning of many page loads. But when it responds "non-empty",
we then care about how many branches and tags there actually are.

So calling count_refs() (the name of the internal RPC that powers all of
this) was an optimization written under the assumption that we actually
are going to ask about the exact number of branches/tags very shortly
after querying for emptiness.

It turns out that empirically it's faster to do something like:

    $ git show-ref --[heads|tags] | head -n 1

to check if there are any branches and tags at all[^1], and then a
follow up 'git show-ref --heads | wc -l' to check how many there are.

But it would be nice to do both operations quickly without having
actually scan all of the entries in each prefix.

Thanks,
Taylor

[^1]: Some may remember my series in
  https://lore.kernel.org/git/cover.1654552560.git.me@ttaylorr.com/
  which replaced '| head -n 1' with a '--count=1' option. This matches
  what GitHub runs in production where piping one command to another
  from Ruby is unfortunately quite complicated.
Han-Wen Nienhuys Nov. 30, 2022, 6:30 p.m. UTC | #15
On Wed, Nov 30, 2022 at 4:16 PM Derrick Stolee <derrickstolee@github.com> wrote:
> > * Symrefs are refs too, but for some reason the packed-refs file
> > doesn't support them. Does packed-refs v2 support symrefs too?  If you
> > want to snapshot the state of refs, do you want to snapshot the value
> > of HEAD too?
>
> I forgot that loose refs under .git/refs/ can be symrefs. This definitely
> is a limitation that I should mention. Again, pseudorefs like HEAD are not
> included and are stored separately, but symrefs within refs/* are not
> available in packed-refs (v1 or v2). That should be explicitly called out
> in the extensions.refFormat docs.
>
> I imagine that such symrefs are uncommon, and users can make their own
> evaluation of whether that use is worth keeping loose refs or not. We can
> still have the {files, packed[-v2]} extension value while having a
> writing strategy that writes as much as possible into the packed layer.

To be honest, I don't understand why symrefs are such a generic
concept; I've only ever seen them used for HEAD.

> > * By not changing reflogs, you are making things simpler. (if a
> > transaction updates the branch that HEAD points to, the reflog for
> > HEAD has to be updated too. Because reftable updates the reflog
> > transactionally, this was some extra work)
> > Then again, I feel the current way that reflogs work are a bit messy,
> > because directory/file conflicts force reflogs to be deleted at times
> > that don't make sense from a user-perspective.
>
> I agree that reflogs are messy. I also think that reflogs have different
> needs than the ref storage, so separating their needs is valuable.

If the reflog records the history of the ref database, then ideally,
an update of a ref should be transactional across the ref database and
the reflog. I think you can never make this work unless you tie the
storage of both together.

I can't judge how many hosting providers really care about this. At
google, we really care, but we keep the ref database and the refllog
in a global Spanner database. Reftable is only used for per-datacenter
serving. (I discovered some bugs in the JGit reflog code when I ported
it to local filesystem repos, because it was never exercised at
Google)

> > * There are a lot of commands that store SHA1s in files under .git/,
> > and access them as if they are a ref (for example: rebase-apply/ ,
> > CHERRY_PICK_HEAD etc.).
>
> Yes, I think these pseudorefs are stored differently from usual refs, and
> hence the {packed[-v2]} extension value would still work, but I'll confirm
> this with more testing.

They will work as long as you keep support for loose refs, because
there is no distinction between "a entry in the ref database" and "any
file randomly written into .git/ ".

> >> In this RFC, I propose a different model that allows for more customization
> >> and incremental updates. The extensions.refFormat config key is multi-valued
> >> and defaults to the list of files and packed. In the context of this RFC,
> >> the intention is to be able to add packed-v2 so the list of all three values
> >> would allow Git to write and read either file format version (v1 or v2). In
> >> the larger scheme, the extension could allow restricting to only loose refs
> >> (just files) or only packed-refs (just packed) or even later when reftable
> >> is complete, files and reftable could mean that loose refs are the primary
> >> ref storage, but the reftable format serves as a drop-in replacement for the
> >> packed-refs file. Not all combinations need to be understood by Git, but
> >
> > I'm not sure how feasible this is. reftable also holds reflog data. A
> > setting {files,reftable} would either not work, or necessitate hairy
> > merging of data to get the reflogs working correctly.
>
> In this setup, would it be possible to continue using the "loose reflog"
> format while using reftable as the packed layer? I personally think this
> combination of formats to be critical to upgrading existing repositories
> to reftable.

I suppose so? If you only store refs and tags (and don't handle
reflogs, symrefs or use the inverse object mapping) then the reftable
file format is just a highly souped-up version of packed-refs.

> (Note: there is a strategy that doesn't need this approach, but it's a bit
> complicated. It would involve rotating all replicas to new repositories
> that are configured to use reftable upon creation, getting the refs from
> other replicas via fetches. In my opinion, this is prohibitively
> expensive.)

I'm not sure I understand the problem. Any deletion of a ref (that is
in packed-refs) today already requires rewriting the entire
packed-refs file ("all or nothing" operation). Whether you write a
packed-refs or reftable is roughly equally expensive.

Are you looking for a way to upgrade a repo, while concurrent git
process may write updates into the repository during the update? That
may be hard to pull off, because you probably need to rename more than
one file atomically. If you accept momentarily failed writes, you
could do

* rename refs/ to refs.old/ (loose ref writes will fail now)
* collect loose refs under refs.old/ , put into packed-refs
* populate the reftable/ dir
* set refFormat extension.
* rename refs.old/ to refs/ with a refs/heads a file (as described in
the reftable spec.)

See also https://gerrit.googlesource.com/jgit/+/ca166a0c62af2ea87fdedf2728ac19cb59a12601/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/FileRepository.java#734

> >> I mentioned earlier that I had considered using reftable as a way to achieve
> >> the stated goals. With the current state of that work, I'm not confident
> >> that it is the right approach here.
> >>
> >> My main worry is that the reftable is more complicated than we need for a
> >> typical Git repository that is based on a typical filesystem. This makes
> >> testing the format very critical, and we seem to not be near reaching that
> >> approach.
> >
> > I think the base code of reading and writing the reftable format is
> > exercised quite exhaustively tested in unit tests. You say 'seem', but
> > do you have anything concrete to say?
>
> Our test suite is focused on integration tests at the command level. While
> unit tests are helpful, I'm not sure if all of the corner cases would be
> covered by tests that check Git commands only.

It's actually easier to test all of the nooks of the format through
unittests, because you can tweak parameters (eg. blocksize) that
aren't normally available in the command-line

> >> That might even help the integration process to allow the reftable format to
> >> be tested at least by some subset of tests instead of waiting for a full
> >> test suite update.
> >
> > I don't understand this comment. In the current state,
> > https://github.com/git/git/pull/1215 already passes 922 of the 968
> > test files if you set GIT_TEST_REFTABLE=1.
> >
> > See https://github.com/git/git/pull/1215#issuecomment-1329579459 for
> > details. As you can see, for most test files, it's just a few
> > individual test cases that fail.
>
> My point is that to get those remaining tests passing requires a
> significant update to the test suite. I imagined that the complexity of
> that update was the blocker to completing the reftable work.
>
> It seems that my estimation of that complexity was overly high compared to
> what you appear to be describing.

To be honest, i'm not quite sure how significant the work is: for
things like worktrees, it wasn't that obvious to me how things should
work in the first place. That makes it hard to make estimates. I
thought there might be a month of full-time work left, but these days
I can barely make a couple of hours of time per week to work on
reftable  if at all.

> > For deleting refs quickly, it seems that you only need to support
> > $ZEROID in packed-refs and then implement a ref database as a stack of
> > packed-ref files? If you're going for minimal effort and minimal
> > disruption wouldn't that be the place to start?
>
> I disagree that jumping straight to stacked packed-refs is minimal effort
> or minimal disruption.
>
> Creating the stack approach does require changing the semantics of the
> packed-refs format to include $ZEROID, which will modify some meanings in
> the iteration code. The use of a stack, as well as how layers are combined
> during a ref write or also during maintenance, adds complications to the
> locking semantics that are decently complicated.
>
> By contrast, the v2 format is isolated to the on-disk format. None of the
> writing or reading semantics are changed in terms of which files to look
> at or write in which order. Instead, it's relatively simple to see from
> the format exactly how it reduces the file size but otherwise has exactly
> the same read/write behavior. In fact, since the refs and OIDs are all
> located in the same chunk in a similar order to the v1 file, we can even
> deduce that page cache semantics will only improve in the new format.
>
> The reason to start with this step is that the benefits and risks are
> clearly understood, which can motivate us to establish the mechanism for
> changing the ref format by defining the extension.

I believe that the v2 format is a safe change with performance
improvements, but it's a backward incompatible format change with only
modest payoff. I also don't understand how it will help you do a stack
of tables,
which you need for your primary goal (ie. transactions/deletions
writing only the delta, rather than rewriting the whole file?).

> > You're concerned about the reftable file format (and maybe rightly
> > so), but if you're changing the file format anyway and you're not
> > picking reftable, why not create a block-based, indexed format that
> > can support storing reflog entries at some point in the future too,
> > rather than build on (the limitations) of packed-refs?
>
> My personal feeling is that storing ref tips and storing the history of a
> ref are sufficiently different problems that should have their own data
> structures. Even if they could be combined by a common format, I don't
> think it is safe to transition every part of every ref operation to a new
> format all at once.
>
> Looking at reftable from the perspective of a hosting provider, I'm very
> hesitant to recommend transitioning to it because of how it is an "all or
> nothing" switch. It does not fit with my expectations for safe deployment
> practices.

You'd have to consult with your SRE team, how to do this best, but
here's my $.02. If you are a hosting provider, I assume you have 3 or
5 copies of each repo in diffrent datacenters for
redundancy/availability. You could have one of the datacenters use the
new format for while, and see if there are any errors or discrepancies
(both in terms of data consistency and latency metrics)

> * The reftable work needs its refs backend implemented, but your draft PR
>   has a prototype of this and some basic test suite integration. There are
>   54 test files that have one or more failing tests, and likely these just
>   need to be adjusted to not care about loose references.
>
> * The reftable is currently fundamentally different enough that it could
>   not be used as a replacement for the packed-refs file underneath loose
>   refs (primarily due to its integration with the reflog). Doing so would
>   require significant work on top of your prototype.

It could, but I don't see the point.

> I'm going to take the following actions on my end to better understand the
> situation:
>
> 1. I'll take your draft PR branch and do some performance evaluations on
>    the speed of ref updates compared to loose refs and my prototype of a
>    two-stack packed-ref where the second layer of the stack is only for
>    deleted refs.

(tangent) - wouldn't that design perform poorly once the number of
deletions gets large? You'd basically have to rewrite the
deleted-packed-refs file all the time.
Sean Allred Nov. 30, 2022, 6:37 p.m. UTC | #16
Han-Wen Nienhuys <hanwen@google.com> writes:
> To be honest, I don't understand why symrefs are such a generic
> concept; I've only ever seen them used for HEAD.

I've been only lurking in this thread (and loosely following along,
even!) but I do want to call out that I have recently considered perhaps
abusing symrefs to point to normal feature branches. In our workflow, we
have documentation records identified by a numeric ID -- the code
changes corresponding to that documentation (testing instructions, etc.)
use formulaic branch names like `feature/123456`.

It is sometimes beneficial for two or more of these documentation
records to perform their work on the same code branch. There are myriad
reasons for this, some better than others, but I want to avoid getting
mired in whether or not this is a good idea. It does happen and is
sometimes even the best way to do it.

In these scenarios, I've considered having `feature/2` be a symref to
`feature/1` so that both features can always 'know' what to call their
branch for operations like checkout. I've done this on a smaller scale
in the past to great effect.

Nothing is set in stone here for us, but I did want to call this out as
a potential real-world use case.

--
Sean Allred
Junio C Hamano Nov. 30, 2022, 10:55 p.m. UTC | #17
Derrick Stolee <derrickstolee@github.com> writes:

> I do want to say that while I admire
> JGit's dedication to being compatible with repositories created by Git, I
> don't think the reverse is a goal of the Git project.

The world works better if cross-pollination happens both ways,
though.

>> * Symrefs are refs too, but for some reason the packed-refs file
>> doesn't support them. Does packed-refs v2 support symrefs too?  If you
>> want to snapshot the state of refs, do you want to snapshot the value
>> of HEAD too?
>
> I forgot that loose refs under .git/refs/ can be symrefs. This definitely
> is a limitation that I should mention. Again, pseudorefs like HEAD are not
> included and are stored separately, but symrefs within refs/* are not
> available in packed-refs (v1 or v2). That should be explicitly called out
> in the extensions.refFormat docs.

I expect that, in a typical individual-contributor repository, there
are at least two symbolic refs, e.g.

    .git/HEAD
    .git/refs/remotes/origin/HEAD

Having to fall back on the loose ref hierarchy only to be able to
store the latter is a bit of shame---as long as you are revamping
the format, the design should allow us to eventually migrate all
refs to the new format without having to do the "check there, and if
there isn't then check this other place", which is what the current
loose + packed combination do, I would think.
Derrick Stolee Dec. 1, 2022, 8:18 p.m. UTC | #18
On 11/30/2022 1:30 PM, Han-Wen Nienhuys wrote:
> On Wed, Nov 30, 2022 at 4:16 PM Derrick Stolee <derrickstolee@github.com> wrote:
>> (Note: there is a strategy that doesn't need this approach, but it's a bit
>> complicated. It would involve rotating all replicas to new repositories
>> that are configured to use reftable upon creation, getting the refs from
>> other replicas via fetches. In my opinion, this is prohibitively
>> expensive.)
> 
> I'm not sure I understand the problem. Any deletion of a ref (that is
> in packed-refs) today already requires rewriting the entire
> packed-refs file ("all or nothing" operation). Whether you write a
> packed-refs or reftable is roughly equally expensive.
> 
> Are you looking for a way to upgrade a repo, while concurrent git
> process may write updates into the repository during the update? That
> may be hard to pull off, because you probably need to rename more than
> one file atomically. If you accept momentarily failed writes, you
> could do
> 
> * rename refs/ to refs.old/ (loose ref writes will fail now)
> * collect loose refs under refs.old/ , put into packed-refs
> * populate the reftable/ dir
> * set refFormat extension.
> * rename refs.old/ to refs/ with a refs/heads a file (as described in
> the reftable spec.)
>
> See also https://gerrit.googlesource.com/jgit/+/ca166a0c62af2ea87fdedf2728ac19cb59a12601/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/FileRepository.java#734

Yes, I would ideally like for the repository to "upgrade" its ref
storage mechanism during routine maintenance in a non-blocking way
while other writes and reads continue as normal.

After discussing it a bit internally, we _could_ avoid the "rotate
the replicas" solution if there was a "git upgrade-ref-format"
command that could switch from one to another, but it would still
involve pulling that replica out of the rotation and then having
it catch up to the other replicas after that is complete. If I'm
reading your draft correctly, that is not currently available in
your work, but we could add it after the fact.

Requiring pulling replicas out of rotation is still a bit heavy-
handed for my liking, but it's much less expensive than moving
all of the Git data.

>> The reason to start with this step is that the benefits and risks are
>> clearly understood, which can motivate us to establish the mechanism for
>> changing the ref format by defining the extension.
> 
> I believe that the v2 format is a safe change with performance
> improvements, but it's a backward incompatible format change with only
> modest payoff. I also don't understand how it will help you do a stack
> of tables,
> which you need for your primary goal (ie. transactions/deletions
> writing only the delta, rather than rewriting the whole file?).

The v2 format doesn't help me on its own, but it has other benefits
in terms of size and speed, as well as the "ref count" functionality.

The important thing is that the definition of extensions.refFormat
that I'm proposing in this RFC establishes a way to make incremental
progress on the ref format, allowing the stacked format to come in
later with less friction.
 
>> * The reftable is currently fundamentally different enough that it could
>>   not be used as a replacement for the packed-refs file underneath loose
>>   refs (primarily due to its integration with the reflog). Doing so would
>>   require significant work on top of your prototype.
> 
> It could, but I don't see the point.

My point is that we can upgrade repositories by replacing packed-refs
with reftable during routine maintenance instead of the heavier
approaches discussed earlier.

* Step 1: replace packed-refs with reftable.
* Step 2: stop writing loose refs, only update reftable (but still read loose refs).
* Step 3: collapse all loose refs into reftable, stop reading or writing loose refs.
 
>> I'm going to take the following actions on my end to better understand the
>> situation:
>>
>> 1. I'll take your draft PR branch and do some performance evaluations on
>>    the speed of ref updates compared to loose refs and my prototype of a
>>    two-stack packed-ref where the second layer of the stack is only for
>>    deleted refs.
> 
> (tangent) - wouldn't that design perform poorly once the number of
> deletions gets large? You'd basically have to rewrite the
> deleted-packed-refs file all the time.
 
We have regular maintenance that is triggered by pushes that rewrites
the packed-refs file frequently, anyway. The maintenance currently is
blocked on the amount of time spent repacking object data, so a large
number of ref updates can come in during this process. (That maintenance
step would collapse the deleted-refs layer into the base layer.)

I've tested a simple version of this stack that shows that rewriting the
file with 1,000 deletions is still within 2x the cost of updating a loose
ref, so it solves the immediate problem using a much simpler stack model,
at least in the most-common case where ref deletions are less frequent
than other updates. Even if the size outgrew the 2x cost limit, the
deleted file is still going to be much smaller than the base packed-refs
file, which is currently rewritten for every deletion, so it is still an
improvement.

The more complicated stack model would be required to funnel all ref
updates into that structure and away from loose refs.

Thanks,
-Stolee
Han-Wen Nienhuys Dec. 2, 2022, 4:46 p.m. UTC | #19
On Thu, Dec 1, 2022 at 9:19 PM Derrick Stolee <derrickstolee@github.com> wrote:
> >> The reason to start with this step is that the benefits and risks are
> >> clearly understood, which can motivate us to establish the mechanism for
> >> changing the ref format by defining the extension.
> >
> > I believe that the v2 format is a safe change with performance
> > improvements, but it's a backward incompatible format change with only
> > modest payoff. I also don't understand how it will help you do a stack
> > of tables,
> > which you need for your primary goal (ie. transactions/deletions
> > writing only the delta, rather than rewriting the whole file?).
>
> The v2 format doesn't help me on its own, but it has other benefits
> in terms of size and speed, as well as the "ref count" functionality.
>
> The important thing is that the definition of extensions.refFormat
> that I'm proposing in this RFC establishes a way to make incremental
> progress on the ref format, allowing the stacked format to come in
> later with less friction.

I guess you want to move the read/write stack under the loose storage
(packed backend), and introduce (read loose/packed + write packed
only) mode that is transitional?

Before you embark on this incremental route, I think it would be best
to think through carefully how an online upgrade would work in detail
(I think it's currently not specified?) If ultimately it's not
feasible to do incrementally, then the added complexity of the
incremental approach will be for naught.

The incremental mode would only be of interest to hosting providers.
It will only be used transitionally. It is inherently going to be
complex, because it has to consider both storage modes at the same
time, and because it is transitional, it will get less real life
testing. At the same time, the ref database is comparatively small, so
the availability blip that converting the storage offline will impair
is going to be small. So, the incremental approach is rather expensive
for a comparatively small benefit.

I also thought a bit about how you could make the transition seamless,
but I can't see a good way: you have to coordinate between tables.list
(the list of reftables active or whatever file signals the presence of
a stack) and files under refs/heads/. I don't know how to do
transactions across multiple files without cooperative locking.

If you assume you can use filesystem locks, then you could do
something simpler: if a git repository is marked 'transitional', git
processes take an FS read lock on .git/ .  The process that converts
the storage can take an exclusive (write) lock on .git/, so it knows
nobody will interfere. I think this only works if the repo is on local
disk rather than NFS, though.

> * Step 1: replace packed-refs with reftable.
> * Step 2: stop writing loose refs, only update reftable (but still read loose refs).

Does that work? A long running process might not notice the switch in
step 2, so it could still write a ref as loose, while another process
racing might write a different value to the same ref through reftable.

PS. I'll be away from work until Jan 9th.
Ævar Arnfjörð Bjarmason Dec. 2, 2022, 6:24 p.m. UTC | #20
On Fri, Dec 02 2022, Han-Wen Nienhuys wrote:

> On Thu, Dec 1, 2022 at 9:19 PM Derrick Stolee <derrickstolee@github.com> wrote:
>> >> The reason to start with this step is that the benefits and risks are
>> >> clearly understood, which can motivate us to establish the mechanism for
>> >> changing the ref format by defining the extension.
>> >
>> > I believe that the v2 format is a safe change with performance
>> > improvements, but it's a backward incompatible format change with only
>> > modest payoff. I also don't understand how it will help you do a stack
>> > of tables,
>> > which you need for your primary goal (ie. transactions/deletions
>> > writing only the delta, rather than rewriting the whole file?).
>>
>> The v2 format doesn't help me on its own, but it has other benefits
>> in terms of size and speed, as well as the "ref count" functionality.
>>
>> The important thing is that the definition of extensions.refFormat
>> that I'm proposing in this RFC establishes a way to make incremental
>> progress on the ref format, allowing the stacked format to come in
>> later with less friction.
>
> I guess you want to move the read/write stack under the loose storage
> (packed backend), and introduce (read loose/packed + write packed
> only) mode that is transitional?
>
> Before you embark on this incremental route, I think it would be best
> to think through carefully how an online upgrade would work in detail
> (I think it's currently not specified?) If ultimately it's not
> feasible to do incrementally, then the added complexity of the
> incremental approach will be for naught.
>
> The incremental mode would only be of interest to hosting providers.
> It will only be used transitionally. It is inherently going to be
> complex, because it has to consider both storage modes at the same
> time, and because it is transitional, it will get less real life
> testing. At the same time, the ref database is comparatively small, so
> the availability blip that converting the storage offline will impair
> is going to be small. So, the incremental approach is rather expensive
> for a comparatively small benefit.
>
> I also thought a bit about how you could make the transition seamless,
> but I can't see a good way: you have to coordinate between tables.list
> (the list of reftables active or whatever file signals the presence of
> a stack) and files under refs/heads/. I don't know how to do
> transactions across multiple files without cooperative locking.

A multi-backend transaction would be hard to do at the best of times,
but we'd also presumably run into the issue that not all ref operations
currently use the transaction mechanism (e.g. branch copying/moving). So
if one or the other fails there all bets are off as far as getting back
to a consistent state.

Perhaps a more doable & interesting approach would be to have a "slave"
backend that would follow along, i.e. we'd replay all operations from
"master" to "slave" (as with DB replication, just within a single
repository).

We might get out of sync, but as the "master" is always the source of
truth presumably we could run some one-off re-exporting of the refspace
get back up-to-date, and hopefully not get out of sync again.

Then once we're ready, we could flip the switch indicating what becomes
the canonical backend.

For reftable the FS layout under .git/* is incompatible, so we'd also
need to support writing to some alternate directory to make such a thing
work...