mbox series

[0/9] midx: implement a multi-pack reverse index

Message ID cover.1612998106.git.me@ttaylorr.com (mailing list archive)
Headers show
Series midx: implement a multi-pack reverse index | expand

Message

Taylor Blau Feb. 10, 2021, 11:02 p.m. UTC
This series describes and implements a reverse index for the multi-pack index,
based on a "pseudo-pack" which can be uniquely described by the multi-pack
index.

The details of the pseudo-pack, and multi-pack reverse index are laid out in
detail in the sixth patch.

This is in support of multi-pack reachability bitmaps, which contain objects
from the multi-pack index. Likewise, an object's bit position in a multi-pack
reachability bitmap is determined by its position with that multi-pack index's
pseudo pack.

In this series, there are no users of the multi-pack index, so this series is
mainly about laying the groundwork for implementing multi-pack bitmaps. This
series is the final prerequisite needed before we can implement multi-pack
bitmaps, which will come in the next series[1].

Since tb/pack-revindex-on-disk is queued to be merged to 'master', but hasn't
yet been merged, this series is based on that branch.

Thanks in advance for your review of this series, and all of the many other
series in support of multi-pack bitmaps.

[1]: If you're curious, you can find the patches in the tb/multi-pack-bitmaps
branch of my fork at https://github.com/ttaylorr/git.

Taylor Blau (9):
  t/helper/test-read-midx.c: add '--show-objects'
  midx: allow marking a pack as preferred
  midx: don't free midx_name early
  midx: keep track of the checksum
  midx: make some functions non-static
  Documentation/technical: describe multi-pack reverse indexes
  pack-revindex: read multi-pack reverse indexes
  pack-write.c: extract 'write_rev_file_order'
  pack-revindex: write multi-pack reverse indexes

 Documentation/git-multi-pack-index.txt       |  11 +-
 Documentation/technical/multi-pack-index.txt |   5 +-
 Documentation/technical/pack-format.txt      |  83 +++++++
 builtin/multi-pack-index.c                   |  10 +-
 builtin/repack.c                             |   2 +-
 midx.c                                       | 239 ++++++++++++++++++-
 midx.h                                       |  11 +-
 pack-revindex.c                              | 112 +++++++++
 pack-revindex.h                              |  46 ++++
 pack-write.c                                 |  39 ++-
 pack.h                                       |   1 +
 packfile.c                                   |   3 +
 t/helper/test-read-midx.c                    |  24 +-
 t/t5319-multi-pack-index.sh                  |  39 +++
 14 files changed, 591 insertions(+), 34 deletions(-)

Comments

Derrick Stolee Feb. 11, 2021, 2:58 a.m. UTC | #1
On 2/10/21 6:02 PM, Taylor Blau wrote:
> This series describes and implements a reverse index for the multi-pack index,
> based on a "pseudo-pack" which can be uniquely described by the multi-pack
> index.
> 
> The details of the pseudo-pack, and multi-pack reverse index are laid out in
> detail in the sixth patch.
> 
> This is in support of multi-pack reachability bitmaps, which contain objects
> from the multi-pack index. Likewise, an object's bit position in a multi-pack
> reachability bitmap is determined by its position with that multi-pack index's
> pseudo pack.

This has been a lot of work, but I'm impressed with the progress here.

This series is good prep, and my comments are very minor.

Since the need for these multi-pack-index-<hash>.rev files doesn't show
up until the reachability bitmaps can be paired with MIDX files, it
would make sense to hold this series in 'next' until that one also
stabilizes and they merge to 'master' together.

Thanks,
-Stolee
Taylor Blau Feb. 11, 2021, 3:06 a.m. UTC | #2
On Wed, Feb 10, 2021 at 09:58:18PM -0500, Derrick Stolee wrote:
> This series is good prep, and my comments are very minor.

Thanks for your review, especially on a series like this one that adds
lots of code without any users :).

> Since the need for these multi-pack-index-<hash>.rev files doesn't show
> up until the reachability bitmaps can be paired with MIDX files, it
> would make sense to hold this series in 'next' until that one also
> stabilizes and they merge to 'master' together.

That's fine with me. It would be OK to merge this down to 'master', too,
since this is all dead code. In fact, that may be easier to work with,
since the next topic can be based directly off 'master' instead of
having to keep this branch around forever.

Either is fine, though.

Thanks,
Taylor
Junio C Hamano Feb. 11, 2021, 8:13 a.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

> Since tb/pack-revindex-on-disk is queued to be merged to 'master', but hasn't
> yet been merged, this series is based on that branch.

This seems to have a light conflict with Derrick's chunked file
format work in midx.c where pack_info is renamed and extended so the
new pack_order variable now needs to become a member in it.

I think I resolved it OK, but without any callers that actually
utilize the new code or tests, it is almost impossible to have any
confidence in the result of the conflict resolution X-<.

Could you two please look over to see if I made any silly mistakes,
when I pushe it out later?

Thanks.
Derrick Stolee Feb. 11, 2021, 6:37 p.m. UTC | #4
On 2/11/2021 3:13 AM, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
> 
>> Since tb/pack-revindex-on-disk is queued to be merged to 'master', but hasn't
>> yet been merged, this series is based on that branch.
> 
> This seems to have a light conflict with Derrick's chunked file
> format work in midx.c where pack_info is renamed and extended so the
> new pack_order variable now needs to become a member in it.
> 
> I think I resolved it OK, but without any callers that actually
> utilize the new code or tests, it is almost impossible to have any
> confidence in the result of the conflict resolution X-<.
> 
> Could you two please look over to see if I made any silly mistakes,
> when I pushe it out later?

I reproduced your merge and got very similar results. The differences
that happened between my result and yours are not meaningful in any
way.

Definitely a very subtle merge, so thank you for doing that!

-Stolee
Junio C Hamano Feb. 11, 2021, 6:55 p.m. UTC | #5
Derrick Stolee <stolee@gmail.com> writes:

>> Could you two please look over to see if I made any silly mistakes,
>> when I pushe it out later?
>
> I reproduced your merge and got very similar results. The differences
> that happened between my result and yours are not meaningful in any
> way.
>
> Definitely a very subtle merge, so thank you for doing that!

Resolving this kind of conflict is like a quick quiz in the
classroom to check my understanding of what is happening in all the
topics involved.  It is like solving a puzzle, and I do not dislike
doing it (otherwise I won't be working as a maintainer), but having
a second set of eyes, helping with independent verification, is
always appreciated.

Thanks.