[0/13] combining object filters and bitmaps
mbox series

Message ID 20200213021506.GA1124607@coredump.intra.peff.net
Headers show
  • combining object filters and bitmaps
Related show


Jeff King Feb. 13, 2020, 2:15 a.m. UTC
Partial clones are nice for saving bandwidth, but they can actually be
more expensive in CPU because many of the filters require the server to
do extra work to figure out which objects to send.

In particular, it's impossible for many filters to work with
reachability bitmaps, because the bitmap result doesn't provide enough
information. E.g., we can't evaluate a path-based sparse filter without
actually walking the trees to find out which objects are referred to at
which paths.

But in some cases the bitmaps _do_ give enough information, and it's
just a weakness of our implementation. E.g., it's easy to remove all of
the blobs from a bitmap result. So this patch series implements some of
the easier filters, while providing a framework for the bitmap code to
say "nope, that filter is too complex for me" so we can fallback to a
non-bitmap path.

One thing I'm not wild about is that this basically creates a parallel
universe of filter implementations, as nothing is shared with the
code in list-objects-filter.c (beyond the option parsing stage).
But if you look at the implementations in patches 11 and 12, I think
that's how it has to be in order to be efficient. The existing filters
are inherently about hooking into the traversal, and the whole point of
reachability bitmaps is to avoid that traversal.

So I think the best we can do is carefully write bitmap-aware versions
of specific filters, and then fall back to a regular traversal as
necessary. I'd like to also give server providers some tools for
limiting which filters they're willing to support (so they can protect
themselves from people running expensive filters), but I've left that as
future work for now.

Here's an outline of the series:

  [01/13]: pack-bitmap: factor out type iterator initialization
  [02/13]: pack-bitmap: fix leak of haves/wants object lists

    Some related fixes and cleanups in the bitmap area.

  [03/13]: rev-list: fallback to non-bitmap traversal when filtering
  [04/13]: rev-list: consolidate bitmap-disabling options
  [05/13]: rev-list: factor out bitmap-optimized routines

    It's much easier to exercise this code via rev-list, so this is some
    refactoring around the bitmap support there.

  [06/13]: rev-list: make --count work with --objects
  [07/13]: rev-list: allow bitmaps when counting objects

    These two aren't strictly necessary for the topic at hand, but have
    been bugging me for years. And they were made simple by the earlier
    refactoring, and provide a handy test mechanism.

  [08/13]: pack-bitmap: basic noop bitmap filter infrastructure
  [09/13]: rev-list: use bitmap filters for traversal
  [10/13]: bitmap: add bitmap_unset() function

    More refactoring in preparation for supporting filters.

  [11/13]: pack-bitmap: implement BLOB_NONE filtering
  [12/13]: pack-bitmap: implement BLOB_LIMIT filtering

    Some actual filters (tested with rev-list support added earlier).

  [13/13]: pack-objects: support filters with bitmaps

    And finally turning on support for pack generation. The final
    numeric result is (using linux.git with blob:none):

      Test                              HEAD^               HEAD
      5310.9: simulated partial clone   38.94(37.28+5.87)   11.06(11.27+4.07) -71.6%

    which isn't too shabby.

 builtin/pack-objects.c             |   3 +-
 builtin/rev-list.c                 | 124 +++++++++++---
 ewah/bitmap.c                      |   8 +
 ewah/ewok.h                        |   1 +
 object.c                           |   9 ++
 object.h                           |   2 +
 pack-bitmap.c                      | 252 +++++++++++++++++++++++++----
 pack-bitmap.h                      |   4 +-
 reachable.c                        |   2 +-
 t/perf/p5310-pack-bitmaps.sh       |  14 ++
 t/t5310-pack-bitmaps.sh            |  20 +++
 t/t6000-rev-list-misc.sh           |  12 ++
 t/t6113-rev-list-bitmap-filters.sh |  77 +++++++++
 13 files changed, 468 insertions(+), 60 deletions(-)
 create mode 100755 t/t6113-rev-list-bitmap-filters.sh