mbox series

[00/12] more miscellaneous Bloom filter improvements, redux

Message ID cover.1599664389.git.me@ttaylorr.com (mailing list archive)
Headers show
Series more miscellaneous Bloom filter improvements, redux | expand

Message

Taylor Blau Sept. 9, 2020, 3:22 p.m. UTC
Here is a rejiggered version of my series in [1], which accomplishes the
same without changing any of the on-disk commit-graph format.

As a reminder, the main goal of this series is to introduce a
'--max-new-filters' flag to 'git commit-graph write' to place a limit on
the number of new Bloom filters a writer is willing to compute from
scratch. The main difficulty is disambiguating between empty/too-large
filters and ones that haven't been computed yet. See "bloom: encode
out-of-bounds filters as non-empty" for the details.

The series is organized as follows:

  * Patches 1-4 are uninteresting preparatory steps.,
  * Patch 5 introduces the 'commitGraph.readChangedPaths' configuration.
  * Patches 6-8 are more preparation.
  * Patch 9 is from Stolee and fixes a bug where computing Bloom filters
    from scratch wouldn't stop at the limit of 512.
  * Patches 10-12 prepares for and then introduces '--max-new-filters'.

The first nine patches are basically unchanged from [1] where they were
thoroughly reviewed. The tenth patch is new, and the final two patches
are only touched up and simplified to work with this new approach, but
they have otherwise been reviewed.

Since the old thread was getting long, and this is a substantially new
approach, I'm sending this as "v1" of a new series, which hopefully
nobody minds.

[1]: https://lore.kernel.org/git/cover.1596480582.git.me@ttaylorr.com/

Derrick Stolee (1):
  bloom/diff: properly short-circuit on max_changes

Taylor Blau (11):
  commit-graph: introduce 'get_bloom_filter_settings()'
  t4216: use an '&&'-chain
  commit-graph: pass a 'struct repository *' in more places
  t/helper/test-read-graph.c: prepare repo settings
  commit-graph: respect 'commitGraph.readChangedPaths'
  commit-graph.c: store maximum changed paths
  bloom: split 'get_bloom_filter()' in two
  bloom: use provided 'struct bloom_filter_settings'
  bloom: encode out-of-bounds filters as non-empty
  commit-graph: rename 'split_commit_graph_opts'
  builtin/commit-graph.c: introduce '--max-new-filters=<n>'

 Documentation/config.txt                      |   2 +
 Documentation/config/commitgraph.txt          |   8 +
 Documentation/git-commit-graph.txt            |   6 +
 .../technical/commit-graph-format.txt         |   2 +-
 blame.c                                       |   8 +-
 bloom.c                                       |  53 +++--
 bloom.h                                       |  29 ++-
 builtin/commit-graph.c                        |  61 ++++--
 commit-graph.c                                | 148 ++++++++++----
 commit-graph.h                                |  17 +-
 diff.h                                        |   2 -
 fuzz-commit-graph.c                           |   5 +-
 line-log.c                                    |   2 +-
 repo-settings.c                               |   3 +
 repository.h                                  |   1 +
 revision.c                                    |   7 +-
 t/helper/test-bloom.c                         |   4 +-
 t/helper/test-read-graph.c                    |   3 +-
 t/t0095-bloom.sh                              |   4 +-
 t/t4216-log-bloom.sh                          | 181 ++++++++++++++++--
 t/t5324-split-commit-graph.sh                 |  13 ++
 tree-diff.c                                   |   5 +-
 22 files changed, 442 insertions(+), 122 deletions(-)
 create mode 100644 Documentation/config/commitgraph.txt

--
2.28.0.462.g4ff11cec37