mbox series

[v3,00/13] more miscellaneous Bloom filter improvements, redux

Message ID cover.1600397826.git.me@ttaylorr.com (mailing list archive)
Headers show
Series more miscellaneous Bloom filter improvements, redux | expand

Message

Taylor Blau Sept. 18, 2020, 2:58 a.m. UTC
Hi again,

Here's a few more changes to the "Bloom filter improvements" topic,
sent as one brand-new re-roll in order to simplify queuing. It
incorporates:

  - Junio's changes from applying to 'seen' (namely, dropping references
    to "too-small" commits in favor of the much more clear "empty"
    commits).

  - On top, I applied Gàbor's data gathered in [1] to 10/13 (and added a
    little more detail on the absolute and relative size differences of
    the resulting commit-graph files built before/after that patch).

Thanks to everyone who has helped out along the way with this series
(Stolee, Gàbor, Junio, Jakub, and I am sure that I am forgetting some...).

Sorry to have given anyone the impression that I was abandoning this
topic; I'm definitely not ;-).

[1]: https://lore.kernel.org/git/20200917221302.GC23146@szeder.dev/

Derrick Stolee (1):
  bloom/diff: properly short-circuit on max_changes

Taylor Blau (12):
  commit-graph: introduce 'get_bloom_filter_settings()'
  t4216: use an '&&'-chain
  commit-graph: pass a 'struct repository *' in more places
  t/helper/test-read-graph.c: prepare repo settings
  commit-graph: respect 'commitGraph.readChangedPaths'
  commit-graph.c: store maximum changed paths
  bloom: split 'get_bloom_filter()' in two
  bloom: use provided 'struct bloom_filter_settings'
  bloom: encode out-of-bounds filters as non-empty
  commit-graph: rename 'split_commit_graph_opts'
  builtin/commit-graph.c: introduce '--max-new-filters=<n>'
  commit-graph: introduce 'commitGraph.maxNewFilters'

 Documentation/config.txt                      |   2 +
 Documentation/config/commitgraph.txt          |   8 +
 Documentation/git-commit-graph.txt            |   6 +
 .../technical/commit-graph-format.txt         |   2 +-
 blame.c                                       |   8 +-
 bloom.c                                       |  59 +++--
 bloom.h                                       |  29 ++-
 builtin/commit-graph.c                        |  63 ++++-
 commit-graph.c                                | 141 +++++++---
 commit-graph.h                                |  17 +-
 diff.h                                        |   2 -
 fuzz-commit-graph.c                           |   5 +-
 line-log.c                                    |   2 +-
 repo-settings.c                               |   3 +
 repository.h                                  |   1 +
 revision.c                                    |   7 +-
 t/helper/test-bloom.c                         |   4 +-
 t/helper/test-read-graph.c                    |   3 +-
 t/t0095-bloom.sh                              |   8 +-
 t/t4216-log-bloom.sh                          | 242 ++++++++++++++++--
 t/t5324-split-commit-graph.sh                 |  13 +
 tree-diff.c                                   |   5 +-
 22 files changed, 507 insertions(+), 123 deletions(-)
 create mode 100644 Documentation/config/commitgraph.txt

--
2.28.0.510.g375ecf1f36

Comments

Taylor Blau Sept. 18, 2020, 1:31 p.m. UTC | #1
Junio,

Two replacements for this version, if it ends up being the one you
queue. Gàbor suggested some helpful changes on 12/13, which in turn
cause a conflict when applying 13/13.

When queueing, please take:

  - The patch in [1] instead of v3's original 12/13, and
  - The patch in [2] as a suggested resolution when applying v3's
    original 13/13 on top.

[1]: https://lore.kernel.org/git/20200918132727.GB1600256@nand.local/
[2]: https://lore.kernel.org/git/20200918132937.GA1601745@nand.local/

Thanks.
Taylor Blau Sept. 18, 2020, 1:34 p.m. UTC | #2
On Fri, Sep 18, 2020 at 09:31:40AM -0400, Taylor Blau wrote:
> Junio,
>
> Two replacements for this version, if it ends up being the one you
> queue. Gàbor suggested some helpful changes on 12/13, which in turn
> cause a conflict when applying 13/13.

I should mention, the changes are purely an alteration to the new
documentation introduced by this series. Here's a range-diff:

12:  4549f0f747 ! 12:  1c3f6b5c96 builtin/commit-graph.c: introduce '--max-new-filters=<n>'
    @@ Documentation/git-commit-graph.txt: this option is given, future commit-graph wr
      +
     +With the `--max-new-filters=<n>` option, generate at most `n` new Bloom
     +filters (if `--changed-paths` is specified). If `n` is `-1`, no limit is
    -+enforced. Commits whose filters are not calculated are stored as a
    -+length zero Bloom filter.
    ++enforced. Only commits present in the new layer count against this
    ++limit. To retroactively compute Bloom filters over earlier layers, it is
    ++advised to use `--split=replace`.
     ++
      With the `--split[=<strategy>]` option, write the commit-graph as a
      chain of multiple commit-graph files stored in
13:  375ecf1f36 ! 13:  a7330ee850 commit-graph: introduce 'commitGraph.maxNewFilters'
    @@ Documentation/config/commitgraph.txt
      	commit-graph file (if it exists, and they are present). Defaults to

      ## Documentation/git-commit-graph.txt ##
    -@@ Documentation/git-commit-graph.txt: data.
    - With the `--max-new-filters=<n>` option, generate at most `n` new Bloom
    +@@ Documentation/git-commit-graph.txt: With the `--max-new-filters=<n>` option, generate at most `n` new Bloom
      filters (if `--changed-paths` is specified). If `n` is `-1`, no limit is
    - enforced. Commits whose filters are not calculated are stored as a
    --length zero Bloom filter.
    -+length zero Bloom filter. Overrides the `commitGraph.maxNewFilters`
    -+configuration.
    + enforced. Only commits present in the new layer count against this
    + limit. To retroactively compute Bloom filters over earlier layers, it is
    +-advised to use `--split=replace`.
    ++advised to use `--split=replace`. Overrides the
    ++`commitGraph.maxNewFilters` configuration.
      +
      With the `--split[=<strategy>]` option, write the commit-graph as a
      chain of multiple commit-graph files stored in

Thanks,
Taylor