mbox series

[00/20] guard object lookups against 32-bit overflow

Message ID cover.1689205042.git.me@ttaylorr.com (mailing list archive)
Headers show
Series guard object lookups against 32-bit overflow | expand

Message

Taylor Blau July 12, 2023, 11:37 p.m. UTC
This is a series I wrote over the last week or so to address a handful
of spots where we may overflow a computation, and produce incorrect
results.

Most of these share a common theme, which is that many of our
data structures (packfiles, bitmaps, commit-graph, MIDX) all tolerate
up to 2^32-1 objects. But when we, say, multiply the number of objects
by some constant or other value which isn't a 64-bit unsigned, we'll
overflow.

Often times the overflows occur when looking up an object / bit
position / commit / etc which is at position 2^32-1/20 or greater,
causing us to exceed the maximum 32-bit unsigned value when we multiply
by 20 (or the_hash_algo->rawsz, if stored as an "unsigned int").

There are a handful of instances where the existing implementation won't
overflow (i.e. because one of the operands is already a size_t, or an
explicit cast is made, etc.). These instances are also replaced with
their corresponding checked functions (st_add(), st_mult(), etc.) to
more clearly express that these operations are supposed to be computed
with 64-bit values.

This series regrettably does not contain any tests. There was some
discussion internally about using a purpose-built test helper to
generate raw packs without going through fast-import. I looked around
for prior examples of how we handled testing (or not) when writing these
kinds of patches with the following script:

    $ git log --format='%H' -G 'st_(mult|add[234]?)\(' |
        xargs -I {} sh -c 'echo "==> {}" && git -P diff {}^ {} -- t'

And looking at each such commit which mentions st_mult(), st_add(), or
one of its variants, I could not find any examples of us constructing a
gigantic pack/MIDX/commit-graph/bitmap/etc specifically to trigger an
overflow.

Taylor Blau (20):
  packfile.c: prevent overflow in `nth_packed_object_id()`
  packfile.c: prevent overflow in `load_idx()`
  packfile.c: use checked arithmetic in `nth_packed_object_offset()`
  midx.c: use `size_t`'s for fanout nr and alloc
  midx.c: prevent overflow in `nth_midxed_object_oid()`
  midx.c: prevent overflow in `nth_midxed_offset()`
  midx.c: store `nr`, `alloc` variables as `size_t`'s
  midx.c: prevent overflow in `write_midx_internal()`
  midx.c: prevent overflow in `fill_included_packs_batch()`
  pack-bitmap.c: ensure that eindex lookups don't overflow
  commit-graph.c: prevent overflow in `write_commit_graph_file()`
  commit-graph.c: prevent overflow in add_graph_to_chain()
  commit-graph.c: prevent overflow in `load_oid_from_graph()`
  commit-graph.c: prevent overflow in `fill_commit_graph_info()`
  commit-graph.c: prevent overflow in `fill_commit_in_graph()`
  commit-graph.c: prevent overflow in `load_tree_for_commit()`
  commit-graph.c: prevent overflow in `split_graph_merge_strategy()`
  commit-graph.c: prevent overflow in `merge_commit_graph()`
  commit-graph.c: prevent overflow in `write_commit_graph()`
  commit-graph.c: prevent overflow in `verify_commit_graph()`

 commit-graph.c | 63 ++++++++++++++++++++++++++++++++------------------
 midx.c         | 42 ++++++++++++++++++---------------
 pack-bitmap.c  | 12 ++++++----
 packfile.c     | 15 ++++++------
 4 files changed, 79 insertions(+), 53 deletions(-)