mbox series

[0/7] oid_array cleanups

Message ID 20200330140247.GA476088@coredump.intra.peff.net (mailing list archive)
Headers show
Series oid_array cleanups | expand

Message

Jeff King March 30, 2020, 2:02 p.m. UTC
I recently encountered a repo in the wild that had over 2^31 objects,
and found that cat-file barfed:

  $ git cat-file --batch-all-objects --batch-check
  fatal: size_t overflow: 32 * 18446744071562067968

The issue is that we use an "int" to store the count and the allocation.
Fortunately our st_mult() protection kicks in before we end up with an
undersized buffer, so this shouldn't be dangerous. And while I'd argue
that having that many objects is probably silly and likely to cause
other problems, I'd just as soon we kept our allocating code as robust
as possible.

The first patch is the actual fix, followed by some related type
cleanup. The rest of it is just leftovers from the
s/sha1_array/oid_array/ transition a few years back.

  [1/7]: oid_array: use size_t for count and allocation
  [2/7]: oid_array: use size_t for iteration
  [3/7]: oid_array: rename source file from sha1-array
  [4/7]: test-tool: rename sha1-array to oid-array
  [5/7]: bisect: stop referring to sha1_array
  [6/7]: ref-filter: stop referring to "sha1 array"
  [7/7]: oidset: stop referring to sha1-array

 Makefile                                         |  4 ++--
 bisect.c                                         |  8 ++++----
 builtin/cat-file.c                               |  2 +-
 builtin/diff.c                                   |  2 +-
 builtin/fetch-pack.c                             |  2 +-
 builtin/pack-objects.c                           |  2 +-
 builtin/pull.c                                   |  2 +-
 builtin/receive-pack.c                           |  2 +-
 builtin/send-pack.c                              |  2 +-
 builtin/tag.c                                    |  2 +-
 cache.h                                          |  2 +-
 combine-diff.c                                   |  2 +-
 connect.c                                        |  2 +-
 delta-islands.c                                  |  2 +-
 fetch-pack.c                                     |  2 +-
 object-store.h                                   |  2 +-
 sha1-array.c => oid-array.c                      | 10 +++++-----
 sha1-array.h => oid-array.h                      |  6 +++---
 oidset.h                                         |  2 +-
 parse-options-cb.c                               |  2 +-
 ref-filter.c                                     |  7 +++----
 ref-filter.h                                     |  2 +-
 remote-curl.c                                    |  2 +-
 send-pack.c                                      |  2 +-
 sha1-name.c                                      |  2 +-
 shallow.c                                        |  2 +-
 submodule.c                                      |  2 +-
 t/helper/{test-sha1-array.c => test-oid-array.c} |  8 ++++----
 t/helper/test-tool.c                             |  2 +-
 t/helper/test-tool.h                             |  2 +-
 t/t0064-sha1-array.sh                            | 16 ++++++++--------
 transport.c                                      |  2 +-
 32 files changed, 54 insertions(+), 55 deletions(-)
 rename sha1-array.c => oid-array.c (93%)
 rename sha1-array.h => oid-array.h (97%)
 rename t/helper/{test-sha1-array.c => test-oid-array.c} (83%)

Comments

Taylor Blau April 15, 2020, 12:35 a.m. UTC | #1
Hi Peff,

On Mon, Mar 30, 2020 at 10:02:47AM -0400, Jeff King wrote:
> I recently encountered a repo in the wild that had over 2^31 objects,
> and found that cat-file barfed:
>
>   $ git cat-file --batch-all-objects --batch-check
>   fatal: size_t overflow: 32 * 18446744071562067968
>
> The issue is that we use an "int" to store the count and the allocation.
> Fortunately our st_mult() protection kicks in before we end up with an
> undersized buffer, so this shouldn't be dangerous. And while I'd argue
> that having that many objects is probably silly and likely to cause
> other problems, I'd just as soon we kept our allocating code as robust
> as possible.
>
> The first patch is the actual fix, followed by some related type
> cleanup. The rest of it is just leftovers from the
> s/sha1_array/oid_array/ transition a few years back.
>
>   [1/7]: oid_array: use size_t for count and allocation
>   [2/7]: oid_array: use size_t for iteration
>   [3/7]: oid_array: rename source file from sha1-array
>   [4/7]: test-tool: rename sha1-array to oid-array
>   [5/7]: bisect: stop referring to sha1_array
>   [6/7]: ref-filter: stop referring to "sha1 array"
>   [7/7]: oidset: stop referring to sha1-array

Thanks for this. I reviewed the patch, which was a breeze thanks to how
you broke it out. I don't think that I said anything useful in my actual
review, which is to say that this looks good from my perspective.

Sorry that I let it sit in my inbox for a few days. Trying to get better
about that :).

  Reviewed-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor