Message ID | 20181120094638.GA22742@sigill.intra.peff.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | delta-island fixes | expand |
On Tue, Nov 20, 2018 at 11:04 AM Jeff King <peff@peff.net> wrote: > > Commit 108f530385 (pack-objects: move tree_depth into 'struct > packing_data', 2018-08-16) dynamically manages a tree_depth array in > packing_data that maintains one of these invariants: > > 1. tree_depth is NULL (i.e., the requested options don't require us to > track tree depths) > > 2. tree_depth is non-NULL and has as many entries as the "objects" > array > > We maintain (2) by: > > a. When the objects array grows, grow tree_depth to the same size > (unless it's NULL, in which case we can leave it). > > b. When a caller asks to set a depth via oe_set_tree_depth(), if > tree_depth is NULL we allocate it. > > But in (b), we use the number of stored objects, _not_ the allocated > size of the objects array. So we can run into a situation like this: > > 1. packlist_alloc() needs to store the Nth object, so it grows the > objects array to M, where M > N. > > 2. oe_set_tree_depth() wants to store a depth, so it allocates an > array of length N. Now we've violated our invariant. > > 3. packlist_alloc() needs to store the N+1th object. But it _doesn't_ > grow the objects array, since N <= M still holds. We try to assign > to tree_depth[N+1], which is out of bounds. Do you think if this splitting data to packing_data is too fragile that we should just scrape the whole thing and move all data back to object_entry[]? We would use more memory of course but higher memory usage is still better than more bugs (if these are likely to show up again).
Jeff King <peff@peff.net> writes: > But in (b), we use the number of stored objects, _not_ the allocated > size of the objects array. So we can run into a situation like this: > > 1. packlist_alloc() needs to store the Nth object, so it grows the > objects array to M, where M > N. > > 2. oe_set_tree_depth() wants to store a depth, so it allocates an > array of length N. Now we've violated our invariant. > > 3. packlist_alloc() needs to store the N+1th object. But it _doesn't_ > grow the objects array, since N <= M still holds. We try to assign > to tree_depth[N+1], which is out of bounds. Ouch. I see counting and allocationg is hard (I think I spotted a bug in another area that comes from the same "count while filtering and then allocate" pattern during this cycle). Thanks for spotting.
On Tue, Nov 20, 2018 at 05:37:18PM +0100, Duy Nguyen wrote: > > But in (b), we use the number of stored objects, _not_ the allocated > > size of the objects array. So we can run into a situation like this: > > > > 1. packlist_alloc() needs to store the Nth object, so it grows the > > objects array to M, where M > N. > > > > 2. oe_set_tree_depth() wants to store a depth, so it allocates an > > array of length N. Now we've violated our invariant. > > > > 3. packlist_alloc() needs to store the N+1th object. But it _doesn't_ > > grow the objects array, since N <= M still holds. We try to assign > > to tree_depth[N+1], which is out of bounds. > > Do you think if this splitting data to packing_data is too fragile > that we should just scrape the whole thing and move all data back to > object_entry[]? We would use more memory of course but higher memory > usage is still better than more bugs (if these are likely to show up > again). Certainly that thought crossed my mind while working on these patches. :) Especially given the difficulties it introduced into the recent bitmap-reuse topic, and the size fixes we had to deal with in v2.19. Overall, though, I dunno. This fix, while subtle, turned out not to be too complicated. And the memory savings are real. I consider 100M objects to be on the large size of feasible for stock Git these days, and I think we are talking about on the order of 4GB memory savings there. You need a big machine to handle a repository of that size, but 4GB is still appreciable. So I guess at this point, with all (known) bugs fixed, we should stick with it for now. If it becomes a problem for development of a future feature, then we can re-evaluate then. -Peff
diff --git a/pack-objects.h b/pack-objects.h index feb6a6a05e..f31ac1c81c 100644 --- a/pack-objects.h +++ b/pack-objects.h @@ -412,7 +412,7 @@ static inline void oe_set_tree_depth(struct packing_data *pack, unsigned int tree_depth) { if (!pack->tree_depth) - ALLOC_ARRAY(pack->tree_depth, pack->nr_objects); + ALLOC_ARRAY(pack->tree_depth, pack->nr_alloc); pack->tree_depth[e - pack->objects] = tree_depth; } @@ -429,7 +429,7 @@ static inline void oe_set_layer(struct packing_data *pack, unsigned char layer) { if (!pack->layer) - ALLOC_ARRAY(pack->layer, pack->nr_objects); + ALLOC_ARRAY(pack->layer, pack->nr_alloc); pack->layer[e - pack->objects] = layer; }
Commit 108f530385 (pack-objects: move tree_depth into 'struct packing_data', 2018-08-16) dynamically manages a tree_depth array in packing_data that maintains one of these invariants: 1. tree_depth is NULL (i.e., the requested options don't require us to track tree depths) 2. tree_depth is non-NULL and has as many entries as the "objects" array We maintain (2) by: a. When the objects array grows, grow tree_depth to the same size (unless it's NULL, in which case we can leave it). b. When a caller asks to set a depth via oe_set_tree_depth(), if tree_depth is NULL we allocate it. But in (b), we use the number of stored objects, _not_ the allocated size of the objects array. So we can run into a situation like this: 1. packlist_alloc() needs to store the Nth object, so it grows the objects array to M, where M > N. 2. oe_set_tree_depth() wants to store a depth, so it allocates an array of length N. Now we've violated our invariant. 3. packlist_alloc() needs to store the N+1th object. But it _doesn't_ grow the objects array, since N <= M still holds. We try to assign to tree_depth[N+1], which is out of bounds. That doesn't happen in our test scripts, because the repositories they use are so small, but it's easy to trigger by running: echo HEAD | git pack-objects --revs --delta-islands --stdout >/dev/null in any reasonably-sized repo (like git.git). We can fix it by always growing the array to match pack->nr_alloc, not pack->nr_objects. Likewise for the "layer" array from fe0ac2fb7f (pack-objects: move 'layer' into 'struct packing_data', 2018-08-16), which has the same bug. Signed-off-by: Jeff King <peff@peff.net> --- pack-objects.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)