Message ID | 61e490595b80b34c55fd640e093e021ff6fa9591.1676542973.git.ps@pks.im (mailing list archive) |
---|---|
State | Accepted |
Commit | 6eb095d78790d0d6cbad2186685dad4e7ba2e3de |
Headers | show |
Series | delta-islands: fix segfault when freeing island marks | expand |
Patrick Steinhardt <ps@pks.im> wrote: > In 647982bb71 (delta-islands: free island_marks and bitmaps, 2023-02-03) > we have introduced logic to free `island_marks` in order to reduce heap > memory usage in git-pack-objects(1). This commit is causing segfaults in > the case where this Git command does not load delta islands at all, e.g. > when reading object IDs from standard input. One such crash can be hit > when using repacking multi-pack-indices with delta islands enabled. > > The root cause of this bug is that we unconditionally dereference the > `island_marks` variable even in the case where it is a `NULL` pointer, > which is fixed by making it conditional. Note that we still leave the > logic in place to set the pointer to `-1` to detect use-after-free bugs > even when there are no loaded island marks at all. Oops, my fault :x Thanks for this fix. I think eliminating global flags like use_delta_islands and just allocating `struct delta_islands_foo *' in builtin/pack-objects.c would make it harder for people unfamiliar with the code to avoid bugs like this. Tested-by: Eric Wong <e@80x24.org> > An easy way to reproduce the segfault is: > > $ git pack-objects .git/objects/pack/pack --delta-islands </dev/null > > I didn't add a test for git-pack-objects(1) directly though, mostly > because I didn't find any location to put it. I'm happy to do so though > in case we want that. *shrug* I'm often unsure where to add tests, too. Maybe just append it to the end of t/t5300-pack-object.sh ?
Patrick Steinhardt <ps@pks.im> writes: > In 647982bb71 (delta-islands: free island_marks and bitmaps, 2023-02-03) > we have introduced logic to free `island_marks` in order to reduce heap > memory usage in git-pack-objects(1). This commit is causing segfaults in > the case where this Git command does not load delta islands at all, e.g. > when reading object IDs from standard input. One such crash can be hit > when using repacking multi-pack-indices with delta islands enabled. > > The root cause of this bug is that we unconditionally dereference the > `island_marks` variable even in the case where it is a `NULL` pointer, > which is fixed by making it conditional. Note that we still leave the > logic in place to set the pointer to `-1` to detect use-after-free bugs > even when there are no loaded island marks at all. > --- Missing sign-off?
Patrick Steinhardt <ps@pks.im> writes: > In 647982bb71 (delta-islands: free island_marks and bitmaps, 2023-02-03) > we have introduced logic to free `island_marks` in order to reduce heap > memory usage in git-pack-objects(1). This commit is causing segfaults in > the case where this Git command does not load delta islands at all, e.g. > when reading object IDs from standard input. One such crash can be hit > when using repacking multi-pack-indices with delta islands enabled. > > The root cause of this bug is that we unconditionally dereference the > `island_marks` variable even in the case where it is a `NULL` pointer, > which is fixed by making it conditional. Note that we still leave the > logic in place to set the pointer to `-1` to detect use-after-free bugs > even when there are no loaded island marks at all. There still are unprotected uses of island_marks in delta-islands.c after this patch, but I think they are safe. * The callchain deduplicate_islands() -> mark_remote_island_1() -> create_or_get_island_marks() assume island_marks is not NULL, and the only caller of deduplicate_islands(), load_delta_islands(), initializes island_marks() before calling into it. * set_island_marks() assumes island_marks is not NULL. One of its two callers, resolve_tree_islands() ensures island_marks is not NULL before proceeding. The other one, propagate_island_marks(), also assumes island_marks is not NULL, and is called only from pack-objects.c::show_commit() when use_delta_islands is set. It is not apparent if island_marks has already populated when this happens, though. I think early in the pack-objects process, prepare_pack() calls resolve_tree_islands() but as we have seen, it just punts when island_marks is NULL, and not populates. But get_object_list() explicitly calls load_delta_islands() when use_delta_islands is set, and that happens way before prepare_pack() gets called, so we should be safe. Thanks.
On Thu, Feb 16, 2023 at 10:01:31AM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > In 647982bb71 (delta-islands: free island_marks and bitmaps, 2023-02-03) > > we have introduced logic to free `island_marks` in order to reduce heap > > memory usage in git-pack-objects(1). This commit is causing segfaults in > > the case where this Git command does not load delta islands at all, e.g. > > when reading object IDs from standard input. One such crash can be hit > > when using repacking multi-pack-indices with delta islands enabled. > > > > The root cause of this bug is that we unconditionally dereference the > > `island_marks` variable even in the case where it is a `NULL` pointer, > > which is fixed by making it conditional. Note that we still leave the > > logic in place to set the pointer to `-1` to detect use-after-free bugs > > even when there are no loaded island marks at all. > > There still are unprotected uses of island_marks in delta-islands.c > after this patch, but I think they are safe. > > * The callchain deduplicate_islands() -> mark_remote_island_1() -> > create_or_get_island_marks() assume island_marks is not NULL, and > the only caller of deduplicate_islands(), load_delta_islands(), > initializes island_marks() before calling into it. > > * set_island_marks() assumes island_marks is not NULL. One of its > two callers, resolve_tree_islands() ensures island_marks is not > NULL before proceeding. The other one, propagate_island_marks(), > also assumes island_marks is not NULL, and is called only from > pack-objects.c::show_commit() when use_delta_islands is set. It > is not apparent if island_marks has already populated when this > happens, though. > > I think early in the pack-objects process, prepare_pack() calls > resolve_tree_islands() but as we have seen, it just punts when > island_marks is NULL, and not populates. But get_object_list() > explicitly calls load_delta_islands() when use_delta_islands is > set, and that happens way before prepare_pack() gets called, so we > should be safe. > > Thanks. Agreed. I also had a look at the other usages of `island_marks` and couldn't find any obvious way to access it without it being initialized. Patrick
On Thu, Feb 16, 2023 at 09:36:20AM -0800, Junio C Hamano wrote: > Patrick Steinhardt <ps@pks.im> writes: > > > In 647982bb71 (delta-islands: free island_marks and bitmaps, 2023-02-03) > > we have introduced logic to free `island_marks` in order to reduce heap > > memory usage in git-pack-objects(1). This commit is causing segfaults in > > the case where this Git command does not load delta islands at all, e.g. > > when reading object IDs from standard input. One such crash can be hit > > when using repacking multi-pack-indices with delta islands enabled. > > > > The root cause of this bug is that we unconditionally dereference the > > `island_marks` variable even in the case where it is a `NULL` pointer, > > which is fixed by making it conditional. Note that we still leave the > > logic in place to set the pointer to `-1` to detect use-after-free bugs > > even when there are no loaded island marks at all. > > --- > > Missing sign-off? Oops, right. Please feel free to add it when pulling: Signed-off-by: Patrick Steinhardt <ps@pks.im> If you prefer I'd also be happy to send another iteration of this patch, potentially including a direct test for git-pack-objects(1). I'm still not sure whether it'd be worth it though given that it's implicitly covered by the higher-level test that checks for the segfault by using git-multi-pack-index(1). Patrick
Patrick Steinhardt <ps@pks.im> writes: > Oops, right. Please feel free to add it when pulling: > > Signed-off-by: Patrick Steinhardt <ps@pks.im> I'll add that to the copy I already have. > If you prefer I'd also be happy to send another iteration of this patch, > potentially including a direct test for git-pack-objects(1). I'm still > not sure whether it'd be worth it though given that it's implicitly > covered by the higher-level test that checks for the segfault by using > git-multi-pack-index(1). The existing test probably is sufficient. Thanks.
diff --git a/delta-islands.c b/delta-islands.c index 8b234cb85b..afdec0a878 100644 --- a/delta-islands.c +++ b/delta-islands.c @@ -517,11 +517,13 @@ void free_island_marks(void) { struct island_bitmap *bitmap; - kh_foreach_value(island_marks, bitmap, { - if (!--bitmap->refcount) - free(bitmap); - }); - kh_destroy_oid_map(island_marks); + if (island_marks) { + kh_foreach_value(island_marks, bitmap, { + if (!--bitmap->refcount) + free(bitmap); + }); + kh_destroy_oid_map(island_marks); + } /* detect use-after-free with a an address which is never valid: */ island_marks = (void *)-1; diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index b5f9b10922..499d5d4c78 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -1015,4 +1015,20 @@ test_expect_success 'complains when run outside of a repository' ' grep "not a git repository" err ' +test_expect_success 'repack with delta islands' ' + git init repo && + test_when_finished "rm -fr repo" && + ( + cd repo && + + test_commit first && + git repack && + test_commit second && + git repack && + + git multi-pack-index write && + git -c repack.useDeltaIslands=true multi-pack-index repack + ) +' + test_done