Message ID | 20241230-pks-meson-sha1-unsafe-v1-2-efb276e171f5@pks.im (mailing list archive) |
---|---|
State | Accepted |
Commit | 106140a99fbdb7acf19723473621e0ccaa03c158 |
Headers | show |
Series | Fix segfaults when using the unsafe SHA1 backend | expand |
On Mon, Dec 30, 2024 at 03:24:02PM +0100, Patrick Steinhardt wrote: > Same as with the preceding commit, git-fast-import(1) is using the safe > variant to initialize a hashfile checkpoint. This leads to a segfault > when passing the checkpoint into the hashfile subsystem because it would > use the unsafe variants instead: > > ++ git --git-dir=R/.git fast-import --big-file-threshold=1 > AddressSanitizer:DEADLYSIGNAL > ================================================================= > ==577126==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x5070000009c0 sp 0x7fffffff5b30 T0) > ==577126==The signal is caused by a READ memory access. > ==577126==Hint: address points to the zero page. > #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) > #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2 > #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2 > #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2 > #4 0x5555559647d1 in stream_blob ../builtin/fast-import.c:1110:2 > #5 0x55555596247b in parse_and_store_blob ../builtin/fast-import.c:2031:3 > #6 0x555555967f91 in file_change_m ../builtin/fast-import.c:2408:5 > #7 0x55555595d8a2 in parse_new_commit ../builtin/fast-import.c:2768:4 > #8 0x55555595bb7a in cmd_fast_import ../builtin/fast-import.c:3614:4 > #9 0x555555b1f493 in run_builtin ../git.c:480:11 > #10 0x555555b1bfef in handle_builtin ../git.c:740:9 > #11 0x555555b1e6f4 in run_argv ../git.c:807:4 > #12 0x555555b1b87a in cmd_main ../git.c:947:19 > #13 0x5555561649e6 in main ../common-main.c:64:11 > #14 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) > #15 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) > #16 0x555555772c84 in _start (git+0x21ec84) Wow. This is a really good catch, and I'm embarrassed to have introduced the bug in the first place. If I can state back to you what I think is going on to make sure that we're on the same page... In fast-import, we initialize a hashfile_checkpoint struct using regular the_hash_algo callbacks, like the_hash_algo->init_fn(), the_hash_algo->update_fn(), etc. But we also have a hashfile, which will use the unsafe variants of these functions. If we're using a specialized unsafe variant, then calling hashfile_checkpoint() will crash. In your example, the crash happens when we try and call the clone_fn, but I think it could also happen with hashflush() when it tries to call update_fn(). I thought that my series here: https://lore.kernel.org/git/cover.1732130001.git.me@ttaylorr.com/ would have prevented this, but I don't think that it would have in its current state. A little more on that in a second... > --- > builtin/fast-import.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/builtin/fast-import.c b/builtin/fast-import.c > index 1fa2929a01b7dfee52b653248bba802884f6be6a..0f86392761abbe6acb217fef7f4fe7c3ff5ac1fa 100644 > --- a/builtin/fast-import.c > +++ b/builtin/fast-import.c > @@ -1106,7 +1106,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark) > || (pack_size + PACK_SIZE_THRESHOLD + len) < pack_size) > cycle_packfile(); > > - the_hash_algo->init_fn(&checkpoint.ctx); > + the_hash_algo->unsafe_init_fn(&checkpoint.ctx); This will obviously fix the issue at hand, but I don't think this is any less brittle than before. The hash function implementation here needs to agree with that used in the hashfile API. This change makes that happen, but only using side information that the hashfile API uses the unsafe variants. The series I mentioned above made the algop a property of the hashfile itself, and introduced a specialized "unsafe" variant that did not require using separate callbacks. So that on its own would not have saved us here (or in the other spot from the previous patch). But something on top like this would have: --- 8< --- diff --git a/builtin/fast-import.c b/builtin/fast-import.c index 76d5c20f14..89fc20d436 100644 --- a/builtin/fast-import.c +++ b/builtin/fast-import.c @@ -1101,7 +1101,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark) || (pack_size + PACK_SIZE_THRESHOLD + len) < pack_size) cycle_packfile(); - the_hash_algo->init_fn(&checkpoint.ctx); + pack_file->algop->init_fn(&checkpoint.ctx); hashfile_checkpoint(pack_file, &checkpoint); offset = checkpoint.offset; --- >8 --- (and an analogous change to the other spot). I think we should perhaps combine forces here. My ideal end-state is to have the unsafe_hash_algo() stuff land from my earlier series, then have these two fixes (adjusted to the new world order as above), and finally the Meson fixes after that. Does that seem like a plan to you? If so, I can put everything together and send it out (if you're OK with me forging your s-o-b). Thanks, Taylor
On Mon, Dec 30, 2024 at 12:22:34PM -0500, Taylor Blau wrote: > On Mon, Dec 30, 2024 at 03:24:02PM +0100, Patrick Steinhardt wrote: > > diff --git a/builtin/fast-import.c b/builtin/fast-import.c > > index 1fa2929a01b7dfee52b653248bba802884f6be6a..0f86392761abbe6acb217fef7f4fe7c3ff5ac1fa 100644 > > --- a/builtin/fast-import.c > > +++ b/builtin/fast-import.c > > @@ -1106,7 +1106,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark) > > || (pack_size + PACK_SIZE_THRESHOLD + len) < pack_size) > > cycle_packfile(); > > > > - the_hash_algo->init_fn(&checkpoint.ctx); > > + the_hash_algo->unsafe_init_fn(&checkpoint.ctx); > > This will obviously fix the issue at hand, but I don't think this is any > less brittle than before. The hash function implementation here needs to > agree with that used in the hashfile API. This change makes that > happen, but only using side information that the hashfile API uses the > unsafe variants. Yup, I only cared about fixing the segfault because we're close to the v2.48 release. I agree that the overall state is still extremely brittle right now. [snip] > I think we should perhaps combine forces here. My ideal end-state is to > have the unsafe_hash_algo() stuff land from my earlier series, then have > these two fixes (adjusted to the new world order as above), and finally > the Meson fixes after that. > > Does that seem like a plan to you? If so, I can put everything together > and send it out (if you're OK with me forging your s-o-b). I think the ideal state would be if the hashing function used was stored as part of `struct git_hash_ctx`. So the flow basically becomes for example: ``` struct git_hash_ctx ctx; struct object_id oid; git_hash_sha1_init(&ctx); git_hash_update(&ctx, data); git_hash_final_oid(&oid, &ctx); ``` Note how the intermediate calls don't need to know which hash function you used to initialize the `struct git_hash_ctx` -- the structure itself should remember what it has been initilized with and do the right thing. Patrick
Patrick Steinhardt <ps@pks.im> writes: > I think the ideal state would be if the hashing function used was stored > as part of `struct git_hash_ctx`. So the flow basically becomes for > example: > > ``` > struct git_hash_ctx ctx; > struct object_id oid; > > git_hash_sha1_init(&ctx); > git_hash_update(&ctx, data); > git_hash_final_oid(&oid, &ctx); > ``` > > Note how the intermediate calls don't need to know which hash function > you used to initialize the `struct git_hash_ctx` -- the structure itself > should remember what it has been initilized with and do the right thing. Yup, that sounds perfect.
On Fri, Jan 03, 2025 at 02:08:01PM +0100, Patrick Steinhardt wrote: > On Mon, Dec 30, 2024 at 12:22:34PM -0500, Taylor Blau wrote: > > On Mon, Dec 30, 2024 at 03:24:02PM +0100, Patrick Steinhardt wrote: > > > diff --git a/builtin/fast-import.c b/builtin/fast-import.c > > > index 1fa2929a01b7dfee52b653248bba802884f6be6a..0f86392761abbe6acb217fef7f4fe7c3ff5ac1fa 100644 > > > --- a/builtin/fast-import.c > > > +++ b/builtin/fast-import.c > > > @@ -1106,7 +1106,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark) > > > || (pack_size + PACK_SIZE_THRESHOLD + len) < pack_size) > > > cycle_packfile(); > > > > > > - the_hash_algo->init_fn(&checkpoint.ctx); > > > + the_hash_algo->unsafe_init_fn(&checkpoint.ctx); > > > > This will obviously fix the issue at hand, but I don't think this is any > > less brittle than before. The hash function implementation here needs to > > agree with that used in the hashfile API. This change makes that > > happen, but only using side information that the hashfile API uses the > > unsafe variants. > > Yup, I only cared about fixing the segfault because we're close to the > v2.48 release. I agree that the overall state is still extremely brittle > right now. > > [snip] > > I think we should perhaps combine forces here. My ideal end-state is to > > have the unsafe_hash_algo() stuff land from my earlier series, then have > > these two fixes (adjusted to the new world order as above), and finally > > the Meson fixes after that. > > > > Does that seem like a plan to you? If so, I can put everything together > > and send it out (if you're OK with me forging your s-o-b). > > I think the ideal state would be if the hashing function used was stored > as part of `struct git_hash_ctx`. So the flow basically becomes for > example: > > ``` > struct git_hash_ctx ctx; > struct object_id oid; > > git_hash_sha1_init(&ctx); > git_hash_update(&ctx, data); > git_hash_final_oid(&oid, &ctx); > ``` > > Note how the intermediate calls don't need to know which hash function > you used to initialize the `struct git_hash_ctx` -- the structure itself > should remember what it has been initilized with and do the right thing. I'm not sure I'm following you here. In the stream_blob() function within fast-import, the problem isn't that we're switching hash functions mid-stream, but that we're initializing the hashfile_context structure with the wrong hash function to begin with. You snipped it out of your reply, but I think that my suggestion to do: pack_file->algop->init_fn(&checkpoint.ctx); would harden us against the broken behavior we're seeing here. As a separate defense-in-depth measure, we could teach functions from the hashfile API which deal with hashfile_checkpoint structure to ensure that the hashfile and its checkpoint both use the same algorithm (by adding a hash_algo field to the hashfile_checkpoint structure). Thanks, Taylor
On Mon, Jan 06, 2025 at 02:17:58PM -0500, Taylor Blau wrote: > On Fri, Jan 03, 2025 at 02:08:01PM +0100, Patrick Steinhardt wrote: > > On Mon, Dec 30, 2024 at 12:22:34PM -0500, Taylor Blau wrote: > > > On Mon, Dec 30, 2024 at 03:24:02PM +0100, Patrick Steinhardt wrote: > > > > diff --git a/builtin/fast-import.c b/builtin/fast-import.c > > > > index 1fa2929a01b7dfee52b653248bba802884f6be6a..0f86392761abbe6acb217fef7f4fe7c3ff5ac1fa 100644 > > > > --- a/builtin/fast-import.c > > > > +++ b/builtin/fast-import.c > > > > @@ -1106,7 +1106,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark) > > > > || (pack_size + PACK_SIZE_THRESHOLD + len) < pack_size) > > > > cycle_packfile(); > > > > > > > > - the_hash_algo->init_fn(&checkpoint.ctx); > > > > + the_hash_algo->unsafe_init_fn(&checkpoint.ctx); > > > > > > This will obviously fix the issue at hand, but I don't think this is any > > > less brittle than before. The hash function implementation here needs to > > > agree with that used in the hashfile API. This change makes that > > > happen, but only using side information that the hashfile API uses the > > > unsafe variants. > > > > Yup, I only cared about fixing the segfault because we're close to the > > v2.48 release. I agree that the overall state is still extremely brittle > > right now. > > > > [snip] > > > I think we should perhaps combine forces here. My ideal end-state is to > > > have the unsafe_hash_algo() stuff land from my earlier series, then have > > > these two fixes (adjusted to the new world order as above), and finally > > > the Meson fixes after that. > > > > > > Does that seem like a plan to you? If so, I can put everything together > > > and send it out (if you're OK with me forging your s-o-b). > > > > I think the ideal state would be if the hashing function used was stored > > as part of `struct git_hash_ctx`. So the flow basically becomes for > > example: > > > > ``` > > struct git_hash_ctx ctx; > > struct object_id oid; > > > > git_hash_sha1_init(&ctx); > > git_hash_update(&ctx, data); > > git_hash_final_oid(&oid, &ctx); > > ``` > > > > Note how the intermediate calls don't need to know which hash function > > you used to initialize the `struct git_hash_ctx` -- the structure itself > > should remember what it has been initilized with and do the right thing. > > I'm not sure I'm following you here. In the stream_blob() function > within fast-import, the problem isn't that we're switching hash > functions mid-stream, but that we're initializing the hashfile_context > structure with the wrong hash function to begin with. True, but it would have been a non-issue if the hash context itself knew which hash function to use for updates. Sure, we would've used the slow variant of SHA1 instead of the fast-but-unsafe one. But that feels like the lesser evil compared to crashing. > You snipped it out of your reply, but I think that my suggestion to do: > > pack_file->algop->init_fn(&checkpoint.ctx); > > would harden us against the broken behavior we're seeing here. > > As a separate defense-in-depth measure, we could teach functions from > the hashfile API which deal with hashfile_checkpoint structure to ensure > that the hashfile and its checkpoint both use the same algorithm (by > adding a hash_algo field to the hashfile_checkpoint structure). I would think that it were even harder to abuse if it wasn't the hashfile API, but the hash API that remembered the algorithm. Patrick
On Tue, Jan 07, 2025 at 01:06:20PM +0100, Patrick Steinhardt wrote: > > > > I think we should perhaps combine forces here. My ideal end-state is to > > > > have the unsafe_hash_algo() stuff land from my earlier series, then have > > > > these two fixes (adjusted to the new world order as above), and finally > > > > the Meson fixes after that. > > > > > > > > Does that seem like a plan to you? If so, I can put everything together > > > > and send it out (if you're OK with me forging your s-o-b). > > > > > > I think the ideal state would be if the hashing function used was stored > > > as part of `struct git_hash_ctx`. So the flow basically becomes for > > > example: > > > > > > ``` > > > struct git_hash_ctx ctx; > > > struct object_id oid; > > > > > > git_hash_sha1_init(&ctx); > > > git_hash_update(&ctx, data); > > > git_hash_final_oid(&oid, &ctx); > > > ``` > > > > > > Note how the intermediate calls don't need to know which hash function > > > you used to initialize the `struct git_hash_ctx` -- the structure itself > > > should remember what it has been initilized with and do the right thing. > > > > I'm not sure I'm following you here. In the stream_blob() function > > within fast-import, the problem isn't that we're switching hash > > functions mid-stream, but that we're initializing the hashfile_context > > structure with the wrong hash function to begin with. > > True, but it would have been a non-issue if the hash context itself knew > which hash function to use for updates. Sure, we would've used the slow > variant of SHA1 instead of the fast-but-unsafe one. But that feels like > the lesser evil compared to crashing. For posterity, Patrick and I used some of our monthly meeting this morning to spend some time together pairing on this idea. It ended up being a dead-end, since this approach only protects you against changing the hash function mid-stream, and not using the incorrect context type from the union. That was along the lines of what I was originally thinking, and so I resurrected my series to introduce 'unsafe_hash_algo()' here: https://lore.kernel.org/git/cover.1736363652.git.me@ttaylorr.com/ I got the impression that Patrick and I are on the same page there as that being a good path forward, but I'll let him chime in in case I misunderstood anything. Thanks, Taylor
On Wed, Jan 08, 2025 at 02:21:47PM -0500, Taylor Blau wrote: > On Tue, Jan 07, 2025 at 01:06:20PM +0100, Patrick Steinhardt wrote: > > > > > I think we should perhaps combine forces here. My ideal end-state is to > > > > > have the unsafe_hash_algo() stuff land from my earlier series, then have > > > > > these two fixes (adjusted to the new world order as above), and finally > > > > > the Meson fixes after that. > > > > > > > > > > Does that seem like a plan to you? If so, I can put everything together > > > > > and send it out (if you're OK with me forging your s-o-b). > > > > > > > > I think the ideal state would be if the hashing function used was stored > > > > as part of `struct git_hash_ctx`. So the flow basically becomes for > > > > example: > > > > > > > > ``` > > > > struct git_hash_ctx ctx; > > > > struct object_id oid; > > > > > > > > git_hash_sha1_init(&ctx); > > > > git_hash_update(&ctx, data); > > > > git_hash_final_oid(&oid, &ctx); > > > > ``` > > > > > > > > Note how the intermediate calls don't need to know which hash function > > > > you used to initialize the `struct git_hash_ctx` -- the structure itself > > > > should remember what it has been initilized with and do the right thing. > > > > > > I'm not sure I'm following you here. In the stream_blob() function > > > within fast-import, the problem isn't that we're switching hash > > > functions mid-stream, but that we're initializing the hashfile_context > > > structure with the wrong hash function to begin with. > > > > True, but it would have been a non-issue if the hash context itself knew > > which hash function to use for updates. Sure, we would've used the slow > > variant of SHA1 instead of the fast-but-unsafe one. But that feels like > > the lesser evil compared to crashing. > > For posterity, Patrick and I used some of our monthly meeting this morning to > spend some time together pairing on this idea. > > It ended up being a dead-end, since this approach only protects you > against changing the hash function mid-stream, and not using the > incorrect context type from the union. > > That was along the lines of what I was originally thinking, and so I > resurrected my series to introduce 'unsafe_hash_algo()' here: > > https://lore.kernel.org/git/cover.1736363652.git.me@ttaylorr.com/ > > I got the impression that Patrick and I are on the same page there as > that being a good path forward, but I'll let him chime in in case I > misunderstood anything. No misunderstanding, we're both on the same page. Thanks! Patrick
diff --git a/builtin/fast-import.c b/builtin/fast-import.c index 1fa2929a01b7dfee52b653248bba802884f6be6a..0f86392761abbe6acb217fef7f4fe7c3ff5ac1fa 100644 --- a/builtin/fast-import.c +++ b/builtin/fast-import.c @@ -1106,7 +1106,7 @@ static void stream_blob(uintmax_t len, struct object_id *oidout, uintmax_t mark) || (pack_size + PACK_SIZE_THRESHOLD + len) < pack_size) cycle_packfile(); - the_hash_algo->init_fn(&checkpoint.ctx); + the_hash_algo->unsafe_init_fn(&checkpoint.ctx); hashfile_checkpoint(pack_file, &checkpoint); offset = checkpoint.offset;
Same as with the preceding commit, git-fast-import(1) is using the safe variant to initialize a hashfile checkpoint. This leads to a segfault when passing the checkpoint into the hashfile subsystem because it would use the unsafe variants instead: ++ git --git-dir=R/.git fast-import --big-file-threshold=1 AddressSanitizer:DEADLYSIGNAL ================================================================= ==577126==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x5070000009c0 sp 0x7fffffff5b30 T0) ==577126==The signal is caused by a READ memory access. ==577126==Hint: address points to the zero page. #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2 #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2 #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2 #4 0x5555559647d1 in stream_blob ../builtin/fast-import.c:1110:2 #5 0x55555596247b in parse_and_store_blob ../builtin/fast-import.c:2031:3 #6 0x555555967f91 in file_change_m ../builtin/fast-import.c:2408:5 #7 0x55555595d8a2 in parse_new_commit ../builtin/fast-import.c:2768:4 #8 0x55555595bb7a in cmd_fast_import ../builtin/fast-import.c:3614:4 #9 0x555555b1f493 in run_builtin ../git.c:480:11 #10 0x555555b1bfef in handle_builtin ../git.c:740:9 #11 0x555555b1e6f4 in run_argv ../git.c:807:4 #12 0x555555b1b87a in cmd_main ../git.c:947:19 #13 0x5555561649e6 in main ../common-main.c:64:11 #14 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) #15 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4) #16 0x555555772c84 in _start (git+0x21ec84) ==577126==Register values: rax = 0x0000511000000cc0 rbx = 0x0000000000000000 rcx = 0x000000000000000c rdx = 0x0000000000000000 rdi = 0x0000000000000000 rsi = 0x00005070000009c0 rbp = 0x00005070000009c0 rsp = 0x00007fffffff5b30 r8 = 0x0000000000000000 r9 = 0x0000000000000000 r10 = 0x0000000000000000 r11 = 0x00007ffff7a01a30 r12 = 0x0000000000000000 r13 = 0x00007fffffff6b60 r14 = 0x00007ffff7ffd000 r15 = 0x00005555563b9910 AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex ==577126==ABORTING ./test-lib.sh: line 1039: 577126 Aborted git --git-dir=R/.git fast-import --big-file-threshold=1 < input error: last command exited with $?=134 not ok 167 - R: blob bigger than threshold The segfault is only exposed in case the unsafe and safe backends are different from one another. Fix the issue by initializing the context with the unsafe SHA1 variant. Signed-off-by: Patrick Steinhardt <ps@pks.im> --- builtin/fast-import.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)