Message ID | cover.1736363652.git.me@ttaylorr.com (mailing list archive) |
---|---|
Headers | show |
Series | hash: introduce unsafe_hash_algo(), drop unsafe_ variants | expand |
On Wed, Jan 08, 2025 at 02:14:29PM -0500, Taylor Blau wrote: > (This series is rebased on 'master', which is 14650065b7 > (RelNotes/2.48.0: fix typos etc., 2025-01-07) at the time of writing). > > The bulk of this series is unchanged since last time, but a new seventh > patch that further hardens the hashfile_checkpoint callers on top of > Patrick's recent series[1]. I think that new patch is a definite improvement, though I left some comments there. The changes in patch 1 look fine to me (I still think a generic "test-tool hash" would make more sense, but I'm OK punting on that for now). I didn't see any response to the review in round 1 about the pointer dangers in patch 3. What do you think of using a separate git_hash_algo_fns struct, with the one-time conversion I showed in the subthread of: https://lore.kernel.org/git/20241121093731.GD602681@coredump.intra.peff.net/ ? -Peff
On Fri, Jan 10, 2025 at 05:41:06AM -0500, Jeff King wrote: > On Wed, Jan 08, 2025 at 02:14:29PM -0500, Taylor Blau wrote: > > > (This series is rebased on 'master', which is 14650065b7 > > (RelNotes/2.48.0: fix typos etc., 2025-01-07) at the time of writing). > > > > The bulk of this series is unchanged since last time, but a new seventh > > patch that further hardens the hashfile_checkpoint callers on top of > > Patrick's recent series[1]. > > I think that new patch is a definite improvement, though I left some > comments there. > > The changes in patch 1 look fine to me (I still think a generic > "test-tool hash" would make more sense, but I'm OK punting on that for > now). > > I didn't see any response to the review in round 1 about the pointer > dangers in patch 3. What do you think of using a separate > git_hash_algo_fns struct, with the one-time conversion I showed in the > subthread of: > > https://lore.kernel.org/git/20241121093731.GD602681@coredump.intra.peff.net/ > > ? Oops. I must have missed those messages; and sure enough when focusing my inbox on this thread they are indeed unread :-). That said, I am not sure that that direction is one that I'd want to go in. Part of the goal of this series is to make it possible to mix safe and unsafe function calls on the same hash function. So doing something like: struct git_hash_algo *algop; algop->init_fn(&ctx); in one part of the code, and then (using the same algop) calling: algop->unsafe_final_fn(...); should be impossible to do to. The benefit of having only a single set of functions implemented on the git_hash_algo type is that it is impossible to mix the two: you'd have to use a different git_hash_algo altogether! So porting the above example to your and brian's git_hash_algo_fns struct, you'd still be able to do: algop->fn.init(&ctx); in one part of the code and algop->unsafe_fn.final(...) in another part, which doesn't appear to me to be safer than the current situation that this series aims to solve. But if I am misunderstanding something about your/brian's discussion earlier in this thread, please feel free to correct me. Thanks, Taylor
Taylor Blau <me@ttaylorr.com> writes: > (This series is rebased on 'master', which is 14650065b7 > (RelNotes/2.48.0: fix typos etc., 2025-01-07) at the time of writing). The previous round was based on <cover.1730833506.git.me@ttaylorr.com> which became 'tb/unsafe-hash-test', but this round is based on a recent 'master' that does not yet contain it? Does it mean that the 2-patch series the previous round of this series was based on is no longer needed? Thanks.
On Fri, Jan 10, 2025 at 04:29:38PM -0500, Taylor Blau wrote: > > I didn't see any response to the review in round 1 about the pointer > > dangers in patch 3. What do you think of using a separate > > git_hash_algo_fns struct, with the one-time conversion I showed in the > > subthread of: > > > > https://lore.kernel.org/git/20241121093731.GD602681@coredump.intra.peff.net/ > > > > ? > > Oops. I must have missed those messages; and sure enough when focusing > my inbox on this thread they are indeed unread :-). > > That said, I am not sure that that direction is one that I'd want to go > in. Part of the goal of this series is to make it possible to mix safe > and unsafe function calls on the same hash function. So doing something > like: > > struct git_hash_algo *algop; > > algop->init_fn(&ctx); > > in one part of the code, and then (using the same algop) calling: > > algop->unsafe_final_fn(...); > > should be impossible to do to. The benefit of having only a single set > of functions implemented on the git_hash_algo type is that it is > impossible to mix the two: you'd have to use a different git_hash_algo > altogether! > > So porting the above example to your and brian's git_hash_algo_fns > struct, you'd still be able to do: > > algop->fn.init(&ctx); > > in one part of the code and algop->unsafe_fn.final(...) in another part, > which doesn't appear to me to be safer than the current situation that > this series aims to solve. I think what that proposal is doing is orthogonal to the goal of your series. You'd still have an unsafe_hash_algo() function, but it would return a git_hash_algo_fns struct, and that's what struct hashfile would store. So your patches would still remain. The advantage is mostly that you can't confuse it with a regular git_hash_algo struct, so it avoids the pointer and hash-id issues. I do think there is one gotcha, though, which is that the hashfile code probably still needs the outer algop pointer for things like algop->raw_size. So you'd have to store both. It's _possible_ to still confuse the two, but the idea is that you'd have to explicitly call algop->fn, to get the wrong one there. If we wanted to make that harder to get wrong, we could start making it a habit to never use algo->fn directly, but to ask for the safe/unsafe git_hash_algo_fns struct. But that would be even more churn in the surrounding code. I think just doing it consistently within hashfile (which is the only unsafe user) would be sufficient. -Peff
On Fri, Jan 10, 2025 at 04:14:58PM -0800, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > (This series is rebased on 'master', which is 14650065b7 > > (RelNotes/2.48.0: fix typos etc., 2025-01-07) at the time of writing). > > The previous round was based on > <cover.1730833506.git.me@ttaylorr.com> which became > 'tb/unsafe-hash-test', but this round is based on a recent 'master' > that does not yet contain it? Does it mean that the 2-patch series > the previous round of this series was based on is no longer needed? Those two patches got squashed together and became the first patch of this series, so 'tb/unsafe-hash-test' is safe to discard. Thank you for shuffling the patches around as always :-). Thanks, Taylor