Message ID | 78932582fa556fd5fd6e8886e80e993f.paul@paul-moore.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Paul Moore |
Headers | show |
Series | [GIT,PULL] selinux/selinux-pr-20231030 | expand |
On Mon, 30 Oct 2023 at 16:16, Paul Moore <paul@paul-moore.com> wrote: > > * Use a better hashing function for the SELinux role tansition hash > table. Bah. While the old hash function was garbage, the new one is quite expensive. Maybe it's worth it. But generally, if you find that "oh, just doing a modulus with a power of two drops all high bits", the first thing to try is probably to just do "hash_long(x, N)" to get N bits instead. Assuming the input is somewhat ok in one word, it does a fairly good job of mixing the bits with a simple multiply-and-shift. Yes, yes, jhash is a fine hash, but it does a quite *lot* of (simple) ALU ops. While "hash_long()" is often small enough to be inlined. I also note that filenametr_hash() does the old "one byte at a time" hash and partial_name_hash(). Is there any reason that code doesn't use the "full_name_hash()" which does things a word at a time? Probably doesn't matter, but since I looked at this to see what the new hashing was, I noticed... Linus
The pull request you sent on Mon, 30 Oct 2023 22:16:31 -0400:
> https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux.git tags/selinux-pr-20231030
has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/f5fc9e4a117d4c118c95abb37e9d34d52b748c99
Thank you!
On Tue, Oct 31, 2023 at 2:13 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Mon, 30 Oct 2023 at 16:16, Paul Moore <paul@paul-moore.com> wrote: > > > > * Use a better hashing function for the SELinux role tansition hash > > table. > > Bah. > > While the old hash function was garbage, the new one is quite expensive. > > Maybe it's worth it. > > But generally, if you find that "oh, just doing a modulus with a power > of two drops all high bits", the first thing to try is probably to > just do "hash_long(x, N)" to get N bits instead. > > Assuming the input is somewhat ok in one word, it does a fairly good > job of mixing the bits with a simple multiply-and-shift. > > Yes, yes, jhash is a fine hash, but it does a quite *lot* of (simple) > ALU ops. While "hash_long()" is often small enough to be inlined. We probably should do some performance measurements of the various hash tables in the SELinux code and use that to drive some decisions on what functions we use. There have been some in the past for specific tables, but I don't think we've done anything comprehensive, or recent. This latest change obviously focused more on ensuring a better distribution, which can help, but if the digest calculation is too slow it probably doesn't matter. > I also note that filenametr_hash() does the old "one byte at a time" > hash and partial_name_hash(). Is there any reason that code doesn't > use the "full_name_hash()" which does things a word at a time? Likely just a matter of no one looking at it and realizing it can be improved. I'll toss this on the todo list, it should take all of five minutes. > Probably doesn't matter, but since I looked at this to see what the > new hashing was, I noticed... No harm in mentioning it, feedback is always welcome, but you know what else is even more welcome? Patches ;)