diff mbox series

[v2,7/9] x86/shadow: reduce effort of hash calculation

Message ID 4785b34b-2672-e3a8-8096-df1365b6b7b8@suse.com (mailing list archive)
State Superseded
Headers show
Series x86/shadow: misc tidying | expand

Commit Message

Jan Beulich Jan. 11, 2023, 1:56 p.m. UTC
The "n" input is a GFN/MFN value and hence bounded by the physical
address bits in use on a system. The hash quality won't improve by also
including the upper always-zero bits in the calculation. To keep things
as compile-time-constant as they were before, use PADDR_BITS (not
paddr_bits) for loop bounding. This reduces loop iterations from 8 to 5.

While there also drop the unnecessary conversion to an array of unsigned
char, moving the value off the stack altogether (at least with
optimization enabled).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
I was tempted to also change the type "i" (unsigned) right here, but
then thought this might be going too far ...
---
v2: Also eliminate the unsigned char * alias of "n".

Comments

Andrew Cooper Jan. 12, 2023, 12:02 a.m. UTC | #1
On 11/01/2023 1:56 pm, Jan Beulich wrote:
> The "n" input is a GFN/MFN value and hence bounded by the physical
> address bits in use on a system. The hash quality won't improve by also
> including the upper always-zero bits in the calculation. To keep things
> as compile-time-constant as they were before, use PADDR_BITS (not
> paddr_bits) for loop bounding. This reduces loop iterations from 8 to 5.
>
> While there also drop the unnecessary conversion to an array of unsigned
> char, moving the value off the stack altogether (at least with
> optimization enabled).

I'm not sure this final bit in brackets is relevant.  It wouldn't be on
the stack without optimisations either, because ABI-wise, it will be in
%rsi.

I'd just drop it.

>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich Jan. 12, 2023, 9:31 a.m. UTC | #2
On 12.01.2023 01:02, Andrew Cooper wrote:
> On 11/01/2023 1:56 pm, Jan Beulich wrote:
>> The "n" input is a GFN/MFN value and hence bounded by the physical
>> address bits in use on a system. The hash quality won't improve by also
>> including the upper always-zero bits in the calculation. To keep things
>> as compile-time-constant as they were before, use PADDR_BITS (not
>> paddr_bits) for loop bounding. This reduces loop iterations from 8 to 5.
>>
>> While there also drop the unnecessary conversion to an array of unsigned
>> char, moving the value off the stack altogether (at least with
>> optimization enabled).
> 
> I'm not sure this final bit in brackets is relevant.  It wouldn't be on
> the stack without optimisations either, because ABI-wise, it will be in
> %rsi.

Without optimization whether an inline function is actually inlined is
unclear. When it is, what register (or stack slot) an argument is in is
simply unknown. When it is made an out-of-line function, a compiler may
very well spill register arguments to the stack first thing, just to
make all arguments (whatsoever, i.e. not just in this function) be
consistently in memory.

>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks.

Jan
diff mbox series

Patch

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1397,10 +1397,13 @@  static unsigned int shadow_get_allocatio
 typedef u32 key_t;
 static inline key_t sh_hash(unsigned long n, unsigned int t)
 {
-    unsigned char *p = (unsigned char *)&n;
     key_t k = t;
     int i;
-    for ( i = 0; i < sizeof(n) ; i++ ) k = (u32)p[i] + (k<<6) + (k<<16) - k;
+
+    BUILD_BUG_ON(PADDR_BITS > BITS_PER_LONG + PAGE_SHIFT);
+    for ( i = 0; i < (PADDR_BITS - PAGE_SHIFT + 7) / 8; i++, n >>= 8 )
+        k = (uint8_t)n + (k << 6) + (k << 16) - k;
+
     return k % SHADOW_HASH_BUCKETS;
 }