diff mbox series

[12/22] x86/fpu: Only write PKRU if it is different from current

Message ID 20190221115020.12385-13-bigeasy@linutronix.de (mailing list archive)
State New, archived
Headers show
Series [01/22] x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig() | expand

Commit Message

Sebastian Andrzej Siewior Feb. 21, 2019, 11:50 a.m. UTC
Dave Hansen says that the `wrpkru' is more expensive than `rdpkru'. It
has a higher cycle cost and it's also practically a (light) speculation
barrier.

As an optimisation read the current PKRU value and only write the new
one if it is different.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/include/asm/special_insns.h | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Dave Hansen Feb. 25, 2019, 6:08 p.m. UTC | #1
On 2/21/19 3:50 AM, Sebastian Andrzej Siewior wrote:
> @@ -111,6 +111,12 @@ static inline void __write_pkru(u32 pkru)
>  {
>  	u32 ecx = 0, edx = 0;
>  
> +	/*
> +	 * WRPKRU is relatively expensive compared to RDPKRU.
> +	 * Avoid WRPKRU when it would not change the value.
> +	 */
> +	if (pkru == __read_pkru())
> +		return;
>  	/*
>  	 * "wrpkru" instruction.  Loads contents in EAX to PKRU,
>  	 * requires that ecx = edx = 0.

Could we do this in write_pkru() instead?  I tried to keep the __
versions of read and write as simple and symmetric as possible.
Sebastian Andrzej Siewior March 8, 2019, 5:24 p.m. UTC | #2
On 2019-02-25 10:08:10 [-0800], Dave Hansen wrote:
> On 2/21/19 3:50 AM, Sebastian Andrzej Siewior wrote:
> > @@ -111,6 +111,12 @@ static inline void __write_pkru(u32 pkru)
> >  {
> >  	u32 ecx = 0, edx = 0;
> >  
> > +	/*
> > +	 * WRPKRU is relatively expensive compared to RDPKRU.
> > +	 * Avoid WRPKRU when it would not change the value.
> > +	 */
> > +	if (pkru == __read_pkru())
> > +		return;
> >  	/*
> >  	 * "wrpkru" instruction.  Loads contents in EAX to PKRU,
> >  	 * requires that ecx = edx = 0.
> 
> Could we do this in write_pkru() instead?  I tried to keep the __
> versions of read and write as simple and symmetric as possible.

write_pkru() has:
| static inline void write_pkru(u32 pkru)
| {
|         struct pkru_state *pk;
| 
|         if (!boot_cpu_has(X86_FEATURE_OSPKE))
|                 return;
|         
|         pk = get_xsave_addr(&current->thread.fpu.state.xsave, XFEATURE_PKRU);
|         /*
|          * The PKRU value in xstate needs to be in sync with the value that is
|          * written to the CPU. The FPU restore on return to userland would
|          * otherwise load the previous value again.
|          */
|         fpregs_lock();
|         if (pk)
|                 pk->pkru = pkru;
|         __write_pkru(pkru);
|         fpregs_unlock();
| }

and is other callers besides write_pkru():
|arch/x86/include/asm/fpu/internal.h:    __write_pkru(pkru_val);
|arch/x86/kvm/vmx/vmx.c:         __write_pkru(vcpu->arch.pkru);
|arch/x86/kvm/vmx/vmx.c:                 __write_pkru(vmx->host_pkru)

Do want __write_pkru_isn() and __read_pkru_isn() so those are symmetric
and then we would have __write_pkru() which does the read check and
invokes __write_pkru_isn()?

Sebastian
diff mbox series

Patch

diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h
index 43c029cdc3fe8..2d3adeb268e38 100644
--- a/arch/x86/include/asm/special_insns.h
+++ b/arch/x86/include/asm/special_insns.h
@@ -111,6 +111,12 @@  static inline void __write_pkru(u32 pkru)
 {
 	u32 ecx = 0, edx = 0;
 
+	/*
+	 * WRPKRU is relatively expensive compared to RDPKRU.
+	 * Avoid WRPKRU when it would not change the value.
+	 */
+	if (pkru == __read_pkru())
+		return;
 	/*
 	 * "wrpkru" instruction.  Loads contents in EAX to PKRU,
 	 * requires that ecx = edx = 0.