[2/2] KVM: VMX: Use LEAVE in vmx_do_interrupt_irqoff()

Message ID	20250414081131.97374-2-ubizjak@gmail.com (mailing list archive)
State	New
Headers	show Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1CA15259C84; Mon, 14 Apr 2025 08:11:37 +0000 (UTC) From: Uros Bizjak <ubizjak@gmail.com> To: kvm@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org Cc: Uros Bizjak <ubizjak@gmail.com>, Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@kernel.org>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com> Subject: [PATCH 2/2] KVM: VMX: Use LEAVE in vmx_do_interrupt_irqoff() Date: Mon, 14 Apr 2025 10:10:51 +0200 Message-ID: <20250414081131.97374-2-ubizjak@gmail.com> In-Reply-To: <20250414081131.97374-1-ubizjak@gmail.com> References: <20250414081131.97374-1-ubizjak@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[1/2] KVM: x86: Use asm_inline() instead of asm() in kvm_hypercall[0-4]() \| expand [1/2] KVM: x86: Use asm_inline() instead of asm() in kvm_hypercall[0-4]() [2/2] KVM: VMX: Use LEAVE in vmx_do_interrupt_irqoff()

Message ID

20250414081131.97374-2-ubizjak@gmail.com (mailing list archive)

State

New

Headers

From: Uros Bizjak <ubizjak@gmail.com>
To: kvm@vger.kernel.org,
	x86@kernel.org,
	linux-kernel@vger.kernel.org
Cc: Uros Bizjak <ubizjak@gmail.com>,
	Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: [PATCH 2/2] KVM: VMX: Use LEAVE in vmx_do_interrupt_irqoff()
Date: Mon, 14 Apr 2025 10:10:51 +0200
Message-ID: <20250414081131.97374-2-ubizjak@gmail.com>
In-Reply-To: <20250414081131.97374-1-ubizjak@gmail.com>
References: <20250414081131.97374-1-ubizjak@gmail.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

[1/2] KVM: x86: Use asm_inline() instead of asm() in kvm_hypercall[0-4]() | expand

Commit Message

Uros Bizjak April 14, 2025, 8:10 a.m. UTC

Micro-optimize vmx_do_interrupt_irqoff() by substituting
MOV %RBP,%RSP; POP %RBP instruction sequence with equivalent
LEAVE instruction. GCC compiler does this by default for
a generic tuning and for all modern processors:

DEF_TUNE (X86_TUNE_USE_LEAVE, "use_leave",
	  m_386 | m_CORE_ALL | m_K6_GEODE | m_AMD_MULTIPLE | m_ZHAOXIN
	  | m_TREMONT | m_CORE_HYBRID | m_CORE_ATOM | m_GENERIC)

The new code also saves a couple of bytes, from:

  27:	48 89 ec             	mov    %rbp,%rsp
  2a:	5d                   	pop    %rbp

to:

  27:	c9                   	leave

No functional change intended.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
 arch/x86/kvm/vmx/vmenter.S | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Sean Christopherson April 15, 2025, 1:05 a.m. UTC | #1

On Mon, Apr 14, 2025, Uros Bizjak wrote:
> Micro-optimize vmx_do_interrupt_irqoff() by substituting
> MOV %RBP,%RSP; POP %RBP instruction sequence with equivalent
> LEAVE instruction. GCC compiler does this by default for
> a generic tuning and for all modern processors:

Out of curisoity, is LEAVE actually a performance win, or is the benefit essentially
just the few code bytes saves?

> DEF_TUNE (X86_TUNE_USE_LEAVE, "use_leave",
> 	  m_386 | m_CORE_ALL | m_K6_GEODE | m_AMD_MULTIPLE | m_ZHAOXIN
> 	  | m_TREMONT | m_CORE_HYBRID | m_CORE_ATOM | m_GENERIC)
> 
> The new code also saves a couple of bytes, from:
> 
>   27:	48 89 ec             	mov    %rbp,%rsp
>   2a:	5d                   	pop    %rbp
> 
> to:
> 
>   27:	c9                   	leave
> 
> No functional change intended.
> 
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> ---
>  arch/x86/kvm/vmx/vmenter.S | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
> index f6986dee6f8c..0a6cf5bff2aa 100644
> --- a/arch/x86/kvm/vmx/vmenter.S
> +++ b/arch/x86/kvm/vmx/vmenter.S
> @@ -59,8 +59,7 @@
>  	 * without the explicit restore, thinks the stack is getting walloped.
>  	 * Using an unwind hint is problematic due to x86-64's dynamic alignment.
>  	 */
> -	mov %_ASM_BP, %_ASM_SP
> -	pop %_ASM_BP
> +	leave
>  	RET
>  .endm
>  
> -- 
> 2.49.0
>

Uros Bizjak April 15, 2025, 7:42 a.m. UTC | #2

On Tue, Apr 15, 2025 at 3:05 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Apr 14, 2025, Uros Bizjak wrote:
> > Micro-optimize vmx_do_interrupt_irqoff() by substituting
> > MOV %RBP,%RSP; POP %RBP instruction sequence with equivalent
> > LEAVE instruction. GCC compiler does this by default for
> > a generic tuning and for all modern processors:
>
> Out of curisoity, is LEAVE actually a performance win, or is the benefit essentially
> just the few code bytes saves?

It is hard to say for out-of-order execution cores, especially when
the stack engine is thrown to the mix (these two instructions, plus
following RET, all update %rsp).

The pragmatic solution was to do what the compiler does and use the
compiler's choice, based on the tuning below.

> > DEF_TUNE (X86_TUNE_USE_LEAVE, "use_leave",
> >         m_386 | m_CORE_ALL | m_K6_GEODE | m_AMD_MULTIPLE | m_ZHAOXIN
> >         | m_TREMONT | m_CORE_HYBRID | m_CORE_ATOM | m_GENERIC)

The tuning is updated when a new target is introduced to the compiler
and is based on various measurements by the processor manufacturer.
The above covers the majority of recent processors (plus generic
tuning), so I guess we won't fail by following the suit. OTOH, any
performance difference will be negligible.

> > The new code also saves a couple of bytes, from:
> >
> >   27: 48 89 ec                mov    %rbp,%rsp
> >   2a: 5d                      pop    %rbp
> >
> > to:
> >
> >   27: c9                      leave

Thanks,
Uros.

diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index f6986dee6f8c..0a6cf5bff2aa 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -59,8 +59,7 @@ 
 	 * without the explicit restore, thinks the stack is getting walloped.
 	 * Using an unwind hint is problematic due to x86-64's dynamic alignment.
 	 */
-	mov %_ASM_BP, %_ASM_SP
-	pop %_ASM_BP
+	leave
 	RET
 .endm

[2/2] KVM: VMX: Use LEAVE in vmx_do_interrupt_irqoff()

Commit Message

Comments

Patch