diff mbox series

[2/3] KVM: SVM: Manually zero/restore DEBUGCTL if LBR virtualization is disabled

Message ID 20250224181315.2376869-3-seanjc@google.com (mailing list archive)
State New
Headers show
Series KVM: SVM: Zero DEBUGCTL before VMRUN if necessary | expand

Commit Message

Sean Christopherson Feb. 24, 2025, 6:13 p.m. UTC
Manually zero DEBUGCTL prior to VMRUN if the host's value is non-zero and
LBR virtualization is disabled, as hardware only context switches DEBUGCTL
if LBR virtualization is fully enabled.  Running the guest with the host's
value has likely been mildly problematic for quite some time, e.g. it will
result in undesirable behavior if host is running with BTF=1.

But the bug became fatal with the introduction of Bus Lock Trap ("Detect"
in kernel paralance) support for AMD (commit 408eb7417a92
("x86/bus_lock: Add support for AMD")), as a bus lock in the guest will
trigger an unexpected #DB.

Note, suppressing the bus lock #DB, i.e. simply resuming the guest without
injecting a #DB, is not an option.  It wouldn't address the general issue
with DEBUGCTL, e.g. for things like BTF, and there are other guest-visible
side effects if BusLockTrap is left enabled.

If BusLockTrap is disabled, then DR6.BLD is reserved-to-1; any attempts to
clear it by software are ignored.  But if BusLockTrap is enabled, software
can clear DR6.BLD:

  Software enables bus lock trap by setting DebugCtl MSR[BLCKDB] (bit 2)
  to 1.  When bus lock trap is enabled, ... The processor indicates that
  this #DB was caused by a bus lock by clearing DR6[BLD] (bit 11).  DR6[11]
  previously had been defined to be always 1.

and clearing DR6.BLD is "sticky" in that it's not set (i.e. lowered) by
other #DBs:

  All other #DB exceptions leave DR6[BLD] unmodified

E.g. leaving BusLockTrap enable can confuse a legacy guest that writes '0'
to reset DR6.

Reported-by: rangemachine@gmail.com
Reported-by: whanos@sergal.fun
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219787
Closes: https://lore.kernel.org/all/bug-219787-28872@https.bugzilla.kernel.org%2F
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

Comments

Ravi Bangoria Feb. 26, 2025, 6:15 a.m. UTC | #1
Hi Sean,

On 24-Feb-25 11:43 PM, Sean Christopherson wrote:
> Manually zero DEBUGCTL prior to VMRUN if the host's value is non-zero and
> LBR virtualization is disabled, as hardware only context switches DEBUGCTL
> if LBR virtualization is fully enabled.  Running the guest with the host's
> value has likely been mildly problematic for quite some time, e.g. it will
> result in undesirable behavior if host is running with BTF=1.
> 
> But the bug became fatal with the introduction of Bus Lock Trap ("Detect"
> in kernel paralance) support for AMD (commit 408eb7417a92
> ("x86/bus_lock: Add support for AMD")), as a bus lock in the guest will
> trigger an unexpected #DB.
> 
> Note, suppressing the bus lock #DB, i.e. simply resuming the guest without
> injecting a #DB, is not an option.  It wouldn't address the general issue
> with DEBUGCTL, e.g. for things like BTF, and there are other guest-visible
> side effects if BusLockTrap is left enabled.
> 
> If BusLockTrap is disabled, then DR6.BLD is reserved-to-1; any attempts to
> clear it by software are ignored.  But if BusLockTrap is enabled, software
> can clear DR6.BLD:
> 
>   Software enables bus lock trap by setting DebugCtl MSR[BLCKDB] (bit 2)
>   to 1.  When bus lock trap is enabled, ... The processor indicates that
>   this #DB was caused by a bus lock by clearing DR6[BLD] (bit 11).  DR6[11]
>   previously had been defined to be always 1.
> 
> and clearing DR6.BLD is "sticky" in that it's not set (i.e. lowered) by
> other #DBs:
> 
>   All other #DB exceptions leave DR6[BLD] unmodified
> 
> E.g. leaving BusLockTrap enable can confuse a legacy guest that writes '0'
> to reset DR6.

What if guest sets DEBUGCTL[BusLockTrapEn] and runs an application which
causes a bus lock? Guest will receive #DB due to bus lock, even though
guest CPUID says BusLockTrap isn't supported. Should KVM prevent guest
to write to DEBUGCTL[BusLockTrapEn]? Something like:

---
@@ -3168,6 +3168,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		if (data & DEBUGCTL_RESERVED_BITS)
 			return 1;
 
+		if ((data & DEBUGCTLMSR_BUS_LOCK_DETECT) &&
+		    !guest_cpu_cap_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
+			return 1;
+
 		svm_get_lbr_vmcb(svm)->save.dbgctl = data;
 		svm_update_lbrv(vcpu);
 		break;
---

Thanks,
Ravi
Sean Christopherson Feb. 26, 2025, 3:42 p.m. UTC | #2
On Wed, Feb 26, 2025, Ravi Bangoria wrote:
> Hi Sean,
> 
> On 24-Feb-25 11:43 PM, Sean Christopherson wrote:
> > Manually zero DEBUGCTL prior to VMRUN if the host's value is non-zero and
> > LBR virtualization is disabled, as hardware only context switches DEBUGCTL
> > if LBR virtualization is fully enabled.  Running the guest with the host's
> > value has likely been mildly problematic for quite some time, e.g. it will
> > result in undesirable behavior if host is running with BTF=1.
> > 
> > But the bug became fatal with the introduction of Bus Lock Trap ("Detect"
> > in kernel paralance) support for AMD (commit 408eb7417a92
> > ("x86/bus_lock: Add support for AMD")), as a bus lock in the guest will
> > trigger an unexpected #DB.
> > 
> > Note, suppressing the bus lock #DB, i.e. simply resuming the guest without
> > injecting a #DB, is not an option.  It wouldn't address the general issue
> > with DEBUGCTL, e.g. for things like BTF, and there are other guest-visible
> > side effects if BusLockTrap is left enabled.
> > 
> > If BusLockTrap is disabled, then DR6.BLD is reserved-to-1; any attempts to
> > clear it by software are ignored.  But if BusLockTrap is enabled, software
> > can clear DR6.BLD:
> > 
> >   Software enables bus lock trap by setting DebugCtl MSR[BLCKDB] (bit 2)
> >   to 1.  When bus lock trap is enabled, ... The processor indicates that
> >   this #DB was caused by a bus lock by clearing DR6[BLD] (bit 11).  DR6[11]
> >   previously had been defined to be always 1.
> > 
> > and clearing DR6.BLD is "sticky" in that it's not set (i.e. lowered) by
> > other #DBs:
> > 
> >   All other #DB exceptions leave DR6[BLD] unmodified
> > 
> > E.g. leaving BusLockTrap enable can confuse a legacy guest that writes '0'
> > to reset DR6.
> 
> What if guest sets DEBUGCTL[BusLockTrapEn] and runs an application which
> causes a bus lock? Guest will receive #DB due to bus lock, even though
> guest CPUID says BusLockTrap isn't supported. Should KVM prevent guest
> to write to DEBUGCTL[BusLockTrapEn]? Something like:

Ugh, right, AMD's legacy DEBUGCTL_RESERVED_BITS weirdness.  Ideally, KVM would
make bits 5:2 reserved.  I suspect we could get away with that, because VMX has
rejected all bits except BTF and LBR since the beginning.  But I really, really
don't want to deal with more guest breakage due to sending such a change to
stable kernels, so for an immediate fix, I'll add a patch to drop those bits.
That'll still be a guest-visible change, e.g. if the guest is enabling LBRs *and*
the legacy PBi bits, then the state of the PBi bits would be accurate.  But given
KVM's craptastic handling of DEBUGCTL, I highly doubt dropping bits 5:2 will break
anything.

*sigh*

And that's exposes yet another bug in this code.  Zeroing DEBUGCTL before VMRUN
is wrong if the guest has enabled BTF.  KVM should *load* the guest's desired
value if DEBUGCTL == BTF, i.e. if BTF is enabled but LBRs are not.

> ---
> @@ -3168,6 +3168,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
>  		if (data & DEBUGCTL_RESERVED_BITS)
>  			return 1;
>  
> +		if ((data & DEBUGCTLMSR_BUS_LOCK_DETECT) &&
> +		    !guest_cpu_cap_has(vcpu, X86_FEATURE_BUS_LOCK_DETECT))
> +			return 1;
> +
>  		svm_get_lbr_vmcb(svm)->save.dbgctl = data;
>  		svm_update_lbrv(vcpu);
>  		break;
> ---
> 
> Thanks,
> Ravi
diff mbox series

Patch

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b8aa0f36850f..d5519e592cb3 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4253,6 +4253,16 @@  static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
 	clgi();
 	kvm_load_guest_xsave_state(vcpu);
 
+	/*
+	 * Hardware only context switches DEBUGCTL if LBR virtualization is
+	 * enabled.  Manually zero DEBUGCTL if necessary (and restore it after
+	 * VM-Exit), as running with the host's DEBUGCTL can negatively affect
+	 * guest state and can even be fatal, e.g. due to Bus Lock Detect.
+	 */
+	if (vcpu->arch.host_debugctl &&
+	    !(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK))
+		update_debugctlmsr(0);
+
 	kvm_wait_lapic_expire(vcpu);
 
 	/*
@@ -4280,6 +4290,10 @@  static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu,
 	if (unlikely(svm->vmcb->control.exit_code == SVM_EXIT_NMI))
 		kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
 
+	if (vcpu->arch.host_debugctl &&
+	    !(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK))
+		update_debugctlmsr(vcpu->arch.host_debugctl);
+
 	kvm_load_host_xsave_state(vcpu);
 	stgi();