Message ID | 1556134382-58814-13-git-send-email-fenghua.yu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/split_lock: Enable split lock detection | expand |
On Wed, 24 Apr 2019, Fenghua Yu wrote: > > +static void atomic_switch_msr_test_ctl(struct vcpu_vmx *vmx) > +{ > + u64 host_msr_test_ctl; > + > + if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) > + return; Again: MSR_TST_CTL is not only about LOCK_DETECT. Check the control mask. > + host_msr_test_ctl = this_cpu_read(msr_test_ctl_cache); > + > + if (host_msr_test_ctl == vmx->msr_test_ctl) { This still assumes that the only bit which can be set in the MSR is that lock detect bit. > + clear_atomic_switch_msr(vmx, MSR_TEST_CTL); > + } else { > + add_atomic_switch_msr(vmx, MSR_TEST_CTL, vmx->msr_test_ctl, > + host_msr_test_ctl, false); So what happens here is that if any other bit is set on the host, VMENTER will happily clear it. guest = (host & ~vmx->test_ctl_mask) | vmx->test_ctl; That preserves any bits which are not exposed to the guest. But the way more interesting question is why are you exposing the MSR and the bit to the guest at all if the host has split lock detection enabled? That does not make any sense as you basically allow the guest to switch it off and then launch a slowdown attack. If the host has it enabled, then a guest has to be treated like any other process and the #AC trap has to be caught by the hypervisor which then kills the guest. Only if the host has split lock detection disabled, then you can expose it and allow the guest to turn it on and handle it on its own. Thanks, tglx
On Thu, 2019-04-25 at 09:42 +0200, Thomas Gleixner wrote: > On Wed, 24 Apr 2019, Fenghua Yu wrote: > > > > +static void atomic_switch_msr_test_ctl(struct vcpu_vmx *vmx) > > +{ > > + u64 host_msr_test_ctl; > > + > > + if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) > > + return; > > Again: MSR_TST_CTL is not only about LOCK_DETECT. Check the control mask. > > > + host_msr_test_ctl = this_cpu_read(msr_test_ctl_cache); > > + > > + if (host_msr_test_ctl == vmx->msr_test_ctl) { > > This still assumes that the only bit which can be set in the MSR is that > lock detect bit. > > > + clear_atomic_switch_msr(vmx, MSR_TEST_CTL); > > + } else { > > + add_atomic_switch_msr(vmx, MSR_TEST_CTL, vmx->msr_test_ctl, > > + host_msr_test_ctl, false); > > So what happens here is that if any other bit is set on the host, VMENTER > will happily clear it. There are two bits of MSR TEST_CTL defined in Intel SDM now, which is bit 29 and bit 31. Bit 31 is not used in kernel, and here we only need to switch bit 29 between host and guest. So should I also change the name to atomic_switch_split_lock_detect() to indicate that we only switch bit 29? > guest = (host & ~vmx->test_ctl_mask) | vmx->test_ctl; > > That preserves any bits which are not exposed to the guest. > > But the way more interesting question is why are you exposing the MSR and > the bit to the guest at all if the host has split lock detection enabled? > > That does not make any sense as you basically allow the guest to switch it > off and then launch a slowdown attack. If the host has it enabled, then a > guest has to be treated like any other process and the #AC trap has to be > caught by the hypervisor which then kills the guest. > > Only if the host has split lock detection disabled, then you can expose it > and allow the guest to turn it on and handle it on its own. Indeed, if we use split lock detection for protection purpose, when host has it enabled we should directly pass it to guest and forbid guest from disabling it. And only when host disables split lock detection, we can expose it and allow the guest to turn it on. If it is used for protection purpose, then it should follow what you said and this feature needs to be disabled by default. Because there are split lock issues in old/current kernels and BIOS. That will cause the existing guest booting failure and killed due to those split lock. If it is only used for debug purpose, I think it might be OK to enable this feature by default and make it indepedent between host and guest? So I think how to handle this feature between host and guest depends on how we use it? Once you give me a decision, I will follow it in next version. > Thanks, > > tglx > >
On Sat, 27 Apr 2019, Xiaoyao Li wrote: > On Thu, 2019-04-25 at 09:42 +0200, Thomas Gleixner wrote: > > On Wed, 24 Apr 2019, Fenghua Yu wrote: > > > > > > +static void atomic_switch_msr_test_ctl(struct vcpu_vmx *vmx) > > > +{ > > > + u64 host_msr_test_ctl; > > > + > > > + if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) > > > + return; > > > > Again: MSR_TST_CTL is not only about LOCK_DETECT. Check the control mask. > > > > > + host_msr_test_ctl = this_cpu_read(msr_test_ctl_cache); > > > + > > > + if (host_msr_test_ctl == vmx->msr_test_ctl) { > > > > This still assumes that the only bit which can be set in the MSR is that > > lock detect bit. > > > > > + clear_atomic_switch_msr(vmx, MSR_TEST_CTL); > > > + } else { > > > + add_atomic_switch_msr(vmx, MSR_TEST_CTL, vmx->msr_test_ctl, > > > + host_msr_test_ctl, false); > > > > So what happens here is that if any other bit is set on the host, VMENTER > > will happily clear it. > > There are two bits of MSR TEST_CTL defined in Intel SDM now, which is bit > 29 and bit 31. Bit 31 is not used in kernel, and here we only need to > switch bit 29 between host and guest. So should I also change the name > to atomic_switch_split_lock_detect() to indicate that we only switch bit > 29? No. Just because we ony use the split lock bit now, there is no jusification to name everything splitlock. This is going to have renamed when yet another bit is added in the future. The MSR is exposed to the guest and the restriction of bits happens to be splitlock today. > > guest = (host & ~vmx->test_ctl_mask) | vmx->test_ctl; > > > > That preserves any bits which are not exposed to the guest. > > > > But the way more interesting question is why are you exposing the MSR and > > the bit to the guest at all if the host has split lock detection enabled? > > > > That does not make any sense as you basically allow the guest to switch it > > off and then launch a slowdown attack. If the host has it enabled, then a > > guest has to be treated like any other process and the #AC trap has to be > > caught by the hypervisor which then kills the guest. > > > > Only if the host has split lock detection disabled, then you can expose it > > and allow the guest to turn it on and handle it on its own. > > Indeed, if we use split lock detection for protection purpose, when host > has it enabled we should directly pass it to guest and forbid guest from > disabling it. And only when host disables split lock detection, we can > expose it and allow the guest to turn it on. ? > If it is used for protection purpose, then it should follow what you said and > this feature needs to be disabled by default. Because there are split lock > issues in old/current kernels and BIOS. That will cause the existing guest > booting failure and killed due to those split lock. Rightfully so. > If it is only used for debug purpose, I think it might be OK to enable this > feature by default and make it indepedent between host and guest? No. It does not make sense. > So I think how to handle this feature between host and guest depends on how we > use it? Once you give me a decision, I will follow it in next version. As I said: The host kernel makes the decision. If the host kernel has it enabled then the guest is not allowed to change it. If the guest triggers an #AC it will be killed. If the host kernel has it disabled then the guest can enable it for it's own purposes. Thanks, tglx
On 4/28/2019 3:09 PM, Thomas Gleixner wrote: > On Sat, 27 Apr 2019, Xiaoyao Li wrote: >> On Thu, 2019-04-25 at 09:42 +0200, Thomas Gleixner wrote: >>> On Wed, 24 Apr 2019, Fenghua Yu wrote: >>>> >>>> +static void atomic_switch_msr_test_ctl(struct vcpu_vmx *vmx) >>>> +{ >>>> + u64 host_msr_test_ctl; >>>> + >>>> + if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) >>>> + return; >>> >>> Again: MSR_TST_CTL is not only about LOCK_DETECT. Check the control mask. >>> >>>> + host_msr_test_ctl = this_cpu_read(msr_test_ctl_cache); >>>> + >>>> + if (host_msr_test_ctl == vmx->msr_test_ctl) { >>> >>> This still assumes that the only bit which can be set in the MSR is that >>> lock detect bit. >>> >>>> + clear_atomic_switch_msr(vmx, MSR_TEST_CTL); >>>> + } else { >>>> + add_atomic_switch_msr(vmx, MSR_TEST_CTL, vmx->msr_test_ctl, >>>> + host_msr_test_ctl, false); >>> >>> So what happens here is that if any other bit is set on the host, VMENTER >>> will happily clear it. >> >> There are two bits of MSR TEST_CTL defined in Intel SDM now, which is bit >> 29 and bit 31. Bit 31 is not used in kernel, and here we only need to >> switch bit 29 between host and guest. So should I also change the name >> to atomic_switch_split_lock_detect() to indicate that we only switch bit >> 29? > > No. Just because we ony use the split lock bit now, there is no > jusification to name everything splitlock. This is going to have renamed > when yet another bit is added in the future. The MSR is exposed to the > guest and the restriction of bits happens to be splitlock today. Got it. >>> guest = (host & ~vmx->test_ctl_mask) | vmx->test_ctl; >>> >>> That preserves any bits which are not exposed to the guest. >>> >>> But the way more interesting question is why are you exposing the MSR and >>> the bit to the guest at all if the host has split lock detection enabled? >>> >>> That does not make any sense as you basically allow the guest to switch it >>> off and then launch a slowdown attack. If the host has it enabled, then a >>> guest has to be treated like any other process and the #AC trap has to be >>> caught by the hypervisor which then kills the guest. >>> >>> Only if the host has split lock detection disabled, then you can expose it >>> and allow the guest to turn it on and handle it on its own. >> >> Indeed, if we use split lock detection for protection purpose, when host >> has it enabled we should directly pass it to guest and forbid guest from >> disabling it. And only when host disables split lock detection, we can >> expose it and allow the guest to turn it on. > ? >> If it is used for protection purpose, then it should follow what you said and >> this feature needs to be disabled by default. Because there are split lock >> issues in old/current kernels and BIOS. That will cause the existing guest >> booting failure and killed due to those split lock. > > Rightfully so. So, the patch 13 "Enable split lock detection by default" needs to be removed? >> If it is only used for debug purpose, I think it might be OK to enable this >> feature by default and make it indepedent between host and guest? > > No. It does not make sense. > >> So I think how to handle this feature between host and guest depends on how we >> use it? Once you give me a decision, I will follow it in next version. > > As I said: The host kernel makes the decision. > > If the host kernel has it enabled then the guest is not allowed to change > it. If the guest triggers an #AC it will be killed. > > If the host kernel has it disabled then the guest can enable it for it's > own purposes. > > Thanks, > > tglx >
Hi, Thomas, Base on your comments, I plan to make the design as following: 1) When host enables this feature, there is no switch between host and guest that guest running with it enabled by force. Since #AC in exception bitmap is set in current kvm, every #AC in guest will be trapped. And in handle_exception() handler in kvm, if #AC is caused by alignment check, kvm injects #AC back to guest; if #AC is caused by split lock, kvm sends a SIGBUS to userspace. 2) When host disables this feature, there needs atomic switch between host and guest if different. And in handle_exception() handler in kvm, we can just inject #AC back to guest, and let guest to handle it. Besides, I think there might be an optimization for case #1. When host has it enabled and guest also has it enabled, I think it's OK to inject #AC back to guest, not directly kill the guest. Because guest kernel has it enabled means it knows what this feature is and it also want to get aware of and fault every split lock. At this point, if guest has it enabled, we can leave it to guest. Only when guest's configuration is having it disabled, can it be regards as potentially harmful that we kill the guest once there is a #AC due to split lock. How do you think about the design and this optimization? Hi, Paolo, What's your opinion about this design of split lock in KVM? Thanks. On 4/28/2019 3:09 PM, Thomas Gleixner wrote: > On Sat, 27 Apr 2019, Xiaoyao Li wrote: >> On Thu, 2019-04-25 at 09:42 +0200, Thomas Gleixner wrote: >>> But the way more interesting question is why are you exposing the MSR and >>> the bit to the guest at all if the host has split lock detection enabled? >>> >>> That does not make any sense as you basically allow the guest to switch it >>> off and then launch a slowdown attack. If the host has it enabled, then a >>> guest has to be treated like any other process and the #AC trap has to be >>> caught by the hypervisor which then kills the guest. >>> >>> Only if the host has split lock detection disabled, then you can expose it >>> and allow the guest to turn it on and handle it on its own. >> >> Indeed, if we use split lock detection for protection purpose, when host >> has it enabled we should directly pass it to guest and forbid guest from >> disabling it. And only when host disables split lock detection, we can >> expose it and allow the guest to turn it on. > ? >> If it is used for protection purpose, then it should follow what you said and >> this feature needs to be disabled by default. Because there are split lock >> issues in old/current kernels and BIOS. That will cause the existing guest >> booting failure and killed due to those split lock. > > Rightfully so. > >> If it is only used for debug purpose, I think it might be OK to enable this >> feature by default and make it indepedent between host and guest? > > No. It does not make sense. > >> So I think how to handle this feature between host and guest depends on how we >> use it? Once you give me a decision, I will follow it in next version. > > As I said: The host kernel makes the decision. > > If the host kernel has it enabled then the guest is not allowed to change > it. If the guest triggers an #AC it will be killed. > > If the host kernel has it disabled then the guest can enable it for it's > own purposes. > > Thanks, > > tglx >
On Sun, 28 Apr 2019, Xiaoyao Li wrote: > On 4/28/2019 3:09 PM, Thomas Gleixner wrote: > > On Sat, 27 Apr 2019, Xiaoyao Li wrote: > > > Indeed, if we use split lock detection for protection purpose, when host > > > has it enabled we should directly pass it to guest and forbid guest from > > > disabling it. And only when host disables split lock detection, we can > > > expose it and allow the guest to turn it on. > > ? > > > If it is used for protection purpose, then it should follow what you said > > > and > > > this feature needs to be disabled by default. Because there are split lock > > > issues in old/current kernels and BIOS. That will cause the existing guest > > > booting failure and killed due to those split lock. > > > > Rightfully so. > > So, the patch 13 "Enable split lock detection by default" needs to be removed? Why? No. We enable it by default and everything which violates the rules gets what it deserves. If there is an issue, boot with ac_splitlock_off and be done with it. Thanks, tglx
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index b4e7d645275a..bbb9859350b5 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1663,6 +1663,11 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) u32 index; switch (msr_info->index) { + case MSR_TEST_CTL: + if (!vmx->msr_test_ctl_mask) + return 1; + msr_info->data = vmx->msr_test_ctl; + break; #ifdef CONFIG_X86_64 case MSR_FS_BASE: msr_info->data = vmcs_readl(GUEST_FS_BASE); @@ -1797,6 +1802,12 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) u32 index; switch (msr_index) { + case MSR_TEST_CTL: + if (!vmx->msr_test_ctl_mask || + (data & vmx->msr_test_ctl_mask) != data) + return 1; + vmx->msr_test_ctl = data; + break; case MSR_EFER: ret = kvm_set_msr_common(vcpu, msr_info); break; @@ -4106,6 +4117,16 @@ static void vmx_vcpu_setup(struct vcpu_vmx *vmx) } } +static u64 vmx_get_msr_test_ctl_mask(struct kvm_vcpu *vcpu) +{ + u64 mask = 0; + + if (vcpu->arch.core_capability & CORE_CAP_SPLIT_LOCK_DETECT) + mask |= TEST_CTL_SPLIT_LOCK_DETECT; + + return mask; +} + static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -4114,6 +4135,8 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmx->rmode.vm86_active = 0; vmx->spec_ctrl = 0; + vmx->msr_test_ctl = 0; + vmx->msr_test_ctl_mask = vmx_get_msr_test_ctl_mask(vcpu); vcpu->arch.microcode_version = 0x100000000ULL; vmx->vcpu.arch.regs[VCPU_REGS_RDX] = get_rdx_init_val(); @@ -6313,6 +6336,23 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) msrs[i].host, false); } +static void atomic_switch_msr_test_ctl(struct vcpu_vmx *vmx) +{ + u64 host_msr_test_ctl; + + if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) + return; + + host_msr_test_ctl = this_cpu_read(msr_test_ctl_cache); + + if (host_msr_test_ctl == vmx->msr_test_ctl) { + clear_atomic_switch_msr(vmx, MSR_TEST_CTL); + } else { + add_atomic_switch_msr(vmx, MSR_TEST_CTL, vmx->msr_test_ctl, + host_msr_test_ctl, false); + } +} + static void vmx_arm_hv_timer(struct vcpu_vmx *vmx, u32 val) { vmcs_write32(VMX_PREEMPTION_TIMER_VALUE, val); @@ -6421,6 +6461,8 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu) atomic_switch_perf_msrs(vmx); + atomic_switch_msr_test_ctl(vmx); + vmx_update_hv_timer(vcpu); /* diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index f879529906b4..8690a1295548 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -190,6 +190,8 @@ struct vcpu_vmx { u64 msr_guest_kernel_gs_base; #endif + u64 msr_test_ctl; + u64 msr_test_ctl_mask; u64 spec_ctrl; u32 vm_entry_controls_shadow; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e88be97d47b9..60aaf75d0fe5 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1231,7 +1231,24 @@ EXPORT_SYMBOL_GPL(kvm_get_arch_capabilities); static u64 kvm_get_core_capability(void) { - return 0; + u64 data = 0; + + if (boot_cpu_has(X86_FEATURE_CORE_CAPABILITY)) { + rdmsrl(MSR_IA32_CORE_CAPABILITY, data); + + /* mask non-virtualizable functions */ + data &= CORE_CAP_SPLIT_LOCK_DETECT; + } else if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) { + /* + * There will be a list of FMS values that have split lock + * detection but lack the CORE CAPABILITY MSR. In this case, + * set CORE_CAP_SPLIT_LOCK_DETECT since we emulate + * MSR CORE_CAPABILITY. + */ + data |= CORE_CAP_SPLIT_LOCK_DETECT; + } + + return data; } static int kvm_get_msr_feature(struct kvm_msr_entry *msr)