[v8,12/15] kvm/vmx: Emulate MSR TEST_CTL

Message ID	1556134382-58814-13-git-send-email-fenghua.yu@intel.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Fenghua Yu <fenghua.yu@intel.com> To: "Thomas Gleixner" <tglx@linutronix.de>, "Ingo Molnar" <mingo@redhat.com>, "Borislav Petkov" <bp@alien8.de>, "H Peter Anvin" <hpa@zytor.com>, "Paolo Bonzini" <pbonzini@redhat.com>, "Dave Hansen" <dave.hansen@intel.com>, "Ashok Raj" <ashok.raj@intel.com>, "Peter Zijlstra" <peterz@infradead.org>, "Ravi V Shankar" <ravi.v.shankar@intel.com>, "Xiaoyao Li " <xiaoyao.li@intel.com>, "Christopherson Sean J" <sean.j.christopherson@intel.com>, "Kalle Valo" <kvalo@codeaurora.org>, "Michael Chan" <michael.chan@broadcom.com> Cc: "linux-kernel" <linux-kernel@vger.kernel.org>, "x86" <x86@kernel.org>, kvm@vger.kernel.org, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, Xiaoyao Li <xiaoyao.li@linux.intel.com>, Fenghua Yu <fenghua.yu@intel.com> Subject: [PATCH v8 12/15] kvm/vmx: Emulate MSR TEST_CTL Date: Wed, 24 Apr 2019 12:32:59 -0700 Message-Id: <1556134382-58814-13-git-send-email-fenghua.yu@intel.com> In-Reply-To: <1556134382-58814-1-git-send-email-fenghua.yu@intel.com> References: <1556134382-58814-1-git-send-email-fenghua.yu@intel.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk
Series	x86/split_lock: Enable split lock detection \| expand [v8,00/15] x86/split_lock: Enable split lock detection [v8,01/15] x86/common: Align cpu_caps_cleared and cpu_caps_set to unsigned long [v8,02/15] drivers/net/b44: Align pwol_mask to unsigned long for better performance [v8,03/15] wlcore: simplify/fix/optimize reg_ch_conf_pending operations [v8,04/15] x86/split_lock: Align x86_capability to unsigned long to avoid split locked access [v8,05/15] x86/msr-index: Define MSR_IA32_CORE_CAPABILITY and split lock detection bit [v8,06/15] x86/cpufeatures: Enumerate MSR_IA32_CORE_CAPABILITY [v8,07/15] x86/split_lock: Enumerate split lock detection by MSR_IA32_CORE_CAPABILITY [v8,08/15] x86/split_lock: Enumerate split lock detection on Icelake mobile processor [v8,09/15] x86/split_lock: Define MSR TEST_CTL register [v8,10/15] x86/split_lock: Handle #AC exception for split lock [v8,11/15] kvm/x86: Emulate MSR IA32_CORE_CAPABILITY [v8,12/15] kvm/vmx: Emulate MSR TEST_CTL [v8,13/15] x86/split_lock: Enable split lock detection by default [v8,14/15] x86/split_lock: Disable split lock detection by kernel parameter "nosplit_lock_detect" [v8,15/15] x86/split_lock: Add a sysfs interface to enable/disable split lock detection during run …

Fenghua Yu April 24, 2019, 7:32 p.m. UTC

From: Xiaoyao Li <xiaoyao.li@linux.intel.com>

A control bit (bit 29) in TEST_CTL MSR 0x33 will be introduced in
future x86 processors. When bit 29 is set, the processor causes #AC
exception for split locked accesses at all CPL.

Please check the latest Intel 64 and IA-32 Architectures Software
Developer's Manual for more detailed information on the MSR and
the split lock bit.

This patch emulates MSR_TEST_CTL with vmx->msr_test_ctl and does the
following:
1. As MSR TEST_CTL of guest is emulated, enable the related bit
in CORE_CAPABILITY to correctly report this feature to guest.

2. Differentiate MSR_TEST_CTL between host and guest.

To avoid costly RDMSR of TEST_CTL when switching between host and guest
during vmentry, read per CPU variable msr_test_ctl_cache which caches
the MSR value.

Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
---
Changes in v7:
  - Add vmx->msr_test_ctl_mask to indicate the valid bits of
  guest's MSR_TEST_CTL.
  - Add X86_FEATURE_SPLIT_LOCK_DETECT check to determine if it needs
  switch MSR_TEST_CTL.
  - Use msr_test_ctl_cache to replace costly RDMSR.
  - minimal adjustment in kvm_get_core_capability(), making it more
  clear.

 arch/x86/kvm/vmx/vmx.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/vmx/vmx.h |  2 ++
 arch/x86/kvm/x86.c     | 19 ++++++++++++++++++-
 3 files changed, 62 insertions(+), 1 deletion(-)

Thomas Gleixner April 25, 2019, 7:42 a.m. UTC | #1

On Wed, 24 Apr 2019, Fenghua Yu wrote:
>  
> +static void atomic_switch_msr_test_ctl(struct vcpu_vmx *vmx)
> +{
> +	u64 host_msr_test_ctl;
> +
> +	if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
> +		return;

Again: MSR_TST_CTL is not only about LOCK_DETECT. Check the control mask.

> +	host_msr_test_ctl = this_cpu_read(msr_test_ctl_cache);
> +
> +	if (host_msr_test_ctl == vmx->msr_test_ctl) {

This still assumes that the only bit which can be set in the MSR is that
lock detect bit.

> +		clear_atomic_switch_msr(vmx, MSR_TEST_CTL);
> +	} else {
> +		add_atomic_switch_msr(vmx, MSR_TEST_CTL, vmx->msr_test_ctl,
> +				      host_msr_test_ctl, false);

So what happens here is that if any other bit is set on the host, VMENTER
will happily clear it.

     guest = (host & ~vmx->test_ctl_mask) | vmx->test_ctl;

That preserves any bits which are not exposed to the guest.

But the way more interesting question is why are you exposing the MSR and
the bit to the guest at all if the host has split lock detection enabled?

That does not make any sense as you basically allow the guest to switch it
off and then launch a slowdown attack. If the host has it enabled, then a
guest has to be treated like any other process and the #AC trap has to be
caught by the hypervisor which then kills the guest.

Only if the host has split lock detection disabled, then you can expose it
and allow the guest to turn it on and handle it on its own.

Thanks,

	tglx

Xiaoyao Li April 27, 2019, 12:20 p.m. UTC | #2

On Thu, 2019-04-25 at 09:42 +0200, Thomas Gleixner wrote:
> On Wed, 24 Apr 2019, Fenghua Yu wrote:
> >  
> > +static void atomic_switch_msr_test_ctl(struct vcpu_vmx *vmx)
> > +{
> > +	u64 host_msr_test_ctl;
> > +
> > +	if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
> > +		return;
> 
> Again: MSR_TST_CTL is not only about LOCK_DETECT. Check the control mask.
> 
> > +	host_msr_test_ctl = this_cpu_read(msr_test_ctl_cache);
> > +
> > +	if (host_msr_test_ctl == vmx->msr_test_ctl) {
> 
> This still assumes that the only bit which can be set in the MSR is that
> lock detect bit.
> 
> > +		clear_atomic_switch_msr(vmx, MSR_TEST_CTL);
> > +	} else {
> > +		add_atomic_switch_msr(vmx, MSR_TEST_CTL, vmx->msr_test_ctl,
> > +				      host_msr_test_ctl, false);
> 
> So what happens here is that if any other bit is set on the host, VMENTER
> will happily clear it.

There are two bits of MSR TEST_CTL defined in Intel SDM now, which is bit 29 and
bit 31. Bit 31 is not used in kernel, and here we only need to switch bit 29
between host and guest. 
So should I also change the name to atomic_switch_split_lock_detect() to
indicate that we only switch bit 29?

>      guest = (host & ~vmx->test_ctl_mask) | vmx->test_ctl;
> 
> That preserves any bits which are not exposed to the guest.
> 
> But the way more interesting question is why are you exposing the MSR and
> the bit to the guest at all if the host has split lock detection enabled?
> 
> That does not make any sense as you basically allow the guest to switch it
> off and then launch a slowdown attack. If the host has it enabled, then a
> guest has to be treated like any other process and the #AC trap has to be
> caught by the hypervisor which then kills the guest.
> 
> Only if the host has split lock detection disabled, then you can expose it
> and allow the guest to turn it on and handle it on its own.

Indeed, if we use split lock detection for protection purpose, when host has it
enabled we should directly pass it to guest and forbid guest from disabling it.
And only when host disables split lock detection, we can expose it and allow the
guest to turn it on.

If it is used for protection purpose, then it should follow what you said and
this feature needs to be disabled by default. Because there are split lock
issues in old/current kernels and BIOS. That will cause the existing guest
booting failure and killed due to those split lock.

If it is only used for debug purpose, I think it might be OK to enable this
feature by default and make it indepedent between host and guest?

So I think how to handle this feature between host and guest depends on how we
use it? Once you give me a decision, I will follow it in next version.

> Thanks,
> 
> 	tglx
> 
>

Thomas Gleixner April 28, 2019, 7:09 a.m. UTC | #3

On Sat, 27 Apr 2019, Xiaoyao Li wrote:
> On Thu, 2019-04-25 at 09:42 +0200, Thomas Gleixner wrote:
> > On Wed, 24 Apr 2019, Fenghua Yu wrote:
> > >  
> > > +static void atomic_switch_msr_test_ctl(struct vcpu_vmx *vmx)
> > > +{
> > > +	u64 host_msr_test_ctl;
> > > +
> > > +	if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
> > > +		return;
> > 
> > Again: MSR_TST_CTL is not only about LOCK_DETECT. Check the control mask.
> > 
> > > +	host_msr_test_ctl = this_cpu_read(msr_test_ctl_cache);
> > > +
> > > +	if (host_msr_test_ctl == vmx->msr_test_ctl) {
> > 
> > This still assumes that the only bit which can be set in the MSR is that
> > lock detect bit.
> > 
> > > +		clear_atomic_switch_msr(vmx, MSR_TEST_CTL);
> > > +	} else {
> > > +		add_atomic_switch_msr(vmx, MSR_TEST_CTL, vmx->msr_test_ctl,
> > > +				      host_msr_test_ctl, false);
> > 
> > So what happens here is that if any other bit is set on the host, VMENTER
> > will happily clear it.
> 
> There are two bits of MSR TEST_CTL defined in Intel SDM now, which is bit
> 29 and bit 31. Bit 31 is not used in kernel, and here we only need to
> switch bit 29 between host and guest.  So should I also change the name
> to atomic_switch_split_lock_detect() to indicate that we only switch bit
> 29?

No. Just because we ony use the split lock bit now, there is no
jusification to name everything splitlock. This is going to have renamed
when yet another bit is added in the future. The MSR is exposed to the
guest and the restriction of bits happens to be splitlock today.

> >      guest = (host & ~vmx->test_ctl_mask) | vmx->test_ctl;
> > 
> > That preserves any bits which are not exposed to the guest.
> > 
> > But the way more interesting question is why are you exposing the MSR and
> > the bit to the guest at all if the host has split lock detection enabled?
> > 
> > That does not make any sense as you basically allow the guest to switch it
> > off and then launch a slowdown attack. If the host has it enabled, then a
> > guest has to be treated like any other process and the #AC trap has to be
> > caught by the hypervisor which then kills the guest.
> > 
> > Only if the host has split lock detection disabled, then you can expose it
> > and allow the guest to turn it on and handle it on its own.
> 
> Indeed, if we use split lock detection for protection purpose, when host
> has it enabled we should directly pass it to guest and forbid guest from
> disabling it.  And only when host disables split lock detection, we can
> expose it and allow the guest to turn it on.
?
> If it is used for protection purpose, then it should follow what you said and
> this feature needs to be disabled by default. Because there are split lock
> issues in old/current kernels and BIOS. That will cause the existing guest
> booting failure and killed due to those split lock.

Rightfully so.

> If it is only used for debug purpose, I think it might be OK to enable this
> feature by default and make it indepedent between host and guest?

No. It does not make sense.

> So I think how to handle this feature between host and guest depends on how we
> use it? Once you give me a decision, I will follow it in next version.

As I said: The host kernel makes the decision.

If the host kernel has it enabled then the guest is not allowed to change
it. If the guest triggers an #AC it will be killed.

If the host kernel has it disabled then the guest can enable it for it's
own purposes.

Thanks,

	tglx

Xiaoyao Li April 28, 2019, 7:34 a.m. UTC | #4

On 4/28/2019 3:09 PM, Thomas Gleixner wrote:
> On Sat, 27 Apr 2019, Xiaoyao Li wrote:
>> On Thu, 2019-04-25 at 09:42 +0200, Thomas Gleixner wrote:
>>> On Wed, 24 Apr 2019, Fenghua Yu wrote:
>>>>   
>>>> +static void atomic_switch_msr_test_ctl(struct vcpu_vmx *vmx)
>>>> +{
>>>> +	u64 host_msr_test_ctl;
>>>> +
>>>> +	if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
>>>> +		return;
>>>
>>> Again: MSR_TST_CTL is not only about LOCK_DETECT. Check the control mask.
>>>
>>>> +	host_msr_test_ctl = this_cpu_read(msr_test_ctl_cache);
>>>> +
>>>> +	if (host_msr_test_ctl == vmx->msr_test_ctl) {
>>>
>>> This still assumes that the only bit which can be set in the MSR is that
>>> lock detect bit.
>>>
>>>> +		clear_atomic_switch_msr(vmx, MSR_TEST_CTL);
>>>> +	} else {
>>>> +		add_atomic_switch_msr(vmx, MSR_TEST_CTL, vmx->msr_test_ctl,
>>>> +				      host_msr_test_ctl, false);
>>>
>>> So what happens here is that if any other bit is set on the host, VMENTER
>>> will happily clear it.
>>
>> There are two bits of MSR TEST_CTL defined in Intel SDM now, which is bit
>> 29 and bit 31. Bit 31 is not used in kernel, and here we only need to
>> switch bit 29 between host and guest.  So should I also change the name
>> to atomic_switch_split_lock_detect() to indicate that we only switch bit
>> 29?
> 
> No. Just because we ony use the split lock bit now, there is no
> jusification to name everything splitlock. This is going to have renamed
> when yet another bit is added in the future. The MSR is exposed to the
> guest and the restriction of bits happens to be splitlock today.

Got it.

>>>       guest = (host & ~vmx->test_ctl_mask) | vmx->test_ctl;
>>>
>>> That preserves any bits which are not exposed to the guest.
>>>
>>> But the way more interesting question is why are you exposing the MSR and
>>> the bit to the guest at all if the host has split lock detection enabled?
>>>
>>> That does not make any sense as you basically allow the guest to switch it
>>> off and then launch a slowdown attack. If the host has it enabled, then a
>>> guest has to be treated like any other process and the #AC trap has to be
>>> caught by the hypervisor which then kills the guest.
>>>
>>> Only if the host has split lock detection disabled, then you can expose it
>>> and allow the guest to turn it on and handle it on its own.
>>
>> Indeed, if we use split lock detection for protection purpose, when host
>> has it enabled we should directly pass it to guest and forbid guest from
>> disabling it.  And only when host disables split lock detection, we can
>> expose it and allow the guest to turn it on.
> ?
>> If it is used for protection purpose, then it should follow what you said and
>> this feature needs to be disabled by default. Because there are split lock
>> issues in old/current kernels and BIOS. That will cause the existing guest
>> booting failure and killed due to those split lock.
> 
> Rightfully so.

So, the patch 13 "Enable split lock detection by default" needs to be 
removed?

>> If it is only used for debug purpose, I think it might be OK to enable this
>> feature by default and make it indepedent between host and guest?
> 
> No. It does not make sense.
> 
>> So I think how to handle this feature between host and guest depends on how we
>> use it? Once you give me a decision, I will follow it in next version.
> 
> As I said: The host kernel makes the decision.
> 
> If the host kernel has it enabled then the guest is not allowed to change
> it. If the guest triggers an #AC it will be killed.
> 
> If the host kernel has it disabled then the guest can enable it for it's
> own purposes.
> 
> Thanks,
> 
> 	tglx
>

Xiaoyao Li April 29, 2019, 5:21 a.m. UTC | #5

Hi, Thomas,

Base on your comments, I plan to make the design as following:

1) When host enables this feature, there is no switch between host and 
guest that guest running with it enabled by force. Since #AC in 
exception bitmap is set in current kvm, every #AC in guest will be 
trapped. And in handle_exception() handler in kvm, if #AC is caused by 
alignment check, kvm injects #AC back to guest; if #AC is caused by 
split lock, kvm sends a SIGBUS to userspace.

2) When host disables this feature, there needs atomic switch between 
host and guest if different. And in handle_exception() handler in kvm, 
we can just inject #AC back to guest, and let guest to handle it.

Besides, I think there might be an optimization for case #1.
When host has it enabled and guest also has it enabled, I think it's OK 
to inject #AC back to guest, not directly kill the guest.
Because guest kernel has it enabled means it knows what this feature is 
and it also want to get aware of and fault every split lock.
At this point, if guest has it enabled, we can leave it to guest. Only 
when guest's configuration is having it disabled, can it be regards as 
potentially harmful that we kill the guest once there is a #AC due to 
split lock.

How do you think about the design and this optimization?

Hi, Paolo,

What's your opinion about this design of split lock in KVM?

Thanks.

On 4/28/2019 3:09 PM, Thomas Gleixner wrote:
> On Sat, 27 Apr 2019, Xiaoyao Li wrote:
>> On Thu, 2019-04-25 at 09:42 +0200, Thomas Gleixner wrote:
>>> But the way more interesting question is why are you exposing the MSR and
>>> the bit to the guest at all if the host has split lock detection enabled?
>>>
>>> That does not make any sense as you basically allow the guest to switch it
>>> off and then launch a slowdown attack. If the host has it enabled, then a
>>> guest has to be treated like any other process and the #AC trap has to be
>>> caught by the hypervisor which then kills the guest.
>>>
>>> Only if the host has split lock detection disabled, then you can expose it
>>> and allow the guest to turn it on and handle it on its own.
>>
>> Indeed, if we use split lock detection for protection purpose, when host
>> has it enabled we should directly pass it to guest and forbid guest from
>> disabling it.  And only when host disables split lock detection, we can
>> expose it and allow the guest to turn it on.
> ?
>> If it is used for protection purpose, then it should follow what you said and
>> this feature needs to be disabled by default. Because there are split lock
>> issues in old/current kernels and BIOS. That will cause the existing guest
>> booting failure and killed due to those split lock.
> 
> Rightfully so.
> 
>> If it is only used for debug purpose, I think it might be OK to enable this
>> feature by default and make it indepedent between host and guest?
> 
> No. It does not make sense.
> 
>> So I think how to handle this feature between host and guest depends on how we
>> use it? Once you give me a decision, I will follow it in next version.
> 
> As I said: The host kernel makes the decision.
> 
> If the host kernel has it enabled then the guest is not allowed to change
> it. If the guest triggers an #AC it will be killed.
> 
> If the host kernel has it disabled then the guest can enable it for it's
> own purposes.
> 
> Thanks,
> 
> 	tglx
>

Thomas Gleixner April 29, 2019, 7:31 a.m. UTC | #6

On Sun, 28 Apr 2019, Xiaoyao Li wrote:
> On 4/28/2019 3:09 PM, Thomas Gleixner wrote:
> > On Sat, 27 Apr 2019, Xiaoyao Li wrote:
> > > Indeed, if we use split lock detection for protection purpose, when host
> > > has it enabled we should directly pass it to guest and forbid guest from
> > > disabling it.  And only when host disables split lock detection, we can
> > > expose it and allow the guest to turn it on.
> > ?
> > > If it is used for protection purpose, then it should follow what you said
> > > and
> > > this feature needs to be disabled by default. Because there are split lock
> > > issues in old/current kernels and BIOS. That will cause the existing guest
> > > booting failure and killed due to those split lock.
> > 
> > Rightfully so.
> 
> So, the patch 13 "Enable split lock detection by default" needs to be removed?

Why? No. We enable it by default and everything which violates the rules
gets what it deserves. If there is an issue, boot with ac_splitlock_off and
be done with it.

Thanks,

	tglx

[v8,12/15] kvm/vmx: Emulate MSR TEST_CTL

Commit Message

Comments

Patch