diff mbox series

[v5,1/9] x86/split_lock: Rework the initialization flow of split lock detection

Message ID 20200315050517.127446-2-xiaoyao.li@intel.com (mailing list archive)
State New, archived
Headers show
Series x86/split_lock: Add feature split lock detection | expand

Commit Message

Xiaoyao Li March 15, 2020, 5:05 a.m. UTC
Current initialization flow of split lock detection has following issues:
1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
   zero. However, it's possible that BIOS/firmware has set it.

2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
   there is a virtualization flaw that FMS indicates the existence while
   it's actually not supported.

3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
   to check verify if feature does exist, so cannot expose it to guest.

To solve these issues, introducing a new sld_state, "sld_not_exist", as
the default value. It will be switched to other value if CORE_CAPABILITIES
or FMS enumerate split lock detection.

Only when sld_state != sld_not_exist, it goes to initialization flow.

In initialization flow, it explicitly accesses MSR_TEST_CTRL and
SPLIT_LOCK_DETECT bit to ensure there is no virtualization flaw, i.e.,
feature split lock detection does supported. In detail,
 1. sld_off,   verify SPLIT_LOCK_DETECT bit can be cleared, and clear it;
 2. sld_warn,  verify SPLIT_LOCK_DETECT bit can be cleared and set,
               and set it;
 3. sld_fatal, verify SPLIT_LOCK_DETECT bit can be set, and set it;

Only when no MSR aceessing failure, can X86_FEATURE_SPLIT_LOCK_DETECT be
set. So kvm can use X86_FEATURE_SPLIT_LOCK_DETECT to check the existence
of feature.

Also, since MSR and bit are checked when split_lock_init(), there
is no need to use safe version RDMSR/WRMSR at runtime.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/kernel/cpu/intel.c | 64 +++++++++++++++++++++++++------------
 1 file changed, 44 insertions(+), 20 deletions(-)

Comments

Luck, Tony March 21, 2020, 12:41 a.m. UTC | #1
On Sun, Mar 15, 2020 at 01:05:09PM +0800, Xiaoyao Li wrote:
> To solve these issues, introducing a new sld_state, "sld_not_exist", as
> the default value. It will be switched to other value if CORE_CAPABILITIES
> or FMS enumerate split lock detection.

Is a better name for this state "sld_uninitialized?

Otherwise looks good.

Reviewed-by: Tony Luck <tony.luck@intel.com>

-Tony
Thomas Gleixner March 23, 2020, 5:02 p.m. UTC | #2
Xiaoyao Li <xiaoyao.li@intel.com> writes:

> Current initialization flow of split lock detection has following issues:
> 1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
>    zero. However, it's possible that BIOS/firmware has set it.

Ok.

> 2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
>    there is a virtualization flaw that FMS indicates the existence while
>    it's actually not supported.
>
> 3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
>    to check verify if feature does exist, so cannot expose it to
>    guest.

Sorry this does not make anny sense. KVM is the hypervisor, so it better
can rely on the detect flag. Unless you talk about nested virt and a
broken L1 hypervisor.

> To solve these issues, introducing a new sld_state, "sld_not_exist",
> as

The usual naming convention is sld_not_supported.

> the default value. It will be switched to other value if CORE_CAPABILITIES
> or FMS enumerate split lock detection.
>
> Only when sld_state != sld_not_exist, it goes to initialization flow.
>
> In initialization flow, it explicitly accesses MSR_TEST_CTRL and
> SPLIT_LOCK_DETECT bit to ensure there is no virtualization flaw, i.e.,
> feature split lock detection does supported. In detail,
>  1. sld_off,   verify SPLIT_LOCK_DETECT bit can be cleared, and clear it;

That's not what the patch does. It writes with the bit cleared and the
only thing it checks is whether the wrmsrl fails or not. Verification is
something completely different.

>  2. sld_warn,  verify SPLIT_LOCK_DETECT bit can be cleared and set,
>                and set it;
>  3. sld_fatal, verify SPLIT_LOCK_DETECT bit can be set, and set it;
>
> Only when no MSR aceessing failure, can X86_FEATURE_SPLIT_LOCK_DETECT be
> set. So kvm can use X86_FEATURE_SPLIT_LOCK_DETECT to check the existence
> of feature.

Again, this has nothing to do with KVM. 

>   * Processors which have self-snooping capability can handle conflicting
> @@ -585,7 +585,7 @@ static void init_intel_misc_features(struct cpuinfo_x86 *c)
>  	wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
>  }
>  
> -static void split_lock_init(void);
> +static void split_lock_init(struct cpuinfo_x86 *c);
>  
>  static void init_intel(struct cpuinfo_x86 *c)
>  {
> @@ -702,7 +702,8 @@ static void init_intel(struct cpuinfo_x86 *c)
>  	if (tsx_ctrl_state == TSX_CTRL_DISABLE)
>  		tsx_disable();
>  
> -	split_lock_init();
> +	if (sld_state != sld_not_exist)
> +		split_lock_init(c);

That conditional want's to be in split_lock_init() where it used to be.

> +/*
> + * Use the "safe" versions of rdmsr/wrmsr here because although code
> + * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
> + * exist, there may be glitches in virtualization that leave a guest
> + * with an incorrect view of real h/w capabilities.
> + * If not msr_broken, then it needn't use "safe" version at runtime.
> + */
> +static void split_lock_init(struct cpuinfo_x86 *c)
>  {
> -	if (sld_state == sld_off)
> -		return;
> +	u64 test_ctrl_val;
>  
> -	if (__sld_msr_set(true))
> -		return;
> +	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
> +		goto msr_broken;
> +
> +	switch (sld_state) {
> +	case sld_off:
> +		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
> +			goto msr_broken;
> +		break;
> +	case sld_warn:
> +		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
> +			goto msr_broken;
> +		fallthrough;
> +	case sld_fatal:
> +		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val | MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
> +			goto msr_broken;
> +		break;

This does not make any sense either. Why doing it any different for warn
and fatal?

> +	default:
> +		break;

If there is ever a state added, then default will just fall through and
possibly nobody notices because the compiler does not complain.

> +	}
> +
> +	set_cpu_cap(c, X86_FEATURE_SPLIT_LOCK_DETECT);
> +	return;
>  
> +msr_broken:
>  	/*
>  	 * If this is anything other than the boot-cpu, you've done
>  	 * funny things and you get to keep whatever pieces.
>  	 */
> -	pr_warn("MSR fail -- disabled\n");
> +	pr_warn_once("MSR fail -- disabled\n");
>  	sld_state = sld_off;

So you run this on every CPU. What's the point? If the hypervisor is so
broken that the MSR works on CPU0 but not on CPU1 then this is probably
the least of your worries.

Thanks,

        tglx
Thomas Gleixner March 23, 2020, 8:24 p.m. UTC | #3
Thomas Gleixner <tglx@linutronix.de> writes:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>
>> Current initialization flow of split lock detection has following issues:
>> 1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
>>    zero. However, it's possible that BIOS/firmware has set it.
>
> Ok.
>
>> 2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
>>    there is a virtualization flaw that FMS indicates the existence while
>>    it's actually not supported.
>>
>> 3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
>>    to check verify if feature does exist, so cannot expose it to
>>    guest.
>
> Sorry this does not make anny sense. KVM is the hypervisor, so it better
> can rely on the detect flag. Unless you talk about nested virt and a
> broken L1 hypervisor.
>
>> To solve these issues, introducing a new sld_state, "sld_not_exist",
>> as
>
> The usual naming convention is sld_not_supported.

But this extra state is not needed at all, it already exists:

    X86_FEATURE_SPLIT_LOCK_DETECT

You just need to make split_lock_setup() a bit smarter. Soemthing like
the below. It just wants to be split into separate patches.

Thanks,

        tglx
---
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -45,6 +45,7 @@ enum split_lock_detect_state {
  * split lock detect, unless there is a command line override.
  */
 static enum split_lock_detect_state sld_state = sld_off;
+static DEFINE_PER_CPU(u64, msr_test_ctrl_cache);
 
 /*
  * Processors which have self-snooping capability can handle conflicting
@@ -984,11 +985,32 @@ static inline bool match_option(const ch
 	return len == arglen && !strncmp(arg, opt, len);
 }
 
+static bool __init split_lock_verify_msr(bool on)
+{
+	u64 ctrl, tmp;
+
+	if (rdmsrl_safe(MSR_TEST_CTRL, &ctrl))
+		return false;
+	if (on)
+		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	else
+		ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	if (wrmsrl_safe(MSR_TEST_CTRL, ctrl))
+		return false;
+	rdmsrl(MSR_TEST_CTRL, tmp);
+	return ctrl == tmp;
+}
+
 static void __init split_lock_setup(void)
 {
 	char arg[20];
 	int i, ret;
 
+	if (!split_lock_verify_msr(true) || !split_lock_verify_msr(false)) {
+		pr_info("MSR access failed: Disabled\n");
+		return;
+	}
+
 	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
 	sld_state = sld_warn;
 
@@ -1007,7 +1029,6 @@ static void __init split_lock_setup(void
 	case sld_off:
 		pr_info("disabled\n");
 		break;
-
 	case sld_warn:
 		pr_info("warning about user-space split_locks\n");
 		break;
@@ -1018,44 +1039,40 @@ static void __init split_lock_setup(void
 	}
 }
 
-/*
- * Locking is not required at the moment because only bit 29 of this
- * MSR is implemented and locking would not prevent that the operation
- * of one thread is immediately undone by the sibling thread.
- * Use the "safe" versions of rdmsr/wrmsr here because although code
- * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
- * exist, there may be glitches in virtualization that leave a guest
- * with an incorrect view of real h/w capabilities.
- */
-static bool __sld_msr_set(bool on)
+static void split_lock_init(void)
 {
-	u64 test_ctrl_val;
+	u64 ctrl;
 
-	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
-		return false;
+	if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
+		return;
 
-	if (on)
-		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	rdmsrl(MSR_TEST_CTRL, ctrl);
+	if (sld_state == sld_off)
+		ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
 	else
-		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
-
-	return !wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val);
+		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	wrmsrl(MSR_TEST_CTRL, ctrl);
+	this_cpu_write(msr_test_ctrl_cache, ctrl);
 }
 
-static void split_lock_init(void)
+/*
+ * MSR_TEST_CTRL is per core, but we treat it like a per CPU MSR. Locking
+ * is not implemented as one thread could undo the setting of the other
+ * thread immediately after dropping the lock anyway.
+ */
+static void msr_test_ctrl_update(bool on, u64 mask)
 {
-	if (sld_state == sld_off)
-		return;
+	u64 tmp, ctrl = this_cpu_read(msr_test_ctrl_cache);
 
-	if (__sld_msr_set(true))
-		return;
+	if (on)
+		tmp = ctrl | mask;
+	else
+		tmp = ctrl & ~mask;
 
-	/*
-	 * If this is anything other than the boot-cpu, you've done
-	 * funny things and you get to keep whatever pieces.
-	 */
-	pr_warn("MSR fail -- disabled\n");
-	sld_state = sld_off;
+	if (tmp != ctrl) {
+		wrmsrl(MSR_TEST_CTRL, ctrl);
+		this_cpu_write(msr_test_ctrl_cache, ctrl);
+	}
 }
 
 bool handle_user_split_lock(struct pt_regs *regs, long error_code)
@@ -1071,7 +1088,7 @@ bool handle_user_split_lock(struct pt_re
 	 * progress and set TIF_SLD so the detection is re-enabled via
 	 * switch_to_sld() when the task is scheduled out.
 	 */
-	__sld_msr_set(false);
+	msr_test_ctrl_update(false, MSR_TEST_CTRL_SPLIT_LOCK_DETECT);
 	set_tsk_thread_flag(current, TIF_SLD);
 	return true;
 }
@@ -1085,7 +1102,7 @@ bool handle_user_split_lock(struct pt_re
  */
 void switch_to_sld(unsigned long tifn)
 {
-	__sld_msr_set(!(tifn & _TIF_SLD));
+	msr_test_ctrl_update(!(tifn & _TIF_SLD), MSR_TEST_CTRL_SPLIT_LOCK_DETECT);
 }
 
 #define SPLIT_LOCK_CPU(model) {X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY}
Xiaoyao Li March 24, 2020, 1:10 a.m. UTC | #4
On 3/24/2020 4:24 AM, Thomas Gleixner wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>
>>> Current initialization flow of split lock detection has following issues:
>>> 1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
>>>     zero. However, it's possible that BIOS/firmware has set it.
>>
>> Ok.
>>
>>> 2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
>>>     there is a virtualization flaw that FMS indicates the existence while
>>>     it's actually not supported.
>>>
>>> 3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
>>>     to check verify if feature does exist, so cannot expose it to
>>>     guest.
>>
>> Sorry this does not make anny sense. KVM is the hypervisor, so it better
>> can rely on the detect flag. Unless you talk about nested virt and a
>> broken L1 hypervisor.
>>
>>> To solve these issues, introducing a new sld_state, "sld_not_exist",
>>> as
>>
>> The usual naming convention is sld_not_supported.
> 
> But this extra state is not needed at all, it already exists:
> 
>      X86_FEATURE_SPLIT_LOCK_DETECT
> 
> You just need to make split_lock_setup() a bit smarter. Soemthing like
> the below. It just wants to be split into separate patches.
> 
> Thanks,
> 
>          tglx
> ---
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -45,6 +45,7 @@ enum split_lock_detect_state {
>    * split lock detect, unless there is a command line override.
>    */
>   static enum split_lock_detect_state sld_state = sld_off;
> +static DEFINE_PER_CPU(u64, msr_test_ctrl_cache);

I used percpu cache in v3, but people prefer Tony's cache for reserved 
bits[1].

If you prefer percpu cache, I'll use it in next version.

[1]: https://lore.kernel.org/lkml/20200303192242.GU1439@linux.intel.com/

>   /*
>    * Processors which have self-snooping capability can handle conflicting
> @@ -984,11 +985,32 @@ static inline bool match_option(const ch
>   	return len == arglen && !strncmp(arg, opt, len);
>   }
>   
> +static bool __init split_lock_verify_msr(bool on)
> +{
> +	u64 ctrl, tmp;
> +
> +	if (rdmsrl_safe(MSR_TEST_CTRL, &ctrl))
> +		return false;
> +	if (on)
> +		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +	else
> +		ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +	if (wrmsrl_safe(MSR_TEST_CTRL, ctrl))
> +		return false;
> +	rdmsrl(MSR_TEST_CTRL, tmp);
> +	return ctrl == tmp;
> +}
> +
>   static void __init split_lock_setup(void)
>   {
>   	char arg[20];
>   	int i, ret;
>   
> +	if (!split_lock_verify_msr(true) || !split_lock_verify_msr(false)) {
> +		pr_info("MSR access failed: Disabled\n");
> +		return;
> +	}
> +

I did similar thing like this in my v3, however Sean raised concern that 
toggling MSR bit before parsing kernel param is bad behavior. [2]

[2]: https://lore.kernel.org/kvm/20200305162311.GG11500@linux.intel.com/
Thomas Gleixner March 24, 2020, 10:29 a.m. UTC | #5
Xiaoyao Li <xiaoyao.li@intel.com> writes:
> On 3/24/2020 4:24 AM, Thomas Gleixner wrote:
>> --- a/arch/x86/kernel/cpu/intel.c
>> +++ b/arch/x86/kernel/cpu/intel.c
>> @@ -45,6 +45,7 @@ enum split_lock_detect_state {
>>    * split lock detect, unless there is a command line override.
>>    */
>>   static enum split_lock_detect_state sld_state = sld_off;
>> +static DEFINE_PER_CPU(u64, msr_test_ctrl_cache);
>
> I used percpu cache in v3, but people prefer Tony's cache for reserved 
> bits[1].
>
> If you prefer percpu cache, I'll use it in next version.

I'm fine with the single variable.

>>   static void __init split_lock_setup(void)
>>   {
>>   	char arg[20];
>>   	int i, ret;
>>   
>> +	if (!split_lock_verify_msr(true) || !split_lock_verify_msr(false)) {
>> +		pr_info("MSR access failed: Disabled\n");
>> +		return;
>> +	}
>> +
>
> I did similar thing like this in my v3, however Sean raised concern that 
> toggling MSR bit before parsing kernel param is bad behavior. [2]

That's trivial enough to fix.

Thanks,

        tglx

8<---------------
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -44,7 +44,8 @@ enum split_lock_detect_state {
  * split_lock_setup() will switch this to sld_warn on systems that support
  * split lock detect, unless there is a command line override.
  */
-static enum split_lock_detect_state sld_state = sld_off;
+static enum split_lock_detect_state sld_state __ro_after_init = sld_off;
+static u64 msr_test_ctrl_cache __ro_after_init;
 
 /*
  * Processors which have self-snooping capability can handle conflicting
@@ -984,78 +985,85 @@ static inline bool match_option(const ch
 	return len == arglen && !strncmp(arg, opt, len);
 }
 
+static bool __init split_lock_verify_msr(bool on)
+{
+	u64 ctrl, tmp;
+
+	if (rdmsrl_safe(MSR_TEST_CTRL, &ctrl))
+		return false;
+	if (on)
+		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	else
+		ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	if (wrmsrl_safe(MSR_TEST_CTRL, ctrl))
+		return false;
+	rdmsrl(MSR_TEST_CTRL, tmp);
+	return ctrl == tmp;
+}
+
 static void __init split_lock_setup(void)
 {
+	enum split_lock_detect_state state = sld_warn;
 	char arg[20];
 	int i, ret;
 
-	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
-	sld_state = sld_warn;
+	if (!split_lock_verify_msr(false)) {
+		pr_info("MSR access failed: Disabled\n");
+		return;
+	}
 
 	ret = cmdline_find_option(boot_command_line, "split_lock_detect",
 				  arg, sizeof(arg));
 	if (ret >= 0) {
 		for (i = 0; i < ARRAY_SIZE(sld_options); i++) {
 			if (match_option(arg, ret, sld_options[i].option)) {
-				sld_state = sld_options[i].state;
+				state = sld_options[i].state;
 				break;
 			}
 		}
 	}
 
-	switch (sld_state) {
+	switch (state) {
 	case sld_off:
 		pr_info("disabled\n");
-		break;
-
+		return;
 	case sld_warn:
 		pr_info("warning about user-space split_locks\n");
 		break;
-
 	case sld_fatal:
 		pr_info("sending SIGBUS on user-space split_locks\n");
 		break;
 	}
+
+	rdmsrl(MSR_TEST_CTRL, msr_test_ctrl_cache);
+
+	if (!split_lock_verify_msr(true)) {
+		pr_info("MSR access failed: Disabled\n");
+		return;
+	}
+
+	sld_state = state;
+	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
 }
 
 /*
- * Locking is not required at the moment because only bit 29 of this
- * MSR is implemented and locking would not prevent that the operation
- * of one thread is immediately undone by the sibling thread.
- * Use the "safe" versions of rdmsr/wrmsr here because although code
- * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
- * exist, there may be glitches in virtualization that leave a guest
- * with an incorrect view of real h/w capabilities.
+ * MSR_TEST_CTRL is per core, but we treat it like a per CPU MSR. Locking
+ * is not implemented as one thread could undo the setting of the other
+ * thread immediately after dropping the lock anyway.
  */
-static bool __sld_msr_set(bool on)
+static void sld_update_msr(bool on)
 {
-	u64 test_ctrl_val;
-
-	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
-		return false;
+	u64 ctrl = msr_test_ctrl_cache;
 
 	if (on)
-		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
-	else
-		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
-
-	return !wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val);
+		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	wrmsrl(MSR_TEST_CTRL, ctrl);
 }
 
 static void split_lock_init(void)
 {
-	if (sld_state == sld_off)
-		return;
-
-	if (__sld_msr_set(true))
-		return;
-
-	/*
-	 * If this is anything other than the boot-cpu, you've done
-	 * funny things and you get to keep whatever pieces.
-	 */
-	pr_warn("MSR fail -- disabled\n");
-	sld_state = sld_off;
+	if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
+		sld_update_msr(sld_state != sld_off);
 }
 
 bool handle_user_split_lock(struct pt_regs *regs, long error_code)
@@ -1071,7 +1079,7 @@ bool handle_user_split_lock(struct pt_re
 	 * progress and set TIF_SLD so the detection is re-enabled via
 	 * switch_to_sld() when the task is scheduled out.
 	 */
-	__sld_msr_set(false);
+	sld_update_msr(false);
 	set_tsk_thread_flag(current, TIF_SLD);
 	return true;
 }
@@ -1085,7 +1093,7 @@ bool handle_user_split_lock(struct pt_re
  */
 void switch_to_sld(unsigned long tifn)
 {
-	__sld_msr_set(!(tifn & _TIF_SLD));
+	sld_update_msr(!(tifn & _TIF_SLD));
 }
 
 #define SPLIT_LOCK_CPU(model) {X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY}
Xiaoyao Li March 24, 2020, 11:51 a.m. UTC | #6
On 3/24/2020 1:02 AM, Thomas Gleixner wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
> 
>> Current initialization flow of split lock detection has following issues:
>> 1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
>>     zero. However, it's possible that BIOS/firmware has set it.
> 
> Ok.
> 
>> 2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
>>     there is a virtualization flaw that FMS indicates the existence while
>>     it's actually not supported.
>>
>> 3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
>>     to check verify if feature does exist, so cannot expose it to
>>     guest.
> 
> Sorry this does not make anny sense. KVM is the hypervisor, so it better
> can rely on the detect flag. Unless you talk about nested virt and a
> broken L1 hypervisor.
> 

Yeah. It is for the nested virt on a broken L1 hypervisor.
Thomas Gleixner March 24, 2020, 1:31 p.m. UTC | #7
Xiaoyao Li <xiaoyao.li@intel.com> writes:
> On 3/24/2020 1:02 AM, Thomas Gleixner wrote:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>> 
>>> Current initialization flow of split lock detection has following issues:
>>> 1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
>>>     zero. However, it's possible that BIOS/firmware has set it.
>> 
>> Ok.
>> 
>>> 2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
>>>     there is a virtualization flaw that FMS indicates the existence while
>>>     it's actually not supported.
>>>
>>> 3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
>>>     to check verify if feature does exist, so cannot expose it to
>>>     guest.
>> 
>> Sorry this does not make anny sense. KVM is the hypervisor, so it better
>> can rely on the detect flag. Unless you talk about nested virt and a
>> broken L1 hypervisor.
>> 
>
> Yeah. It is for the nested virt on a broken L1 hypervisor.

Then please spell it out in the changelog. Changelogs which need crystalballs
to decode are pretty useless.

Thanks,

        tglx
Xiaoyao Li March 25, 2020, 12:18 a.m. UTC | #8
On 3/24/2020 6:29 PM, Thomas Gleixner wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>> On 3/24/2020 4:24 AM, Thomas Gleixner wrote:
>>> --- a/arch/x86/kernel/cpu/intel.c
>>> +++ b/arch/x86/kernel/cpu/intel.c
>>> @@ -45,6 +45,7 @@ enum split_lock_detect_state {
>>>     * split lock detect, unless there is a command line override.
>>>     */
>>>    static enum split_lock_detect_state sld_state = sld_off;
>>> +static DEFINE_PER_CPU(u64, msr_test_ctrl_cache);
>>
>> I used percpu cache in v3, but people prefer Tony's cache for reserved
>> bits[1].
>>
>> If you prefer percpu cache, I'll use it in next version.
> 
> I'm fine with the single variable.
> 
>>>    static void __init split_lock_setup(void)
>>>    {
>>>    	char arg[20];
>>>    	int i, ret;
>>>    
>>> +	if (!split_lock_verify_msr(true) || !split_lock_verify_msr(false)) {
>>> +		pr_info("MSR access failed: Disabled\n");
>>> +		return;
>>> +	}
>>> +
>>
>> I did similar thing like this in my v3, however Sean raised concern that
>> toggling MSR bit before parsing kernel param is bad behavior. [2]
> 
> That's trivial enough to fix.
> 
> Thanks,
> 
>          tglx
> 
> 8<---------------
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -44,7 +44,8 @@ enum split_lock_detect_state {
>    * split_lock_setup() will switch this to sld_warn on systems that support
>    * split lock detect, unless there is a command line override.
>    */
> -static enum split_lock_detect_state sld_state = sld_off;
> +static enum split_lock_detect_state sld_state __ro_after_init = sld_off;
> +static u64 msr_test_ctrl_cache __ro_after_init;
>   
>   /*
>    * Processors which have self-snooping capability can handle conflicting
> @@ -984,78 +985,85 @@ static inline bool match_option(const ch
>   	return len == arglen && !strncmp(arg, opt, len);
>   }
>   
> +static bool __init split_lock_verify_msr(bool on)
> +{
> +	u64 ctrl, tmp;
> +
> +	if (rdmsrl_safe(MSR_TEST_CTRL, &ctrl))
> +		return false;
> +	if (on)
> +		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +	else
> +		ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +	if (wrmsrl_safe(MSR_TEST_CTRL, ctrl))
> +		return false;
> +	rdmsrl(MSR_TEST_CTRL, tmp);
> +	return ctrl == tmp;
> +}
> +
>   static void __init split_lock_setup(void)
>   {
> +	enum split_lock_detect_state state = sld_warn;
>   	char arg[20];
>   	int i, ret;
>   
> -	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
> -	sld_state = sld_warn;
> +	if (!split_lock_verify_msr(false)) {
> +		pr_info("MSR access failed: Disabled\n");
> +		return;
> +	}
>   
>   	ret = cmdline_find_option(boot_command_line, "split_lock_detect",
>   				  arg, sizeof(arg));
>   	if (ret >= 0) {
>   		for (i = 0; i < ARRAY_SIZE(sld_options); i++) {
>   			if (match_option(arg, ret, sld_options[i].option)) {
> -				sld_state = sld_options[i].state;
> +				state = sld_options[i].state;
>   				break;
>   			}
>   		}
>   	}
>   
> -	switch (sld_state) {
> +	switch (state) {
>   	case sld_off:
>   		pr_info("disabled\n");
> -		break;
> -
> +		return;
Here, when sld_off, it just returns without 
setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT).

So for APs, it won't clear SLD bit in split_lock_init().

And I remember why I used sld_not_exist, not use
setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT)

Yes, we can call setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT)
for sld_off case. And in split_lock_init(), explicitly calling 
sld_update_msr(false) to turn off sld, and calling clear_cpu_cap(c, 
X86_FEATURE_SPLIT_LOCK_DETECT) to clear the cap. But due to 
setup_force_cpu_cap(), split_lock_detect will still occurs in 
/proc/cpuinfo.

>   	case sld_warn:
>   		pr_info("warning about user-space split_locks\n");
>   		break;
> -
>   	case sld_fatal:
>   		pr_info("sending SIGBUS on user-space split_locks\n");
>   		break;
>   	}
> +
> +	rdmsrl(MSR_TEST_CTRL, msr_test_ctrl_cache);
> +
> +	if (!split_lock_verify_msr(true)) {
> +		pr_info("MSR access failed: Disabled\n");
> +		return;
> +	}
> +
> +	sld_state = state;
> +	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
>   }
>   
>   /*
> - * Locking is not required at the moment because only bit 29 of this
> - * MSR is implemented and locking would not prevent that the operation
> - * of one thread is immediately undone by the sibling thread.
> - * Use the "safe" versions of rdmsr/wrmsr here because although code
> - * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
> - * exist, there may be glitches in virtualization that leave a guest
> - * with an incorrect view of real h/w capabilities.
> + * MSR_TEST_CTRL is per core, but we treat it like a per CPU MSR. Locking
> + * is not implemented as one thread could undo the setting of the other
> + * thread immediately after dropping the lock anyway.
>    */
> -static bool __sld_msr_set(bool on)
> +static void sld_update_msr(bool on)
>   {
> -	u64 test_ctrl_val;
> -
> -	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
> -		return false;
> +	u64 ctrl = msr_test_ctrl_cache;
>   
>   	if (on)
> -		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> -	else
> -		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> -
> -	return !wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val);
> +		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +	wrmsrl(MSR_TEST_CTRL, ctrl);
>   }
>   
>   static void split_lock_init(void)
>   {
> -	if (sld_state == sld_off)
> -		return;
> -
> -	if (__sld_msr_set(true))
> -		return;
> -
> -	/*
> -	 * If this is anything other than the boot-cpu, you've done
> -	 * funny things and you get to keep whatever pieces.
> -	 */
> -	pr_warn("MSR fail -- disabled\n");
> -	sld_state = sld_off;
> +	if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
> +		sld_update_msr(sld_state != sld_off);
>   }
>   
>   bool handle_user_split_lock(struct pt_regs *regs, long error_code)
> @@ -1071,7 +1079,7 @@ bool handle_user_split_lock(struct pt_re
>   	 * progress and set TIF_SLD so the detection is re-enabled via
>   	 * switch_to_sld() when the task is scheduled out.
>   	 */
> -	__sld_msr_set(false);
> +	sld_update_msr(false);
>   	set_tsk_thread_flag(current, TIF_SLD);
>   	return true;
>   }
> @@ -1085,7 +1093,7 @@ bool handle_user_split_lock(struct pt_re
>    */
>   void switch_to_sld(unsigned long tifn)
>   {
> -	__sld_msr_set(!(tifn & _TIF_SLD));
> +	sld_update_msr(!(tifn & _TIF_SLD));
>   }
>   
>   #define SPLIT_LOCK_CPU(model) {X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY}
>
Thomas Gleixner March 25, 2020, 12:52 a.m. UTC | #9
Xiaoyao Li <xiaoyao.li@intel.com> writes:
> On 3/24/2020 6:29 PM, Thomas Gleixner wrote:
>> -	switch (sld_state) {
>> +	switch (state) {
>>   	case sld_off:
>>   		pr_info("disabled\n");
>> -		break;
>> -
>> +		return;
> Here, when sld_off, it just returns without 
> setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT).
>
> So for APs, it won't clear SLD bit in split_lock_init().

Trivial fix:

static void split_lock_init(void)
{
	split_lock_verify_msr(sld_state != sld_off);
}

You just need to remove the __init annotation from split_lock_verify_msr().

Thanks,

        tglx
diff mbox series

Patch

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index db3e745e5d47..064ba12defc8 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -34,17 +34,17 @@ 
 #endif
 
 enum split_lock_detect_state {
-	sld_off = 0,
+	sld_not_exist = 0,
+	sld_off,
 	sld_warn,
 	sld_fatal,
 };
 
 /*
- * Default to sld_off because most systems do not support split lock detection
  * split_lock_setup() will switch this to sld_warn on systems that support
  * split lock detect, unless there is a command line override.
  */
-static enum split_lock_detect_state sld_state = sld_off;
+static enum split_lock_detect_state sld_state = sld_not_exist;
 
 /*
  * Processors which have self-snooping capability can handle conflicting
@@ -585,7 +585,7 @@  static void init_intel_misc_features(struct cpuinfo_x86 *c)
 	wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
 }
 
-static void split_lock_init(void);
+static void split_lock_init(struct cpuinfo_x86 *c);
 
 static void init_intel(struct cpuinfo_x86 *c)
 {
@@ -702,7 +702,8 @@  static void init_intel(struct cpuinfo_x86 *c)
 	if (tsx_ctrl_state == TSX_CTRL_DISABLE)
 		tsx_disable();
 
-	split_lock_init();
+	if (sld_state != sld_not_exist)
+		split_lock_init(c);
 }
 
 #ifdef CONFIG_X86_32
@@ -989,7 +990,6 @@  static void __init split_lock_setup(void)
 	char arg[20];
 	int i, ret;
 
-	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
 	sld_state = sld_warn;
 
 	ret = cmdline_find_option(boot_command_line, "split_lock_detect",
@@ -1015,6 +1015,8 @@  static void __init split_lock_setup(void)
 	case sld_fatal:
 		pr_info("sending SIGBUS on user-space split_locks\n");
 		break;
+	default:
+		break;
 	}
 }
 
@@ -1022,39 +1024,61 @@  static void __init split_lock_setup(void)
  * Locking is not required at the moment because only bit 29 of this
  * MSR is implemented and locking would not prevent that the operation
  * of one thread is immediately undone by the sibling thread.
- * Use the "safe" versions of rdmsr/wrmsr here because although code
- * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
- * exist, there may be glitches in virtualization that leave a guest
- * with an incorrect view of real h/w capabilities.
  */
-static bool __sld_msr_set(bool on)
+static void __sld_msr_set(bool on)
 {
 	u64 test_ctrl_val;
 
-	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
-		return false;
+	rdmsrl(MSR_TEST_CTRL, test_ctrl_val);
 
 	if (on)
 		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
 	else
 		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
 
-	return !wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val);
+	wrmsrl(MSR_TEST_CTRL, test_ctrl_val);
 }
 
-static void split_lock_init(void)
+/*
+ * Use the "safe" versions of rdmsr/wrmsr here because although code
+ * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
+ * exist, there may be glitches in virtualization that leave a guest
+ * with an incorrect view of real h/w capabilities.
+ * If not msr_broken, then it needn't use "safe" version at runtime.
+ */
+static void split_lock_init(struct cpuinfo_x86 *c)
 {
-	if (sld_state == sld_off)
-		return;
+	u64 test_ctrl_val;
 
-	if (__sld_msr_set(true))
-		return;
+	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
+		goto msr_broken;
+
+	switch (sld_state) {
+	case sld_off:
+		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
+			goto msr_broken;
+		break;
+	case sld_warn:
+		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
+			goto msr_broken;
+		fallthrough;
+	case sld_fatal:
+		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val | MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
+			goto msr_broken;
+		break;
+	default:
+		break;
+	}
+
+	set_cpu_cap(c, X86_FEATURE_SPLIT_LOCK_DETECT);
+	return;
 
+msr_broken:
 	/*
 	 * If this is anything other than the boot-cpu, you've done
 	 * funny things and you get to keep whatever pieces.
 	 */
-	pr_warn("MSR fail -- disabled\n");
+	pr_warn_once("MSR fail -- disabled\n");
 	sld_state = sld_off;
 }