Message ID | 1556134382-58814-16-git-send-email-fenghua.yu@intel.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Johannes Berg |
Headers | show |
Series | x86/split_lock: Enable split lock detection | expand |
* Fenghua Yu <fenghua.yu@intel.com> wrote: > To workaround or debug a split lock issue, the administrator may need to > disable or enable split lock detection during run time without rebooting > the system. > > The interface /sys/device/system/cpu/split_lock_detect is added to allow > the administrator to disable or enable split lock detection and show > current split lock detection setting. > > Writing [yY1] or [oO][nN] to the file enables split lock detection and > writing [nN0] or [oO][fF] disables split lock detection. Split lock > detection is enabled or disabled on all CPUs. > > Reading the file returns current global split lock detection setting: > 0: disabled > 1: enabled > > Add an ABI document entry for /sys/devices/system/cpu/split_lock_detect. > > Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> > --- > Not sure if the justification for the sysfs knob is valid. If not, this > patch could be removed from this patch set. > > .../ABI/testing/sysfs-devices-system-cpu | 22 ++++++++ > arch/x86/kernel/cpu/intel.c | 52 ++++++++++++++++++- > 2 files changed, 72 insertions(+), 2 deletions(-) > > diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu > index 9605dbd4b5b5..aad7b1698065 100644 > --- a/Documentation/ABI/testing/sysfs-devices-system-cpu > +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu > @@ -67,6 +67,28 @@ Description: Discover NUMA node a CPU belongs to > /sys/devices/system/cpu/cpu42/node2 -> ../../node/node2 > > > +What: /sys/devices/system/cpu/split_lock_detect > +Date: March 2019 > +Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> > +Description: (RW) Control split lock detection on Intel Tremont and > + future CPUs > + > + Reads return split lock detection status: > + 0: disabled > + 1: enabled > + > + Writes enable or disable split lock detection: > + The first character is one of 'Nn0' or [oO][fF] for off > + disables the feature. > + The first character is one of 'Yy1' or [oO][nN] for on > + enables the feature. > + > + Please note the interface only shows or controls global setting. > + During run time, split lock detection on one CPU may be > + disabled if split lock operation in kernel code happens on > + the CPU. The interface doesn't show or control split lock > + detection on individual CPU. I.e. implementation and possible actual state are out of sync. Why? Also, if it's a global flag, why waste memory on putting a sysfs knob into every CPU's sysfs file? Finally, why is a debugging facility in sysfs, why not a debugfs knob? Using a sysctl would solve the percpu vs. global confusion as well ... > --- a/arch/x86/kernel/cpu/intel.c > +++ b/arch/x86/kernel/cpu/intel.c > @@ -35,6 +35,7 @@ > DEFINE_PER_CPU(u64, msr_test_ctl_cache); > EXPORT_PER_CPU_SYMBOL_GPL(msr_test_ctl_cache); > > +static DEFINE_MUTEX(split_lock_detect_mutex); > static bool split_lock_detect_enable; 'enable' is a verb in plain form - which we use for function names. For variable names that denotes current state we typically use past tense, i.e. 'enabled'. (The only case where we'd us the split_lock_detect_enable name for a flag if it's a flag to trigger some sort of enabling action - which this isn't.) Please review the whole series for various naming mishaps. > + mutex_lock(&split_lock_detect_mutex); > + > + split_lock_detect_enable = val; > + > + /* Update the split lock detection setting in MSR on all online CPUs. */ > + on_each_cpu(split_lock_update_msr, NULL, 1); > + > + if (split_lock_detect_enable) > + pr_info("enabled\n"); > + else > + pr_info("disabled\n"); > + > + mutex_unlock(&split_lock_detect_mutex); Instead of a mutex, please just use the global atomic debug flag which controls the warning printout. By using that flag both for the WARN()ing and for controlling MSR state all the races are solved and the code is simplified. Thanks, Ingo
On Thu, Apr 25, 2019 at 08:31:15AM +0200, Ingo Molnar wrote: > > * Fenghua Yu <fenghua.yu@intel.com> wrote: > > > + disabled if split lock operation in kernel code happens on > > + the CPU. The interface doesn't show or control split lock > > + detection on individual CPU. > > I.e. implementation and possible actual state are out of sync. Why? > > Also, if it's a global flag, why waste memory on putting a sysfs knob > into every CPU's sysfs file? > > Finally, why is a debugging facility in sysfs, why not a debugfs knob? > Using a sysctl would solve the percpu vs. global confusion as well ... Can I put the interface in /sys/kernel/debug/x86/split_lock_detect? > > > --- a/arch/x86/kernel/cpu/intel.c > > +++ b/arch/x86/kernel/cpu/intel.c > > @@ -35,6 +35,7 @@ > > DEFINE_PER_CPU(u64, msr_test_ctl_cache); > > EXPORT_PER_CPU_SYMBOL_GPL(msr_test_ctl_cache); > > > > +static DEFINE_MUTEX(split_lock_detect_mutex); > > static bool split_lock_detect_enable; > > 'enable' is a verb in plain form - which we use for function names. > > For variable names that denotes current state we typically use past > tense, i.e. 'enabled'. > > (The only case where we'd us the split_lock_detect_enable name for a flag > if it's a flag to trigger some sort of enabling action - which this > isn't.) > > Please review the whole series for various naming mishaps. OK. > > > + mutex_lock(&split_lock_detect_mutex); > > + > > + split_lock_detect_enable = val; > > + > > + /* Update the split lock detection setting in MSR on all online CPUs. */ > > + on_each_cpu(split_lock_update_msr, NULL, 1); > > + > > + if (split_lock_detect_enable) > > + pr_info("enabled\n"); > > + else > > + pr_info("disabled\n"); > > + > > + mutex_unlock(&split_lock_detect_mutex); > > Instead of a mutex, please just use the global atomic debug flag which > controls the warning printout. By using that flag both for the WARN()ing > and for controlling MSR state all the races are solved and the code is > simplified. So is it OK to define split_lock_debug and use it in #AC handler and in here? static atomic_t split_lock_debug; in #AC handler: + if (atomic_cmpxchg(&split_lock_debug, 0, 1) == 0) { + /* Only warn split lock once */ + WARN_ONCE(1, "split lock operation detected\n"); + atomic_set(&split_lock_debug, 0); + } And in split_lock_detect_store(), replace the mutex with split_lock_debug like this: - mutex_lock(&split_lock_detect_mutex); + while (atomic_cmpxchg(&split_lock_debug, 1, 0)) + cpu_relax(); .... - mutex_unlock(&split_lock_detect_mutex); + atomic_set(&split_lock_debug, 0); Is this right code for sync in both #AC handler and in split_lock_detect_store()? Thanks. -Fenghua
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu index 9605dbd4b5b5..aad7b1698065 100644 --- a/Documentation/ABI/testing/sysfs-devices-system-cpu +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu @@ -67,6 +67,28 @@ Description: Discover NUMA node a CPU belongs to /sys/devices/system/cpu/cpu42/node2 -> ../../node/node2 +What: /sys/devices/system/cpu/split_lock_detect +Date: March 2019 +Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> +Description: (RW) Control split lock detection on Intel Tremont and + future CPUs + + Reads return split lock detection status: + 0: disabled + 1: enabled + + Writes enable or disable split lock detection: + The first character is one of 'Nn0' or [oO][fF] for off + disables the feature. + The first character is one of 'Yy1' or [oO][nN] for on + enables the feature. + + Please note the interface only shows or controls global setting. + During run time, split lock detection on one CPU may be + disabled if split lock operation in kernel code happens on + the CPU. The interface doesn't show or control split lock + detection on individual CPU. + What: /sys/devices/system/cpu/cpu#/topology/core_id /sys/devices/system/cpu/cpu#/topology/core_siblings /sys/devices/system/cpu/cpu#/topology/core_siblings_list diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 959ebf25beda..f257d1e92706 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -35,6 +35,7 @@ DEFINE_PER_CPU(u64, msr_test_ctl_cache); EXPORT_PER_CPU_SYMBOL_GPL(msr_test_ctl_cache); +static DEFINE_MUTEX(split_lock_detect_mutex); static bool split_lock_detect_enable; /* @@ -660,7 +661,7 @@ static void init_intel_misc_features(struct cpuinfo_x86 *c) wrmsrl(MSR_MISC_FEATURES_ENABLES, msr); } -static void split_lock_update_msr(void) +static void split_lock_update_msr(void *__unused) { if (split_lock_detect_enable) { /* Enable split lock detection */ @@ -682,7 +683,7 @@ static void init_split_lock_detect(struct cpuinfo_x86 *c) rdmsrl(MSR_TEST_CTL, test_ctl_val); this_cpu_write(msr_test_ctl_cache, test_ctl_val); - split_lock_update_msr(); + split_lock_update_msr(NULL); } } @@ -1114,3 +1115,50 @@ void handle_split_lock_kernel_mode(void) this_cpu_and(msr_test_ctl_cache, ~TEST_CTL_SPLIT_LOCK_DETECT); WARN_ONCE(1, "split lock operation detected\n"); } + +static ssize_t +split_lock_detect_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + return sprintf(buf, "%u\n", split_lock_detect_enable); +} + +static ssize_t +split_lock_detect_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + bool val; + int ret; + + ret = strtobool(buf, &val); + if (ret) + return ret; + + mutex_lock(&split_lock_detect_mutex); + + split_lock_detect_enable = val; + + /* Update the split lock detection setting in MSR on all online CPUs. */ + on_each_cpu(split_lock_update_msr, NULL, 1); + + if (split_lock_detect_enable) + pr_info("enabled\n"); + else + pr_info("disabled\n"); + + mutex_unlock(&split_lock_detect_mutex); + + return count; +} + +static DEVICE_ATTR_RW(split_lock_detect); + +static int __init split_lock_init(void) +{ + if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) + return -ENODEV; + + return device_create_file(cpu_subsys.dev_root, + &dev_attr_split_lock_detect); +} +subsys_initcall(split_lock_init);
To workaround or debug a split lock issue, the administrator may need to disable or enable split lock detection during run time without rebooting the system. The interface /sys/device/system/cpu/split_lock_detect is added to allow the administrator to disable or enable split lock detection and show current split lock detection setting. Writing [yY1] or [oO][nN] to the file enables split lock detection and writing [nN0] or [oO][fF] disables split lock detection. Split lock detection is enabled or disabled on all CPUs. Reading the file returns current global split lock detection setting: 0: disabled 1: enabled Add an ABI document entry for /sys/devices/system/cpu/split_lock_detect. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> --- Not sure if the justification for the sysfs knob is valid. If not, this patch could be removed from this patch set. .../ABI/testing/sysfs-devices-system-cpu | 22 ++++++++ arch/x86/kernel/cpu/intel.c | 52 ++++++++++++++++++- 2 files changed, 72 insertions(+), 2 deletions(-)