Message ID | 20250331082251.3171276-2-xin@zytor.com (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | MSR refactor with new MSR instructions support | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
* Xin Li (Intel) <xin@zytor.com> wrote: > - __wrmsr (MSR_AMD_DBG_EXTN_CFG, val | 3ULL << 3, val >> 32); > + native_wrmsrl(MSR_AMD_DBG_EXTN_CFG, val | 3ULL << 3); This is an improvement. > - __wrmsr (MSR_IA32_PQR_ASSOC, rmid_p, plr->closid); > + native_wrmsrl(MSR_IA32_PQR_ASSOC, (u64)plr->closid << 32 | rmid_p); > - __wrmsr (MSR_IA32_PQR_ASSOC, rmid_p, closid_p); > + native_wrmsrl(MSR_IA32_PQR_ASSOC, (u64)closid_p << 32 | rmid_p); This is not an improvement. Please provide a native_wrmsrl() API variant where natural [rmid_p, closid_p] high/lo parameters can be used, without the shift-uglification... Thanks, Ingo
On March 31, 2025 3:17:30 AM PDT, Ingo Molnar <mingo@kernel.org> wrote: > >* Xin Li (Intel) <xin@zytor.com> wrote: > >> - __wrmsr (MSR_AMD_DBG_EXTN_CFG, val | 3ULL << 3, val >> 32); >> + native_wrmsrl(MSR_AMD_DBG_EXTN_CFG, val | 3ULL << 3); > >This is an improvement. > >> - __wrmsr (MSR_IA32_PQR_ASSOC, rmid_p, plr->closid); >> + native_wrmsrl(MSR_IA32_PQR_ASSOC, (u64)plr->closid << 32 | rmid_p); > >> - __wrmsr (MSR_IA32_PQR_ASSOC, rmid_p, closid_p); >> + native_wrmsrl(MSR_IA32_PQR_ASSOC, (u64)closid_p << 32 | rmid_p); > >This is not an improvement. > >Please provide a native_wrmsrl() API variant where natural [rmid_p, closid_p] >high/lo parameters can be used, without the shift-uglification... > >Thanks, > > Ingo Directing this question primarily to Ingo, who is more than anyone else the namespace consistency guardian: On the subject of msr function naming ... *msrl() has always been misleading. The -l suffix usually means 32 bits; sometimes it means the C type "long" (which in the kernel is used instead of size_t/uintptr_t, which might end up being "fun" when 128-bit architectures appear some time this century), but for a fixed 64-but type we normally use -q. Should we rename the *msrl() functions to *msrq() as part of this overhaul?
On 31/03/2025 9:22 am, Xin Li (Intel) wrote: > __wrmsr() is the lowest level primitive MSR write API, and its direct > use is NOT preferred. Use its wrapper function native_wrmsrl() instead. > > No functional change intended. > > Signed-off-by: Xin Li (Intel) <xin@zytor.com> The critical piece of information you're missing from the commit message is that the MSR_IMM instructions take a single u64. Therefore to use them, you've got to arrange for all callers to provide a single u64, rather than a split u32 pair. ~Andrew
On March 31, 2025 2:45:43 PM PDT, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >On 31/03/2025 9:22 am, Xin Li (Intel) wrote: >> __wrmsr() is the lowest level primitive MSR write API, and its direct >> use is NOT preferred. Use its wrapper function native_wrmsrl() instead. >> >> No functional change intended. >> >> Signed-off-by: Xin Li (Intel) <xin@zytor.com> > >The critical piece of information you're missing from the commit message >is that the MSR_IMM instructions take a single u64. > >Therefore to use them, you've got to arrange for all callers to provide >a single u64, rather than a split u32 pair. > >~Andrew That being said, there is nothing wrong with having a two-word convenience wrapper.
On 3/31/2025 10:13 PM, H. Peter Anvin wrote: > On March 31, 2025 2:45:43 PM PDT, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> On 31/03/2025 9:22 am, Xin Li (Intel) wrote: >>> __wrmsr() is the lowest level primitive MSR write API, and its direct >>> use is NOT preferred. Use its wrapper function native_wrmsrl() instead. >>> >>> No functional change intended. >>> >>> Signed-off-by: Xin Li (Intel) <xin@zytor.com> >> >> The critical piece of information you're missing from the commit message >> is that the MSR_IMM instructions take a single u64. >> >> Therefore to use them, you've got to arrange for all callers to provide >> a single u64, rather than a split u32 pair. >> >> ~Andrew > > That being said, there is nothing wrong with having a two-word convenience wrapper. > Yes, I ended up keeping the two-word convenience wrapper in this patch set, and the wrapper calls a lower level API that takes a u64 argument. And yes, as Ingo said, some of the conversion is NOT an improvement.
On 3/31/2025 1:32 PM, H. Peter Anvin wrote: > On March 31, 2025 3:17:30 AM PDT, Ingo Molnar <mingo@kernel.org> wrote: >> >> * Xin Li (Intel) <xin@zytor.com> wrote: >> >>> - __wrmsr (MSR_AMD_DBG_EXTN_CFG, val | 3ULL << 3, val >> 32); >>> + native_wrmsrl(MSR_AMD_DBG_EXTN_CFG, val | 3ULL << 3); >> >> This is an improvement. >> >>> - __wrmsr (MSR_IA32_PQR_ASSOC, rmid_p, plr->closid); >>> + native_wrmsrl(MSR_IA32_PQR_ASSOC, (u64)plr->closid << 32 | rmid_p); >> >>> - __wrmsr (MSR_IA32_PQR_ASSOC, rmid_p, closid_p); >>> + native_wrmsrl(MSR_IA32_PQR_ASSOC, (u64)closid_p << 32 | rmid_p); >> >> This is not an improvement. >> >> Please provide a native_wrmsrl() API variant where natural [rmid_p, closid_p] >> high/lo parameters can be used, without the shift-uglification... >> >> Thanks, >> >> Ingo > > Directing this question primarily to Ingo, who is more than anyone else the namespace consistency guardian: > > On the subject of msr function naming ... *msrl() has always been misleading. The -l suffix usually means 32 bits; sometimes it means the C type "long" (which in the kernel is used instead of size_t/uintptr_t, which might end up being "fun" when 128-bit architectures appear some time this century), but for a fixed 64-but type we normally use -q. > > Should we rename the *msrl() functions to *msrq() as part of this overhaul? > Per "struct msr" defined in arch/x86/include/asm/shared/msr.h: struct msr { union { struct { u32 l; u32 h; }; u64 q; }; }; Probably *msrq() is what we want?
* H. Peter Anvin <hpa@zytor.com> wrote: > On March 31, 2025 3:17:30 AM PDT, Ingo Molnar <mingo@kernel.org> wrote: > > > >* Xin Li (Intel) <xin@zytor.com> wrote: > > > >> - __wrmsr (MSR_AMD_DBG_EXTN_CFG, val | 3ULL << 3, val >> 32); > >> + native_wrmsrl(MSR_AMD_DBG_EXTN_CFG, val | 3ULL << 3); > > > >This is an improvement. > > > >> - __wrmsr (MSR_IA32_PQR_ASSOC, rmid_p, plr->closid); > >> + native_wrmsrl(MSR_IA32_PQR_ASSOC, (u64)plr->closid << 32 | rmid_p); > > > >> - __wrmsr (MSR_IA32_PQR_ASSOC, rmid_p, closid_p); > >> + native_wrmsrl(MSR_IA32_PQR_ASSOC, (u64)closid_p << 32 | rmid_p); > > > >This is not an improvement. > > > >Please provide a native_wrmsrl() API variant where natural [rmid_p, closid_p] > >high/lo parameters can be used, without the shift-uglification... > > > >Thanks, > > > > Ingo > > Directing this question primarily to Ingo, who is more than anyone > else the namespace consistency guardian: > > On the subject of msr function naming ... *msrl() has always been > misleading. The -l suffix usually means 32 bits; sometimes it means > the C type "long" (which in the kernel is used instead of > size_t/uintptr_t, which might end up being "fun" when 128-bit > architectures appear some time this century), but for a fixed 64-but > type we normally use -q. Yeah, agreed - that's been bothering me for a while too. :-) > Should we rename the *msrl() functions to *msrq() as part of this > overhaul? Yeah, that's a good idea, and because talk is cheap I just implemented this in the tip:WIP.x86/msr branch with a couple of other cleanups in this area (see the shortlog & diffstat below), but the churn is high: 144 files changed, 1034 insertions(+), 1034 deletions(-) So this can only be done if regenerated and sent to Linus right before an -rc1 I think: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip WIP.x86/msr Thanks, Ingo =======================> Ingo Molnar (18): x86/msr: Standardize on u64 in <asm/msr.h> x86/msr: Standardize on u64 in <asm/msr-index.h> x86/msr: Use u64 in rdmsrl_amd_safe() and wrmsrl_amd_safe() x86/msr: Use u64 in rdmsrl_safe() and paravirt_read_pmc() x86/msr: Rename 'rdmsrl()' to 'rdmsrq()' x86/msr: Rename 'wrmsrl()' to 'wrmsrq()' x86/msr: Rename 'rdmsrl_safe()' to 'rdmsrq_safe()' x86/msr: Rename 'wrmsrl_safe()' to 'wrmsrq_safe()' x86/msr: Rename 'rdmsrl_safe_on_cpu()' to 'rdmsrq_safe_on_cpu()' x86/msr: Rename 'wrmsrl_safe_on_cpu()' to 'wrmsrq_safe_on_cpu()' x86/msr: Rename 'rdmsrl_on_cpu()' to 'rdmsrq_on_cpu()' x86/msr: Rename 'wrmsrl_on_cpu()' to 'wrmsrq_on_cpu()' x86/msr: Rename 'mce_rdmsrl()' to 'mce_rdmsrq()' x86/msr: Rename 'mce_wrmsrl()' to 'mce_wrmsrq()' x86/msr: Rename 'rdmsrl_amd_safe()' to 'rdmsrq_amd_safe()' x86/msr: Rename 'wrmsrl_amd_safe()' to 'wrmsrq_amd_safe()' x86/msr: Rename 'native_wrmsrl()' to 'native_wrmsrq()' x86/msr: Rename 'wrmsrl_cstar()' to 'wrmsrq_cstar()' arch/x86/coco/sev/core.c | 2 +- arch/x86/events/amd/brs.c | 8 +- arch/x86/events/amd/core.c | 12 +-- arch/x86/events/amd/ibs.c | 26 ++--- arch/x86/events/amd/lbr.c | 20 ++-- arch/x86/events/amd/power.c | 10 +- arch/x86/events/amd/uncore.c | 12 +-- arch/x86/events/core.c | 42 ++++---- arch/x86/events/intel/core.c | 66 ++++++------- arch/x86/events/intel/cstate.c | 2 +- arch/x86/events/intel/ds.c | 10 +- arch/x86/events/intel/knc.c | 16 +-- arch/x86/events/intel/lbr.c | 44 ++++----- arch/x86/events/intel/p4.c | 24 ++--- arch/x86/events/intel/p6.c | 12 +-- arch/x86/events/intel/pt.c | 32 +++--- arch/x86/events/intel/uncore.c | 2 +- arch/x86/events/intel/uncore_discovery.c | 10 +- arch/x86/events/intel/uncore_nhmex.c | 70 ++++++------- arch/x86/events/intel/uncore_snb.c | 42 ++++---- arch/x86/events/intel/uncore_snbep.c | 50 +++++----- arch/x86/events/msr.c | 2 +- arch/x86/events/perf_event.h | 26 ++--- arch/x86/events/probe.c | 2 +- arch/x86/events/rapl.c | 8 +- arch/x86/events/zhaoxin/core.c | 16 +-- arch/x86/hyperv/hv_apic.c | 4 +- arch/x86/hyperv/hv_init.c | 66 ++++++------- arch/x86/hyperv/hv_spinlock.c | 6 +- arch/x86/hyperv/ivm.c | 2 +- arch/x86/include/asm/apic.h | 8 +- arch/x86/include/asm/debugreg.h | 4 +- arch/x86/include/asm/fsgsbase.h | 4 +- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/include/asm/microcode.h | 2 +- arch/x86/include/asm/msr-index.h | 12 +-- arch/x86/include/asm/msr.h | 50 +++++----- arch/x86/include/asm/paravirt.h | 8 +- arch/x86/include/asm/spec-ctrl.h | 2 +- arch/x86/kernel/acpi/cppc.c | 8 +- arch/x86/kernel/amd_nb.c | 2 +- arch/x86/kernel/apic/apic.c | 16 +-- arch/x86/kernel/apic/apic_numachip.c | 6 +- arch/x86/kernel/cet.c | 2 +- arch/x86/kernel/cpu/amd.c | 28 +++--- arch/x86/kernel/cpu/aperfmperf.c | 28 +++--- arch/x86/kernel/cpu/bugs.c | 24 ++--- arch/x86/kernel/cpu/bus_lock.c | 18 ++-- arch/x86/kernel/cpu/common.c | 68 ++++++------- arch/x86/kernel/cpu/feat_ctl.c | 4 +- arch/x86/kernel/cpu/hygon.c | 6 +- arch/x86/kernel/cpu/intel.c | 10 +- arch/x86/kernel/cpu/intel_epb.c | 12 +-- arch/x86/kernel/cpu/mce/amd.c | 22 ++--- arch/x86/kernel/cpu/mce/core.c | 58 +++++------ arch/x86/kernel/cpu/mce/inject.c | 32 +++--- arch/x86/kernel/cpu/mce/intel.c | 32 +++--- arch/x86/kernel/cpu/mce/internal.h | 2 +- arch/x86/kernel/cpu/microcode/amd.c | 2 +- arch/x86/kernel/cpu/microcode/intel.c | 2 +- arch/x86/kernel/cpu/mshyperv.c | 12 +-- arch/x86/kernel/cpu/resctrl/core.c | 10 +- arch/x86/kernel/cpu/resctrl/monitor.c | 2 +- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 +- arch/x86/kernel/cpu/sgx/main.c | 2 +- arch/x86/kernel/cpu/topology.c | 2 +- arch/x86/kernel/cpu/topology_amd.c | 4 +- arch/x86/kernel/cpu/tsx.c | 20 ++-- arch/x86/kernel/cpu/umwait.c | 2 +- arch/x86/kernel/fpu/core.c | 2 +- arch/x86/kernel/fpu/xstate.c | 10 +- arch/x86/kernel/fpu/xstate.h | 2 +- arch/x86/kernel/fred.c | 20 ++-- arch/x86/kernel/hpet.c | 2 +- arch/x86/kernel/kvm.c | 28 +++--- arch/x86/kernel/kvmclock.c | 4 +- arch/x86/kernel/mmconf-fam10h_64.c | 8 +- arch/x86/kernel/process.c | 16 +-- arch/x86/kernel/process_64.c | 20 ++-- arch/x86/kernel/reboot_fixups_32.c | 2 +- arch/x86/kernel/shstk.c | 18 ++-- arch/x86/kernel/traps.c | 10 +- arch/x86/kernel/tsc.c | 2 +- arch/x86/kernel/tsc_sync.c | 14 +-- arch/x86/kvm/svm/avic.c | 2 +- arch/x86/kvm/svm/sev.c | 2 +- arch/x86/kvm/svm/svm.c | 16 +-- arch/x86/kvm/vmx/nested.c | 4 +- arch/x86/kvm/vmx/pmu_intel.c | 4 +- arch/x86/kvm/vmx/sgx.c | 8 +- arch/x86/kvm/vmx/vmx.c | 66 ++++++------- arch/x86/kvm/x86.c | 38 ++++---- arch/x86/lib/insn-eval.c | 6 +- arch/x86/lib/msr-smp.c | 16 +-- arch/x86/lib/msr.c | 4 +- arch/x86/mm/pat/memtype.c | 4 +- arch/x86/mm/tlb.c | 2 +- arch/x86/pci/amd_bus.c | 10 +- arch/x86/platform/olpc/olpc-xo1-rtc.c | 6 +- arch/x86/platform/olpc/olpc-xo1-sci.c | 2 +- arch/x86/power/cpu.c | 26 ++--- arch/x86/realmode/init.c | 2 +- arch/x86/virt/svm/sev.c | 20 ++-- arch/x86/xen/suspend.c | 6 +- drivers/acpi/acpi_extlog.c | 2 +- drivers/acpi/acpi_lpit.c | 2 +- drivers/cpufreq/acpi-cpufreq.c | 8 +- drivers/cpufreq/amd-pstate-ut.c | 6 +- drivers/cpufreq/amd-pstate.c | 22 ++--- drivers/cpufreq/amd_freq_sensitivity.c | 2 +- drivers/cpufreq/e_powersaver.c | 6 +- drivers/cpufreq/intel_pstate.c | 108 ++++++++++----------- drivers/cpufreq/longhaul.c | 24 ++--- drivers/cpufreq/powernow-k7.c | 14 +-- drivers/crypto/ccp/sev-dev.c | 2 +- drivers/edac/amd64_edac.c | 6 +- drivers/gpu/drm/i915/selftests/librapl.c | 4 +- drivers/hwmon/fam15h_power.c | 6 +- drivers/idle/intel_idle.c | 34 +++---- drivers/mtd/nand/raw/cs553x_nand.c | 6 +- drivers/platform/x86/intel/ifs/core.c | 4 +- drivers/platform/x86/intel/ifs/load.c | 20 ++-- drivers/platform/x86/intel/ifs/runtest.c | 16 +-- drivers/platform/x86/intel/pmc/cnp.c | 6 +- drivers/platform/x86/intel/pmc/core.c | 8 +- .../x86/intel/speed_select_if/isst_if_common.c | 18 ++-- .../x86/intel/speed_select_if/isst_if_mbox_msr.c | 14 +-- .../x86/intel/speed_select_if/isst_tpmi_core.c | 2 +- drivers/platform/x86/intel/tpmi_power_domains.c | 4 +- drivers/platform/x86/intel/turbo_max_3.c | 4 +- .../x86/intel/uncore-frequency/uncore-frequency.c | 10 +- drivers/platform/x86/intel_ips.c | 36 +++---- drivers/powercap/intel_rapl_msr.c | 6 +- .../int340x_thermal/processor_thermal_device.c | 2 +- drivers/thermal/intel/intel_hfi.c | 14 +-- drivers/thermal/intel/intel_powerclamp.c | 4 +- drivers/thermal/intel/intel_tcc_cooling.c | 4 +- drivers/thermal/intel/therm_throt.c | 10 +- drivers/video/fbdev/geode/gxfb_core.c | 2 +- drivers/video/fbdev/geode/lxfb_ops.c | 22 ++--- drivers/video/fbdev/geode/suspend_gx.c | 10 +- drivers/video/fbdev/geode/video_gx.c | 16 +-- include/hyperv/hvgdk_mini.h | 2 +- 144 files changed, 1034 insertions(+), 1034 deletions(-)
diff --git a/arch/x86/events/amd/brs.c b/arch/x86/events/amd/brs.c index ec3427463382..4a47f3c108de 100644 --- a/arch/x86/events/amd/brs.c +++ b/arch/x86/events/amd/brs.c @@ -44,7 +44,7 @@ static inline unsigned int brs_to(int idx) static __always_inline void set_debug_extn_cfg(u64 val) { /* bits[4:3] must always be set to 11b */ - __wrmsr(MSR_AMD_DBG_EXTN_CFG, val | 3ULL << 3, val >> 32); + native_wrmsrl(MSR_AMD_DBG_EXTN_CFG, val | 3ULL << 3); } static __always_inline u64 get_debug_extn_cfg(void) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index c903d358405d..3345a819c859 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -214,7 +214,7 @@ static inline void native_apic_msr_write(u32 reg, u32 v) static inline void native_apic_msr_eoi(void) { - __wrmsr(APIC_BASE_MSR + (APIC_EOI >> 4), APIC_EOI_ACK, 0); + native_wrmsrl(APIC_BASE_MSR + (APIC_EOI >> 4), APIC_EOI_ACK); } static inline u32 native_apic_msr_read(u32 reg) diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index 9397a319d165..27ea8793705d 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -144,10 +144,12 @@ static inline unsigned long long native_read_msr_safe(unsigned int msr, static inline void notrace native_write_msr(unsigned int msr, u32 low, u32 high) { - __wrmsr(msr, low, high); + u64 val = (u64)high << 32 | low; + + native_wrmsrl(msr, val); if (tracepoint_enabled(write_msr)) - do_trace_write_msr(msr, ((u64)high << 32 | low), 0); + do_trace_write_msr(msr, val, 0); } /* Can be uninlined because referenced by paravirt */ diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 1f14c3308b6b..0eaeaba12df2 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1306,7 +1306,7 @@ static noinstr bool mce_check_crashing_cpu(void) } if (mcgstatus & MCG_STATUS_RIPV) { - __wrmsr(MSR_IA32_MCG_STATUS, 0, 0); + native_wrmsrl(MSR_IA32_MCG_STATUS, 0); return true; } } diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index 01fa7890b43f..55536120c8d1 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -481,7 +481,7 @@ int resctrl_arch_pseudo_lock_fn(void *_plr) * cache. */ saved_msr = __rdmsr(MSR_MISC_FEATURE_CONTROL); - __wrmsr(MSR_MISC_FEATURE_CONTROL, prefetch_disable_bits, 0x0); + native_wrmsrl(MSR_MISC_FEATURE_CONTROL, prefetch_disable_bits); closid_p = this_cpu_read(pqr_state.cur_closid); rmid_p = this_cpu_read(pqr_state.cur_rmid); mem_r = plr->kmem; @@ -493,7 +493,7 @@ int resctrl_arch_pseudo_lock_fn(void *_plr) * pseudo-locked followed by reading of kernel memory to load it * into the cache. */ - __wrmsr(MSR_IA32_PQR_ASSOC, rmid_p, plr->closid); + native_wrmsrl(MSR_IA32_PQR_ASSOC, (u64)plr->closid << 32 | rmid_p); /* * Cache was flushed earlier. Now access kernel memory to read it @@ -530,7 +530,7 @@ int resctrl_arch_pseudo_lock_fn(void *_plr) * Critical section end: restore closid with capacity bitmask that * does not overlap with pseudo-locked region. */ - __wrmsr(MSR_IA32_PQR_ASSOC, rmid_p, closid_p); + native_wrmsrl(MSR_IA32_PQR_ASSOC, (u64)closid_p << 32 | rmid_p); /* Re-enable the hardware prefetcher(s) */ wrmsrl(MSR_MISC_FEATURE_CONTROL, saved_msr);
__wrmsr() is the lowest level primitive MSR write API, and its direct use is NOT preferred. Use its wrapper function native_wrmsrl() instead. No functional change intended. Signed-off-by: Xin Li (Intel) <xin@zytor.com> --- arch/x86/events/amd/brs.c | 2 +- arch/x86/include/asm/apic.h | 2 +- arch/x86/include/asm/msr.h | 6 ++++-- arch/x86/kernel/cpu/mce/core.c | 2 +- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 6 +++--- 5 files changed, 10 insertions(+), 8 deletions(-)