Message ID | 20240104153939.129179-1-pbonzini@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL | expand |
On 2024-01-04 10:39 a.m., Paolo Bonzini wrote: > When commit c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE > MSR emulation for extended PEBS") switched the initialization of > cpuc->guest_switch_msrs to use compound literals, it screwed up > the boolean logic: > > + u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable; > ... > - arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask; > - arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable); > + .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask), > > Before the patch, the value of arr[0].guest would have been intel_ctrl & > ~cpuc->intel_ctrl_host_mask & ~pebs_mask. The intent is to always treat > PEBS events as host-only because, while the guest runs, there is no way > to tell the processor about the virtual address where to put PEBS records > intended for the host. > > Unfortunately, the new expression can be expanded to > > (intel_ctrl & ~cpuc->intel_ctrl_host_mask) | (intel_ctrl & ~pebs_mask) > > which makes no sense; it includes any bit that isn't *both* marked as > exclude_guest and using PEBS. So, reinstate the old logic. I think the old logic will completely disable the PEBS in guest capability. Because the counter which is assigned to a guest PEBS event will also be set in the pebs_mask. The old logic disable the counter in GLOBAL_CTRL in guest. Nothing will be counted. Like once proposed a fix in the intel_guest_get_msrs(). https://lore.kernel.org/lkml/20231129095055.88060-1-likexu@tencent.com/ It should work for the issue. Ideally, we should prevent the host PEBS from profiling a guest via rejecting the event creation in the perf. But I couldn't find a good way to distinguish host-created PEBS and guest-created PEBS. So Like's proposal should be a good alternative so far. Thanks, Kan > Another > way to write it could be "intel_ctrl & ~(cpuc->intel_ctrl_host_mask | > pebs_mask)", presumably the intention of the author of the faulty. > However, I personally find the repeated application of A AND NOT B to > be a bit more readable. > > This shows up as guest failures when running concurrent long-running > perf workloads on the host, and was reported to happen with rcutorture. > All guests on a given host would die simultaneously with something like an > instruction fault or a segmentation violation. > > Reported-by: Paul E. McKenney <paulmck@kernel.org> > Analyzed-by: Sean Christopherson <seanjc@google.com> > Tested-by: Paul E. McKenney <paulmck@kernel.org> > Cc: stable@vger.kernel.org > Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS") > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > arch/x86/events/intel/core.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c > index ce1c777227b4..0f2786d4e405 100644 > --- a/arch/x86/events/intel/core.c > +++ b/arch/x86/events/intel/core.c > @@ -4051,12 +4051,17 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data) > u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable; > int global_ctrl, pebs_enable; > > + /* > + * In addition to obeying exclude_guest/exclude_host, remove bits being > + * used for PEBS when running a guest, because PEBS writes to virtual > + * addresses (not physical addresses). > + */ > *nr = 0; > global_ctrl = (*nr)++; > arr[global_ctrl] = (struct perf_guest_switch_msr){ > .msr = MSR_CORE_PERF_GLOBAL_CTRL, > .host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask, > - .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask), > + .guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask & ~pebs_mask, > }; > > if (!x86_pmu.pebs)
On Thu, Jan 04, 2024, Liang, Kan wrote: > > > On 2024-01-04 10:39 a.m., Paolo Bonzini wrote: > > When commit c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE > > MSR emulation for extended PEBS") switched the initialization of > > cpuc->guest_switch_msrs to use compound literals, it screwed up > > the boolean logic: > > > > + u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable; > > ... > > - arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask; > > - arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable); > > + .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask), > > > > Before the patch, the value of arr[0].guest would have been intel_ctrl & > > ~cpuc->intel_ctrl_host_mask & ~pebs_mask. The intent is to always treat > > PEBS events as host-only because, while the guest runs, there is no way > > to tell the processor about the virtual address where to put PEBS records > > intended for the host. > > > > Unfortunately, the new expression can be expanded to > > > > (intel_ctrl & ~cpuc->intel_ctrl_host_mask) | (intel_ctrl & ~pebs_mask) > > > > which makes no sense; it includes any bit that isn't *both* marked as > > exclude_guest and using PEBS. So, reinstate the old logic. > > I think the old logic will completely disable the PEBS in guest > capability. Because the counter which is assigned to a guest PEBS event > will also be set in the pebs_mask. The old logic disable the counter in > GLOBAL_CTRL in guest. Nothing will be counted. > > Like once proposed a fix in the intel_guest_get_msrs(). > https://lore.kernel.org/lkml/20231129095055.88060-1-likexu@tencent.com/ > It should work for the issue. No, that patch only affects the path where hardware supports enabling PEBS in the the guest, i.e. intel_guest_get_msrs() will bail before getting to that code due to the lack of x86_pmu.pebs_ept support, which IIUC is all pre-Icelake Intel CPUs. if (!kvm_pmu || !x86_pmu.pebs_ept) return arr;
On 2024-01-04 1:22 p.m., Sean Christopherson wrote: > On Thu, Jan 04, 2024, Liang, Kan wrote: >> >> >> On 2024-01-04 10:39 a.m., Paolo Bonzini wrote: >>> When commit c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE >>> MSR emulation for extended PEBS") switched the initialization of >>> cpuc->guest_switch_msrs to use compound literals, it screwed up >>> the boolean logic: >>> >>> + u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable; >>> ... >>> - arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask; >>> - arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable); >>> + .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask), >>> >>> Before the patch, the value of arr[0].guest would have been intel_ctrl & >>> ~cpuc->intel_ctrl_host_mask & ~pebs_mask. The intent is to always treat >>> PEBS events as host-only because, while the guest runs, there is no way >>> to tell the processor about the virtual address where to put PEBS records >>> intended for the host. >>> >>> Unfortunately, the new expression can be expanded to >>> >>> (intel_ctrl & ~cpuc->intel_ctrl_host_mask) | (intel_ctrl & ~pebs_mask) >>> >>> which makes no sense; it includes any bit that isn't *both* marked as >>> exclude_guest and using PEBS. So, reinstate the old logic. >> >> I think the old logic will completely disable the PEBS in guest >> capability. Because the counter which is assigned to a guest PEBS event >> will also be set in the pebs_mask. The old logic disable the counter in >> GLOBAL_CTRL in guest. Nothing will be counted. >> >> Like once proposed a fix in the intel_guest_get_msrs(). >> https://lore.kernel.org/lkml/20231129095055.88060-1-likexu@tencent.com/ >> It should work for the issue. > > No, that patch only affects the path where hardware supports enabling PEBS in the > the guest, i.e. intel_guest_get_msrs() will bail before getting to that code due > to the lack of x86_pmu.pebs_ept support, which IIUC is all pre-Icelake Intel CPUs. > > if (!kvm_pmu || !x86_pmu.pebs_ept) > return arr; > True, we have to disable all PEBS counters for pre-ICL as well. I think what I missed is that the disable here is temporary. The arr[global_ctrl].guest will be updated later for the x86_pmu.pebs_ept platform, so the guest PEBS event should still work. The patch looks good to me. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Thanks, Kan
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index ce1c777227b4..0f2786d4e405 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -4051,12 +4051,17 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data) u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable; int global_ctrl, pebs_enable; + /* + * In addition to obeying exclude_guest/exclude_host, remove bits being + * used for PEBS when running a guest, because PEBS writes to virtual + * addresses (not physical addresses). + */ *nr = 0; global_ctrl = (*nr)++; arr[global_ctrl] = (struct perf_guest_switch_msr){ .msr = MSR_CORE_PERF_GLOBAL_CTRL, .host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask, - .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask), + .guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask & ~pebs_mask, }; if (!x86_pmu.pebs)