Message ID | 20220411101946.20262-1-likexu@tencent.com (mailing list archive) |
---|---|
Headers | show |
Series | KVM: x86/pmu: Add basic support to enable guest PEBS via DS | expand |
Queued, thanks, but only because I have not done my job very well in handling this patch series (and LBR too) and I feel bad about it. Sending such a large patch series with no kvm-unit-tests should not happen, and I'd be grateful if you wrote testcases after the fact. Paolo
Like Xu <like.xu.linux@gmail.com> writes:
...
Hi, the following commit
> KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS
(currently in kvm/queue) breaks a number of selftests, e.g.:
# ./tools/testing/selftests/kvm/x86_64/state_test
==== Test Assertion Failure ====
lib/x86_64/processor.c:1207: r == nmsrs
pid=6702 tid=6702 errno=7 - Argument list too long
1 0x000000000040da11: vcpu_save_state at processor.c:1207 (discriminator 4)
2 0x00000000004024e5: main at state_test.c:209 (discriminator 6)
3 0x00007f9f48c2d55f: ?? ??:0
4 0x00007f9f48c2d60b: ?? ??:0
5 0x00000000004026d4: _start at ??:?
Unexpected result from KVM_GET_MSRS, r: 29 (failed MSR was 0x3f1)
I don't think any of these failing tests care about MSR_IA32_PEBS_ENABLE
in particular, they're just trying to do KVM_GET_MSRS/KVM_SET_MSRS.
On 19/5/2022 8:14 pm, Vitaly Kuznetsov wrote: > Like Xu <like.xu.linux@gmail.com> writes: > > ... > > Hi, the following commit > >> KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS > > (currently in kvm/queue) breaks a number of selftests, e.g.: Indeed, e.g.: x86_64/hyperv_clock x86_64/max_vcpuid_cap_test x86_64/mmu_role_test > > # ./tools/testing/selftests/kvm/x86_64/state_test This test continues to be silent after the top commit a3808d884612 ("KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64"), which implies a root cause. Anyway, thanks for this git-bisect report. > ==== Test Assertion Failure ==== > lib/x86_64/processor.c:1207: r == nmsrs > pid=6702 tid=6702 errno=7 - Argument list too long > 1 0x000000000040da11: vcpu_save_state at processor.c:1207 (discriminator 4) > 2 0x00000000004024e5: main at state_test.c:209 (discriminator 6) > 3 0x00007f9f48c2d55f: ?? ??:0 > 4 0x00007f9f48c2d60b: ?? ??:0 > 5 0x00000000004026d4: _start at ??:? > Unexpected result from KVM_GET_MSRS, r: 29 (failed MSR was 0x3f1) > > I don't think any of these failing tests care about MSR_IA32_PEBS_ENABLE > in particular, they're just trying to do KVM_GET_MSRS/KVM_SET_MSRS. >
On 19/5/2022 9:31 pm, Like Xu wrote: > ==== Test Assertion Failure ==== > lib/x86_64/processor.c:1207: r == nmsrs > pid=6702 tid=6702 errno=7 - Argument list too long > 1 0x000000000040da11: vcpu_save_state at processor.c:1207 > (discriminator 4) > 2 0x00000000004024e5: main at state_test.c:209 (discriminator 6) > 3 0x00007f9f48c2d55f: ?? ??:0 > 4 0x00007f9f48c2d60b: ?? ??:0 > 5 0x00000000004026d4: _start at ??:? > Unexpected result from KVM_GET_MSRS, r: 29 (failed MSR was 0x3f1) > > I don't think any of these failing tests care about MSR_IA32_PEBS_ENABLE > in particular, they're just trying to do KVM_GET_MSRS/KVM_SET_MSRS. One of the lessons I learned here is that the members of msrs_to_save_all[] are part of the KVM ABI. We don't add feature-related MSRs until the last step of the KVM exposure feature (in this case, adding MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG to msrs_to_save_all[] should take effect along with exposing the CPUID bits). Awaiting a ruling from the core guardian on this part of the git-bisect deficiency.
Like Xu <like.xu.linux@gmail.com> writes: > On 19/5/2022 9:31 pm, Like Xu wrote: >> ==== Test Assertion Failure ==== >> lib/x86_64/processor.c:1207: r == nmsrs >> pid=6702 tid=6702 errno=7 - Argument list too long >> 1 0x000000000040da11: vcpu_save_state at processor.c:1207 >> (discriminator 4) >> 2 0x00000000004024e5: main at state_test.c:209 (discriminator 6) >> 3 0x00007f9f48c2d55f: ?? ??:0 >> 4 0x00007f9f48c2d60b: ?? ??:0 >> 5 0x00000000004026d4: _start at ??:? >> Unexpected result from KVM_GET_MSRS, r: 29 (failed MSR was 0x3f1) >> >> I don't think any of these failing tests care about MSR_IA32_PEBS_ENABLE >> in particular, they're just trying to do KVM_GET_MSRS/KVM_SET_MSRS. > > One of the lessons I learned here is that the members of msrs_to_save_all[] > are part of the KVM ABI. We don't add feature-related MSRs until the last > step of the KVM exposure feature (in this case, adding MSR_IA32_PEBS_ENABLE, > MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG to msrs_to_save_all[] should take > effect along with exposing the CPUID bits). AFAIR the basic rule here is that whatever gets returned with KVM_GET_MSR_INDEX_LIST can be passed to KVM_GET_MSRS and read successfully by the host (not necessarily by the guest) so my guess is that MSR_IA32_PEBS_ENABLE is now returned in KVM_GET_MSR_INDEX_LIST but can't be read with KVM_GET_MSRS. Later, the expectation is that what was returned by KVM_GET_MSRS can be set successfully with KVM_SET_MSRS.
On 19/5/2022 10:46 pm, Vitaly Kuznetsov wrote: > Like Xu <like.xu.linux@gmail.com> writes: > >> On 19/5/2022 9:31 pm, Like Xu wrote: >>> ==== Test Assertion Failure ==== >>> lib/x86_64/processor.c:1207: r == nmsrs >>> pid=6702 tid=6702 errno=7 - Argument list too long >>> 1 0x000000000040da11: vcpu_save_state at processor.c:1207 >>> (discriminator 4) >>> 2 0x00000000004024e5: main at state_test.c:209 (discriminator 6) >>> 3 0x00007f9f48c2d55f: ?? ??:0 >>> 4 0x00007f9f48c2d60b: ?? ??:0 >>> 5 0x00000000004026d4: _start at ??:? >>> Unexpected result from KVM_GET_MSRS, r: 29 (failed MSR was 0x3f1) >>> >>> I don't think any of these failing tests care about MSR_IA32_PEBS_ENABLE >>> in particular, they're just trying to do KVM_GET_MSRS/KVM_SET_MSRS. >> >> One of the lessons I learned here is that the members of msrs_to_save_all[] >> are part of the KVM ABI. We don't add feature-related MSRs until the last >> step of the KVM exposure feature (in this case, adding MSR_IA32_PEBS_ENABLE, >> MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG to msrs_to_save_all[] should take >> effect along with exposing the CPUID bits). > > AFAIR the basic rule here is that whatever gets returned with > KVM_GET_MSR_INDEX_LIST can be passed to KVM_GET_MSRS and read > successfully by the host (not necessarily by the guest) so my guess is > that MSR_IA32_PEBS_ENABLE is now returned in KVM_GET_MSR_INDEX_LIST but > can't be read with KVM_GET_MSRS. Later, the expectation is that what was > returned by KVM_GET_MSRS can be set successfully with KVM_SET_MSRS. > Thanks for the clarification. Some kvm x86 selftests have been failing due to this issue even after the last commit. I blame myself for not passing the msr_info->host_initiated to the intel_is_valid_msr(), meanwhile I pondered further whether we should check only the MSR addrs range in the kvm_pmu_is_valid_msr() and apply this kind of sanity check in the pmu_set/get_msr(). Vitaly && Paolo, any preference to move forward ? Thanks, Like Xu
On 5/25/22 09:56, Like Xu wrote: > Thanks for the clarification. > > Some kvm x86 selftests have been failing due to this issue even after > the last commit. > > I blame myself for not passing the msr_info->host_initiated to the > intel_is_valid_msr(), > meanwhile I pondered further whether we should check only the MSR addrs > range in > the kvm_pmu_is_valid_msr() and apply this kind of sanity check in the > pmu_set/get_msr(). > > Vitaly && Paolo, any preference to move forward ? I'm not sure what I did wrong to not see the failure, so I'll fix it myself. But from now on, I'll have a hard rule of no new processor features enabled without KVM unit tests or selftests. In fact, it would be nice if you wrote some for PEBS. Paolo
On 25/5/2022 4:14 pm, Paolo Bonzini wrote: > On 5/25/22 09:56, Like Xu wrote: >> Thanks for the clarification. >> >> Some kvm x86 selftests have been failing due to this issue even after the last >> commit. >> >> I blame myself for not passing the msr_info->host_initiated to the >> intel_is_valid_msr(), >> meanwhile I pondered further whether we should check only the MSR addrs range in >> the kvm_pmu_is_valid_msr() and apply this kind of sanity check in the >> pmu_set/get_msr(). >> >> Vitaly && Paolo, any preference to move forward ? > > I'm not sure what I did wrong to not see the failure, so I'll fix it myself. More info, some Skylake hosts fail the tests like x86_64/state_test due to this issue. > > But from now on, I'll have a hard rule of no new processor features enabled > without KVM unit tests or selftests. In fact, it would be nice if you wrote > some for PEBS. Great, my team (or at least me) is committed to contributing more tests on vPMU features. We may update the process document to the Documentation/virt/kvm/review-checklist.rst. > > Paolo >
On Wed, 2022-05-25 at 16:32 +0800, Like Xu wrote: > On 25/5/2022 4:14 pm, Paolo Bonzini wrote: > > On 5/25/22 09:56, Like Xu wrote: > > > Thanks for the clarification. > > > > > > Some kvm x86 selftests have been failing due to this issue even after the last > > > commit. > > > > > > I blame myself for not passing the msr_info->host_initiated to the > > > intel_is_valid_msr(), > > > meanwhile I pondered further whether we should check only the MSR addrs range in > > > the kvm_pmu_is_valid_msr() and apply this kind of sanity check in the > > > pmu_set/get_msr(). > > > > > > Vitaly && Paolo, any preference to move forward ? > > > > I'm not sure what I did wrong to not see the failure, so I'll fix it myself. > > More info, some Skylake hosts fail the tests like x86_64/state_test due to this > issue. > > > But from now on, I'll have a hard rule of no new processor features enabled > > without KVM unit tests or selftests. In fact, it would be nice if you wrote > > some for PEBS. > > Great, my team (or at least me) is committed to contributing more tests on vPMU > features. > > We may update the process document to the > Documentation/virt/kvm/review-checklist.rst. > > > Paolo > > FYI, this patch series also break 'msr' test in kvm-unit tests. (kvm/queue of today, and master of the kvm-unit-tests repo) The test tries to set the MSR_IA32_MISC_ENABLE to 0x400c51889 and gets #GP. Commenting this out, gets rid of #GP, but test still fails with unexpected result if (!msr_info->host_initiated && ((old_val ^ data) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL)) return 1; It is very possible that the test is broken, I'll check this later. Best regards, Maxim Levitsky
On 5/25/22 16:12, Maxim Levitsky wrote: > FYI, this patch series also break 'msr' test in kvm-unit tests. > (kvm/queue of today, and master of the kvm-unit-tests repo) > > The test tries to set the MSR_IA32_MISC_ENABLE to 0x400c51889 and gets #GP. > > > Commenting this out, gets rid of #GP, but test still fails with unexpected result > > if (!msr_info->host_initiated && > ((old_val ^ data) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL)) > return 1; > > > > > It is very possible that the test is broken, I'll check this later. Yes, for that I've sent a patch already: https://lore.kernel.org/kvm/20220520183207.7952-1-pbonzini@redhat.com/ Paolo
On Wed, 2022-05-25 at 16:13 +0200, Paolo Bonzini wrote: > On 5/25/22 16:12, Maxim Levitsky wrote: > > FYI, this patch series also break 'msr' test in kvm-unit tests. > > (kvm/queue of today, and master of the kvm-unit-tests repo) > > > > The test tries to set the MSR_IA32_MISC_ENABLE to 0x400c51889 and gets #GP. > > > > > > Commenting this out, gets rid of #GP, but test still fails with unexpected result > > > > if (!msr_info->host_initiated && > > ((old_val ^ data) & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL)) > > return 1; > > > > > > > > > > It is very possible that the test is broken, I'll check this later. > > Yes, for that I've sent a patch already: > > https://lore.kernel.org/kvm/20220520183207.7952-1-pbonzini@redhat.com/ > > Paolo > Thank you very much! Best regards, Maxim Levitsky