Message ID | 2f16f83e-ed60-fcb7-7f3d-0fa216c41cb9@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: Fix smp_processor_id() call in preemptible context | expand |
On Tue, Jun 06, 2023, Sebastian Ott wrote: > Fixes: 1c913a1c35aa ("KVM: arm64: Iterate arm_pmus list to probe for default PMU") > Signed-off-by: Sebastian Ott <sebott@redhat.com> > --- > arch/arm64/kvm/pmu-emul.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c > index 491ca7eb2a4c..f9e4e4334875 100644 > --- a/arch/arm64/kvm/pmu-emul.c > +++ b/arch/arm64/kvm/pmu-emul.c > @@ -700,6 +700,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) > > mutex_lock(&arm_pmus_lock); > > + preempt_disable(); get_cpu() + put_cpu() would be more succinct and self-documenting. > cpu = smp_processor_id(); > list_for_each_entry(entry, &arm_pmus, entry) { > tmp = entry->arm_pmu; > @@ -709,7 +710,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) > break; > } > } > - > + preempt_enable(); > mutex_unlock(&arm_pmus_lock); > > return pmu; > -- > 2.40.1 >
Hi Sebastian, On Tue, Jun 06, 2023 at 12:37:30PM +0200, Sebastian Ott wrote: > Commit 1c913a1c35aa ("KVM: arm64: Iterate arm_pmus list to probe for > default PMU") introduced a smp_processor_id() call in preemtible context: > > [70506.110187] BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-aar/3078242 > [70506.119077] caller is debug_smp_processor_id+0x20/0x30 > [70506.124229] CPU: 129 PID: 3078242 Comm: qemu-system-aar Tainted: G W 6.4.0-rc5 #25 > [70506.133176] Hardware name: GIGABYTE R181-T92-00/MT91-FS4-00, BIOS F34 08/13/2020 > [70506.140559] Call trace: > [70506.142993] dump_backtrace+0xa4/0x130 > [70506.146737] show_stack+0x20/0x38 > [70506.150040] dump_stack_lvl+0x48/0x60 > [70506.153704] dump_stack+0x18/0x28 > [70506.157007] check_preemption_disabled+0xe4/0x108 > [70506.161701] debug_smp_processor_id+0x20/0x30 > [70506.166046] kvm_arm_pmu_v3_set_attr+0x460/0x628 > [70506.170662] kvm_arm_vcpu_arch_set_attr+0x88/0xd8 > [70506.175363] kvm_arch_vcpu_ioctl+0x258/0x4a8 > [70506.179632] kvm_vcpu_ioctl+0x32c/0x6b8 > [70506.183465] __arm64_sys_ioctl+0xb4/0x100 > [70506.187467] invoke_syscall+0x78/0x108 > [70506.191205] el0_svc_common.constprop.0+0x4c/0x100 > [70506.195984] do_el0_svc+0x34/0x50 > [70506.199287] el0_svc+0x34/0x108 > [70506.202416] el0t_64_sync_handler+0xf4/0x120 > [70506.206674] el0t_64_sync+0x194/0x198 > > Just disable preemption for this section. The call from a preemptible context is intentional, so this really should just be raw_smp_processor_id(). Do you mind if we fix it with the following? From 2f4680ee6a5aea5c3cf826c84b86172b0b2c1a67 Mon Sep 17 00:00:00 2001 From: Oliver Upton <oliver.upton@linux.dev> Date: Tue, 6 Jun 2023 06:44:54 -0700 Subject: [PATCH] KVM: arm64: Use raw_smp_processor_id() in kvm_pmu_probe_armpmu() Sebastian reports that commit 1c913a1c35aa ("KVM: arm64: Iterate arm_pmus list to probe for default PMU") introduced the following splat with CONFIG_DEBUG_PREEMPT enabled: [70506.110187] BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-aar/3078242 [70506.119077] caller is debug_smp_processor_id+0x20/0x30 [70506.124229] CPU: 129 PID: 3078242 Comm: qemu-system-aar Tainted: G W 6.4.0-rc5 #25 [70506.133176] Hardware name: GIGABYTE R181-T92-00/MT91-FS4-00, BIOS F34 08/13/2020 [70506.140559] Call trace: [70506.142993] dump_backtrace+0xa4/0x130 [70506.146737] show_stack+0x20/0x38 [70506.150040] dump_stack_lvl+0x48/0x60 [70506.153704] dump_stack+0x18/0x28 [70506.157007] check_preemption_disabled+0xe4/0x108 [70506.161701] debug_smp_processor_id+0x20/0x30 [70506.166046] kvm_arm_pmu_v3_set_attr+0x460/0x628 [70506.170662] kvm_arm_vcpu_arch_set_attr+0x88/0xd8 [70506.175363] kvm_arch_vcpu_ioctl+0x258/0x4a8 [70506.179632] kvm_vcpu_ioctl+0x32c/0x6b8 [70506.183465] __arm64_sys_ioctl+0xb4/0x100 [70506.187467] invoke_syscall+0x78/0x108 [70506.191205] el0_svc_common.constprop.0+0x4c/0x100 [70506.195984] do_el0_svc+0x34/0x50 [70506.199287] el0_svc+0x34/0x108 [70506.202416] el0t_64_sync_handler+0xf4/0x120 [70506.206674] el0t_64_sync+0x194/0x198 Nonetheless, there's no functional requirement for disabling preemption, as the cpu # is only used to walk the arm_pmus list. Fix it by using raw_smp_processor_id() instead. Fixes: 1c913a1c35aa ("KVM: arm64: Iterate arm_pmus list to probe for default PMU") Reported-by: Sebastian Ott <sebott@redhat.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> --- arch/arm64/kvm/pmu-emul.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c index 491ca7eb2a4c..933a6331168b 100644 --- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -700,7 +700,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) mutex_lock(&arm_pmus_lock); - cpu = smp_processor_id(); + cpu = raw_smp_processor_id(); list_for_each_entry(entry, &arm_pmus, entry) { tmp = entry->arm_pmu; base-commit: 9561de3a55bed6bdd44a12820ba81ec416e705a7
On Tue, 6 Jun 2023, Oliver Upton wrote: > The call from a preemptible context is intentional, so this really > should just be raw_smp_processor_id(). Do you mind if we fix it with the > following? Works for me. Thanks, Sebastian
On Tue, Jun 06, 2023, Oliver Upton wrote: > The call from a preemptible context is intentional, so this really > should just be raw_smp_processor_id(). Do you mind if we fix it with the > following? ... > Nonetheless, there's no functional requirement for disabling preemption, > as the cpu # is only used to walk the arm_pmus list. Fix it by using > raw_smp_processor_id() instead. As a partial outsider, that needs an explanation, and the code could really use a comment. I assume KVM's ABI is that it's userspace's responsibility to ensure that the CPU(s) used for KVM_RUN is compatible with the CPU used for KVM_ARM_VCPU_PMU_V3_CTRL, but neither the original changelog nor the above state that, nor does anything explain what happens if userspace doesn't uphold its side of things. That stuff might be covered in documentation somewhere, but for someone just looking at git blame, this is all very magical. > Fixes: 1c913a1c35aa ("KVM: arm64: Iterate arm_pmus list to probe for default PMU") > Reported-by: Sebastian Ott <sebott@redhat.com> > Signed-off-by: Oliver Upton <oliver.upton@linux.dev> > --- > arch/arm64/kvm/pmu-emul.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c > index 491ca7eb2a4c..933a6331168b 100644 > --- a/arch/arm64/kvm/pmu-emul.c > +++ b/arch/arm64/kvm/pmu-emul.c > @@ -700,7 +700,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) > > mutex_lock(&arm_pmus_lock); > > - cpu = smp_processor_id(); > + cpu = raw_smp_processor_id(); > list_for_each_entry(entry, &arm_pmus, entry) { > tmp = entry->arm_pmu; > > > base-commit: 9561de3a55bed6bdd44a12820ba81ec416e705a7 > -- > 2.41.0.rc0.172.g3f132b7071-goog >
On Tue, Jun 06, 2023 at 07:29:16AM -0700, Sean Christopherson wrote: > On Tue, Jun 06, 2023, Oliver Upton wrote: > > The call from a preemptible context is intentional, so this really > > should just be raw_smp_processor_id(). Do you mind if we fix it with the > > following? > > ... > > > Nonetheless, there's no functional requirement for disabling preemption, > > as the cpu # is only used to walk the arm_pmus list. Fix it by using > > raw_smp_processor_id() instead. > > As a partial outsider, that needs an explanation, and the code could really use a > comment. I assume KVM's ABI is that it's userspace's responsibility to ensure that > the CPU(s) used for KVM_RUN is compatible with the CPU used for KVM_ARM_VCPU_PMU_V3_CTRL, > but neither the original changelog nor the above state that, nor does anything > explain what happens if userspace doesn't uphold its side of things. See commit 40e54cad4540 ("KVM: arm64: Document default vPMU behavior on heterogeneous systems"), which documents the subtlety of vCPU scheduling with the 'old' ABI at the callsite of this function. I don't want to bleed any details of this crap into user documentation, since the entire scheme is irretrievably broken. See Documentation/virt/kvm/api/devices/vcpu.rst 1.4 for the 'new' ABI where userspace explicitly selects a vPMU instance. > That stuff might be covered in documentation somewhere, but for someone > just looking at git blame, this is all very magical. Personally, I find any other fix that involves disabling preemption to be quite a lot more 'magical', as there isn't any percpu data we're working with in the loop.
On Tue, Jun 06, 2023, Oliver Upton wrote: > On Tue, Jun 06, 2023 at 07:29:16AM -0700, Sean Christopherson wrote: > > On Tue, Jun 06, 2023, Oliver Upton wrote: > > > The call from a preemptible context is intentional, so this really > > > should just be raw_smp_processor_id(). Do you mind if we fix it with the > > > following? > > > > ... > > > > > Nonetheless, there's no functional requirement for disabling preemption, > > > as the cpu # is only used to walk the arm_pmus list. Fix it by using > > > raw_smp_processor_id() instead. > > > > As a partial outsider, that needs an explanation, and the code could really use a > > comment. I assume KVM's ABI is that it's userspace's responsibility to ensure that > > the CPU(s) used for KVM_RUN is compatible with the CPU used for KVM_ARM_VCPU_PMU_V3_CTRL, > > but neither the original changelog nor the above state that, nor does anything > > explain what happens if userspace doesn't uphold its side of things. > > See commit 40e54cad4540 ("KVM: arm64: Document default vPMU behavior on > heterogeneous systems"), which documents the subtlety of vCPU scheduling > with the 'old' ABI at the callsite of this function. I don't want to > bleed any details of this crap into user documentation, since the entire > scheme is irretrievably broken. I wasn't suggesting adding anything to user documentation, I was suggesting/asking that the changelog explain the assertion "there's no functional requirement for disabling preemption". > See Documentation/virt/kvm/api/devices/vcpu.rst 1.4 for the 'new' ABI > where userspace explicitly selects a vPMU instance. > > > That stuff might be covered in documentation somewhere, but for someone > > just looking at git blame, this is all very magical. > > Personally, I find any other fix that involves disabling preemption to be > quite a lot more 'magical', as there isn't any percpu data we're working > with in the loop. Heh, and? I wasn't suggesting you take Sebastian's fix, nor was I arguing that disabling preemption is any more or less magical. I was simply calling out that for folks that aren't already familiar with the code, it's not obvious why using an unstable CPU is functionally correct. Poking around more, it looks like the answer is that kvm_arch_vcpu_load() will mark the vCPU as being loaded on an unsupported pCPU, which will prevent running on the target pCPU, and thus presumably prevent creating new perf events on the incompatible pCPU. Though following the breadcrumbs from the commit you referenced above, the set of "supported" CPUs at this point is all possible CPUs in the system. So unless I'm missing something, testing that the current, unstable CPU is a "supported" CPU is unnecessary because running on an impossible CPU is, um, impossible. I.e. can't you just do? --- arch/arm64/kvm/pmu-emul.c | 31 +++++++------------------------ 1 file changed, 7 insertions(+), 24 deletions(-) diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c index 491ca7eb2a4c..b899d580ff73 100644 --- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -692,29 +692,6 @@ void kvm_host_pmu_init(struct arm_pmu *pmu) mutex_unlock(&arm_pmus_lock); } -static struct arm_pmu *kvm_pmu_probe_armpmu(void) -{ - struct arm_pmu *tmp, *pmu = NULL; - struct arm_pmu_entry *entry; - int cpu; - - mutex_lock(&arm_pmus_lock); - - cpu = smp_processor_id(); - list_for_each_entry(entry, &arm_pmus, entry) { - tmp = entry->arm_pmu; - - if (cpumask_test_cpu(cpu, &tmp->supported_cpus)) { - pmu = tmp; - break; - } - } - - mutex_unlock(&arm_pmus_lock); - - return pmu; -} - u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool pmceid1) { unsigned long *bmap = vcpu->kvm->arch.pmu_filter; @@ -900,8 +877,14 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr) * upholds the preexisting behavior on heterogeneous systems * where vCPUs can be scheduled on any core but the guest * counters could stop working. + * + * In short, the set of "supported" CPUS for the default PMU is + * all possible CPUs in the system, simply grab the first PMU. */ - kvm->arch.arm_pmu = kvm_pmu_probe_armpmu(); + mutex_lock(&arm_pmus_lock); + kvm->arch.arm_pmu = list_first_entry_or_null(&arm_pmus); + mutex_unlock(&arm_pmus_lock); + if (!kvm->arch.arm_pmu) return -ENODEV; } base-commit: 40e54cad454076172cc3e2bca60aa924650a3e4b
On Tue, 06 Jun 2023 15:10:44 +0100, Oliver Upton <oliver.upton@linux.dev> wrote: > > Hi Sebastian, > > On Tue, Jun 06, 2023 at 12:37:30PM +0200, Sebastian Ott wrote: > > Commit 1c913a1c35aa ("KVM: arm64: Iterate arm_pmus list to probe for > > default PMU") introduced a smp_processor_id() call in preemtible context: > > > > [70506.110187] BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-aar/3078242 > > [70506.119077] caller is debug_smp_processor_id+0x20/0x30 > > [70506.124229] CPU: 129 PID: 3078242 Comm: qemu-system-aar Tainted: G W 6.4.0-rc5 #25 > > [70506.133176] Hardware name: GIGABYTE R181-T92-00/MT91-FS4-00, BIOS F34 08/13/2020 > > [70506.140559] Call trace: > > [70506.142993] dump_backtrace+0xa4/0x130 > > [70506.146737] show_stack+0x20/0x38 > > [70506.150040] dump_stack_lvl+0x48/0x60 > > [70506.153704] dump_stack+0x18/0x28 > > [70506.157007] check_preemption_disabled+0xe4/0x108 > > [70506.161701] debug_smp_processor_id+0x20/0x30 > > [70506.166046] kvm_arm_pmu_v3_set_attr+0x460/0x628 > > [70506.170662] kvm_arm_vcpu_arch_set_attr+0x88/0xd8 > > [70506.175363] kvm_arch_vcpu_ioctl+0x258/0x4a8 > > [70506.179632] kvm_vcpu_ioctl+0x32c/0x6b8 > > [70506.183465] __arm64_sys_ioctl+0xb4/0x100 > > [70506.187467] invoke_syscall+0x78/0x108 > > [70506.191205] el0_svc_common.constprop.0+0x4c/0x100 > > [70506.195984] do_el0_svc+0x34/0x50 > > [70506.199287] el0_svc+0x34/0x108 > > [70506.202416] el0t_64_sync_handler+0xf4/0x120 > > [70506.206674] el0t_64_sync+0x194/0x198 > > > > Just disable preemption for this section. > > The call from a preemptible context is intentional, so this really > should just be raw_smp_processor_id(). Do you mind if we fix it with the > following? > > From 2f4680ee6a5aea5c3cf826c84b86172b0b2c1a67 Mon Sep 17 00:00:00 2001 > From: Oliver Upton <oliver.upton@linux.dev> > Date: Tue, 6 Jun 2023 06:44:54 -0700 > Subject: [PATCH] KVM: arm64: Use raw_smp_processor_id() in > kvm_pmu_probe_armpmu() > > Sebastian reports that commit 1c913a1c35aa ("KVM: arm64: Iterate > arm_pmus list to probe for default PMU") introduced the following splat > with CONFIG_DEBUG_PREEMPT enabled: > > [70506.110187] BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-aar/3078242 > [70506.119077] caller is debug_smp_processor_id+0x20/0x30 > [70506.124229] CPU: 129 PID: 3078242 Comm: qemu-system-aar Tainted: G W 6.4.0-rc5 #25 > [70506.133176] Hardware name: GIGABYTE R181-T92-00/MT91-FS4-00, BIOS F34 08/13/2020 > [70506.140559] Call trace: > [70506.142993] dump_backtrace+0xa4/0x130 > [70506.146737] show_stack+0x20/0x38 > [70506.150040] dump_stack_lvl+0x48/0x60 > [70506.153704] dump_stack+0x18/0x28 > [70506.157007] check_preemption_disabled+0xe4/0x108 > [70506.161701] debug_smp_processor_id+0x20/0x30 > [70506.166046] kvm_arm_pmu_v3_set_attr+0x460/0x628 > [70506.170662] kvm_arm_vcpu_arch_set_attr+0x88/0xd8 > [70506.175363] kvm_arch_vcpu_ioctl+0x258/0x4a8 > [70506.179632] kvm_vcpu_ioctl+0x32c/0x6b8 > [70506.183465] __arm64_sys_ioctl+0xb4/0x100 > [70506.187467] invoke_syscall+0x78/0x108 > [70506.191205] el0_svc_common.constprop.0+0x4c/0x100 > [70506.195984] do_el0_svc+0x34/0x50 > [70506.199287] el0_svc+0x34/0x108 > [70506.202416] el0t_64_sync_handler+0xf4/0x120 > [70506.206674] el0t_64_sync+0x194/0x198 > > Nonetheless, there's no functional requirement for disabling preemption, > as the cpu # is only used to walk the arm_pmus list. Fix it by using > raw_smp_processor_id() instead. > > Fixes: 1c913a1c35aa ("KVM: arm64: Iterate arm_pmus list to probe for default PMU") > Reported-by: Sebastian Ott <sebott@redhat.com> > Signed-off-by: Oliver Upton <oliver.upton@linux.dev> > --- > arch/arm64/kvm/pmu-emul.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c > index 491ca7eb2a4c..933a6331168b 100644 > --- a/arch/arm64/kvm/pmu-emul.c > +++ b/arch/arm64/kvm/pmu-emul.c > @@ -700,7 +700,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) > > mutex_lock(&arm_pmus_lock); > > - cpu = smp_processor_id(); > + cpu = raw_smp_processor_id(); > list_for_each_entry(entry, &arm_pmus, entry) { > tmp = entry->arm_pmu; > > If preemption doesn't matter (and I really don't think it does), why are we looking for a the current CPU? I'd rather we pick the PMU that is associated with CPU0 (we're pretty sure it exists), and be done with it. Something like: diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c index 491ca7eb2a4c..fce9d07fe26b 100644 --- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -696,15 +696,14 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) { struct arm_pmu *tmp, *pmu = NULL; struct arm_pmu_entry *entry; - int cpu; mutex_lock(&arm_pmus_lock); - cpu = smp_processor_id(); list_for_each_entry(entry, &arm_pmus, entry) { tmp = entry->arm_pmu; - if (cpumask_test_cpu(cpu, &tmp->supported_cpus)) { + /* Pick the CPU associated with CPU0 as the default */ + if (cpumask_test_cpu(0, &tmp->supported_cpus)) { pmu = tmp; break; } At least, it saves us wondering about the rationale for picking one or the other. Thanks, M.
On Tue, Jun 06, 2023 at 05:17:34PM +0100, Marc Zyngier wrote: > On Tue, 06 Jun 2023 15:10:44 +0100, Oliver Upton <oliver.upton@linux.dev> wrote: > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c > > index 491ca7eb2a4c..933a6331168b 100644 > > --- a/arch/arm64/kvm/pmu-emul.c > > +++ b/arch/arm64/kvm/pmu-emul.c > > @@ -700,7 +700,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) > > > > mutex_lock(&arm_pmus_lock); > > > > - cpu = smp_processor_id(); > > + cpu = raw_smp_processor_id(); > > list_for_each_entry(entry, &arm_pmus, entry) { > > tmp = entry->arm_pmu; > > > > > > If preemption doesn't matter (and I really don't think it does), why > are we looking for a the current CPU? I'd rather we pick the PMU that > is associated with CPU0 (we're pretty sure it exists), and be done > with it. Getting the current CPU is still useful, we just don't care about that cpu# being stale. Unconditionally using CPU0 could break existing usage patterns. A not-too-contrived example would be to taskset QEMU onto a cluster of cores in a big.LITTLE system (I do this). The current behavior would assign the right PMU to the guest. I've made my opinions about the 'old' ABI quite clear, but I don't have too great of an appetite for breakage, though fragile. Can we proceed with the fix I had suggested along with a more complete description of the baggage that we're carrying?
On Tue, Jun 06, 2023 at 03:46:09PM +0000, Sean Christopherson wrote: > On Tue, Jun 06, 2023, Oliver Upton wrote: > > On Tue, Jun 06, 2023 at 07:29:16AM -0700, Sean Christopherson wrote: > > > On Tue, Jun 06, 2023, Oliver Upton wrote: > > > > The call from a preemptible context is intentional, so this really > > > > should just be raw_smp_processor_id(). Do you mind if we fix it with the > > > > following? > > > > > > ... > > > > > > > Nonetheless, there's no functional requirement for disabling preemption, > > > > as the cpu # is only used to walk the arm_pmus list. Fix it by using > > > > raw_smp_processor_id() instead. > > > > > > As a partial outsider, that needs an explanation, and the code could really use a > > > comment. I assume KVM's ABI is that it's userspace's responsibility to ensure that > > > the CPU(s) used for KVM_RUN is compatible with the CPU used for KVM_ARM_VCPU_PMU_V3_CTRL, > > > but neither the original changelog nor the above state that, nor does anything > > > explain what happens if userspace doesn't uphold its side of things. > > > > See commit 40e54cad4540 ("KVM: arm64: Document default vPMU behavior on > > heterogeneous systems"), which documents the subtlety of vCPU scheduling > > with the 'old' ABI at the callsite of this function. I don't want to > > bleed any details of this crap into user documentation, since the entire > > scheme is irretrievably broken. > > I wasn't suggesting adding anything to user documentation, I was suggesting/asking > that the changelog explain the assertion "there's no functional requirement for > disabling preemption". Let's continue this on Marc's reply (don't want to spread context over multiple threads). > Poking around more, it looks like the answer is that kvm_arch_vcpu_load() will > mark the vCPU as being loaded on an unsupported pCPU, which will prevent running > on the target pCPU, and thus presumably prevent creating new perf events on the > incompatible pCPU. That's only the case with the 'new' ABI, where userspace has explicitly selected a PMU instance. The answer for the 'old' interface is here: /* * No PMU set, get the default one. * * The observant among you will notice that the supported_cpus * mask does not get updated for the default PMU even though it * is quite possible the selected instance supports only a * subset of cores in the system. This is intentional, and * upholds the preexisting behavior on heterogeneous systems * where vCPUs can be scheduled on any core but the guest * counters could stop working. */ kvm->arch.arm_pmu = kvm_pmu_probe_armpmu(); > Though following the breadcrumbs from the commit you referenced above, the set of > "supported" CPUs at this point is all possible CPUs in the system. So unless I'm > missing something, testing that the current, unstable CPU is a "supported" CPU is > unnecessary because running on an impossible CPU is, um, impossible. Careful -- arm_pmu->supported_cpus is different from kvm->arch.supported_cpus. The former describes a PMU instance in the system and the latter enforces our scheduling requirements when userspace _explicitly_ sets a PMU type, i.e. KVM_ARM_VCPU_PMU_V3_SET_PMU. So it is probable that some of the PMUs in the arm_pmus list *do not* support the calling CPU. > I.e. can't you just do? No, this would break the 'old' ABI, but if we decide to deliberately break it I prefer your suggestion over using CPU0.
On Tue, Jun 06, 2023, Oliver Upton wrote: > On Tue, Jun 06, 2023 at 03:46:09PM +0000, Sean Christopherson wrote: > > I.e. can't you just do? > > No, this would break the 'old' ABI, but if we decide to deliberately > break it I prefer your suggestion over using CPU0. Ah, if userspace pins the task when doing KVM_ARM_VCPU_PMU_V3_CTRL, then factoring in the pCPU matters.
On Tue, 06 Jun 2023 17:48:14 +0100, Oliver Upton <oliver.upton@linux.dev> wrote: > > On Tue, Jun 06, 2023 at 05:17:34PM +0100, Marc Zyngier wrote: > > On Tue, 06 Jun 2023 15:10:44 +0100, Oliver Upton <oliver.upton@linux.dev> wrote: > > > diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c > > > index 491ca7eb2a4c..933a6331168b 100644 > > > --- a/arch/arm64/kvm/pmu-emul.c > > > +++ b/arch/arm64/kvm/pmu-emul.c > > > @@ -700,7 +700,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) > > > > > > mutex_lock(&arm_pmus_lock); > > > > > > - cpu = smp_processor_id(); > > > + cpu = raw_smp_processor_id(); > > > list_for_each_entry(entry, &arm_pmus, entry) { > > > tmp = entry->arm_pmu; > > > > > > > > > > If preemption doesn't matter (and I really don't think it does), why > > are we looking for a the current CPU? I'd rather we pick the PMU that > > is associated with CPU0 (we're pretty sure it exists), and be done > > with it. > > Getting the current CPU is still useful, we just don't care about that > cpu# being stale. Unconditionally using CPU0 could break existing usage > patterns. > > A not-too-contrived example would be to taskset QEMU onto a cluster of > cores in a big.LITTLE system (I do this). The current behavior would > assign the right PMU to the guest. I've made my opinions about the 'old' > ABI quite clear, but I don't have too great of an appetite for breakage, > though fragile. Fair enough. > > Can we proceed with the fix I had suggested along with a more complete > description of the baggage that we're carrying? Sure. Please post a separate patch and I'll queue that together with Reiji's EL0 PMU stuff for the next bag of fixes. Thanks, M.
diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c index 491ca7eb2a4c..f9e4e4334875 100644 --- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -700,6 +700,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) mutex_lock(&arm_pmus_lock); + preempt_disable(); cpu = smp_processor_id(); list_for_each_entry(entry, &arm_pmus, entry) { tmp = entry->arm_pmu; @@ -709,7 +710,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) break; } } - + preempt_enable(); mutex_unlock(&arm_pmus_lock); return pmu;
Commit 1c913a1c35aa ("KVM: arm64: Iterate arm_pmus list to probe for default PMU") introduced a smp_processor_id() call in preemtible context: [70506.110187] BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-aar/3078242 [70506.119077] caller is debug_smp_processor_id+0x20/0x30 [70506.124229] CPU: 129 PID: 3078242 Comm: qemu-system-aar Tainted: G W 6.4.0-rc5 #25 [70506.133176] Hardware name: GIGABYTE R181-T92-00/MT91-FS4-00, BIOS F34 08/13/2020 [70506.140559] Call trace: [70506.142993] dump_backtrace+0xa4/0x130 [70506.146737] show_stack+0x20/0x38 [70506.150040] dump_stack_lvl+0x48/0x60 [70506.153704] dump_stack+0x18/0x28 [70506.157007] check_preemption_disabled+0xe4/0x108 [70506.161701] debug_smp_processor_id+0x20/0x30 [70506.166046] kvm_arm_pmu_v3_set_attr+0x460/0x628 [70506.170662] kvm_arm_vcpu_arch_set_attr+0x88/0xd8 [70506.175363] kvm_arch_vcpu_ioctl+0x258/0x4a8 [70506.179632] kvm_vcpu_ioctl+0x32c/0x6b8 [70506.183465] __arm64_sys_ioctl+0xb4/0x100 [70506.187467] invoke_syscall+0x78/0x108 [70506.191205] el0_svc_common.constprop.0+0x4c/0x100 [70506.195984] do_el0_svc+0x34/0x50 [70506.199287] el0_svc+0x34/0x108 [70506.202416] el0t_64_sync_handler+0xf4/0x120 [70506.206674] el0t_64_sync+0x194/0x198 Just disable preemption for this section. Fixes: 1c913a1c35aa ("KVM: arm64: Iterate arm_pmus list to probe for default PMU") Signed-off-by: Sebastian Ott <sebott@redhat.com> --- arch/arm64/kvm/pmu-emul.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)