From patchwork Mon Aug 23 14:30:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 12452955 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E779C4320A for ; Mon, 23 Aug 2021 14:31:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E8333613CD for ; Mon, 23 Aug 2021 14:31:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229979AbhHWOcb (ORCPT ); Mon, 23 Aug 2021 10:32:31 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:27226 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230225AbhHWOc3 (ORCPT ); Mon, 23 Aug 2021 10:32:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629729106; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IYFUg3KjJacWl8OLaf6vH1G2cmWUV+wxHB9g+ejqdCs=; b=azXVdd4F3tJNwWUmfIC7vPCUtrF+1E6PZKGPC9S76KPWsAcGl1X+kNEG+bMDB1+zMEjw7H WYToCoxKFjiQuuUVZoUtS+twNOadbuvLnkNvKhSbMwL9iJKcS5HUa2Xbigirz2ZaQJr7ue bfzSpqllGWDJOkomMapnKZ2F4DxcBc0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-5-hRW3pwkzMmGFHNHC6pAsnA-1; Mon, 23 Aug 2021 10:30:41 -0400 X-MC-Unique: hRW3pwkzMmGFHNHC6pAsnA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C4AD81026200; Mon, 23 Aug 2021 14:30:39 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.132]) by smtp.corp.redhat.com (Postfix) with ESMTP id D2D5727091; Mon, 23 Aug 2021 14:30:37 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, Paolo Bonzini Cc: Sean Christopherson , Wanpeng Li , Jim Mattson , "Dr. David Alan Gilbert" , Nitesh Narayan Lal , linux-kernel@vger.kernel.org Subject: [PATCH v2 1/4] KVM: Clean up benign vcpu->cpu data races when kicking vCPUs Date: Mon, 23 Aug 2021 16:30:25 +0200 Message-Id: <20210823143028.649818-2-vkuznets@redhat.com> In-Reply-To: <20210823143028.649818-1-vkuznets@redhat.com> References: <20210823143028.649818-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson Fix a benign data race reported by syzbot+KCSAN[*] by ensuring vcpu->cpu is read exactly once, and by ensuring the vCPU is booted from guest mode if kvm_arch_vcpu_should_kick() returns true. Fix a similar race in kvm_make_vcpus_request_mask() by ensuring the vCPU is interrupted if kvm_request_needs_ipi() returns true. Reading vcpu->cpu before vcpu->mode (via kvm_arch_vcpu_should_kick() or kvm_request_needs_ipi()) means the target vCPU could get migrated (change vcpu->cpu) and enter !OUTSIDE_GUEST_MODE between reading vcpu->cpud and reading vcpu->mode. If that happens, the kick/IPI will be sent to the old pCPU, not the new pCPU that is now running the vCPU or reading SPTEs. Although failing to kick the vCPU is not exactly ideal, practically speaking it cannot cause a functional issue unless there is also a bug in the caller, and any such bug would exist regardless of kvm_vcpu_kick()'s behavior. The purpose of sending an IPI is purely to get a vCPU into the host (or out of reading SPTEs) so that the vCPU can recognize a change in state, e.g. a KVM_REQ_* request. If vCPU's handling of the state change is required for correctness, KVM must ensure either the vCPU sees the change before entering the guest, or that the sender sees the vCPU as running in guest mode. All architectures handle this by (a) sending the request before calling kvm_vcpu_kick() and (b) checking for requests _after_ setting vcpu->mode. x86's READING_SHADOW_PAGE_TABLES has similar requirements; KVM needs to ensure it kicks and waits for vCPUs that started reading SPTEs _before_ MMU changes were finalized, but any vCPU that starts reading after MMU changes were finalized will see the new state and can continue on uninterrupted. For uses of kvm_vcpu_kick() that are not paired with a KVM_REQ_*, e.g. x86's kvm_arch_sync_dirty_log(), the order of the kick must not be relied upon for functional correctness, e.g. in the dirty log case, userspace cannot assume it has a 100% complete log if vCPUs are still running. All that said, eliminate the benign race since the cost of doing so is an "extra" atomic cmpxchg() in the case where the target vCPU is loaded by the current pCPU or is not loaded at all. I.e. the kick will be skipped due to kvm_vcpu_exiting_guest_mode() seeing a compatible vcpu->mode as opposed to the kick being skipped because of the cpu checks. Keep the "cpu != me" checks even though they appear useless/impossible at first glance. x86 processes guest IPI writes in a fast path that runs in IN_GUEST_MODE, i.e. can call kvm_vcpu_kick() from IN_GUEST_MODE. And calling kvm_vm_bugged()->kvm_make_vcpus_request_mask() from IN_GUEST or READING_SHADOW_PAGE_TABLES is perfectly reasonable. Note, a race with the cpu_online() check in kvm_vcpu_kick() likely persists, e.g. the vCPU could exit guest mode and get offlined between the cpu_online() check and the sending of smp_send_reschedule(). But, the online check appears to exist only to avoid a WARN in x86's native_smp_send_reschedule() that fires if the target CPU is not online. The reschedule WARN exists because CPU offlining takes the CPU out of the scheduling pool, i.e. the WARN is intended to detect the case where the kernel attempts to schedule a task on an offline CPU. The actual sending of the IPI is a non-issue as at worst it will simpy be dropped on the floor. In other words, KVM's usurping of the reschedule IPI could theoretically trigger a WARN if the stars align, but there will be no loss of functionality. [*] https://syzkaller.appspot.com/bug?extid=cd4154e502f43f10808a Cc: Venkatesh Srinivas Cc: Vitaly Kuznetsov Fixes: 97222cc83163 ("KVM: Emulate local APIC in kernel") Signed-off-by: Sean Christopherson Signed-off-by: Vitaly Kuznetsov --- virt/kvm/kvm_main.c | 36 ++++++++++++++++++++++++++++-------- 1 file changed, 28 insertions(+), 8 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 3e67c93ca403..786b914db98f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -273,14 +273,26 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req, continue; kvm_make_request(req, vcpu); - cpu = vcpu->cpu; if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu)) continue; - if (tmp != NULL && cpu != -1 && cpu != me && - kvm_request_needs_ipi(vcpu, req)) - __cpumask_set_cpu(cpu, tmp); + /* + * Note, the vCPU could get migrated to a different pCPU at any + * point after kvm_request_needs_ipi(), which could result in + * sending an IPI to the previous pCPU. But, that's ok because + * the purpose of the IPI is to ensure the vCPU returns to + * OUTSIDE_GUEST_MODE, which is satisfied if the vCPU migrates. + * Entering READING_SHADOW_PAGE_TABLES after this point is also + * ok, as the requirement is only that KVM wait for vCPUs that + * were reading SPTEs _before_ any changes were finalized. See + * kvm_vcpu_kick() for more details on handling requests. + */ + if (tmp != NULL && kvm_request_needs_ipi(vcpu, req)) { + cpu = READ_ONCE(vcpu->cpu); + if (cpu != -1 && cpu != me) + __cpumask_set_cpu(cpu, tmp); + } } called = kvm_kick_many_cpus(tmp, !!(req & KVM_REQUEST_WAIT)); @@ -3309,16 +3321,24 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up); */ void kvm_vcpu_kick(struct kvm_vcpu *vcpu) { - int me; - int cpu = vcpu->cpu; + int me, cpu; if (kvm_vcpu_wake_up(vcpu)) return; + /* + * Note, the vCPU could get migrated to a different pCPU at any point + * after kvm_arch_vcpu_should_kick(), which could result in sending an + * IPI to the previous pCPU. But, that's ok because the purpose of the + * IPI is to force the vCPU to leave IN_GUEST_MODE, and migrating the + * vCPU also requires it to leave IN_GUEST_MODE. + */ me = get_cpu(); - if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu)) - if (kvm_arch_vcpu_should_kick(vcpu)) + if (kvm_arch_vcpu_should_kick(vcpu)) { + cpu = READ_ONCE(vcpu->cpu); + if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu)) smp_send_reschedule(cpu); + } put_cpu(); } EXPORT_SYMBOL_GPL(kvm_vcpu_kick); From patchwork Mon Aug 23 14:30:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 12452957 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B60FDC4338F for ; Mon, 23 Aug 2021 14:31:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9DB0861425 for ; Mon, 23 Aug 2021 14:31:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230336AbhHWOcd (ORCPT ); Mon, 23 Aug 2021 10:32:33 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:35325 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230290AbhHWOcc (ORCPT ); Mon, 23 Aug 2021 10:32:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629729109; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/WbznmYSWXy3CwdDh9G5nMXx6sTahDe95v+6YXmfShI=; b=JpN7qj4OYj2c/htc75vIkWUpGZ/teAkwLVhaQy+tyHPBD959DR/UEUz8yW45ZlA4sZSyEo jDz7V37tbZmCGsdDUuZ33/a9cvqsBo96vtIBCRgBjJqweHo2EA5qZXTjKTWi081zBiDn8O NckR3mTfVwjOKwPYkpdF+c6sMQp3jiA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-9-EA4ViK-VP-iBsqhJwEhxlw-1; Mon, 23 Aug 2021 10:30:43 -0400 X-MC-Unique: EA4ViK-VP-iBsqhJwEhxlw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0E2D5190B2A9; Mon, 23 Aug 2021 14:30:42 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.132]) by smtp.corp.redhat.com (Postfix) with ESMTP id 200342707F; Mon, 23 Aug 2021 14:30:39 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, Paolo Bonzini Cc: Sean Christopherson , Wanpeng Li , Jim Mattson , "Dr. David Alan Gilbert" , Nitesh Narayan Lal , linux-kernel@vger.kernel.org Subject: [PATCH v2 2/4] KVM: Guard cpusmask NULL check with CONFIG_CPUMASK_OFFSTACK Date: Mon, 23 Aug 2021 16:30:26 +0200 Message-Id: <20210823143028.649818-3-vkuznets@redhat.com> In-Reply-To: <20210823143028.649818-1-vkuznets@redhat.com> References: <20210823143028.649818-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Sean Christopherson Check for a NULL cpumask_var_t when kicking multiple vCPUs if and only if cpumasks are configured to be allocated off-stack. This is a meaningless optimization, e.g. avoids a TEST+Jcc and TEST+CMOV on x86, but more importantly helps document that the NULL check is necessary even though all callers pass in a local variable. No functional change intended. Signed-off-by: Sean Christopherson Signed-off-by: Vitaly Kuznetsov --- virt/kvm/kvm_main.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 786b914db98f..82c5280dd5ce 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -247,7 +247,7 @@ static void ack_flush(void *_completed) static inline bool kvm_kick_many_cpus(const struct cpumask *cpus, bool wait) { - if (unlikely(!cpus)) + if (IS_ENABLED(CONFIG_CPUMASK_OFFSTACK) && unlikely(!cpus)) cpus = cpu_online_mask; if (cpumask_empty(cpus)) @@ -277,6 +277,14 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req, if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu)) continue; + /* + * tmp can be NULL if cpumasks are allocated off stack, as + * allocation of the mask is deliberately not fatal and is + * handled by falling back to kicking all online CPUs. + */ + if (IS_ENABLED(CONFIG_CPUMASK_OFFSTACK) && !tmp) + continue; + /* * Note, the vCPU could get migrated to a different pCPU at any * point after kvm_request_needs_ipi(), which could result in @@ -288,7 +296,7 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req, * were reading SPTEs _before_ any changes were finalized. See * kvm_vcpu_kick() for more details on handling requests. */ - if (tmp != NULL && kvm_request_needs_ipi(vcpu, req)) { + if (kvm_request_needs_ipi(vcpu, req)) { cpu = READ_ONCE(vcpu->cpu); if (cpu != -1 && cpu != me) __cpumask_set_cpu(cpu, tmp); From patchwork Mon Aug 23 14:30:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 12452951 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D4D3C4338F for ; Mon, 23 Aug 2021 14:30:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E951F613B1 for ; Mon, 23 Aug 2021 14:30:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230178AbhHWObb (ORCPT ); Mon, 23 Aug 2021 10:31:31 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:59267 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230177AbhHWOba (ORCPT ); Mon, 23 Aug 2021 10:31:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629729047; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wq3GtJnJ0UcnS9tRpeI+vww4phaIYcteU9bLlmRBJQw=; b=EpoMxETy+QrxL5qU/thIJ+phgmfr5g4K+d94k6vaj2i/7FTClOtYSr+JOGVAtRy0jhvIc/ MdKwpkRRtBVX4dPQhBQWOX8YHkDwfXjhZozpB6+5eRAzT5mqt7CPavvn89i+OFvmh+J9LC Du9AZ2mu+Z5o+wiWtpS/iONxrbjzqS0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-503--guwZZ_nMUOC-VfR-27R4Q-1; Mon, 23 Aug 2021 10:30:45 -0400 X-MC-Unique: -guwZZ_nMUOC-VfR-27R4Q-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5D7DF87D545; Mon, 23 Aug 2021 14:30:44 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.132]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5E31A99BD; Mon, 23 Aug 2021 14:30:42 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, Paolo Bonzini Cc: Sean Christopherson , Wanpeng Li , Jim Mattson , "Dr. David Alan Gilbert" , Nitesh Narayan Lal , linux-kernel@vger.kernel.org Subject: [PATCH v2 3/4] KVM: Optimize kvm_make_vcpus_request_mask() a bit Date: Mon, 23 Aug 2021 16:30:27 +0200 Message-Id: <20210823143028.649818-4-vkuznets@redhat.com> In-Reply-To: <20210823143028.649818-1-vkuznets@redhat.com> References: <20210823143028.649818-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Iterating over set bits in 'vcpu_bitmap' should be faster than going through all vCPUs, especially when just a few bits are set. Signed-off-by: Vitaly Kuznetsov --- virt/kvm/kvm_main.c | 83 ++++++++++++++++++++++++++------------------- 1 file changed, 49 insertions(+), 34 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 82c5280dd5ce..e9c2ce2d273f 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -257,49 +257,64 @@ static inline bool kvm_kick_many_cpus(const struct cpumask *cpus, bool wait) return true; } +static void kvm_make_vcpu_request(struct kvm *kvm, struct kvm_vcpu *vcpu, + unsigned int req, cpumask_var_t tmp, + int current_cpu) +{ + int cpu = vcpu->cpu; + + kvm_make_request(req, vcpu); + + if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu)) + return; + + /* + * tmp can be NULL if cpumasks are allocated off stack, as allocation of + * the mask is deliberately not fatal and is handled by falling back to + * kicking all online CPUs. + */ + if (IS_ENABLED(CONFIG_CPUMASK_OFFSTACK) && !tmp) + return; + + /* + * Note, the vCPU could get migrated to a different pCPU at any point + * after kvm_request_needs_ipi(), which could result in sending an IPI + * to the previous pCPU. But, that's ok because the purpose of the IPI + * is to ensure the vCPU returns to OUTSIDE_GUEST_MODE, which is + * satisfied if the vCPU migrates. Entering READING_SHADOW_PAGE_TABLES + * after this point is also ok, as the requirement is only that KVM wait + * for vCPUs that were reading SPTEs _before_ any changes were + * finalized. See kvm_vcpu_kick() for more details on handling requests. + */ + if (kvm_request_needs_ipi(vcpu, req)) { + cpu = READ_ONCE(vcpu->cpu); + if (cpu != -1 && cpu != current_cpu) + __cpumask_set_cpu(cpu, tmp); + } +} + bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req, struct kvm_vcpu *except, unsigned long *vcpu_bitmap, cpumask_var_t tmp) { - int i, cpu, me; + int i, me; struct kvm_vcpu *vcpu; bool called; me = get_cpu(); - kvm_for_each_vcpu(i, vcpu, kvm) { - if ((vcpu_bitmap && !test_bit(i, vcpu_bitmap)) || - vcpu == except) - continue; - - kvm_make_request(req, vcpu); - - if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu)) - continue; - - /* - * tmp can be NULL if cpumasks are allocated off stack, as - * allocation of the mask is deliberately not fatal and is - * handled by falling back to kicking all online CPUs. - */ - if (IS_ENABLED(CONFIG_CPUMASK_OFFSTACK) && !tmp) - continue; - - /* - * Note, the vCPU could get migrated to a different pCPU at any - * point after kvm_request_needs_ipi(), which could result in - * sending an IPI to the previous pCPU. But, that's ok because - * the purpose of the IPI is to ensure the vCPU returns to - * OUTSIDE_GUEST_MODE, which is satisfied if the vCPU migrates. - * Entering READING_SHADOW_PAGE_TABLES after this point is also - * ok, as the requirement is only that KVM wait for vCPUs that - * were reading SPTEs _before_ any changes were finalized. See - * kvm_vcpu_kick() for more details on handling requests. - */ - if (kvm_request_needs_ipi(vcpu, req)) { - cpu = READ_ONCE(vcpu->cpu); - if (cpu != -1 && cpu != me) - __cpumask_set_cpu(cpu, tmp); + if (vcpu_bitmap) { + for_each_set_bit(i, vcpu_bitmap, KVM_MAX_VCPUS) { + vcpu = kvm_get_vcpu(kvm, i); + if (!vcpu || vcpu == except) + continue; + kvm_make_vcpu_request(kvm, vcpu, req, tmp, me); + } + } else { + kvm_for_each_vcpu(i, vcpu, kvm) { + if (vcpu == except) + continue; + kvm_make_vcpu_request(kvm, vcpu, req, tmp, me); } } From patchwork Mon Aug 23 14:30:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vitaly Kuznetsov X-Patchwork-Id: 12452953 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69B4EC4338F for ; Mon, 23 Aug 2021 14:30:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4C124613CE for ; Mon, 23 Aug 2021 14:30:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230211AbhHWObg (ORCPT ); Mon, 23 Aug 2021 10:31:36 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:24199 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230180AbhHWObf (ORCPT ); Mon, 23 Aug 2021 10:31:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629729052; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gAk+4KSA7ARdoYI0TMZIFG727Vb77x9IC9rj5AiT9Dc=; b=W6bD25Ygc5UyK38mWfJ2orFblNj21NendQSpPfnEQcSnR8RHb+4pVD/O0y5LP3w9E7wqrq Ck8YLY/uOn3IFf7s5HTOkyIl1g5p2LCG0oVIPN8x3fN+2sgtEt6tE/mITi6EvhbDTjeTja nd5RnuLtHpBUO0wusqmVQBlRQprhrxU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-47-o4UY9sSgMpG0XHeU6jl7BQ-1; Mon, 23 Aug 2021 10:30:51 -0400 X-MC-Unique: o4UY9sSgMpG0XHeU6jl7BQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B1E021082922; Mon, 23 Aug 2021 14:30:46 +0000 (UTC) Received: from vitty.brq.redhat.com (unknown [10.40.195.132]) by smtp.corp.redhat.com (Postfix) with ESMTP id AC7352719D; Mon, 23 Aug 2021 14:30:44 +0000 (UTC) From: Vitaly Kuznetsov To: kvm@vger.kernel.org, Paolo Bonzini Cc: Sean Christopherson , Wanpeng Li , Jim Mattson , "Dr. David Alan Gilbert" , Nitesh Narayan Lal , linux-kernel@vger.kernel.org Subject: [PATCH v2 4/4] KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect() Date: Mon, 23 Aug 2021 16:30:28 +0200 Message-Id: <20210823143028.649818-5-vkuznets@redhat.com> In-Reply-To: <20210823143028.649818-1-vkuznets@redhat.com> References: <20210823143028.649818-1-vkuznets@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KASAN reports the following issue: BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm] Read of size 8 at addr ffffc9001364f638 by task qemu-kvm/4798 CPU: 0 PID: 4798 Comm: qemu-kvm Tainted: G X --------- --- Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RYM0081C 07/13/2020 Call Trace: dump_stack+0xa5/0xe6 print_address_description.constprop.0+0x18/0x130 ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm] __kasan_report.cold+0x7f/0x114 ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm] kasan_report+0x38/0x50 kasan_check_range+0xf5/0x1d0 kvm_make_vcpus_request_mask+0x174/0x440 [kvm] kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm] ? kvm_arch_exit+0x110/0x110 [kvm] ? sched_clock+0x5/0x10 ioapic_write_indirect+0x59f/0x9e0 [kvm] ? static_obj+0xc0/0xc0 ? __lock_acquired+0x1d2/0x8c0 ? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm] The problem appears to be that 'vcpu_bitmap' is allocated as a single long on stack and it should really be KVM_MAX_VCPUS long. We also seem to clear the lower 16 bits of it with bitmap_zero() for no particular reason (my guess would be that 'bitmap' and 'vcpu_bitmap' variables in kvm_bitmap_or_dest_vcpus() caused the confusion: while the later is indeed 16-bit long, the later should accommodate all possible vCPUs). Fixes: 7ee30bc132c6 ("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs") Fixes: 9a2ae9f6b6bb ("KVM: x86: Zero the IOAPIC scan request dest vCPUs bitmap") Reported-by: Dr. David Alan Gilbert Signed-off-by: Vitaly Kuznetsov Reviewed-by: Maxim Levitsky Signed-off-by: Eduardo Habkost --- arch/x86/kvm/ioapic.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c index ff005fe738a4..92cd4b02e9ba 100644 --- a/arch/x86/kvm/ioapic.c +++ b/arch/x86/kvm/ioapic.c @@ -319,7 +319,7 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val) unsigned index; bool mask_before, mask_after; union kvm_ioapic_redirect_entry *e; - unsigned long vcpu_bitmap; + unsigned long vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)]; int old_remote_irr, old_delivery_status, old_dest_id, old_dest_mode; switch (ioapic->ioregsel) { @@ -384,9 +384,9 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val) irq.shorthand = APIC_DEST_NOSHORT; irq.dest_id = e->fields.dest_id; irq.msi_redir_hint = false; - bitmap_zero(&vcpu_bitmap, 16); + bitmap_zero(vcpu_bitmap, KVM_MAX_VCPUS); kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq, - &vcpu_bitmap); + vcpu_bitmap); if (old_dest_mode != e->fields.dest_mode || old_dest_id != e->fields.dest_id) { /* @@ -399,10 +399,10 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val) kvm_lapic_irq_dest_mode( !!e->fields.dest_mode); kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq, - &vcpu_bitmap); + vcpu_bitmap); } kvm_make_scan_ioapic_request_mask(ioapic->kvm, - &vcpu_bitmap); + vcpu_bitmap); } else { kvm_make_scan_ioapic_request(ioapic->kvm); }