From patchwork Wed Aug 17 00:36:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12945371 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3ED5C32773 for ; Wed, 17 Aug 2022 00:36:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F66E8D0006; Tue, 16 Aug 2022 20:36:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 57ED28D0001; Tue, 16 Aug 2022 20:36:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35C388D0006; Tue, 16 Aug 2022 20:36:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1DA948D0001 for ; Tue, 16 Aug 2022 20:36:22 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id EA5481A05E2 for ; Wed, 17 Aug 2022 00:36:21 +0000 (UTC) X-FDA: 79807218162.14.89EFEAB Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf24.hostedemail.com (Postfix) with ESMTP id 81D9F180079 for ; Wed, 17 Aug 2022 00:36:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1660696581; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FLbPJ7hy11GyuxEWqZ14BS1jQ5aTLxQT7Al42LT3IDU=; b=bRuOTLYTE4r3KVhlDU00D0JvYewqSqsxg9HpbGf2Mdv16vezd46eVxocoBUKusB+mm1Vu6 CxuywuvktSGevRR4JZT2hkPYZMSIx6/69VQ0aGd8rFd7ReIIA0bnjsC83hkoszTjIZuTvY AtmrNkqJDDe9gHXZ305zMqMeO5DcLDo= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-199-bhtp0OzSOjuuuuLb-fnHjQ-1; Tue, 16 Aug 2022 20:36:19 -0400 X-MC-Unique: bhtp0OzSOjuuuuLb-fnHjQ-1 Received: by mail-qk1-f197.google.com with SMTP id u15-20020a05620a0c4f00b006b8b3f41303so10417428qki.8 for ; Tue, 16 Aug 2022 17:36:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=FLbPJ7hy11GyuxEWqZ14BS1jQ5aTLxQT7Al42LT3IDU=; b=wAGYc2e1xCMsOvXqUcfVjVwkFumSw1fd5k0gD3vmBXXygA6VPLOu7PLAIOyBuhw+Wt WWkFLbvza5stzwi+1eF8Mw8W4TV5MSuH3aBxMyO2oA6aH/sb6SPLlRieuMtKvMNPA7SF 8azq+sMqtAXmqrQq2ODE+e1z9xCnQI6L6ihppwXeVlHSUYNg6pfU0QmP7YszR919TkX3 jhku12n9mQ2GM+lH/LpqTWHyr1tminkaWetd+z9l2jtf9MdtiWdRPa1xCSQbCZ3MzGCH tn3y2fIyuXK/AVjN87QEEKlln76F7i0Ls3SvBact9ZddEwYF6ZbyjLelXp9firqeIWyy 8Ogg== X-Gm-Message-State: ACgBeo2PBn5+VKHW85g0GLXa/IHS4fMxtyfg/Y0gqw/stbYGzwAKPz/c XZyNYzs7XSUgV5ZI65W1y0Za8HCvlFK50fcmkgej+SlhI4kHo78nIf8o8O7PUCAQlJiZSzXSvoV KzSjrChvlE44= X-Received: by 2002:ae9:e004:0:b0:6bb:5a15:2889 with SMTP id m4-20020ae9e004000000b006bb5a152889mr5337111qkk.235.1660696579349; Tue, 16 Aug 2022 17:36:19 -0700 (PDT) X-Google-Smtp-Source: AA6agR5QpgX57yEUqRa2TlfnMEJNrK5HHiJcugAOY7ms5GMSgdMLg91tEP6F3QIsUQIES+lP5ek7EA== X-Received: by 2002:ae9:e004:0:b0:6bb:5a15:2889 with SMTP id m4-20020ae9e004000000b006bb5a152889mr5337102qkk.235.1660696579048; Tue, 16 Aug 2022 17:36:19 -0700 (PDT) Received: from localhost.localdomain (bras-base-aurron9127w-grc-35-70-27-3-10.dsl.bell.ca. [70.27.3.10]) by smtp.gmail.com with ESMTPSA id c13-20020ac87dcd000000b0034358bfc3c8sm12007175qte.67.2022.08.16.17.36.17 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 16 Aug 2022 17:36:18 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Sean Christopherson , David Hildenbrand , Andrew Morton , Andrea Arcangeli , peterx@redhat.com, Paolo Bonzini , "Dr . David Alan Gilbert" , Linux MM Mailing List , John Hubbard Subject: [PATCH v3 2/3] kvm: Add new pfn error KVM_PFN_ERR_SIGPENDING Date: Tue, 16 Aug 2022 20:36:13 -0400 Message-Id: <20220817003614.58900-3-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220817003614.58900-1-peterx@redhat.com> References: <20220817003614.58900-1-peterx@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-type: text/plain ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660696581; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FLbPJ7hy11GyuxEWqZ14BS1jQ5aTLxQT7Al42LT3IDU=; b=NKA3300PiG2reakF4+VuETHPBX4lRPfMAMt1sv3Gp6/hItJ6nnZvTlf/y9mZSEibAd+q8J Dq3QrlxnvJk2zonlkpOpIUa4idvdESbx7D30PTbM+yVhvHH3oe0GgkSXA4Sk3kithSyBea XtVfQMGZwSIIPm1oQO8DoxE1Ukm8mcU= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bRuOTLYT; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf24.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660696581; a=rsa-sha256; cv=none; b=k1zX6C/wV18Ri0bNwJN6RBBMgAUuOSu0IdbCnrZn0DjTavuz62lc2PXYCOj+rtF9qLW2+S uOqluqfMaWo1EJKGw+PwKFPMkzjgpAdL3UmpSNyBY8eY4coFSSJovs7g8tdTSPBYRooutW 86xOjW6pR1mIZj1xgXHwF0717wQdxks= Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bRuOTLYT; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf24.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-Stat-Signature: nc6oir5538ux8atk565uy1hfi1y71fks X-Rspamd-Queue-Id: 81D9F180079 X-Rspamd-Server: rspam11 X-HE-Tag: 1660696581-226558 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add one new PFN error type to show when KVM got interrupted when fetching the PFN due to any kind of signal pending. Add a new "interruptible" flag showing that the caller is willing to be interrupted by signals during the __gfn_to_pfn_memslot() request. Wire it up with a FOLL_INTERRUPTIBLE flag that we've just introduced. This prepares KVM to be able to respond to SIGUSR1 (for QEMU that's the SIGIPI) even during e.g. handling an userfaultfd page fault. Since at it, renaming kvm_handle_bad_page to kvm_handle_error_pfn assuming that'll match better with what it does, e.g. KVM_PFN_ERR_SIGPENDING is not accurately a bad page but just one kind of errors. No functional change intended so far with the patch. Signed-off-by: Peter Xu --- arch/arm64/kvm/mmu.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- arch/x86/kvm/mmu/mmu.c | 13 +++++++---- include/linux/kvm_host.h | 14 ++++++++++-- virt/kvm/kvm_main.c | 30 ++++++++++++++++---------- virt/kvm/kvm_mm.h | 4 ++-- virt/kvm/pfncache.c | 2 +- 8 files changed, 46 insertions(+), 23 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 87f1cd0df36e..8b730e809e9f 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1204,7 +1204,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, */ smp_rmb(); - pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL, + pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, write_fault, &writable, NULL); if (pfn == KVM_PFN_ERR_HWPOISON) { kvm_send_hwpoison_signal(hva, vma_shift); diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 514fd45c1994..7aed5ef6588e 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -598,7 +598,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_vcpu *vcpu, write_ok = true; } else { /* Call KVM generic code to do the slow-path check */ - pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL, + pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, writing, &write_ok, NULL); if (is_error_noslot_pfn(pfn)) return -EFAULT; diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c index 42851c32ff3b..9991f9d9ee59 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -845,7 +845,7 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu, unsigned long pfn; /* Call KVM generic code to do the slow-path check */ - pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL, + pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, writing, upgrade_p, NULL); if (is_error_noslot_pfn(pfn)) return -EFAULT; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 3e1317325e1f..23dc46da2f18 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3134,8 +3134,13 @@ static void kvm_send_hwpoison_signal(unsigned long address, struct task_struct * send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, PAGE_SHIFT, tsk); } -static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_t pfn) +static int kvm_handle_error_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_t pfn) { + if (is_sigpending_pfn(pfn)) { + kvm_handle_signal_exit(vcpu); + return -EINTR; + } + /* * Do not cache the mmio info caused by writing the readonly gfn * into the spte otherwise read access on readonly gfn also can @@ -3157,7 +3162,7 @@ static int handle_abnormal_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fau { /* The pfn is invalid, report the error! */ if (unlikely(is_error_pfn(fault->pfn))) - return kvm_handle_bad_page(vcpu, fault->gfn, fault->pfn); + return kvm_handle_error_pfn(vcpu, fault->gfn, fault->pfn); if (unlikely(!fault->slot)) { gva_t gva = fault->is_tdp ? 0 : fault->addr; @@ -4155,7 +4160,7 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) } async = false; - fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, &async, + fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, false, &async, fault->write, &fault->map_writable, &fault->hva); if (!async) @@ -4172,7 +4177,7 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) } } - fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, NULL, + fault->pfn = __gfn_to_pfn_memslot(slot, fault->gfn, false, false, NULL, fault->write, &fault->map_writable, &fault->hva); return RET_PF_CONTINUE; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 1c480b1821e1..1d792f60cfda 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -96,6 +96,7 @@ #define KVM_PFN_ERR_FAULT (KVM_PFN_ERR_MASK) #define KVM_PFN_ERR_HWPOISON (KVM_PFN_ERR_MASK + 1) #define KVM_PFN_ERR_RO_FAULT (KVM_PFN_ERR_MASK + 2) +#define KVM_PFN_ERR_SIGPENDING (KVM_PFN_ERR_MASK + 3) /* * error pfns indicate that the gfn is in slot but faild to @@ -106,6 +107,15 @@ static inline bool is_error_pfn(kvm_pfn_t pfn) return !!(pfn & KVM_PFN_ERR_MASK); } +/* + * KVM_PFN_ERR_SIGPENDING indicates that fetching the PFN was interrupted + * by a pending signal. Note, the signal may or may not be fatal. + */ +static inline bool is_sigpending_pfn(kvm_pfn_t pfn) +{ + return pfn == KVM_PFN_ERR_SIGPENDING; +} + /* * error_noslot pfns indicate that the gfn can not be * translated to pfn - it is not in slot or failed to @@ -1141,8 +1151,8 @@ kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault, kvm_pfn_t gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn); kvm_pfn_t gfn_to_pfn_memslot_atomic(const struct kvm_memory_slot *slot, gfn_t gfn); kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, - bool atomic, bool *async, bool write_fault, - bool *writable, hva_t *hva); + bool atomic, bool interruptible, bool *async, + bool write_fault, bool *writable, hva_t *hva); void kvm_release_pfn_clean(kvm_pfn_t pfn); void kvm_release_pfn_dirty(kvm_pfn_t pfn); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 32896c845ffe..56dde0e141b4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2498,7 +2498,7 @@ static bool hva_to_pfn_fast(unsigned long addr, bool write_fault, * 1 indicates success, -errno is returned if error is detected. */ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault, - bool *writable, kvm_pfn_t *pfn) + bool interruptible, bool *writable, kvm_pfn_t *pfn) { unsigned int flags = FOLL_HWPOISON; struct page *page; @@ -2513,6 +2513,8 @@ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault, flags |= FOLL_WRITE; if (async) flags |= FOLL_NOWAIT; + if (interruptible) + flags |= FOLL_INTERRUPTIBLE; npages = get_user_pages_unlocked(addr, 1, &page, flags); if (npages != 1) @@ -2622,6 +2624,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma, * Pin guest page in memory and return its pfn. * @addr: host virtual address which maps memory to the guest * @atomic: whether this function can sleep + * @interruptible: whether the process can be interrupted by non-fatal signals * @async: whether this function need to wait IO complete if the * host page is not in the memory * @write_fault: whether we should get a writable host page @@ -2632,8 +2635,8 @@ static int hva_to_pfn_remapped(struct vm_area_struct *vma, * 2): @write_fault = false && @writable, @writable will tell the caller * whether the mapping is writable. */ -kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async, - bool write_fault, bool *writable) +kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool interruptible, + bool *async, bool write_fault, bool *writable) { struct vm_area_struct *vma; kvm_pfn_t pfn; @@ -2648,9 +2651,12 @@ kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async, if (atomic) return KVM_PFN_ERR_FAULT; - npages = hva_to_pfn_slow(addr, async, write_fault, writable, &pfn); + npages = hva_to_pfn_slow(addr, async, write_fault, interruptible, + writable, &pfn); if (npages == 1) return pfn; + if (npages == -EINTR) + return KVM_PFN_ERR_SIGPENDING; mmap_read_lock(current->mm); if (npages == -EHWPOISON || @@ -2681,8 +2687,8 @@ kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async, } kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, - bool atomic, bool *async, bool write_fault, - bool *writable, hva_t *hva) + bool atomic, bool interruptible, bool *async, + bool write_fault, bool *writable, hva_t *hva) { unsigned long addr = __gfn_to_hva_many(slot, gfn, NULL, write_fault); @@ -2707,7 +2713,7 @@ kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn, writable = NULL; } - return hva_to_pfn(addr, atomic, async, write_fault, + return hva_to_pfn(addr, atomic, interruptible, async, write_fault, writable); } EXPORT_SYMBOL_GPL(__gfn_to_pfn_memslot); @@ -2715,20 +2721,22 @@ EXPORT_SYMBOL_GPL(__gfn_to_pfn_memslot); kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault, bool *writable) { - return __gfn_to_pfn_memslot(gfn_to_memslot(kvm, gfn), gfn, false, NULL, - write_fault, writable, NULL); + return __gfn_to_pfn_memslot(gfn_to_memslot(kvm, gfn), gfn, false, false, + NULL, write_fault, writable, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_prot); kvm_pfn_t gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn) { - return __gfn_to_pfn_memslot(slot, gfn, false, NULL, true, NULL, NULL); + return __gfn_to_pfn_memslot(slot, gfn, false, false, NULL, true, + NULL, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot); kvm_pfn_t gfn_to_pfn_memslot_atomic(const struct kvm_memory_slot *slot, gfn_t gfn) { - return __gfn_to_pfn_memslot(slot, gfn, true, NULL, true, NULL, NULL); + return __gfn_to_pfn_memslot(slot, gfn, true, false, NULL, true, + NULL, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic); diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h index 41da467d99c9..a1ab15006af3 100644 --- a/virt/kvm/kvm_mm.h +++ b/virt/kvm/kvm_mm.h @@ -24,8 +24,8 @@ #define KVM_MMU_READ_UNLOCK(kvm) spin_unlock(&(kvm)->mmu_lock) #endif /* KVM_HAVE_MMU_RWLOCK */ -kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async, - bool write_fault, bool *writable); +kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool interruptible, + bool *async, bool write_fault, bool *writable); #ifdef CONFIG_HAVE_KVM_PFNCACHE void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index ab519f72f2cd..6dacb58bf4d0 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -181,7 +181,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, struct gfn_to_pfn_cache *gpc) } /* We always request a writeable mapping */ - new_pfn = hva_to_pfn(gpc->uhva, false, NULL, true, NULL); + new_pfn = hva_to_pfn(gpc->uhva, false, false, NULL, true, NULL); if (is_error_noslot_pfn(new_pfn)) goto out_error;