From patchwork Thu Aug 3 14:32:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13340197 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 238C7EB64DD for ; Thu, 3 Aug 2023 14:32:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B6DFC280268; Thu, 3 Aug 2023 10:32:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1E9C28022C; Thu, 3 Aug 2023 10:32:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A0DCE280268; Thu, 3 Aug 2023 10:32:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 934F628022C for ; Thu, 3 Aug 2023 10:32:36 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1F0D8141262 for ; Thu, 3 Aug 2023 14:32:36 +0000 (UTC) X-FDA: 81083034312.04.ADD0248 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id D3CDE18002E for ; Thu, 3 Aug 2023 14:32:30 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="b/IRGEJ/"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691073151; a=rsa-sha256; cv=none; b=D/K857e8+a0aeR7CFI29EyyxZsphsqOZjRyzMMv5B4ebB4ybzLgH9fNhq4OgsicUz/aIdq ysVN1T3epaEpU+ut8hYPO4Hhc41hB931OAtXIjhI4TGtYPc1YNtd5jndo1gfn/f0avLjuW naayi+No9Vz5f1VZ5dI/VVMVjU/wTBo= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="b/IRGEJ/"; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf06.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691073151; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=k5L4YtRFm7cOLJR+l6dUREkXQfCqKIU8byjm44bC064=; b=4AGuSsfdvOGg5c6j7WoX0YpoB8p1USZEfNV4BYZd3KVA9eKLrjtcOCsNFgwq31boqDWF/7 SrFomAeg+LF4ZumJXDSldR+P75xd/WqSBw+hDgRAiH6o+iHqd41BTaHYhXRQV2c4Hb+GOk G6g8ePIq3uBwnCn4rmgx3YrwLvkxTMg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1691073147; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=k5L4YtRFm7cOLJR+l6dUREkXQfCqKIU8byjm44bC064=; b=b/IRGEJ/B0x87ZmXPNAeWF02FOF4gDIZTGrRn9l/NrjAT+POZI68gKWXdB7+MFKhICiuEC lrjIUr2Czi1dYzKPoBeJ01I4Yi6I2MDk689pKlak4y652O0Clc003Ip6hecIp4J/aCJVRM +IPfsrvGYcn6+uUw43abKVONX1+AcmU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-318-PirLHtypPEKo3Vn7YCBa1g-1; Thu, 03 Aug 2023 10:32:24 -0400 X-MC-Unique: PirLHtypPEKo3Vn7YCBa1g-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 391A1104D516; Thu, 3 Aug 2023 14:32:23 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.193.129]) by smtp.corp.redhat.com (Postfix) with ESMTP id A451B200B66C; Thu, 3 Aug 2023 14:32:20 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, David Hildenbrand , Andrew Morton , Linus Torvalds , liubo , Peter Xu , Matthew Wilcox , Hugh Dickins , Jason Gunthorpe , John Hubbard , Mel Gorman , Shuah Khan , Paolo Bonzini Subject: [PATCH v3 3/7] kvm: explicitly set FOLL_HONOR_NUMA_FAULT in hva_to_pfn_slow() Date: Thu, 3 Aug 2023 16:32:04 +0200 Message-ID: <20230803143208.383663-4-david@redhat.com> In-Reply-To: <20230803143208.383663-1-david@redhat.com> References: <20230803143208.383663-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D3CDE18002E X-Stat-Signature: cc69deeqridw5ngwy8wq7xd8fuijdh6d X-HE-Tag: 1691073150-367772 X-HE-Meta: U2FsdGVkX1+d0mtKaLzlEG78IONNk1trnp+xDjQfIbqawWHoOHRgeakW6aX5aKvztWlsDQuvLGBn7e1tk811J5zGF8CrOKSf/+Uc4kdRI7oaVCLrG+DrDSGvYdYamTf/tkraEJzUJzXeE4qUnPULllYZwiaem/xzoW4G7FojvtDys0RsbGhia7NyU3OMXnYwP8Kn6uQnU6vrmNDSEaFYozpHO7nPSo5zETyxhlJ83JMwIm4/hkqglWitc9vFcLToLPujJ3nclhHsg5jGui1HBfClJ+8JBFzTPDFUTRy5J7mDbLQKhvPoKbajo9HZW9IqaXrIEAjsgpODNp+lyUZufrp5Vtjk9KDxX99lDsKT9f3UzFKNWUVANYtZVmKnvw9SpKY3tkPDkPWboVd5x0R1gggyjihbuUlgZBcz25QGx2vnk8Iy8JLDHLdN/wAtUr6MCHXmiwebpwcMBZo42xR5jQ+HTDsUTfQoBIUfnQxZIpkT9Y4A45/xl57Uw+v1rSaqR4FcXy7F1VmL4HYiODY96pCTNHg4UGNM7eYgsP40wyOoUwxiFoaDGD1h3/I2W4rL85bngmOLOjdacUhLVILdVbXtNE+UTd+4+4POmBwx1seueV3qNcv/4gJcofdyr97pcf+bPA1pfQy3iuzPsTIc5ZFcsALnhGswVy+B3hCGQVJUAc63GyZBbHOaKntZ9Oi+W4hvnQkZPPtFCY8HPrCf7tiuHbTxiC253AO3ceXI83nhfEUTsAxmwCi7HPjg4T5/AzM7CzZZl7GlW+rM9UyMVOjdntyNehKg98gGM7ux4Jrs397P2HZIULieFvMkusAqB+G4Hprpsn0VqJdHBGrznsH0XpOEvEOCkTtbn0XPdt870aJLqilMI95acN885+7S4mkZpU3DiB0IeE1rBrumu92uaem0IbKC2VxpJsA5ILB8FUKpiRjxDG4oaXCYhVNg2j9AYsS1FfhUUCtTkEK vqn8Lk2n yglEe1em4/DRIj6ijZLqy1pBzOQLSwd4+KiVNJFKIj+PW6ieVPYAeecN+JubG9hW3CxDqKHL7LyoL6a0I04MWacN81dTpK8sxt2uN+4xkvwVanib/i7BXUPc+f/04vcLSgvaC0Xt3uJk0P2QBOrN5F+ajblJCwu2r6FdbrJKHTHv+qInLHgAh82hpxR9zw8qMbqipMPCBfiUvhWIdKulHFZDqXg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: KVM is *the* case we know that really wants to honor NUMA hinting falls. As we want to stop setting FOLL_HONOR_NUMA_FAULT implicitly, set FOLL_HONOR_NUMA_FAULT whenever we might obtain pages on behalf of a VCPU to map them into a secondary MMU, and add a comment why. Do that unconditionally in hva_to_pfn_slow() when calling get_user_pages_unlocked(). kvmppc_book3s_instantiate_page(), hva_to_pfn_fast() and gfn_to_page_many_atomic() are similarly used to map pages into a secondary MMU. However, FOLL_WRITE and get_user_page_fast_only() always implicitly honor NUMA hinting faults -- as documented for FOLL_HONOR_NUMA_FAULT -- so we can limit this change to a single location for now. Don't set it in check_user_page_hwpoison(), where we really only want to check if the mapped page is HW-poisoned. We won't set it for other KVM users of get_user_pages()/pin_user_pages() * arch/powerpc/kvm/book3s_64_mmu_hv.c: not used to map pages into a secondary MMU. * arch/powerpc/kvm/e500_mmu.c: only used on shared TLB pages with userspace * arch/s390/kvm/*: s390x only supports a single NUMA node either way * arch/x86/kvm/svm/sev.c: not used to map pages into a secondary MMU. This is a preparation for making FOLL_HONOR_NUMA_FAULT no longer implicitly be set by get_user_pages() and friends. Signed-off-by: David Hildenbrand --- virt/kvm/kvm_main.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index dfbaafbe3a00..6e4f2b81541e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2517,7 +2517,18 @@ static bool hva_to_pfn_fast(unsigned long addr, bool write_fault, static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault, bool interruptible, bool *writable, kvm_pfn_t *pfn) { - unsigned int flags = FOLL_HWPOISON; + /* + * When a VCPU accesses a page that is not mapped into the secondary + * MMU, we lookup the page using GUP to map it, so the guest VCPU can + * make progress. We always want to honor NUMA hinting faults in that + * case, because GUP usage corresponds to memory accesses from the VCPU. + * Otherwise, we'd not trigger NUMA hinting faults once a page is + * mapped into the secondary MMU and gets accessed by a VCPU. + * + * Note that get_user_page_fast_only() and FOLL_WRITE for now + * implicitly honor NUMA hinting faults and don't need this flag. + */ + unsigned int flags = FOLL_HWPOISON | FOLL_HONOR_NUMA_FAULT; struct page *page; int npages;