From patchwork Tue Aug 1 12:48:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13336678 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B2A2C001DF for ; Tue, 1 Aug 2023 12:49:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BA067940014; Tue, 1 Aug 2023 08:49:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B4F52940010; Tue, 1 Aug 2023 08:49:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3F0D940014; Tue, 1 Aug 2023 08:49:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 93FB6940010 for ; Tue, 1 Aug 2023 08:49:15 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 799EF1C91F2 for ; Tue, 1 Aug 2023 12:49:15 +0000 (UTC) X-FDA: 81075516270.09.9C8785E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf17.hostedemail.com (Postfix) with ESMTP id A429B40010 for ; Tue, 1 Aug 2023 12:49:12 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=G3Rl7LR0; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690894152; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=k5L4YtRFm7cOLJR+l6dUREkXQfCqKIU8byjm44bC064=; b=gUlJTdYajYQKFyuVKeOLkIAOkLqgD8Y9E0HL8uxtZSSiv+05UtzCS9A2xAbJ4qgtXiq51x tZrL9vUytSTWh0dojfU7hL4/tS6L2ib/9eko6h/Zx9u5TMCGb2GA0MyYE62xulZnRbY+es 1ELB1D+vitfK99ayWOzPRtrVKJo9bQM= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=G3Rl7LR0; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf17.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690894152; a=rsa-sha256; cv=none; b=qKY3i/OndVdNfb8kf/3NZzLKEcc+KQbHVhyb0sobq+2iJXTYj/h6yxsLg4xGC4XSPGh6ke oBH1K9Ompwo890FUvHWG7im37zzrtYdE7yJo35tzCSt4HQtaouTdNQddb1HelNUvrgXoLM W7ePU0YVN1aRSc06cgw88zlLU2S+5xM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690894151; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=k5L4YtRFm7cOLJR+l6dUREkXQfCqKIU8byjm44bC064=; b=G3Rl7LR0BR64fQkhORfeh+C60fWHAI5T+H36Eu1nIv2VVhzBp4BKiw2EmBUV6u21ib/EQl 0nGdY+5Y4YKxTdvf4o9rSwKYB69qDgQRGaAzRgSBPLSRwSDPQl3WC13okeKgc5ldhzUP4A JJlP+qeU/phDsE6SD6FLujJj5v6pzZ4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-275-nQOIVHn-OOObXZamCc7BOQ-1; Tue, 01 Aug 2023 08:49:08 -0400 X-MC-Unique: nQOIVHn-OOObXZamCc7BOQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B20E68007CE; Tue, 1 Aug 2023 12:49:07 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.193.232]) by smtp.corp.redhat.com (Postfix) with ESMTP id 266B5C585A0; Tue, 1 Aug 2023 12:48:57 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, kvm@vger.kernel.org, linux-kselftest@vger.kernel.org, David Hildenbrand , Andrew Morton , Linus Torvalds , liubo , Peter Xu , Matthew Wilcox , Hugh Dickins , Jason Gunthorpe , John Hubbard , Mel Gorman , Shuah Khan , Paolo Bonzini Subject: [PATCH v2 3/8] kvm: explicitly set FOLL_HONOR_NUMA_FAULT in hva_to_pfn_slow() Date: Tue, 1 Aug 2023 14:48:39 +0200 Message-ID: <20230801124844.278698-4-david@redhat.com> In-Reply-To: <20230801124844.278698-1-david@redhat.com> References: <20230801124844.278698-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: A429B40010 X-Stat-Signature: 1qmehxebjbq15uxuoku9a917ejjx3nfe X-HE-Tag: 1690894152-888542 X-HE-Meta: U2FsdGVkX196bhBmu9dNxe9xwgc484b5tbO/BHOfZxCIxWnmTgWHiXmulkYzVPP02O9yaZpqUmBz5zpM04cNrQpwKcesfBwORWKm8pz2AkoHOJ/Af7JakKRFSd63efo8OxAB99PYoC6BrM9OOW/V0fbo/Z6xq1+SVZ9gBjs5WIPwEnVI2fYE/wZeEnoCe8V83m6SK1XBIv/qRvXVHsVFvRndIG3yKD6kJ76zLtAonNRfnGn7uqP2/OJQ0Z3aOLeZBnX0Oac4CJNqGL+VlkldiNkdIZH3FdrQGUgl24BLggucXpo0eBobM6U+MJhQCGnZMY64gsAVwK+YD2V8y62CGT1Z03su8RyQChTdFb1EDdXIc4yI7/KENtOHP1KVmD/NQNPUvb04TOouFIVPxXJVuijXS19Y7sYuwfHVDA49fleLNUXQVS7I86G1+1PwTI+Ni6eUMKzNFbVaPWUNUp7ZGYwlFzCIVRENUCkBp7IUe9YRqLhXo6ietlMNC1jSjiM4i+PeluD9XA24YQE460QO231ka4GTT9H6vOTesQvNYnTMe+HplIjvb9H3s3JvkxzeUb/JhUqlX5iiFY6ARhLrTnoGE6L2jbtm+4/eKFlh45H/xhLn3GWJF+5L6JZczhM3H/z18UlXbLLtBXxVEXAN8Lw+d2KuhudSX8p9k1jmnEJa2prEWM18X9T7OgVGd9ATWtAKa/FG4/Ky6q1EiGqyPthUX1i29+bDp4yKRk5tpBsOSsvBstQ6WRdFKWLIbVe5/TamBu/hO9E92kaC7iaeKlo7AUHftUh5uS7U/Nfn/hyZUPGgWKiX1BGApr+Vf6QTLEFlnLb6LeCffLRbKE5iFqPvD0beXI34qeSVhjNooHs7mBpReNZmHB71ngGUc/BhB15bjQO9UhBbN9GlmLobSsf6Q7iSsFWZkNeKJV6sMUs5VPSKkgsWcvo8aM1M1HIyg7zVHJmHdsxPvStJ+h1 nlEOYnqd ziO1f/G/j2OToeGhVfjX9mcb7yn5gXw7iypmGulNyRZS29UEMvdUpk6TYKS0nXzZGhAjnJ8524YelIVuYuugjOVillJqhCLtznB6l8GE/QzKpCECD67ArSB/F765iAAQIQjsw0coKP47xifyl90Z/VT3vf8S4/KISgWJBu4UIfC5B4d+2evkFVuvFcSoSveBpYYZmF9AmiV3Q/QPSp+18PROx2w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: KVM is *the* case we know that really wants to honor NUMA hinting falls. As we want to stop setting FOLL_HONOR_NUMA_FAULT implicitly, set FOLL_HONOR_NUMA_FAULT whenever we might obtain pages on behalf of a VCPU to map them into a secondary MMU, and add a comment why. Do that unconditionally in hva_to_pfn_slow() when calling get_user_pages_unlocked(). kvmppc_book3s_instantiate_page(), hva_to_pfn_fast() and gfn_to_page_many_atomic() are similarly used to map pages into a secondary MMU. However, FOLL_WRITE and get_user_page_fast_only() always implicitly honor NUMA hinting faults -- as documented for FOLL_HONOR_NUMA_FAULT -- so we can limit this change to a single location for now. Don't set it in check_user_page_hwpoison(), where we really only want to check if the mapped page is HW-poisoned. We won't set it for other KVM users of get_user_pages()/pin_user_pages() * arch/powerpc/kvm/book3s_64_mmu_hv.c: not used to map pages into a secondary MMU. * arch/powerpc/kvm/e500_mmu.c: only used on shared TLB pages with userspace * arch/s390/kvm/*: s390x only supports a single NUMA node either way * arch/x86/kvm/svm/sev.c: not used to map pages into a secondary MMU. This is a preparation for making FOLL_HONOR_NUMA_FAULT no longer implicitly be set by get_user_pages() and friends. Signed-off-by: David Hildenbrand --- virt/kvm/kvm_main.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index dfbaafbe3a00..6e4f2b81541e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2517,7 +2517,18 @@ static bool hva_to_pfn_fast(unsigned long addr, bool write_fault, static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault, bool interruptible, bool *writable, kvm_pfn_t *pfn) { - unsigned int flags = FOLL_HWPOISON; + /* + * When a VCPU accesses a page that is not mapped into the secondary + * MMU, we lookup the page using GUP to map it, so the guest VCPU can + * make progress. We always want to honor NUMA hinting faults in that + * case, because GUP usage corresponds to memory accesses from the VCPU. + * Otherwise, we'd not trigger NUMA hinting faults once a page is + * mapped into the secondary MMU and gets accessed by a VCPU. + * + * Note that get_user_page_fast_only() and FOLL_WRITE for now + * implicitly honor NUMA hinting faults and don't need this flag. + */ + unsigned int flags = FOLL_HWPOISON | FOLL_HONOR_NUMA_FAULT; struct page *page; int npages;