From patchwork Wed Feb 5 23:17:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 13962018 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5348BC02194 for ; Wed, 5 Feb 2025 23:18:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 96788280003; Wed, 5 Feb 2025 18:18:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 919986B009C; Wed, 5 Feb 2025 18:18:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 76F83280003; Wed, 5 Feb 2025 18:18:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4BE1B6B009B for ; Wed, 5 Feb 2025 18:18:27 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 10267A0157 for ; Wed, 5 Feb 2025 23:18:27 +0000 (UTC) X-FDA: 83087457054.26.6E5D8AD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf29.hostedemail.com (Postfix) with ESMTP id 38FA612000F for ; Wed, 5 Feb 2025 23:18:14 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=MEZhMhSZ; spf=pass (imf29.hostedemail.com: domain of alex.williamson@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=alex.williamson@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738797505; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=lxr7VZoBp9SRemPHjQq6QqkK5CRHvwkSMpiB9VkcVIg=; b=mZ8Hqlx4Ks9ANst73l2RtiDBprSBXcAC9nyY9SHoXHHuTmZQkIE4P7dohdszebw1bTVkJX 6E1QPYiMPyX/ipJAUPkFmzEPwswQgRChDM2pwGQqaoa9n1d5VZVZxIR5V0PwMrTspCc3OR Y9ltZL0ocuc/rEBzvzLmZyPzYkzaOyI= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=MEZhMhSZ; spf=pass (imf29.hostedemail.com: domain of alex.williamson@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=alex.williamson@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738797505; a=rsa-sha256; cv=none; b=bds+55bn6cxjP2gD14NkoCm8JDe23KTf1foAvmPn9g34IJHxoaep9hNVGI89/D3jjkDP4z oTDhUP3qPhKfpmLd3zkZeHjY3N/CAtTBv7SH4uJv72prN11aC5XMB8BFydUV+QPwi4lu5d j8b94t340ozfZGYPPVDS5n7cJsjMnIk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1738797493; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=lxr7VZoBp9SRemPHjQq6QqkK5CRHvwkSMpiB9VkcVIg=; b=MEZhMhSZVARQNsVl4klM1abPs5b0oD2BnRsnwBremW5v93YuI60j0BN48hcHcIQK1hMBDV XXfH0Y0acAmgV/clzIwvZSVAdx7CpElmHEHhBwDxJ5sVAE+eLr/E4JGQJrzukRcVy9klMD 0OqvNTtoC3bwPBlD3PsydPDlHPRrGr0= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-655-wnlDHOwxOda7dsA9fSpD-g-1; Wed, 05 Feb 2025 18:18:10 -0500 X-MC-Unique: wnlDHOwxOda7dsA9fSpD-g-1 X-Mimecast-MFC-AGG-ID: wnlDHOwxOda7dsA9fSpD-g Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id EA90D1955E80; Wed, 5 Feb 2025 23:18:08 +0000 (UTC) Received: from omen.home.shazbot.org (unknown [10.22.81.141]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6969B1800570; Wed, 5 Feb 2025 23:18:06 +0000 (UTC) From: Alex Williamson To: alex.williamson@redhat.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, peterx@redhat.com, mitchell.augustin@canonical.com, clg@redhat.com, akpm@linux-foundation.org, linux-mm@kvack.org Subject: [PATCH 0/5] vfio: Improve DMA mapping performance for huge pfnmaps Date: Wed, 5 Feb 2025 16:17:16 -0700 Message-ID: <20250205231728.2527186-1-alex.williamson@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 38FA612000F X-Stat-Signature: odz96tgadz4htad69yire7mcdpdaenfa X-Rspam-User: X-HE-Tag: 1738797494-521593 X-HE-Meta: U2FsdGVkX1/tYsI22rQ0+N4oVG6SkQhVkR+kPyegoFYkftqZCunwERP9ZeQX6HiKSuDwvexJzPTm8hvVWFmqSFnyBYcIN/oxr2+64+utvmt07la52Pg4S9h8hQ2WJr8Na7xeDjL4q0qwhFlffU3R0XZqxfLV7qmPMDe6rz2mCMsrb06d+fqcPmTAaHEffKCxjfyqoPfMAf4XuSqSkR/RfxoCFt2ymZpBtcm63sEMuqqodouwU70FPn0MVXAm59QA7VWlN7LU9qUkm0wrzDqY0GxoeehlgFfQsucFo/bDr2AxxRiy3riO9AAr0BgXEilN0QXkxYiF6vqsQG9grU4PYw15gMAVxMLWJuRC/8KWqZJ0KncF9oN+I3xlePh/Wl8b1IEEssKOfkTNLdZpGcTwplJGcGAz2XrGFwCTzkRW57/gDP2fqcYUB0ehsc55H+Q3ka7PbTicVz7rpYrNPRMOmmyfltwxm4efPFti1CL+xDLcR4x3AwQg+rusk/gRvdv1mGBgDuL/kE3eJiRuqYRkIjTK7+S3rHkxu5aoUM6HhUlUigFRvSKxAQKk0bMkCm2kfZoF/1SboK9osWr/2K8FH9vOBSohqbPhoDlCSZeq3lsiSYbC2+mKV0xiK4C52becpnKO1fTQlc33EX6eKTWq4EmzJ4w9wEyBibF2YOGLu1j/7/LX3K9pjZNRS5W0q8WpoPgVokvRU6tU3vExwYRPAOu+rQEiEC83DQaZUetcCe2njp1c53Rm6AV36t3H8ANHlrRNuxJFCaNLeJXy6AsgDgRUgzJRlw2FgHVHdZGzC77ImL1YnaybwuuTF+avpPC/ezAOB8xM3x61kIHTGY1nZBptOzvDifWxAcLgh/eLlivC/cTtAvmuYmtTnaLGgYrNwg7o2T+4Kyqdr5BsqSx1dgGepLoZs6HSWvLC81b1VCJpmro3Endq4kY0BqAt5Pq1BijrcTpA9NIJRRR580K U7bnHQez ef4NLiGgO8WgUMttrre9RyatbPVVoi5nA47onIuLBYT0k0gQqWt/T5mTV5MmQYwS/TEoYCTFAnvirQllmsMwH1YkNI2pr8fQfywES4vo615+KEZUVXz66KiYxjajmy76vjkeNXrTvIC1OGxSIJ27yOvkzL4V3zmX6f1yc2tiS4+f+6mZMBckuyR9OAGCHvdNyUap9NCPX3mzqqinGesVyGFcclvW3LkBQJy2PuwxgWXJL0KeLvQ7U/ahar7hpeJLH7LjeDVjGFd3DX4SGCtlhw0zfczoPYvKY7je7/87nsErEJJdDJ/YWcf6rcqknQBuTVKwP6hBhWLkdbCRmptc/Y24H4w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: As GPU BAR sizes increase, the overhead of DMA mapping pfnmap ranges has become a significant overhead for VMs making use of device assignment. Not only does each mapping require upwards of a few seconds, but BARs are mapped in and out of the VM address space multiple times during guest boot. Also factor in that multi-GPU configurations are increasingly commonplace and BAR sizes are continuing to increase. Configurations today can already be delayed minutes during guest boot. We've taken steps to make Linux a better guest by batching PCI BAR sizing operations[1], but it only provides and incremental improvement. This series attempts to fully address the issue by leveraging the huge pfnmap support added in v6.12. When we insert pfnmaps using pud and pmd mappings, we can later take advantage of the knowledge of the mapping level page mask to iterate on the relevant mapping stride. In the commonly achieved optimal case, this results in a reduction of pfn lookups by a factor of 256k. For a local test system, an overhead of ~1s for DMA mapping a 32GB PCI BAR is reduced to sub-millisecond (8M page sized operations reduced to 32 pud sized operations). Please review, test, and provide feedback. I hope that mm folks can ack the trivial follow_pfnmap_args update to provide the mapping level page mask. Naming is hard, so any preference other than pgmask is welcome. Thanks, Alex [1]https://lore.kernel.org/all/20250120182202.1878581-1-alex.williamson@redhat.com/ Alex Williamson (5): vfio/type1: Catch zero from pin_user_pages_remote() vfio/type1: Convert all vaddr_get_pfns() callers to use vfio_batch vfio/type1: Use vfio_batch for vaddr_get_pfns() mm: Provide page mask in struct follow_pfnmap_args vfio/type1: Use mapping page mask for pfnmaps drivers/vfio/vfio_iommu_type1.c | 107 ++++++++++++++++++++------------ include/linux/mm.h | 2 + mm/memory.c | 1 + 3 files changed, 72 insertions(+), 38 deletions(-)