From patchwork Thu Oct 24 09:34:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qinyun Tan X-Patchwork-Id: 13848631 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C754D0BB7A for ; Thu, 24 Oct 2024 09:34:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 973BC6B008A; Thu, 24 Oct 2024 05:34:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9093D6B0093; Thu, 24 Oct 2024 05:34:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C3CE6B0092; Thu, 24 Oct 2024 05:34:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5E9B46B0089 for ; Thu, 24 Oct 2024 05:34:58 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B5B25A118E for ; Thu, 24 Oct 2024 09:34:24 +0000 (UTC) X-FDA: 82707986130.24.8781F66 Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) by imf30.hostedemail.com (Postfix) with ESMTP id 228AB8000E for ; Thu, 24 Oct 2024 09:34:17 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=X12JoyZ3; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf30.hostedemail.com: domain of qinyuntan@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=qinyuntan@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729762371; a=rsa-sha256; cv=none; b=0wBiG9a8V1Fgt/6JSoEVuEUFRcZGNjMNO61zloDQtav68WbV41EtgR8WaT1EAHUz85pMtS SI5Na8yo/HawxMFpURYcWVcUxDZbpkiYLE4D2wcCVBylAYZpwFacetqCCf2pRwseECJMK5 ONY6Apgig1xp9TudHdmmo48S9i2aW+E= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=X12JoyZ3; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf30.hostedemail.com: domain of qinyuntan@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=qinyuntan@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729762371; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=Sqi0c9JA0XixX3/Pb1Xq3+tg4PT5N19qCIVxnsw0ZkE=; b=ATtcEDcIUF+sezJThqoA+ILNThOp/sSaYW2Q7AE44Y6/EPp5hzKgMkoXGYApCLIK5Ctrge Eg9MkOMu8V0kqd6Wn796HDDXGVuKsbiII22LY3tsLUIhnVA5alFv3KGX2D6CBASu1hWmiD N6AK3AuwlHP0SeU0e9uzroq2VvcG3y8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1729762489; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=Sqi0c9JA0XixX3/Pb1Xq3+tg4PT5N19qCIVxnsw0ZkE=; b=X12JoyZ3cJ2GTEFg8/ZzjNa2ML2OvGB4JcOoTSl0JcTQBg/EmEnW2g/xAT69iaDQMWrllZ6IH5y02wzrR6SMFHIqAYuanfThJLNnUs5KVn/DPPmBEWBN6DB6GhK+kNCpsZbcqb2iED0/V+xh5CbmC6K5reU5/OQLkIA/nionKBI= Received: from localhost.localdomain(mailfrom:qinyuntan@linux.alibaba.com fp:SMTPD_---0WHoiywV_1729762487 cluster:ay36) by smtp.aliyun-inc.com; Thu, 24 Oct 2024 17:34:48 +0800 From: Qinyun Tan To: Andrew Morton , Alex Williamson Cc: linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Qinyun Tan Subject: [PATCH v1: vfio: avoid unnecessary pin memory when dma map io address space 0/2] Date: Thu, 24 Oct 2024 17:34:42 +0800 Message-ID: X-Mailer: git-send-email 2.46.0 MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 228AB8000E X-Stat-Signature: kzyedngzkiwkc3q9wznt69zj1od83qrk X-Rspam-User: X-HE-Tag: 1729762457-294561 X-HE-Meta: U2FsdGVkX1/lhYbYdgqY2p/YKCT2a5EUAYJiQMkiIcOxchdciANGjcyG0MqyqPAJN7IYKFfLDhYNtK3vdxsG9nKA8F43HojSc59b2IobCWpFnUcWSoZdFeEpWsAgBwK0T8AZFpRwe6OqaF+vwNYmKqpO3hu5D/nZEBpi1dNrbqxab61raklf837WmmBs4Wv9EfB7xXVPQXBLsZgv0p7dtFJgJutcIMVMoq62v66qAOadaIUtH1Pc96ArgWQPYCh2174oLeHelH3ae/JZy+m3JVy+0Q+zW+ZCJ3UcNXMG/HW69U39bpp17sArlwmki31bngQPmt2OoLZ75enkzOYidisNks7QGzaU1aNm4kS/1DjbiH5C5tiV+JU6kvs4WkW6YpO7N5sMwpZB/7w3Vjvt8ILDC1R/QkLJwSci/rMJEI06WhwDBRD/1aiUwJREtcWXCeCj/6CVXnf9OGgpqhYSkfAwPbMjh9B9PclRKCtWIMmpX2Udcx5SLbe34QDWgdCrP268kQEvys5dNo0oIHlTrsqJUox/mhomQWB/C9kx3V4uAWKGlYe1vKyjCtnfmHvcz6QAeJrKm4RcMzDnHvgtVL19KZESFxXysIzMUpEgLVbNHawCw90904mJ93j3i/MBgHXTccEfZ+uhCijs98ROWmwkqGDNa9S/pleFr3AVQ0/ehtXbSOrTXxvqJcYwuOTVMwpYeL3obxKIm3LHVfa6hK8Qh5O+nhkDmbnsxWyf0slLZP2cb2Q+1FIR6jXiFTfY5nW7bpzXrYReYzQeOnMluDQ1Vgcq67BpHcLzQkhfnKC+vaXXSyEzVS/0mblM6ztPkthFlqqakvYI+/lKfEVJ4GuMOS1oHa3MbHfOIUKUqHf5oPJvFADNvEB1Def9BJKeibuYgWFeiOtJaZilCqC7XJwN1uI3OTODOb3zD5v+Cww1XjhfLW4t8nWlOoBntEGpfzMtbQXslQfXt+4uKw9 qc7VDoeb p9uD+wVgWmiXdfX/MV6avQYfzIvi/LmHayZbNYOpTTfo/SXP+GOhprM2YZADRjT7uHf1fDESCHf8YztG8OjQboEMHYloYEPPUJcziUXXjfofLU6vI8asLyDnBSxo8F1cWWkj9CY27ygaQcvrRBJdVPZsoxTKbWEiRnSY961iwsgNCCpvnIFBZYgUCcJi/cHUuXuGZl1H+Asxh00UMLL1NE5ks9UZKUgj8VKVFWuXScS9Fvlg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When user application call ioctl(VFIO_IOMMU_MAP_DMA) to map a dma address, the general handler 'vfio_pin_map_dma' attempts to pin the memory and then create the mapping in the iommu. However, some mappings aren't backed by a struct page, for example an mmap'd MMIO range for our own or another device. In this scenario, a vma with flag VM_IO | VM_PFNMAP, the pin operation will fail. Moreover, the pin operation incurs a large overhead which will result in a longer startup time for the VM. We don't actually need a pin in this scenario. To address this issue, we introduce a new DMA MAP flag 'VFIO_DMA_MAP_FLAG_MMIO_DONT_PIN' to skip the 'vfio_pin_pages_remote' operation in the DMA map process for mmio memory. Additionally, we add the 'VM_PGOFF_IS_PFN' flag for vfio_pci_mmap address, ensuring that we can directly obtain the pfn through vma->vm_pgoff. This approach allows us to avoid unnecessary memory pinning operations, which would otherwise introduce additional overhead during DMA mapping. In my tests, using vfio to pass through an 8-card AMD GPU which with a large bar size (128GB*8), the time mapping the 192GB*8 bar was reduced from about 50.79s to 1.57s. Qinyun Tan (2): mm: introduce vma flag VM_PGOFF_IS_PFN vfio: avoid unnecessary pin memory when dma map io address space drivers/vfio/pci/vfio_pci_core.c | 2 +- drivers/vfio/vfio_iommu_type1.c | 64 +++++++++++++++++++++++++------- include/linux/mm.h | 6 +++ include/uapi/linux/vfio.h | 11 ++++++ 4 files changed, 68 insertions(+), 15 deletions(-)