From patchwork Fri Jan 17 10:03:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943116 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21E46C02185 for ; Fri, 17 Jan 2025 10:04:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 996B0280004; Fri, 17 Jan 2025 05:04:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 941B9280001; Fri, 17 Jan 2025 05:04:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7BC17280004; Fri, 17 Jan 2025 05:04:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5E063280001 for ; Fri, 17 Jan 2025 05:04:06 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 144CC14170C for ; Fri, 17 Jan 2025 10:04:06 +0000 (UTC) X-FDA: 83016508092.05.AFB6EBF Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf22.hostedemail.com (Postfix) with ESMTP id 65F51C0007 for ; Fri, 17 Jan 2025 10:04:04 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=t5vYGL12; spf=pass (imf22.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108244; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Tnr0OKnvJaatGiIAeE+nswae8hX4XeH5kGaseYREXnM=; b=eh/1xqwJDchB9W0JSTupnVQBNoZutXfv8f3wxYGxCibmKprRjpqgKSBCYnjSpRvYXBTUmd HOadNLNek6xb+xR2kQtw/rTMh8SSshPYotiL63CSXo6BgVgRn5MTsvzTwloR59c6vOvz6d gIYy3LZV79ENwECD1rdM2ByDNvPJr9E= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=t5vYGL12; spf=pass (imf22.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108244; a=rsa-sha256; cv=none; b=WOQa4v06ngez+9dTrcoX7J5TlmfsfSE4luRwP5YTJ0zqQGaJKTqad8nB5x/jp4EMWkl28i Jnl982tmDfmw8MeCSdybPk+OhW5nP+rxwGmtb/HmUwdW/bIrSBgWEQI0+J9YwdBVElae+B 8PDr104sP5NF1/Dwpg4pZBNwplmWyoE= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id CBE80A42B2C; Fri, 17 Jan 2025 10:02:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A385FC4CEDD; Fri, 17 Jan 2025 10:04:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108243; bh=pFa9TE7b6sgm+nu0XsE2bcdNK8a5tHk/BkZ6iI4p+pA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t5vYGL12izCzdNdTDhvTGqw4yPDYIcRxE+iqauIaAKg+HkWOcjcA7UCii3oiLFEKz ON/8fGFbBmYnsko+d7BeFqGS/7kIjDeA/FW+MhxU4pz8w+N1NZNALYvYLBzdKsItvc Qm5jLvhSoHadfxlPLEp2bvP6z+5Vg1A2cz0ysfB3aCOKAt4pDIPRqQdt1PshNCAubz 0lz93K/mRN5RRqC++sSKIIqJg6XV57NORt1W9uHEbCSeeK37KxZf/j+vewwBfrVRdl 8rSqmd+tOsgZCWJRCxaZ3OJNg1FrwMDf4y5DkXivFwUZIj6bD9PmeQDDtduotXnLUp LJsKMuIovCX6A== From: Leon Romanovsky To: Christoph Hellwig , Jason Gunthorpe , Robin Murphy Cc: Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 01/17] PCI/P2PDMA: Refactor the p2pdma mapping helpers Date: Fri, 17 Jan 2025 12:03:32 +0200 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 65F51C0007 X-Stat-Signature: cb3mzwtdko5piehnxc6nfp1my9344gt7 X-Rspam-User: X-HE-Tag: 1737108244-373361 X-HE-Meta: U2FsdGVkX1+TlNpcngpqbVaxbrObTTd4gnlBNJ98k9/c8Y35qZLNEJogT0DqacWbMsD+K2Z1dC5zmCIjSyPykIQQKt7aNcTKrZB7yAJaWChNPkiiNLm8PRUUH3MwLg1L0v1jX0niCNrqTQXNeoHEznJbmzpRhpRABt+8ic+Z7gLCZAD9hE/Tls+76UmGjHFUsKWVIzA9Nnj7Pk3K/NFeE5U2qRVHEeAP33511Ilzgtb5thtnSL1D5xwhn5ZbUo8zYRMcso0Z3zWnMwYDCFsfEeJ/vzTolVR7fT/oj2RM5R1tsTSjbTcDmKHKmNVdxriGe6XNPs8AhjNamMq5VyI+qWon79nt0DCOQEnTtOuYMqzu2xuG0ls8AczesMt+r2Ska5aIw/tm0zU0fIr33Fe+k0EpmtOMGGMz5ANveKoKSWpfEEvBmbI/RiJdTCdjWEmzASL/nC4JiQp4fKvqJhnEDPMfxJ/lOSN3xcCzU0NJz0ZBPWWTPDKy7HqiiEqpgrsLdKHQdu5YAqzVH/cc+MXFTnvu3iEcvkttgczQruvlo8gCEWPoQZPvtCW/GvX9wNH5crgBLHyuvu/0AnjrRYAWrxS06Q4JuMm8u/ZGgfVNXUxVJt9WbXZ481isxM+KeJK7FQrecXzNNu6AImNx4Kr0pKdD0F6ae/N+oDRfvxXQbFhGu8wTheRoygid4XnwnhAmdEAp0ZMCekP/RAWIFcuHZeyMlcbgBcG1wutPcy+J7SrQMFWWTFJYptUm8LbkZ/RUpO1XPxJt3Pq7APrCw6CYet5uSF6cJOXzsLt9XjRpPeZnenNkqwGt8cyacpD5xnd3yZsDiOtQXVfmYkQpLkq0pGnqt5Vdz4CgQWhvXqSFGqfd0zhQda9E30qo/xnowDEAljwSmkOOjD2wIlri7mnACZvHlAEzy9KA1GEVY3QazCFw/KrlX5XF8iGjeaGr8/fmJpLdJ3TMtXhG2TbZnEN /Ia5+9N/ ll4J1AsOWXK+H8okjGmK+lUXV2EAwiiX0pOu6nu6ZbnbbiWCk59RpqMLEKVay2DEM8rP57piL9MwSEUwvXwbip17oRtNlZcnPGNXNHrL8AQ5ZUPVxK4sx6mKmKk/LMINm9Isk/TPj0kJwyvvC+WjlQsNKFY8HHPjJ2ThM5YjvQygiWq8j/pYkH+ZEV9IKqN32gLrkTV7fkR2fYHEVb+A8NRD14vlncuCsBDdXhazURtqio8qq387JmJYbDZyjkcY8wIqAPfl+aPEXDVjQgET6asRC42jFUwmHXvJhAr895pg0v51J4gVRPocg4g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Christoph Hellwig The current scheme with a single helper to determine the P2P status and map a scatterlist segment force users to always use the map_sg helper to DMA map, which we're trying to get away from because they are very cache inefficient. Refactor the code so that there is a single helper that checks the P2P state for a page, including the result that it is not a P2P page to simplify the callers, and a second one to perform the address translation for a bus mapped P2P transfer that does not depend on the scatterlist structure. Signed-off-by: Christoph Hellwig Reviewed-by: Logan Gunthorpe Acked-by: Bjorn Helgaas Signed-off-by: Leon Romanovsky --- drivers/iommu/dma-iommu.c | 47 +++++++++++++++++----------------- drivers/pci/p2pdma.c | 38 ++++----------------------- include/linux/dma-map-ops.h | 51 +++++++++++++++++++++++++++++-------- kernel/dma/direct.c | 43 +++++++++++++++---------------- 4 files changed, 91 insertions(+), 88 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 2a9fa0c8cc00..5746ffaf0061 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1382,7 +1382,6 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, struct scatterlist *s, *prev = NULL; int prot = dma_info_to_prot(dir, dev_is_dma_coherent(dev), attrs); struct pci_p2pdma_map_state p2pdma_state = {}; - enum pci_p2pdma_map_type map; dma_addr_t iova; size_t iova_len = 0; unsigned long mask = dma_get_seg_boundary(dev); @@ -1412,28 +1411,30 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, size_t s_length = s->length; size_t pad_len = (mask - iova_len + 1) & mask; - if (is_pci_p2pdma_page(sg_page(s))) { - map = pci_p2pdma_map_segment(&p2pdma_state, dev, s); - switch (map) { - case PCI_P2PDMA_MAP_BUS_ADDR: - /* - * iommu_map_sg() will skip this segment as - * it is marked as a bus address, - * __finalise_sg() will copy the dma address - * into the output segment. - */ - continue; - case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: - /* - * Mapping through host bridge should be - * mapped with regular IOVAs, thus we - * do nothing here and continue below. - */ - break; - default: - ret = -EREMOTEIO; - goto out_restore_sg; - } + switch (pci_p2pdma_state(&p2pdma_state, dev, sg_page(s))) { + case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: + /* + * Mapping through host bridge should be mapped with + * regular IOVAs, thus we do nothing here and continue + * below. + */ + break; + case PCI_P2PDMA_MAP_NONE: + break; + case PCI_P2PDMA_MAP_BUS_ADDR: + /* + * iommu_map_sg() will skip this segment as it is marked + * as a bus address, __finalise_sg() will copy the dma + * address into the output segment. + */ + s->dma_address = pci_p2pdma_bus_addr_map(&p2pdma_state, + sg_phys(s)); + sg_dma_len(s) = sg->length; + sg_dma_mark_bus_address(s); + continue; + default: + ret = -EREMOTEIO; + goto out_restore_sg; } sg_dma_address(s) = s_iova_off; diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 7abd4f546d3c..82b6ed736f0f 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -995,40 +995,12 @@ static enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, return type; } -/** - * pci_p2pdma_map_segment - map an sg segment determining the mapping type - * @state: State structure that should be declared outside of the for_each_sg() - * loop and initialized to zero. - * @dev: DMA device that's doing the mapping operation - * @sg: scatterlist segment to map - * - * This is a helper to be used by non-IOMMU dma_map_sg() implementations where - * the sg segment is the same for the page_link and the dma_address. - * - * Attempt to map a single segment in an SGL with the PCI bus address. - * The segment must point to a PCI P2PDMA page and thus must be - * wrapped in a is_pci_p2pdma_page(sg_page(sg)) check. - * - * Returns the type of mapping used and maps the page if the type is - * PCI_P2PDMA_MAP_BUS_ADDR. - */ -enum pci_p2pdma_map_type -pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, - struct scatterlist *sg) +void __pci_p2pdma_update_state(struct pci_p2pdma_map_state *state, + struct device *dev, struct page *page) { - if (state->pgmap != sg_page(sg)->pgmap) { - state->pgmap = sg_page(sg)->pgmap; - state->map = pci_p2pdma_map_type(state->pgmap, dev); - state->bus_off = to_p2p_pgmap(state->pgmap)->bus_offset; - } - - if (state->map == PCI_P2PDMA_MAP_BUS_ADDR) { - sg->dma_address = sg_phys(sg) + state->bus_off; - sg_dma_len(sg) = sg->length; - sg_dma_mark_bus_address(sg); - } - - return state->map; + state->pgmap = page->pgmap; + state->map = pci_p2pdma_map_type(state->pgmap, dev); + state->bus_off = to_p2p_pgmap(state->pgmap)->bus_offset; } /** diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index e172522cd936..63dd480e209b 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -443,6 +443,11 @@ enum pci_p2pdma_map_type { */ PCI_P2PDMA_MAP_UNKNOWN = 0, + /* + * Not a PCI P2PDMA transfer. + */ + PCI_P2PDMA_MAP_NONE, + /* * PCI_P2PDMA_MAP_NOT_SUPPORTED: Indicates the transaction will * traverse the host bridge and the host bridge is not in the @@ -471,21 +476,47 @@ enum pci_p2pdma_map_type { struct pci_p2pdma_map_state { struct dev_pagemap *pgmap; - int map; + enum pci_p2pdma_map_type map; u64 bus_off; }; -#ifdef CONFIG_PCI_P2PDMA -enum pci_p2pdma_map_type -pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, - struct scatterlist *sg); -#else /* CONFIG_PCI_P2PDMA */ +/* helper for pci_p2pdma_state(), do not use directly */ +void __pci_p2pdma_update_state(struct pci_p2pdma_map_state *state, + struct device *dev, struct page *page); + +/** + * pci_p2pdma_state - check the P2P transfer state of a page + * @state: P2P state structure + * @dev: device to transfer to/from + * @page: page to map + * + * Check if @page is a PCI P2PDMA page, and if yes of what kind. Returns the + * map type, and updates @state with all information needed for a P2P transfer. + */ static inline enum pci_p2pdma_map_type -pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, - struct scatterlist *sg) +pci_p2pdma_state(struct pci_p2pdma_map_state *state, struct device *dev, + struct page *page) +{ + if (IS_ENABLED(CONFIG_PCI_P2PDMA) && is_pci_p2pdma_page(page)) { + if (state->pgmap != page->pgmap) + __pci_p2pdma_update_state(state, dev, page); + return state->map; + } + return PCI_P2PDMA_MAP_NONE; +} + +/** + * pci_p2pdma_bus_addr_map - map a PCI_P2PDMA_MAP_BUS_ADDR P2P transfer + * @state: P2P state structure + * @paddr: physical address to map + * + * Map a physically contigous PCI_P2PDMA_MAP_BUS_ADDR transfer. + */ +static inline dma_addr_t +pci_p2pdma_bus_addr_map(struct pci_p2pdma_map_state *state, phys_addr_t paddr) { - return PCI_P2PDMA_MAP_NOT_SUPPORTED; + WARN_ON_ONCE(state->map != PCI_P2PDMA_MAP_BUS_ADDR); + return paddr + state->bus_off; } -#endif /* CONFIG_PCI_P2PDMA */ #endif /* _LINUX_DMA_MAP_OPS_H */ diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 5b4e6d3bf7bc..e289ad27d1b5 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -462,34 +462,33 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents, enum dma_data_direction dir, unsigned long attrs) { struct pci_p2pdma_map_state p2pdma_state = {}; - enum pci_p2pdma_map_type map; struct scatterlist *sg; int i, ret; for_each_sg(sgl, sg, nents, i) { - if (is_pci_p2pdma_page(sg_page(sg))) { - map = pci_p2pdma_map_segment(&p2pdma_state, dev, sg); - switch (map) { - case PCI_P2PDMA_MAP_BUS_ADDR: - continue; - case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: - /* - * Any P2P mapping that traverses the PCI - * host bridge must be mapped with CPU physical - * address and not PCI bus addresses. This is - * done with dma_direct_map_page() below. - */ - break; - default: - ret = -EREMOTEIO; + switch (pci_p2pdma_state(&p2pdma_state, dev, sg_page(sg))) { + case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: + /* + * Any P2P mapping that traverses the PCI host bridge + * must be mapped with CPU physical address and not PCI + * bus addresses. + */ + break; + case PCI_P2PDMA_MAP_NONE: + sg->dma_address = dma_direct_map_page(dev, sg_page(sg), + sg->offset, sg->length, dir, attrs); + if (sg->dma_address == DMA_MAPPING_ERROR) { + ret = -EIO; goto out_unmap; } - } - - sg->dma_address = dma_direct_map_page(dev, sg_page(sg), - sg->offset, sg->length, dir, attrs); - if (sg->dma_address == DMA_MAPPING_ERROR) { - ret = -EIO; + break; + case PCI_P2PDMA_MAP_BUS_ADDR: + sg->dma_address = pci_p2pdma_bus_addr_map(&p2pdma_state, + sg_phys(sg)); + sg_dma_mark_bus_address(sg); + continue; + default: + ret = -EREMOTEIO; goto out_unmap; } sg_dma_len(sg) = sg->length; From patchwork Fri Jan 17 10:03:33 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943119 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C58D4C02185 for ; Fri, 17 Jan 2025 10:04:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59A1D280007; Fri, 17 Jan 2025 05:04:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 54A4A280001; Fri, 17 Jan 2025 05:04:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C46F280007; Fri, 17 Jan 2025 05:04:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1CFF2280001 for ; Fri, 17 Jan 2025 05:04:17 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D062DC188B for ; Fri, 17 Jan 2025 10:04:16 +0000 (UTC) X-FDA: 83016508512.14.D6AC55B Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf12.hostedemail.com (Postfix) with ESMTP id 1E4B440009 for ; Fri, 17 Jan 2025 10:04:14 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=mRCFjuqn; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf12.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108255; a=rsa-sha256; cv=none; b=oNHuulxckYkG/ksW0OJwMu2cu89b1hY3zdpk8YJDDS+Sa8mTWpiSP51b7qcqNjFSC0Cjd7 v2XHlVoUaEYGXaGiqVlody/UfzFxZ/SeEVE6gtHhBRY1WFVfc/e1jqNLDOhG20X7XWTKcx fcdfyQLY0o/QrdtAt7P5loi5xe8y70k= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=mRCFjuqn; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf12.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108255; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OUFgoi3FQNM98+ha+bc/i/4AE2ysI0eOkSmgUOa4jHI=; b=i6RYi09qjxrSEWFLVe5RFf5uuyF9ovt6hfpWmp56YW//GaHo1oXNnCKXwmmzM73B4XS4i8 7h+0lJzLeGq+e0FlymkaNnvgW2x5k3AtCSFBqlZqT0mO9SytCi1kuzGxiXtQ3JO5ikXZ+Z 3x8PtsL5jOQMEcOvj7lKgyzRSrgYnmQ= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id ADF225C5714; Fri, 17 Jan 2025 10:03:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 658ACC4CEE3; Fri, 17 Jan 2025 10:04:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108254; bh=b6d3Ock46XbthcVFLofuUHJCITAgXKG9siQAvYGaAp8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mRCFjuqnJtaIWwe5/GpzaWd63p9tgJQU/oErGC9PUU1ISFs/68kuynU5ej1dZAsGI Gv1xe8qaH5EJ+rljZi5lBb7NfaOU2iB3qSgNLFgHwskkOTeGkQ1huEgjNMNe4YT7Uc qtgiYU/SAt406jHKheWu9G1MOrALkqESIaaXid3bfWfQrJrX7Qvlm0XmmPWL85HcfN fxT61GZZAgV9QxkIM6rzDebaoDwtghxnRq33HIu0tg6fTJw6v3DuUP/BQKz/tibbN/ YMemYu8n/YzF8sIDBocLOnHplH6tMC+0ZCNvntu9J383dqY7xIUgaiA3bXhz7xQsmQ 1t/Kl/NZpZhpg== From: Leon Romanovsky To: Christoph Hellwig , Jason Gunthorpe , Robin Murphy Cc: Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 02/17] dma-mapping: move the PCI P2PDMA mapping helpers to pci-p2pdma.h Date: Fri, 17 Jan 2025 12:03:33 +0200 Message-ID: <15e9becd1a061b538b44cbe02a47beeed0f53771.1737106761.git.leon@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 1E4B440009 X-Stat-Signature: m9b5kzf6mhcdjawqg973boqpt6e9cuy6 X-Rspam-User: X-HE-Tag: 1737108254-598004 X-HE-Meta: U2FsdGVkX1/s1LT1XTEr9K2vgtQiqWexKaCggpwvQ3G/8ZHzg7oRqaSAa61JCQiiYC7Yc5UB7lHhN0gZN+XMC09migA6RkWpdz6RgTHJnx0vRBkJ6uLMdoi1AVsZRfhgRphGWokKFm1NpvP7uJ9IOr4fLoKxqJtYi3zxu21gb+A7PPs2vMHFBvw35RNvKtB2SxFyGroyHYcGH9yC6o9lreVToTBsKF4dulfQPNacp0tJj5+8lQfus0NNXmhf4GUfru7YXha46wI+j+hWcbQPzQ3iIO4MKOCUbF9eenpYsmuYA7CTeBccvvG4Pz8CSQ9tzOFOQ6vglWvNS/G2lTYkvj72d6eIQkOYojiE4AsuS8bwHcVDwu1iqFAXYebLtTilOsR7O7FhKtf/gadFdsaMTe8jCHQ+LVO6d22XqZFAAv8862Y7HBJq2vWFi7XzlYm+pwGvQ+EdkYhQO4IS2f+uB4pCoaUyV4/ASVDPJm+3+EuhoSzibsIE7SOqscolZhld/ZL3iVBGfCPcA2Y2FO0elNI2KTW8nb1tqHYy0byL+4dbOvXZoYCqYH+2j2VrE8uEIGzf2K7Ksy2Map8iyhP5s8mUxlJrrlzic3hCZCkcMsBcYoxD7w/4MsPe1I7iuVPWgi6SDOiD6YALSfkgSq0Eh/2ZuAlICP1Aej5sQ6wrRte6Awb5AfbXip9Hwu8XzEILoECkmr0tsPZfljMdYOEXWJuOkUoHYLZqgxUH2IVLxycjn8mbqkZha1ZW9vgU6nK7uV83i6pZUAV8NuvII5T74OUM0xaUQ+yCv6W2UEkeGp2bQfmtib7rEFLA8eFDb4ct5lR9NSzL1+QUVoBLg1D+r+HmJt5EdEEIRA6DQsKNFdbjN69TbQPS+qNQzfO/JT91pVWdfCxx+0h9VjwvYoB4RS4Ix8KcamObtvcLMbyTwKZotwnJXWxIRRHxdC+qp3LSX1lqX4abNDNBX9VwZPL 0uyRxWnA D752UkXBLer+O3TNCxbrTl7WD+3svtE5IMEYlghGgLeV0T8IXrkPO0jPjWakkmMKN5kBxZ5MVjY9MfAqU17lbhSutxkNKQhd4Rerl22fzRU00O/FdjIcpmysqH2CPI4EkfVjpv9cq9nzY9PGTpBxsnfMHDQ+nV7FK1EAC9thSES3Za9WYpvRqGvNdJ7EoUrvG2zddJmPiSp3tqz9MvD42o1CLk1WzsExRE+ptmA4SCl2SI2m85gMAJn3AtKEL7erUATc0RNpXC5ddTpp2BPRdRoeUPOrXTC6sCfxXcCR0IIXYrLazKIa+Xgo5ezek7cH5Mslk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Christoph Hellwig To support the upcoming non-scatterlist mapping helpers, we need to go back to have them called outside of the DMA API. Thus move them out of dma-map-ops.h, which is only for DMA API implementations to pci-p2pdma.h, which is for driver use. Note that the core helper is still not exported as the mapping is expected to be done only by very highlevel subsystem code at least for now. Signed-off-by: Christoph Hellwig Reviewed-by: Logan Gunthorpe Acked-by: Bjorn Helgaas Signed-off-by: Leon Romanovsky --- drivers/iommu/dma-iommu.c | 1 + include/linux/dma-map-ops.h | 85 ------------------------------------- include/linux/pci-p2pdma.h | 84 ++++++++++++++++++++++++++++++++++++ kernel/dma/direct.c | 1 + 4 files changed, 86 insertions(+), 85 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 5746ffaf0061..853247c42f7d 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include #include diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index 63dd480e209b..f48e5fb88bd5 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -434,89 +434,4 @@ static inline void debug_dma_dump_mappings(struct device *dev) #endif /* CONFIG_DMA_API_DEBUG */ extern const struct dma_map_ops dma_dummy_ops; - -enum pci_p2pdma_map_type { - /* - * PCI_P2PDMA_MAP_UNKNOWN: Used internally for indicating the mapping - * type hasn't been calculated yet. Functions that return this enum - * never return this value. - */ - PCI_P2PDMA_MAP_UNKNOWN = 0, - - /* - * Not a PCI P2PDMA transfer. - */ - PCI_P2PDMA_MAP_NONE, - - /* - * PCI_P2PDMA_MAP_NOT_SUPPORTED: Indicates the transaction will - * traverse the host bridge and the host bridge is not in the - * allowlist. DMA Mapping routines should return an error when - * this is returned. - */ - PCI_P2PDMA_MAP_NOT_SUPPORTED, - - /* - * PCI_P2PDMA_BUS_ADDR: Indicates that two devices can talk to - * each other directly through a PCI switch and the transaction will - * not traverse the host bridge. Such a mapping should program - * the DMA engine with PCI bus addresses. - */ - PCI_P2PDMA_MAP_BUS_ADDR, - - /* - * PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: Indicates two devices can talk - * to each other, but the transaction traverses a host bridge on the - * allowlist. In this case, a normal mapping either with CPU physical - * addresses (in the case of dma-direct) or IOVA addresses (in the - * case of IOMMUs) should be used to program the DMA engine. - */ - PCI_P2PDMA_MAP_THRU_HOST_BRIDGE, -}; - -struct pci_p2pdma_map_state { - struct dev_pagemap *pgmap; - enum pci_p2pdma_map_type map; - u64 bus_off; -}; - -/* helper for pci_p2pdma_state(), do not use directly */ -void __pci_p2pdma_update_state(struct pci_p2pdma_map_state *state, - struct device *dev, struct page *page); - -/** - * pci_p2pdma_state - check the P2P transfer state of a page - * @state: P2P state structure - * @dev: device to transfer to/from - * @page: page to map - * - * Check if @page is a PCI P2PDMA page, and if yes of what kind. Returns the - * map type, and updates @state with all information needed for a P2P transfer. - */ -static inline enum pci_p2pdma_map_type -pci_p2pdma_state(struct pci_p2pdma_map_state *state, struct device *dev, - struct page *page) -{ - if (IS_ENABLED(CONFIG_PCI_P2PDMA) && is_pci_p2pdma_page(page)) { - if (state->pgmap != page->pgmap) - __pci_p2pdma_update_state(state, dev, page); - return state->map; - } - return PCI_P2PDMA_MAP_NONE; -} - -/** - * pci_p2pdma_bus_addr_map - map a PCI_P2PDMA_MAP_BUS_ADDR P2P transfer - * @state: P2P state structure - * @paddr: physical address to map - * - * Map a physically contigous PCI_P2PDMA_MAP_BUS_ADDR transfer. - */ -static inline dma_addr_t -pci_p2pdma_bus_addr_map(struct pci_p2pdma_map_state *state, phys_addr_t paddr) -{ - WARN_ON_ONCE(state->map != PCI_P2PDMA_MAP_BUS_ADDR); - return paddr + state->bus_off; -} - #endif /* _LINUX_DMA_MAP_OPS_H */ diff --git a/include/linux/pci-p2pdma.h b/include/linux/pci-p2pdma.h index 2c07aa6b7665..e839f52b512b 100644 --- a/include/linux/pci-p2pdma.h +++ b/include/linux/pci-p2pdma.h @@ -104,4 +104,88 @@ static inline struct pci_dev *pci_p2pmem_find(struct device *client) return pci_p2pmem_find_many(&client, 1); } +enum pci_p2pdma_map_type { + /* + * PCI_P2PDMA_MAP_UNKNOWN: Used internally for indicating the mapping + * type hasn't been calculated yet. Functions that return this enum + * never return this value. + */ + PCI_P2PDMA_MAP_UNKNOWN = 0, + + /* + * Not a PCI P2PDMA transfer. + */ + PCI_P2PDMA_MAP_NONE, + + /* + * PCI_P2PDMA_MAP_NOT_SUPPORTED: Indicates the transaction will + * traverse the host bridge and the host bridge is not in the + * allowlist. DMA Mapping routines should return an error when + * this is returned. + */ + PCI_P2PDMA_MAP_NOT_SUPPORTED, + + /* + * PCI_P2PDMA_BUS_ADDR: Indicates that two devices can talk to + * each other directly through a PCI switch and the transaction will + * not traverse the host bridge. Such a mapping should program + * the DMA engine with PCI bus addresses. + */ + PCI_P2PDMA_MAP_BUS_ADDR, + + /* + * PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: Indicates two devices can talk + * to each other, but the transaction traverses a host bridge on the + * allowlist. In this case, a normal mapping either with CPU physical + * addresses (in the case of dma-direct) or IOVA addresses (in the + * case of IOMMUs) should be used to program the DMA engine. + */ + PCI_P2PDMA_MAP_THRU_HOST_BRIDGE, +}; + +struct pci_p2pdma_map_state { + struct dev_pagemap *pgmap; + enum pci_p2pdma_map_type map; + u64 bus_off; +}; + +/* helper for pci_p2pdma_state(), do not use directly */ +void __pci_p2pdma_update_state(struct pci_p2pdma_map_state *state, + struct device *dev, struct page *page); + +/** + * pci_p2pdma_state - check the P2P transfer state of a page + * @state: P2P state structure + * @dev: device to transfer to/from + * @page: page to map + * + * Check if @page is a PCI P2PDMA page, and if yes of what kind. Returns the + * map type, and updates @state with all information needed for a P2P transfer. + */ +static inline enum pci_p2pdma_map_type +pci_p2pdma_state(struct pci_p2pdma_map_state *state, struct device *dev, + struct page *page) +{ + if (IS_ENABLED(CONFIG_PCI_P2PDMA) && is_pci_p2pdma_page(page)) { + if (state->pgmap != page->pgmap) + __pci_p2pdma_update_state(state, dev, page); + return state->map; + } + return PCI_P2PDMA_MAP_NONE; +} + +/** + * pci_p2pdma_bus_addr_map - map a PCI_P2PDMA_MAP_BUS_ADDR P2P transfer + * @state: P2P state structure + * @paddr: physical address to map + * + * Map a physically contigous PCI_P2PDMA_MAP_BUS_ADDR transfer. + */ +static inline dma_addr_t +pci_p2pdma_bus_addr_map(struct pci_p2pdma_map_state *state, phys_addr_t paddr) +{ + WARN_ON_ONCE(state->map != PCI_P2PDMA_MAP_BUS_ADDR); + return paddr + state->bus_off; +} + #endif /* _LINUX_PCI_P2P_H */ diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index e289ad27d1b5..c9b3893257d4 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "direct.h" /* From patchwork Fri Jan 17 10:03:34 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943117 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6889AC02183 for ; Fri, 17 Jan 2025 10:04:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E75FD280005; Fri, 17 Jan 2025 05:04:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DFD6B280001; Fri, 17 Jan 2025 05:04:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4F47280005; Fri, 17 Jan 2025 05:04:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id A3BF0280001 for ; Fri, 17 Jan 2025 05:04:09 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 51D44121948 for ; Fri, 17 Jan 2025 10:04:09 +0000 (UTC) X-FDA: 83016508218.03.481A3C7 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf03.hostedemail.com (Postfix) with ESMTP id BDFB22000C for ; Fri, 17 Jan 2025 10:04:07 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Te4IE0Vt; spf=pass (imf03.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Gy0yVVTaXXujFr+7OVK39pky4G3YT31AkJo6hQqAzGI=; b=n4dUqt3pcxwe4oKZ7EgJgFHe1/VXVTBha90nz7CE2Hk4fGgcUJqGWiWVeMF6aZhppDtr4v 4PW3uc9aRWkd3mXsKIu5B3SZoYsyDpGsHP1kFASlveFzxOUe0XDQmH2oqKs1xFOf0g/iRV fAlqjgajavGn0MP0gOrmOMhIzPk+n8U= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Te4IE0Vt; spf=pass (imf03.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108247; a=rsa-sha256; cv=none; b=wdB3AY7Q8T6BFeOk+bsA+a8tFMtPRCySkQW41xqV3viDNSQ6f5NW6FGU3HMa6TL2fc0jli D0zgseX+d9BvfRMKFBibtwSFxmoN5T5EEh7vY7KQMkclFwvOohfGwMt/gh9YEsFelXxxZ9 dQ42mAJBsXoQpGS8Nx7suRyhtUwbF4Y= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 43BC0A42B29; Fri, 17 Jan 2025 10:02:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2B2C2C4AF09; Fri, 17 Jan 2025 10:04:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108246; bh=S6Hg437d0RID+kEvHLWRloi4SVdnFFAwB65ojc8mqzM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Te4IE0VtJt+3NGv+jGTNce9dxg9QLi6Jf6POm4F5HR8+lC+l+q1hZM0bP9E1z1BZ1 QyqQX8+XdppByG2o9y/ilC+r4FaFPzHunx2eMvcSY4zoF80NLzunJgnnx4hb9VwJm3 PZ1rUxcoupikul7nj8L9QkXwL19lJtLwIAmnZtFcaXCjdVd58FBDAfq3mIq/VF4XXb UAg63yBn6B/aGTkcgW8b6GV0wfHYLT34lInWmFwFvjaQC4TuPaLT3tg5s2Sc9zJ4DL kT8mOA50cu7qHdu9C/JkntNVeJjt7P6G3BwXTuKs1ixyK+HFd2AyIV59o7VyQaOLyR hAh+B1KLkbwnA== From: Leon Romanovsky To: Christoph Hellwig , Jason Gunthorpe , Robin Murphy Cc: Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 03/17] iommu: generalize the batched sync after map interface Date: Fri, 17 Jan 2025 12:03:34 +0200 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: BDFB22000C X-Stat-Signature: xgxgcg8874e6cnf1zjgxm3zd6m49o4zu X-Rspam-User: X-HE-Tag: 1737108247-950897 X-HE-Meta: U2FsdGVkX1+Ul/f3UJRyzeAdC0Ganq0/ExeFmthdRS996GOmi0Ar9ImNw2kBx0hC4BFG5ZiGqk6odGKOLbU7HBj6Ob28/vCeWe27ADnDlG6XemkPrgeCjYt4ZkNpepAa/sEiuhJaP06kpiG3mhr9IFaXZvvSMDx+yY+FLkiip3CKahz/G5eqL1F6XBLPcSubBgjwxiZDEG8KNZj/N/gKHJK3g3txzCwVZ0l5ri11ll0u2YXMbKcTLrGNkNnEllt78ESkk5AhjRXGvKNgdckCl1k1BC65LdtOtqmTv6j6ky84WNK1wYPNYLVQJz3x+ymxb2bR343uMMV5nlGyllXQxT+8MmdmKnLX8k6uJd+ZkvRzHcNLyviaLL4AL8caaOjiE7w3j5uqdF2JfKEc6EtuB49RV8X7mTTYJL0uP+e1JqdBz7X1zfSzxW9nH/bNsaCDRBL18YdHhmB/jvjwryyWUlbYpzthDiNfBhIYpLWOCpnaC+5AcWs5JaonWo5wXp0b+rXszFcIes2UnnODilQYSOr1EbBPaIuLZ0oWLvi9pLUEAh8E0I0CxzAxPK39EHCnCxT0DeV38AYYHyGro3YSqYAgn3soMhpeXX4NFEbho66aJM1z5mchjai08X8dTdKWjrq9t9F6OgZ/0p+0vmiBpsD/MUBJUnfRJp1ZG5pgLnYNIIav8QUM9G9KgessUBg9KbNbQwzFmLlja78izcLV9z3BF9VX/BJjWwc5z09xQEnZrSZTdWecv6FPdQYdE5NlOZUFZ8t0ZQ7kAHOKpUR2YIkOxbTMeHxJYWW1zW3bN8aSsMj+j5xdmQOCUNaCWy7zg/Qjb0Zzgx9yHfSp2Da+7abKtM65sSmED5pUkYwtXxJAsyidTtoWYrz+f4zUJbZK5RfN5cktXxexVdSS8icaNmR1B927UcQB/cMnj1I6MdC6mKwhpMrdsnW6wnXRkTZeKltOfA8Q4hYub5F4pEe ekb0UguK RRUg6QdbyEmVLFK/JRfrYWvvqqhfl3Iq++6mxggJrqoKsjFJsNSsGTqYNIgD+xGVmcUQq4k9o54ViuVveYWRMRNjyk6ENP5g+gkyxdcwg1a5G6m0EOHCkJ4kdeLGbV0vXTzlePYJMLU2DmwXt79pj+/BTMP6Rz6LiaQZrPqZP1OnFeUI9wA1s+iQkRZoMEQBbjX+gPUkXqwARR3BWLY6qOVkuCiXRAmGMRccahvyQGbK6Ta+Ts0iFHlGveEd/N/VX45ug X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Christoph Hellwig For the upcoming IOVA-based DMA API we want to use the interface batch the sync after mapping multiple entries from dma-iommu without having a scatterlist. For that move more sanity checks from the callers into __iommu_map and make that function available outside of iommu.c as iommu_map_nosync. Add a wrapper for the map_sync as iommu_sync_map so that callers don't need to poke into the methods directly. Signed-off-by: Christoph Hellwig Acked-by: Will Deacon Signed-off-by: Leon Romanovsky --- drivers/iommu/iommu.c | 65 +++++++++++++++++++------------------------ include/linux/iommu.h | 4 +++ 2 files changed, 33 insertions(+), 36 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 9bc0c74cca3c..ec75d14497bf 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2412,8 +2412,8 @@ static size_t iommu_pgsize(struct iommu_domain *domain, unsigned long iova, return pgsize; } -static int __iommu_map(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, size_t size, int prot, gfp_t gfp) +int iommu_map_nosync(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t size, int prot, gfp_t gfp) { const struct iommu_domain_ops *ops = domain->ops; unsigned long orig_iova = iova; @@ -2422,12 +2422,19 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t orig_paddr = paddr; int ret = 0; + might_sleep_if(gfpflags_allow_blocking(gfp)); + if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING))) return -EINVAL; if (WARN_ON(!ops->map_pages || domain->pgsize_bitmap == 0UL)) return -ENODEV; + /* Discourage passing strange GFP flags */ + if (WARN_ON_ONCE(gfp & (__GFP_COMP | __GFP_DMA | __GFP_DMA32 | + __GFP_HIGHMEM))) + return -EINVAL; + /* find out the minimum page size supported */ min_pagesz = 1 << __ffs(domain->pgsize_bitmap); @@ -2475,31 +2482,27 @@ static int __iommu_map(struct iommu_domain *domain, unsigned long iova, return ret; } -int iommu_map(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, size_t size, int prot, gfp_t gfp) +int iommu_sync_map(struct iommu_domain *domain, unsigned long iova, size_t size) { const struct iommu_domain_ops *ops = domain->ops; - int ret; - - might_sleep_if(gfpflags_allow_blocking(gfp)); - /* Discourage passing strange GFP flags */ - if (WARN_ON_ONCE(gfp & (__GFP_COMP | __GFP_DMA | __GFP_DMA32 | - __GFP_HIGHMEM))) - return -EINVAL; + if (!ops->iotlb_sync_map) + return 0; + return ops->iotlb_sync_map(domain, iova, size); +} - ret = __iommu_map(domain, iova, paddr, size, prot, gfp); - if (ret == 0 && ops->iotlb_sync_map) { - ret = ops->iotlb_sync_map(domain, iova, size); - if (ret) - goto out_err; - } +int iommu_map(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t size, int prot, gfp_t gfp) +{ + int ret; - return ret; + ret = iommu_map_nosync(domain, iova, paddr, size, prot, gfp); + if (ret) + return ret; -out_err: - /* undo mappings already done */ - iommu_unmap(domain, iova, size); + ret = iommu_sync_map(domain, iova, size); + if (ret) + iommu_unmap(domain, iova, size); return ret; } @@ -2599,26 +2602,17 @@ ssize_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova, struct scatterlist *sg, unsigned int nents, int prot, gfp_t gfp) { - const struct iommu_domain_ops *ops = domain->ops; size_t len = 0, mapped = 0; phys_addr_t start; unsigned int i = 0; int ret; - might_sleep_if(gfpflags_allow_blocking(gfp)); - - /* Discourage passing strange GFP flags */ - if (WARN_ON_ONCE(gfp & (__GFP_COMP | __GFP_DMA | __GFP_DMA32 | - __GFP_HIGHMEM))) - return -EINVAL; - while (i <= nents) { phys_addr_t s_phys = sg_phys(sg); if (len && s_phys != start + len) { - ret = __iommu_map(domain, iova + mapped, start, + ret = iommu_map_nosync(domain, iova + mapped, start, len, prot, gfp); - if (ret) goto out_err; @@ -2641,11 +2635,10 @@ ssize_t iommu_map_sg(struct iommu_domain *domain, unsigned long iova, sg = sg_next(sg); } - if (ops->iotlb_sync_map) { - ret = ops->iotlb_sync_map(domain, iova, mapped); - if (ret) - goto out_err; - } + ret = iommu_sync_map(domain, iova, mapped); + if (ret) + goto out_err; + return mapped; out_err: diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 318d27841130..de77012f76d5 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -862,6 +862,10 @@ extern struct iommu_domain *iommu_get_domain_for_dev(struct device *dev); extern struct iommu_domain *iommu_get_dma_domain(struct device *dev); extern int iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot, gfp_t gfp); +int iommu_map_nosync(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t size, int prot, gfp_t gfp); +int iommu_sync_map(struct iommu_domain *domain, unsigned long iova, + size_t size); extern size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size); extern size_t iommu_unmap_fast(struct iommu_domain *domain, From patchwork Fri Jan 17 10:03:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943118 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D83B4C02185 for ; Fri, 17 Jan 2025 10:04:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D594280006; Fri, 17 Jan 2025 05:04:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 65C24280001; Fri, 17 Jan 2025 05:04:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4FF77280006; Fri, 17 Jan 2025 05:04:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 2D826280001 for ; Fri, 17 Jan 2025 05:04:14 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B63D0A18D0 for ; Fri, 17 Jan 2025 10:04:13 +0000 (UTC) X-FDA: 83016508386.15.FBA031A Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf15.hostedemail.com (Postfix) with ESMTP id 03606A0011 for ; Fri, 17 Jan 2025 10:04:11 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Q9tywBk0; spf=pass (imf15.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108252; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ku8ubNYre/w13pJMtu7Kp0D6VT+U0ludZSt17Nd8EU4=; b=ziDa/uBtiHedLVwH7OUf0ySV1j7sRoJXknkVSTTvSCgeRruHRI3bmQjyEBXJSha2xxB4gF uAiJuuQ6ox66In88XuC+DYMcpRLRQDy1YJ0ZFhCmQXxFecSwOgxujiX+AcYMkO8jW1D9RP wf9GelVxfM2ovbdSRpcSWJblIe5lj0k= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Q9tywBk0; spf=pass (imf15.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108252; a=rsa-sha256; cv=none; b=l5QvVjf2vpFWeKhQfKzDGDcgZfJ4HB3+tWlpoAxyreQInJ5mRZdOJLAJ2NaZVs/WMNVBWY CHH4G0wxPbDFHv0gbLy/ecPzS9SaiTqp0wHopuHdM2r/9whVGWQcmmnYNfOPC6d/RwXClY gUDYU6yW/lZXcZTbMXyfQ2a3BS9tZMs= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 1D1215C5710; Fri, 17 Jan 2025 10:03:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CAD5BC4CEE5; Fri, 17 Jan 2025 10:04:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108250; bh=F8sJw/87mV5gIytPIjyBvaX8yGczgIz08QAVaYWaMzI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Q9tywBk0v0rMbF2vqYafzs4kkPbPNI5m9+Y6dMJkANGUXLqUMewUXxW8CCTVE0O5+ 4/h08s6bsJmh1lhXo7P6OpgiTbG4BmCxLQeNTIFVqUSPWjgEnOcRiOeau5NGCxbOTS mkGR/8kOyOxWSxJVNrM04Wj+trb2Fp5EH7vOJmSdxGc/c7tqX+yYL4nz07hJZ1Cz+F R/BbHgmwV6EIXl4GFoyM6mIOCRD/RLUzC9lakllGhSiyT4MNuSLgpKtPJi6rBUs+BU R3+UWt2WOc17d19WYOWBh9efL6SYldmI8WwttyBmNNLetjghTsui0z3GCdExEIiX9G Bb2gdP4klOqaA== From: Leon Romanovsky To: Christoph Hellwig , Jason Gunthorpe , Robin Murphy Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" , Jason Gunthorpe Subject: [PATCH v6 04/17] iommu: add kernel-doc for iommu_unmap and iommu_unmap_fast Date: Fri, 17 Jan 2025 12:03:35 +0200 Message-ID: <0ae577f8b99f7e03c679729434c87ea7daf78955.1737106761.git.leon@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 03606A0011 X-Stat-Signature: wfkpp7bxrm6kg7gcsjsnmru4ksjiihyi X-Rspam-User: X-HE-Tag: 1737108251-618514 X-HE-Meta: U2FsdGVkX1+rxm5alaYocZ5GbEZ2RI4QWgOFfAZxPTF1aCm4mpspxbPyyaTLwWJNvxmv3aCFwERUQiLM1F1eGq+T8JSbE5L+UiqWaH/HsH4aKy8fYcI8Bnv4gk+VxHQ0NCIEn7mi720kfHRmif0nwdyRxFh6W0bEqqMFb5vw6JSx73kPoO79fwrk2Q9pObZTfjcePnpaFSSsFu5g5pMtF3G4D+xzvr1uTxI8cuoXGGBChLZz22YMq3C/jWosLm6dmZRA3xNHSQ3MLLCAMw0r75kCuHa8Fk3V+82EITPg/i48NfqYagvxfRpUL0h9W5iqfec+JflTFA5WQa2cVfsnLWTLkSvs7TowvJXYMiPdvzaJTAHDJcP0Y3RraVjF3Ts5gJrJkSKEWipqNYhsqqFFKwmbbW24ZTpG8CXWl50fHLSiiNMPbnZtwbCueB7cPgwyME5XtpVKjdTHtJCJggubW3Rl/oTOfOZQdxo7YW6PKvv8I6TaftnbG0o1eLdT7gu+MEpPGPT2DmX/pYYtLWii79aV/lu5+tGqIUSpdH76SxvhwxdGO8ZTGeDeqmyg9bjD+XiC5foXQ+W7rC4HtbftwjIoxF1/PXwoypYiVD/jztsKkCRDYVx6ZAbp2VIxHIbe40faOOuj8eHwKWDd36otN0Q11P0mxykruBeww8mSr7jHD4U2lVkanvwxtKvwEy/INrturgCiKJ0p/3qJrP4ErRhT5JyEb/oVRnACE26aE4CcG3fbQtbNK3g6DsChaEEfPxT0BEahFNTBd5PQGjaXTJN8HW2KKiLVXP7bZjRfglY2z2gtgxsz1tTnuhqA07bfIwyz/q8e6QIIfqNJ99pPoiRkEiQ3DmxvjuVssNzPx0LzBvIyX2jHRSTwrLleZcnxcyK0P7v1HcMPaf2B3iI7+dffFCR87G9AHuyd8czLu83gInnsFQSx/Hanwy8FgnGhCFSfTpubTGEbOT0TH8r KmpakQAi 9DHUOGVOjbjAaWK7CWMcD2TR5erXxzYJBpnAdPXeGqy9xwmJicZwqQ/A91BoPIlVc/glfzLvUgQtZF3BqRG9vISO7vhZlgXcM6a3wn3w9TMdefbkmr8nG8km0pZ5LDdbnn4ZuPIuRsUdg9pATfJnkCmV7q+Rjuv4QLv+AwD+EBqlNmXbuEz15FY6gXEkgfF9FE7EoDSHb5Y/WNlNHIvbVyON1L9kAGiFZtOv5ttR/KXGXAAjsmFGR9gRi8ncpAnSV+nJrM11SpgSp2XymVddAGnFMtTw9HEkF3ZzAsgQcbQ9NDG1TntCPMbDmBw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Add kernel-doc section for iommu_unmap and iommu_unmap_fast to document existing limitation of underlying functions which can't split individual ranges. Suggested-by: Jason Gunthorpe Acked-by: Will Deacon Reviewed-by: Christoph Hellwig Signed-off-by: Leon Romanovsky --- drivers/iommu/iommu.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index ec75d14497bf..c86a57abe292 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -2590,6 +2590,25 @@ size_t iommu_unmap(struct iommu_domain *domain, } EXPORT_SYMBOL_GPL(iommu_unmap); +/** + * iommu_unmap_fast() - Remove mappings from a range of IOVA without IOTLB sync + * @domain: Domain to manipulate + * @iova: IO virtual address to start + * @size: Length of the range starting from @iova + * @iotlb_gather: range information for a pending IOTLB flush + * + * iommu_unmap_fast() will remove a translation created by iommu_map(). + * It can't subdivide a mapping created by iommu_map(), so it should be + * called with IOVA ranges that match what was passed to iommu_map(). The + * range can aggregate contiguous iommu_map() calls so long as no individual + * range is split. + * + * Basically iommu_unmap_fast() is the same as iommu_unmap() but for callers + * which manage the IOTLB flushing externally to perform a batched sync. + * + * Returns: Number of bytes of IOVA unmapped. iova + res will be the point + * unmapping stopped. + */ size_t iommu_unmap_fast(struct iommu_domain *domain, unsigned long iova, size_t size, struct iommu_iotlb_gather *iotlb_gather) From patchwork Fri Jan 17 10:03:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943122 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41B26C02185 for ; Fri, 17 Jan 2025 10:04:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C2C9B28000A; Fri, 17 Jan 2025 05:04:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BDA2E280001; Fri, 17 Jan 2025 05:04:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A54FA28000A; Fri, 17 Jan 2025 05:04:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7F153280001 for ; Fri, 17 Jan 2025 05:04:28 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 08F7A1A1977 for ; Fri, 17 Jan 2025 10:04:28 +0000 (UTC) X-FDA: 83016509016.30.E28FBEA Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf01.hostedemail.com (Postfix) with ESMTP id 52AFE40006 for ; Fri, 17 Jan 2025 10:04:26 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=RDU0zUWR; spf=pass (imf01.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108266; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VZZjruW+4H6/aQ089bOGP+4bjBLOtbHEOrx9Ft55JNU=; b=DTxEb0RJIUb2YUdSi3ySpmbs8VrQi8I5BIIf/4rpLWeWwZN7LDB3M9ZrlXoUVfAkXZ1j2M YeG2vk1pHnzhlM5oEqig6tIrK+yykF7a6HF+JDRR+FPSS+epf0L0fm7REBwvc16hCyBso/ UxzUrbiVIj0KdEVNpZ2xrFKkwgxHTxc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108266; a=rsa-sha256; cv=none; b=QNlXMv1gs3X5zCGNbzxvarmnnhXx+4ToqJmaqsFu7aeBbqoUG3ac/sQJP5y8h2uw1BBHf+ R5NuT1MzZGh8FgRcKAYUjgWt5A0NzQ+NxUE3Jgi6XHqhre4fEwr7PC+qfWFW4n8DceiUTj Ti3v56POC+lYSDqsD9LFNPKGfYHMCgU= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=RDU0zUWR; spf=pass (imf01.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id C85515C577E; Fri, 17 Jan 2025 10:03:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 81D5CC4CEDD; Fri, 17 Jan 2025 10:04:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108265; bh=4GN46EdgtKs52uYrnZ/WFZUZIcSUfGlvUbPZSutsKwg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RDU0zUWRkFl3MkDWfMyzao/HvM6rpd/zmpenEP+vzh193nOx5NyI+FbBE4Ilqc7Vx G56EM1A1GOB7T02ywxgf3BKwOF1J635h16KKe7KqA2PzL5oeahSPCJhPuEZ46xxtRA 2AKJMESFi6p5gofQY4cOg/apWaQuLFhUCGGN3cC27VT6j40aeEom8fqwLRciGdJfr+ /fO9TxhIVvVJd91dkCpjU78M4DXlKEz8EhjWaU1Y7NQW/BIaY738aTUQ7tQA8gS/Rx XE+UYcek/V+wQhyAJWvxqMlIeufsuetd3Ph19B+w89RuephT5UhLPuIW2jMIfyLLG+ /FpqIxESR4hlA== From: Leon Romanovsky To: Christoph Hellwig , Jason Gunthorpe , Robin Murphy Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 05/17] dma-mapping: Provide an interface to allow allocate IOVA Date: Fri, 17 Jan 2025 12:03:36 +0200 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: wshuyrb8dcrj9rnoqofxb8fqzmhyrmyq X-Rspam-User: X-Rspamd-Queue-Id: 52AFE40006 X-Rspamd-Server: rspam03 X-HE-Tag: 1737108266-864474 X-HE-Meta: U2FsdGVkX18JbtGggKwdJuE/RsEvtqxPnqqIZ4PZQXdxsBH6IOvTqrfb5q0jwFU1+icHfdHtHeaKTiks5UEwa1od//8oh1dCSm/gKZRL6i0hz+vh5pD5THUjuTFEQJSbF9nP+nHrHZO5VL39pok4LSYDBoyDlVQWQwQRS/E1m6/WHNU6XTQlUKWzrUMvaRXr7TrjQsmBBqnYRQgUrD8yi+vKKFtSOg82aGv22MT9bQlU8BbT2Tq42sTugp7CSqjc7MT+SSH+f459dw7l/zfsXz/Lmq7NX5w42E0L5B1OAp0SZQfYREWFS9ZFE9MY2txxydJ2rheRDFKStdUq7uYUdKKkWzVcvFtfbY4zsRWdzl5ckizw1qDW+OnqykpWN8Fb9j3zVOdELqUiADkNuMfSesUeX8CNtPICTRGV/3MTeTgfiz0z5vvy0Yt8FwEJnJygwYgC3tshY3dWAUO/jC+e5L8jLaHqdiTWUNaB1t2EvZRqnjQTBUoDKBGziJ3vrGC0WT9bVxG0SOwTTFnY7AEmEt3LXtASjX2zFFHrjJcyA3DLNqCT9Y/jxbUhYYfVUUXAMhUeydUoW5tQJzdNzJIY6jwbgQ6KOzNjPUIY+u1Uc1N3VD/bGeNaVJ81XL+QWBjr6crhjKzl1mNGaqhFv3YGxSsH+eVUzvFpR2z6eyD4Ec88T0oYkOPtmCPEpylBzVPdX77dQUCpQ7RzJuU4wq0q3jM0JiVFA8KsvYreP8qmN3tRJCeBI/Ev0lC/0g0Ks2R8BVQRCKSErcRtcZW/yMsQrn47AvK4VhGzf9blNtCWY8/7JrsX/kJaYL86tP8RnIL9IvRj91cFO5Bu4N6uaSvNxFYjQpmlbder1n08+qw/8sA1ukjyXOIAyYXId7nJ2q6i3fybMDhbN1OrtQYxNk5E8lFUvr/bEL5bdC3ryWEPzsMfJtqFAzCUjd1HWNDAGCigITWF7cnzqJvrPjqMfPM eX2g5vtl uGo2SkCA+kvGpDrQRcrAglBimhAD3tVamu4kgbAuvEL8K/jy31hIzZoTX5KCiAiwYH0hs0nHz82+n0v0M2bYzSrOHMbZwoosmpi2JvB7fUqpQpRofTxr7i/IHxl8IVNUmV4g+9oEJyK+Gd9+41RhwqQJSSgAhGaBKCs9hWOCh9k+H/ftFJOikAYCRJpP3le45sK1Gy58hfbgDyYl9hSJthSoeV8Q1+DHf1pfwXb/rVmglolk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky The existing .map_page() callback provides both allocating of IOVA and linking DMA pages. That combination works great for most of the callers who use it in control paths, but is less effective in fast paths where there may be multiple calls to map_page(). These advanced callers already manage their data in some sort of database and can perform IOVA allocation in advance, leaving range linkage operation to be in fast path. Provide an interface to allocate/deallocate IOVA and next patch link/unlink DMA ranges to that specific IOVA. In the new API a DMA mapping transaction is identified by a struct dma_iova_state, which holds some recomputed information for the transaction which does not change for each page being mapped, so add a check if IOVA can be used for the specific transaction. The API is exported from dma-iommu as it is the only implementation supported, the namespace is clearly different from iommu_* functions which are not allowed to be used. This code layout allows us to save function call per API call used in datapath as well as a lot of boilerplate code. Reviewed-by: Christoph Hellwig Signed-off-by: Leon Romanovsky --- drivers/iommu/dma-iommu.c | 86 +++++++++++++++++++++++++++++++++++++ include/linux/dma-mapping.h | 48 +++++++++++++++++++++ 2 files changed, 134 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 853247c42f7d..309d278b1d86 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1746,6 +1746,92 @@ size_t iommu_dma_max_mapping_size(struct device *dev) return SIZE_MAX; } +/** + * dma_iova_try_alloc - Try to allocate an IOVA space + * @dev: Device to allocate the IOVA space for + * @state: IOVA state + * @phys: physical address + * @size: IOVA size + * + * Check if @dev supports the IOVA-based DMA API, and if yes allocate IOVA space + * for the given base address and size. + * + * Note: @phys is only used to calculate the IOVA alignment. Callers that always + * do PAGE_SIZE aligned transfers can safely pass 0 here. + * + * Returns %true if the IOVA-based DMA API can be used and IOVA space has been + * allocated, or %false if the regular DMA API should be used. + */ +bool dma_iova_try_alloc(struct device *dev, struct dma_iova_state *state, + phys_addr_t phys, size_t size) +{ + struct iommu_dma_cookie *cookie; + struct iommu_domain *domain; + struct iova_domain *iovad; + size_t iova_off; + dma_addr_t addr; + + memset(state, 0, sizeof(*state)); + if (!use_dma_iommu(dev)) + return false; + + domain = iommu_get_dma_domain(dev); + cookie = domain->iova_cookie; + iovad = &cookie->iovad; + iova_off = iova_offset(iovad, phys); + + if (static_branch_unlikely(&iommu_deferred_attach_enabled) && + iommu_deferred_attach(dev, iommu_get_domain_for_dev(dev))) + return false; + + if (WARN_ON_ONCE(!size)) + return false; + + /* + * DMA_IOVA_USE_SWIOTLB is flag which is set by dma-iommu + * internals, make sure that caller didn't set it and/or + * didn't use this interface to map SIZE_MAX. + */ + if (WARN_ON_ONCE((u64)size & DMA_IOVA_USE_SWIOTLB)) + return false; + + addr = iommu_dma_alloc_iova(domain, + iova_align(iovad, size + iova_off), + dma_get_mask(dev), dev); + if (!addr) + return false; + + state->addr = addr + iova_off; + state->__size = size; + return true; +} +EXPORT_SYMBOL_GPL(dma_iova_try_alloc); + +/** + * dma_iova_free - Free an IOVA space + * @dev: Device to free the IOVA space for + * @state: IOVA state + * + * Undoes a successful dma_try_iova_alloc(). + * + * Note that all dma_iova_link() calls need to be undone first. For callers + * that never call dma_iova_unlink(), dma_iova_destroy() can be used instead + * which unlinks all ranges and frees the IOVA space in a single efficient + * operation. + */ +void dma_iova_free(struct device *dev, struct dma_iova_state *state) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + size_t iova_start_pad = iova_offset(iovad, state->addr); + size_t size = dma_iova_size(state); + + iommu_dma_free_iova(cookie, state->addr - iova_start_pad, + iova_align(iovad, size + iova_start_pad), NULL); +} +EXPORT_SYMBOL_GPL(dma_iova_free); + void iommu_setup_dma_ops(struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index b79925b1c433..de7f73810d54 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -72,6 +72,22 @@ #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1)) +struct dma_iova_state { + dma_addr_t addr; + u64 __size; +}; + +/* + * Use the high bit to mark if we used swiotlb for one or more ranges. + */ +#define DMA_IOVA_USE_SWIOTLB (1ULL << 63) + +static inline size_t dma_iova_size(struct dma_iova_state *state) +{ + /* Casting is needed for 32-bits systems */ + return (size_t)(state->__size & ~DMA_IOVA_USE_SWIOTLB); +} + #ifdef CONFIG_DMA_API_DEBUG void debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr); void debug_dma_map_single(struct device *dev, const void *addr, @@ -277,6 +293,38 @@ static inline int dma_mmap_noncontiguous(struct device *dev, } #endif /* CONFIG_HAS_DMA */ +#ifdef CONFIG_IOMMU_DMA +/** + * dma_use_iova - check if the IOVA API is used for this state + * @state: IOVA state + * + * Return %true if the DMA transfers uses the dma_iova_*() calls or %false if + * they can't be used. + */ +static inline bool dma_use_iova(struct dma_iova_state *state) +{ + return state->__size != 0; +} + +bool dma_iova_try_alloc(struct device *dev, struct dma_iova_state *state, + phys_addr_t phys, size_t size); +void dma_iova_free(struct device *dev, struct dma_iova_state *state); +#else /* CONFIG_IOMMU_DMA */ +static inline bool dma_use_iova(struct dma_iova_state *state) +{ + return false; +} +static inline bool dma_iova_try_alloc(struct device *dev, + struct dma_iova_state *state, phys_addr_t phys, size_t size) +{ + return false; +} +static inline void dma_iova_free(struct device *dev, + struct dma_iova_state *state) +{ +} +#endif /* CONFIG_IOMMU_DMA */ + #if defined(CONFIG_HAS_DMA) && defined(CONFIG_DMA_NEED_SYNC) void __dma_sync_single_for_cpu(struct device *dev, dma_addr_t addr, size_t size, enum dma_data_direction dir); From patchwork Fri Jan 17 10:03:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943120 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 009A3C02183 for ; Fri, 17 Jan 2025 10:04:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D2F6280008; Fri, 17 Jan 2025 05:04:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 682CD280001; Fri, 17 Jan 2025 05:04:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D570280008; Fri, 17 Jan 2025 05:04:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 2A539280001 for ; Fri, 17 Jan 2025 05:04:21 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D7EB1C07DD for ; Fri, 17 Jan 2025 10:04:20 +0000 (UTC) X-FDA: 83016508680.27.C456AD9 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf25.hostedemail.com (Postfix) with ESMTP id 35136A0018 for ; Fri, 17 Jan 2025 10:04:19 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Mc8CLwKz; spf=pass (imf25.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108259; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Avq+ozoMMu+jQmoTfVCX5VDb19iU3ByTSlyhXaFVsV0=; b=y2SrUv09OkjtT02As0UF9SgKaQ5sVZ28Ei55nuDkp0ifIEY5GaXdtUIrL0pRLhFQBGRBzC dexptdcCaZSdXRdw/G9A6gyF+YzBSgjGhGWBA1J31xn3HenRdYk/StyHWq+4DyVGDDmeFH Nq9A+p3DSB4cHqdljDU2DweikoARUug= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108259; a=rsa-sha256; cv=none; b=GMySzluPYtfQnA8ayViXaxSAuYwKL/lWJmTQSNfGAeuBxiCjBT1B5bV4Vw4MdMX459jpSr 6zgE6QTCe21sQ7FicMpp9pAi1FEuXqWjfbqRs6EEPB/Pgla8Lr/LFS92zMHDbZYtLY9apf o0BO9JQ2uSBuQIuAQAMWC33Dk1tYIgc= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Mc8CLwKz; spf=pass (imf25.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id AA0D65C56AB; Fri, 17 Jan 2025 10:03:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1AC38C4CEDD; Fri, 17 Jan 2025 10:04:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108257; bh=rum8Whg042xtqAAmYnLBImAToSE6OI+VPKJj0K+Pohc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Mc8CLwKz7mcdjOCnAG6ibIY8fcYKhCcOq2PmhjxuMXrgUUBs6Vg+fAM3jon/hdaY3 El7Bqjx4eTk7ePsxyy1gVt7SPj8RPnH0fKGXTOTj8ZDrW9ypTPHtDuOn8XKBqedSfg YuIvn3AO+94EpLsXUhTmaiVjImD7VM2FgNl4K8edzDIkwhQsihPCDrlvDqPMD5XUA9 +hsYCA86uB9WKSypdcSPK0LPXU9jv7CUgSfvLM5JVEelfKbMZr+RbNQJG7BNhfMpv+ Ub3w7UF+6N3O2/R0+ZoAV4boOKGwAZ7IFEVW3ZF/mb7I4QgKF+L0vD4K6LabcgFGC8 4jwmdTjQfSEMw== From: Leon Romanovsky To: Christoph Hellwig , Jason Gunthorpe , Robin Murphy Cc: Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 06/17] iommu/dma: Factor out a iommu_dma_map_swiotlb helper Date: Fri, 17 Jan 2025 12:03:37 +0200 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: wnhio7fogm5dzwsmn8z1s5radphxwn3b X-Rspam-User: X-Rspamd-Queue-Id: 35136A0018 X-Rspamd-Server: rspam03 X-HE-Tag: 1737108259-129409 X-HE-Meta: U2FsdGVkX18hJ/HF1DELfmmMc++ft0tbZ7RCChbJWwlyp5y73KdxRj71wF4U7GL6gyrsvaRXmeSgm9Jiyl9FEGj6QtFDY6ykN/TfehzGZxncK0/1ZbHtsBvO6PnjXpoJ6BIXDAN2+i2rarpUG2/WA/JYeR8L0PTzeKNDyfAn6SLaH9rYeEbuJALzeBRZfZxo2OReoXWSZFJujfi7Ez4KquL1pucfbee9fG5fqyZsi5KX1f/cB77YvGQMl3M0+mNmmJKkbkoD3ioGb0gaaOLTZBHA5dGwoRxialr3eD6Pq6Q9y5KMdctZ5BiBH/OzCH+AMX6/AOGsn1c5sUz3r5NVywBAw3U1wnSm38geoMI9iYCwr5zNChUy35Uva+XHLx6qxtxAwAv7aN/A3XlK/e9wq3vt4kbHt3y4j0xH+gBFPIhZi5Z+DBon6axqvoaSJ7SgToV1Hp48t0BlwGSAhyY+do3ptRHxSTfUpqhjCQw1uIvdEcttNTY6LMh/8NHNksLir9L7zgLIsPC8kwXFkxLZ91VhF8DVBC13JhzyO/TOWOtBgyQXrmfqkxo/lk1oPM2TB1qLhyKSP35BU7r8/aEUrGYa7iWrAZMKgCJiM95rEmnfQ0PrsNHigBQpCnhKneTLY3cNRrOf/hmKop+5si/QIzDE9wVDcGzN7Sh53sRwt8x6U1YkGdHXbNq/QapEFDTE1GP+be6YDpaW8jDsWslIQibe711LM+QC9gRdKMtxt74oxS9Lc5Z2h9id/ybqYGPgh7nLxUPv/eIXqOxLrNuUIGdMBQbfqxl8BPk9GkyegExxlhkjl1SC/Vv1iYC7logolge2P0D28it8zpFQx/8Zp+fqqAXw4D/2LFuXhG7w2cDQuOoJ3Tinx7ka2ZHlf530InKPhq3LJrQ44NLO4q1p9OxGkH7RWfMHZ12UdpWBElyAHT/UccSZIORoVVZzptEYoK+nCqobGqTFjmKfWbX yVLMAgfs N+XclH8aCg03O5/5TXnOQd//0KiAksKgQ0xL8VEn/PcSVa3NDN5qDS5gYLOC3MkcRrU3CemHNAwkIgox4obtB6UmNB3WfeumItRxeA5y8JFga/6SZDXHALblww34S07rcQbexfA9fz5NUajomyvz0K8KoBGks9iQ8qhhKAkQOLw3xZ/zzyp9Pwruu+wSlmnslvVM6hwl0xVP45a4Fldv4M6aMu6O4w4tHprkUneMOyKKXAvo3fQmHL1oUX06r09Gq4wH5yDiDw4WbNBif3WUMovPTyPW3BMH7rqkC X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Christoph Hellwig Split the iommu logic from iommu_dma_map_page into a separate helper. This not only keeps the code neatly separated, but will also allow for reuse in another caller. Signed-off-by: Christoph Hellwig Signed-off-by: Leon Romanovsky --- drivers/iommu/dma-iommu.c | 73 ++++++++++++++++++++++----------------- 1 file changed, 41 insertions(+), 32 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 309d278b1d86..80cc2c51ac99 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1161,6 +1161,43 @@ void iommu_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sgl, arch_sync_dma_for_device(sg_phys(sg), sg->length, dir); } +static phys_addr_t iommu_dma_map_swiotlb(struct device *dev, phys_addr_t phys, + size_t size, enum dma_data_direction dir, unsigned long attrs) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iova_domain *iovad = &domain->iova_cookie->iovad; + + if (!is_swiotlb_active(dev)) { + dev_warn_once(dev, "DMA bounce buffers are inactive, unable to map unaligned transaction.\n"); + return DMA_MAPPING_ERROR; + } + + trace_swiotlb_bounced(dev, phys, size); + + phys = swiotlb_tbl_map_single(dev, phys, size, iova_mask(iovad), dir, + attrs); + + /* + * Untrusted devices should not see padding areas with random leftover + * kernel data, so zero the pre- and post-padding. + * swiotlb_tbl_map_single() has initialized the bounce buffer proper to + * the contents of the original memory buffer. + */ + if (phys != DMA_MAPPING_ERROR && dev_is_untrusted(dev)) { + size_t start, virt = (size_t)phys_to_virt(phys); + + /* Pre-padding */ + start = iova_align_down(iovad, virt); + memset((void *)start, 0, virt - start); + + /* Post-padding */ + start = virt + size; + memset((void *)start, 0, iova_align(iovad, start) - start); + } + + return phys; +} + dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, unsigned long offset, size_t size, enum dma_data_direction dir, unsigned long attrs) @@ -1174,42 +1211,14 @@ dma_addr_t iommu_dma_map_page(struct device *dev, struct page *page, dma_addr_t iova, dma_mask = dma_get_mask(dev); /* - * If both the physical buffer start address and size are - * page aligned, we don't need to use a bounce page. + * If both the physical buffer start address and size are page aligned, + * we don't need to use a bounce page. */ if (dev_use_swiotlb(dev, size, dir) && iova_offset(iovad, phys | size)) { - if (!is_swiotlb_active(dev)) { - dev_warn_once(dev, "DMA bounce buffers are inactive, unable to map unaligned transaction.\n"); - return DMA_MAPPING_ERROR; - } - - trace_swiotlb_bounced(dev, phys, size); - - phys = swiotlb_tbl_map_single(dev, phys, size, - iova_mask(iovad), dir, attrs); - + phys = iommu_dma_map_swiotlb(dev, phys, size, dir, attrs); if (phys == DMA_MAPPING_ERROR) - return DMA_MAPPING_ERROR; - - /* - * Untrusted devices should not see padding areas with random - * leftover kernel data, so zero the pre- and post-padding. - * swiotlb_tbl_map_single() has initialized the bounce buffer - * proper to the contents of the original memory buffer. - */ - if (dev_is_untrusted(dev)) { - size_t start, virt = (size_t)phys_to_virt(phys); - - /* Pre-padding */ - start = iova_align_down(iovad, virt); - memset((void *)start, 0, virt - start); - - /* Post-padding */ - start = virt + size; - memset((void *)start, 0, - iova_align(iovad, start) - start); - } + return phys; } if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) From patchwork Fri Jan 17 10:03:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943121 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76D54C02185 for ; Fri, 17 Jan 2025 10:04:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 06A6A280009; Fri, 17 Jan 2025 05:04:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 01AE4280001; Fri, 17 Jan 2025 05:04:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD6F0280009; Fri, 17 Jan 2025 05:04:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C02AA280001 for ; Fri, 17 Jan 2025 05:04:24 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 7F4BAA18ED for ; Fri, 17 Jan 2025 10:04:24 +0000 (UTC) X-FDA: 83016508848.17.CE95EAA Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf21.hostedemail.com (Postfix) with ESMTP id E0FDC1C000A for ; Fri, 17 Jan 2025 10:04:22 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=K2FQ6nm3; spf=pass (imf21.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108262; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RngIMRSa77ShqVybUgKIiNdANEgS0HEJf3rVGmYsJHM=; b=ckr8cm7HWTZ2NIies5Y3JVgUsesrqgYQP5L0VOavA8s3Bg9gs+OFXz5y8PSVM7ACwZSufO tI+QQN2Xc2XJpDofKTrrGg3n3jv6H0QeplmqDFouwWyMlhUr/q/u9+ELVOUAE3eLknBto5 uvWHq2kh/s/1MET+Dac0Z/Z8Te5PeiM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108262; a=rsa-sha256; cv=none; b=a6WWL1F08jc17Z7esaUQjei96g+D4UOgTRWbzG5AvuFaHyJuRPsC0drht+bltXt3zPsOVi Hh7xDHovpAXTcRC+CbDqnjXRmj+LSOUDvIOpoaFbhpNbglyIk7N8wo/KN06+or5S9UIfpu PbV/IF7JAUWpXd1Mwxbgez1KPMvF4nU= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=K2FQ6nm3; spf=pass (imf21.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 22923A42B23; Fri, 17 Jan 2025 10:02:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0A33BC4CEDF; Fri, 17 Jan 2025 10:04:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108261; bh=8bXsWyM0/7GaV5f8DALKTocHMQx13DLC0G20bt5q350=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=K2FQ6nm31qwL9bpdJXRVsuabFqRhWHEfDxouZGE1eNtxDeuAnUPBOVWVxichwz5JC yGhV2+wgUhO9lN+134gS1otZ1pUUxt+yVYU6DIW/Aa3Odg/5725cJC3hi7JSQCCVkF dpUDHrhYaZL0HpP1mJ/3sXfBSaPfuU2mchxkX5ctaRZTQEEi6gnlFwNrBc1KeF0h2V Lv7GeG2znuNDfGoPBdCnk7PksfRp51S2RPBtTLQJa0eUWGBWqDQIFw3TQgxUAioxov NN4JoLcUJjI+kozJ3zL3jmSgrt7JgDYlQWMNKPI41umqx2Jmjcvf7v9dU7l5K15UV1 QA/NEt7EOjd+Q== From: Leon Romanovsky To: Christoph Hellwig , Jason Gunthorpe , Robin Murphy Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 07/17] dma-mapping: Implement link/unlink ranges API Date: Fri, 17 Jan 2025 12:03:38 +0200 Message-ID: <7a858031dda0a888609c214cdf8210a813b2df42.1737106761.git.leon@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: E0FDC1C000A X-Stat-Signature: gsyyaj8z387i9hur3n45kyq9nmb1pgdg X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1737108262-177634 X-HE-Meta: U2FsdGVkX1+q1HUnItThwgXRjd+U4JXTpgWS71kLQ1TP3MTlCY9l6elkSIUb0GnNjrn6BZBQOQTFOKKnfglRqowmf6H22wzVnL//GwQiB/9N3IGLFxjodnpNNAPnxktWT2LhSDnVSwYgLHe7IoYZ0n2XewbL+l1nuZUCfEvSBo84IN6ANX1uYy9qk3UvuKN2Th3y6Q9MUxGVyiwk41wvL9aIDFxdt2wURv3bANeHCmeX41ArUwSIIkrfOAKeWtzRk/bEHTaW5NUsHwFNsDNcYYv93zztk6MqTdE51Czlan/HM6BLLXh/x7Zb6SJwnC+bCGoftjLW1qCztAJvOV1xMmiA2rNR0pdOMir5esZhTbcL5KDliUZ2z+NRWc+m8FsPbQyJW+bai3vv6sQTNZjNYqtMI3xbrZBAXRPXTxkJb71QqwMRvrIrN1xYZHgXq/WqxvH1G89LBW5FJA8KzvMfSO8rxbxky1rCjAsxpSIxHQP5Fry876KJt40mwO4bcOkeQ+WNSPyZq/qKJoanJfLq4RIOoBKXVpg/pQ/zD8sRpy97OtXHh/ehLU3h5iOoHSjIWyJLC484Pby5I0kHipVDQwnrrSRoCAnwhSS6UflsVLcnIaACPhNpKQ+TDd0c9oVdaUhUoroI9zHpDWGS7wa61VKUgeFURDmWpauTwVIyCnP+glD6EdooiWYEC9F+X+54qlVYFl7PJg6aXwyLnvojf/zohNNPRdtNS+oScpWJru2hT8KXlgbnMo4GJ+taGzHKgy/Zr7bFAmguR9WPzdR6iO5iwAo/rDUAxrJZyuMAduosS8WYSdn3l5iHnIAlWl7OXLZIsqyL3YCHtaukJsJJv9uQOAbe++1kVKli3SzggoQS4Oi0+Od9an++7tmVfXhV4x+9v7ihpsH7XWHq8eRn5y9esiLM6ZfBuToY/Cy1OoRb9I4/Y0cl6V8dFGkP3uQtWbnZrybiIMCTLBLp2hn MRzvKfWU /MM88QX4rwuzqz5C59coW45wcfB83Pw9t35cIDbrWlpXKLvzvHL/DENVz2AtOqNhg+rdEHXCDCq95phSPC8m9c6TCGunc1tblHa+04kvGRoAtiMQaKY6Zq27sk0FbNr2TmprLvWMdpSwNjDaiCKuhARmbZ2fIeOxb20JvJrB0BuJoGgXlFyFs8PI0Ci6Ro9YDhtvA/9t/T+5NUhOiq/91DN/leLZQSDs/C27hwX0TAXKyXso= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Introduce new DMA APIs to perform DMA linkage of buffers in layers higher than DMA. In proposed API, the callers will perform the following steps. In map path: if (dma_can_use_iova(...)) dma_iova_alloc() for (page in range) dma_iova_link_next(...) dma_iova_sync(...) else /* Fallback to legacy map pages */ for (all pages) dma_map_page(...) In unmap path: if (dma_can_use_iova(...)) dma_iova_destroy() else for (all pages) dma_unmap_page(...) Reviewed-by: Christoph Hellwig Signed-off-by: Leon Romanovsky --- drivers/iommu/dma-iommu.c | 261 ++++++++++++++++++++++++++++++++++++ include/linux/dma-mapping.h | 32 +++++ 2 files changed, 293 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 80cc2c51ac99..67cb537bde3f 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1841,6 +1841,267 @@ void dma_iova_free(struct device *dev, struct dma_iova_state *state) } EXPORT_SYMBOL_GPL(dma_iova_free); +static int __dma_iova_link(struct device *dev, dma_addr_t addr, + phys_addr_t phys, size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + bool coherent = dev_is_dma_coherent(dev); + + if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) + arch_sync_dma_for_device(phys, size, dir); + + return iommu_map_nosync(iommu_get_dma_domain(dev), addr, phys, size, + dma_info_to_prot(dir, coherent, attrs), GFP_ATOMIC); +} + +static int iommu_dma_iova_bounce_and_link(struct device *dev, dma_addr_t addr, + phys_addr_t phys, size_t bounce_len, + enum dma_data_direction dir, unsigned long attrs, + size_t iova_start_pad) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iova_domain *iovad = &domain->iova_cookie->iovad; + phys_addr_t bounce_phys; + int error; + + bounce_phys = iommu_dma_map_swiotlb(dev, phys, bounce_len, dir, attrs); + if (bounce_phys == DMA_MAPPING_ERROR) + return -ENOMEM; + + error = __dma_iova_link(dev, addr - iova_start_pad, + bounce_phys - iova_start_pad, + iova_align(iovad, bounce_len), dir, attrs); + if (error) + swiotlb_tbl_unmap_single(dev, bounce_phys, bounce_len, dir, + attrs); + return error; +} + +static int iommu_dma_iova_link_swiotlb(struct device *dev, + struct dma_iova_state *state, phys_addr_t phys, size_t offset, + size_t size, enum dma_data_direction dir, unsigned long attrs) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + size_t iova_start_pad = iova_offset(iovad, phys); + size_t iova_end_pad = iova_offset(iovad, phys + size); + dma_addr_t addr = state->addr + offset; + size_t mapped = 0; + int error; + + if (iova_start_pad) { + size_t bounce_len = min(size, iovad->granule - iova_start_pad); + + error = iommu_dma_iova_bounce_and_link(dev, addr, phys, + bounce_len, dir, attrs, iova_start_pad); + if (error) + return error; + state->__size |= DMA_IOVA_USE_SWIOTLB; + + mapped += bounce_len; + size -= bounce_len; + if (!size) + return 0; + } + + size -= iova_end_pad; + error = __dma_iova_link(dev, addr + mapped, phys + mapped, size, dir, + attrs); + if (error) + goto out_unmap; + mapped += size; + + if (iova_end_pad) { + error = iommu_dma_iova_bounce_and_link(dev, addr + mapped, + phys + mapped, iova_end_pad, dir, attrs, 0); + if (error) + goto out_unmap; + state->__size |= DMA_IOVA_USE_SWIOTLB; + } + + return 0; + +out_unmap: + dma_iova_unlink(dev, state, 0, mapped, dir, attrs); + return error; +} + +/** + * dma_iova_link - Link a range of IOVA space + * @dev: DMA device + * @state: IOVA state + * @phys: physical address to link + * @offset: offset into the IOVA state to map into + * @size: size of the buffer + * @dir: DMA direction + * @attrs: attributes of mapping properties + * + * Link a range of IOVA space for the given IOVA state without IOTLB sync. + * This function is used to link multiple physical addresses in contigueous + * IOVA space without performing costly IOTLB sync. + * + * The caller is responsible to call to dma_iova_sync() to sync IOTLB at + * the end of linkage. + */ +int dma_iova_link(struct device *dev, struct dma_iova_state *state, + phys_addr_t phys, size_t offset, size_t size, + enum dma_data_direction dir, unsigned long attrs) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + size_t iova_start_pad = iova_offset(iovad, phys); + + if (WARN_ON_ONCE(iova_start_pad && offset > 0)) + return -EIO; + + if (dev_use_swiotlb(dev, size, dir) && iova_offset(iovad, phys | size)) + return iommu_dma_iova_link_swiotlb(dev, state, phys, offset, + size, dir, attrs); + + return __dma_iova_link(dev, state->addr + offset - iova_start_pad, + phys - iova_start_pad, + iova_align(iovad, size + iova_start_pad), dir, attrs); +} +EXPORT_SYMBOL_GPL(dma_iova_link); + +/** + * dma_iova_sync - Sync IOTLB + * @dev: DMA device + * @state: IOVA state + * @offset: offset into the IOVA state to sync + * @size: size of the buffer + * + * Sync IOTLB for the given IOVA state. This function should be called on + * the IOVA-contigous range created by one ore more dma_iova_link() calls + * to sync the IOTLB. + */ +int dma_iova_sync(struct device *dev, struct dma_iova_state *state, + size_t offset, size_t size) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + dma_addr_t addr = state->addr + offset; + size_t iova_start_pad = iova_offset(iovad, addr); + + return iommu_sync_map(domain, addr - iova_start_pad, + iova_align(iovad, size + iova_start_pad)); +} +EXPORT_SYMBOL_GPL(dma_iova_sync); + +static void iommu_dma_iova_unlink_range_slow(struct device *dev, + dma_addr_t addr, size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + size_t iova_start_pad = iova_offset(iovad, addr); + dma_addr_t end = addr + size; + + do { + phys_addr_t phys; + size_t len; + + phys = iommu_iova_to_phys(domain, addr); + if (WARN_ON(!phys)) + /* Something very horrible happen here */ + return; + + len = min_t(size_t, + end - addr, iovad->granule - iova_start_pad); + + if (!dev_is_dma_coherent(dev) && + !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) + arch_sync_dma_for_cpu(phys, len, dir); + + swiotlb_tbl_unmap_single(dev, phys, len, dir, attrs); + + addr += len; + iova_start_pad = 0; + } while (addr < end); +} + +static void __iommu_dma_iova_unlink(struct device *dev, + struct dma_iova_state *state, size_t offset, size_t size, + enum dma_data_direction dir, unsigned long attrs, + bool free_iova) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + dma_addr_t addr = state->addr + offset; + size_t iova_start_pad = iova_offset(iovad, addr); + struct iommu_iotlb_gather iotlb_gather; + size_t unmapped; + + if ((state->__size & DMA_IOVA_USE_SWIOTLB) || + (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))) + iommu_dma_iova_unlink_range_slow(dev, addr, size, dir, attrs); + + iommu_iotlb_gather_init(&iotlb_gather); + iotlb_gather.queued = free_iova && READ_ONCE(cookie->fq_domain); + + size = iova_align(iovad, size + iova_start_pad); + addr -= iova_start_pad; + unmapped = iommu_unmap_fast(domain, addr, size, &iotlb_gather); + WARN_ON(unmapped != size); + + if (!iotlb_gather.queued) + iommu_iotlb_sync(domain, &iotlb_gather); + if (free_iova) + iommu_dma_free_iova(cookie, addr, size, &iotlb_gather); +} + +/** + * dma_iova_unlink - Unlink a range of IOVA space + * @dev: DMA device + * @state: IOVA state + * @offset: offset into the IOVA state to unlink + * @size: size of the buffer + * @dir: DMA direction + * @attrs: attributes of mapping properties + * + * Unlink a range of IOVA space for the given IOVA state. + */ +void dma_iova_unlink(struct device *dev, struct dma_iova_state *state, + size_t offset, size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + __iommu_dma_iova_unlink(dev, state, offset, size, dir, attrs, false); +} +EXPORT_SYMBOL_GPL(dma_iova_unlink); + +/** + * dma_iova_destroy - Finish a DMA mapping transaction + * @dev: DMA device + * @state: IOVA state + * @mapped_len: number of bytes to unmap + * @dir: DMA direction + * @attrs: attributes of mapping properties + * + * Unlink the IOVA range up to @mapped_len and free the entire IOVA space. The + * range of IOVA from dma_addr to @mapped_len must all be linked, and be the + * only linked IOVA in state. + */ +void dma_iova_destroy(struct device *dev, struct dma_iova_state *state, + size_t mapped_len, enum dma_data_direction dir, + unsigned long attrs) +{ + if (mapped_len) + __iommu_dma_iova_unlink(dev, state, 0, mapped_len, dir, attrs, + true); + else + /* + * We can be here if first call to dma_iova_link() failed and + * there is nothing to unlink, so let's be more clear. + */ + dma_iova_free(dev, state); +} +EXPORT_SYMBOL_GPL(dma_iova_destroy); + void iommu_setup_dma_ops(struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index de7f73810d54..a71e110f1e9d 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -309,6 +309,17 @@ static inline bool dma_use_iova(struct dma_iova_state *state) bool dma_iova_try_alloc(struct device *dev, struct dma_iova_state *state, phys_addr_t phys, size_t size); void dma_iova_free(struct device *dev, struct dma_iova_state *state); +void dma_iova_destroy(struct device *dev, struct dma_iova_state *state, + size_t mapped_len, enum dma_data_direction dir, + unsigned long attrs); +int dma_iova_sync(struct device *dev, struct dma_iova_state *state, + size_t offset, size_t size); +int dma_iova_link(struct device *dev, struct dma_iova_state *state, + phys_addr_t phys, size_t offset, size_t size, + enum dma_data_direction dir, unsigned long attrs); +void dma_iova_unlink(struct device *dev, struct dma_iova_state *state, + size_t offset, size_t size, enum dma_data_direction dir, + unsigned long attrs); #else /* CONFIG_IOMMU_DMA */ static inline bool dma_use_iova(struct dma_iova_state *state) { @@ -323,6 +334,27 @@ static inline void dma_iova_free(struct device *dev, struct dma_iova_state *state) { } +static inline void dma_iova_destroy(struct device *dev, + struct dma_iova_state *state, size_t mapped_len, + enum dma_data_direction dir, unsigned long attrs) +{ +} +static inline int dma_iova_sync(struct device *dev, + struct dma_iova_state *state, size_t offset, size_t size) +{ + return -EOPNOTSUPP; +} +static inline int dma_iova_link(struct device *dev, + struct dma_iova_state *state, phys_addr_t phys, size_t offset, + size_t size, enum dma_data_direction dir, unsigned long attrs) +{ + return -EOPNOTSUPP; +} +static inline void dma_iova_unlink(struct device *dev, + struct dma_iova_state *state, size_t offset, size_t size, + enum dma_data_direction dir, unsigned long attrs) +{ +} #endif /* CONFIG_IOMMU_DMA */ #if defined(CONFIG_HAS_DMA) && defined(CONFIG_DMA_NEED_SYNC) From patchwork Fri Jan 17 10:03:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943126 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1172C02185 for ; Fri, 17 Jan 2025 10:04:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7216028000E; Fri, 17 Jan 2025 05:04:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D0F4280001; Fri, 17 Jan 2025 05:04:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 54BB228000E; Fri, 17 Jan 2025 05:04:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 2D20A280001 for ; Fri, 17 Jan 2025 05:04:42 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id E948A1618A3 for ; Fri, 17 Jan 2025 10:04:41 +0000 (UTC) X-FDA: 83016509562.16.B47E028 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf18.hostedemail.com (Postfix) with ESMTP id 567FF1C0011 for ; Fri, 17 Jan 2025 10:04:40 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=vExohzXD; spf=pass (imf18.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108280; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Irhj2QQiTBNUb9SSTzF7SWHwxYmY67f2A+u8b3gQU54=; b=uHVW6E4zUnSKiGLcBw19BAhqGOlWz1F1DTk5eiBWmXPAyPg4uc3HXCmPsUYloNghOSaUgM anzta4s6N7B5AaSzPs22aiyUslnz8ksU4CkFvbtCQGP/LqtWcw1ytD+jvnTZCRvEA2W5XB kokP289NuzcfgRsLZD97gZ2AuPBpCCU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108280; a=rsa-sha256; cv=none; b=RPAou0btuqCqga9xkp83WcmLrStYeqHJw4pPbfkYi7PTpRdaKy7mS6vtMZVTbZ2Pudsaru dOMm8ISPF7TEKBb55zs59RzYv/08GX7af1TrbhGVUE4hmIZDvXrhn+lPmnVwfHTN90na2e Ml5XWyTSP+9nMVW1ueEG4leQB2CGjRc= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=vExohzXD; spf=pass (imf18.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id CF07FA42B32; Fri, 17 Jan 2025 10:02:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B8323C4CEDD; Fri, 17 Jan 2025 10:04:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108279; bh=FJYQ7qR9hcBLpRhiPHcPvV9HvqBheLEfsFGqvBAMLaE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=vExohzXD6Ujd7i5ylWAXIQ8r+KiHahR4ugS8NzRN12jXXWg6XpFDS0Az2kDJdGS3b r2P83zACaoMN4Wm74N2ASzU3RY5SHuQvegOGleRrgJR0ewKZNF1R7rYaprTdiqfZta qHAONZp/pjWD0Exsf2wMAG5W74kZ9llv+fixzaMpWuVhexQbaRl57Q96SppjOh8KaZ UB7ck8uR8pWleUg587r5ORtXxR7k4/8yK6Oa7z3askZexI38iaJksqiskbye1+z2jM yfRbYsVQg1Ux+DbZ9S99dgY9nJvxXEfXBrI8xyOVSpm8JiF79noUc1A75KueNPzS9U QWTZ4Ku7NpBiA== From: Leon Romanovsky To: Christoph Hellwig , Jason Gunthorpe , Robin Murphy Cc: Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 08/17] dma-mapping: add a dma_need_unmap helper Date: Fri, 17 Jan 2025 12:03:39 +0200 Message-ID: <9458f681274f4f0e5dd3bf3959e0b8a1ac9a4e7b.1737106761.git.leon@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: 18yb3wnga6rhrcdhkrzpjdeky1dc938z X-Rspam-User: X-Rspamd-Queue-Id: 567FF1C0011 X-Rspamd-Server: rspam03 X-HE-Tag: 1737108280-312189 X-HE-Meta: U2FsdGVkX19ae+MRyPD3vEIZNDB6tdabpjnjiP7HbQuqnaopUvE85xPapKC2hthOgX+5quWVWyfjqSTS8BkcspgBJ3tUPX9jgZkTW1pnBS6afS/NEjw1kyeQtO75i1O19syM8h7wwyXrHtmOfkRmktY9G9bYhJunS8U52LzHgzd4Nb449xOYRbhil9hpZ4g6Lbu/pQsZ/tlCZGFqdTDyAS6bTCWCYuz2bjeR/mdUui6Fh+JQK0hgHlgkwqMFlX9bRChAqm7HhJ0qQ3w9TtsltHSFS/pFEikEcJIacJEmlacx5D4vTKRseWsi8x6bMJdHzs6SiCQpe7Ic0nbkY/9kaAYmB/tJWZ6DqC4EJJl27Sk8KbdaulM32ZWqFAAntFaL9wCLh25gQqFSPKzybIeb28CFT79POBuss0KpgDvzPKc6bMlSuFJk2T1wkO1idnrcIqa2drXWHva7PpVS+o1Tbn2MRFc8EdXq2ypEikdu43lQvTdCkjdcJBrDwdz72TTXpkuXpriYc4MZtIM2w1jvMI9sg63SJ7XWLrqCbE+sIKJFfalh2cAXn8o24wIueuJukxinUnkmUPtmiD4jWW7XSt5bAl4P+mv2L6y8AawH6uqOf7yRTl7zoeJHTbm2U9ZH1Mf5ynrSaXnbP+j0/vTqWTqTq4D+h5S7WiCGj6zhEF1cwrI4eKhQ6qHbRAn0rCOd9OYXbjJkxt71jQ+VdGOwXrkglO96gjLDzDyuimo4MZhTgvnRgprGsReoxTVuExlqgBkqxLSvuHw2TlZHPMfm3j/ZQeLXqfWnxz7Lus+ociQhd5NpEoP/G04TZZv5tpkMhpMZEdrM1aY6eispFtUnzUc52ypvqu9KERIblHG/QIJTyDERZ0N6oVbE+uFCXj9VkMJGmGv9j2YVCFKgAqjCGw/ufp/qLxxhzPXMxaDzzaSJ7z6sOxVYYrCBAiRelIc6I4NMcT9YiR+OK5laz9K 5cA6gDMv 5cWxmHjhU/0LYBlAvPQvCIj++K9Ri8von6fbZTWIsWxQAQgnMGvE6BO3Bi+7xmtviG0mqYqxhTrNLv6bkGzfj+8TBmhmSMqchKE/nN/pkwpD3irwK/bJ4MJ61xyiX7x443rrKB0ACuHqAoxaAs7UIttFWij1b46q4w+/HNisGNtNGVQfqw3jRsSt2TsXB7lgcgUEGhgswc8a8U/i4lvOHJHKIwHjklhwnXr73g8xX/KEReEm2+t6aCLoMCk+LagjB6iTCjywYcqy5BMovEmlGeJxrfEHr6HNnFoyE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Christoph Hellwig Add helper that allows a driver to skip calling dma_unmap_* if the DMA layer can guarantee that they are no-nops. Signed-off-by: Christoph Hellwig Signed-off-by: Leon Romanovsky --- include/linux/dma-mapping.h | 5 +++++ kernel/dma/mapping.c | 18 ++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index a71e110f1e9d..d2f358c5a25d 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -406,6 +406,7 @@ static inline bool dma_need_sync(struct device *dev, dma_addr_t dma_addr) { return dma_dev_need_sync(dev) ? __dma_need_sync(dev, dma_addr) : false; } +bool dma_need_unmap(struct device *dev); #else /* !CONFIG_HAS_DMA || !CONFIG_DMA_NEED_SYNC */ static inline bool dma_dev_need_sync(const struct device *dev) { @@ -431,6 +432,10 @@ static inline bool dma_need_sync(struct device *dev, dma_addr_t dma_addr) { return false; } +static inline bool dma_need_unmap(struct device *dev) +{ + return false; +} #endif /* !CONFIG_HAS_DMA || !CONFIG_DMA_NEED_SYNC */ struct page *dma_alloc_pages(struct device *dev, size_t size, diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index cda127027e48..3c3204ad2839 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -443,6 +443,24 @@ bool __dma_need_sync(struct device *dev, dma_addr_t dma_addr) } EXPORT_SYMBOL_GPL(__dma_need_sync); +/** + * dma_need_unmap - does this device need dma_unmap_* operations + * @dev: device to check + * + * If this function returns %false, drivers can skip calling dma_unmap_* after + * finishing an I/O. This function must be called after all mappings that might + * need to be unmapped have been performed. + */ +bool dma_need_unmap(struct device *dev) +{ + if (!dma_map_direct(dev, get_dma_ops(dev))) + return true; + if (!dev->dma_skip_sync) + return true; + return IS_ENABLED(CONFIG_DMA_API_DEBUG); +} +EXPORT_SYMBOL_GPL(dma_need_unmap); + static void dma_setup_need_sync(struct device *dev) { const struct dma_map_ops *ops = get_dma_ops(dev); From patchwork Fri Jan 17 10:03:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943123 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53604C02183 for ; Fri, 17 Jan 2025 10:04:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D3CAB28000B; Fri, 17 Jan 2025 05:04:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CED52280001; Fri, 17 Jan 2025 05:04:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B3FAA28000B; Fri, 17 Jan 2025 05:04:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 92486280001 for ; Fri, 17 Jan 2025 05:04:31 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 4A783141897 for ; Fri, 17 Jan 2025 10:04:31 +0000 (UTC) X-FDA: 83016509142.04.8491464 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf23.hostedemail.com (Postfix) with ESMTP id B8ED6140011 for ; Fri, 17 Jan 2025 10:04:29 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=CoIahJnD; spf=pass (imf23.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108269; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DmK8WgvpG/Zyw/VfypmEnIWMyF/z1YcxGAx9FI5n4dU=; b=WXSVTACCT4RiKz+S/tYVAdNdRNKMEQtjP7PtS4uyBDEea6LsEuOULFFpNnG/HGrKf0Uw7p rAWVmvZg3pEpLQk2q2x4W+VEXTTRHbXdv6T1QlE+VUJ9UrwwVHEt5Uu+HoEq7rJa95t5jr l+REoYEoUBAc6ZwmkuZEu1O9M4WHeHw= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=CoIahJnD; spf=pass (imf23.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108269; a=rsa-sha256; cv=none; b=G29E18iaW7pYZv5d442u5rIQLUS1kF5gzSSkRHVmKgUgjep1IePHaYeJKeU7OKZwMGlxWe VMRYh5gGGQ7e7agmbaOekPzr9LGcjwAUPG1g77YHh05syJnnu+bS/yGVeNR1/+FxJcszLE 6r8Cya05MvJejxgzgJffO/brWP0ViW0= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 1DF05A41B24; Fri, 17 Jan 2025 10:02:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 05194C4CEE4; Fri, 17 Jan 2025 10:04:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108268; bh=KG1gfl9h8v/ucKbjkssx+J8kG9zIxS1cy+tMx8hgV6s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CoIahJnDuJR3UhGA/T5o/6xqvX4Vo+/WCukiV+L09L3RCeb5wcfMsl/yTuAlcuWrd rFGWjTxBkDE+yqJR+lGWaNqRhNMVEBUFrpTAQ075ZaMvGBlLWZsq1TyUUNl/U0yzBU K/p3HKd/TStOciDIXSdIVFHLdmOaVjpk/fdziHQvjd7L/b9mh9HM/ki+j30W6QgzED BNDpSlsn4D9uTMU0kjQoFzlhd7cSyM1loeB1VLrZxyQ6M3Oy7Ron9GpQf9PVeF8Sl7 2+yIIsHZrbUNyLj3CF5/ey28U5yMlN7Y7G0RRsl/JZr45nYIwf6MrMg7zng625jlVe QNb7oGQy7sNPA== From: Leon Romanovsky To: Christoph Hellwig , Jason Gunthorpe , Robin Murphy Cc: Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 09/17] docs: core-api: document the IOVA-based API Date: Fri, 17 Jan 2025 12:03:40 +0200 Message-ID: <6060892b1f02ec6a640928e4866cd6752b56b8d8.1737106761.git.leon@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: B8ED6140011 X-Stat-Signature: rsj4uhzyic6sd1atmq3i6yuret9ky99q X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1737108269-888096 X-HE-Meta: U2FsdGVkX1+1CnUp2f7W44DrSQiLLCmWTirz0NABXymEqemZ/3MiG5G1wr7IISCQhPS7fBihVXvdFs4hmqcsh9FK2tWrsEhRtvH9Nbv8nkG0ikmrK6NuiOhQ0/HXIYiOZVzCVytrzZD58H7IgUiMJP1YmggUoieKvNi97QUOyxBq9GQuo5ROr8KMPJP6T9yCdHI9g4OKaFRO3voEVX224o/Z5PQsTrFFZRFMEmwZQVNQM08/Y6ybO9YWrEbEm/hBuQwqDPghmHeF3h891Q2fblChTXz694KkDuEwvyqTI2IOm9Z97ha0DmYEtFXR13ld6/6X5NgHj/x74LXUwn/gI0pj8mqinJOiHdsPshEjMcTBuB97lhyok94oercQvGct2mYYIM3OEpfpvRiwCYe3lRvADCLz/0kQdz8r5dXNda5WI+IbuX2tzxTBbrb0TrcphHP6wqmcO3/GoFy21kCJ6ETKuskbUCBKa/GsDfgLc/ZS2ClB2JUDyrXyagu76kx0CRuxMRZ4xW1Qrd6lbKZB+5rt7XTaHvd4Nj2cEpbNUdOdp6z6IUIJgBODKvZzYjCHuGC/lrNtmSKDbFD+FVsQbLEKHeyJm7OqtFKO1KOIRERaZpvp8h7pn3sZuT6s0bTQivlhj3HwSmp+sfZQTEczNkOIfMTPTButqvvKbEe16aWVSuQHDKE9dSX161K+7Gc0rlLBCctyj9xDMOJuovXNr3V7hK+VrVkcYPx4qCT1S2kcnGli3/iHAnpS1vJyy2XFgHXEBIfgOGkP3NW8XctJWveq5wxDdh47Cg+ulOlZ1qGwxqHmifrYd+8NBvepwEfr9o9FFxEPiQfKAff5XChTutvQoZhM9b0+wU5gj1PXL5fejA1YHOr7r5Ue6DxOz+8fpRcHU6NT6/OErJ/emYsqKgSax6uWK1xktaPGSPV6RkiWohVY3I+BktWsA8N/ZiT7DpjF1mLHtCOlYUvyWmB iPiBMaUR 91/kfUY4mHWatHdiuAQOjGE4igwmvAGDgha++bDjC02U4bfO4eaRzrdU/VH2EtorMAvD+dgCVBZwloidyqZmf9BBIVkfSFGqNda00g7Uxsgsj8KTRv31Y6tU8F0x+lK4cuqMrZT5lJW3t37qjSS0uX8wacH5mLcCNzsBxo2lOvmbLdLc6008ziAScmxxMnTTep2E0veKXjTso+/DCXD+/z+PPWPVkAsXux0oASUaaySiIMLQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Christoph Hellwig Add an explanation of the newly added IOVA-based mapping API. Signed-off-by: Christoph Hellwig Signed-off-by: Leon Romanovsky --- Documentation/core-api/dma-api.rst | 70 ++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) diff --git a/Documentation/core-api/dma-api.rst b/Documentation/core-api/dma-api.rst index 8e3cce3d0a23..61d6f4fe3d88 100644 --- a/Documentation/core-api/dma-api.rst +++ b/Documentation/core-api/dma-api.rst @@ -530,6 +530,76 @@ routines, e.g.::: .... } +Part Ie - IOVA-based DMA mappings +--------------------------------- + +These APIs allow a very efficient mapping when using an IOMMU. They are an +optional path that requires extra code and are only recommended for drivers +where DMA mapping performance, or the space usage for storing the DMA addresses +matter. All the considerations from the previous section apply here as well. + +:: + + bool dma_iova_try_alloc(struct device *dev, struct dma_iova_state *state, + phys_addr_t phys, size_t size); + +Is used to try to allocate IOVA space for mapping operation. If it returns +false this API can't be used for the given device and the normal streaming +DMA mapping API should be used. The ``struct dma_iova_state`` is allocated +by the driver and must be kept around until unmap time. + +:: + + static inline bool dma_use_iova(struct dma_iova_state *state) + +Can be used by the driver to check if the IOVA-based API is used after a +call to dma_iova_try_alloc. This can be useful in the unmap path. + +:: + + int dma_iova_link(struct device *dev, struct dma_iova_state *state, + phys_addr_t phys, size_t offset, size_t size, + enum dma_data_direction dir, unsigned long attrs); + +Is used to link ranges to the IOVA previously allocated. The start of all +but the first call to dma_iova_link for a given state must be aligned +to the DMA merge boundary returned by ``dma_get_merge_boundary())``, and +the size of all but the last range must be aligned to the DMA merge boundary +as well. + +:: + + int dma_iova_sync(struct device *dev, struct dma_iova_state *state, + size_t offset, size_t size); + +Must be called to sync the IOMMU page tables for IOVA-range mapped by one or +more calls to ``dma_iova_link()``. + +For drivers that use a one-shot mapping, all ranges can be unmapped and the +IOVA freed by calling: + +:: + + void dma_iova_destroy(struct device *dev, struct dma_iova_state *state, + enum dma_data_direction dir, unsigned long attrs); + +Alternatively drivers can dynamically manage the IOVA space by unmapping +and mapping individual regions. In that case + +:: + + void dma_iova_unlink(struct device *dev, struct dma_iova_state *state, + size_t offset, size_t size, enum dma_data_direction dir, + unsigned long attrs); + +is used to unmap a range previously mapped, and + +:: + + void dma_iova_free(struct device *dev, struct dma_iova_state *state); + +is used to free the IOVA space. All regions must have been unmapped using +``dma_iova_unlink()`` before calling ``dma_iova_free()``. Part II - Non-coherent DMA allocations -------------------------------------- From patchwork Fri Jan 17 10:03:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943124 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4EBAC02183 for ; Fri, 17 Jan 2025 10:04:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5265128000C; Fri, 17 Jan 2025 05:04:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4D498280001; Fri, 17 Jan 2025 05:04:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3784A28000C; Fri, 17 Jan 2025 05:04:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 183FC280001 for ; Fri, 17 Jan 2025 05:04:35 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C83871C852E for ; Fri, 17 Jan 2025 10:04:34 +0000 (UTC) X-FDA: 83016509268.24.8B51478 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf28.hostedemail.com (Postfix) with ESMTP id 448DDC0007 for ; Fri, 17 Jan 2025 10:04:33 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=cxib5n7O; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf28.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108273; a=rsa-sha256; cv=none; b=CFQQwpx2OIH6f9YZxPBqJM9FSsKGSweORjPYB/0udZGneDIW+AGHEw6nKDyp1BqX/IuRQ/ XWYYZ0Wd4C9bC3TI/IoKQ4ApRVKX0k+oe/Dn+thoNn2m25MqQKipDwM0NOHgSYmsRtClzP ZI4I4m2cmHvQ8FGx4stpj4CN2/YtUBc= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=cxib5n7O; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf28.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108273; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7k0WJN9VHYrWroEqAlCojPU8Gu3aynTgjEPQVaU2xJg=; b=AZj4eAKzNm36MMjdVTNQtZ+e5MbKuI1/p6qlp88t0MIgqusxVLQahQg215eSOcOGgjcc4M eBEpk/6x7E24RKJmQ+5MpFPCFb1UwjtJ1a2lclzozoubK7lvy5oDgXG8cI4Na5/2nhp0EI kMIDb1JpRdom2uiM8e680FYnHEDY0F8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id AD4A0A42B2D; Fri, 17 Jan 2025 10:02:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8AC4EC4CEDD; Fri, 17 Jan 2025 10:04:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108272; bh=In+edwZaKe3io90YLtv4jjZtq6muAkpbvftk2SGeHaY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cxib5n7OJ4awdwP9n9ns3nBcyKLJpElFKANRoB+QeGzY2n8K+dzz9u4siNPoi/ZnW Kf2WnGEWbgoEw9iGxK03Z4nR7BFjomVC4rzGpC6Vrfs+xVBgJXjtijNOXpqddYAckb PCWupB6fuyFakIhYaGkH/Gz9iDn9q8MwXlPbPdAUoTEX8G9wNd59qIGSPJqDk6oRM3 BKJl1Pieag1pVLLq8efeLNN7WKuLpAofCUz54s6lVSkTdKz4tp92eiZbwNL4vmEg5A wyXUt4T215wqVCJBt9DlaTkJ3mdqoZRm/yKP0XJzJFb0eWhGiwB/yi3S7nmrxG8cTN 9a3+9jsvMWZZQ== From: Leon Romanovsky To: Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 10/17] mm/hmm: let users to tag specific PFN with DMA mapped bit Date: Fri, 17 Jan 2025 12:03:41 +0200 Message-ID: <0f89921b4197830f410330ae7a5e89ecbe7739d5.1737106761.git.leon@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 448DDC0007 X-Stat-Signature: 8rg8fo6fyncqj8smrjb5xjfj3ftwuujn X-HE-Tag: 1737108273-336318 X-HE-Meta: U2FsdGVkX1/A9IrXwkGb0y8RD/qB5AJ57f9ATcA8QGe7+zRhhNwLbw4JW0cBxq3T3Gvepcpat4y9JGJGj80TN6YvWC189e3LsRkOxqoAPWYNiPlOs64XpUCHhd+JGNldS3bJ6bqnqBMiNZH+k5EsbDAqnJUIODT5yamBW2Zsn3+Loej1t1coLGelcTq6nGgn9T/s6epF49A05QT62+sPpv/2aU6PKnDNQcyMAhYqrk1Y8vC0cwPUt+A60llACGurAUr4IeWLl3DHEPAFbRLbtnVEysthJsD8bL6rtiDEoS1/it74K3ZZy7lb+9SqOGVZAe485BqtHsBtp84H5lKxBYZI0ML0iAYaOeUwx76xoz+GKxwJIln8RmnSyrYpCJ0ar0O49emSKKayziNLdOO6k1GD6mfZw//vdcelFCj5QkCfs4FgNp17rvnumilI9MvdtcO1L2d1vLKsiV3aumpELzOcWbU4zFpvu5Gej4qHX7//y3lvndrTnyEgCRa6+sp50ri5e510n2pecjyslCmCoCc0Ajwq8OjVgxQmnegx4H5UPrfebq8nVI+xprTvUEbwlL+Hz1SRhl+ruhBPsBoAqxg8uCXHidhjC+K2iFIMwVBrsKaqP9Bv9hHQO8Vy2BwpaIszisaUUAjGZUIFVb8fuBGBZQzn8K5fmSkQPa4cvswreGDxCpZf2EWp25o0x+x8kA8jdKR2gZfTtyuQBtD17aQJdiEahlPq86wDOuV5Zkh+2bcW5c8a8gb4+7KWLJsQvAx1g18ayqiS7mgsxPMdpn7sCuwTB8//cbMfkGd/eSJkVkwttXkhM4yIPvjwc94PoRynvrRKQ1R4uNMgw+RoE5495jSPKu8RyjWcmU+p0fKeVYAHbGXpNhRY49uzZ6ai+8xNfEpbuB4QXtmQjwOfYopSqIQfdwE0VwtoU0AiJwlyzW3hpFB8YkUC4c5BnlIrzoZ0617xaIik9lOyGfF wMkQWd43 268RzNx6QISSN7161bbZgx0/mNHWIUmzPSTwswNnQqO2YpZ/Xp+4BT+GvQpJ9GdS5e51YEuotmVg+Il3REjY62BdrMhLn/rEs7yB2Uu/+Uj4TkiGv2k85ASRk3XFC23f6CltMZXjnlZYBK9GTHOVPy6GQfNx4f7oepwCeHRj3ZRWmrIU8lob+2hnJLF8m11Jd2MsFGIDYC1h4LZu6DhobjocXY9pa8n9EV/ssBfwfDjtQFJs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Introduce new sticky flag (HMM_PFN_DMA_MAPPED), which isn't overwritten by HMM range fault. Such flag allows users to tag specific PFNs with information if this specific PFN was already DMA mapped. Signed-off-by: Leon Romanovsky --- include/linux/hmm.h | 17 +++++++++++++++ mm/hmm.c | 51 ++++++++++++++++++++++++++++----------------- 2 files changed, 49 insertions(+), 19 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 126a36571667..a1ddbedc19c0 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -23,6 +23,8 @@ struct mmu_interval_notifier; * HMM_PFN_WRITE - if the page memory can be written to (requires HMM_PFN_VALID) * HMM_PFN_ERROR - accessing the pfn is impossible and the device should * fail. ie poisoned memory, special pages, no vma, etc + * HMM_PFN_DMA_MAPPED - Flag preserved on input-to-output transformation + * to mark that page is already DMA mapped * * On input: * 0 - Return the current state of the page, do not fault it. @@ -36,6 +38,13 @@ enum hmm_pfn_flags { HMM_PFN_VALID = 1UL << (BITS_PER_LONG - 1), HMM_PFN_WRITE = 1UL << (BITS_PER_LONG - 2), HMM_PFN_ERROR = 1UL << (BITS_PER_LONG - 3), + + /* + * Sticky flags, carried from input to output, + * don't forget to update HMM_PFN_INOUT_FLAGS + */ + HMM_PFN_DMA_MAPPED = 1UL << (BITS_PER_LONG - 7), + HMM_PFN_ORDER_SHIFT = (BITS_PER_LONG - 8), /* Input flags */ @@ -57,6 +66,14 @@ static inline struct page *hmm_pfn_to_page(unsigned long hmm_pfn) return pfn_to_page(hmm_pfn & ~HMM_PFN_FLAGS); } +/* + * hmm_pfn_to_phys() - return physical address pointed to by a device entry + */ +static inline phys_addr_t hmm_pfn_to_phys(unsigned long hmm_pfn) +{ + return __pfn_to_phys(hmm_pfn & ~HMM_PFN_FLAGS); +} + /* * hmm_pfn_to_map_order() - return the CPU mapping size order * diff --git a/mm/hmm.c b/mm/hmm.c index 7e0229ae4a5a..da5743f6d854 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -39,13 +39,20 @@ enum { HMM_NEED_ALL_BITS = HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT, }; +enum { + /* These flags are carried from input-to-output */ + HMM_PFN_INOUT_FLAGS = HMM_PFN_DMA_MAPPED, +}; + static int hmm_pfns_fill(unsigned long addr, unsigned long end, struct hmm_range *range, unsigned long cpu_flags) { unsigned long i = (addr - range->start) >> PAGE_SHIFT; - for (; addr < end; addr += PAGE_SIZE, i++) - range->hmm_pfns[i] = cpu_flags; + for (; addr < end; addr += PAGE_SIZE, i++) { + range->hmm_pfns[i] &= HMM_PFN_INOUT_FLAGS; + range->hmm_pfns[i] |= cpu_flags; + } return 0; } @@ -202,8 +209,10 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr, return hmm_vma_fault(addr, end, required_fault, walk); pfn = pmd_pfn(pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); - for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) - hmm_pfns[i] = pfn | cpu_flags; + for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) { + hmm_pfns[i] &= HMM_PFN_INOUT_FLAGS; + hmm_pfns[i] |= pfn | cpu_flags; + } return 0; } #else /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -230,14 +239,14 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, unsigned long cpu_flags; pte_t pte = ptep_get(ptep); uint64_t pfn_req_flags = *hmm_pfn; + uint64_t new_pfn_flags = 0; if (pte_none_mostly(pte)) { required_fault = hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0); if (required_fault) goto fault; - *hmm_pfn = 0; - return 0; + goto out; } if (!pte_present(pte)) { @@ -253,16 +262,14 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, cpu_flags = HMM_PFN_VALID; if (is_writable_device_private_entry(entry)) cpu_flags |= HMM_PFN_WRITE; - *hmm_pfn = swp_offset_pfn(entry) | cpu_flags; - return 0; + new_pfn_flags = swp_offset_pfn(entry) | cpu_flags; + goto out; } required_fault = hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0); - if (!required_fault) { - *hmm_pfn = 0; - return 0; - } + if (!required_fault) + goto out; if (!non_swap_entry(entry)) goto fault; @@ -304,11 +311,13 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, pte_unmap(ptep); return -EFAULT; } - *hmm_pfn = HMM_PFN_ERROR; - return 0; + new_pfn_flags = HMM_PFN_ERROR; + goto out; } - *hmm_pfn = pte_pfn(pte) | cpu_flags; + new_pfn_flags = pte_pfn(pte) | cpu_flags; +out: + *hmm_pfn = (*hmm_pfn & HMM_PFN_INOUT_FLAGS) | new_pfn_flags; return 0; fault: @@ -448,8 +457,10 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, } pfn = pud_pfn(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); - for (i = 0; i < npages; ++i, ++pfn) - hmm_pfns[i] = pfn | cpu_flags; + for (i = 0; i < npages; ++i, ++pfn) { + hmm_pfns[i] &= HMM_PFN_INOUT_FLAGS; + hmm_pfns[i] |= pfn | cpu_flags; + } goto out_unlock; } @@ -507,8 +518,10 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, } pfn = pte_pfn(entry) + ((start & ~hmask) >> PAGE_SHIFT); - for (; addr < end; addr += PAGE_SIZE, i++, pfn++) - range->hmm_pfns[i] = pfn | cpu_flags; + for (; addr < end; addr += PAGE_SIZE, i++, pfn++) { + range->hmm_pfns[i] &= HMM_PFN_INOUT_FLAGS; + range->hmm_pfns[i] |= pfn | cpu_flags; + } spin_unlock(ptl); return 0; From patchwork Fri Jan 17 10:03:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943125 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4466C02188 for ; Fri, 17 Jan 2025 10:04:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3ED0A28000D; Fri, 17 Jan 2025 05:04:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 39DB4280001; Fri, 17 Jan 2025 05:04:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2180928000D; Fri, 17 Jan 2025 05:04:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0104D280001 for ; Fri, 17 Jan 2025 05:04:38 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id B823046DE3 for ; Fri, 17 Jan 2025 10:04:38 +0000 (UTC) X-FDA: 83016509436.21.B141D51 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf09.hostedemail.com (Postfix) with ESMTP id 0F9C414000C for ; Fri, 17 Jan 2025 10:04:36 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=S5TBYCNp; spf=pass (imf09.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108277; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DusodMacRNPGUEIa6V2Hk2tWoOAJhUOUycT/mw1d8hY=; b=P5ai4KA4UJMCQf2EIbzcM1+e4cwyYZzx6Hi0nXWaUmHzTjZrfWwCsF9zQHO431j0xY260a w51sYKvSr9RGpKg/0SMQqZdtWgYirVZ1indi/U5LfryeVxAXeqU0EaqffyaehtbzFA63rZ HKZLU2VITW8WbkN9DX6ol26C/rqmh0g= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=S5TBYCNp; spf=pass (imf09.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108277; a=rsa-sha256; cv=none; b=3YDFrdXMpcUA3oxM5IhtniCiyY6yqrhrZb0ZqClGzv/BbwL6m8o1f1pdv/gaEB1T37L4gQ 0t319IvB2nQWVY3iYBU5tzXp5srUqvydBdQWMPyYSlVNQxfuOMOW2jfNeFX8Z3XoxA0/yJ BrKSRu+fclY92h7EsiaAxMCGB9kdflQ= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 880EE5C5506; Fri, 17 Jan 2025 10:03:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3A615C4CEDD; Fri, 17 Jan 2025 10:04:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108275; bh=3C7AjmismICJ7E4PqTn/EEduqLtaT9gA9Uld3NHM//w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=S5TBYCNpYzyIIn2EIX1L/IupTQ8UOz/O1mpBRwLu4PKe4G9Mcps6STlJkIfeTKtwJ QgWd4jljHUzDZOkXm7qBV5BwedVRrK/IqRTjdkpy7/wjXfomDyf47ma1hgeZRPuxb8 DxUR3b/HZ80T2fD4rOfeQcvq0jtF2twvJZ132bY5Cvilqk6EyGe8PibO2vyfr2f4BL sVqyukjVn2PPCgohhPIXKzRM5cISZsT00AGSe4vwxwJZgsYIZjGanoDCIOfrCx+BHG gT64IRHi8AIL6ggXK4yHwOAbTEUVRYLbch0J0MNrJM9FsQUl3U1ujrFM2hP7N/zoBN I9hYDMkoJK7jA== From: Leon Romanovsky To: Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 11/17] mm/hmm: provide generic DMA managing logic Date: Fri, 17 Jan 2025 12:03:42 +0200 Message-ID: <0c0862f31cb7243d4d31ec45bda488c99d534a68.1737106761.git.leon@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 0F9C414000C X-Stat-Signature: q9pdkyynioank46aeyffzuxahon5ctqw X-Rspam-User: X-HE-Tag: 1737108276-58346 X-HE-Meta: U2FsdGVkX18jtH0egLjoDoBV+hnx79N3qprfiZRW5nombl0oBSvEicWzTSYWdc95xrO0n9/iKJET1siATtrphdnLcfZ9cNlYGQoRMETRImG6Cyjfm79DNfRgEjSKumnhKq6j5UE97UImXsJIwrZKi9cq5gaB8bq1Q6Jfg4UST+4EOPiWyW7d1rrAL6LCiCU6JG6J8evUsOgc/0/T+/5OkcR3+/SbxwpKqKZYSMBHYYqBH7a/cEICrWl6dGik04wvQLJu4Y06ypWVha9cNUEIvK8NhDMY2Cg8fZVTp8ZUfsUbSfMjM8GKpFy7D/qYCdjUj4GElsYO8CPp2oN0Bwvk0bSnmtLA8Qv5HFKuABC1qw7LlOe3wkt37fNgIyXEbA2HdhC6jDMDAA3h7Eg6MaIWZJMb2WIToONDiJUH4erPDKV/k/an0CwihQBP2NlF01A3rkhgSl1fG+TT0lZkH2SSDoP7GUa7XMwbDKg/scp/lLgJNSRtmDAiimo/6Y3XJckuGkjP1eAGHr4iXKbrP0VkE1VqI6yPy8BTviGQyKs/djELnkbSV4B8kRO50YkV47U+aGKFmzcSSeog0hHD1kqIz5ALRBB91exxEvu1w0j9M7Q9L6WgPhEnb3OM3ojrl8GpxNy1XNecToXTwiGUdua9OmKmhgzOx1BZDtcSNjZtTxIqSxzHNSjTUqmPE2kVyNFQTNajBnjDBmXhdH8e/MTi8n9Cdfi6I8+uigTHMh58sLfiuMYqVccoszZj5rHV0YrzyWF3LqsgsFKOHoS/kGi2t6AosafbAZ7WS4CGryYbwPwKLdExblc4p3ydv12JDaLjrD1qAM364DSJ62eojbwijgj7FNjnEoVlHcc+7no73Nuuxfq/7+WEmbtOFyQbOaT+A7BprlKRGrsuvRLi5tVQ1m0jEOQwIheOnNz73QXtwEgHNhKxPDgOVDspS98NfT5uyENS1KxXfKZFvdS/n0M m5a4UQfo SM0w4ozY9l4bQRoKv5TsM5opqaegoSO7H0OH9+ye0wpqNQlOFtgsHs3aKUNcRAU4XoH5xHl54tK++NS1u19K4BZX/k3ISjx4yeyxi+hpCUnCXGesT9gbGmY9i77PATOU1yeRRG/BCB278P09x8DzFo6NVnKQ/m1GJAizimlgD//dn41KrDktunNZHODHxNBN1Kl+K2bVULkOO1K15hpzgx8b1orUvF5Qt//y/rMT6sF6SuUg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky HMM callers use PFN list to populate range while calling to hmm_range_fault(), the conversion from PFN to DMA address is done by the callers with help of another DMA list. However, it is wasteful on any modern platform and by doing the right logic, that DMA list can be avoided. Provide generic logic to manage these lists and gave an interface to map/unmap PFNs to DMA addresses, without requiring from the callers to be an experts in DMA core API. Signed-off-by: Leon Romanovsky --- include/linux/hmm-dma.h | 33 ++++++ include/linux/hmm.h | 4 + mm/hmm.c | 215 +++++++++++++++++++++++++++++++++++++++- 3 files changed, 251 insertions(+), 1 deletion(-) create mode 100644 include/linux/hmm-dma.h diff --git a/include/linux/hmm-dma.h b/include/linux/hmm-dma.h new file mode 100644 index 000000000000..f58b9fc71999 --- /dev/null +++ b/include/linux/hmm-dma.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* Copyright (c) 2024 NVIDIA Corporation & Affiliates */ +#ifndef LINUX_HMM_DMA_H +#define LINUX_HMM_DMA_H + +#include + +struct dma_iova_state; +struct pci_p2pdma_map_state; + +/* + * struct hmm_dma_map - array of PFNs and DMA addresses + * + * @state: DMA IOVA state + * @pfns: array of PFNs + * @dma_list: array of DMA addresses + * @dma_entry_size: size of each DMA entry in the array + */ +struct hmm_dma_map { + struct dma_iova_state state; + unsigned long *pfn_list; + dma_addr_t *dma_list; + size_t dma_entry_size; +}; + +int hmm_dma_map_alloc(struct device *dev, struct hmm_dma_map *map, + size_t nr_entries, size_t dma_entry_size); +void hmm_dma_map_free(struct device *dev, struct hmm_dma_map *map); +dma_addr_t hmm_dma_map_pfn(struct device *dev, struct hmm_dma_map *map, + size_t idx, + struct pci_p2pdma_map_state *p2pdma_state); +bool hmm_dma_unmap_pfn(struct device *dev, struct hmm_dma_map *map, size_t idx); +#endif /* LINUX_HMM_DMA_H */ diff --git a/include/linux/hmm.h b/include/linux/hmm.h index a1ddbedc19c0..1bc33e4c20ea 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -23,6 +23,8 @@ struct mmu_interval_notifier; * HMM_PFN_WRITE - if the page memory can be written to (requires HMM_PFN_VALID) * HMM_PFN_ERROR - accessing the pfn is impossible and the device should * fail. ie poisoned memory, special pages, no vma, etc + * HMM_PFN_P2PDMA - P2P page + * HMM_PFN_P2PDMA_BUS - Bus mapped P2P transfer * HMM_PFN_DMA_MAPPED - Flag preserved on input-to-output transformation * to mark that page is already DMA mapped * @@ -43,6 +45,8 @@ enum hmm_pfn_flags { * Sticky flags, carried from input to output, * don't forget to update HMM_PFN_INOUT_FLAGS */ + HMM_PFN_P2PDMA = 1UL << (BITS_PER_LONG - 5), + HMM_PFN_P2PDMA_BUS = 1UL << (BITS_PER_LONG - 6), HMM_PFN_DMA_MAPPED = 1UL << (BITS_PER_LONG - 7), HMM_PFN_ORDER_SHIFT = (BITS_PER_LONG - 8), diff --git a/mm/hmm.c b/mm/hmm.c index da5743f6d854..e7dfb9f6cd9b 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -10,6 +10,7 @@ */ #include #include +#include #include #include #include @@ -23,6 +24,7 @@ #include #include #include +#include #include #include @@ -41,7 +43,8 @@ enum { enum { /* These flags are carried from input-to-output */ - HMM_PFN_INOUT_FLAGS = HMM_PFN_DMA_MAPPED, + HMM_PFN_INOUT_FLAGS = HMM_PFN_DMA_MAPPED | HMM_PFN_P2PDMA | + HMM_PFN_P2PDMA_BUS, }; static int hmm_pfns_fill(unsigned long addr, unsigned long end, @@ -620,3 +623,213 @@ int hmm_range_fault(struct hmm_range *range) return ret; } EXPORT_SYMBOL(hmm_range_fault); + +/** + * hmm_dma_map_alloc - Allocate HMM map structure + * @dev: device to allocate structure for + * @map: HMM map to allocate + * @nr_entries: number of entries in the map + * @dma_entry_size: size of the DMA entry in the map + * + * Allocate the HMM map structure and all the lists it contains. + * Return 0 on success, -ENOMEM on failure. + */ +int hmm_dma_map_alloc(struct device *dev, struct hmm_dma_map *map, + size_t nr_entries, size_t dma_entry_size) +{ + bool dma_need_sync = false; + bool use_iova; + + if (!(nr_entries * PAGE_SIZE / dma_entry_size)) + return -EINVAL; + + /* + * The HMM API violates our normal DMA buffer ownership rules and can't + * transfer buffer ownership. The dma_addressing_limited() check is a + * best approximation to ensure no swiotlb buffering happens. + */ +#ifdef CONFIG_DMA_NEED_SYNC + dma_need_sync = !dev->dma_skip_sync; +#endif /* CONFIG_DMA_NEED_SYNC */ + if (dma_need_sync || dma_addressing_limited(dev)) + return -EOPNOTSUPP; + + map->dma_entry_size = dma_entry_size; + map->pfn_list = + kvcalloc(nr_entries, sizeof(*map->pfn_list), GFP_KERNEL); + if (!map->pfn_list) + return -ENOMEM; + + use_iova = dma_iova_try_alloc(dev, &map->state, 0, + nr_entries * PAGE_SIZE); + if (!use_iova && dma_need_unmap(dev)) { + map->dma_list = kvcalloc(nr_entries, sizeof(*map->dma_list), + GFP_KERNEL); + if (!map->dma_list) + goto err_dma; + } + return 0; + +err_dma: + kvfree(map->pfn_list); + return -ENOMEM; +} +EXPORT_SYMBOL_GPL(hmm_dma_map_alloc); + +/** + * hmm_dma_map_free - iFree HMM map structure + * @dev: device to free structure from + * @map: HMM map containing the various lists and state + * + * Free the HMM map structure and all the lists it contains. + */ +void hmm_dma_map_free(struct device *dev, struct hmm_dma_map *map) +{ + if (dma_use_iova(&map->state)) + dma_iova_free(dev, &map->state); + kvfree(map->pfn_list); + kvfree(map->dma_list); +} +EXPORT_SYMBOL_GPL(hmm_dma_map_free); + +/** + * hmm_dma_map_pfn - Map a physical HMM page to DMA address + * @dev: Device to map the page for + * @map: HMM map + * @idx: Index into the PFN and dma address arrays + * @pci_p2pdma_map_state: PCI P2P state. + * + * dma_alloc_iova() allocates IOVA based on the size specified by their use in + * iova->size. Call this function after IOVA allocation to link whole @page + * to get the DMA address. Note that very first call to this function + * will have @offset set to 0 in the IOVA space allocated from + * dma_alloc_iova(). For subsequent calls to this function on same @iova, + * @offset needs to be advanced by the caller with the size of previous + * page that was linked + DMA address returned for the previous page that was + * linked by this function. + */ +dma_addr_t hmm_dma_map_pfn(struct device *dev, struct hmm_dma_map *map, + size_t idx, + struct pci_p2pdma_map_state *p2pdma_state) +{ + struct dma_iova_state *state = &map->state; + dma_addr_t *dma_addrs = map->dma_list; + unsigned long *pfns = map->pfn_list; + struct page *page = hmm_pfn_to_page(pfns[idx]); + phys_addr_t paddr = hmm_pfn_to_phys(pfns[idx]); + size_t offset = idx * map->dma_entry_size; + unsigned long attrs = 0; + dma_addr_t dma_addr; + int ret; + + if ((pfns[idx] & HMM_PFN_DMA_MAPPED) && + !(pfns[idx] & HMM_PFN_P2PDMA_BUS)) { + /* + * We are in this flow when there is a need to resync flags, + * for example when page was already linked in prefetch call + * with READ flag and now we need to add WRITE flag + * + * This page was already programmed to HW and we don't want/need + * to unlink and link it again just to resync flags. + */ + if (dma_use_iova(state)) + return state->addr + offset; + + /* + * Without dma_need_unmap, the dma_addrs array is NULL, thus we + * need to regenerate the address below even if there already + * was a mapping. But !dma_need_unmap implies that the + * mapping stateless, so this is fine. + */ + if (dma_need_unmap(dev)) + return dma_addrs[idx]; + + /* Continue to remapping */ + } + + switch (pci_p2pdma_state(p2pdma_state, dev, page)) { + case PCI_P2PDMA_MAP_NONE: + break; + case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: + attrs |= DMA_ATTR_SKIP_CPU_SYNC; + pfns[idx] |= HMM_PFN_P2PDMA; + break; + case PCI_P2PDMA_MAP_BUS_ADDR: + pfns[idx] |= HMM_PFN_P2PDMA_BUS | HMM_PFN_DMA_MAPPED; + return pci_p2pdma_bus_addr_map(p2pdma_state, paddr); + default: + return DMA_MAPPING_ERROR; + } + + if (dma_use_iova(state)) { + ret = dma_iova_link(dev, state, paddr, offset, + map->dma_entry_size, DMA_BIDIRECTIONAL, + attrs); + if (ret) + goto error; + + ret = dma_iova_sync(dev, state, offset, map->dma_entry_size); + if (ret) { + dma_iova_unlink(dev, state, offset, map->dma_entry_size, + DMA_BIDIRECTIONAL, attrs); + goto error; + } + + dma_addr = state->addr + offset; + } else { + if (WARN_ON_ONCE(dma_need_unmap(dev) && !dma_addrs)) + goto error; + + dma_addr = dma_map_page(dev, page, 0, map->dma_entry_size, + DMA_BIDIRECTIONAL); + if (dma_mapping_error(dev, dma_addr)) + goto error; + + if (dma_need_unmap(dev)) + dma_addrs[idx] = dma_addr; + } + pfns[idx] |= HMM_PFN_DMA_MAPPED; + return dma_addr; +error: + pfns[idx] &= ~HMM_PFN_P2PDMA; + return DMA_MAPPING_ERROR; + +} +EXPORT_SYMBOL_GPL(hmm_dma_map_pfn); + +/** + * hmm_dma_unmap_pfn - Unmap a physical HMM page from DMA address + * @dev: Device to unmap the page from + * @map: HMM map + * @idx: Index of the PFN to unmap + * + * Returns true if the PFN was mapped and has been unmapped, false otherwise. + */ +bool hmm_dma_unmap_pfn(struct device *dev, struct hmm_dma_map *map, size_t idx) +{ + struct dma_iova_state *state = &map->state; + dma_addr_t *dma_addrs = map->dma_list; + unsigned long *pfns = map->pfn_list; + unsigned long attrs = 0; + +#define HMM_PFN_VALID_DMA (HMM_PFN_VALID | HMM_PFN_DMA_MAPPED) + if ((pfns[idx] & HMM_PFN_VALID_DMA) != HMM_PFN_VALID_DMA) + return false; +#undef HMM_PFN_VALID_DMA + + if (pfns[idx] & HMM_PFN_P2PDMA_BUS) + ; /* no need to unmap bus address P2P mappings */ + else if (dma_use_iova(state)) { + if (pfns[idx] & HMM_PFN_P2PDMA) + attrs |= DMA_ATTR_SKIP_CPU_SYNC; + dma_iova_unlink(dev, state, idx * map->dma_entry_size, + map->dma_entry_size, DMA_BIDIRECTIONAL, attrs); + } else if (dma_need_unmap(dev)) + dma_unmap_page(dev, dma_addrs[idx], map->dma_entry_size, + DMA_BIDIRECTIONAL); + + pfns[idx] &= + ~(HMM_PFN_DMA_MAPPED | HMM_PFN_P2PDMA | HMM_PFN_P2PDMA_BUS); + return true; +} +EXPORT_SYMBOL_GPL(hmm_dma_unmap_pfn); From patchwork Fri Jan 17 10:03:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943129 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AC27C02183 for ; Fri, 17 Jan 2025 10:04:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EEA8D280011; Fri, 17 Jan 2025 05:04:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E9EAE280001; Fri, 17 Jan 2025 05:04:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C829B280011; Fri, 17 Jan 2025 05:04:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A7C15280001 for ; Fri, 17 Jan 2025 05:04:53 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 615E2B0CBB for ; Fri, 17 Jan 2025 10:04:53 +0000 (UTC) X-FDA: 83016510066.14.4631F2C Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf16.hostedemail.com (Postfix) with ESMTP id C7113180013 for ; Fri, 17 Jan 2025 10:04:51 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=i2xfFdyW; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf16.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108291; a=rsa-sha256; cv=none; b=TDvNw95O+r3vKMW2wPh2wh7Zq6nS19A7RtEwj4utI4UmDaI8X+MPg0b6bkFTrcn5GJcOCn HmfQGNDsGiur/nTPIRoxZm6Ti86BwimLijHfEDOrBpOldq3+2nbZBdPmUbRm1PhkCAjCsC RX1p0zAl98R/Htn+ZkzU3cy6un3L16o= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=i2xfFdyW; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf16.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108291; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9MdxQeZT4RxdjoRb4DWA33ZsNg0VcoS6gYxPjLqJdsA=; b=tlgf76n4wV1MG0OSShh/KN1vt5rQjOZDdelKQp4w1cf6tLF+7z4P+jm6YK4DAsERsNEbJY y6hvV8J59L9zNBDF+/EJGO63ZP15R46QFLnTeYWfJYbHUjQnOGvyHa9JWCZh/OakaqKyOx VJSeh+LmFuRofvnlMChMbANI/pCWDCc= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 35D1DA42B37; Fri, 17 Jan 2025 10:03:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C2B01C4CEDD; Fri, 17 Jan 2025 10:04:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108290; bh=89c9YX8SUBqqG6xDhROSaxdk8wySQIaKmvGgLzRIiQI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=i2xfFdyWTSjrg37aBXvW6TMUl7eLmjetw1cW1t27cJ7tT86RfMD4ntVlIVtugqcBL pIR/deZzkW/PB4Os0r+fTVAOqBtWZ2ckDvQIN5PkVMCJk4yb1Lnwjthha/AwbCYqGN hMxsqy7X+GkC7VnX12CoaZl+4STbhl3EFu6SF4FaaXkAMcyRaxECdJSdKVnB3sWS/8 0xvl8k9djNnRH3av/yZjRlPl3M5AmV9X/76Q1weWul1LXa3bLtlk+niBlRzru0ugpt UeNMX9DHvsX+butFOvQngOXGsLdAzaLFYJiXFIfoosxrbWkhZA4OI1LX2T11jzPbbs /+ee6TJoe9/Jg== From: Leon Romanovsky To: Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 12/17] RDMA/umem: Store ODP access mask information in PFN Date: Fri, 17 Jan 2025 12:03:43 +0200 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: C7113180013 X-Stat-Signature: uqmbo1omsckd3ge7if9ahjen8en6kazm X-HE-Tag: 1737108291-935983 X-HE-Meta: U2FsdGVkX18psMbjZGmk7q9iWaH1+AjvHJlppESqwTU1GwmIZns6IigoRX7AotfcN9gU1FMProWw5ry/7B/AZz2FwXownqHLM4LSPCnnKUsswOcNZy9vsBNvY0oZJoF7GGd8BujyWwMli3HjFmMkK9Zh69Xxgfw6yyebKODVPsB4p8U9b9dK/RfbMuNqqeg6iedl9CAN3llzdOQn644gECDdT9rw+gNneSkU9FXlqwpkfoO3JzOCiopO3bsF9HuMB+ff+2wF1QhKuU4qgk8XmwKGXs1iJBfbwu1gtSN4MKL2ClI3TzgFkwWgDp7IgQf9zsZWl6lJG2XPosghYmObJhfFduCyMoxaGKT5wp0eFo0dfy1KoXZ891Zx7hmY4XeWxoHhj6Y6j5BIvrCFOegLT1h1Vz8ZFOQxpiSYkPMjDB7zq7KLAG+MZew60frqPyItLra5phVuxAaMHirJOmOITQTuZ/vvBRA1V48852H/3QDCXHZKVChej8wvZLS50yd3Nekv7W6mFKHtd4lgk2EQRux0AqpxqfxD6qeJ/v3EHOl8UUwHdh/EC+ifBnge4GxnhAsaFrDZF3kdKNXuYk5CQKuiiB4f65nu8+rOja1NcKPO93eoHJt2Wiq3ZkPgVhwKMrhYni3bpNXU4cmpBP1eOtnUbaWj46WV7qde5F5xaN2IL5vTLTon6UbDx+BmiQuMvTNZP1gujzEByMHQQJjew93exrghiE3zH/yCqfiZAwSTH5tcKT7XxTl0K8hZ/6YsKr1Sd1o6kDCnHvHOCZ8WoVJXr1uDvzn/SC/Q1N41a+qhz2mwN+0XGj+ABZKpCA5TbzB/B3gqefNFe56iUC6LpgoBAmiRDSRKsYMaLYiROpMO7ctCBXqerrrMAMj9PL4LAlIYpvCNHlkFnP3MtlvYkaRodPlL7ElkyhYcJIdVi6brRLaIAu4zXyqfoVj1iDj8DSRKx15sDCAXRj1tTfp 9Icl2WC3 2egbdVotymCBcAzWTko7KVe8fnSOSHMM0GdIL3I4uJLxPXmLvQ5/uEAd+aVv4DDGGjX8bS8uaSBjdZaPw6E/CEcvBR4K/gGbT/V8f2LLJKM4nhoZcDsIzIAGuQoKYf83w4zNsFfBicDA4h9P1P7bcmsixqEgb1UbewDO1USUbRHX3BBFY0uaXV4r9OiZQGxmQ5fNXq1Nw8KsP0/wx6KLp+eT7E562lyQukajY/dkUGH/fNEo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky As a preparation to remove dma_list, store access mask in PFN pointer and not in dma_addr_t. Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 103 +++++++++++---------------- drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 + drivers/infiniband/hw/mlx5/odp.c | 37 +++++----- include/rdma/ib_umem_odp.h | 14 +--- 4 files changed, 64 insertions(+), 91 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index e9fa22d31c23..e1a5a567efb3 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -296,22 +296,11 @@ EXPORT_SYMBOL(ib_umem_odp_release); static int ib_umem_odp_map_dma_single_page( struct ib_umem_odp *umem_odp, unsigned int dma_index, - struct page *page, - u64 access_mask) + struct page *page) { struct ib_device *dev = umem_odp->umem.ibdev; dma_addr_t *dma_addr = &umem_odp->dma_list[dma_index]; - if (*dma_addr) { - /* - * If the page is already dma mapped it means it went through - * a non-invalidating trasition, like read-only to writable. - * Resync the flags. - */ - *dma_addr = (*dma_addr & ODP_DMA_ADDR_MASK) | access_mask; - return 0; - } - *dma_addr = ib_dma_map_page(dev, page, 0, 1 << umem_odp->page_shift, DMA_BIDIRECTIONAL); if (ib_dma_mapping_error(dev, *dma_addr)) { @@ -319,7 +308,6 @@ static int ib_umem_odp_map_dma_single_page( return -EFAULT; } umem_odp->npages++; - *dma_addr |= access_mask; return 0; } @@ -355,9 +343,6 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, struct hmm_range range = {}; unsigned long timeout; - if (access_mask == 0) - return -EINVAL; - if (user_virt < ib_umem_start(umem_odp) || user_virt + bcnt > ib_umem_end(umem_odp)) return -EFAULT; @@ -383,7 +368,7 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, if (fault) { range.default_flags = HMM_PFN_REQ_FAULT; - if (access_mask & ODP_WRITE_ALLOWED_BIT) + if (access_mask & HMM_PFN_WRITE) range.default_flags |= HMM_PFN_REQ_WRITE; } @@ -415,22 +400,17 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, for (pfn_index = 0; pfn_index < num_pfns; pfn_index += 1 << (page_shift - PAGE_SHIFT), dma_index++) { - if (fault) { - /* - * Since we asked for hmm_range_fault() to populate - * pages it shouldn't return an error entry on success. - */ - WARN_ON(range.hmm_pfns[pfn_index] & HMM_PFN_ERROR); - WARN_ON(!(range.hmm_pfns[pfn_index] & HMM_PFN_VALID)); - } else { - if (!(range.hmm_pfns[pfn_index] & HMM_PFN_VALID)) { - WARN_ON(umem_odp->dma_list[dma_index]); - continue; - } - access_mask = ODP_READ_ALLOWED_BIT; - if (range.hmm_pfns[pfn_index] & HMM_PFN_WRITE) - access_mask |= ODP_WRITE_ALLOWED_BIT; - } + /* + * Since we asked for hmm_range_fault() to populate + * pages it shouldn't return an error entry on success. + */ + WARN_ON(fault && range.hmm_pfns[pfn_index] & HMM_PFN_ERROR); + WARN_ON(fault && !(range.hmm_pfns[pfn_index] & HMM_PFN_VALID)); + if (!(range.hmm_pfns[pfn_index] & HMM_PFN_VALID)) + continue; + + if (range.hmm_pfns[pfn_index] & HMM_PFN_DMA_MAPPED) + continue; hmm_order = hmm_pfn_to_map_order(range.hmm_pfns[pfn_index]); /* If a hugepage was detected and ODP wasn't set for, the umem @@ -445,13 +425,14 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, } ret = ib_umem_odp_map_dma_single_page( - umem_odp, dma_index, hmm_pfn_to_page(range.hmm_pfns[pfn_index]), - access_mask); + umem_odp, dma_index, + hmm_pfn_to_page(range.hmm_pfns[pfn_index])); if (ret < 0) { ibdev_dbg(umem_odp->umem.ibdev, "ib_umem_odp_map_dma_single_page failed with error %d\n", ret); break; } + range.hmm_pfns[pfn_index] |= HMM_PFN_DMA_MAPPED; } /* upon success lock should stay on hold for the callee */ if (!ret) @@ -471,7 +452,6 @@ EXPORT_SYMBOL(ib_umem_odp_map_dma_and_lock); void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt, u64 bound) { - dma_addr_t dma_addr; dma_addr_t dma; int idx; u64 addr; @@ -482,34 +462,37 @@ void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt, virt = max_t(u64, virt, ib_umem_start(umem_odp)); bound = min_t(u64, bound, ib_umem_end(umem_odp)); for (addr = virt; addr < bound; addr += BIT(umem_odp->page_shift)) { + unsigned long pfn_idx = (addr - ib_umem_start(umem_odp)) >> + PAGE_SHIFT; + struct page *page = + hmm_pfn_to_page(umem_odp->pfn_list[pfn_idx]); + idx = (addr - ib_umem_start(umem_odp)) >> umem_odp->page_shift; dma = umem_odp->dma_list[idx]; - /* The access flags guaranteed a valid DMA address in case was NULL */ - if (dma) { - unsigned long pfn_idx = (addr - ib_umem_start(umem_odp)) >> PAGE_SHIFT; - struct page *page = hmm_pfn_to_page(umem_odp->pfn_list[pfn_idx]); - - dma_addr = dma & ODP_DMA_ADDR_MASK; - ib_dma_unmap_page(dev, dma_addr, - BIT(umem_odp->page_shift), - DMA_BIDIRECTIONAL); - if (dma & ODP_WRITE_ALLOWED_BIT) { - struct page *head_page = compound_head(page); - /* - * set_page_dirty prefers being called with - * the page lock. However, MMU notifiers are - * called sometimes with and sometimes without - * the lock. We rely on the umem_mutex instead - * to prevent other mmu notifiers from - * continuing and allowing the page mapping to - * be removed. - */ - set_page_dirty(head_page); - } - umem_odp->dma_list[idx] = 0; - umem_odp->npages--; + if (!(umem_odp->pfn_list[pfn_idx] & HMM_PFN_VALID)) + goto clear; + if (!(umem_odp->pfn_list[pfn_idx] & HMM_PFN_DMA_MAPPED)) + goto clear; + + ib_dma_unmap_page(dev, dma, BIT(umem_odp->page_shift), + DMA_BIDIRECTIONAL); + if (umem_odp->pfn_list[pfn_idx] & HMM_PFN_WRITE) { + struct page *head_page = compound_head(page); + /* + * set_page_dirty prefers being called with + * the page lock. However, MMU notifiers are + * called sometimes with and sometimes without + * the lock. We rely on the umem_mutex instead + * to prevent other mmu notifiers from + * continuing and allowing the page mapping to + * be removed. + */ + set_page_dirty(head_page); } + umem_odp->npages--; +clear: + umem_odp->pfn_list[pfn_idx] &= ~HMM_PFN_FLAGS; } } EXPORT_SYMBOL(ib_umem_odp_unmap_dma_pages); diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index a01b592aa716..c4946d4f0ad7 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -336,6 +336,7 @@ struct mlx5_ib_flow_db { #define MLX5_IB_UPD_XLT_PD BIT(4) #define MLX5_IB_UPD_XLT_ACCESS BIT(5) #define MLX5_IB_UPD_XLT_INDIRECT BIT(6) +#define MLX5_IB_UPD_XLT_DOWNGRADE BIT(7) /* Private QP creation flags to be passed in ib_qp_init_attr.create_flags. * diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 4b37446758fd..78887500ce15 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -34,6 +34,7 @@ #include #include #include +#include #include "mlx5_ib.h" #include "cmd.h" @@ -158,22 +159,12 @@ static void populate_klm(struct mlx5_klm *pklm, size_t idx, size_t nentries, } } -static u64 umem_dma_to_mtt(dma_addr_t umem_dma) -{ - u64 mtt_entry = umem_dma & ODP_DMA_ADDR_MASK; - - if (umem_dma & ODP_READ_ALLOWED_BIT) - mtt_entry |= MLX5_IB_MTT_READ; - if (umem_dma & ODP_WRITE_ALLOWED_BIT) - mtt_entry |= MLX5_IB_MTT_WRITE; - - return mtt_entry; -} - static void populate_mtt(__be64 *pas, size_t idx, size_t nentries, struct mlx5_ib_mr *mr, int flags) { struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem); + bool downgrade = flags & MLX5_IB_UPD_XLT_DOWNGRADE; + unsigned long pfn; dma_addr_t pa; size_t i; @@ -181,8 +172,17 @@ static void populate_mtt(__be64 *pas, size_t idx, size_t nentries, return; for (i = 0; i < nentries; i++) { + pfn = odp->pfn_list[idx + i]; + if (!(pfn & HMM_PFN_VALID)) + /* ODP initialization */ + continue; + pa = odp->dma_list[idx + i]; - pas[i] = cpu_to_be64(umem_dma_to_mtt(pa)); + pa |= MLX5_IB_MTT_READ; + if ((pfn & HMM_PFN_WRITE) && !downgrade) + pa |= MLX5_IB_MTT_WRITE; + + pas[i] = cpu_to_be64(pa); } } @@ -286,8 +286,7 @@ static bool mlx5_ib_invalidate_range(struct mmu_interval_notifier *mni, * estimate the cost of another UMR vs. the cost of bigger * UMR. */ - if (umem_odp->dma_list[idx] & - (ODP_READ_ALLOWED_BIT | ODP_WRITE_ALLOWED_BIT)) { + if (umem_odp->pfn_list[idx] & HMM_PFN_VALID) { if (!in_block) { blk_start_idx = idx; in_block = 1; @@ -668,7 +667,7 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp, { int page_shift, ret, np; bool downgrade = flags & MLX5_PF_FLAGS_DOWNGRADE; - u64 access_mask; + u64 access_mask = 0; u64 start_idx; bool fault = !(flags & MLX5_PF_FLAGS_SNAPSHOT); u32 xlt_flags = MLX5_IB_UPD_XLT_ATOMIC; @@ -676,12 +675,14 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp, if (flags & MLX5_PF_FLAGS_ENABLE) xlt_flags |= MLX5_IB_UPD_XLT_ENABLE; + if (flags & MLX5_PF_FLAGS_DOWNGRADE) + xlt_flags |= MLX5_IB_UPD_XLT_DOWNGRADE; + page_shift = odp->page_shift; start_idx = (user_va - ib_umem_start(odp)) >> page_shift; - access_mask = ODP_READ_ALLOWED_BIT; if (odp->umem.writable && !downgrade) - access_mask |= ODP_WRITE_ALLOWED_BIT; + access_mask |= HMM_PFN_WRITE; np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault); if (np < 0) diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h index 0844c1d05ac6..a345c26a745d 100644 --- a/include/rdma/ib_umem_odp.h +++ b/include/rdma/ib_umem_odp.h @@ -8,6 +8,7 @@ #include #include +#include struct ib_umem_odp { struct ib_umem umem; @@ -67,19 +68,6 @@ static inline size_t ib_umem_odp_num_pages(struct ib_umem_odp *umem_odp) umem_odp->page_shift; } -/* - * The lower 2 bits of the DMA address signal the R/W permissions for - * the entry. To upgrade the permissions, provide the appropriate - * bitmask to the map_dma_pages function. - * - * Be aware that upgrading a mapped address might result in change of - * the DMA address for the page. - */ -#define ODP_READ_ALLOWED_BIT (1<<0ULL) -#define ODP_WRITE_ALLOWED_BIT (1<<1ULL) - -#define ODP_DMA_ADDR_MASK (~(ODP_READ_ALLOWED_BIT | ODP_WRITE_ALLOWED_BIT)) - #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING struct ib_umem_odp * From patchwork Fri Jan 17 10:03:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943127 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EADEAC02183 for ; Fri, 17 Jan 2025 10:04:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7E45028000F; Fri, 17 Jan 2025 05:04:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 76D4F280001; Fri, 17 Jan 2025 05:04:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E6E128000F; Fri, 17 Jan 2025 05:04:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3C30D280001 for ; Fri, 17 Jan 2025 05:04:46 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E93DAA194E for ; Fri, 17 Jan 2025 10:04:45 +0000 (UTC) X-FDA: 83016509730.12.91118FC Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf15.hostedemail.com (Postfix) with ESMTP id 480C6A000B for ; Fri, 17 Jan 2025 10:04:44 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=sfa+JDDG; spf=pass (imf15.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108284; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yweF4ht0o7E+/zZecBgZmvx1ZxXBIYYmJlEiRnNfedg=; b=J7+ynoRjzXa6ipGTsLJ1EHqTPU/kSYJJy0FU+6rcEoo+MgDcJDuMGIryOrhhTWcDF4jRcV RUK93o/MNV1+WCCLurzL+tSyo1OOClIr2H0sS4oN+JGukTcMXj65H7QXc965Kmzq/lMnup lppqFkztHFgLPHE0k3mQkT9VXBGoZfs= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=sfa+JDDG; spf=pass (imf15.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108284; a=rsa-sha256; cv=none; b=uBdKaoG0YUm6nrLPVz0cwmOlxUrr2bHcgZNsXLGxyBeIIEiitBFp9BzUVsIF70nUw+pOBO TAW6e0zMdExNPiCRD/Q0v0FTis3xLSgwIEFmrrP6wtqoEts7FLkH11gD4E8Bc4kRrtx+JA j2TaiBZJqc4bbk9MtH884fZ3pJ7ZS7g= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id D07F05C580F; Fri, 17 Jan 2025 10:04:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 887DCC4CEDD; Fri, 17 Jan 2025 10:04:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108283; bh=dDTCSmQeMPIAEBweCBxqLOzPbgY20hq3depOnf0540k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sfa+JDDGHr/gzAnMENhnX1eZSUU8U71OkaEQHCknIexynmXaR5nBE8iXLb7bMZwPu tAcn2nfb5l00M5mxY+Mzo3nNxG2JYrZcDmNimjX9Rut+2c0/j5TkVsSdI+nSsSd9GL PpuWuV2A/tjfaj+4WSb1P/tTjqUsyLkfO0/S2u/Ytk8q0Ooi0CUPL4Kwhj4U3/l8nQ TQNlzBFXBUJ+zUtNhsVj0R2pQNsw0v1I9VYqspMlOI6Kdhf+uXIifdaJ/UqWqi6GrD pXuiSbhne8fzEF1CbZpyKeTxmHCc+l+MTY1wh4PR9iSb5kKqQWCUJsdBxr51kc1O4G DAhr0NqJKDYEA== From: Leon Romanovsky To: Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 13/17] RDMA/core: Convert UMEM ODP DMA mapping to caching IOVA and page linkage Date: Fri, 17 Jan 2025 12:03:44 +0200 Message-ID: <2f309e04cc483ec57d501f5df120794e07b19901.1737106761.git.leon@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 480C6A000B X-Stat-Signature: 8wcnoto1rocjwtmkyq8nyxhsddn8qfm7 X-Rspam-User: X-HE-Tag: 1737108284-390966 X-HE-Meta: U2FsdGVkX19m/VJ5UU3iT+i5dtiVBU0m8PTaSR5BGAGRrP7NOv3gZccelzdWctlrVhrXWxZSvn8zZtc9ms6ZBRIbAO36H7S6K67DKY6Z6ajX6+AJoE8bVPx7sBVgjjQrKFtofOREAxJWutd2Z3n1hL3QSZlhB/pFzFjYi0C5aZ0bzrCfKb25EWifvQF9IoCZeiuO2lSX1NjSYVoLxwYv3s0aqciZPVkG4/mfQ1L0RcIGamtJRZ305n7KdAOBNQzR6ZiwOWVjKUziAdzPcD3lWZtMn8Xc+MtPMhO20m5rKDQ3ddI22TgnGQw7tDkkJwPz/WNiVxCIz7xfKib7b6yZ/j+39lCEpgGRU2vnzvLyjni4CGb21SS5D6GtZ56NYRxG5ywCkuKHuCAQvkAhs792tFmgTbwu2V51OmTIfCY/HHxMVVbS1/4xaQBVmvgCZubuksi1g3sYuI71UrRwHPcHs65IXy3vnOzkn5rlmXC91IiNlZ5YytjEL1rAzYWFCmn4dNp2KqgioF8r1+ij033cingqqjqW0CByeHiag948oEC6hoDA4A+G8OmA7BvfnTOguqqK/KtITQLEWhVz6fFHIgr6ViaTyfof5KplUoxnrbGwp3iVENaPgCIP9pbKxw1CBULkbP/skCYKYrSCukb4HLipURBOp7qJ80UwFvjjNUDknvu4TKk4+Ra4lIpUWyzZ+R63AMxxKTHwN3yTwNdFGVR7uO6+BHGoJkbBWaUsNtVkSDUKxNzugplndLKvaaR/MyIW/Urb+4NK/rh1OzdDIHb1Lq4FjFXIj8/Yko/1xVGkevVJ5IvvGSD3nVRXtSjBq8gpkUmn40W/8XHjbtkXWU+Nsc8RU8H+DxfmgXyY3ck/R9BQOTVgQma9ik7iXI2UtmyZq4ii3H1uPFPDqhnwnu/W5FxsOdvVK1w9oDpV4QHv87vAgrqdCZiE79jiO7EniQbb9KPw5V32m7L5ZCh 4cZmUEj0 mdfDKdlGVgWH3N/hAzZ2zXJnBh2FF3I2d2N+wwoPNpw+8SuPn1qXJda6WCW7VC+75e/aFGlhK5LXS17Z7Vl+ToT1EiKnRMNFs4Lsm10yi9RCMZ+Q/LOjL5E6fHOLTc01wjNdpCPJyL1USpKVRkBp59Z3DQqSp2wbXiuxl2/TmBTPd7/F5tEB9jgeSCE2hPIeie66EYxvuEH1I4eUhG9fmezNLgJVJ4ZYLilM3TqJB8CZAyZA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Reuse newly added DMA API to cache IOVA and only link/unlink pages in fast path for UMEM ODP flow. Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 104 ++++++--------------------- drivers/infiniband/hw/mlx5/mlx5_ib.h | 11 +-- drivers/infiniband/hw/mlx5/odp.c | 40 +++++++---- drivers/infiniband/hw/mlx5/umr.c | 12 +++- include/rdma/ib_umem_odp.h | 13 +--- 5 files changed, 69 insertions(+), 111 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index e1a5a567efb3..30cd8f353476 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -41,6 +41,7 @@ #include #include #include +#include #include #include @@ -50,6 +51,7 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, const struct mmu_interval_notifier_ops *ops) { + struct ib_device *dev = umem_odp->umem.ibdev; int ret; umem_odp->umem.is_odp = 1; @@ -59,7 +61,6 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, size_t page_size = 1UL << umem_odp->page_shift; unsigned long start; unsigned long end; - size_t ndmas, npfns; start = ALIGN_DOWN(umem_odp->umem.address, page_size); if (check_add_overflow(umem_odp->umem.address, @@ -70,36 +71,23 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, if (unlikely(end < page_size)) return -EOVERFLOW; - ndmas = (end - start) >> umem_odp->page_shift; - if (!ndmas) - return -EINVAL; - - npfns = (end - start) >> PAGE_SHIFT; - umem_odp->pfn_list = kvcalloc( - npfns, sizeof(*umem_odp->pfn_list), GFP_KERNEL); - if (!umem_odp->pfn_list) - return -ENOMEM; - - umem_odp->dma_list = kvcalloc( - ndmas, sizeof(*umem_odp->dma_list), GFP_KERNEL); - if (!umem_odp->dma_list) { - ret = -ENOMEM; - goto out_pfn_list; - } + ret = hmm_dma_map_alloc(dev->dma_device, &umem_odp->map, + (end - start) >> PAGE_SHIFT, + 1 << umem_odp->page_shift); + if (ret) + return ret; ret = mmu_interval_notifier_insert(&umem_odp->notifier, umem_odp->umem.owning_mm, start, end - start, ops); if (ret) - goto out_dma_list; + goto out_free_map; } return 0; -out_dma_list: - kvfree(umem_odp->dma_list); -out_pfn_list: - kvfree(umem_odp->pfn_list); +out_free_map: + hmm_dma_map_free(dev->dma_device, &umem_odp->map); return ret; } @@ -262,6 +250,8 @@ EXPORT_SYMBOL(ib_umem_odp_get); void ib_umem_odp_release(struct ib_umem_odp *umem_odp) { + struct ib_device *dev = umem_odp->umem.ibdev; + /* * Ensure that no more pages are mapped in the umem. * @@ -274,48 +264,17 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp) ib_umem_end(umem_odp)); mutex_unlock(&umem_odp->umem_mutex); mmu_interval_notifier_remove(&umem_odp->notifier); - kvfree(umem_odp->dma_list); - kvfree(umem_odp->pfn_list); + hmm_dma_map_free(dev->dma_device, &umem_odp->map); } put_pid(umem_odp->tgid); kfree(umem_odp); } EXPORT_SYMBOL(ib_umem_odp_release); -/* - * Map for DMA and insert a single page into the on-demand paging page tables. - * - * @umem: the umem to insert the page to. - * @dma_index: index in the umem to add the dma to. - * @page: the page struct to map and add. - * @access_mask: access permissions needed for this page. - * - * The function returns -EFAULT if the DMA mapping operation fails. - * - */ -static int ib_umem_odp_map_dma_single_page( - struct ib_umem_odp *umem_odp, - unsigned int dma_index, - struct page *page) -{ - struct ib_device *dev = umem_odp->umem.ibdev; - dma_addr_t *dma_addr = &umem_odp->dma_list[dma_index]; - - *dma_addr = ib_dma_map_page(dev, page, 0, 1 << umem_odp->page_shift, - DMA_BIDIRECTIONAL); - if (ib_dma_mapping_error(dev, *dma_addr)) { - *dma_addr = 0; - return -EFAULT; - } - umem_odp->npages++; - return 0; -} - /** * ib_umem_odp_map_dma_and_lock - DMA map userspace memory in an ODP MR and lock it. * * Maps the range passed in the argument to DMA addresses. - * The DMA addresses of the mapped pages is updated in umem_odp->dma_list. * Upon success the ODP MR will be locked to let caller complete its device * page table update. * @@ -372,7 +331,7 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, range.default_flags |= HMM_PFN_REQ_WRITE; } - range.hmm_pfns = &(umem_odp->pfn_list[pfn_start_idx]); + range.hmm_pfns = &(umem_odp->map.pfn_list[pfn_start_idx]); timeout = jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT); retry: @@ -423,16 +382,6 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, __func__, hmm_order, page_shift); break; } - - ret = ib_umem_odp_map_dma_single_page( - umem_odp, dma_index, - hmm_pfn_to_page(range.hmm_pfns[pfn_index])); - if (ret < 0) { - ibdev_dbg(umem_odp->umem.ibdev, - "ib_umem_odp_map_dma_single_page failed with error %d\n", ret); - break; - } - range.hmm_pfns[pfn_index] |= HMM_PFN_DMA_MAPPED; } /* upon success lock should stay on hold for the callee */ if (!ret) @@ -452,32 +401,23 @@ EXPORT_SYMBOL(ib_umem_odp_map_dma_and_lock); void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt, u64 bound) { - dma_addr_t dma; - int idx; - u64 addr; struct ib_device *dev = umem_odp->umem.ibdev; + u64 addr; lockdep_assert_held(&umem_odp->umem_mutex); virt = max_t(u64, virt, ib_umem_start(umem_odp)); bound = min_t(u64, bound, ib_umem_end(umem_odp)); for (addr = virt; addr < bound; addr += BIT(umem_odp->page_shift)) { - unsigned long pfn_idx = (addr - ib_umem_start(umem_odp)) >> - PAGE_SHIFT; - struct page *page = - hmm_pfn_to_page(umem_odp->pfn_list[pfn_idx]); - - idx = (addr - ib_umem_start(umem_odp)) >> umem_odp->page_shift; - dma = umem_odp->dma_list[idx]; + u64 offset = addr - ib_umem_start(umem_odp); + size_t idx = offset >> umem_odp->page_shift; + unsigned long pfn = umem_odp->map.pfn_list[idx]; - if (!(umem_odp->pfn_list[pfn_idx] & HMM_PFN_VALID)) - goto clear; - if (!(umem_odp->pfn_list[pfn_idx] & HMM_PFN_DMA_MAPPED)) + if (!hmm_dma_unmap_pfn(dev->dma_device, &umem_odp->map, idx)) goto clear; - ib_dma_unmap_page(dev, dma, BIT(umem_odp->page_shift), - DMA_BIDIRECTIONAL); - if (umem_odp->pfn_list[pfn_idx] & HMM_PFN_WRITE) { + if (pfn & HMM_PFN_WRITE) { + struct page *page = hmm_pfn_to_page(pfn); struct page *head_page = compound_head(page); /* * set_page_dirty prefers being called with @@ -492,7 +432,7 @@ void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt, } umem_odp->npages--; clear: - umem_odp->pfn_list[pfn_idx] &= ~HMM_PFN_FLAGS; + umem_odp->map.pfn_list[idx] &= ~HMM_PFN_FLAGS; } } EXPORT_SYMBOL(ib_umem_odp_unmap_dma_pages); diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index c4946d4f0ad7..6fa171e74754 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1445,8 +1445,8 @@ void mlx5_ib_odp_cleanup_one(struct mlx5_ib_dev *ibdev); int __init mlx5_ib_odp_init(void); void mlx5_ib_odp_cleanup(void); int mlx5_odp_init_mkey_cache(struct mlx5_ib_dev *dev); -void mlx5_odp_populate_xlt(void *xlt, size_t idx, size_t nentries, - struct mlx5_ib_mr *mr, int flags); +int mlx5_odp_populate_xlt(void *xlt, size_t idx, size_t nentries, + struct mlx5_ib_mr *mr, int flags); int mlx5_ib_advise_mr_prefetch(struct ib_pd *pd, enum ib_uverbs_advise_mr_advice advice, @@ -1467,8 +1467,11 @@ static inline int mlx5_odp_init_mkey_cache(struct mlx5_ib_dev *dev) { return 0; } -static inline void mlx5_odp_populate_xlt(void *xlt, size_t idx, size_t nentries, - struct mlx5_ib_mr *mr, int flags) {} +static inline int mlx5_odp_populate_xlt(void *xlt, size_t idx, size_t nentries, + struct mlx5_ib_mr *mr, int flags) +{ + return -EOPNOTSUPP; +} static inline int mlx5_ib_advise_mr_prefetch(struct ib_pd *pd, diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 78887500ce15..fbb2a5670c32 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -35,6 +35,8 @@ #include #include #include +#include +#include #include "mlx5_ib.h" #include "cmd.h" @@ -159,40 +161,50 @@ static void populate_klm(struct mlx5_klm *pklm, size_t idx, size_t nentries, } } -static void populate_mtt(__be64 *pas, size_t idx, size_t nentries, - struct mlx5_ib_mr *mr, int flags) +static int populate_mtt(__be64 *pas, size_t start, size_t nentries, + struct mlx5_ib_mr *mr, int flags) { struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem); bool downgrade = flags & MLX5_IB_UPD_XLT_DOWNGRADE; - unsigned long pfn; - dma_addr_t pa; + struct pci_p2pdma_map_state p2pdma_state = {}; + struct ib_device *dev = odp->umem.ibdev; size_t i; if (flags & MLX5_IB_UPD_XLT_ZAP) - return; + return 0; for (i = 0; i < nentries; i++) { - pfn = odp->pfn_list[idx + i]; + unsigned long pfn = odp->map.pfn_list[start + i]; + dma_addr_t dma_addr; + + pfn = odp->map.pfn_list[start + i]; if (!(pfn & HMM_PFN_VALID)) /* ODP initialization */ continue; - pa = odp->dma_list[idx + i]; - pa |= MLX5_IB_MTT_READ; + dma_addr = hmm_dma_map_pfn(dev->dma_device, &odp->map, + start + i, &p2pdma_state); + if (ib_dma_mapping_error(dev, dma_addr)) + return -EFAULT; + + dma_addr |= MLX5_IB_MTT_READ; if ((pfn & HMM_PFN_WRITE) && !downgrade) - pa |= MLX5_IB_MTT_WRITE; + dma_addr |= MLX5_IB_MTT_WRITE; - pas[i] = cpu_to_be64(pa); + pas[i] = cpu_to_be64(dma_addr); + odp->npages++; } + return 0; } -void mlx5_odp_populate_xlt(void *xlt, size_t idx, size_t nentries, - struct mlx5_ib_mr *mr, int flags) +int mlx5_odp_populate_xlt(void *xlt, size_t idx, size_t nentries, + struct mlx5_ib_mr *mr, int flags) { if (flags & MLX5_IB_UPD_XLT_INDIRECT) { populate_klm(xlt, idx, nentries, mr, flags); + return 0; } else { - populate_mtt(xlt, idx, nentries, mr, flags); + return populate_mtt(xlt, idx, nentries, mr, flags); } } @@ -286,7 +298,7 @@ static bool mlx5_ib_invalidate_range(struct mmu_interval_notifier *mni, * estimate the cost of another UMR vs. the cost of bigger * UMR. */ - if (umem_odp->pfn_list[idx] & HMM_PFN_VALID) { + if (umem_odp->map.pfn_list[idx] & HMM_PFN_VALID) { if (!in_block) { blk_start_idx = idx; in_block = 1; diff --git a/drivers/infiniband/hw/mlx5/umr.c b/drivers/infiniband/hw/mlx5/umr.c index 887fd6fa3ba9..d7fa94ab23cf 100644 --- a/drivers/infiniband/hw/mlx5/umr.c +++ b/drivers/infiniband/hw/mlx5/umr.c @@ -811,7 +811,17 @@ int mlx5r_umr_update_xlt(struct mlx5_ib_mr *mr, u64 idx, int npages, size_to_map = npages * desc_size; dma_sync_single_for_cpu(ddev, sg.addr, sg.length, DMA_TO_DEVICE); - mlx5_odp_populate_xlt(xlt, idx, npages, mr, flags); + /* + * npages is the maximum number of pages to map, but we + * can't guarantee that all pages are actually mapped. + * + * For example, if page is p2p of type which is not supported + * for mapping, the number of pages mapped will be less than + * requested. + */ + err = mlx5_odp_populate_xlt(xlt, idx, npages, mr, flags); + if (err) + return err; dma_sync_single_for_device(ddev, sg.addr, sg.length, DMA_TO_DEVICE); sg.length = ALIGN(size_to_map, MLX5_UMR_FLEX_ALIGNMENT); diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h index a345c26a745d..2a24bf791c10 100644 --- a/include/rdma/ib_umem_odp.h +++ b/include/rdma/ib_umem_odp.h @@ -8,24 +8,17 @@ #include #include -#include +#include struct ib_umem_odp { struct ib_umem umem; struct mmu_interval_notifier notifier; struct pid *tgid; - /* An array of the pfns included in the on-demand paging umem. */ - unsigned long *pfn_list; + struct hmm_dma_map map; /* - * An array with DMA addresses mapped for pfns in pfn_list. - * The lower two bits designate access permissions. - * See ODP_READ_ALLOWED_BIT and ODP_WRITE_ALLOWED_BIT. - */ - dma_addr_t *dma_list; - /* - * The umem_mutex protects the page_list and dma_list fields of an ODP + * The umem_mutex protects the page_list field of an ODP * umem, allowing only a single thread to map/unmap pages. The mutex * also protects access to the mmu notifier counters. */ From patchwork Fri Jan 17 10:03:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943128 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4462FC02183 for ; Fri, 17 Jan 2025 10:04:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D1553280010; Fri, 17 Jan 2025 05:04:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C76C0280001; Fri, 17 Jan 2025 05:04:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF0DE280010; Fri, 17 Jan 2025 05:04:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8D884280001 for ; Fri, 17 Jan 2025 05:04:49 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4D97F16189C for ; Fri, 17 Jan 2025 10:04:49 +0000 (UTC) X-FDA: 83016509898.29.A021AE8 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf15.hostedemail.com (Postfix) with ESMTP id B7CFFA0016 for ; Fri, 17 Jan 2025 10:04:47 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=bkB3BLpB; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf15.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108287; a=rsa-sha256; cv=none; b=3sgLNgCbZTEBtv1/Ubk47UBg2WGSqsnk9NMoaeakKeMA8SNX7yTJ4o/Y0ZcZ+QMRc2EeL/ zstfKvjd660owKSinww+Mv7Et/JWL9x5kcoRXj9UqVVG1asdM3D5RwSgw7uJtsH3fywXdk K8sEXpnmRqIvapS1M1IKmHDGmv8AkR4= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=bkB3BLpB; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf15.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108287; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7cQNlFpbnVX6gUAB9amacDtsmGpZEeXHQqQoZZ1JWjE=; b=MgdAnbiLkduesIhAOQ8ni2a9Z/5zbjv9Pf0Nfb2LCtg0efsNzOLzr/uOcd9KbYUhlIkOAm 9oiVdZjeK/JBz/knADGYrVZduP3kBLv32NqugMX41HiDWjiF8f7VVNIBThdhz4v3C5Zk2i NK4n8yGUZNuCTenGUXqhOT8nFrvesCE= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 3B93BA42B36; Fri, 17 Jan 2025 10:02:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 23080C4CEDD; Fri, 17 Jan 2025 10:04:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108286; bh=+jQHGvycC9oSv0D+RK/HRGf1ID+TlvYCc0BZmHQ5ysU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bkB3BLpBVVoZs3Gx90avfaINhDI6JUWwXrziH40Dqga5ISniEhoAjLj5RvuyIUnP2 j/FhgetCrfNwKwEBwX8TNjmKU0lsM7Pm5v/n4Uw9GUPIGtcA2vS3q9LsCvpPPK/eIv L8uldIG5M+jGNP4539BpUxo7/4WJ5ooV80isa85oburt/kXaLUEMukr35gGp5oWMrr OEGrAwLaD7HePIiOAuBjT8b0JnBHsO9P0LXkXPEKGZiTDDCKItyho29JtXg589lTaC 2LCMMv38ZwdUq7o3kc2lfF3uOxJzIpqKgOoJ+ZOvSKGrnUZXYC4jq1uiVhj9+wJGVG cE1kPpwtR3fHA== From: Leon Romanovsky To: Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 14/17] RDMA/umem: Separate implicit ODP initialization from explicit ODP Date: Fri, 17 Jan 2025 12:03:45 +0200 Message-ID: <5c284b0ef955d4cbbe1f0adbddfa3cd3acdd831d.1737106761.git.leon@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: B7CFFA0016 X-Rspamd-Server: rspam10 X-Stat-Signature: 3cfdytjkwbzmhoeim8dae1rp6ggybdb4 X-HE-Tag: 1737108287-86116 X-HE-Meta: U2FsdGVkX18uJXG3A57Wywk3cuGiOthHF1xQoAQPJjzImRH3BZANOYSRF+zLMP+Q6ZfB3WwxHbO32Hzm0Ty4KapBypHwbQWiAQnIU6Z6uXTIu55oMXz3NGQmRZMM89x3KXbdcISx+rMYD+VTn8BFjHiZNef26kfa0sXqzsjVQ9yG6dVGEtf3yWcdgp0rE6g6ByDxHsUopN1MILupW67ZBTKcvXKYG67ZTvZBEUY1kcTohVCppQzmezIZOIquBhxuhJ0lgRfd6xHDsGSoJnz5ovDZLbQLsbGrFrOMnKTNRJGtJUGzHPi+gyTptpAtJowzxL9nIWY3JxoEzDozHmonr1gZvfGZGTC+t1o07HRktUjyS39dVdPzNM5QPF+JDATqsjg8j0DmEG9nissRu+H/eOjvOp9KSAWVv044KSdL/IrQouXp/F4GshMnSj4wEv9VHw9TWotClRmkKYaocsdAj5c/tubEhv3NczAthBS3VPYIR/YoKJLCe3/yBu5vTIOCC53c/2tH02e/lWiaKroUiHj9o0jCsptOslceg4H0DEL/U06xb6jiSzwHsn5+ifLhjSFAM1WS++5tTHHu56zo/U9v/sdaOh/VMgw17oOwhsZiB+SASWEp1d6aTKECkMTvPsenDiWGGj+C0cLcJ+pm4Wn+IjOY2pnD04JHgcGNrXohtj1CTkmPm9EfgYMUbwhbmsQwldlVfccl9ng0JeW3hkMc1jQTumdVU0zrhq4Fjm1DRbIzhp4MG3JowjHz1T7Ch2mIHjsr7G9jfZ6+sd73KeiO958QgS3+7OpuROwkdTRQguDsCVr6dxD8Ayt1fqAiIRM0Faw1XY81UUVuaWFNxz1ivowZgYR99BbjLZn9OFdUan5cNX6hDV1kNrxwCI+ijL70A1JN812PdHO9We8M0tbpI6GMTrPkfjSqqPSZRL98jXlgb8QYnVLWdR0UvYwxiOnH8RUbU7yLgXvJXja /yqrzk/3 EBLb2EWl9zabKBmhxnDWv3x+KpbagEjHk5kgDf+dR1y7WrBeJhv25efL8LX4zVgAY0IO6PLRXcqWJFrf5+KJlQwzZ51HNuIxfWoMtx8olSBNMB5aWlUHA7r1KB94ya4BUzlz+Hpr+ZcGa/1wO/hgLFXLolrfFBLytjDC0nQhFkqnknRYV8ci7utucGy1j4xyK2T3xJ5sGXednpVjiJRCUc3AsQXgo27i2sxo2n0xwv4HDd5I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Create separate functions for the implicit ODP initialization which is different from the explicit ODP initialization. Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 91 +++++++++++++++--------------- 1 file changed, 46 insertions(+), 45 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 30cd8f353476..51d518989914 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -48,41 +48,44 @@ #include "uverbs.h" -static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, - const struct mmu_interval_notifier_ops *ops) +static void ib_init_umem_implicit_odp(struct ib_umem_odp *umem_odp) +{ + umem_odp->is_implicit_odp = 1; + umem_odp->umem.is_odp = 1; + mutex_init(&umem_odp->umem_mutex); +} + +static int ib_init_umem_odp(struct ib_umem_odp *umem_odp, + const struct mmu_interval_notifier_ops *ops) { struct ib_device *dev = umem_odp->umem.ibdev; + size_t page_size = 1UL << umem_odp->page_shift; + unsigned long start; + unsigned long end; int ret; umem_odp->umem.is_odp = 1; mutex_init(&umem_odp->umem_mutex); - if (!umem_odp->is_implicit_odp) { - size_t page_size = 1UL << umem_odp->page_shift; - unsigned long start; - unsigned long end; - - start = ALIGN_DOWN(umem_odp->umem.address, page_size); - if (check_add_overflow(umem_odp->umem.address, - (unsigned long)umem_odp->umem.length, - &end)) - return -EOVERFLOW; - end = ALIGN(end, page_size); - if (unlikely(end < page_size)) - return -EOVERFLOW; - - ret = hmm_dma_map_alloc(dev->dma_device, &umem_odp->map, - (end - start) >> PAGE_SHIFT, - 1 << umem_odp->page_shift); - if (ret) - return ret; - - ret = mmu_interval_notifier_insert(&umem_odp->notifier, - umem_odp->umem.owning_mm, - start, end - start, ops); - if (ret) - goto out_free_map; - } + start = ALIGN_DOWN(umem_odp->umem.address, page_size); + if (check_add_overflow(umem_odp->umem.address, + (unsigned long)umem_odp->umem.length, &end)) + return -EOVERFLOW; + end = ALIGN(end, page_size); + if (unlikely(end < page_size)) + return -EOVERFLOW; + + ret = hmm_dma_map_alloc(dev->dma_device, &umem_odp->map, + (end - start) >> PAGE_SHIFT, + 1 << umem_odp->page_shift); + if (ret) + return ret; + + ret = mmu_interval_notifier_insert(&umem_odp->notifier, + umem_odp->umem.owning_mm, start, + end - start, ops); + if (ret) + goto out_free_map; return 0; @@ -106,7 +109,6 @@ struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_device *device, { struct ib_umem *umem; struct ib_umem_odp *umem_odp; - int ret; if (access & IB_ACCESS_HUGETLB) return ERR_PTR(-EINVAL); @@ -118,16 +120,10 @@ struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_device *device, umem->ibdev = device; umem->writable = ib_access_writable(access); umem->owning_mm = current->mm; - umem_odp->is_implicit_odp = 1; umem_odp->page_shift = PAGE_SHIFT; umem_odp->tgid = get_task_pid(current->group_leader, PIDTYPE_PID); - ret = ib_init_umem_odp(umem_odp, NULL); - if (ret) { - put_pid(umem_odp->tgid); - kfree(umem_odp); - return ERR_PTR(ret); - } + ib_init_umem_implicit_odp(umem_odp); return umem_odp; } EXPORT_SYMBOL(ib_umem_odp_alloc_implicit); @@ -248,7 +244,7 @@ struct ib_umem_odp *ib_umem_odp_get(struct ib_device *device, } EXPORT_SYMBOL(ib_umem_odp_get); -void ib_umem_odp_release(struct ib_umem_odp *umem_odp) +static void ib_umem_odp_free(struct ib_umem_odp *umem_odp) { struct ib_device *dev = umem_odp->umem.ibdev; @@ -258,14 +254,19 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp) * It is the driver's responsibility to ensure, before calling us, * that the hardware will not attempt to access the MR any more. */ - if (!umem_odp->is_implicit_odp) { - mutex_lock(&umem_odp->umem_mutex); - ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp), - ib_umem_end(umem_odp)); - mutex_unlock(&umem_odp->umem_mutex); - mmu_interval_notifier_remove(&umem_odp->notifier); - hmm_dma_map_free(dev->dma_device, &umem_odp->map); - } + mutex_lock(&umem_odp->umem_mutex); + ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp), + ib_umem_end(umem_odp)); + mutex_unlock(&umem_odp->umem_mutex); + mmu_interval_notifier_remove(&umem_odp->notifier); + hmm_dma_map_free(dev->dma_device, &umem_odp->map); +} + +void ib_umem_odp_release(struct ib_umem_odp *umem_odp) +{ + if (!umem_odp->is_implicit_odp) + ib_umem_odp_free(umem_odp); + put_pid(umem_odp->tgid); kfree(umem_odp); } From patchwork Fri Jan 17 10:03:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943132 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1AC0C02185 for ; Fri, 17 Jan 2025 10:05:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6ACE6280014; Fri, 17 Jan 2025 05:05:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 65CFF280001; Fri, 17 Jan 2025 05:05:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D65D280014; Fri, 17 Jan 2025 05:05:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 26535280001 for ; Fri, 17 Jan 2025 05:05:05 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id D57ACC1893 for ; Fri, 17 Jan 2025 10:05:04 +0000 (UTC) X-FDA: 83016510528.09.020C6C0 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf18.hostedemail.com (Postfix) with ESMTP id 41E861C000D for ; Fri, 17 Jan 2025 10:05:03 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=dPnPlsl4; spf=pass (imf18.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108303; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xHL+0PwsYkqy9hrcZzisDE7voF41Wm2cb546EfOK7HQ=; b=4f2KOQHylKlhUKax7Q4/1+0ev7f0r356UGqytxNuMrhMW2Wto2S3nFHCWz/pamqr1z3m+h oqnIgGHqyxxsWJd3cLJGGaD1vw0Jg1jr26iOxfQUiTJLgAfQndBsM+6VLSuA0JKf1+HN91 NH6u/4o0/gVGT1sBHQuwnVpRCkDNSXc= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=dPnPlsl4; spf=pass (imf18.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108303; a=rsa-sha256; cv=none; b=c2ZuWv/G495Rg6g5wDxUVfmAXgqGXJjTtCAddqgM1K7w6nfmOQHP3B+5WIoKZ5STcMV2Lg X0j58rWuZ/Fr5KkNCNd9dix5t4Rh1uGtQIBvWIpb9lYCrIleYo2F8G1upL5eDSO8yNywoZ AQ8xbRY8eYpjJrNkQa0/I8uJGJcg6c8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id A832EA42B35; Fri, 17 Jan 2025 10:03:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 750C6C4CEDD; Fri, 17 Jan 2025 10:05:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108302; bh=NjoF5CkFbjfbiiDh3lbvpEgTINBqFJZ8gAtQ0qoZOsM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dPnPlsl4l3XEbZ6+YufhPOPVz19IDZFEVZFvoB2c4laBaP28NRzb/ZXJ06r4Me1/r ZtZjVoJHCnUfeZFeD4gCzg+vtq91HawcJ5qyJEdu23sfd9OtIUfRVcYghCJJDEurAp glHdtj4q1NaA0Z6Lra2VOeUfxHHriM0Q8hT9DTFrZDnqMncxg2Ra4WEHoYA8QI7+Mr AyV1GlmFAZg0fVxyqWovsMk/wbm3vWsw7G2di4YbfG4lUqsZLKMFUfLunEjTLAkdNb yYu6xI98I+5MpSRovIfbQr5xCjXxzzDgayLWG4w8ylx7iPalO6zWqSy4HYQSzkh58u aJO6ADhEGgmZg== From: Leon Romanovsky To: Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 15/17] vfio/mlx5: Explicitly use number of pages instead of allocated length Date: Fri, 17 Jan 2025 12:03:46 +0200 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 41E861C000D X-Stat-Signature: b91dkqe6zf4e6bh9n5bik3qt46mwynfz X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1737108303-865641 X-HE-Meta: U2FsdGVkX18xLn4HQIu/Dwf3KAlc8avFijtkn4yUcYkxZ1P7Rjh70wKE/3uEN+/cZegH5zWu/yfSnri/tGCB+Q2SJusWnse6/gjBBGoe9gxUmsprDlaBS2l6c8CY1t5f3J/uoSTRmnj4wYrrp4lob8fvg+G9eX2o2vtz5PHRhMLiU8i1GkVxenLrijPkiNlNEqf2qOY08Knm96BXNdRkiQmyQURTwBV+UQcb+oFeQhxcZGI8+kWl6rcYvaId3+EflNFIPNDL2d5Bx6Y4Gsmdqyj9T/24rPqdUZm3ZoZL9sbH4EIrEjT6ziZxsYN7y/SoXRn6yP6rKNtusJs3pqDzdjAVWIpNhsF+cI8Y8pPZ0FoRHCyFfpK2bwgrLAeJgo3LuBolGvaasyFRqRek8+BSi7Shapiq3JxiFOYEnqPiP/mR+LuxtF49GmD5oQxNAgqTeWoiAb5qMzjwQ4fSJ6+lhy/qzlYFlAZkin2LKsdnnKO9omHzLVWkWKNqx2vdIRANX4gfeuYxg/WVH8blMBwQBTwnlmQ9XJbt0GpOixCxvZ2UIA++GOksHd3TLipiltOlRwb+3YfweR3Z3k19OE1lWOYzfh9X3rXSe75Zm2wdvaKjvt7zamOTHp3SRj4rP3s/IsmoLqvDVDf/BPxGoC0H1DQ99/r0nHNY/zGnXKWMivQEd3rC797E4voOh+5jluMlMPX8CszAVWeuDwVLMrjjMmr5p2vDakOy4sPILDfvI860lD4CTyxs9Em+m2+BYZtxix1dam23jBZp7XQr8clPKc0WuVlOpAdwCXy2uE4/n6jZdhLD0mCsD5Uz5ZBmcitBrbqcJ46v1gwQyhAlJQccElwzo3LnZbuIidZQwghiuhZh8KPlcX0k29nhZij6lsGc6NtYIHxU4UZjZ8YL/CyZMBWy5v/Ae6PdtFhqIb1HVCpMIBLWRwpcfLIBjZjUYoDh4PVuW6sc9zoeut1xCQh 4myPtYQT zicAE4GlGRSpIiVFRKR6cadx4Pd8/m1CZRzKLQ5HfY8Mf70KEryyzA3JKkhA/RDkLYPnsYXJ7NCA/mlcvFYIzbETnMpDy4oOY+4cBSty+UJFjWbUAsT2ZcE7ul8VUxeyf+Ur3wQe2IwiY4ilpuMlrVTdLGDabpuMwedqupCokdilxr9IZAu+zdnajiGq18qAAUoI/vQulD9TXfBjGeBZEUopMC3TLN2s9I5HqwS4lgvXjc4Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky allocated_length is a multiple of page size and number of pages, so let's change the functions to accept number of pages. It opens us a venue to combine receive and send paths together with code readability improvement. Signed-off-by: Leon Romanovsky --- drivers/vfio/pci/mlx5/cmd.c | 32 ++++++++++----------- drivers/vfio/pci/mlx5/cmd.h | 10 +++---- drivers/vfio/pci/mlx5/main.c | 56 +++++++++++++++++++++++------------- 3 files changed, 57 insertions(+), 41 deletions(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index 7527e277c898..88e76afba606 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -318,8 +318,7 @@ static int _create_mkey(struct mlx5_core_dev *mdev, u32 pdn, struct mlx5_vhca_recv_buf *recv_buf, u32 *mkey) { - size_t npages = buf ? DIV_ROUND_UP(buf->allocated_length, PAGE_SIZE) : - recv_buf->npages; + size_t npages = buf ? buf->npages : recv_buf->npages; int err = 0, inlen; __be64 *mtt; void *mkc; @@ -375,7 +374,7 @@ static int mlx5vf_dma_data_buffer(struct mlx5_vhca_data_buffer *buf) if (mvdev->mdev_detach) return -ENOTCONN; - if (buf->dmaed || !buf->allocated_length) + if (buf->dmaed || !buf->npages) return -EINVAL; ret = dma_map_sgtable(mdev->device, &buf->table.sgt, buf->dma_dir, 0); @@ -445,7 +444,7 @@ static int mlx5vf_add_migration_pages(struct mlx5_vhca_data_buffer *buf, if (ret) goto err_append; - buf->allocated_length += filled * PAGE_SIZE; + buf->npages += filled; /* clean input for another bulk allocation */ memset(page_list, 0, filled * sizeof(*page_list)); to_fill = min_t(unsigned int, to_alloc, @@ -464,8 +463,7 @@ static int mlx5vf_add_migration_pages(struct mlx5_vhca_data_buffer *buf, } struct mlx5_vhca_data_buffer * -mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, - size_t length, +mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, enum dma_data_direction dma_dir) { struct mlx5_vhca_data_buffer *buf; @@ -477,9 +475,8 @@ mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, buf->dma_dir = dma_dir; buf->migf = migf; - if (length) { - ret = mlx5vf_add_migration_pages(buf, - DIV_ROUND_UP_ULL(length, PAGE_SIZE)); + if (npages) { + ret = mlx5vf_add_migration_pages(buf, npages); if (ret) goto end; @@ -505,8 +502,8 @@ void mlx5vf_put_data_buffer(struct mlx5_vhca_data_buffer *buf) } struct mlx5_vhca_data_buffer * -mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, - size_t length, enum dma_data_direction dma_dir) +mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, + enum dma_data_direction dma_dir) { struct mlx5_vhca_data_buffer *buf, *temp_buf; struct list_head free_list; @@ -521,7 +518,7 @@ mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, list_for_each_entry_safe(buf, temp_buf, &migf->avail_list, buf_elm) { if (buf->dma_dir == dma_dir) { list_del_init(&buf->buf_elm); - if (buf->allocated_length >= length) { + if (buf->npages >= npages) { spin_unlock_irq(&migf->list_lock); goto found; } @@ -535,7 +532,7 @@ mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, } } spin_unlock_irq(&migf->list_lock); - buf = mlx5vf_alloc_data_buffer(migf, length, dma_dir); + buf = mlx5vf_alloc_data_buffer(migf, npages, dma_dir); found: while ((temp_buf = list_first_entry_or_null(&free_list, @@ -716,7 +713,7 @@ int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev, MLX5_SET(save_vhca_state_in, in, op_mod, 0); MLX5_SET(save_vhca_state_in, in, vhca_id, mvdev->vhca_id); MLX5_SET(save_vhca_state_in, in, mkey, buf->mkey); - MLX5_SET(save_vhca_state_in, in, size, buf->allocated_length); + MLX5_SET(save_vhca_state_in, in, size, buf->npages * PAGE_SIZE); MLX5_SET(save_vhca_state_in, in, incremental, inc); MLX5_SET(save_vhca_state_in, in, set_track, track); @@ -738,8 +735,11 @@ int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev, } if (!header_buf) { - header_buf = mlx5vf_get_data_buffer(migf, - sizeof(struct mlx5_vf_migration_header), DMA_NONE); + header_buf = mlx5vf_get_data_buffer( + migf, + DIV_ROUND_UP(sizeof(struct mlx5_vf_migration_header), + PAGE_SIZE), + DMA_NONE); if (IS_ERR(header_buf)) { err = PTR_ERR(header_buf); goto err_free; diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index df421dc6de04..7d4a833b6900 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -56,7 +56,7 @@ struct mlx5_vhca_data_buffer { struct sg_append_table table; loff_t start_pos; u64 length; - u64 allocated_length; + u32 npages; u32 mkey; enum dma_data_direction dma_dir; u8 dmaed:1; @@ -217,12 +217,12 @@ int mlx5vf_cmd_alloc_pd(struct mlx5_vf_migration_file *migf); void mlx5vf_cmd_dealloc_pd(struct mlx5_vf_migration_file *migf); void mlx5fv_cmd_clean_migf_resources(struct mlx5_vf_migration_file *migf); struct mlx5_vhca_data_buffer * -mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, - size_t length, enum dma_data_direction dma_dir); +mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, + enum dma_data_direction dma_dir); void mlx5vf_free_data_buffer(struct mlx5_vhca_data_buffer *buf); struct mlx5_vhca_data_buffer * -mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, - size_t length, enum dma_data_direction dma_dir); +mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, + enum dma_data_direction dma_dir); void mlx5vf_put_data_buffer(struct mlx5_vhca_data_buffer *buf); struct page *mlx5vf_get_migration_page(struct mlx5_vhca_data_buffer *buf, unsigned long offset); diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index 8833e60d42f5..83247f016441 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -308,6 +308,7 @@ static struct mlx5_vhca_data_buffer * mlx5vf_mig_file_get_stop_copy_buf(struct mlx5_vf_migration_file *migf, u8 index, size_t required_length) { + u32 npages = DIV_ROUND_UP(required_length, PAGE_SIZE); struct mlx5_vhca_data_buffer *buf = migf->buf[index]; u8 chunk_num; @@ -315,12 +316,11 @@ mlx5vf_mig_file_get_stop_copy_buf(struct mlx5_vf_migration_file *migf, chunk_num = buf->stop_copy_chunk_num; buf->migf->buf[index] = NULL; /* Checking whether the pre-allocated buffer can fit */ - if (buf->allocated_length >= required_length) + if (buf->npages >= npages) return buf; mlx5vf_put_data_buffer(buf); - buf = mlx5vf_get_data_buffer(buf->migf, required_length, - DMA_FROM_DEVICE); + buf = mlx5vf_get_data_buffer(buf->migf, npages, DMA_FROM_DEVICE); if (IS_ERR(buf)) return buf; @@ -373,7 +373,8 @@ static int mlx5vf_add_stop_copy_header(struct mlx5_vf_migration_file *migf, u8 *to_buff; int ret; - header_buf = mlx5vf_get_data_buffer(migf, size, DMA_NONE); + header_buf = mlx5vf_get_data_buffer(migf, DIV_ROUND_UP(size, PAGE_SIZE), + DMA_NONE); if (IS_ERR(header_buf)) return PTR_ERR(header_buf); @@ -388,7 +389,7 @@ static int mlx5vf_add_stop_copy_header(struct mlx5_vf_migration_file *migf, to_buff = kmap_local_page(page); memcpy(to_buff, &header, sizeof(header)); header_buf->length = sizeof(header); - data.stop_copy_size = cpu_to_le64(migf->buf[0]->allocated_length); + data.stop_copy_size = cpu_to_le64(migf->buf[0]->npages * PAGE_SIZE); memcpy(to_buff + sizeof(header), &data, sizeof(data)); header_buf->length += sizeof(data); kunmap_local(to_buff); @@ -437,15 +438,20 @@ static int mlx5vf_prep_stop_copy(struct mlx5vf_pci_core_device *mvdev, num_chunks = mvdev->chunk_mode ? MAX_NUM_CHUNKS : 1; for (i = 0; i < num_chunks; i++) { - buf = mlx5vf_get_data_buffer(migf, inc_state_size, DMA_FROM_DEVICE); + buf = mlx5vf_get_data_buffer( + migf, DIV_ROUND_UP(inc_state_size, PAGE_SIZE), + DMA_FROM_DEVICE); if (IS_ERR(buf)) { ret = PTR_ERR(buf); goto err; } migf->buf[i] = buf; - buf = mlx5vf_get_data_buffer(migf, - sizeof(struct mlx5_vf_migration_header), DMA_NONE); + buf = mlx5vf_get_data_buffer( + migf, + DIV_ROUND_UP(sizeof(struct mlx5_vf_migration_header), + PAGE_SIZE), + DMA_NONE); if (IS_ERR(buf)) { ret = PTR_ERR(buf); goto err; @@ -553,7 +559,8 @@ static long mlx5vf_precopy_ioctl(struct file *filp, unsigned int cmd, * We finished transferring the current state and the device has a * dirty state, save a new state to be ready for. */ - buf = mlx5vf_get_data_buffer(migf, inc_length, DMA_FROM_DEVICE); + buf = mlx5vf_get_data_buffer(migf, DIV_ROUND_UP(inc_length, PAGE_SIZE), + DMA_FROM_DEVICE); if (IS_ERR(buf)) { ret = PTR_ERR(buf); mlx5vf_mark_err(migf); @@ -675,8 +682,8 @@ mlx5vf_pci_save_device_data(struct mlx5vf_pci_core_device *mvdev, bool track) if (track) { /* leave the allocated buffer ready for the stop-copy phase */ - buf = mlx5vf_alloc_data_buffer(migf, - migf->buf[0]->allocated_length, DMA_FROM_DEVICE); + buf = mlx5vf_alloc_data_buffer(migf, migf->buf[0]->npages, + DMA_FROM_DEVICE); if (IS_ERR(buf)) { ret = PTR_ERR(buf); goto out_pd; @@ -917,11 +924,14 @@ static ssize_t mlx5vf_resume_write(struct file *filp, const char __user *buf, goto out_unlock; break; case MLX5_VF_LOAD_STATE_PREP_HEADER_DATA: - if (vhca_buf_header->allocated_length < migf->record_size) { + { + u32 npages = DIV_ROUND_UP(migf->record_size, PAGE_SIZE); + + if (vhca_buf_header->npages < npages) { mlx5vf_free_data_buffer(vhca_buf_header); - migf->buf_header[0] = mlx5vf_alloc_data_buffer(migf, - migf->record_size, DMA_NONE); + migf->buf_header[0] = mlx5vf_alloc_data_buffer( + migf, npages, DMA_NONE); if (IS_ERR(migf->buf_header[0])) { ret = PTR_ERR(migf->buf_header[0]); migf->buf_header[0] = NULL; @@ -934,6 +944,7 @@ static ssize_t mlx5vf_resume_write(struct file *filp, const char __user *buf, vhca_buf_header->start_pos = migf->max_pos; migf->load_state = MLX5_VF_LOAD_STATE_READ_HEADER_DATA; break; + } case MLX5_VF_LOAD_STATE_READ_HEADER_DATA: ret = mlx5vf_resume_read_header_data(migf, vhca_buf_header, &buf, &len, pos, &done); @@ -944,12 +955,13 @@ static ssize_t mlx5vf_resume_write(struct file *filp, const char __user *buf, { u64 size = max(migf->record_size, migf->stop_copy_prep_size); + u32 npages = DIV_ROUND_UP(size, PAGE_SIZE); - if (vhca_buf->allocated_length < size) { + if (vhca_buf->npages < npages) { mlx5vf_free_data_buffer(vhca_buf); - migf->buf[0] = mlx5vf_alloc_data_buffer(migf, - size, DMA_TO_DEVICE); + migf->buf[0] = mlx5vf_alloc_data_buffer( + migf, npages, DMA_TO_DEVICE); if (IS_ERR(migf->buf[0])) { ret = PTR_ERR(migf->buf[0]); migf->buf[0] = NULL; @@ -1037,8 +1049,11 @@ mlx5vf_pci_resume_device_data(struct mlx5vf_pci_core_device *mvdev) } migf->buf[0] = buf; - buf = mlx5vf_alloc_data_buffer(migf, - sizeof(struct mlx5_vf_migration_header), DMA_NONE); + buf = mlx5vf_alloc_data_buffer( + migf, + DIV_ROUND_UP(sizeof(struct mlx5_vf_migration_header), + PAGE_SIZE), + DMA_NONE); if (IS_ERR(buf)) { ret = PTR_ERR(buf); goto out_buf; @@ -1148,7 +1163,8 @@ mlx5vf_pci_step_device_state_locked(struct mlx5vf_pci_core_device *mvdev, MLX5VF_QUERY_INC | MLX5VF_QUERY_CLEANUP); if (ret) return ERR_PTR(ret); - buf = mlx5vf_get_data_buffer(migf, size, DMA_FROM_DEVICE); + buf = mlx5vf_get_data_buffer(migf, + DIV_ROUND_UP(size, PAGE_SIZE), DMA_FROM_DEVICE); if (IS_ERR(buf)) return ERR_CAST(buf); /* pre_copy cleanup */ From patchwork Fri Jan 17 10:03:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943130 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23E7EC02183 for ; Fri, 17 Jan 2025 10:04:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A8030280012; Fri, 17 Jan 2025 05:04:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A2CF0280001; Fri, 17 Jan 2025 05:04:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8A912280012; Fri, 17 Jan 2025 05:04:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6780C280001 for ; Fri, 17 Jan 2025 05:04:57 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2ADB1C188C for ; Fri, 17 Jan 2025 10:04:57 +0000 (UTC) X-FDA: 83016510234.26.D4DBFA0 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf12.hostedemail.com (Postfix) with ESMTP id 9045340002 for ; Fri, 17 Jan 2025 10:04:55 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=eDiukSfa; spf=pass (imf12.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108295; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vetHNQg2R6fL5A23M2cGEW8QodRQ+CNA3NhS9/WeD4Y=; b=BLHkA9rJ/0LyIv/zlcKVPwpP+j43n3O/JXswYXJ5CZjCffmu9Rcizc1w0r4AzvT8lCId1V 6JMyw8iu8CG1Zlznc6Climv7Z7FAb5Jxk8HMtMfCcaVLgsxySYYz0Pu2mGrVKDtfb2dgQt vSJ1sRvWaNujoQNzKhFFONawzCBulTM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108295; a=rsa-sha256; cv=none; b=o50kICqtCmsbiOg7wOKwKTox3va2CjPTeP5HqGL2ocrJtauBcVvdV0Cepjik/WTXHOd0p1 KQSEEL6Gi+Y+kRfrnBnuvHBe9c2PWBy++wpB4UvsNRcJHx2dnaBkRIZ0ysOtsyoHfKpLqR Bx0LBWTSyP7Kx8APy/CcsVUX4PS8Yfw= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=eDiukSfa; spf=pass (imf12.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 0AE9AA42B2E; Fri, 17 Jan 2025 10:03:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BE887C4CEDD; Fri, 17 Jan 2025 10:04:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108294; bh=8CqQfzqAoc7KUQL2847UpQoyFG7eKSLI7450SxkhIcs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eDiukSfaFCfjd4svlPNun7iFLCImZITE+4Xy1iLXXDUtwq59Sm+nFqw1J1W2uG133 L5Ktx2wlWmHZNIKcRMTptSVbGWY1t2ei/Mg3ZDXII0jnv4gQyvwPLzMNznzA6Hq7uJ 1EQQWZu0ZhjsWNehLWYfMGQGMWvOxzksYzevnnELEvy5QKQrylGLFC3cSThCL0LXFU eKHr/GZubBrNboXXVFtcSigUftwNCgz7M4VinNqWgWJBj1Bz2ChGC0cd5fDBrvMR59 HdzAfHHJ30cdVHBJNr3XnV5Y/V/iHGLSYcbZZEZAob3ypXcy7ZJ4xynWQ2wde/ZC5T PPvT0aWyruK/w== From: Leon Romanovsky To: Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 16/17] vfio/mlx5: Rewrite create mkey flow to allow better code reuse Date: Fri, 17 Jan 2025 12:03:47 +0200 Message-ID: X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 9045340002 X-Stat-Signature: a3umsgcqejpe9dw1c7ry6486eaczzzn6 X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1737108295-401536 X-HE-Meta: U2FsdGVkX19O4Tg6UCcWx0/7IjZRGQ2qaBCRwW4JxTHArpZUkYvRdkLquLwGjlkB7XRe2COFy+y5ldBSV1/rN11vs474PN79HDOJbNF9lcA1bE+RtN6OFk9nr9zEBOxmCW918jQMw4SztIpEQxhLRxuDAPK0TsJr7DnMhO4LGFut92s94ukALx5zUhpmUyyc0lP72VMcQyuAiP+qS5W2VtM8nUhzK3eH1+8FKSXZtfBwA5vCPNHXvUUeXilfdrZ8AMMM28jgHE5o5vCZVXEse0BY/iJugU0RVaFfu4e2TzBaAfxLqJd2ugILmYe8mDGDkHT6wVVzl4WuC14QveDqznroRjLl5jSWnxA+JjaAn+I5/R9mWi0vMedHfXFIAzJiNEIVS05FQtlw/sE6oCLwNXthCZGUWWEdmPggHmlB0bxtffkEP45xgu8I+k99xHsZgX1gh8OmFLUXvHkEL4yQh4QPk2zne2/F+CKxqcKumFZSQJh8sQc4QUcy3AlMt/QFDPCKoFoFRThg3HWdH+HXjI5JgaRbdWnfM69eAwjWonweZq7fE8eApsJisz1pk1e8o+3AydORn59Qp/JHkz9SzTUGicrg1xbwSIIKml/rHpzm8UJWEoSfcpr7kAF3Yaq1PGt41JceI0sRQ/nGHm+3pslcAltX/EBv/Ej691IPiJqVL6lkRoxnfPLscfffD6OWKaHS92K8XLIa+mMuI3xgQ4xT6JmWQ3MEHIAXEB89tmMOnPEijJiZG7Y3YT0NHNPObArSahnS/jDYV08DAx6hz01/LI92tu2XB5H93jmlxNuXRFPMHQfcvFaqGsHGDfbeU+vuPyENRNiZwkmzeHQ2GjB9hH9KqtzD5+hpIfB0cacX0qBT+Uq7kqNJdN98DChhWxoepXis06ayrUsxl3gmEmw54H4haPOi6z9Wtzpr6/xJUlXQ879nJwZkD0eUszi3Km7gybSAb9UuPPFpVVC hVbSHWxk qTqPHZw2MstCKs4+JanofP3TxFVDM1tpLCwx+VSNIxvJkBjypQwemL13uisXke06fi7HhlDyxxuZVfwbBNGzJcPEm9dGO/655fuJxR8uiOXAy9cKmRbi4ZjSgKAJ+eJJ7HzwGzb4r3K/sEZZZxvSoPy0LL0MvVDzxseT9cwe2SY3gWzmNDDkqXP+aQa4bJklA1KXq1KmCSi/JA7ILMgEopjQOiTCwZX1c9/zeBym43K4lJm4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Change the creation of mkey to be performed in multiple steps: data allocation, DMA setup and actual call to HW to create that mkey. In this new flow, the whole input to MKEY command is saved to eliminate the need to keep array of pointers for DMA addresses for receive list and in the future patches for send list too. In addition to memory size reduce and elimination of unnecessary data movements to set MKEY input, the code is prepared for future reuse. Signed-off-by: Leon Romanovsky --- drivers/vfio/pci/mlx5/cmd.c | 157 ++++++++++++++++++++---------------- drivers/vfio/pci/mlx5/cmd.h | 4 +- 2 files changed, 91 insertions(+), 70 deletions(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index 88e76afba606..48c272ecb04f 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -313,39 +313,21 @@ static int mlx5vf_cmd_get_vhca_id(struct mlx5_core_dev *mdev, u16 function_id, return ret; } -static int _create_mkey(struct mlx5_core_dev *mdev, u32 pdn, - struct mlx5_vhca_data_buffer *buf, - struct mlx5_vhca_recv_buf *recv_buf, - u32 *mkey) +static u32 *alloc_mkey_in(u32 npages, u32 pdn) { - size_t npages = buf ? buf->npages : recv_buf->npages; - int err = 0, inlen; - __be64 *mtt; + int inlen; void *mkc; u32 *in; inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + - sizeof(*mtt) * round_up(npages, 2); + sizeof(__be64) * round_up(npages, 2); - in = kvzalloc(inlen, GFP_KERNEL); + in = kvzalloc(inlen, GFP_KERNEL_ACCOUNT); if (!in) - return -ENOMEM; + return NULL; MLX5_SET(create_mkey_in, in, translations_octword_actual_size, DIV_ROUND_UP(npages, 2)); - mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt); - - if (buf) { - struct sg_dma_page_iter dma_iter; - - for_each_sgtable_dma_page(&buf->table.sgt, &dma_iter, 0) - *mtt++ = cpu_to_be64(sg_page_iter_dma_address(&dma_iter)); - } else { - int i; - - for (i = 0; i < npages; i++) - *mtt++ = cpu_to_be64(recv_buf->dma_addrs[i]); - } mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MTT); @@ -359,9 +341,30 @@ static int _create_mkey(struct mlx5_core_dev *mdev, u32 pdn, MLX5_SET(mkc, mkc, log_page_size, PAGE_SHIFT); MLX5_SET(mkc, mkc, translations_octword_size, DIV_ROUND_UP(npages, 2)); MLX5_SET64(mkc, mkc, len, npages * PAGE_SIZE); - err = mlx5_core_create_mkey(mdev, mkey, in, inlen); - kvfree(in); - return err; + + return in; +} + +static int create_mkey(struct mlx5_core_dev *mdev, u32 npages, + struct mlx5_vhca_data_buffer *buf, u32 *mkey_in, + u32 *mkey) +{ + __be64 *mtt; + int inlen; + + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); + if (buf) { + struct sg_dma_page_iter dma_iter; + + for_each_sgtable_dma_page(&buf->table.sgt, &dma_iter, 0) + *mtt++ = cpu_to_be64( + sg_page_iter_dma_address(&dma_iter)); + } + + inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + + sizeof(__be64) * round_up(npages, 2); + + return mlx5_core_create_mkey(mdev, mkey, mkey_in, inlen); } static int mlx5vf_dma_data_buffer(struct mlx5_vhca_data_buffer *buf) @@ -374,20 +377,28 @@ static int mlx5vf_dma_data_buffer(struct mlx5_vhca_data_buffer *buf) if (mvdev->mdev_detach) return -ENOTCONN; - if (buf->dmaed || !buf->npages) + if (buf->mkey_in || !buf->npages) return -EINVAL; ret = dma_map_sgtable(mdev->device, &buf->table.sgt, buf->dma_dir, 0); if (ret) return ret; - ret = _create_mkey(mdev, buf->migf->pdn, buf, NULL, &buf->mkey); - if (ret) + buf->mkey_in = alloc_mkey_in(buf->npages, buf->migf->pdn); + if (!buf->mkey_in) { + ret = -ENOMEM; goto err; + } - buf->dmaed = true; + ret = create_mkey(mdev, buf->npages, buf, buf->mkey_in, &buf->mkey); + if (ret) + goto err_create_mkey; return 0; + +err_create_mkey: + kvfree(buf->mkey_in); + buf->mkey_in = NULL; err: dma_unmap_sgtable(mdev->device, &buf->table.sgt, buf->dma_dir, 0); return ret; @@ -401,8 +412,9 @@ void mlx5vf_free_data_buffer(struct mlx5_vhca_data_buffer *buf) lockdep_assert_held(&migf->mvdev->state_mutex); WARN_ON(migf->mvdev->mdev_detach); - if (buf->dmaed) { + if (buf->mkey_in) { mlx5_core_destroy_mkey(migf->mvdev->mdev, buf->mkey); + kvfree(buf->mkey_in); dma_unmap_sgtable(migf->mvdev->mdev->device, &buf->table.sgt, buf->dma_dir, 0); } @@ -783,7 +795,7 @@ int mlx5vf_cmd_load_vhca_state(struct mlx5vf_pci_core_device *mvdev, if (mvdev->mdev_detach) return -ENOTCONN; - if (!buf->dmaed) { + if (!buf->mkey_in) { err = mlx5vf_dma_data_buffer(buf); if (err) return err; @@ -1384,56 +1396,54 @@ static int alloc_recv_pages(struct mlx5_vhca_recv_buf *recv_buf, kvfree(recv_buf->page_list); return -ENOMEM; } +static void unregister_dma_pages(struct mlx5_core_dev *mdev, u32 npages, + u32 *mkey_in) +{ + dma_addr_t addr; + __be64 *mtt; + int i; + + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); + for (i = npages - 1; i >= 0; i--) { + addr = be64_to_cpu(mtt[i]); + dma_unmap_single(mdev->device, addr, PAGE_SIZE, + DMA_FROM_DEVICE); + } +} -static int register_dma_recv_pages(struct mlx5_core_dev *mdev, - struct mlx5_vhca_recv_buf *recv_buf) +static int register_dma_pages(struct mlx5_core_dev *mdev, u32 npages, + struct page **page_list, u32 *mkey_in) { - int i, j; + dma_addr_t addr; + __be64 *mtt; + int i; - recv_buf->dma_addrs = kvcalloc(recv_buf->npages, - sizeof(*recv_buf->dma_addrs), - GFP_KERNEL_ACCOUNT); - if (!recv_buf->dma_addrs) - return -ENOMEM; + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); - for (i = 0; i < recv_buf->npages; i++) { - recv_buf->dma_addrs[i] = dma_map_page(mdev->device, - recv_buf->page_list[i], - 0, PAGE_SIZE, - DMA_FROM_DEVICE); - if (dma_mapping_error(mdev->device, recv_buf->dma_addrs[i])) + for (i = 0; i < npages; i++) { + addr = dma_map_page(mdev->device, page_list[i], 0, PAGE_SIZE, + DMA_FROM_DEVICE); + if (dma_mapping_error(mdev->device, addr)) goto error; + + *mtt++ = cpu_to_be64(addr); } + return 0; error: - for (j = 0; j < i; j++) - dma_unmap_single(mdev->device, recv_buf->dma_addrs[j], - PAGE_SIZE, DMA_FROM_DEVICE); - - kvfree(recv_buf->dma_addrs); + unregister_dma_pages(mdev, i, mkey_in); return -ENOMEM; } -static void unregister_dma_recv_pages(struct mlx5_core_dev *mdev, - struct mlx5_vhca_recv_buf *recv_buf) -{ - int i; - - for (i = 0; i < recv_buf->npages; i++) - dma_unmap_single(mdev->device, recv_buf->dma_addrs[i], - PAGE_SIZE, DMA_FROM_DEVICE); - - kvfree(recv_buf->dma_addrs); -} - static void mlx5vf_free_qp_recv_resources(struct mlx5_core_dev *mdev, struct mlx5_vhca_qp *qp) { struct mlx5_vhca_recv_buf *recv_buf = &qp->recv_buf; mlx5_core_destroy_mkey(mdev, recv_buf->mkey); - unregister_dma_recv_pages(mdev, recv_buf); + unregister_dma_pages(mdev, recv_buf->npages, recv_buf->mkey_in); + kvfree(recv_buf->mkey_in); free_recv_pages(&qp->recv_buf); } @@ -1449,18 +1459,29 @@ static int mlx5vf_alloc_qp_recv_resources(struct mlx5_core_dev *mdev, if (err < 0) return err; - err = register_dma_recv_pages(mdev, recv_buf); - if (err) + recv_buf->mkey_in = alloc_mkey_in(npages, pdn); + if (!recv_buf->mkey_in) { + err = -ENOMEM; goto end; + } + + err = register_dma_pages(mdev, npages, recv_buf->page_list, + recv_buf->mkey_in); + if (err) + goto err_register_dma; - err = _create_mkey(mdev, pdn, NULL, recv_buf, &recv_buf->mkey); + err = create_mkey(mdev, npages, NULL, recv_buf->mkey_in, + &recv_buf->mkey); if (err) goto err_create_mkey; return 0; err_create_mkey: - unregister_dma_recv_pages(mdev, recv_buf); + unregister_dma_pages(mdev, npages, recv_buf->mkey_in); +err_register_dma: + kvfree(recv_buf->mkey_in); + recv_buf->mkey_in = NULL; end: free_recv_pages(recv_buf); return err; diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index 7d4a833b6900..25dd6ff54591 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -58,8 +58,8 @@ struct mlx5_vhca_data_buffer { u64 length; u32 npages; u32 mkey; + u32 *mkey_in; enum dma_data_direction dma_dir; - u8 dmaed:1; u8 stop_copy_chunk_num; struct list_head buf_elm; struct mlx5_vf_migration_file *migf; @@ -133,8 +133,8 @@ struct mlx5_vhca_cq { struct mlx5_vhca_recv_buf { u32 npages; struct page **page_list; - dma_addr_t *dma_addrs; u32 next_rq_offset; + u32 *mkey_in; u32 mkey; }; From patchwork Fri Jan 17 10:03:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13943131 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D592C02185 for ; Fri, 17 Jan 2025 10:05:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 99555280013; Fri, 17 Jan 2025 05:05:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 91B40280001; Fri, 17 Jan 2025 05:05:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 74906280013; Fri, 17 Jan 2025 05:05:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4E7C3280001 for ; Fri, 17 Jan 2025 05:05:01 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 099111C8BBE for ; Fri, 17 Jan 2025 10:05:01 +0000 (UTC) X-FDA: 83016510402.14.CA735C6 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf24.hostedemail.com (Postfix) with ESMTP id 63FC2180005 for ; Fri, 17 Jan 2025 10:04:59 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=rIYLAAc2; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf24.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737108299; a=rsa-sha256; cv=none; b=KHzFpJHatX2+zCWiJ3K970XbZkaGEtPg01LSqWFhoT4JzYhZh25Tl6WbuND4jfRTVo200q NDogdqSxC1iwxKZear3nu3EAPXRViweIArsLEW17DwdGnJWlwK2LYxJjeB1dlme9slK7W1 spbnKzOrRbXVc1yeHaY1r73DzsHSaoE= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=rIYLAAc2; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf24.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737108299; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vurEggIOtlIWGm844n3q6u7hN02E7Wmyhwf7K8e8rS4=; b=HxXYGg64lf67EZJteb4Ie1bTFwwOHea36d/RnTq8rstCn9EijizFRUG7Q/w6rMu0RZV29j ojHyDY7WNrkk65Px74w7pJydZvnu5BcqJxQIrHegZZCYIn1FE0teT1Z3GOI+GWyrJ3Zj2h EQSc+WfCjMY/ARvIzvNOX4gctV9uFYs= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id DC5995C0540; Fri, 17 Jan 2025 10:04:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 92380C4CEDD; Fri, 17 Jan 2025 10:04:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737108298; bh=o+6vF1rEgX4bgHcNQOvXVocFHBkXLf/fQZQZ/AwPAtw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rIYLAAc2FMI5l7w1DtaWccEM3S+Rro6gTG7ZSBOmIEjNEUc9eT9kwd8Q5T6vmapv6 +IUMN+CS/RfufvCijiYqV1FRP+yaz8kBZXopoH6xweIs0bTrQ7OqPQcXQW+IEajbK9 IetoQN+icWnO0wqx6VQGDeD7DAOa7S2+hAtovlW0QGmABrPpnkX0h+4nQ0kkvGnAZ8 PufdtkugzXaL2juDizNRlkVJqaJpdqkBX1HHfQyovLWnJL+Su72l1JgU1A1hTpHSQ8 g6FYQh0ccpR8/bXRVJlVM/uzxmRSW8J4RDfrk8kWu1SG8ZwD36ca9i/38qjKQaeSIj vS3kpRr/NZhYg== From: Leon Romanovsky To: Cc: Leon Romanovsky , Jens Axboe , Joerg Roedel , Will Deacon , Sagi Grimberg , "Keith Busch" , "Bjorn Helgaas" , "Logan Gunthorpe" , "Yishai Hadas" , "Shameer Kolothum" , "Kevin Tian" , "Alex Williamson" , "Marek Szyprowski" , =?utf-8?b?SsOpcsO0bWUgR2xp?= =?utf-8?b?c3Nl?= , "Andrew Morton" , "Jonathan Corbet" , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, "Randy Dunlap" Subject: [PATCH v6 17/17] vfio/mlx5: Enable the DMA link API Date: Fri, 17 Jan 2025 12:03:48 +0200 Message-ID: <77932f39fd7cdccbe8e2eda50d1ee4727dfbfa9b.1737106761.git.leon@kernel.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 63FC2180005 X-Stat-Signature: gtykryy81rxwobrcu375c8u1n9mho76f X-Rspam-User: X-HE-Tag: 1737108299-682350 X-HE-Meta: U2FsdGVkX1/yUlQgqAmN5hdo9gvt608x7Vo+nMxh9qFRL06OV5ktqz5PoL2D2K93Hrlvo77q+2ywET4BZg2AMZaFxvCn2sHWCldwiIvTbPvJbtpBYlseGWXs0w9eYmWJWtqZNm66Un8oVLPig1KxTVgSykRC8rvmfIVYkDVYSdikXSz4evROz0uz/4hC1t/VntGyXimRa9fpj11tcMbSxHctZa0Ndxk2PUe51vpa+sHxFojmk7M/SeXiE1eK+kXlVtgGpHfNQGOfmeAPCv/JsOzAV8fsph18FNlvJuiwxFbvBkVyvIZGDa9h0MyrBEy71kpGg0TJQsGiSjAG/t2PIG6xrWuGjYgtZYpSyexOAEis3Y+wr17IloHBdpF48tloG69ZROyVf/G0R7xn+FL+pQERFX1zgwWRu4To33/wTLPmlt7nv8k+r2FHbFTUmGaMooZEO8w+c8IrFBg1YzRgITG8Om6iX7HFu05PVk/5M6tJW2/fREdgSol3DgiytTslAlygEc9VkXAZZoV2iEM8LK7mxMa5Z1tP7HtKf9pf9jF8v0jj6WZPexJDB8RKgfR9ST5T7FB50vnfIAppu7ZMmS7mDGE51xi0bvsWjnuzRsfFsi+LZjgPk2zABnP4tIUoys/JTddUaE+E8OorSHw2MubP025EjkaBFHN2A6/2AU7nbXzl8081pQSWz0CfGCzVKU/Kd7aAGcPYNQxS/miXltXhq17D9LXoct+XRSb4nKU+q2ZDtiPSAVwhOwPneOE8KMp6KWrcJoG/vszBJVuEv6dEUskyZ71NH9OUVBVV0hjEj2jTsiqx+HFTh1bICmPWul2feVv6KiGeveEeDJ83rqkBk0bXxQM4oBSWCscFG6jq8Ekwu+FtUHHg1JKe71bzUqW06X1XVu4dNG9Hq/03QVqShsDk9WhA+kwyMPn4ubH8CpLCQQiYTuiNXQIhZZ/5KK25XO5mVbKq7+Cr64p 5jKvDgBu JzQY7V9H6xmbgOkHOwP8fjeJhSmMcd7o6nWnhmKOicLn7USSallbXR8ecCDhfnwzbAPUunBNS21V6KejRUXZmaBPq/SmXWffLZesmZxtuf8i/H/gyytuyn6rKadTXY/ftb63MpDNAvm40YfwR+kjmbTuGnI7qNmw1UvSvnRDiVovQba2dfnr4QSoT+aXnYmSDMI05rmlCN0OA2yEnQOKXHa2gZF8RvpGN4GOv1u2pvf2mH+k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Remove intermediate scatter-gather table completely and enable new DMA link API. Signed-off-by: Leon Romanovsky --- drivers/vfio/pci/mlx5/cmd.c | 299 ++++++++++++++++------------------- drivers/vfio/pci/mlx5/cmd.h | 21 ++- drivers/vfio/pci/mlx5/main.c | 31 ---- 3 files changed, 148 insertions(+), 203 deletions(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index 48c272ecb04f..fba20abf240a 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -345,26 +345,82 @@ static u32 *alloc_mkey_in(u32 npages, u32 pdn) return in; } -static int create_mkey(struct mlx5_core_dev *mdev, u32 npages, - struct mlx5_vhca_data_buffer *buf, u32 *mkey_in, +static int create_mkey(struct mlx5_core_dev *mdev, u32 npages, u32 *mkey_in, u32 *mkey) { + int inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + + sizeof(__be64) * round_up(npages, 2); + + return mlx5_core_create_mkey(mdev, mkey, mkey_in, inlen); +} + +static void unregister_dma_pages(struct mlx5_core_dev *mdev, u32 npages, + u32 *mkey_in, struct dma_iova_state *state, + enum dma_data_direction dir) +{ + dma_addr_t addr; __be64 *mtt; - int inlen; + int i; - mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); - if (buf) { - struct sg_dma_page_iter dma_iter; + WARN_ON_ONCE(dir == DMA_NONE); - for_each_sgtable_dma_page(&buf->table.sgt, &dma_iter, 0) - *mtt++ = cpu_to_be64( - sg_page_iter_dma_address(&dma_iter)); + if (dma_use_iova(state)) { + dma_iova_destroy(mdev->device, state, npages * PAGE_SIZE, dir, + 0); + } else { + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, + klm_pas_mtt); + for (i = npages - 1; i >= 0; i--) { + addr = be64_to_cpu(mtt[i]); + dma_unmap_page(mdev->device, addr, PAGE_SIZE, dir); + } } +} - inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + - sizeof(__be64) * round_up(npages, 2); +static int register_dma_pages(struct mlx5_core_dev *mdev, u32 npages, + struct page **page_list, u32 *mkey_in, + struct dma_iova_state *state, + enum dma_data_direction dir) +{ + dma_addr_t addr; + size_t mapped = 0; + __be64 *mtt; + int i, err; - return mlx5_core_create_mkey(mdev, mkey, mkey_in, inlen); + WARN_ON_ONCE(dir == DMA_NONE); + + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); + + if (dma_iova_try_alloc(mdev->device, state, 0, npages * PAGE_SIZE)) { + addr = state->addr; + for (i = 0; i < npages; i++) { + err = dma_iova_link(mdev->device, state, + page_to_phys(page_list[i]), mapped, + PAGE_SIZE, dir, 0); + if (err) + goto error; + *mtt++ = cpu_to_be64(addr); + addr += PAGE_SIZE; + mapped += PAGE_SIZE; + } + err = dma_iova_sync(mdev->device, state, 0, mapped); + if (err) + goto error; + } else { + for (i = 0; i < npages; i++) { + addr = dma_map_page(mdev->device, page_list[i], 0, + PAGE_SIZE, dir); + err = dma_mapping_error(mdev->device, addr); + if (err) + goto error; + *mtt++ = cpu_to_be64(addr); + } + } + return 0; + +error: + unregister_dma_pages(mdev, i, mkey_in, state, dir); + return err; } static int mlx5vf_dma_data_buffer(struct mlx5_vhca_data_buffer *buf) @@ -380,98 +436,91 @@ static int mlx5vf_dma_data_buffer(struct mlx5_vhca_data_buffer *buf) if (buf->mkey_in || !buf->npages) return -EINVAL; - ret = dma_map_sgtable(mdev->device, &buf->table.sgt, buf->dma_dir, 0); - if (ret) - return ret; - buf->mkey_in = alloc_mkey_in(buf->npages, buf->migf->pdn); - if (!buf->mkey_in) { - ret = -ENOMEM; - goto err; - } + if (!buf->mkey_in) + return -ENOMEM; - ret = create_mkey(mdev, buf->npages, buf, buf->mkey_in, &buf->mkey); + ret = register_dma_pages(mdev, buf->npages, buf->page_list, + buf->mkey_in, &buf->state, buf->dma_dir); + if (ret) + goto err_register_dma; + + ret = create_mkey(mdev, buf->npages, buf->mkey_in, &buf->mkey); if (ret) goto err_create_mkey; return 0; err_create_mkey: + unregister_dma_pages(mdev, buf->npages, buf->mkey_in, &buf->state, + buf->dma_dir); +err_register_dma: kvfree(buf->mkey_in); buf->mkey_in = NULL; -err: - dma_unmap_sgtable(mdev->device, &buf->table.sgt, buf->dma_dir, 0); return ret; } +static void free_page_list(u32 npages, struct page **page_list) +{ + int i; + + /* Undo alloc_pages_bulk_array() */ + for (i = npages - 1; i >= 0; i--) + __free_page(page_list[i]); + + kvfree(page_list); +} + void mlx5vf_free_data_buffer(struct mlx5_vhca_data_buffer *buf) { - struct mlx5_vf_migration_file *migf = buf->migf; - struct sg_page_iter sg_iter; + struct mlx5vf_pci_core_device *mvdev = buf->migf->mvdev; + struct mlx5_core_dev *mdev = mvdev->mdev; - lockdep_assert_held(&migf->mvdev->state_mutex); - WARN_ON(migf->mvdev->mdev_detach); + lockdep_assert_held(&mvdev->state_mutex); + WARN_ON(mvdev->mdev_detach); if (buf->mkey_in) { - mlx5_core_destroy_mkey(migf->mvdev->mdev, buf->mkey); + mlx5_core_destroy_mkey(mdev, buf->mkey); + unregister_dma_pages(mdev, buf->npages, buf->mkey_in, + &buf->state, buf->dma_dir); kvfree(buf->mkey_in); - dma_unmap_sgtable(migf->mvdev->mdev->device, &buf->table.sgt, - buf->dma_dir, 0); } - /* Undo alloc_pages_bulk_array() */ - for_each_sgtable_page(&buf->table.sgt, &sg_iter, 0) - __free_page(sg_page_iter_page(&sg_iter)); - sg_free_append_table(&buf->table); + free_page_list(buf->npages, buf->page_list); kfree(buf); } -static int mlx5vf_add_migration_pages(struct mlx5_vhca_data_buffer *buf, - unsigned int npages) +static int mlx5vf_add_pages(struct page ***page_list, unsigned int npages) { - unsigned int to_alloc = npages; - struct page **page_list; - unsigned long filled; - unsigned int to_fill; - int ret; + unsigned int filled, done = 0; int i; - to_fill = min_t(unsigned int, npages, PAGE_SIZE / sizeof(*page_list)); - page_list = kvzalloc(to_fill * sizeof(*page_list), GFP_KERNEL_ACCOUNT); - if (!page_list) + *page_list = + kvcalloc(npages, sizeof(struct page *), GFP_KERNEL_ACCOUNT); + if (!*page_list) return -ENOMEM; - do { - filled = alloc_pages_bulk_array(GFP_KERNEL_ACCOUNT, to_fill, - page_list); - if (!filled) { - ret = -ENOMEM; + for (;;) { + filled = alloc_pages_bulk_array(GFP_KERNEL_ACCOUNT, + npages - done, + *page_list + done); + if (!filled) goto err; - } - to_alloc -= filled; - ret = sg_alloc_append_table_from_pages( - &buf->table, page_list, filled, 0, - filled << PAGE_SHIFT, UINT_MAX, SG_MAX_SINGLE_ALLOC, - GFP_KERNEL_ACCOUNT); - if (ret) - goto err_append; - buf->npages += filled; - /* clean input for another bulk allocation */ - memset(page_list, 0, filled * sizeof(*page_list)); - to_fill = min_t(unsigned int, to_alloc, - PAGE_SIZE / sizeof(*page_list)); - } while (to_alloc > 0); + done += filled; + if (done == npages) + break; + } - kvfree(page_list); return 0; -err_append: - for (i = filled - 1; i >= 0; i--) - __free_page(page_list[i]); err: - kvfree(page_list); - return ret; + for (i = 0; i < done; i++) + __free_page(*page_list[i]); + + kvfree(*page_list); + *page_list = NULL; + return -ENOMEM; } struct mlx5_vhca_data_buffer * @@ -488,10 +537,12 @@ mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, buf->dma_dir = dma_dir; buf->migf = migf; if (npages) { - ret = mlx5vf_add_migration_pages(buf, npages); + ret = mlx5vf_add_pages(&buf->page_list, npages); if (ret) goto end; + buf->npages = npages; + if (dma_dir != DMA_NONE) { ret = mlx5vf_dma_data_buffer(buf); if (ret) @@ -1350,101 +1401,16 @@ static void mlx5vf_destroy_qp(struct mlx5_core_dev *mdev, kfree(qp); } -static void free_recv_pages(struct mlx5_vhca_recv_buf *recv_buf) -{ - int i; - - /* Undo alloc_pages_bulk_array() */ - for (i = 0; i < recv_buf->npages; i++) - __free_page(recv_buf->page_list[i]); - - kvfree(recv_buf->page_list); -} - -static int alloc_recv_pages(struct mlx5_vhca_recv_buf *recv_buf, - unsigned int npages) -{ - unsigned int filled = 0, done = 0; - int i; - - recv_buf->page_list = kvcalloc(npages, sizeof(*recv_buf->page_list), - GFP_KERNEL_ACCOUNT); - if (!recv_buf->page_list) - return -ENOMEM; - - for (;;) { - filled = alloc_pages_bulk_array(GFP_KERNEL_ACCOUNT, - npages - done, - recv_buf->page_list + done); - if (!filled) - goto err; - - done += filled; - if (done == npages) - break; - } - - recv_buf->npages = npages; - return 0; - -err: - for (i = 0; i < npages; i++) { - if (recv_buf->page_list[i]) - __free_page(recv_buf->page_list[i]); - } - - kvfree(recv_buf->page_list); - return -ENOMEM; -} -static void unregister_dma_pages(struct mlx5_core_dev *mdev, u32 npages, - u32 *mkey_in) -{ - dma_addr_t addr; - __be64 *mtt; - int i; - - mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); - for (i = npages - 1; i >= 0; i--) { - addr = be64_to_cpu(mtt[i]); - dma_unmap_single(mdev->device, addr, PAGE_SIZE, - DMA_FROM_DEVICE); - } -} - -static int register_dma_pages(struct mlx5_core_dev *mdev, u32 npages, - struct page **page_list, u32 *mkey_in) -{ - dma_addr_t addr; - __be64 *mtt; - int i; - - mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); - - for (i = 0; i < npages; i++) { - addr = dma_map_page(mdev->device, page_list[i], 0, PAGE_SIZE, - DMA_FROM_DEVICE); - if (dma_mapping_error(mdev->device, addr)) - goto error; - - *mtt++ = cpu_to_be64(addr); - } - - return 0; - -error: - unregister_dma_pages(mdev, i, mkey_in); - return -ENOMEM; -} - static void mlx5vf_free_qp_recv_resources(struct mlx5_core_dev *mdev, struct mlx5_vhca_qp *qp) { struct mlx5_vhca_recv_buf *recv_buf = &qp->recv_buf; mlx5_core_destroy_mkey(mdev, recv_buf->mkey); - unregister_dma_pages(mdev, recv_buf->npages, recv_buf->mkey_in); + unregister_dma_pages(mdev, recv_buf->npages, recv_buf->mkey_in, + &recv_buf->state, DMA_FROM_DEVICE); kvfree(recv_buf->mkey_in); - free_recv_pages(&qp->recv_buf); + free_page_list(recv_buf->npages, recv_buf->page_list); } static int mlx5vf_alloc_qp_recv_resources(struct mlx5_core_dev *mdev, @@ -1455,10 +1421,12 @@ static int mlx5vf_alloc_qp_recv_resources(struct mlx5_core_dev *mdev, struct mlx5_vhca_recv_buf *recv_buf = &qp->recv_buf; int err; - err = alloc_recv_pages(recv_buf, npages); - if (err < 0) + err = mlx5vf_add_pages(&recv_buf->page_list, npages); + if (err) return err; + recv_buf->npages = npages; + recv_buf->mkey_in = alloc_mkey_in(npages, pdn); if (!recv_buf->mkey_in) { err = -ENOMEM; @@ -1466,24 +1434,25 @@ static int mlx5vf_alloc_qp_recv_resources(struct mlx5_core_dev *mdev, } err = register_dma_pages(mdev, npages, recv_buf->page_list, - recv_buf->mkey_in); + recv_buf->mkey_in, &recv_buf->state, + DMA_FROM_DEVICE); if (err) goto err_register_dma; - err = create_mkey(mdev, npages, NULL, recv_buf->mkey_in, - &recv_buf->mkey); + err = create_mkey(mdev, npages, recv_buf->mkey_in, &recv_buf->mkey); if (err) goto err_create_mkey; return 0; err_create_mkey: - unregister_dma_pages(mdev, npages, recv_buf->mkey_in); + unregister_dma_pages(mdev, npages, recv_buf->mkey_in, &recv_buf->state, + DMA_FROM_DEVICE); err_register_dma: kvfree(recv_buf->mkey_in); recv_buf->mkey_in = NULL; end: - free_recv_pages(recv_buf); + free_page_list(npages, recv_buf->page_list); return err; } diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index 25dd6ff54591..d7821b5ca772 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -53,7 +53,8 @@ struct mlx5_vf_migration_header { }; struct mlx5_vhca_data_buffer { - struct sg_append_table table; + struct page **page_list; + struct dma_iova_state state; loff_t start_pos; u64 length; u32 npages; @@ -63,10 +64,6 @@ struct mlx5_vhca_data_buffer { u8 stop_copy_chunk_num; struct list_head buf_elm; struct mlx5_vf_migration_file *migf; - /* Optimize mlx5vf_get_migration_page() for sequential access */ - struct scatterlist *last_offset_sg; - unsigned int sg_last_entry; - unsigned long last_offset; }; struct mlx5vf_async_data { @@ -133,6 +130,7 @@ struct mlx5_vhca_cq { struct mlx5_vhca_recv_buf { u32 npages; struct page **page_list; + struct dma_iova_state state; u32 next_rq_offset; u32 *mkey_in; u32 mkey; @@ -224,8 +222,17 @@ struct mlx5_vhca_data_buffer * mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, enum dma_data_direction dma_dir); void mlx5vf_put_data_buffer(struct mlx5_vhca_data_buffer *buf); -struct page *mlx5vf_get_migration_page(struct mlx5_vhca_data_buffer *buf, - unsigned long offset); +static inline struct page * +mlx5vf_get_migration_page(struct mlx5_vhca_data_buffer *buf, + unsigned long offset) +{ + int page_entry = offset / PAGE_SIZE; + + if (page_entry >= buf->npages) + return NULL; + + return buf->page_list[page_entry]; +} void mlx5vf_state_mutex_unlock(struct mlx5vf_pci_core_device *mvdev); void mlx5vf_disable_fds(struct mlx5vf_pci_core_device *mvdev, enum mlx5_vf_migf_state *last_save_state); diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index 83247f016441..c528932e5739 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -34,37 +34,6 @@ static struct mlx5vf_pci_core_device *mlx5vf_drvdata(struct pci_dev *pdev) core_device); } -struct page * -mlx5vf_get_migration_page(struct mlx5_vhca_data_buffer *buf, - unsigned long offset) -{ - unsigned long cur_offset = 0; - struct scatterlist *sg; - unsigned int i; - - /* All accesses are sequential */ - if (offset < buf->last_offset || !buf->last_offset_sg) { - buf->last_offset = 0; - buf->last_offset_sg = buf->table.sgt.sgl; - buf->sg_last_entry = 0; - } - - cur_offset = buf->last_offset; - - for_each_sg(buf->last_offset_sg, sg, - buf->table.sgt.orig_nents - buf->sg_last_entry, i) { - if (offset < sg->length + cur_offset) { - buf->last_offset_sg = sg; - buf->sg_last_entry += i; - buf->last_offset = cur_offset; - return nth_page(sg_page(sg), - (offset - cur_offset) / PAGE_SIZE); - } - cur_offset += sg->length; - } - return NULL; -} - static void mlx5vf_disable_fd(struct mlx5_vf_migration_file *migf) { mutex_lock(&migf->lock);