From patchwork Thu Sep 12 11:15:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801925 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9F95EEB594 for ; Thu, 12 Sep 2024 11:16:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3A4486B0088; Thu, 12 Sep 2024 07:16:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 353266B0089; Thu, 12 Sep 2024 07:16:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F5A16B008A; Thu, 12 Sep 2024 07:16:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id F30526B0088 for ; Thu, 12 Sep 2024 07:16:14 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 71535A200B for ; Thu, 12 Sep 2024 11:16:14 +0000 (UTC) X-FDA: 82555832268.09.0DA6100 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf03.hostedemail.com (Postfix) with ESMTP id DCA4E20019 for ; Thu, 12 Sep 2024 11:16:12 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=ElXcUbpH; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf03.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139744; a=rsa-sha256; cv=none; b=MtdLc3DkwfI7YJreAAKpDwJuM/fZzJO7kyMblsmdNDjlITfpIvT6PInfqXh5RCxfPTPwKG TPB48A7HBWxlHAxXAMSQnebG9+WZZl1GyPqVDR9P3dBnPL8+MPANWUnR7TWsSzRL41EsgM RhKpLWz+oKCTRsT+oILZ4ZzQstk+2rQ= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=ElXcUbpH; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf03.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139744; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GHZqbsgPWLLPfEtLLgGj5Xd8kjYiY1F7Q9uucO2usO4=; b=wfleHzogkx4p5I0VBWPT1CpQGHUvRjYCsyoIq7AqSyMY4zT98ul9/3rCYcgp1Ol2W8kRPu NTgy9qYIbZZY/m1RHPD0GXTwcPCRSh8SI5Hh3S+p/JNtSZBj1ADIWPKaI/JOza40VzLHyU LyY/AtcAoIc4JsSE9MSv7lc81/zEMPw= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 8D334A45256; Thu, 12 Sep 2024 11:16:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 126DEC4CEC3; Thu, 12 Sep 2024 11:16:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139771; bh=wScFwwk329QxKWfLLj8pz7NHQQe+VUBRN1u2NOeObGA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ElXcUbpHs5KI2K+PlHKsjpKVaEdXIxDpLVBVHKF4mWEuE2E7Ocs9+wyk4s1iPJPZl MksOnT4XNgcgbVxPfE2E5QYww/ITakbKdUQDYgbzYGmg/vX++ZK2D4Lanvdqp+YK95 ZXZcZyroMhH2X46lASJh3WD5m7PC3YPmecSgvvLBCwJeZ22yUP66AhIpUjdeWqJsPz QrVBKcD5kMCzIh+oE7Oa4e8rYNzkrM0zuv78ko8JsMq1aKACqwkFXDOZsH3otwMUTy p5EkgVVvI2x8IWEdGZtUfwuKhR+nBBCQZLf3AiKXGSgFqxAIegmk3Tdh282tqbQyGP +ryDDdtJImP5A== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 01/21] iommu/dma: Provide an interface to allow preallocate IOVA Date: Thu, 12 Sep 2024 14:15:36 +0300 Message-ID: <8ae3944565cd7b140625a71b8c7e74ca466bd3ec.1726138681.git.leon@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: DCA4E20019 X-Rspamd-Server: rspam01 X-Stat-Signature: 1bw9rn3o6azb8biogz4ddasask4qadu6 X-HE-Tag: 1726139772-495102 X-HE-Meta: U2FsdGVkX1+Yh2zJk1oBrskFfnZK5pQA9anKd5O2DWW5JBsefwwbpYpi0lPvM+NQAWk9QwFkeo7KwZ1jzY7/4TCmdQCgMqrWzIfrszblMAyfqAvhQTbVva62kHVbASPtmTQWpF45T3uIbz1+WVTx6SQ9pM3xcx/mJfkueRpK9dukv00Za7DI+ahk8fDzVE8ErjLT5y3yL4R/XbhZoaVDJg8HMS4Xn+bSV8RAk/vbR3c2WTRGoAwgQT3nhDVqJnQQEDB7CT3C6AFVf9udbh2XM2sL5Ag+7ssHf5Z6z+qQuRwHh4xagReVOk1W/ssfES9W545v+6B4zMdV9tGw7vH0tNMa2MrOaqVCdhWXhakfo+dEAaiQkqiUm4MkG0ErjdTVsIScJt23o8Bf0DrJZlWjwiPF0y7I6pD63lfdru/wt+C3N2XaXoyCP5qRALqGzZHoCVfC4UjAr03eixny9bNqdXD7rpBfAVOUZfMb8ShhGQVvjvZ6NXxdpfvBUHLkYEbFzeUSLsnVvajiuaH5YZ9Etvbi462baEha2H7HXdAm5U+OnndnrGnZPbvNSKSZTLllpg6KcBEUR+50x/tGo6NQtDJituM8DmWfAiHtnb/iZEXx1EQSj/Lj1hjxDFZK5UAWsxKrkdJI+WA3REJvWNxr37CG+QlG7NTR8u43pAHb7oeUCbCi05il1FOffOhe4gFWvKuP5AAujRpURFs0XVM3ur8+ccMs0d/BnIU4WjiVlgaLOS2UzQsetZ7h6zPpReSyEvcaq9H3Vu7Ke40gtz4lfuaf680qkBH+Cit28zi0LwHj71BImQYtlYDm0eXEtZ0YuiUCSEiv8K/3fFqwZeMlspo14JSkpkZBeaVNhWi8mv1B1YJwZJwhCzR66jRtKC1nH4oGVNLxPEAtHUt+vNL7kaRnGK8BXeY+eDLdGyvJspEp0p2m0rTerS0+BJbBXfuN0xJKBZ8YbZTVpj+AjUI yh2sWr9T afl6qnTKPEYrHVJgbi+feH8fQLErhrytYwm9R7QMdF3H/Nd4Gi880Q2zweLJ1yzZHuGWffXAPzq9of1zZVkK+C5i8N8AIs1ZBeksk2xpzKlN6C6AX4C2qEKXFCy8ax31qSFnBQ9Nypn6nNUebzfv2baDXb1g2uN4A3AVRbwMJ52MNjJynSMGBz42bfYj5OdOQkFJ1rgpib9H3lIj6iMKsI4cZCOd5dFrzfGyxXIOxYn2QKGQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Separate IOVA allocation to dedicated callback so it will allow cache of IOVA and reuse it in fast paths for devices which support ODP (on-demand-paging) mechanism. Signed-off-by: Leon Romanovsky --- drivers/iommu/dma-iommu.c | 57 ++++++++++++++++++++++++++++++--------- include/linux/iommu-dma.h | 11 ++++++++ 2 files changed, 56 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 65a38b5695f9..09deea2fc86b 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -358,7 +358,7 @@ int iommu_dma_init_fq(struct iommu_domain *domain) atomic_set(&cookie->fq_timer_on, 0); /* * Prevent incomplete fq state being observable. Pairs with path from - * __iommu_dma_unmap() through iommu_dma_free_iova() to queue_iova() + * __iommu_dma_unmap() through __iommu_dma_free_iova() to queue_iova() */ smp_wmb(); WRITE_ONCE(cookie->fq_domain, domain); @@ -759,7 +759,7 @@ static int dma_info_to_prot(enum dma_data_direction dir, bool coherent, } } -static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain, +static dma_addr_t __iommu_dma_alloc_iova(struct iommu_domain *domain, size_t size, u64 dma_limit, struct device *dev) { struct iommu_dma_cookie *cookie = domain->iova_cookie; @@ -805,7 +805,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain, return (dma_addr_t)iova << shift; } -static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie, +static void __iommu_dma_free_iova(struct iommu_dma_cookie *cookie, dma_addr_t iova, size_t size, struct iommu_iotlb_gather *gather) { struct iova_domain *iovad = &cookie->iovad; @@ -842,7 +842,7 @@ static void __iommu_dma_unmap(struct device *dev, dma_addr_t dma_addr, if (!iotlb_gather.queued) iommu_iotlb_sync(domain, &iotlb_gather); - iommu_dma_free_iova(cookie, dma_addr, size, &iotlb_gather); + __iommu_dma_free_iova(cookie, dma_addr, size, &iotlb_gather); } static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, @@ -865,12 +865,12 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, size = iova_align(iovad, size + iova_off); - iova = iommu_dma_alloc_iova(domain, size, dma_mask, dev); + iova = __iommu_dma_alloc_iova(domain, size, dma_mask, dev); if (!iova) return DMA_MAPPING_ERROR; if (iommu_map(domain, iova, phys - iova_off, size, prot, GFP_ATOMIC)) { - iommu_dma_free_iova(cookie, iova, size, NULL); + __iommu_dma_free_iova(cookie, iova, size, NULL); return DMA_MAPPING_ERROR; } return iova + iova_off; @@ -973,7 +973,7 @@ static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev, return NULL; size = iova_align(iovad, size); - iova = iommu_dma_alloc_iova(domain, size, dev->coherent_dma_mask, dev); + iova = __iommu_dma_alloc_iova(domain, size, dev->coherent_dma_mask, dev); if (!iova) goto out_free_pages; @@ -1007,7 +1007,7 @@ static struct page **__iommu_dma_alloc_noncontiguous(struct device *dev, out_free_sg: sg_free_table(sgt); out_free_iova: - iommu_dma_free_iova(cookie, iova, size, NULL); + __iommu_dma_free_iova(cookie, iova, size, NULL); out_free_pages: __iommu_dma_free_pages(pages, count); return NULL; @@ -1434,7 +1434,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, if (!iova_len) return __finalise_sg(dev, sg, nents, 0); - iova = iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev); + iova = __iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev); if (!iova) { ret = -ENOMEM; goto out_restore_sg; @@ -1451,7 +1451,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, int nents, return __finalise_sg(dev, sg, nents, iova); out_free_iova: - iommu_dma_free_iova(cookie, iova, iova_len, NULL); + __iommu_dma_free_iova(cookie, iova, iova_len, NULL); out_restore_sg: __invalidate_sg(sg, nents); out: @@ -1710,6 +1710,39 @@ size_t iommu_dma_max_mapping_size(struct device *dev) return SIZE_MAX; } +int iommu_dma_alloc_iova(struct dma_iova_state *state, phys_addr_t phys, + size_t size) +{ + struct iommu_domain *domain = iommu_get_dma_domain(state->dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + dma_addr_t addr; + + size = iova_align(iovad, size + iova_offset(iovad, phys)); + addr = __iommu_dma_alloc_iova(domain, size, dma_get_mask(state->dev), + state->dev); + if (addr == DMA_MAPPING_ERROR) + return -EINVAL; + + state->addr = addr; + state->size = size; + return 0; +} + +void iommu_dma_free_iova(struct dma_iova_state *state) +{ + struct iommu_domain *domain = iommu_get_dma_domain(state->dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + size_t iova_off = iova_offset(iovad, state->addr); + struct iommu_iotlb_gather iotlb_gather; + + iommu_iotlb_gather_init(&iotlb_gather); + __iommu_dma_free_iova(cookie, state->addr - iova_off, + iova_align(iovad, state->size + iova_off), + &iotlb_gather); +} + void iommu_setup_dma_ops(struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); @@ -1746,7 +1779,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, if (!msi_page) return NULL; - iova = iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev); + iova = __iommu_dma_alloc_iova(domain, size, dma_get_mask(dev), dev); if (!iova) goto out_free_page; @@ -1760,7 +1793,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, return msi_page; out_free_iova: - iommu_dma_free_iova(cookie, iova, size, NULL); + __iommu_dma_free_iova(cookie, iova, size, NULL); out_free_page: kfree(msi_page); return NULL; diff --git a/include/linux/iommu-dma.h b/include/linux/iommu-dma.h index 13874f95d77f..698df67b152a 100644 --- a/include/linux/iommu-dma.h +++ b/include/linux/iommu-dma.h @@ -57,6 +57,9 @@ void iommu_dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sgl, int nelems, enum dma_data_direction dir); void iommu_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sgl, int nelems, enum dma_data_direction dir); +int iommu_dma_alloc_iova(struct dma_iova_state *state, phys_addr_t phys, + size_t size); +void iommu_dma_free_iova(struct dma_iova_state *state); #else static inline bool use_dma_iommu(struct device *dev) { @@ -173,5 +176,13 @@ static inline void iommu_dma_sync_sg_for_device(struct device *dev, enum dma_data_direction dir) { } +static inline int iommu_dma_alloc_iova(struct dma_iova_state *state, + phys_addr_t phys, size_t size) +{ + return -EOPNOTSUPP; +} +static inline void iommu_dma_free_iova(struct dma_iova_state *state) +{ +} #endif /* CONFIG_IOMMU_DMA */ #endif /* _LINUX_IOMMU_DMA_H */ From patchwork Thu Sep 12 11:15:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801924 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FACDEEB594 for ; Thu, 12 Sep 2024 11:16:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B938B6B0082; Thu, 12 Sep 2024 07:16:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B45086B0083; Thu, 12 Sep 2024 07:16:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E4906B0085; Thu, 12 Sep 2024 07:16:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 828316B0082 for ; Thu, 12 Sep 2024 07:16:11 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 3B0E512185D for ; Thu, 12 Sep 2024 11:16:11 +0000 (UTC) X-FDA: 82555832142.28.227F8A7 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf28.hostedemail.com (Postfix) with ESMTP id A29BFC000D for ; Thu, 12 Sep 2024 11:16:08 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=EmLlNz91; spf=pass (imf28.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139663; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dRHSz0R1VwTi0dYb+LlH3KG5wXGf6fhQy9+v7ADWjGg=; b=YZ3cq+BeYONJ01nD2xjAwmuEKJ+fyo2B+bP7jnxrfH8zkiyR470d1ED3zttji0HcNZURGu ZsFTqfFr47RmSyREY/JchrYDIPfjbThhELGv4OD9UoagEn08r3UwfjZWJQA4hXSmorch9J e5Frgkk+5skuJzaUye6/D7OOQHM/H6k= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139663; a=rsa-sha256; cv=none; b=8qXtffBH8JGZFFddVXeodM3juq40SoQP5CkN2m9nt95ScmWDqtZUUoUKHZYFOelojSCLuo 3KaKrsFwaFNaKtpTBkdLQ/Osv0YE1Mqdyj8Ut7srs0AY1/g62bX/q0JWNxZrir2wVc3E0U FQiG25bCjvlk1gncjiPAvSep7LXcj7o= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=EmLlNz91; spf=pass (imf28.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 5B36CA4523B; Thu, 12 Sep 2024 11:16:00 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D77C3C4CEC3; Thu, 12 Sep 2024 11:16:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139767; bh=VWjabdNWVPsGz8JpMutT65B9uF9Uo/BpJGgpfDUTVak=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EmLlNz91vs6SxK07DPB0FeRsNdYraG/8/aqHCC+3yK4unyOBTigWl3dgYpL2FkZvP G00341+DXFortwi5A0DRM0oHHJqqoS7g8LSRwclecTKFOR6jCplrCR/SZdrKvwQrep 5phR03LaxkxKfznDtpK311Npqk+LqTE1hfaGrQzheteDVBW+0npQPavVlKK53ZaCZV jUwtCJrepSK/qZzM0l46iOZh5zC7lj4U1Ro+C1FrUtN0yi1ZAX+AUZ4QhVVi/LHgnS 9a/vmDOidRNoehzzXg/ZzBt24/4beQiO6/rnc2z0BUmysAh3pMoMNwdyfrAPJ0MonP a9CIwLBnLrSxw== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 02/21] iommu/dma: Implement link/unlink ranges callbacks Date: Thu, 12 Sep 2024 14:15:37 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: souno9tu9xjmf9u41ggi3x1uwkzmkywf X-Rspamd-Queue-Id: A29BFC000D X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1726139768-448608 X-HE-Meta: U2FsdGVkX1+cOiSRZKj9yWtV3BLS9eNIo4/Vb94zNVetgC8i4SQEOGIIymKQtKVL/ghzMbRWmmoiDXrjaCQxGBBLh2l5O3LJBHDlQ1GZuhNbS72VDTD2aH1zH7JCcfbBWjGdXeBTxmHJAZGN+yUZB/j0M6+D2z2/JfOTezrqjJbo0P7ei7TBqhOTr+B/bi4dqcY4t1JAuXuofPlpzlD5TxxD6jy2kg66XLbdYM3TkvmXq35aBDqHzuYdFeiq6KXoWfWf/MoRfpT+TNQGxbOl2+OerQvHPJHQjzQ3/L3ooSdwNlCfD8/E0wHfBHsBEXDjsng46fvRG6AgYtEWxuguZ9SJroSHu/n3HuLvCXE8SvduepBG3gTXCSeI29nKlbq1rz29mEP7Gz7lU6dnVXa+d2B7WlR0f+7gYN5UNoGhkQ8WPyk69zC/QBKW7VptxKK/vLyyINjVGECrLvJ2gsEdNlRCtWb1Yfx49VHtDUug0Al1TWmMxUac0TjXFcIvrYXoYHMoF7qVpwmlYxvWZCGLjo3foSIitv2HaYrCNJ1BbvGu3csBMEMRnFPaf7st9Qg9BHduDfLCTTRGpChRKe0fxkNjwTpq7YuKP5ne5vyzSqI37KQYFSConwawxHNSEUPSdd3ys+uHtZrshIxzsl0NQay78I2YgKpzVEgl0a1oe0ROpEEXKf2Kw3drnO6n2YMFGoEk2sZv1ZkhLuYRCINJpPYHRKVMTwGgBugpgblEY+FyaKNBvCIaqB/l3S3JfkPa1/47GPMbn9kOl0gHQKY2lM+h/Ma4XRnT2qkggBqaAQy3qmJaLBF3ST1Z+z99J80sX2pLVvQb78afOC5OWmoz7rpnzy+1ryIXuZR04KwAv+NT79N266Ep7sNWoODKX/vonZ344fhmaVJ6H3ROyhwG1ErzI+IRMs+5tzLq0gPIHxzLgtsE6VDN3B5nV8UJHmVyhrX9ps2QHZCFB/4PR/5 G8gqdVpK UYphtqhwLiVA4F+rByjMnuLHRMhYcVh3xr0VbU2LI9jF1zhxLdsfJPWjzgKVQlKzVGrN1wM3ggI2vyFwFdifxeCy1K3Du2dI3zldeHblOxVjm9vmvE8gWF0gfG9rmayMTNEqbY18sUe0C7aSnbGcA0FpSbPHI86DtwwW5SmuumcIi8fHRcwtrIHBchsst1yO/fVCQBw+hFh8m8FY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Add an implementation of link/unlink interface to perform in map/unmap pages in fast patch for pre-allocated IOVA. Signed-off-by: Leon Romanovsky --- drivers/iommu/dma-iommu.c | 86 +++++++++++++++++++++++++++++++++++++++ include/linux/iommu-dma.h | 25 ++++++++++++ 2 files changed, 111 insertions(+) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 09deea2fc86b..72763f76b712 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1743,6 +1743,92 @@ void iommu_dma_free_iova(struct dma_iova_state *state) &iotlb_gather); } +int iommu_dma_start_range(struct device *dev) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + + if (static_branch_unlikely(&iommu_deferred_attach_enabled)) + return iommu_deferred_attach(dev, domain); + + return 0; +} + +void iommu_dma_end_range(struct device *dev) +{ + /* TODO: Factor out ops->iotlb_sync_map(..) call from iommu_map() + * and put it here to provide batched iotlb sync for the range. + */ +} + +dma_addr_t iommu_dma_link_range(struct dma_iova_state *state, phys_addr_t phys, + size_t size, unsigned long attrs) +{ + struct iommu_domain *domain = iommu_get_dma_domain(state->dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + size_t iova_off = iova_offset(iovad, phys); + bool coherent = dev_is_dma_coherent(state->dev); + int prot = dma_info_to_prot(state->dir, coherent, attrs); + dma_addr_t addr = state->addr + state->range_size; + int ret; + + WARN_ON_ONCE(iova_off && state->range_size > 0); + + if (!coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) + arch_sync_dma_for_device(phys, size, state->dir); + + size = iova_align(iovad, size + iova_off); + ret = iommu_map(domain, addr, phys - iova_off, size, prot, GFP_ATOMIC); + if (ret) + return ret; + + state->range_size += size; + return addr + iova_off; +} + +static void iommu_sync_dma_for_cpu(struct iommu_domain *domain, + dma_addr_t start, size_t size, + enum dma_data_direction dir) +{ + size_t sync_size, unmapped = 0; + phys_addr_t phys; + + do { + phys = iommu_iova_to_phys(domain, start + unmapped); + if (WARN_ON(!phys)) + continue; + + sync_size = (unmapped + PAGE_SIZE > size) ? size % PAGE_SIZE : + PAGE_SIZE; + arch_sync_dma_for_cpu(phys, sync_size, dir); + unmapped += sync_size; + } while (unmapped < size); +} + +void iommu_dma_unlink_range(struct device *dev, dma_addr_t start, size_t size, + enum dma_data_direction dir, unsigned long attrs) +{ + struct iommu_domain *domain = iommu_get_dma_domain(dev); + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iova_domain *iovad = &cookie->iovad; + struct iommu_iotlb_gather iotlb_gather; + bool coherent = dev_is_dma_coherent(dev); + size_t unmapped; + + iommu_iotlb_gather_init(&iotlb_gather); + iotlb_gather.queued = READ_ONCE(cookie->fq_domain); + + if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) && !coherent) + iommu_sync_dma_for_cpu(domain, start, size, dir); + + size = iova_align(iovad, size); + unmapped = iommu_unmap_fast(domain, start, size, &iotlb_gather); + WARN_ON(unmapped != size); + + if (!iotlb_gather.queued) + iommu_iotlb_sync(domain, &iotlb_gather); +} + void iommu_setup_dma_ops(struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); diff --git a/include/linux/iommu-dma.h b/include/linux/iommu-dma.h index 698df67b152a..21b0341f52b8 100644 --- a/include/linux/iommu-dma.h +++ b/include/linux/iommu-dma.h @@ -60,6 +60,12 @@ void iommu_dma_sync_sg_for_device(struct device *dev, struct scatterlist *sgl, int iommu_dma_alloc_iova(struct dma_iova_state *state, phys_addr_t phys, size_t size); void iommu_dma_free_iova(struct dma_iova_state *state); +int iommu_dma_start_range(struct device *dev); +void iommu_dma_end_range(struct device *dev); +dma_addr_t iommu_dma_link_range(struct dma_iova_state *state, phys_addr_t phys, + size_t size, unsigned long attrs); +void iommu_dma_unlink_range(struct device *dev, dma_addr_t start, size_t size, + enum dma_data_direction dir, unsigned long attrs); #else static inline bool use_dma_iommu(struct device *dev) { @@ -184,5 +190,24 @@ static inline int iommu_dma_alloc_iova(struct dma_iova_state *state, static inline void iommu_dma_free_iova(struct dma_iova_state *state) { } +static inline int iommu_dma_start_range(struct device *dev) +{ + return -EOPNOTSUPP; +} +static inline void iommu_dma_end_range(struct device *dev) +{ +} +static inline dma_addr_t iommu_dma_link_range(struct dma_iova_state *state, + phys_addr_t phys, size_t size, + unsigned long attrs) +{ + return DMA_MAPPING_ERROR; +} +static inline void iommu_dma_unlink_range(struct device *dev, dma_addr_t start, + size_t size, + enum dma_data_direction dir, + unsigned long attrs) +{ +} #endif /* CONFIG_IOMMU_DMA */ #endif /* _LINUX_IOMMU_DMA_H */ From patchwork Thu Sep 12 11:15:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801930 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19817EEB593 for ; Thu, 12 Sep 2024 11:16:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9E4E46B009A; Thu, 12 Sep 2024 07:16:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 994EC6B009B; Thu, 12 Sep 2024 07:16:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8358B6B009C; Thu, 12 Sep 2024 07:16:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6678E6B009A for ; Thu, 12 Sep 2024 07:16:36 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 25492121EFA for ; Thu, 12 Sep 2024 11:16:36 +0000 (UTC) X-FDA: 82555833192.12.0A45681 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf26.hostedemail.com (Postfix) with ESMTP id 8AC82140005 for ; Thu, 12 Sep 2024 11:16:34 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=JaEIJPQl; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf26.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139717; a=rsa-sha256; cv=none; b=ga5o6NhQu1i5bNrZLGpwWAsDEZQqX1gkOt0kX57yBJ1GrcoMeDFm1kw8BPAwExdqN/UCq3 CZv7o06p703vP2gt2Fv+Mmg3dTHmOT4D9O6iC7PF93uPI6CswBWyyPDrtrH6GcbK5yxnmq 5PREHM8uREVva+EBi8LsUYYrPvaM6+k= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=JaEIJPQl; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf26.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139717; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a8A9ch32OnCK6zlVv+ChhoMntSAAJjv1Y1wVJxAAmWs=; b=5BKrToZ8tnt3KP3VisfEWS63R+lXwzJzYc1Z3CA/nEH1RxM2kLN0hktQRCNM4ejGNK7094 ZgswjkVRigauUZ2/ECD4yb+5E6nn33BtYdhGcu0IA+d+f+IiAcahYs5IqEa6EXRmh/Fm8C g//T2zeP3bYyyZ3EmyvkAHZdq/tGv6I= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 33B8AA45274; Thu, 12 Sep 2024 11:16:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8814DC4CECC; Thu, 12 Sep 2024 11:16:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139793; bh=UdWHtN/ksXEH4oEZ6oLMvfaCKk3WWuTK2xFlFpaVROQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JaEIJPQlXW2dDMP6p7hqTalr5/nZfzPDNjt31Ao+a/Z+6Qd6aR11344w2a+DiulS6 iMiK8gTB/L8nSXItJGUYe9cwRmL2GgnozYUV8QmwIPwxlHyW9aemlOw4kUFJ274pAA PN0d31Risc4/iSKKWebUhSD66kfvH9LIyEwYMS4hOAKbYQxJRBmddM3wevSD+t9Lab 3P9ksYF1WRRS9QQhDgWjrPfCaOyxagFMoWzGFxcyzomruL7xcQnhG0yOGfbzjF3fR/ bCmDiQ+O2/NGR32Y9Ac9Uqzw207ZTfmgI/uaIsUSGZlxOlUgCp8lIc5BJSywScgOIC JJAYki7oZADlw== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 03/21] iommu/dma: Add check if IOVA can be used Date: Thu, 12 Sep 2024 14:15:38 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 8AC82140005 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 4owrn9ga6oe3ashwp4a9i3n94hk9aft9 X-HE-Tag: 1726139794-316695 X-HE-Meta: U2FsdGVkX1+3IzWK8xEpBQPaluN++TZgeVPpiu3IodTPSq79ofr42BJAgCxSg725GFBpLXdlNgVFhe5wtzt8FXXbjVqeG9+26N6+pvFskACHiB5BPPTpUb/0uy//JCNd/3db17/S5vI3S3DPWbo93U7uQKjB6KNxeOuYJ1XVdY8whVXb9gv8v3yEWlvgnlQqMJkKw8iVzy0H4qeFmS6SpTJLq1PiIQDiqQXnJR25YwVE0J0LJaNz3zIc9EKSdXgFPD9smZeUXHVj7qOS7TNs6sYnVkN3tLC1gjwWTd+m2blfqKijHMicB06nO9bj8zDjMYKdpeao+YlOUWfP3ISUCUourSihXFpZkP79Ctr5bvSlZPZu43WDB4XTWCZ4u61qlnCfNpG1Kes5Voud8Tta79SXZxCaoDyF0KCiQZLI+eWKey8GjahTG5MfeOYe0fLc0xLuGR2GCYAVfFmZt9VckVBOp51Wi7OYYbll2mXS9N+79YWoZd0BgzaYLvZ5F9s9TCnS3M/Hgv/1GuQwiQhtBg6pTrOLZTky8ZaueQ9iU/FQocw8sm8d5csN/GtLKBnCVlPNlmw4Pb6fgpYoLEVkwhGevMfvH373bOzFMZcA6eLB0Tcu+UQ3LASxFktz6Y+XqyX2zjVnkJsZuI2mkcDzs+kfDuSSDt7gFgt2ZTYsP/BXlxO6aK7HR5tdnbPjFC5ELsZrrMzoWBnaiz7gVlQ86VEETeDOtbb8nExxIfpjZ+uMBlAx1FlUUa6CbF/VrhQ7NcgummGfgISD3oRkqB7lwEoHQmLALcoA2hy4W8a3ttmsNJDzPgw4Rg6qr5DndcIeeYI37s1KzmnTb0bDgH570eSFGSKMu5sez3DqHs2zToc6aP+DyfMsSSG9Ysy4i5SsE/HGqW6fC2T7eX9OtgKK8CeBNHNUln0vsfgg6ds6anYNPIbHxGe1ExU1U/+eCrJxzPLa3Y138IOl5wWbejX HHX3KuZd w0S8kcZ2BabIHwYVHZQArOc1lum4FtBfidsKgoDuSXmkOUJKow/NoA54FhZfUH+j86P8YOprZwUNWMd8qxZNr5DRppU9HxpgggVfWHvILX9gxr/bG67nnhLXqwcBNLEK3gePpn1cR74POamfqMK8AfrCsuT/F6HU2EVpE4eRzj7bC7af4Vq6knaqx9zbALZsw2Mh2aL3hdEceNQM4qjF30F177ZWlbxDeWgWiU6NMGCg4RXs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky This patch adds a check if IOVA can be used for the page and size. Signed-off-by: Leon Romanovsky --- drivers/iommu/dma-iommu.c | 21 +++++++++++++++++++++ drivers/pci/p2pdma.c | 4 ++-- include/linux/dma-map-ops.h | 7 +++++++ include/linux/iommu-dma.h | 7 +++++++ 4 files changed, 37 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 72763f76b712..3e2e382bb502 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -1829,6 +1830,26 @@ void iommu_dma_unlink_range(struct device *dev, dma_addr_t start, size_t size, iommu_iotlb_sync(domain, &iotlb_gather); } +bool iommu_can_use_iova(struct device *dev, struct page *page, size_t size, + enum dma_data_direction dir) +{ + enum pci_p2pdma_map_type map; + + if (is_swiotlb_force_bounce(dev) || dev_use_swiotlb(dev, size, dir)) + return false; + + /* TODO: Rewrite this check to rely on specific struct page flags */ + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) + return false; + + if (page && is_pci_p2pdma_page(page)) { + map = pci_p2pdma_map_type(page->pgmap, dev); + return map == PCI_P2PDMA_MAP_THRU_HOST_BRIDGE; + } + + return true; +} + void iommu_setup_dma_ops(struct device *dev) { struct iommu_domain *domain = iommu_get_domain_for_dev(dev); diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 4f47a13cb500..6ceea32bb041 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -964,8 +964,8 @@ void pci_p2pmem_publish(struct pci_dev *pdev, bool publish) } EXPORT_SYMBOL_GPL(pci_p2pmem_publish); -static enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, - struct device *dev) +enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, + struct device *dev) { enum pci_p2pdma_map_type type = PCI_P2PDMA_MAP_NOT_SUPPORTED; struct pci_dev *provider = to_p2p_pgmap(pgmap)->provider; diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index 103d9c66c445..936e822e9f40 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -516,6 +516,8 @@ struct pci_p2pdma_map_state { enum pci_p2pdma_map_type pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, struct scatterlist *sg); +enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, + struct device *dev); #else /* CONFIG_PCI_P2PDMA */ static inline enum pci_p2pdma_map_type pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, @@ -523,6 +525,11 @@ pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, { return PCI_P2PDMA_MAP_NOT_SUPPORTED; } +static inline enum pci_p2pdma_map_type +pci_p2pdma_map_type(struct dev_pagemap *pgmap, struct device *dev) +{ + return PCI_P2PDMA_MAP_NOT_SUPPORTED; +} #endif /* CONFIG_PCI_P2PDMA */ #endif /* _LINUX_DMA_MAP_OPS_H */ diff --git a/include/linux/iommu-dma.h b/include/linux/iommu-dma.h index 21b0341f52b8..561d81b12d9c 100644 --- a/include/linux/iommu-dma.h +++ b/include/linux/iommu-dma.h @@ -66,6 +66,8 @@ dma_addr_t iommu_dma_link_range(struct dma_iova_state *state, phys_addr_t phys, size_t size, unsigned long attrs); void iommu_dma_unlink_range(struct device *dev, dma_addr_t start, size_t size, enum dma_data_direction dir, unsigned long attrs); +bool iommu_can_use_iova(struct device *dev, struct page *page, size_t size, + enum dma_data_direction dir); #else static inline bool use_dma_iommu(struct device *dev) { @@ -209,5 +211,10 @@ static inline void iommu_dma_unlink_range(struct device *dev, dma_addr_t start, unsigned long attrs) { } +static inline bool iommu_can_use_iova(struct device *dev, struct page *page, + size_t size, enum dma_data_direction dir) +{ + return false; +} #endif /* CONFIG_IOMMU_DMA */ #endif /* _LINUX_IOMMU_DMA_H */ From patchwork Thu Sep 12 11:15:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801926 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C938EEB593 for ; Thu, 12 Sep 2024 11:16:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F15976B008A; Thu, 12 Sep 2024 07:16:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC50B6B008C; Thu, 12 Sep 2024 07:16:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D17326B0092; Thu, 12 Sep 2024 07:16:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id AE9296B008A for ; Thu, 12 Sep 2024 07:16:18 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6B470C1FFA for ; Thu, 12 Sep 2024 11:16:18 +0000 (UTC) X-FDA: 82555832436.16.27F9654 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf19.hostedemail.com (Postfix) with ESMTP id DBB301A0004 for ; Thu, 12 Sep 2024 11:16:16 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=oqy5RZrK; spf=pass (imf19.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139724; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kry8TBp/PimGl5X7TNs+j/9P8+WUWzEQfu0ZW+M89Ng=; b=Bo1mIJuBscPhU95zh1COpSa1f90Fvc4UKLX1g+c0ceKhXTJetOYGfXFli6kDAEZ8pCLddG xkPQozAVb+w2teUmsAlkmZN3/Ii4aCiq8+PFKldPtR8hWGqePuv13wlb2Qlh7RSiHaFGvv LeBAms0F2uY4aFe6qrAPfEubZh1ou2I= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=oqy5RZrK; spf=pass (imf19.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139724; a=rsa-sha256; cv=none; b=o82jdIX0O/xfo2WdCN/m5tDe+4O/obel2b1prZoJs+h3f2W5n5UEDD1xM3eTa6XBJEDlNI MKiNgTtAAfTmrNEkQenP8UbZhuF+SbRq4VFDYiNypjtVntZpLHDQBv5ENdvCL6THNMqvOC n6UkLq1sptUgjRehPqPv8tOpNaLQDIM= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 9AC32A45255; Thu, 12 Sep 2024 11:16:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4577FC4CECE; Thu, 12 Sep 2024 11:16:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139776; bh=wOAE2cOF8fAnldbeI2zuqZ+UOONgwZZm16iJWxrv2PE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oqy5RZrK9GLnKls1wNDOUPi+7NtU2JS4w8JdmoFHMUazCDp35ubbfNk+eD2Ky3+kD c9dh8XOv6xn8eUdWkXYq/ky1ASXhJwYbPSmlWO7OggXZokzWbb0xc++iuMk1NnstOZ 9G+LadC3Vfaio8VU/li3IVUb1dbRspKdNLjwGfwSKTIlBcbij2YVopnafSpz+igh2w 8q3vtsyRvLIliApnPdT+nobPr5chNkY+1N1mtTKAQdsI0f3f4yiiv78QQi0zQHlzvS 4okcMEZtIu1uoplQgT2/NieLqCfTUhXBYZSEndngVTH5rsuHS4qYvHNtl2e/+MdkFi 0ds6D1OQO+Iuw== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 04/21] dma-mapping: initialize IOVA state struct Date: Thu, 12 Sep 2024 14:15:39 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: DBB301A0004 X-Stat-Signature: 6dc4cb3tpektfbq9pms6q366amrieot1 X-HE-Tag: 1726139776-751938 X-HE-Meta: U2FsdGVkX19r3j0D5/39Pn0dsUnrT4ls1ZNGG/qIiTy7pZb4kHqSqbd9OYoy1at4pIhdsKnwkS6HQQoaGM25V4OOqQETm9eSYlgfLc8g0gaAKd8C/qW+1dA1S1qJn/yxLtNxO8XdEQVMLhfrtXyT1ULZUwYr4GALRe43aRsqVlo/aFSNwrVu/b6xZaYuTuDmzEztvQmhutjNWN/0vRlEblLfBcMgV6zV+HjU+Szafka+i9WjFrd2o0qezSwwRJz1Ej8SCxuAjmU2JrFXtUZ72nAet0CIVg/fi9sokXyEkHgx/Wb0y/t3BRMHYxNrlWHcnhBjCaPqbqrtjtjWNFMlzGTBzE4M4fDckTd8qtZOS3A7rvoE0rmVCXZobp7HbjaV/Fqu12tt+v+3x3Xdu+IFmodGgKaflGMq39hNH3fuevyhlul72ZFXMhaBv2Hv+/grTMVDLnnis4Ms6h5gdOkncIM/5sclQEG7XR9H7DYRQ3opthUSUzqAvzwBnfdIYWbcbQ1njF7Bsc3J6UHXXvvAYKRuNST5Ii6K7gz0kXCj4IyX0pGfH5U0GaQDa/B7o39lXsCJV2H20Eltq8UWrOdw75f45He15dOUPHuQoo9xWHFyngmAJshkAKesju1qp9I0Fm6j1jmKkmhOuhb7x/TE1bYx3dI2iC6A7OzxXhIe5j/WAxYUYYqqDrtB3nxb6aiZoqgoC/CHlMTD757/4z7FpW5fad4rBThvnPDpSvqRGUVbEF1FxKVmNnsy+Zb3mYrNwbJOKsjy702Rva8amI9Tz0P8EBu3ugmprdsAtFVA0ASlbrRIKu2hELlOcNIxQSiPtWNYkgpAhAZlcHknFw3jQTJFBgJ5gRArS6znUYSkQmPLvyEly+LAkR44pYVYMA6goh4Ji7ma1ZuXbwdczbgv8/fg4bDInsW+Y1oBZ5q5fJ44Y8DSCYvOJdpoh1Ohhd6PpwuI7L01Mpry1e3BtVD BR2PmMxA U7lwiZx6GRx7angoe+B1ftHMHtMl5x0BHUXlPpAoCTPtrtEfJdIUTe1vfIkpI4gsjRvrlWVhBw7pIB2ythcCp/YJeJj2Q7qGz0GT/cE5r0+OKr+Z5vxTPTpmmLlU2g3gRXoU6FtmxxzJI4S4titiNJq8xB0uSd9z2HUxYY1xnV3cIjixoUKW3xH/ZVVfhm7jvmfqFCQ+Yo750juMEcMIDcDuWTmK6RosgxMQcPz56l5Pp48U= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Allow callers to properly initialize the IOVA state struct by providing a new function to do so. This will make sure that even users who doesn't zero their allocated memory will have a valid IOVA state struct. Signed-off-by: Leon Romanovsky --- include/linux/dma-mapping.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index f693aafe221f..285075873077 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -76,6 +76,20 @@ #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1)) +struct dma_iova_state { + struct device *dev; + enum dma_data_direction dir; +}; + +static inline void dma_init_iova_state(struct dma_iova_state *state, + struct device *dev, + enum dma_data_direction dir) +{ + memset(state, 0, sizeof(*state)); + state->dev = dev; + state->dir = dir; +} + #ifdef CONFIG_DMA_API_DEBUG void debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr); void debug_dma_map_single(struct device *dev, const void *addr, From patchwork Thu Sep 12 11:15:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801927 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15A71EEB593 for ; Thu, 12 Sep 2024 11:16:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 85ED16B0092; Thu, 12 Sep 2024 07:16:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 80EF36B0093; Thu, 12 Sep 2024 07:16:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6AEF36B0095; Thu, 12 Sep 2024 07:16:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 48F6B6B0092 for ; Thu, 12 Sep 2024 07:16:23 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 00F74141F43 for ; Thu, 12 Sep 2024 11:16:22 +0000 (UTC) X-FDA: 82555832646.04.A72F962 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf28.hostedemail.com (Postfix) with ESMTP id 65CB1C0006 for ; Thu, 12 Sep 2024 11:16:21 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=DW0EiOBL; spf=pass (imf28.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139677; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GPZuS5wvsAjn9FDbEhcpirCEsYojcDK3lhW1ODMTRzw=; b=guup6vVLVV+EjpuWiOGkaFIH9OylxcIwF+rnOP7pE97Lu1p6xHxk/i1gNE/8p5q78RC6Og VyOAog7Y34jIg6pJxJ1UYJ2XS1A8t60LV1FL1oaigzRD1tJm0WLk3RIGJrBLn85r68mh/v F3JJLTbIaHWNzkl8VMKsoMtRKOiNISM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139677; a=rsa-sha256; cv=none; b=bpSU2mLgvaf9A4SgGmlkXvZZAe7bpHHCXherCY2T381+zqVN6dtdkIGb9W/KWUTDGMDp0t TWhGUIuET082czH/vivCJCiSUiX8Q4fzhg7JGsKY1itZPGoqCIWPRsbW3zLZAQYEc9GtqL e9D1vRiuk6d9+sQT2D5/G6gzHEhENJQ= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=DW0EiOBL; spf=pass (imf28.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 351C3A451EC; Thu, 12 Sep 2024 11:16:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id ADF5FC4CED2; Thu, 12 Sep 2024 11:16:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139780; bh=li4T+ci0miq/Ck6TZvJSh35TTfQLVZ3ez/sogkVEAqg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DW0EiOBLWTdXdcYL7iQeO1KEagJUzhnYtnawH9XAOccSvNYhDqCr0kKYfQM7BvY7U Z/wF8BB04Og3VC/8rARWoLu32FUWjBo1rIeCC7O48BCmC1v7BqOLzPGtwScbzU+jh5 yM16QVL6oJS9fqiYwrvdfeNXuOWuirfNAINa2A9ynEvi8yQyOZcl7Q5AiJF1Psq1wq BlV0gczTOznpsch1Z/NwAxSedcftaMBui/NE1/LqZ1tIhEFQpGA/yfhVm5TfMyPZyn iwopoujIQGMD6QhZFyOHnnS0zH8X4MT/j6CzTSHEFKa//fWWK8/8tzQhXDftpbBb7j 3hFBAtVG640fQ== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 05/21] dma-mapping: provide an interface to allocate IOVA Date: Thu, 12 Sep 2024 14:15:40 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 65CB1C0006 X-Stat-Signature: 756cf56qrqo4fi4ka7s9tkmio6jumwxr X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1726139781-527890 X-HE-Meta: U2FsdGVkX19LcIyBIH0PBxeHGVxE6yNWvc8cRncD55D+ncTGWPanMPerAaOqozdU8SNLD7yJkyZB1sOg+sjPjqHVXapnSg2Fs7RQQabOuTlNYykGydDAp4qoMVy3ZhKYlAFxd1Y1fhugcWyjOzbGGMzbCUopO/dweLYCZA3pKwNCOdEx1IRbXTJPSD8N6bnd0+VbX08vDUK2VkwyGyJmUotg/HTZ6lQdD9RXUpW8TqGWBk+kgJgcqM9T4I0YObc70dxypFVbIk8QL/ZnZ+NBOTf/rFDro833v0idNuikVW/vKEbdd9p6jyXPuNIvorB8gR/fsNCpt3hiIJkxUg1yZeemvaNMRZ4z9Srg0mbquFcXiES1nLp6Ahn013uERfo6drMGzh5Yhl6rr8PHdvEwX4L8nM8xevex3V9XmcClEz+OonRmNsjw6MvvOV+fINaRHVb/M7+hr5i4oLynZI9wDQgr0OgUu819AdIk7IhyQeTsHNXFyjilY2nJBWjHwOeajA2KaFTdufdL7EBzvGky+KujsxUKRJl1GXHIPBzH2oYrCY+Mb+ZO84i5/O3dyS4wgla41qcr4HsFYLAWufh4Rrx5nGs11deumWFUYdSQ9muoGWx4r2I4+l2Oy1RAygUIT8pCrhDpNWGIJww53yELb2Nbk8S2fSAiTTRIMEr/HLLhNK5xZyvpKITZ+91LNjjTE1dEMDtrEYqN08or2brcCNBMLIklK9VyIgiM78gdcDoeynWoYuxAA5lktCsynPOsK0g0YNKSMoMESyQwiWguJCPTbruigCL+x4/qR2Q7/nwaYE4rmAs4726ilvb1pYYjNUpOs7PZmHKiVUXO7otfPl8ktfN9cK/0jTmjJ0jCNMGEqZmw0IvuVXAn9LADjIeskwlupATKxrQyZfNjhCh9IC4muBgT0a3QTJVFKxLRLfQ1a6s4uFFJoU8DoW3eolSwlrMbhW4L676Hxq/F7AY WWTZbFim S0IU5dAjtNC0TyKOGaTfleDqe/7g5oru0zzYiW/PDxYaWphkOUiH2qR2J5ET+4gOt2tdvntB1WHvTlvGZneFmLETNMkevEYICYgCOa5ETMdzL0uebC9jA9i00WL5/OMdLPoaqy6UF3XDMW6MRl3TRW3jMvedRm55OMwtJ70cdqBZ59kwrxy97PeggExl24RTZ5ZRLO08dpk+PNqoYkJ8Ix2FZP96xFR4C91h52fOl3Vq8Ghs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Existing .map_page() callback provides two things at the same time: allocates IOVA and links DMA pages. That combination works great for most of the callers who use it in control paths, but less effective in fast paths. These advanced callers already manage their data in some sort of database and can perform IOVA allocation in advance, leaving range linkage operation to be in fast path. Provide an interface to allocate/deallocate IOVA and next patch link/unlink DMA ranges to that specific IOVA. Signed-off-by: Leon Romanovsky --- include/linux/dma-mapping.h | 18 ++++++++++++++++++ kernel/dma/mapping.c | 35 +++++++++++++++++++++++++++++++++++ 2 files changed, 53 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 285075873077..6a51d8e96a9d 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -78,6 +78,8 @@ struct dma_iova_state { struct device *dev; + dma_addr_t addr; + size_t size; enum dma_data_direction dir; }; @@ -115,6 +117,10 @@ static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr) return 0; } +int dma_alloc_iova_unaligned(struct dma_iova_state *state, phys_addr_t phys, + size_t size); +void dma_free_iova(struct dma_iova_state *state); + dma_addr_t dma_map_page_attrs(struct device *dev, struct page *page, size_t offset, size_t size, enum dma_data_direction dir, unsigned long attrs); @@ -164,6 +170,14 @@ void dma_vunmap_noncontiguous(struct device *dev, void *vaddr); int dma_mmap_noncontiguous(struct device *dev, struct vm_area_struct *vma, size_t size, struct sg_table *sgt); #else /* CONFIG_HAS_DMA */ +static inline int dma_alloc_iova_unaligned(struct dma_iova_state *state, + phys_addr_t phys, size_t size) +{ + return -EOPNOTSUPP; +} +static inline void dma_free_iova(struct dma_iova_state *state) +{ +} static inline dma_addr_t dma_map_page_attrs(struct device *dev, struct page *page, size_t offset, size_t size, enum dma_data_direction dir, unsigned long attrs) @@ -370,6 +384,10 @@ static inline bool dma_need_sync(struct device *dev, dma_addr_t dma_addr) return false; } #endif /* !CONFIG_HAS_DMA || !CONFIG_DMA_NEED_SYNC */ +static inline int dma_alloc_iova(struct dma_iova_state *state, size_t size) +{ + return dma_alloc_iova_unaligned(state, 0, size); +} struct page *dma_alloc_pages(struct device *dev, size_t size, dma_addr_t *dma_handle, enum dma_data_direction dir, gfp_t gfp); diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index fd9ecff8beee..4cd910f27dee 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -951,3 +951,38 @@ unsigned long dma_get_merge_boundary(struct device *dev) return ops->get_merge_boundary(dev); } EXPORT_SYMBOL_GPL(dma_get_merge_boundary); + +/** + * dma_alloc_iova_unaligned - Allocate an IOVA space + * @state: IOVA state + * @phys: physical address + * @size: IOVA size + * + * Allocate an IOVA space for the given IOVA state and size. The IOVA space + * is allocated to the worst case when whole range is going to be used. + */ +int dma_alloc_iova_unaligned(struct dma_iova_state *state, phys_addr_t phys, + size_t size) +{ + if (!use_dma_iommu(state->dev)) + return 0; + + WARN_ON_ONCE(!size); + return iommu_dma_alloc_iova(state, phys, size); +} +EXPORT_SYMBOL_GPL(dma_alloc_iova_unaligned); + +/** + * dma_free_iova - Free an IOVA space + * @state: IOVA state + * + * Free an IOVA space for the given IOVA attributes. + */ +void dma_free_iova(struct dma_iova_state *state) +{ + if (!use_dma_iommu(state->dev)) + return; + + iommu_dma_free_iova(state); +} +EXPORT_SYMBOL_GPL(dma_free_iova); From patchwork Thu Sep 12 11:15:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801928 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE1E6EEB594 for ; Thu, 12 Sep 2024 11:16:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 418DE6B0095; Thu, 12 Sep 2024 07:16:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C8D06B0096; Thu, 12 Sep 2024 07:16:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F4086B0098; Thu, 12 Sep 2024 07:16:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id F0E226B0095 for ; Thu, 12 Sep 2024 07:16:27 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 9B6B6A1F5D for ; Thu, 12 Sep 2024 11:16:27 +0000 (UTC) X-FDA: 82555832814.08.39CD472 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf19.hostedemail.com (Postfix) with ESMTP id 182BC1A0010 for ; Thu, 12 Sep 2024 11:16:25 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=aLB45VxS; spf=pass (imf19.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139732; a=rsa-sha256; cv=none; b=nYV/D69TGfjOCQTSfSHO5X1vGxubMcL4pUIYONWpazTahJckvDvVb23LoTmMTMZAaA8II0 iDNMVSK4pb30VPXyBZp5Smn4Gkgs92tXTo6ksUALyY3qEpNmW85b6Oe5YkafBeug8u43mY f9BSgqNSZSX1OT4JdzT/VWDgeNEh90s= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=aLB45VxS; spf=pass (imf19.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139732; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2JnuJBl7580UzNRE/N3KmM4LmreoZWhWzDO4jU/MMb0=; b=7JVXnYV4lYuA0P/iz5RQ/NjvoMTfaayRTZl0d3fV81WrjMVrUtERDh8CZpUgQqf2FhrKqR eZ95x9VmQsQlK3qdisNLWCErYv2uV8/U0UwFD5SLRZ/SrKKB2CHAdAoY5X7NqFEfGPFwf3 sKK86aE9PZdU9wM3Jolk6au/nI5GA+Y= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id C5354A45257; Thu, 12 Sep 2024 11:16:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 499D0C4CECF; Thu, 12 Sep 2024 11:16:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139785; bh=VvpomHjn0PXFDu7e/jW7/gfUbs21DN2RjkYJAyY9I0w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aLB45VxSAnSX1Omnd4tNbgxYwGvlO0vlHzxS5XRXs3Jf+H6Sv4g2apXm3/BjzYZcU jJat8T1BsFMxKIbvmMGDI3d2Wav68reJ0E3LN7sVO+ms1PwvbS5VCjz/Zp+6dn0u38 Hjyfj6EN+qrGYfmm8UXj6bzSoYbXvDIEBCuKATHLVp+eKVznmpDYlgEn6mEfIV25ac CluAye/X57cQXU1+3AFoTAvO2BnVfHu3mTsIsvPd7Y8II6FIRfLqDJj5jqYb3qQCuB HoFkBLlicit7oilmLRe8VOlKBnHex2vD837rA3akUkz25QskU2jzMySkDTK/0QW30m CBd4EkuZoW2Aw== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 06/21] dma-mapping: set and query DMA IOVA state Date: Thu, 12 Sep 2024 14:15:41 +0300 Message-ID: <818f2fbdb80f07297ca2abe5d04443d3b665f445.1726138681.git.leon@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: 8x53833rafy9jfcxxozgdjx59jj9zhkc X-Rspamd-Queue-Id: 182BC1A0010 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1726139785-809130 X-HE-Meta: U2FsdGVkX1+hI+lNK9SnxwG4dr012Cm/36+xUk2nzezeRX8Nsz87fryd7LGTYlb7+fjzUu1IOZP9UwT8AgFZI1pFSDr2/c/HD/yUbzdTjSmY36YzcNjBsU7cc4uvWwB9zE8j6ZozvOkp9IHA7A+t8e9CnAfqYpr1PQdrdG395ZFv69vcQK0M9HUeh/VMyOL+PVZeLxZ6RApmpHcDNh01c42+UIaamzkCDKbdtTPpTJzDy9SehFTika8qlWXrhtIJ5zF9fNFUnUzG+8LGfBacD3FAG6SYN+jPAnapIArRuTut3ewrl/GTWUM9h6ujk9P+nNAmiEdx4H67K0iDWCk3Pbf41RpTmcTBceRa+zrb5FIMesW2G1cGbUAKhJVz6TWsgWi05blS2ssc2m3+bpGGg5ZVuTdz52gcCYXUXjhlBg/qkdq4Mh9W/G6HySj/F+BhMUrblwZG5esn+tHcHU2Xr6gRNXABEXPsS0qeX77sbTTh4rXmpOrKF2q/vjXxcyhReXMkOcYYIwfTQYMRiobs54AVuR4BFMTAbuU6JfCAXnM75CAT5/oP0AGkivY+CSbIsRSOSG3vIchGQeb8cRYTGaM+8oE4xUbhphIAjBR1MFv6nyeSQC220ohT6VQoOn1tgLcuwWrQuc1jzTzYjSptaVpFrbbzTV/RqKYhwe3TV+9EJhexggC1KR4AfC36gBNtFFRsk5J9cKIe2+TL4PXgAG1U0NpYC21Or+6v2qJFfkO1IyzkGlh1BIk1DFJRzha9zeGrxAVfmJWIfcoCnlPZPe4FggNN1ZwtVuQpdqdzUgokI48JWIcOaquFhDAYkDVp29F/zAFDaKRDpNiQpWz7V+WVqKJsCbwjnTnwE+KSD50joMLy9CE4mmqnrHMuZAcoS0iFRMjEIB9fynockLHUnRwE8+qaz7tOk04c/aAP601QHjH2zdHmSmrOt4Rz0I6ioP3YuYj4fUapmYPvC2L uCNYnK8+ DtKCXQDuyZ42Lg0SFC7/qSj3WZany9M81Mm72fRR+z6DV6RhXfV3ziSLQGZkuLjEHUVhpJIRqEZKvANlWX6Uk1JJe24i9PNdtAEE0g5ut4L/0BBBpw7NT53Kf8h6KOpbKngch3UbeRbi703thAxwoe8cDfQBpGfJja/FCrOiylEvGqZDqYX+3eRiucfF08uIbRW9YfZLW2inf9QBPkHqPc4Em+mstCPwloaBzeVdeOiDFJQzVBRknwz4j3w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Provide an option to query and set if IOMMU path can be taken. Callers who supply range of pages can perform it only once as the whole range is supposed to have same memory type. Signed-off-by: Leon Romanovsky --- include/linux/dma-mapping.h | 12 ++++++++++++ kernel/dma/mapping.c | 38 +++++++++++++++++++++++++++++++++++++ 2 files changed, 50 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 6a51d8e96a9d..2c74e68b0567 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -81,6 +81,7 @@ struct dma_iova_state { dma_addr_t addr; size_t size; enum dma_data_direction dir; + u8 use_iova : 1; }; static inline void dma_init_iova_state(struct dma_iova_state *state, @@ -169,6 +170,9 @@ void *dma_vmap_noncontiguous(struct device *dev, size_t size, void dma_vunmap_noncontiguous(struct device *dev, void *vaddr); int dma_mmap_noncontiguous(struct device *dev, struct vm_area_struct *vma, size_t size, struct sg_table *sgt); +void dma_set_iova_state(struct dma_iova_state *state, struct page *page, + size_t size); +bool dma_can_use_iova(struct dma_iova_state *state); #else /* CONFIG_HAS_DMA */ static inline int dma_alloc_iova_unaligned(struct dma_iova_state *state, phys_addr_t phys, size_t size) @@ -307,6 +311,14 @@ static inline int dma_mmap_noncontiguous(struct device *dev, { return -EINVAL; } +static inline void dma_set_iova_state(struct dma_iova_state *state, + struct page *page, size_t size) +{ +} +static inline bool dma_can_use_iova(struct dma_iova_state *state) +{ + return false; +} #endif /* CONFIG_HAS_DMA */ #if defined(CONFIG_HAS_DMA) && defined(CONFIG_DMA_NEED_SYNC) diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 4cd910f27dee..16cb03d5d87d 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -6,6 +6,7 @@ * Copyright (c) 2006 Tejun Heo */ #include /* for max_pfn */ +#include #include #include #include @@ -15,6 +16,7 @@ #include #include #include +#include #include "debug.h" #include "direct.h" @@ -986,3 +988,39 @@ void dma_free_iova(struct dma_iova_state *state) iommu_dma_free_iova(state); } EXPORT_SYMBOL_GPL(dma_free_iova); + +/** + * dma_set_iova_state - Set the IOVA state for the given page and size + * @state: IOVA state + * @page: page to check + * @size: size of the page + * + * Set the IOVA state for the given page and size. The IOVA state is set + * based on the device and the page. + */ +void dma_set_iova_state(struct dma_iova_state *state, struct page *page, + size_t size) +{ + if (!use_dma_iommu(state->dev)) + return; + + state->use_iova = iommu_can_use_iova(state->dev, page, size, state->dir); +} +EXPORT_SYMBOL_GPL(dma_set_iova_state); + +/** + * dma_can_use_iova - check if the device type is valid + * and won't take SWIOTLB path + * @state: IOVA state + * + * Return %true if the device should use swiotlb for the given buffer, else + * %false. + */ +bool dma_can_use_iova(struct dma_iova_state *state) +{ + if (!use_dma_iommu(state->dev)) + return false; + + return state->use_iova; +} +EXPORT_SYMBOL_GPL(dma_can_use_iova); From patchwork Thu Sep 12 11:15:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801929 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8E13EEB594 for ; Thu, 12 Sep 2024 11:16:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7EB0B6B0098; Thu, 12 Sep 2024 07:16:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 79A2E6B0099; Thu, 12 Sep 2024 07:16:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 613E66B009A; Thu, 12 Sep 2024 07:16:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 420CA6B0098 for ; Thu, 12 Sep 2024 07:16:32 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id E8A11161181 for ; Thu, 12 Sep 2024 11:16:31 +0000 (UTC) X-FDA: 82555832982.29.36017F5 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf29.hostedemail.com (Postfix) with ESMTP id 61DD5120013 for ; Thu, 12 Sep 2024 11:16:30 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="fMCA+j/u"; spf=pass (imf29.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139651; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0W2KxFG10TKBSKgDyfiWAc4x58BJsldPcALBb7K1X/4=; b=bVvjtO/t+cYEZ1U5SOuFBGJmzMfjOSiamsLf2XxQ0VZ6U+9I5Z3UXzdsPM/EpMm8q6AMx1 G0BC0GNhmJkl99up7DONrnFNHBdrYKNcq1elKD/j+hjyidNgmQt2gqg2koapwEyiTnZoxb q5K+B9F1gzqcHdgqRSf9bDih1EnK9HM= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="fMCA+j/u"; spf=pass (imf29.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139651; a=rsa-sha256; cv=none; b=pnVftpkUUmMNXCKZk9AMCRDJPiorCsIGPD2EmL3lxJonBdXtoWyX3yKRmGz4bsWQxcxBqm DSj2nfbT+qCKzsUabs6CorIY3Z4WD1LaUKcLfnQ2OEx+BLV4ZJfvAwbfyYh3N6egOeq6G4 WoE2ujiiRWf8UaIye9OEuqz/jHmNcuo= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 21E78A45269; Thu, 12 Sep 2024 11:16:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 89AE0C4CECC; Thu, 12 Sep 2024 11:16:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139789; bh=YbYR1B9oX34Xv+Cehx9/tYc07da7ONVOmPeGXmHRn1c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fMCA+j/uOJoozgNYTkAQZ9i5owgnrutNMDDvBGalOj/zk/VYWCdYuZ4BwEUgbo+Ee 8IK3/oKPT7MWMTtxPtDL2tX4K+v6ORqHmqpoXabXDx37Z0XKm5aVZRF6/WCoJ9ITzO y2gg2RxijzVwlXxmRZdMVaHnbiANFg06GCTCeNRhNlUjPZ4StdmFz7Mfcxa8G1XV1K slWv9Nbw6MJs1cRNGwbW5Pi2YtCX+5q7wtccX+oU6he+cAY4vrHTPY4NBWkdGZdWz5 rmSAiwxnbgN6fVd3O1RZFGWZ5pQ/Skqfv2XfNFsGj0rdVOoX/QhfA/JxO+QYtRiKMd e8hcOh4m/hvyg== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 07/21] dma-mapping: implement link range API Date: Thu, 12 Sep 2024 14:15:42 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 61DD5120013 X-Stat-Signature: w4p45tred11ohx8usxcb6mijdnsbqyxs X-Rspam-User: X-HE-Tag: 1726139790-319244 X-HE-Meta: U2FsdGVkX18YXj+iIvIYlScnKIpEp5BLB6ZSIyvC6j0BIH+WNsgoqd7qOyI4px+0sloTy2YLA51vMGypBB/vRYhp3F3ez3k+0pTm576hmEKWMh8sqCvU/jpus4+pghnLbUyLlSzIEeh5PQMe9/1VHwSvoKbyPTWfT0UcDpkPEcJgi8pbCL+NvWHjcDvNaJwD7JZ0eM9vKDfu1Dkye7O1/FSeCtew7HEmcGfzCzWriYPtRc4o79MPtZc3kjV7oPE9rvNs3x2rHrGK1uLU1MwV9IBLsK+hvpwAneOtBSInkVBuXrHLyfVxy/7k86iaHotZM98nahmDjChhkr9e0eLCTcYoDqWhOuYnQKUNoCED02yX9CftInWaJ/sHglQPMySd1Ry8dPVkxYAK6ahd7yJ4pW15GED2i+286d6KyTC+m/Tp9cwbYtb6Yu0g4zQttRvxUSXEGH74j9QiZ1GDVPtEqjzwuP/CvqwxMD5rWCkXc4+DJnpUbdmUv9VcivBAM1NQpo5WHQ3uP7L9ndu5HN3FmWCBvhFWulJuT2tr976rwuUUVXc/cGd70eS8tb2nm3SWg8kN2gdW/7cV1OmirilQReHhpvzDdtNxfuiEmSGMMZ7ERTqlXI4synWNC9WjCPQdTsnMxeOLwt6d+RF4oM3mjgHg1eNt6I6eat9brO63rF/FMp1P5txAY8oxNpAaHT+KpoVYKstI5H5lxmNaENGXbfSv2diiPmsFZzX+5aG4wOgJXcMfkfu5KCjZVS1ogJILEyaVGVQUeAhqR40IuAN0yBl6ThoYN80RQbzESVWywWcnN3ntNtwI4SyR/idwdS1pSN3C07wLvcu1AdMoFY3l53j/RZpUHtY/Kvz+7l93xM76+d4ISEUQIKbSAKyrkFahk7Np3AgTmWEwVEhq2D+ZtWdMAUDBIUtiRQvQHu3MFofrV45mPRQbtGbbPWsE+d26RnL32jGaKHsCt5rz61j fMnuF9MS dkIJj0eQpqJaw2Dk3eNJgUS9SfJy5JdUt4mYzYWvPTqAB4HIg6DymHAWz6EYhhXthO+AKOQ1dkmkQDg2LLt9z1aXI4xHDxdfMRZzsoguMiGDKWczdtgvbjHXUonev1OK/ZuhNjcDb0CRppSLrEhB0g6zSXy0JGBWHSiVX17gyiz+GkA9aZo/25KoznwRb9NgMKXUJzkembbxiIsc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Introduce new DMA APIs to perform DMA linkage of buffers in layers higher than DMA. In proposed API, the callers will perform the following steps: dma_alloc_iova() if (dma_can_use_iova(...)) dma_start_range(...) for (page in range) dma_link_range(...) dma_end_range(...) else /* Fallback to legacy map pages */ dma_map_page(...) Signed-off-by: Leon Romanovsky --- include/linux/dma-mapping.h | 26 ++++++++++++++++ kernel/dma/mapping.c | 60 +++++++++++++++++++++++++++++++++++++ 2 files changed, 86 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 2c74e68b0567..bb541f8944e5 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -11,6 +11,7 @@ #include #include #include +#include /** * List of possible attributes associated with a DMA mapping. The semantics @@ -82,6 +83,7 @@ struct dma_iova_state { size_t size; enum dma_data_direction dir; u8 use_iova : 1; + size_t range_size; }; static inline void dma_init_iova_state(struct dma_iova_state *state, @@ -173,6 +175,11 @@ int dma_mmap_noncontiguous(struct device *dev, struct vm_area_struct *vma, void dma_set_iova_state(struct dma_iova_state *state, struct page *page, size_t size); bool dma_can_use_iova(struct dma_iova_state *state); +int dma_start_range(struct dma_iova_state *state); +void dma_end_range(struct dma_iova_state *state); +dma_addr_t dma_link_range_attrs(struct dma_iova_state *state, phys_addr_t phys, + size_t size, unsigned long attrs); +void dma_unlink_range_attrs(struct dma_iova_state *state, unsigned long attrs); #else /* CONFIG_HAS_DMA */ static inline int dma_alloc_iova_unaligned(struct dma_iova_state *state, phys_addr_t phys, size_t size) @@ -319,6 +326,23 @@ static inline bool dma_can_use_iova(struct dma_iova_state *state) { return false; } +static inline int dma_start_range(struct dma_iova_state *state) +{ + return -EOPNOTSUPP; +} +static inline void dma_end_range(struct dma_iova_state *state) +{ +} +static inline dma_addr_t dma_link_range_attrs(struct dma_iova_state *state, + phys_addr_t phys, size_t size, + unsigned long attrs) +{ + return DMA_MAPPING_ERROR; +} +static inline void dma_unlink_range_attrs(struct dma_iova_state *state, + unsigned long attrs) +{ +} #endif /* CONFIG_HAS_DMA */ #if defined(CONFIG_HAS_DMA) && defined(CONFIG_DMA_NEED_SYNC) @@ -513,6 +537,8 @@ static inline void dma_sync_sgtable_for_device(struct device *dev, #define dma_unmap_page(d, a, s, r) dma_unmap_page_attrs(d, a, s, r, 0) #define dma_get_sgtable(d, t, v, h, s) dma_get_sgtable_attrs(d, t, v, h, s, 0) #define dma_mmap_coherent(d, v, c, h, s) dma_mmap_attrs(d, v, c, h, s, 0) +#define dma_link_range(d, p, o) dma_link_range_attrs(d, p, o, 0) +#define dma_unlink_range(d) dma_unlink_range_attrs(d, 0) bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size); diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 16cb03d5d87d..39fac8c21643 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -1024,3 +1024,63 @@ bool dma_can_use_iova(struct dma_iova_state *state) return state->use_iova; } EXPORT_SYMBOL_GPL(dma_can_use_iova); + +/** + * dma_start_range - Start a range of IOVA space + * @state: IOVA state + * + * Start a range of IOVA space for the given IOVA state. + */ +int dma_start_range(struct dma_iova_state *state) +{ + if (!state->use_iova) + return 0; + + return iommu_dma_start_range(state->dev); +} +EXPORT_SYMBOL_GPL(dma_start_range); + +/** + * dma_end_range - End a range of IOVA space + * @state: IOVA state + * + * End a range of IOVA space for the given IOVA state. + */ +void dma_end_range(struct dma_iova_state *state) +{ + if (!state->use_iova) + return; + + iommu_dma_end_range(state->dev); +} +EXPORT_SYMBOL_GPL(dma_end_range); + +/** + * dma_link_range_attrs - Link a range of IOVA space + * @state: IOVA state + * @phys: physical address to link + * @size: size of the buffer + * @attrs: attributes of mapping properties + * + * Link a range of IOVA space for the given IOVA state. + */ +dma_addr_t dma_link_range_attrs(struct dma_iova_state *state, phys_addr_t phys, + size_t size, unsigned long attrs) +{ + return iommu_dma_link_range(state, phys, size, attrs); +} +EXPORT_SYMBOL_GPL(dma_link_range_attrs); + +/** + * dma_unlink_range_attrs - Unlink a range of IOVA space + * @state: IOVA state + * @attrs: attributes of mapping properties + * + * Unlink a range of IOVA space for the given IOVA state. + */ +void dma_unlink_range_attrs(struct dma_iova_state *state, unsigned long attrs) +{ + iommu_dma_unlink_range(state->dev, state->addr, state->range_size, + state->dir, attrs); +} +EXPORT_SYMBOL_GPL(dma_unlink_range_attrs); From patchwork Thu Sep 12 11:15:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4621EEEB591 for ; Thu, 12 Sep 2024 11:16:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CCDC66B00A4; Thu, 12 Sep 2024 07:16:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C7D246B00A5; Thu, 12 Sep 2024 07:16:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id ACDD16B00A6; Thu, 12 Sep 2024 07:16:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8A95A6B00A4 for ; Thu, 12 Sep 2024 07:16:56 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 404FA141F44 for ; Thu, 12 Sep 2024 11:16:56 +0000 (UTC) X-FDA: 82555834032.18.7313155 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf05.hostedemail.com (Postfix) with ESMTP id AFB9510000C for ; Thu, 12 Sep 2024 11:16:54 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=KcO5hqoq; spf=pass (imf05.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139761; a=rsa-sha256; cv=none; b=abWZRC5Hon/SQns++yAgEQP2D48s3L3DubSJHleWES/Lzfq8k0Q6EY04TYXCD7vdw/xW2O dPcGp3NHGnPFVFGZeajpitL023d/1rKUWeBhSjhiQBJAgoY0etwaGxV5KAa7Ph8aZmOgt7 pCipwSEmC2ZAARTKDRof/znMtlrhAUU= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=KcO5hqoq; spf=pass (imf05.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139761; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fxhG8o+81SDRKq3TdIwITrFS1mBvB78tzyajOhWjAsc=; b=Pl/FMqIluHF0Cyh9A/KH+9OW9S0EIDjqfciTCvvAKDxRxRm610vB1pJf8WJJaH/V9zXNcS AZoxGySdRXBt23xx0qe996jAwjFqc2MOhJ2414GECPpyBqLQUo8C3sBxHM3pHyoedP0U/d 1v4CgwHgg3MxRw4m40txFSG7+JdTKQs= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 53F71A45258; Thu, 12 Sep 2024 11:16:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D69B4C4CEC5; Thu, 12 Sep 2024 11:16:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139813; bh=yM7hzIxBdo84p+8J0W8C8RyNqM4t4h1sCQMfoWGuons=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KcO5hqoqU35uJkwTk2yXTFB4Zx39KtK7hLnT6ftctxrL1AUdcl2c9tnGfUA2Fvw2M r5iFRNfnVbZv6yTQ48Tjd/51FH0f19FXC8BtccsPCy7E0m8ApLxF1G5F76sHNMTt08 fF3pMkKWUGBB039tUpB9ZsNe7dvQ2urhvG1SBgra3+HegAm5UPpxSCG/xYfD9p+slm iQ2atj875hpQG5yGAlNljzrlOsJXskm97Rsc6suFUTq9OBYSlV8Yzgx6F9X2SySLHy OzJA/eLp51lYQsqHxESmhnqv+PE06l4lHYcZq7dAQQz89lRv1f943RJZNMgpUqV/6n axJKYYHuNlJuw== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 08/21] mm/hmm: let users to tag specific PFN with DMA mapped bit Date: Thu, 12 Sep 2024 14:15:43 +0300 Message-ID: <3c68ab13bcabe908c35388c66bf38a43f5d68c8b.1726138681.git.leon@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: j1uh65y7e15juhu89j5xfgw9by67dkse X-Rspamd-Queue-Id: AFB9510000C X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1726139814-168632 X-HE-Meta: U2FsdGVkX1987Sq21x44BmiRWG3KWkiKV8htMeHQZ1DSFDG7LeQF6QbnVW5KPz/W1O1Vi2DiXXEBaz216HWVydguniZO8MIXxotF0z6tmFPmp7lpVqaX6EUjZm2Fw+YZ7ayL9RS9yDXN/IP4EQkcj4isOEr4Dxs1BVbIMfAeoxba7q9nd8X4iqG0x4EBb5aCEpaZdLkna18KaRSxg4sh//mhVaOtCSpGTZHKOIYXGEPywHiWQbPvUzsxNcXGckvNjG6gcORKlAO6NYr67anH65x9vtMmmABBoj1SV/iJXstWN1QLNYTXUSkpW6cIgjUBa6YM74fiw7ROlcCDcKcT6GnL7xnPhty/KZ60Q7HBwYeSaB83ZBtkh1/QkNhv8n3Ggix4vOb3CXAhIhDesWhNBO3O/2iHj5rxnbSuS3xn5/jEP5AouTAjX1cOK0oMecWATOU6uUCCpWhAxNmPlckWtIMon3XrzpfyH/JfibsTM/7kpm0Y6LF59Cr7wbwRosm+WXHL6NLsnUQdL/VVDGA3iAWghQh0//gv/TQlVPvhPAqaon+3CiJRuxf+PSHIv/FkxRDjVUy1UCbzB26yKlHCYjul1j9MhzIyzf3ix9Ja0mgalX/Zo7A3Rsj9SMtm1bvXQVDHzgKaNC15apW/SN2YV9Bm6fK+v4EKjW5fSeqePcso3dyqJsJKFA+ZaWEg6uwzmBNo/wqA9A4WyfGXOC+bgLtNlPN7o57xvK2k0BG9U7u/Y/lYR7o2E8TbKNnUBVYsTPrf2fvI1sO/yZpspxZQGZNuAZSWIpdMYrGlmTVmbPkyfYWzETLCk6Q/xPeBeLIanebzrNgiMUYNDJ93Nr1mejjHBKS0X5MegaaBaOr8oxgwhmn3AzQwJFn5gHutzMznP1SxDshGy2BsTFpOsHn8LVE0GNVvuP5t+2hE7+svGxg+ClFIWuS1mwHbX0x2JPsRUhYl94yFhGqigfEliik 9rUTB+BE o9EMpxtK+rtE3dNO30xa/IDh3SDODOrujy0KdrIyPj8+WB283O9pf+S44h1zNBF4tUJxRiybP3LXK4YU7HTeHu1rsjS5o0kTYWzXaiUb9fNdThcaf0kK8NGPnbKARZNcg8gYDg93KAVOZPaWW1ENYLt7EoPhbBsPuCNiNtYbAk/BBLPR2ZBbnRdWeS2UFVgUInDKNQsVNcSh1DK3WfclH4STOKLmDImwm39DlbjUg4v30a9k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Introduce new sticky flag (HMM_PFN_DMA_MAPPED), which isn't overwritten by HMM range fault. Such flag allows users to tag specific PFNs with information if this specific PFN was already DMA mapped. Signed-off-by: Leon Romanovsky --- include/linux/hmm.h | 4 ++++ mm/hmm.c | 34 +++++++++++++++++++++------------- 2 files changed, 25 insertions(+), 13 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index 126a36571667..2999697db83a 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -23,6 +23,8 @@ struct mmu_interval_notifier; * HMM_PFN_WRITE - if the page memory can be written to (requires HMM_PFN_VALID) * HMM_PFN_ERROR - accessing the pfn is impossible and the device should * fail. ie poisoned memory, special pages, no vma, etc + * HMM_PFN_DMA_MAPPED - Flag preserved on input-to-output transformation + * to mark that page is already DMA mapped * * On input: * 0 - Return the current state of the page, do not fault it. @@ -36,6 +38,8 @@ enum hmm_pfn_flags { HMM_PFN_VALID = 1UL << (BITS_PER_LONG - 1), HMM_PFN_WRITE = 1UL << (BITS_PER_LONG - 2), HMM_PFN_ERROR = 1UL << (BITS_PER_LONG - 3), + /* Sticky lag, carried from Input to Output */ + HMM_PFN_DMA_MAPPED = 1UL << (BITS_PER_LONG - 7), HMM_PFN_ORDER_SHIFT = (BITS_PER_LONG - 8), /* Input flags */ diff --git a/mm/hmm.c b/mm/hmm.c index 7e0229ae4a5a..2a0c34d7cb2b 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -44,8 +44,10 @@ static int hmm_pfns_fill(unsigned long addr, unsigned long end, { unsigned long i = (addr - range->start) >> PAGE_SHIFT; - for (; addr < end; addr += PAGE_SIZE, i++) - range->hmm_pfns[i] = cpu_flags; + for (; addr < end; addr += PAGE_SIZE, i++) { + range->hmm_pfns[i] &= HMM_PFN_DMA_MAPPED; + range->hmm_pfns[i] |= cpu_flags; + } return 0; } @@ -202,8 +204,10 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr, return hmm_vma_fault(addr, end, required_fault, walk); pfn = pmd_pfn(pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); - for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) - hmm_pfns[i] = pfn | cpu_flags; + for (i = 0; addr < end; addr += PAGE_SIZE, i++, pfn++) { + hmm_pfns[i] &= HMM_PFN_DMA_MAPPED; + hmm_pfns[i] |= pfn | cpu_flags; + } return 0; } #else /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -236,7 +240,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0); if (required_fault) goto fault; - *hmm_pfn = 0; + *hmm_pfn = *hmm_pfn & HMM_PFN_DMA_MAPPED; return 0; } @@ -253,14 +257,14 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, cpu_flags = HMM_PFN_VALID; if (is_writable_device_private_entry(entry)) cpu_flags |= HMM_PFN_WRITE; - *hmm_pfn = swp_offset_pfn(entry) | cpu_flags; + *hmm_pfn = (*hmm_pfn & HMM_PFN_DMA_MAPPED) | swp_offset_pfn(entry) | cpu_flags; return 0; } required_fault = hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0); if (!required_fault) { - *hmm_pfn = 0; + *hmm_pfn = *hmm_pfn & HMM_PFN_DMA_MAPPED; return 0; } @@ -304,11 +308,11 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr, pte_unmap(ptep); return -EFAULT; } - *hmm_pfn = HMM_PFN_ERROR; + *hmm_pfn = (*hmm_pfn & HMM_PFN_DMA_MAPPED) | HMM_PFN_ERROR; return 0; } - *hmm_pfn = pte_pfn(pte) | cpu_flags; + *hmm_pfn = (*hmm_pfn & HMM_PFN_DMA_MAPPED) | pte_pfn(pte) | cpu_flags; return 0; fault: @@ -448,8 +452,10 @@ static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, } pfn = pud_pfn(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT); - for (i = 0; i < npages; ++i, ++pfn) - hmm_pfns[i] = pfn | cpu_flags; + for (i = 0; i < npages; ++i, ++pfn) { + hmm_pfns[i] &= HMM_PFN_DMA_MAPPED; + hmm_pfns[i] |= pfn | cpu_flags; + } goto out_unlock; } @@ -507,8 +513,10 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, } pfn = pte_pfn(entry) + ((start & ~hmask) >> PAGE_SHIFT); - for (; addr < end; addr += PAGE_SIZE, i++, pfn++) - range->hmm_pfns[i] = pfn | cpu_flags; + for (; addr < end; addr += PAGE_SIZE, i++, pfn++) { + range->hmm_pfns[i] &= HMM_PFN_DMA_MAPPED; + range->hmm_pfns[i] |= pfn | cpu_flags; + } spin_unlock(ptl); return 0; From patchwork Thu Sep 12 11:15:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801931 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E248EEB591 for ; Thu, 12 Sep 2024 11:16:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD17C6B009C; Thu, 12 Sep 2024 07:16:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A324B6B009D; Thu, 12 Sep 2024 07:16:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8AB6C6B009E; Thu, 12 Sep 2024 07:16:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 622876B009C for ; Thu, 12 Sep 2024 07:16:40 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 0027481787 for ; Thu, 12 Sep 2024 11:16:39 +0000 (UTC) X-FDA: 82555833318.10.05221CF Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf05.hostedemail.com (Postfix) with ESMTP id 5D7D6100011 for ; Thu, 12 Sep 2024 11:16:38 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=utilRRP5; spf=pass (imf05.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139694; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=B5PPeoNX25dL6J7bffwKTP3lf8mz/H4D8kNjbdmoKTw=; b=XxBS0LCT+b5jT5eohEZTzKQfTkjeWyP8AKeZzbahyDxfySN00Vfu7JVNpuWwJW2YgOzmLf nEXod2SGbUtntZIomCUzMk3hssZhnysrwQed+D9ItUtwEDa34y22BIjAnGwcyBRJnrf3av w1dz65JL1IKvUcQkza+FJb0uQuEwN04= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139694; a=rsa-sha256; cv=none; b=wNa/f2T1hDt4K3UgIIlQ/z6VuH4/VXhcc2bMlTxvjex8zvdAc922T1xO2mJy6dYQgbXTsO iK/hDhefw43XNud3LuT93Zb/eudYmK8SdA6YYG+aEwXQA3KRNqFsKg5AQ2OltAYJDT/dKk Rsu5ow7pxvNta+qnZrf5EDRvN66xnWM= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=utilRRP5; spf=pass (imf05.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 21049A4527C; Thu, 12 Sep 2024 11:16:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 95168C4CEC5; Thu, 12 Sep 2024 11:16:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139797; bh=HCbYGqX249833/tdiRDBvFBjeuSolO9rH8wFEsCHmuw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=utilRRP5yDc5niZQdTcTGnzCnlrPS0S4L3Hb9vq/XwpoQFI0AVDbejOe0Qo42RRZR KkYQDJw1ZBH1+GhREn9IO3Aft1IbL0WfEet5f0emPRe8iuqT8w2AFuYT9rawGzAivh cQxvsrrXrX2aswijqPZVgD9EpT7cf20gG1iWYXmqsE07Ej+8c8FKOF4tQuWco24nd1 2lGYnudAW1yc+NprinXcSsQXcnqPmKubTJyZY5LKaiOnVKB5gOk1fdSCFyV62ullq/ IqZxh1M1CLXFDX8nzGm4+Sy2BP2yjU9370wTj3+xV+5K9S+8gyJXtfhxEhlqM50c+P tzPreaeQDm/dA== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 09/21] dma-mapping: provide callbacks to link/unlink HMM PFNs to specific IOVA Date: Thu, 12 Sep 2024 14:15:44 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 5D7D6100011 X-Stat-Signature: irhgg9kydppeufaeabu9m1kzzsbh78hz X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1726139798-556240 X-HE-Meta: U2FsdGVkX19fLtWCpTNGWVdKZvIPfFW8VQzy0qvrVvkdrdX2gK5F1wV8TRBuM7IPqcXFf+TfQSNqlTzcPhw86+OzrwylpDwN4lvQQFo0acJ6xBv594ma67Ec20LI2E3r0jert8j76aTFdrS14MV81XLnYSK2uQzC3q+SQ8pWq+gF2ElvdYQ+lg7X4EroJBISHF+bJKHIgQaUrPzX0gB3RH5cejbBdHrHLpo99UI56kG0aPHQnk9VzbJ4IwoW1dkh3aNIVx/OhPuGQVcylH3wzsGBcRBllhpR3cBFNenzn3h25IycsZx+Wb/dh7hicVPbqclbfgtmGohPOhmU0xdxhC8foqQUbLGaoT16q6A6IG4QApyprpDZeYYvTmmQVG+KG/HRNky+8exuVjWOhzEeasfXot1IbRAcHqcpPSPRckQNSTJm4UkygVGMkJy29atoyjOc8ZJPf3pkH2VW5wEeJ9ZyvnDajLmqPGms1Db0+zvR/NPnrtxrSeTHzbCH4LyAQb1z7kQFAUbwefpuxEJByn6bpl14MaKUddO+lNWXeH4My57klpBEEhYDhcvkclxHBvJAYHWoulK9I6nlzHDh9YXWZOujoHjOftZ+JXR4c/pfp91AYf4SD5t2MgfG1QdLVziiLWpjo+1IaWXO77yjNNnsSIDurec85tL9Dqsee3KcuXlQiGauw8/nEtjEqHHgw6OCFO37nJgTz/+JE4qB6K/sTJVYe7SDhLb0uHVO1h+xvxoOruIp88ifYjg87g0H1PNE9CTZUWeKcrEyfSnoilzeW2+fvVN1WYPM4RbhaaBNF2u+CI5OrCznVSORtvu6utW71Fver09rLllGAC0arajS+87vrYNS+rRCvaIZjKvpBmDyczEpoIdQ/u3Rxse+ZnmvKoGgB+og8cwlI+KkTcrEWVoUCXWODluFVLTZFRlYrTt2aEZN6i3AnqVgElMYfqOjaxomsYDBU+0Soc9 /dnOMKw7 LdBzLp4ZaSqfQWT9I7PWE/LCApxPwopUG19rkAMPAjyReNNDEGBKNa/nU7CzxBu6hrRGQRe1JpLnfvYb23KKFgWUuXzzSN93NfS+8FA+SH/gHJxZOFbYEWZQUVpLUzBSHOn73CYUfSIFjHSotjd+6XmJgyQrDc7N+6JFN5xjwmw0c/OlwShO1OnoT93/ZyJ+a1vAIH0BAPaQt5dG1PXFToZUmMx0PzbEwpBHrihJRJvq0j5s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Introduce new DMA link/unlink API to provide a way for HMM users to link pages to already preallocated IOVA. Signed-off-by: Leon Romanovsky --- include/linux/dma-mapping.h | 15 ++++++ kernel/dma/mapping.c | 102 ++++++++++++++++++++++++++++++++++++ 2 files changed, 117 insertions(+) diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index bb541f8944e5..8c2a468c5420 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -123,6 +123,10 @@ static inline int dma_mapping_error(struct device *dev, dma_addr_t dma_addr) int dma_alloc_iova_unaligned(struct dma_iova_state *state, phys_addr_t phys, size_t size); void dma_free_iova(struct dma_iova_state *state); +dma_addr_t dma_hmm_link_page(struct dma_iova_state *state, unsigned long *pfn, + dma_addr_t dma_offset); +void dma_hmm_unlink_page(struct dma_iova_state *state, unsigned long *pfn, + dma_addr_t dma_offset); dma_addr_t dma_map_page_attrs(struct device *dev, struct page *page, size_t offset, size_t size, enum dma_data_direction dir, @@ -189,6 +193,17 @@ static inline int dma_alloc_iova_unaligned(struct dma_iova_state *state, static inline void dma_free_iova(struct dma_iova_state *state) { } +static inline dma_addr_t dma_hmm_link_page(struct dma_iova_state *state, + unsigned long *pfn, + dma_addr_t dma_offset) +{ + return DMA_MAPPING_ERROR; +} +static inline void dma_hmm_unlink_page(struct dma_iova_state *state, + unsigned long *pfn, + dma_addr_t dma_offset) +{ +} static inline dma_addr_t dma_map_page_attrs(struct device *dev, struct page *page, size_t offset, size_t size, enum dma_data_direction dir, unsigned long attrs) diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 39fac8c21643..5354ddc3ac03 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "debug.h" #include "direct.h" @@ -1084,3 +1085,104 @@ void dma_unlink_range_attrs(struct dma_iova_state *state, unsigned long attrs) state->dir, attrs); } EXPORT_SYMBOL_GPL(dma_unlink_range_attrs); + +/** + * dma_hmm_link_page - Link a physical HMM page to DMA address + * @state: IOVA state + * @pfn: HMM PFN + * @dma_offset: DMA offset form which this page needs to be linked + * + * dma_alloc_iova() allocates IOVA based on the size specified by their use in + * iova->size. Call this function after IOVA allocation to link whole @page + * to get the DMA address. Note that very first call to this function + * will have @dma_offset set to 0 in the IOVA space allocated from + * dma_alloc_iova(). For subsequent calls to this function on same @iova, + * @dma_offset needs to be advanced by the caller with the size of previous + * page that was linked + DMA address returned for the previous page that was + * linked by this function. + */ +dma_addr_t dma_hmm_link_page(struct dma_iova_state *state, unsigned long *pfn, + dma_addr_t dma_offset) +{ + struct device *dev = state->dev; + struct page *page = hmm_pfn_to_page(*pfn); + phys_addr_t phys = page_to_phys(page); + bool coherent = dev_is_dma_coherent(dev); + dma_addr_t addr; + int ret; + + if (*pfn & HMM_PFN_DMA_MAPPED) + /* + * We are in this flow when there is a need to resync flags, + * for example when page was already linked in prefetch call + * with READ flag and now we need to add WRITE flag + * + * This page was already programmed to HW and we don't want/need + * to unlink and link it again just to resync flags. + * + * The DMA address calculation below is based on the fact that + * HMM doesn't work with swiotlb. + */ + return (state->addr) ? state->addr + dma_offset : + phys_to_dma(dev, phys); + + state->range_size = dma_offset; + + /* + * The below check is based on assumption that HMM range users + * don't work with swiotlb and hence can be or in direct mode + * or in IOMMU mode. + */ + if (!use_dma_iommu(dev)) { + if (!coherent) + arch_sync_dma_for_device(phys, PAGE_SIZE, state->dir); + + addr = phys_to_dma(dev, phys); + goto done; + } + + ret = dma_start_range(state); + if (ret) + return DMA_MAPPING_ERROR; + + addr = dma_link_range(state, phys, PAGE_SIZE); + dma_end_range(state); + if (dma_mapping_error(state->dev, addr)) + return addr; + +done: + kmsan_handle_dma(page, 0, PAGE_SIZE, state->dir); + *pfn |= HMM_PFN_DMA_MAPPED; + return addr; +} +EXPORT_SYMBOL_GPL(dma_hmm_link_page); + +/** + * dma_hmm_unlink_page - Unlink a physical HMM page from DMA address + * @state: IOVA state + * @pfn: HMM PFN + * @dma_offset: DMA offset form which this page needs to be unlinked + * from the IOVA space + */ +void dma_hmm_unlink_page(struct dma_iova_state *state, unsigned long *pfn, + dma_addr_t dma_offset) +{ + struct device *dev = state->dev; + struct page *page; + phys_addr_t phys; + + *pfn &= ~HMM_PFN_DMA_MAPPED; + + if (!use_dma_iommu(dev)) { + page = hmm_pfn_to_page(*pfn); + phys = page_to_phys(page); + + dma_direct_sync_single_for_cpu(dev, phys_to_dma(dev, phys), + PAGE_SIZE, state->dir); + return; + } + + iommu_dma_unlink_range(dev, state->addr + dma_offset, PAGE_SIZE, + state->dir, 0); +} +EXPORT_SYMBOL_GPL(dma_hmm_unlink_page); From patchwork Thu Sep 12 11:15:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801932 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 475A8EEB593 for ; Thu, 12 Sep 2024 11:16:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D69426B009E; Thu, 12 Sep 2024 07:16:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CF12D6B009F; Thu, 12 Sep 2024 07:16:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B69E66B00A0; Thu, 12 Sep 2024 07:16:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 976ED6B009E for ; Thu, 12 Sep 2024 07:16:44 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 5E4C0161F75 for ; Thu, 12 Sep 2024 11:16:44 +0000 (UTC) X-FDA: 82555833528.26.A6A12DE Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf07.hostedemail.com (Postfix) with ESMTP id 942EF4000A for ; Thu, 12 Sep 2024 11:16:42 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=MaNgzGuY; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf07.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139725; a=rsa-sha256; cv=none; b=rxyiXRSIRef9YC+hH5JjPbrhcsMdEdy9Stv+zy+2u30PBgqnTrYWeBJh5LlTRcFs6hoFqh eRYFKX08MR1ipocsyLXw/W28KGwmycKhVTniP/3FpSSI3Mg3Tbl/fvwjv24NcNlMA7i/ES LqAUjuUPtSlMaSVy49lpujkObwKgcaE= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=MaNgzGuY; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf07.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139725; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nQMfrCX4WgbAaPgUWX5Mw3xDMJN+RJdskHrVgeXWM5Y=; b=ZRDr5smp4bTXnY0JNY+fdhCTDXXBY+Q7iLXnR5yL1POP9t7rL7O77h85it1ATX7dGfVgTC bqPPfTn/RUYpYEZNpd6Raq29ahD4xgT7F1O2Bm1bjf9spH/02AFoJCSfCIA+Yyw1KD/BLz gXTYlleD56zLuNKKq5vOn5S8aq6x9RQ= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 2F99DA45273; Thu, 12 Sep 2024 11:16:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A155AC4CEC3; Thu, 12 Sep 2024 11:16:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139801; bh=cWTZ/vX5KH29DDePm3J0/PIuPOYHHrOmAmy/4Xsjr4k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MaNgzGuYutMZ4S+zhyYnCR1nQ7/cVUpL4cqJdHGuAEl1wmik2EzqsRYrKwjzPSgGy sfidRPgO4zHN72I0olnabe+xMFut7ZSLfoMfElMezA5aKgxLkIlLNFrWStvp5nrBQe WIU+RNQrJtth1GJ+9XevvCvmshuuvPMj21JSNCgv1OQQ6sBN2NPFmAViYUr2n9nfgn dbjpF0kXfzUGO92UMShptYWEvda0Nuu3fgD16HzX7heBADacYrIS4+3hYZbE1tbiNB QNt5RqCvZK/+hGQmCeSmU9XwJusDixnsAIA3mt2lH8BGiBECYIR+WZoJ5CVwVtemKh 8zM6C899KSD3A== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 10/21] RDMA/umem: Preallocate and cache IOVA for UMEM ODP Date: Thu, 12 Sep 2024 14:15:45 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 942EF4000A X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: aju33nyo9tfbdc55ygmfy6g79tyeuir6 X-HE-Tag: 1726139802-203682 X-HE-Meta: U2FsdGVkX1/qDgPDg5giyUO98TsIJj3yYgyVDq7KQcj0ne0AXvxVufyen5MEDqnJifmKw9jTLUOFarMZbyWta31OUMMNE243x2pr16MHsrnA+qv20Tj5OC21dMn/YVc2YZ2TqIUZwveG3dLEzlpMbNYsmHSW+ZcL9+sTlzM4V+YMoWSTtHf6mbVKwewH6UlqrwkiWGhx0OsUjFx06hxArkclnO9puHkZ1F4R+kiHm60WJZMyY73ZMzK+dNvIPlxtwQ0qBz97piogAkDZW7bo0Lf+RYkCOf2K0PyDF+Wm408lrstMMbPyy9byIIyg6CbmpBGqzd/SajXwLO5WnDNuANbwpthDyUiiIEYe4Lg3T51PdwNwhfKp/uXl+G7M0MqG9LuNMR+1VC12mBv7ggIdmDgl0K7p2/SbNrcGtUj2zBPt1ALy4kkwXGZmX/UIBbImbRuF/4T6QnqCXwvXGUYGEAs+ty5jh5RyKhmXY/Cv10523k96Xmv6ViJ+HK6AIsMRpP4yyPS2gYz5fsCk7aoXecDsrrJxg5DnCKmkUUG0loR09jJ1cedXVbRbtxnjsJjSvXNLRVpJ3BpKxq6+WERoj23lYoZhG5U0m4OsRTiVq5COwy2qHVZPO351sOCzjVDWLywk1C67bG45B6yUv/DfvVx3Ah1OjJcNax8sfUNym5kmZgkKqrsPs/eQHrnCQSdaRXB8Aw5yrtQaQccVCM5j9L1spNO9qYBW8wgrzKIwqYsGtdedKMhgwXkFaYl+KBwpXp6QNrgonZclMUelj/bIlHmr5iF/k/TshyCtbICUYc/NbTZuUYGt/8dQtoVEBvGro3zQRJlPdtDYrOarfNU45X0Z5hXvH20FpnS7DgwzQO4eS+H4XeqHGJxQwH4gup+eFxR/Nh8UEPW4Eb+3X/LIvGlrvg0+J2jadUj0vKGpSJdY71wEOb+k29uteZxdBfJcbcblyhd5DXjiWG8wEw4 NZTeslUG Yyprg2YKrHZPiZpnP6kLCyKNTNl8T6xZYvFeS9T5FGXo/m4W2VyPMysgJFG+r9BxMy4U8Hs2b6toM05H4UwT+vkMCZQWN04KO5kY98E2aP1cHQUk58aHPdnYG83yFUSvyeA0qjazSnP/jVAIpjUtE0hK0YS3zZVh3LaxkcJ6u0LgbByIDHdLtENcZIfD97iE7OK19KjhPdKmHsaD8z0uSUhS0OsyD6y3AnGCGkka5OCs2niY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky As a preparation to provide two step interface to map pages, preallocate IOVA when UMEM is initialized. Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 13 ++++++++++++- include/rdma/ib_umem_odp.h | 1 + 2 files changed, 13 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index e9fa22d31c23..01cbf7f55b3a 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -50,6 +50,7 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, const struct mmu_interval_notifier_ops *ops) { + struct ib_device *dev = umem_odp->umem.ibdev; int ret; umem_odp->umem.is_odp = 1; @@ -87,15 +88,24 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, goto out_pfn_list; } + dma_init_iova_state(&umem_odp->state, dev->dma_device, + DMA_BIDIRECTIONAL); + ret = dma_alloc_iova(&umem_odp->state, end - start); + if (ret) + goto out_dma_list; + + ret = mmu_interval_notifier_insert(&umem_odp->notifier, umem_odp->umem.owning_mm, start, end - start, ops); if (ret) - goto out_dma_list; + goto out_free_iova; } return 0; +out_free_iova: + dma_free_iova(&umem_odp->state); out_dma_list: kvfree(umem_odp->dma_list); out_pfn_list: @@ -274,6 +284,7 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp) ib_umem_end(umem_odp)); mutex_unlock(&umem_odp->umem_mutex); mmu_interval_notifier_remove(&umem_odp->notifier); + dma_free_iova(&umem_odp->state); kvfree(umem_odp->dma_list); kvfree(umem_odp->pfn_list); } diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h index 0844c1d05ac6..c0c1215925eb 100644 --- a/include/rdma/ib_umem_odp.h +++ b/include/rdma/ib_umem_odp.h @@ -23,6 +23,7 @@ struct ib_umem_odp { * See ODP_READ_ALLOWED_BIT and ODP_WRITE_ALLOWED_BIT. */ dma_addr_t *dma_list; + struct dma_iova_state state; /* * The umem_mutex protects the page_list and dma_list fields of an ODP * umem, allowing only a single thread to map/unmap pages. The mutex From patchwork Thu Sep 12 11:15:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801933 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E8B6EEB591 for ; Thu, 12 Sep 2024 11:16:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B459F6B009F; Thu, 12 Sep 2024 07:16:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF5A66B00A1; Thu, 12 Sep 2024 07:16:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96EAD6B00A2; Thu, 12 Sep 2024 07:16:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 797226B009F for ; Thu, 12 Sep 2024 07:16:48 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 30CDDC1F97 for ; Thu, 12 Sep 2024 11:16:48 +0000 (UTC) X-FDA: 82555833696.25.9200750 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf16.hostedemail.com (Postfix) with ESMTP id 86714180005 for ; Thu, 12 Sep 2024 11:16:46 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="NgzO9Cd/"; spf=pass (imf16.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139701; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GH+WLggRKei5hbMS7gbOtgoQjkuh3b/wjL8c0knqTZg=; b=y/owiHsoviXaQqUFMEtgcjD3HgvaVFVrptlDOdSeGyr0037/6I/P2Su/w//viBSH4Udc4W wfZCrH2JEztFoeQT2jwYunqvwDrdW2iH89I9YHsBfOegOV9/vpZCzPFF1qvkD9jVaV6ZvK qOtE1X1OLLIXS6wPnvuzHAEw9/tHwGU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139701; a=rsa-sha256; cv=none; b=13xSvGFI2sOKIiDgvbCFRv05rFJVcK3kxSLzE8XjOOoTyL6dKWmoDxsIsQcELVNx6/S0NF iLPbiAKOPi8SrBBUjH9BsP0XpVm26cg2nOksX+0xO1HQcGefihrSjVAqB8dIJltNpF+FOc 3UUpubIQzjpRm65S0XhGI+iYQRVcTv4= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="NgzO9Cd/"; spf=pass (imf16.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 29A84A451A6; Thu, 12 Sep 2024 11:16:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A435DC4CEC3; Thu, 12 Sep 2024 11:16:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139805; bh=XdvLFLSs77dfdKFwvoI+1QDjheRjvPIMY5q3duojPU4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NgzO9Cd/pEYl3RXBSH3KvVWH5oLj+mgwuiuvlEuWz/LdVeoxrVjxHPYN+1RgtJ5vx uYJjvq15M3diYUAp46bypI/thcHEx+QtEexV40Maopkpc7NH/pQObOfLTMM9sedXTe HyCCIZaumupuZzn/90Sj4asgy/V2CGMFpJPauJAwyc4WeCzaSYZigiZbhfgSaAIuNl AltJNiTekFARdZbPRH/+4tfrcUAYzA0Buzi7se4BfNiYVdO4ozUObZevqOncwAYEYE P94t+8sP2sgWdP4i0Jt7pKccaJG3IY0H5c0QUNwnV7gocMrpLWBr+P/djT3WWKtrhz 5mKfa6eJbE7lA== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 11/21] RDMA/umem: Store ODP access mask information in PFN Date: Thu, 12 Sep 2024 14:15:46 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 86714180005 X-Stat-Signature: tmxfx8f4orub4ys895yodewyisci1rxw X-HE-Tag: 1726139806-371905 X-HE-Meta: U2FsdGVkX1+AQDKX+xDpfyeyE19+uoX8nYVnthYdxxXWQ3S1zZ6H7OXC0fy8WXHGOsoixoiGZY/Cs/Z9VK8aeEAcTx+7SjQe8K4duB0dknPO8LrghB98mMy5Oo13C6xc/N+hZCkrRkdUpAybHeJIn/FoCF64D1QkHMFWBo43HyuFk0BTXWgiUBsn3F6YAIq6YZdMtY3K8S1v3Ya/ZtUhPSrOFbRj1F+NFaqGevkntBaytbfFM0bWORo8UNZywdAKdED7vU/ynMAkBqOfaDFSNpXFn9B1GYnZU6kB1yQ4Zj5hh3yMQ2qkyARd4/d/teD70cDGVCQyjJKsFoAAB3K63djjHvf87ofz2pwWyAxHPW7CJfKpXBe2bmyalou2Fg8vcb7o0D9MiqFBbPxccWbCEcBtEOcJs7T79NvRsFlEClR9m/Mm2Odjrwa9P7q8w534f0Driic2IQY3nPuHsBNCcor5APsQyzZQXqfSgtudV6riS03L2efzSiJXieCkLep/N7vf3f4OsTY+fge8H3n0LD4audqJxj7fJs6iAbm17fmCZj1afZQeuc5PHT3w+lGpjHKPBXJm/2a9G9hvCdBO2wAozEaR+lNbrRdgjOgUkJLwxezWUiuWfA/5swgc0jdRKvD9aZhP/YL+z3s7hEhA/iyPkjSuB3dNbbjyw7n/RvA+mk1rZ0SzErRBzHqIIjfJdHq+kI2kL7G9IpGJFwZSKvIBrKmuIjStprjzRH32NMnHdLkNPKYjkffaEkVhyNr3tpiTKX79cpuOOQ7wJ3uCaEdlYcf1xH1vwidQ99QUgTSZJtvE+N0ZUtMq55Uo6JLhCr/x3sfQcdcTcBuB637MRW53IPWYfqZagvGO2HDoXYU0FnJoGX4Sfnuv6awdS5XGLXajUV/g/FQ8PaGLDZux3wyLE8lUvRAh1WuQ87HlPC93MTbLioUuTYu9KqJwLCMgTNpIsDLsHgd4UjHbeJj SzKwgjNv AkMHeaTwYFyyGF3UjgPUiS7n8P/I1cbg3ysd0eroGQa08YzgjJH0KA59Fzpyjx5836vywAANEqrOFDFOLQJqzQJB6wqM4h8Usi5Du2vfQo7JzbtN1DNbdkpNMaLfsKmJW+A/4BqCoiZt4qnfml9Oc1yxiRE6AvMlkfq/cq5WLOTv4lmqdNQ5FE6JJYsHL3adjGYOpGwFdbvXOeYg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky As a preparation to remove of dma_list, store access mask in PFN pointer and not in dma_addr_t. Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 98 +++++++++++----------------- drivers/infiniband/hw/mlx5/mlx5_ib.h | 1 + drivers/infiniband/hw/mlx5/odp.c | 37 ++++++----- include/rdma/ib_umem_odp.h | 14 +--- 4 files changed, 59 insertions(+), 91 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 01cbf7f55b3a..72885eca4181 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -307,22 +307,11 @@ EXPORT_SYMBOL(ib_umem_odp_release); static int ib_umem_odp_map_dma_single_page( struct ib_umem_odp *umem_odp, unsigned int dma_index, - struct page *page, - u64 access_mask) + struct page *page) { struct ib_device *dev = umem_odp->umem.ibdev; dma_addr_t *dma_addr = &umem_odp->dma_list[dma_index]; - if (*dma_addr) { - /* - * If the page is already dma mapped it means it went through - * a non-invalidating trasition, like read-only to writable. - * Resync the flags. - */ - *dma_addr = (*dma_addr & ODP_DMA_ADDR_MASK) | access_mask; - return 0; - } - *dma_addr = ib_dma_map_page(dev, page, 0, 1 << umem_odp->page_shift, DMA_BIDIRECTIONAL); if (ib_dma_mapping_error(dev, *dma_addr)) { @@ -330,7 +319,6 @@ static int ib_umem_odp_map_dma_single_page( return -EFAULT; } umem_odp->npages++; - *dma_addr |= access_mask; return 0; } @@ -366,9 +354,6 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, struct hmm_range range = {}; unsigned long timeout; - if (access_mask == 0) - return -EINVAL; - if (user_virt < ib_umem_start(umem_odp) || user_virt + bcnt > ib_umem_end(umem_odp)) return -EFAULT; @@ -394,7 +379,7 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, if (fault) { range.default_flags = HMM_PFN_REQ_FAULT; - if (access_mask & ODP_WRITE_ALLOWED_BIT) + if (access_mask & HMM_PFN_WRITE) range.default_flags |= HMM_PFN_REQ_WRITE; } @@ -426,22 +411,17 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, for (pfn_index = 0; pfn_index < num_pfns; pfn_index += 1 << (page_shift - PAGE_SHIFT), dma_index++) { - if (fault) { - /* - * Since we asked for hmm_range_fault() to populate - * pages it shouldn't return an error entry on success. - */ - WARN_ON(range.hmm_pfns[pfn_index] & HMM_PFN_ERROR); - WARN_ON(!(range.hmm_pfns[pfn_index] & HMM_PFN_VALID)); - } else { - if (!(range.hmm_pfns[pfn_index] & HMM_PFN_VALID)) { - WARN_ON(umem_odp->dma_list[dma_index]); - continue; - } - access_mask = ODP_READ_ALLOWED_BIT; - if (range.hmm_pfns[pfn_index] & HMM_PFN_WRITE) - access_mask |= ODP_WRITE_ALLOWED_BIT; - } + /* + * Since we asked for hmm_range_fault() to populate + * pages it shouldn't return an error entry on success. + */ + WARN_ON(fault && range.hmm_pfns[pfn_index] & HMM_PFN_ERROR); + WARN_ON(fault && !(range.hmm_pfns[pfn_index] & HMM_PFN_VALID)); + if (!(range.hmm_pfns[pfn_index] & HMM_PFN_VALID)) + continue; + + if (range.hmm_pfns[pfn_index] & HMM_PFN_DMA_MAPPED) + continue; hmm_order = hmm_pfn_to_map_order(range.hmm_pfns[pfn_index]); /* If a hugepage was detected and ODP wasn't set for, the umem @@ -456,13 +436,13 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, } ret = ib_umem_odp_map_dma_single_page( - umem_odp, dma_index, hmm_pfn_to_page(range.hmm_pfns[pfn_index]), - access_mask); + umem_odp, dma_index, hmm_pfn_to_page(range.hmm_pfns[pfn_index])); if (ret < 0) { ibdev_dbg(umem_odp->umem.ibdev, "ib_umem_odp_map_dma_single_page failed with error %d\n", ret); break; } + range.hmm_pfns[pfn_index] |= HMM_PFN_DMA_MAPPED; } /* upon success lock should stay on hold for the callee */ if (!ret) @@ -482,7 +462,6 @@ EXPORT_SYMBOL(ib_umem_odp_map_dma_and_lock); void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt, u64 bound) { - dma_addr_t dma_addr; dma_addr_t dma; int idx; u64 addr; @@ -493,34 +472,33 @@ void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt, virt = max_t(u64, virt, ib_umem_start(umem_odp)); bound = min_t(u64, bound, ib_umem_end(umem_odp)); for (addr = virt; addr < bound; addr += BIT(umem_odp->page_shift)) { + unsigned long pfn_idx = (addr - ib_umem_start(umem_odp)) >> PAGE_SHIFT; + struct page *page = hmm_pfn_to_page(umem_odp->pfn_list[pfn_idx]); + idx = (addr - ib_umem_start(umem_odp)) >> umem_odp->page_shift; dma = umem_odp->dma_list[idx]; - /* The access flags guaranteed a valid DMA address in case was NULL */ - if (dma) { - unsigned long pfn_idx = (addr - ib_umem_start(umem_odp)) >> PAGE_SHIFT; - struct page *page = hmm_pfn_to_page(umem_odp->pfn_list[pfn_idx]); - - dma_addr = dma & ODP_DMA_ADDR_MASK; - ib_dma_unmap_page(dev, dma_addr, - BIT(umem_odp->page_shift), - DMA_BIDIRECTIONAL); - if (dma & ODP_WRITE_ALLOWED_BIT) { - struct page *head_page = compound_head(page); - /* - * set_page_dirty prefers being called with - * the page lock. However, MMU notifiers are - * called sometimes with and sometimes without - * the lock. We rely on the umem_mutex instead - * to prevent other mmu notifiers from - * continuing and allowing the page mapping to - * be removed. - */ - set_page_dirty(head_page); - } - umem_odp->dma_list[idx] = 0; - umem_odp->npages--; + if (!(umem_odp->pfn_list[pfn_idx] & HMM_PFN_VALID)) + continue; + if (!(umem_odp->pfn_list[pfn_idx] & HMM_PFN_DMA_MAPPED)) + continue; + + ib_dma_unmap_page(dev, dma, BIT(umem_odp->page_shift), + DMA_BIDIRECTIONAL); + if (umem_odp->pfn_list[pfn_idx] & HMM_PFN_WRITE) { + struct page *head_page = compound_head(page); + /* + * set_page_dirty prefers being called with + * the page lock. However, MMU notifiers are + * called sometimes with and sometimes without + * the lock. We rely on the umem_mutex instead + * to prevent other mmu notifiers from + * continuing and allowing the page mapping to + * be removed. + */ + set_page_dirty(head_page); } + umem_odp->npages--; } } EXPORT_SYMBOL(ib_umem_odp_unmap_dma_pages); diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index d5eb1b726675..8149b4c3d3db 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -347,6 +347,7 @@ struct mlx5_ib_flow_db { #define MLX5_IB_UPD_XLT_PD BIT(4) #define MLX5_IB_UPD_XLT_ACCESS BIT(5) #define MLX5_IB_UPD_XLT_INDIRECT BIT(6) +#define MLX5_IB_UPD_XLT_DOWNGRADE BIT(7) /* Private QP creation flags to be passed in ib_qp_init_attr.create_flags. * diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index a524181f34df..4bf691fb266f 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -34,6 +34,7 @@ #include #include #include +#include #include "mlx5_ib.h" #include "cmd.h" @@ -143,22 +144,12 @@ static void populate_klm(struct mlx5_klm *pklm, size_t idx, size_t nentries, } } -static u64 umem_dma_to_mtt(dma_addr_t umem_dma) -{ - u64 mtt_entry = umem_dma & ODP_DMA_ADDR_MASK; - - if (umem_dma & ODP_READ_ALLOWED_BIT) - mtt_entry |= MLX5_IB_MTT_READ; - if (umem_dma & ODP_WRITE_ALLOWED_BIT) - mtt_entry |= MLX5_IB_MTT_WRITE; - - return mtt_entry; -} - static void populate_mtt(__be64 *pas, size_t idx, size_t nentries, struct mlx5_ib_mr *mr, int flags) { struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem); + bool downgrade = flags & MLX5_IB_UPD_XLT_DOWNGRADE; + unsigned long pfn; dma_addr_t pa; size_t i; @@ -166,8 +157,17 @@ static void populate_mtt(__be64 *pas, size_t idx, size_t nentries, return; for (i = 0; i < nentries; i++) { + pfn = odp->pfn_list[idx + i]; + if (!(pfn & HMM_PFN_VALID)) + /* Initial ODP init */ + continue; + pa = odp->dma_list[idx + i]; - pas[i] = cpu_to_be64(umem_dma_to_mtt(pa)); + pa |= MLX5_IB_MTT_READ; + if ((pfn & HMM_PFN_WRITE) && !downgrade) + pa |= MLX5_IB_MTT_WRITE; + + pas[i] = cpu_to_be64(pa); } } @@ -268,8 +268,7 @@ static bool mlx5_ib_invalidate_range(struct mmu_interval_notifier *mni, * estimate the cost of another UMR vs. the cost of bigger * UMR. */ - if (umem_odp->dma_list[idx] & - (ODP_READ_ALLOWED_BIT | ODP_WRITE_ALLOWED_BIT)) { + if (umem_odp->pfn_list[idx] & HMM_PFN_VALID) { if (!in_block) { blk_start_idx = idx; in_block = 1; @@ -555,7 +554,7 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp, { int page_shift, ret, np; bool downgrade = flags & MLX5_PF_FLAGS_DOWNGRADE; - u64 access_mask; + u64 access_mask = 0; u64 start_idx; bool fault = !(flags & MLX5_PF_FLAGS_SNAPSHOT); u32 xlt_flags = MLX5_IB_UPD_XLT_ATOMIC; @@ -563,12 +562,14 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp, if (flags & MLX5_PF_FLAGS_ENABLE) xlt_flags |= MLX5_IB_UPD_XLT_ENABLE; + if (flags & MLX5_PF_FLAGS_DOWNGRADE) + xlt_flags |= MLX5_IB_UPD_XLT_DOWNGRADE; + page_shift = odp->page_shift; start_idx = (user_va - ib_umem_start(odp)) >> page_shift; - access_mask = ODP_READ_ALLOWED_BIT; if (odp->umem.writable && !downgrade) - access_mask |= ODP_WRITE_ALLOWED_BIT; + access_mask |= HMM_PFN_WRITE; np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault); if (np < 0) diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h index c0c1215925eb..f99911b478c4 100644 --- a/include/rdma/ib_umem_odp.h +++ b/include/rdma/ib_umem_odp.h @@ -8,6 +8,7 @@ #include #include +#include struct ib_umem_odp { struct ib_umem umem; @@ -68,19 +69,6 @@ static inline size_t ib_umem_odp_num_pages(struct ib_umem_odp *umem_odp) umem_odp->page_shift; } -/* - * The lower 2 bits of the DMA address signal the R/W permissions for - * the entry. To upgrade the permissions, provide the appropriate - * bitmask to the map_dma_pages function. - * - * Be aware that upgrading a mapped address might result in change of - * the DMA address for the page. - */ -#define ODP_READ_ALLOWED_BIT (1<<0ULL) -#define ODP_WRITE_ALLOWED_BIT (1<<1ULL) - -#define ODP_DMA_ADDR_MASK (~(ODP_READ_ALLOWED_BIT | ODP_WRITE_ALLOWED_BIT)) - #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING struct ib_umem_odp * From patchwork Thu Sep 12 11:15:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801934 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 186B7EEB594 for ; Thu, 12 Sep 2024 11:16:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AAA0F6B00A1; Thu, 12 Sep 2024 07:16:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A58CB6B00A2; Thu, 12 Sep 2024 07:16:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9479C6B00A3; Thu, 12 Sep 2024 07:16:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 746196B00A1 for ; Thu, 12 Sep 2024 07:16:52 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 36FCDC1FA2 for ; Thu, 12 Sep 2024 11:16:52 +0000 (UTC) X-FDA: 82555833864.04.5B8C626 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf04.hostedemail.com (Postfix) with ESMTP id 99E8E40015 for ; Thu, 12 Sep 2024 11:16:50 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=gxIRjDv2; spf=pass (imf04.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139706; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gGLs6QzmJOO36oeXATzigzPKHaISuQ3mIDTi96IgMDE=; b=Dichme65SBtc6b5KWyOGKLfRsW0bHgVirfnoK/1Qigael+YrEg5KTLiITKXBmM6AdWYq0i rPRe/kv4X/C4GiO5MzBOydR7KjTpjXdeNVaLeNfsV/v7iSlXPCXw95MTVA+zVoDog+mD5T JESYONN9OkDjcBck0QX9xuK6tTKnTm8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139706; a=rsa-sha256; cv=none; b=kl1eXymMK2xnSDpu1JCtazE9/Cu05uWOfPMoglyGcAVapZjOovUHK9O2faV/6n/V1SBXHu SC6hINmkrAxIrb0itGI3ckC/srnGZkJ943n5lBqNb4OaLss9aYtIRuPXYlYATL0XyChfkW 1Ju1hlUsrtvLQbMmso/eq8O63vZMBeA= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=gxIRjDv2; spf=pass (imf04.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 3B862A45276; Thu, 12 Sep 2024 11:16:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B3784C4CEC3; Thu, 12 Sep 2024 11:16:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139809; bh=spVFy3m3YsiOCt+eM/7kAiwbky1QV6tGttipLIv2zOw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gxIRjDv2xSAwxfC7Pzv93fwrrIfQ9PwEmdy9PQY9kxiAaKE9zksC7l6BVJARsx/Su 1STWOMET2EijMQV/ogv3irEMdmIuZj3nnGt4KBmBcT6BEGtZX8cjNM1Pr20esfKwTf J4gCuwNEqcuVTf4mIf24b0bSOAFHULKIEK1sxUpdhCWhDJud3jPtiwwQRwp14hntAr VSglEmDWqgoCBkDlNN7vrfcRvvwdFx4IRR1eaFLfLtGEBfPB8gWc/Xet5lmJKcD4NA zjaak/j/qqJF6x0ng9Y22CZ07xQLxhebgIlmpr7AwEEvH/7NVecTBsfEfiaTHT8btF KIbgGEdSyuStA== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 12/21] RDMA/core: Separate DMA mapping to caching IOVA and page linkage Date: Thu, 12 Sep 2024 14:15:47 +0300 Message-ID: <32e9e95b05e49d95079dc7cbfd458b00b47b1c81.1726138681.git.leon@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 99E8E40015 X-Stat-Signature: 46qfw919egq5r6rambc8419sjm8dftmp X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1726139810-212447 X-HE-Meta: U2FsdGVkX1+n6OegL7+ez+WPRpUFgtfeozCHeUW7ujQKnn3qllNfEO2x5IufKYaLlKWVzlgOwuNKgOYEmzcOYZanTolYkX30XzmGLmVzlhWb79x4zeY9tDh7cPzPzPvBDLLlCvLaVKFB0a0PVwo1pWCGDXPk5p7wGGRJuJpZebYHFzh7KcYmtbnQ7Ej2ihRn6mjc1nW54Z2xms1RddNSFU651l8buIoF/t+0m5Yoa24E7Yg71zegH0sgHRrQK/0ZctQdodfy01ykEat7RGp8+c3Yl9rwWcMz3976L8Iv9lYBBv+p2C2IYeGNQLIE32dKc+PZm+jIuxbiu4lt5npfWoz42QzSUB5Y1dqlP8+A7EqrGx5xN0150CE4jXin+1880Ra79YUySAuM7YtYIrl2UN8K05j5XDyzw3112WDSb2IY2tnn49AJFXH+lDKkmaeb0jLtWPkmwtZaEnopDfLh0628Chi5c7fo3/d0cOoJwdWOUb4NlbLQPsC9Ny+amJE2lbxjoqg1CAR3M6L+uH/HZRi5wVfDg89sNERnG7k3IlBeykvYKcxY2D0p2g5cD1/saYRVGa44n0fRKJRXsXPIOqGv+7N09vEiG5ybkKakH2AfLswGaTapxT4MBUQivV3s9vKTQAYfDXDheMruDcOozZ3/NLEdKa+eEbKP5zYeQ40xV5mvv6ptta9fzGedbFhY7YwNakqF9g3JW1x7rA4cQP/1sr8LAveZzN5mMvHGSW6Z7+NkWq+bA5E8bn7zTDp631IvJmVKpKbzH0FHlNzgulzKff+K+s03sw0yIe3oSgpLwaIIeszx7eY7O4bspfOFPJk4llG6t+qLQlNZSd8z++1ySxacz1HuOoAfyJounxpDEGVV1w2WCgrjIQI2dT/2nd+Sl/nkXzG9QSNmsojqa2HhK8u8B48UoKPGd8ArXIHxBWHeDnbPIfdOnKWgsurb6NyXI3uhd3xt0O00RhF d4OJjYFE +yOzVJ3bhKqI54iJj1a35mJ72BRU6W4hiC2sNSHXvWTL4sNQQQXWSaHZoloMD8z0Vs7z7XaJHh2naYWCEOZOOazxyvmrExEAnmXRaREG7+WuIYvd2nw5dabZ+0J1cPLmNxr/YtnbM9CnYgLeJvoamGonPwzqO06wppRwnKaeBgSEcZzY5gvhC/fdhReyLTwHndRNMqQUO8kc+f0VYqV3KNItWc2SWJJIr6AfOQbL0F0tT+Ck= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Reuse newly added DMA API to cache IOVA and only link/unlink pages in fast path. Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 61 +++--------------------------- drivers/infiniband/hw/mlx5/odp.c | 7 +++- include/rdma/ib_umem_odp.h | 8 +--- kernel/dma/mapping.c | 7 +--- 4 files changed, 14 insertions(+), 69 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 72885eca4181..7bfa1e54454c 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -81,19 +81,12 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, if (!umem_odp->pfn_list) return -ENOMEM; - umem_odp->dma_list = kvcalloc( - ndmas, sizeof(*umem_odp->dma_list), GFP_KERNEL); - if (!umem_odp->dma_list) { - ret = -ENOMEM; - goto out_pfn_list; - } dma_init_iova_state(&umem_odp->state, dev->dma_device, DMA_BIDIRECTIONAL); ret = dma_alloc_iova(&umem_odp->state, end - start); if (ret) - goto out_dma_list; - + goto out_pfn_list; ret = mmu_interval_notifier_insert(&umem_odp->notifier, umem_odp->umem.owning_mm, @@ -106,8 +99,6 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, out_free_iova: dma_free_iova(&umem_odp->state); -out_dma_list: - kvfree(umem_odp->dma_list); out_pfn_list: kvfree(umem_odp->pfn_list); return ret; @@ -285,7 +276,6 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp) mutex_unlock(&umem_odp->umem_mutex); mmu_interval_notifier_remove(&umem_odp->notifier); dma_free_iova(&umem_odp->state); - kvfree(umem_odp->dma_list); kvfree(umem_odp->pfn_list); } put_pid(umem_odp->tgid); @@ -293,40 +283,10 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp) } EXPORT_SYMBOL(ib_umem_odp_release); -/* - * Map for DMA and insert a single page into the on-demand paging page tables. - * - * @umem: the umem to insert the page to. - * @dma_index: index in the umem to add the dma to. - * @page: the page struct to map and add. - * @access_mask: access permissions needed for this page. - * - * The function returns -EFAULT if the DMA mapping operation fails. - * - */ -static int ib_umem_odp_map_dma_single_page( - struct ib_umem_odp *umem_odp, - unsigned int dma_index, - struct page *page) -{ - struct ib_device *dev = umem_odp->umem.ibdev; - dma_addr_t *dma_addr = &umem_odp->dma_list[dma_index]; - - *dma_addr = ib_dma_map_page(dev, page, 0, 1 << umem_odp->page_shift, - DMA_BIDIRECTIONAL); - if (ib_dma_mapping_error(dev, *dma_addr)) { - *dma_addr = 0; - return -EFAULT; - } - umem_odp->npages++; - return 0; -} - /** * ib_umem_odp_map_dma_and_lock - DMA map userspace memory in an ODP MR and lock it. * * Maps the range passed in the argument to DMA addresses. - * The DMA addresses of the mapped pages is updated in umem_odp->dma_list. * Upon success the ODP MR will be locked to let caller complete its device * page table update. * @@ -434,15 +394,6 @@ int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, __func__, hmm_order, page_shift); break; } - - ret = ib_umem_odp_map_dma_single_page( - umem_odp, dma_index, hmm_pfn_to_page(range.hmm_pfns[pfn_index])); - if (ret < 0) { - ibdev_dbg(umem_odp->umem.ibdev, - "ib_umem_odp_map_dma_single_page failed with error %d\n", ret); - break; - } - range.hmm_pfns[pfn_index] |= HMM_PFN_DMA_MAPPED; } /* upon success lock should stay on hold for the callee */ if (!ret) @@ -462,10 +413,8 @@ EXPORT_SYMBOL(ib_umem_odp_map_dma_and_lock); void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt, u64 bound) { - dma_addr_t dma; int idx; u64 addr; - struct ib_device *dev = umem_odp->umem.ibdev; lockdep_assert_held(&umem_odp->umem_mutex); @@ -473,19 +422,19 @@ void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt, bound = min_t(u64, bound, ib_umem_end(umem_odp)); for (addr = virt; addr < bound; addr += BIT(umem_odp->page_shift)) { unsigned long pfn_idx = (addr - ib_umem_start(umem_odp)) >> PAGE_SHIFT; - struct page *page = hmm_pfn_to_page(umem_odp->pfn_list[pfn_idx]); idx = (addr - ib_umem_start(umem_odp)) >> umem_odp->page_shift; - dma = umem_odp->dma_list[idx]; if (!(umem_odp->pfn_list[pfn_idx] & HMM_PFN_VALID)) continue; if (!(umem_odp->pfn_list[pfn_idx] & HMM_PFN_DMA_MAPPED)) continue; - ib_dma_unmap_page(dev, dma, BIT(umem_odp->page_shift), - DMA_BIDIRECTIONAL); + dma_hmm_unlink_page(&umem_odp->state, + &umem_odp->pfn_list[pfn_idx], + idx * (1 << umem_odp->page_shift)); if (umem_odp->pfn_list[pfn_idx] & HMM_PFN_WRITE) { + struct page *page = hmm_pfn_to_page(umem_odp->pfn_list[pfn_idx]); struct page *head_page = compound_head(page); /* * set_page_dirty prefers being called with diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c index 4bf691fb266f..f1fe2b941bb4 100644 --- a/drivers/infiniband/hw/mlx5/odp.c +++ b/drivers/infiniband/hw/mlx5/odp.c @@ -149,6 +149,7 @@ static void populate_mtt(__be64 *pas, size_t idx, size_t nentries, { struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem); bool downgrade = flags & MLX5_IB_UPD_XLT_DOWNGRADE; + struct ib_device *dev = odp->umem.ibdev; unsigned long pfn; dma_addr_t pa; size_t i; @@ -162,12 +163,16 @@ static void populate_mtt(__be64 *pas, size_t idx, size_t nentries, /* Initial ODP init */ continue; - pa = odp->dma_list[idx + i]; + pa = dma_hmm_link_page(&odp->state, &odp->pfn_list[idx + i], + (idx + i) * (1 << odp->page_shift)); + WARN_ON_ONCE(ib_dma_mapping_error(dev, pa)); + pa |= MLX5_IB_MTT_READ; if ((pfn & HMM_PFN_WRITE) && !downgrade) pa |= MLX5_IB_MTT_WRITE; pas[i] = cpu_to_be64(pa); + odp->npages++; } } diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h index f99911b478c4..cb081c69fd1a 100644 --- a/include/rdma/ib_umem_odp.h +++ b/include/rdma/ib_umem_odp.h @@ -18,15 +18,9 @@ struct ib_umem_odp { /* An array of the pfns included in the on-demand paging umem. */ unsigned long *pfn_list; - /* - * An array with DMA addresses mapped for pfns in pfn_list. - * The lower two bits designate access permissions. - * See ODP_READ_ALLOWED_BIT and ODP_WRITE_ALLOWED_BIT. - */ - dma_addr_t *dma_list; struct dma_iova_state state; /* - * The umem_mutex protects the page_list and dma_list fields of an ODP + * The umem_mutex protects the page_list field of an ODP * umem, allowing only a single thread to map/unmap pages. The mutex * also protects access to the mmu notifier counters. */ diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 5354ddc3ac03..38d7b3239dbb 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -1108,7 +1108,7 @@ dma_addr_t dma_hmm_link_page(struct dma_iova_state *state, unsigned long *pfn, struct page *page = hmm_pfn_to_page(*pfn); phys_addr_t phys = page_to_phys(page); bool coherent = dev_is_dma_coherent(dev); - dma_addr_t addr; + dma_addr_t addr = phys_to_dma(dev, phys); int ret; if (*pfn & HMM_PFN_DMA_MAPPED) @@ -1123,8 +1123,7 @@ dma_addr_t dma_hmm_link_page(struct dma_iova_state *state, unsigned long *pfn, * The DMA address calculation below is based on the fact that * HMM doesn't work with swiotlb. */ - return (state->addr) ? state->addr + dma_offset : - phys_to_dma(dev, phys); + return (state->addr) ? state->addr + dma_offset : addr; state->range_size = dma_offset; @@ -1136,8 +1135,6 @@ dma_addr_t dma_hmm_link_page(struct dma_iova_state *state, unsigned long *pfn, if (!use_dma_iommu(dev)) { if (!coherent) arch_sync_dma_for_device(phys, PAGE_SIZE, state->dir); - - addr = phys_to_dma(dev, phys); goto done; } From patchwork Thu Sep 12 11:15:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E9C0EEB593 for ; Thu, 12 Sep 2024 11:17:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32F4D6B00B0; Thu, 12 Sep 2024 07:17:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DFB66B00B2; Thu, 12 Sep 2024 07:17:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 132516B00B3; Thu, 12 Sep 2024 07:17:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id E03C96B00B0 for ; Thu, 12 Sep 2024 07:17:20 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 9BD3E121835 for ; Thu, 12 Sep 2024 11:17:20 +0000 (UTC) X-FDA: 82555835040.20.11E398D Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf30.hostedemail.com (Postfix) with ESMTP id F2D2180002 for ; Thu, 12 Sep 2024 11:17:18 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=QFWaxyYp; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf30.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139810; a=rsa-sha256; cv=none; b=WBG0AG9aFUITLNz4qFFzqmsO6sJK0kkDbisjTQ7X+T1kYGoe4suFI0mC3a/B5gfX0X/se4 D5E0pQceybgDTOyR0Wbn1aor4IHQen12fS94k5vNgr71lcTcGJ0UaRFonSWTEaQUoZC8mQ S1h8QgjBAdPByeA/eb9fO4Ly4V7cwlg= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=QFWaxyYp; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf30.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139810; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Mr0G1m8SROsZ7tCbvxfUWydjf4YRJTndQwy2lASZe5A=; b=kjzj30IdLVpsx3UGNHxBm835k5WIFTLtkg2x5/wf1tTmS77XnOlSo2WGGL4K7tjr00HxqQ cbF8uRYeP5593VP/vUE/HRguJTK2E8MRZ/JRa0NPej9ZrJoUjMvvrvEU1lsnaHEenTz6bI Yz0C/ZrrXpDpAiqd6eD/8iIeZByr468= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 788325C5AAC; Thu, 12 Sep 2024 11:17:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F0BF2C4CEC3; Thu, 12 Sep 2024 11:17:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139837; bh=wkFj1cXNmX5ZlhYlYscU+J0rWGL2pg30iH0xC9v8VVY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QFWaxyYpYfe2FFX+K0p+Goy6dyaOFgOpyojuT6rJgqucu1wtXJ7Ko6B4Z9/VXe95S 2I3S4vrw+Ff5zb5nynL5Vh4Yw/g8+uv0ZNC87b03g7DhVrmV3YivYg7uBbwVNe/GZG +MBvLijRXvMiwL3ZM84io5kv0P8Wbcp45n4DIbM7CtyDWAAQlTWkB20jUCGxLF406f CPRu/FtyIHE0iWOhGGjqbS+ZeLgoViJvPDk3twLQyYGb5FUYjNquCd5o/JMVAX1ObX p/Oa3w5s2HlnKu34VY+AD80OkZOKYp43R+53dQV6GuLtLI/rUJqwRY2wXjSwusplbM Yvypy/0K4VPSA== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 13/21] RDMA/umem: Prevent UMEM ODP creation with SWIOTLB Date: Thu, 12 Sep 2024 14:15:48 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: F2D2180002 X-Rspamd-Server: rspam01 X-Stat-Signature: prtczjcysurfoypsn3fet9sxjqdy5zdo X-HE-Tag: 1726139838-830832 X-HE-Meta: U2FsdGVkX1/19hTeDdDzAbaisj41jFaYk9PoCcew1wGP0rB723m4p07v+54vC0cdaoF+DGvshAOOSn3yes/J1IcVseHWb8NsMZ8+ecUS73TRqt+C2JyT399ns9rcBxjcZuCgLL0eSdLyd3uYPVsZNcyQvmkY5LzOJivIODtqQXIXMdaxfMevhoyTLn46A/bVMP/XloN1Sro80dFklsl3orRrG+xwxzHKbxbmjfWzJOcdGG7leqFwfybx5J5J+wOS1rteLsLgLARaK0HgX2IKBuRvlq1jrcSGkK032WHBkuxHu3G4OdxJ6iDqZtnWP/7/B6PQY1nnzWgSgfbkerkkZv8or4If84e+vCF+0iOLY7x7Xb3twCX1TCBMcy6aOyE0lFAcBv6lFWzLx/lLeDL2FTNJNsl3fNLe5WKZvSOr8Am3+h/WYgyUKFzRne0dT5YZU72WGMGaS3q+wGKjB9XutxfCaNrPzBae5WN374Xqk8A7pWHwW2fSis0CABrHuXrwIk3ZDVOX8J/rWPtTEjeIcRYStPSvpib0rF03/2c31RqcYNXdMqW1pUdb6ZCj5jHIna75zG+BL34NKoGb7jxhmPeZ5MO6O+wcb4RqBXfnNz2VTAD199bUhCRf2O+99CBvzkGeKBZfDoTr7gW4Rl0KW9Z2PQ2RNljmlsTVGaAEUVKoKNzS/yZ1sJr/IWze0geEkrZTPNJjmRPSFZzC3oPhNwwr54fLE4l5c64pDAmi46Xydurlrp6jaknO/hUb6ng3TxLVaLfWXKjRs8QaolWvOZk428Nidx+FDxlasDfOropz42oMzAqJfVgAXI4+jI5jHbXSOw1wLGQa9JfibfdzwHLzVaBZ+gKmQehmSo+CtvsET3yi3q6LED2KgZ45GU6NPnyHk58Eed4LpXkPbTa0oKlIa24Ec8ue4arr8p9qKEwPn/ZGG+0JCXEJvsM98YlpcWHxVhAqR/ZqQX2V3rZ nwASWSa+ 7pfG57SqLjF4wo45xJzuO07qTgY9xmVrchr2Tjdv3aqpkm2PCgIODdBYXZsaZ4O8ONLP9bY/kDfgwQXhtGmwSBMRdx9yAus5WdenzDBmE3yMtbQi27vaylbl+8JnF05cHHcuHXMwqPaXIAR952Fd/HPV+N3vaZz5zdQiRngQndq05MBCDSTGMwmx9BQB89k7DoXGBE25J/KZeaDRq+L3aRtsAiFx0NuVaTC7Sm9oSBydLFTc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky RDMA UMEM never supported DMA addresses returned from SWIOTLB, as these addresses should be programmed to the hardware which is not aware that it is bounce buffers and not real ones. Instead of silently leave broken system for the users who didn't know it, let's be explicit and return an error to them. Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/umem_odp.c | 78 +++++++++++++++--------------- drivers/iommu/dma-iommu.c | 1 + 2 files changed, 40 insertions(+), 39 deletions(-) diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c index 7bfa1e54454c..58fc3d4bfb73 100644 --- a/drivers/infiniband/core/umem_odp.c +++ b/drivers/infiniband/core/umem_odp.c @@ -42,7 +42,7 @@ #include #include #include - +#include #include #include "uverbs.h" @@ -51,49 +51,49 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp, const struct mmu_interval_notifier_ops *ops) { struct ib_device *dev = umem_odp->umem.ibdev; + size_t page_size = 1UL << umem_odp->page_shift; + unsigned long start, end; + size_t ndmas, npfns; int ret; umem_odp->umem.is_odp = 1; mutex_init(&umem_odp->umem_mutex); + if (umem_odp->is_implicit_odp) + return 0; + + if (!iommu_can_use_iova(dev->dma_device, NULL, page_size, + DMA_BIDIRECTIONAL)) + return -EOPNOTSUPP; + + start = ALIGN_DOWN(umem_odp->umem.address, page_size); + if (check_add_overflow(umem_odp->umem.address, + (unsigned long)umem_odp->umem.length, &end)) + return -EOVERFLOW; + end = ALIGN(end, page_size); + if (unlikely(end < page_size)) + return -EOVERFLOW; + + ndmas = (end - start) >> umem_odp->page_shift; + if (!ndmas) + return -EINVAL; + + npfns = (end - start) >> PAGE_SHIFT; + umem_odp->pfn_list = + kvcalloc(npfns, sizeof(*umem_odp->pfn_list), GFP_KERNEL); + if (!umem_odp->pfn_list) + return -ENOMEM; + + dma_init_iova_state(&umem_odp->state, dev->dma_device, + DMA_BIDIRECTIONAL); + ret = dma_alloc_iova(&umem_odp->state, end - start); + if (ret) + goto out_pfn_list; - if (!umem_odp->is_implicit_odp) { - size_t page_size = 1UL << umem_odp->page_shift; - unsigned long start; - unsigned long end; - size_t ndmas, npfns; - - start = ALIGN_DOWN(umem_odp->umem.address, page_size); - if (check_add_overflow(umem_odp->umem.address, - (unsigned long)umem_odp->umem.length, - &end)) - return -EOVERFLOW; - end = ALIGN(end, page_size); - if (unlikely(end < page_size)) - return -EOVERFLOW; - - ndmas = (end - start) >> umem_odp->page_shift; - if (!ndmas) - return -EINVAL; - - npfns = (end - start) >> PAGE_SHIFT; - umem_odp->pfn_list = kvcalloc( - npfns, sizeof(*umem_odp->pfn_list), GFP_KERNEL); - if (!umem_odp->pfn_list) - return -ENOMEM; - - - dma_init_iova_state(&umem_odp->state, dev->dma_device, - DMA_BIDIRECTIONAL); - ret = dma_alloc_iova(&umem_odp->state, end - start); - if (ret) - goto out_pfn_list; - - ret = mmu_interval_notifier_insert(&umem_odp->notifier, - umem_odp->umem.owning_mm, - start, end - start, ops); - if (ret) - goto out_free_iova; - } + ret = mmu_interval_notifier_insert(&umem_odp->notifier, + umem_odp->umem.owning_mm, start, + end - start, ops); + if (ret) + goto out_free_iova; return 0; diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 3e2e382bb502..af3428ae150d 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1849,6 +1849,7 @@ bool iommu_can_use_iova(struct device *dev, struct page *page, size_t size, return true; } +EXPORT_SYMBOL_GPL(iommu_can_use_iova); void iommu_setup_dma_ops(struct device *dev) { From patchwork Thu Sep 12 11:15:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801936 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BA9AEEB593 for ; Thu, 12 Sep 2024 11:17:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B6DCA6B00A6; Thu, 12 Sep 2024 07:17:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1C2C6B00A7; Thu, 12 Sep 2024 07:17:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96FEA6B00A8; Thu, 12 Sep 2024 07:17:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7769D6B00A6 for ; Thu, 12 Sep 2024 07:17:02 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 214ED161CCF for ; Thu, 12 Sep 2024 11:17:02 +0000 (UTC) X-FDA: 82555834284.08.EDE1FC6 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf26.hostedemail.com (Postfix) with ESMTP id EA0EE140012 for ; Thu, 12 Sep 2024 11:16:58 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=XgkpL9dA; spf=pass (imf26.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139766; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=x1PoqX4OH+HeUBKDENhGhtDbXS9V5NPb/zMHkqmRIE8=; b=eIBWJRB6U3cdAPFt0B/DRrZehUcmAFYzpTulh+QV8vY1LSu3nHZL9y9BwA3vEswdmXkPF4 L6I7DZ086UvlZelwrjkMhbPf7fkipcLOAHcre+Z5jcRbX4atSYv20ntFWoW2BeljRmyga+ cZBtxN9suYQ1vmpnWnoKg120f1pTj/k= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=XgkpL9dA; spf=pass (imf26.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139766; a=rsa-sha256; cv=none; b=XZfmCaCKuYajuIpfXabZtJyUGfBVbMeq3enjyjjz9czBhy7TApM81HeG+ORcSD+jA9E69V OclfrzOMtI0KCkoUOHm1o5itR9D7JZ3pNRc3uM1izszOc2wxrptQXb80w6aUT0WSDbM051 AFnws4PPyk65M1TSX/Aeq5x6SOZmUoY= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 86673A45284; Thu, 12 Sep 2024 11:16:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EC6DBC4CECC; Thu, 12 Sep 2024 11:16:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139817; bh=cC5WVFVLsV04Ng3PmoGLp8oMvNfM8af8nEIAZfXZFwM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XgkpL9dAYw9R+tD+B3UEjGJFrVUvX1Xw+6lm2MrJQdQ2r7jtbxTWS059rlVncbbDM viAybphzEL6aZfdX67tx5qN60zahQm9sodxMVAdLuwJHQ4ZhAeHc+9lT0hho9L0Zt+ oBpgO08eoYR/nGGJg1qmcUPLU0/xHQhWvv8YK/+2OtKG7b/OMqTL43zYry+FPFQG+8 2tTdF4FeAPzQ2hPkC3ThKVPLEiaC0HCnSFhHkI1r8he/ZWRnHohIbxSBimVMY4w8VD xwP/4P1UVs4oknIzlIhf86d8YFOU8zNkaFWhwj+ZYMfzr8KOFF/pS3cyUpgyn6p2As QR+JcJ1RifusQ== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 14/21] vfio/mlx5: Explicitly use number of pages instead of allocated length Date: Thu, 12 Sep 2024 14:15:49 +0300 Message-ID: <29dea17e8e4dbbd839f14d3b248f5f3d06d251fa.1726138681.git.leon@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: EA0EE140012 X-Stat-Signature: quyq39zb5cfsfkyg3x3ift1akiu7zizk X-HE-Tag: 1726139818-151178 X-HE-Meta: U2FsdGVkX188kj2acvSCUA0xf+FlY8aoCfbPQSKzRtAbI5wDKb7rdkamnxT8s9dY4MucFkPFVAxM8SiGFsoUkFRvH8SWVw+mYz0GC0iimKl3d8zPkHcdv04nzDWkKVsYDKoJ0vKMYT0Hc1ezLpn2m5gAmF6YANR3Uk/dYoUV7tCKqUvqjIxMkUDJHPKJcdx6uckXnVkTblIQcaP5hgMV3R9c/xF1wbTW/neA5IsdqADFMAZgxIEnXDwhlbiKByBOH2BOSF4gzk5RbLaNC1Dzx3sNNE6nBMtc6MGHWAiTr8g//lMu9t++SbPa4j1MRPZpyBVIBEscSSJHSaKsQxLD3F3ixZlU5JSYOqSpqiXPH2Akb3mZLt4xrPnr+Q0e4MTnN8Mo5+pZnnVNGaqH03kewubeYSNOvc49MdCqIS0/0lXiYfRzRUajA7D3ApulN20NxLKmUrnX3IQPMHZcxC+znrAoPT1jUPW4EWvI3bEBxC/gftA/4F//BD1T27qKaenF7t2/7WGgOfWueaQ911r0NTcIloh+jQ2seZxI4ZHTxMovSGTUFmYeRWcZ/2ZXl7enul3eYswSOIFg923WPmepanMghG+sPKJPZHqpsMjU8whD5ZnibFdeyZHO2pf142ab27Twi3NEjwSvDh3lRLrQOlgRA5edZ8Gc2H9XdIdRR3Ul4N36aVZxOQD7Z/AAYOtHHMqvKjWvazbdeZHtDJQPNQtZ6hmVd9j9R03rEHPgso0aSWGoBS9kngxsYJNyzt/TxeIwib3oOzVgkAe5mhezKSk1Sh/y8zPkBmZDGnIdEAnt24hxqd/mQOq3Uy0kg4RrKe1g5tUQ9UlmZlfl/2lX609p174xYIJ3Jsx6GYUBT0EpScS9Ko7xHAO6WD7AV/TvmlGBzrklUR7Bx7xR+m9eIbIP0sZH3JTvdzBOtPjQFzLwx0RpU+4BcOrAtMFarEsVSK5tjYG5gcHpRZwk2au vXOkwOpd jFL9xuKGq1GwzcoyebcuV+qyhB3lCmJrRiSXobiGXQPHgaljHxDSJswVTLOvGMHtKNxGjYYR1aEwBSQW2BkE/ko79HAv7j8Tmv1wEcdn26NnyIz99zDWHFPJYBHkr7zGSorh4WR6jdwtXcxpbIXNrcka9e+y6vSpM7drrAXFxOn7/DRAKpvI+ihn2i0csPn/J2BFuI1HzsqWMX+/LDTsOhkpvrRO3LxZHuWcizo+V5zmBT8E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky allocated_length is a multiple of page size and number of pages, so let's change the functions to accept number of pages. It opens us a venue to combine receive and send paths together with code readability improvement. Signed-off-by: Leon Romanovsky --- drivers/vfio/pci/mlx5/cmd.c | 32 ++++++++++----------- drivers/vfio/pci/mlx5/cmd.h | 10 +++---- drivers/vfio/pci/mlx5/main.c | 56 +++++++++++++++++++++++------------- 3 files changed, 57 insertions(+), 41 deletions(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index 41a4b0cf4297..fdc3e515741f 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -318,8 +318,7 @@ static int _create_mkey(struct mlx5_core_dev *mdev, u32 pdn, struct mlx5_vhca_recv_buf *recv_buf, u32 *mkey) { - size_t npages = buf ? DIV_ROUND_UP(buf->allocated_length, PAGE_SIZE) : - recv_buf->npages; + size_t npages = buf ? buf->npages : recv_buf->npages; int err = 0, inlen; __be64 *mtt; void *mkc; @@ -375,7 +374,7 @@ static int mlx5vf_dma_data_buffer(struct mlx5_vhca_data_buffer *buf) if (mvdev->mdev_detach) return -ENOTCONN; - if (buf->dmaed || !buf->allocated_length) + if (buf->dmaed || !buf->npages) return -EINVAL; ret = dma_map_sgtable(mdev->device, &buf->table.sgt, buf->dma_dir, 0); @@ -444,7 +443,7 @@ static int mlx5vf_add_migration_pages(struct mlx5_vhca_data_buffer *buf, if (ret) goto err; - buf->allocated_length += filled * PAGE_SIZE; + buf->npages += filled; /* clean input for another bulk allocation */ memset(page_list, 0, filled * sizeof(*page_list)); to_fill = min_t(unsigned int, to_alloc, @@ -460,8 +459,7 @@ static int mlx5vf_add_migration_pages(struct mlx5_vhca_data_buffer *buf, } struct mlx5_vhca_data_buffer * -mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, - size_t length, +mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, enum dma_data_direction dma_dir) { struct mlx5_vhca_data_buffer *buf; @@ -473,9 +471,8 @@ mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, buf->dma_dir = dma_dir; buf->migf = migf; - if (length) { - ret = mlx5vf_add_migration_pages(buf, - DIV_ROUND_UP_ULL(length, PAGE_SIZE)); + if (npages) { + ret = mlx5vf_add_migration_pages(buf, npages); if (ret) goto end; @@ -501,8 +498,8 @@ void mlx5vf_put_data_buffer(struct mlx5_vhca_data_buffer *buf) } struct mlx5_vhca_data_buffer * -mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, - size_t length, enum dma_data_direction dma_dir) +mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, + enum dma_data_direction dma_dir) { struct mlx5_vhca_data_buffer *buf, *temp_buf; struct list_head free_list; @@ -517,7 +514,7 @@ mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, list_for_each_entry_safe(buf, temp_buf, &migf->avail_list, buf_elm) { if (buf->dma_dir == dma_dir) { list_del_init(&buf->buf_elm); - if (buf->allocated_length >= length) { + if (buf->npages >= npages) { spin_unlock_irq(&migf->list_lock); goto found; } @@ -531,7 +528,7 @@ mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, } } spin_unlock_irq(&migf->list_lock); - buf = mlx5vf_alloc_data_buffer(migf, length, dma_dir); + buf = mlx5vf_alloc_data_buffer(migf, npages, dma_dir); found: while ((temp_buf = list_first_entry_or_null(&free_list, @@ -712,7 +709,7 @@ int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev, MLX5_SET(save_vhca_state_in, in, op_mod, 0); MLX5_SET(save_vhca_state_in, in, vhca_id, mvdev->vhca_id); MLX5_SET(save_vhca_state_in, in, mkey, buf->mkey); - MLX5_SET(save_vhca_state_in, in, size, buf->allocated_length); + MLX5_SET(save_vhca_state_in, in, size, buf->npages * PAGE_SIZE); MLX5_SET(save_vhca_state_in, in, incremental, inc); MLX5_SET(save_vhca_state_in, in, set_track, track); @@ -734,8 +731,11 @@ int mlx5vf_cmd_save_vhca_state(struct mlx5vf_pci_core_device *mvdev, } if (!header_buf) { - header_buf = mlx5vf_get_data_buffer(migf, - sizeof(struct mlx5_vf_migration_header), DMA_NONE); + header_buf = mlx5vf_get_data_buffer( + migf, + DIV_ROUND_UP(sizeof(struct mlx5_vf_migration_header), + PAGE_SIZE), + DMA_NONE); if (IS_ERR(header_buf)) { err = PTR_ERR(header_buf); goto err_free; diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index df421dc6de04..7d4a833b6900 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -56,7 +56,7 @@ struct mlx5_vhca_data_buffer { struct sg_append_table table; loff_t start_pos; u64 length; - u64 allocated_length; + u32 npages; u32 mkey; enum dma_data_direction dma_dir; u8 dmaed:1; @@ -217,12 +217,12 @@ int mlx5vf_cmd_alloc_pd(struct mlx5_vf_migration_file *migf); void mlx5vf_cmd_dealloc_pd(struct mlx5_vf_migration_file *migf); void mlx5fv_cmd_clean_migf_resources(struct mlx5_vf_migration_file *migf); struct mlx5_vhca_data_buffer * -mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, - size_t length, enum dma_data_direction dma_dir); +mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, + enum dma_data_direction dma_dir); void mlx5vf_free_data_buffer(struct mlx5_vhca_data_buffer *buf); struct mlx5_vhca_data_buffer * -mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, - size_t length, enum dma_data_direction dma_dir); +mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, + enum dma_data_direction dma_dir); void mlx5vf_put_data_buffer(struct mlx5_vhca_data_buffer *buf); struct page *mlx5vf_get_migration_page(struct mlx5_vhca_data_buffer *buf, unsigned long offset); diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index 61d9b0f9146d..d899cd499e27 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -308,6 +308,7 @@ static struct mlx5_vhca_data_buffer * mlx5vf_mig_file_get_stop_copy_buf(struct mlx5_vf_migration_file *migf, u8 index, size_t required_length) { + u32 npages = DIV_ROUND_UP(required_length, PAGE_SIZE); struct mlx5_vhca_data_buffer *buf = migf->buf[index]; u8 chunk_num; @@ -315,12 +316,11 @@ mlx5vf_mig_file_get_stop_copy_buf(struct mlx5_vf_migration_file *migf, chunk_num = buf->stop_copy_chunk_num; buf->migf->buf[index] = NULL; /* Checking whether the pre-allocated buffer can fit */ - if (buf->allocated_length >= required_length) + if (buf->npages >= npages) return buf; mlx5vf_put_data_buffer(buf); - buf = mlx5vf_get_data_buffer(buf->migf, required_length, - DMA_FROM_DEVICE); + buf = mlx5vf_get_data_buffer(buf->migf, npages, DMA_FROM_DEVICE); if (IS_ERR(buf)) return buf; @@ -373,7 +373,8 @@ static int mlx5vf_add_stop_copy_header(struct mlx5_vf_migration_file *migf, u8 *to_buff; int ret; - header_buf = mlx5vf_get_data_buffer(migf, size, DMA_NONE); + header_buf = mlx5vf_get_data_buffer(migf, DIV_ROUND_UP(size, PAGE_SIZE), + DMA_NONE); if (IS_ERR(header_buf)) return PTR_ERR(header_buf); @@ -388,7 +389,7 @@ static int mlx5vf_add_stop_copy_header(struct mlx5_vf_migration_file *migf, to_buff = kmap_local_page(page); memcpy(to_buff, &header, sizeof(header)); header_buf->length = sizeof(header); - data.stop_copy_size = cpu_to_le64(migf->buf[0]->allocated_length); + data.stop_copy_size = cpu_to_le64(migf->buf[0]->npages * PAGE_SIZE); memcpy(to_buff + sizeof(header), &data, sizeof(data)); header_buf->length += sizeof(data); kunmap_local(to_buff); @@ -437,15 +438,20 @@ static int mlx5vf_prep_stop_copy(struct mlx5vf_pci_core_device *mvdev, num_chunks = mvdev->chunk_mode ? MAX_NUM_CHUNKS : 1; for (i = 0; i < num_chunks; i++) { - buf = mlx5vf_get_data_buffer(migf, inc_state_size, DMA_FROM_DEVICE); + buf = mlx5vf_get_data_buffer( + migf, DIV_ROUND_UP(inc_state_size, PAGE_SIZE), + DMA_FROM_DEVICE); if (IS_ERR(buf)) { ret = PTR_ERR(buf); goto err; } migf->buf[i] = buf; - buf = mlx5vf_get_data_buffer(migf, - sizeof(struct mlx5_vf_migration_header), DMA_NONE); + buf = mlx5vf_get_data_buffer( + migf, + DIV_ROUND_UP(sizeof(struct mlx5_vf_migration_header), + PAGE_SIZE), + DMA_NONE); if (IS_ERR(buf)) { ret = PTR_ERR(buf); goto err; @@ -553,7 +559,8 @@ static long mlx5vf_precopy_ioctl(struct file *filp, unsigned int cmd, * We finished transferring the current state and the device has a * dirty state, save a new state to be ready for. */ - buf = mlx5vf_get_data_buffer(migf, inc_length, DMA_FROM_DEVICE); + buf = mlx5vf_get_data_buffer(migf, DIV_ROUND_UP(inc_length, PAGE_SIZE), + DMA_FROM_DEVICE); if (IS_ERR(buf)) { ret = PTR_ERR(buf); mlx5vf_mark_err(migf); @@ -674,8 +681,8 @@ mlx5vf_pci_save_device_data(struct mlx5vf_pci_core_device *mvdev, bool track) if (track) { /* leave the allocated buffer ready for the stop-copy phase */ - buf = mlx5vf_alloc_data_buffer(migf, - migf->buf[0]->allocated_length, DMA_FROM_DEVICE); + buf = mlx5vf_alloc_data_buffer(migf, migf->buf[0]->npages, + DMA_FROM_DEVICE); if (IS_ERR(buf)) { ret = PTR_ERR(buf); goto out_pd; @@ -918,11 +925,14 @@ static ssize_t mlx5vf_resume_write(struct file *filp, const char __user *buf, goto out_unlock; break; case MLX5_VF_LOAD_STATE_PREP_HEADER_DATA: - if (vhca_buf_header->allocated_length < migf->record_size) { + { + u32 npages = DIV_ROUND_UP(migf->record_size, PAGE_SIZE); + + if (vhca_buf_header->npages < npages) { mlx5vf_free_data_buffer(vhca_buf_header); - migf->buf_header[0] = mlx5vf_alloc_data_buffer(migf, - migf->record_size, DMA_NONE); + migf->buf_header[0] = mlx5vf_alloc_data_buffer( + migf, npages, DMA_NONE); if (IS_ERR(migf->buf_header[0])) { ret = PTR_ERR(migf->buf_header[0]); migf->buf_header[0] = NULL; @@ -935,6 +945,7 @@ static ssize_t mlx5vf_resume_write(struct file *filp, const char __user *buf, vhca_buf_header->start_pos = migf->max_pos; migf->load_state = MLX5_VF_LOAD_STATE_READ_HEADER_DATA; break; + } case MLX5_VF_LOAD_STATE_READ_HEADER_DATA: ret = mlx5vf_resume_read_header_data(migf, vhca_buf_header, &buf, &len, pos, &done); @@ -945,12 +956,13 @@ static ssize_t mlx5vf_resume_write(struct file *filp, const char __user *buf, { u64 size = max(migf->record_size, migf->stop_copy_prep_size); + u32 npages = DIV_ROUND_UP(size, PAGE_SIZE); - if (vhca_buf->allocated_length < size) { + if (vhca_buf->npages < npages) { mlx5vf_free_data_buffer(vhca_buf); - migf->buf[0] = mlx5vf_alloc_data_buffer(migf, - size, DMA_TO_DEVICE); + migf->buf[0] = mlx5vf_alloc_data_buffer( + migf, npages, DMA_TO_DEVICE); if (IS_ERR(migf->buf[0])) { ret = PTR_ERR(migf->buf[0]); migf->buf[0] = NULL; @@ -1033,8 +1045,11 @@ mlx5vf_pci_resume_device_data(struct mlx5vf_pci_core_device *mvdev) } migf->buf[0] = buf; - buf = mlx5vf_alloc_data_buffer(migf, - sizeof(struct mlx5_vf_migration_header), DMA_NONE); + buf = mlx5vf_alloc_data_buffer( + migf, + DIV_ROUND_UP(sizeof(struct mlx5_vf_migration_header), + PAGE_SIZE), + DMA_NONE); if (IS_ERR(buf)) { ret = PTR_ERR(buf); goto out_buf; @@ -1151,7 +1166,8 @@ mlx5vf_pci_step_device_state_locked(struct mlx5vf_pci_core_device *mvdev, MLX5VF_QUERY_INC | MLX5VF_QUERY_CLEANUP); if (ret) return ERR_PTR(ret); - buf = mlx5vf_get_data_buffer(migf, size, DMA_FROM_DEVICE); + buf = mlx5vf_get_data_buffer(migf, + DIV_ROUND_UP(size, PAGE_SIZE), DMA_FROM_DEVICE); if (IS_ERR(buf)) return ERR_CAST(buf); /* pre_copy cleanup */ From patchwork Thu Sep 12 11:15:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E3CEEEB591 for ; Thu, 12 Sep 2024 11:17:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E54056B00A9; Thu, 12 Sep 2024 07:17:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E0ABE6B00AB; Thu, 12 Sep 2024 07:17:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C2EC66B00AA; Thu, 12 Sep 2024 07:17:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A25D16B00A8 for ; Thu, 12 Sep 2024 07:17:04 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5CD9841EDB for ; Thu, 12 Sep 2024 11:17:04 +0000 (UTC) X-FDA: 82555834368.27.1EEA335 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf15.hostedemail.com (Postfix) with ESMTP id BECC3A0012 for ; Thu, 12 Sep 2024 11:17:02 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=NiV6oHmJ; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf15.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139745; a=rsa-sha256; cv=none; b=C3/tL4QVK5vn9K1x9nqUhGnjlxmC89Q7RYu3nOfUha3v7CVrnwp9bY0rlxaG695nkNN+SO 79EltMEhMRIFZR5/bkc9tuPsIcpkgtHDkeLmAc3AFxUI1KqZ6OXOiJiQHjtzNyo/er76OA aKwTOnOxVGBa/JKDbdo/+x7f5rDgVh4= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=NiV6oHmJ; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf15.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139745; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=95yAFMIoxXwo1PKo1Gg34eJ2/hcn2q+EWJ4BSWue6/A=; b=d4kckDyLXAVDTe1181oHjXxduHhK/sy+O62NSg6uSvWw4YZpqnGwqYGzIcUPUF2X52hjDU BOrTFVD2Pwqp8aoMt0cOy4eT/D4Uv3aguwQ0XuZZdyc1Eo1HB1AZGl4KJ7LS2Tby7FqQ6Q 3HLlYzTZL6cVStr3cX7Qlx1ItfaqOak= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 6ED80A45265; Thu, 12 Sep 2024 11:16:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EBBD2C4CEC3; Thu, 12 Sep 2024 11:17:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139821; bh=yF378rSxOmaEB1KQyXFnlpitufhUf1qEDwDdVCzADsk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NiV6oHmJseR+ndn1UqV+Ilrat+GhnqqpPHeKSnqhVUuIHcdziH99Uslcv+vxhwj9x ejDSH5390uYhbTFlSnL3FuYg0OqTJKwz1snVJNSTHOBVFpcKrTMPo3hLFwiKwWzVWB 0oFNUNEURS7xj82souQoVDAUnW15w6cKkAfyI/Z55bnykzizga5/J4wJPvhVIcBAsh QR7Tm+6wMUz5YibP5bRhzlRRLRCdBxSrpm+9B93GfJsK9MzlH5Cr6J4clYc2St8IlH 8lO3nz0w1UBORXviWvfXUfXtzaNj0dKrW/Pe+fFILltSD3nR2aMYpQAp1g8hzdqJX3 cEbOXRiaU7d5g== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 15/21] vfio/mlx5: Rewrite create mkey flow to allow better code reuse Date: Thu, 12 Sep 2024 14:15:50 +0300 Message-ID: <22ea84d941b486127bd8bee0655abd2918035d81.1726138681.git.leon@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: BECC3A0012 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: c9edwdxpyj4nsmxe5smo5awpxw75okkf X-HE-Tag: 1726139822-130447 X-HE-Meta: U2FsdGVkX1+ddX53Xkx4tnanhGM0u+VnC+PKmMpoxz+cC2tV0In9hA9+AVqK17sBtcJq8qPC5nyfLjprkZqayWLprQHynoejsFrR0/YLqJ9ADUvjoaDI+GJZiwypfrfb8hsiCnUrPs461h6hvBqpJ9+LA55oyi8rwg1V5odutg2kPac4rLqEwLAj23Rt920Dr6lY7ev+00wELdO4kkwgnMtAqIcZcIeMszt16aFJtoN+2ce3W2v5p6ow2UoEuKReFPR2M032KO4Lw1Klx44FWBNlSP2KoFHB7ycnnSqVH+YKMNPK8AEqFscpvLUmCKrnAEGJyTCCBZv6sGjMNd6XJgJ9efFQBf0OV/Xj5U5R9PBTBkAWJcjsXx4b2wVgVXRVHDqdM5xUHq/tU6WCtV7D9b+R88NCSOh/Nie2ttMT9aryh7RrWmMzUvtQ0xZ1CgkoVYBqUuuRLrkNMBPOiqHopag0C3bvxnfgxr4tkckfTbKkRDu78+HLVyV171mYlb8PRdvb8yrgO8QYRtVtzueAoj8vMd7jEN6HQseDl6Nxy1yNA7jKUY91SOG5tkiNeXNa/xiIMQh9EYx/sXLYwaxp/jP/5jISppFWpINW+ykJ4PLufYhRRIqwTiwohUr5j8Ok7vHZar3LIjesliGpuP47QYKdoijEl3NYm2ibE7GxNa5gLKScGyJM+5N7q0DeHH+zWx0Pb/loTluHCnF9AtADrm3c5fy2zokH7+csAwf2XgYqjepdyr/ObUwl3HmKpfTY9lXQJMkRpvstC/bI35OsYbp2hSIZSrZq3lpaFmLwCpUTDUnp/1O8pMkqCvHb5CVTZkRjr3mWzJ+o1Pj27M6lwtDmvlqpxLP8zG964768ouguaMiXzCGjMAgv65peS3oue0BbWJfhEDzc+qaHLIP3cRnSufpbQDnHMMj5ih1oI+w/vwyBa3t/Uf8atU2Olgw4VBIHacubL5k6J+clVBA 0EqCeXHB doioOQNF+Hd+q2smQLykjeWq9+Qw0jJMry4/zN11Nl8sZExx/x37q+HZE4DoAxKUuBr34X87UcHYZ9xChBFX+Of0G/BY1RqCnOWSgQYY4idm3SkevMbHyuxyZUlhpBxWu93nMsgG0FaweEZzgomtAUKywGenfLDInOpkz9yP2H/Sdqh8Hi6wK4F1mRASkdVMToYEJ+7JRQnSERJQTvp7osBNwashsFqjZUs+DGefo7d9ZIQc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Change the creation of mkey to be performed in multiple steps: data allocation, DMA setup and actual call to HW to create that mkey. In this new flow, the whole input to MKEY command is saved to eliminate the need to keep array of pointers for DMA addresses for receive list and in the future patches for send list too. In addition to memory size reduce and elimination of unnecessary data movements to set MKEY input, the code is prepared for future reuse. Signed-off-by: Leon Romanovsky --- drivers/vfio/pci/mlx5/cmd.c | 156 ++++++++++++++++++++---------------- drivers/vfio/pci/mlx5/cmd.h | 4 +- 2 files changed, 90 insertions(+), 70 deletions(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index fdc3e515741f..1832a6c1f35d 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -313,39 +313,21 @@ static int mlx5vf_cmd_get_vhca_id(struct mlx5_core_dev *mdev, u16 function_id, return ret; } -static int _create_mkey(struct mlx5_core_dev *mdev, u32 pdn, - struct mlx5_vhca_data_buffer *buf, - struct mlx5_vhca_recv_buf *recv_buf, - u32 *mkey) +static u32 *alloc_mkey_in(u32 npages, u32 pdn) { - size_t npages = buf ? buf->npages : recv_buf->npages; - int err = 0, inlen; - __be64 *mtt; + int inlen; void *mkc; u32 *in; inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + - sizeof(*mtt) * round_up(npages, 2); + sizeof(__be64) * round_up(npages, 2); - in = kvzalloc(inlen, GFP_KERNEL); + in = kvzalloc(inlen, GFP_KERNEL_ACCOUNT); if (!in) - return -ENOMEM; + return NULL; MLX5_SET(create_mkey_in, in, translations_octword_actual_size, DIV_ROUND_UP(npages, 2)); - mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt); - - if (buf) { - struct sg_dma_page_iter dma_iter; - - for_each_sgtable_dma_page(&buf->table.sgt, &dma_iter, 0) - *mtt++ = cpu_to_be64(sg_page_iter_dma_address(&dma_iter)); - } else { - int i; - - for (i = 0; i < npages; i++) - *mtt++ = cpu_to_be64(recv_buf->dma_addrs[i]); - } mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MTT); @@ -359,9 +341,29 @@ static int _create_mkey(struct mlx5_core_dev *mdev, u32 pdn, MLX5_SET(mkc, mkc, log_page_size, PAGE_SHIFT); MLX5_SET(mkc, mkc, translations_octword_size, DIV_ROUND_UP(npages, 2)); MLX5_SET64(mkc, mkc, len, npages * PAGE_SIZE); - err = mlx5_core_create_mkey(mdev, mkey, in, inlen); - kvfree(in); - return err; + + return in; +} + +static int create_mkey(struct mlx5_core_dev *mdev, u32 npages, + struct mlx5_vhca_data_buffer *buf, u32 *mkey_in, + u32 *mkey) +{ + __be64 *mtt; + int inlen; + + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); + if (buf) { + struct sg_dma_page_iter dma_iter; + + for_each_sgtable_dma_page(&buf->table.sgt, &dma_iter, 0) + *mtt++ = cpu_to_be64(sg_page_iter_dma_address(&dma_iter)); + } + + inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + + sizeof(__be64) * round_up(npages, 2); + + return mlx5_core_create_mkey(mdev, mkey, mkey_in, inlen); } static int mlx5vf_dma_data_buffer(struct mlx5_vhca_data_buffer *buf) @@ -374,20 +376,28 @@ static int mlx5vf_dma_data_buffer(struct mlx5_vhca_data_buffer *buf) if (mvdev->mdev_detach) return -ENOTCONN; - if (buf->dmaed || !buf->npages) + if (buf->mkey_in || !buf->npages) return -EINVAL; ret = dma_map_sgtable(mdev->device, &buf->table.sgt, buf->dma_dir, 0); if (ret) return ret; - ret = _create_mkey(mdev, buf->migf->pdn, buf, NULL, &buf->mkey); - if (ret) + buf->mkey_in = alloc_mkey_in(buf->npages, buf->migf->pdn); + if (!buf->mkey_in) { + ret = -ENOMEM; goto err; + } - buf->dmaed = true; + ret = create_mkey(mdev, buf->npages, buf, buf->mkey_in, &buf->mkey); + if (ret) + goto err_create_mkey; return 0; + +err_create_mkey: + kvfree(buf->mkey_in); + buf->mkey_in = NULL; err: dma_unmap_sgtable(mdev->device, &buf->table.sgt, buf->dma_dir, 0); return ret; @@ -401,8 +411,9 @@ void mlx5vf_free_data_buffer(struct mlx5_vhca_data_buffer *buf) lockdep_assert_held(&migf->mvdev->state_mutex); WARN_ON(migf->mvdev->mdev_detach); - if (buf->dmaed) { + if (buf->mkey_in) { mlx5_core_destroy_mkey(migf->mvdev->mdev, buf->mkey); + kvfree(buf->mkey_in); dma_unmap_sgtable(migf->mvdev->mdev->device, &buf->table.sgt, buf->dma_dir, 0); } @@ -779,7 +790,7 @@ int mlx5vf_cmd_load_vhca_state(struct mlx5vf_pci_core_device *mvdev, if (mvdev->mdev_detach) return -ENOTCONN; - if (!buf->dmaed) { + if (!buf->mkey_in) { err = mlx5vf_dma_data_buffer(buf); if (err) return err; @@ -1380,56 +1391,54 @@ static int alloc_recv_pages(struct mlx5_vhca_recv_buf *recv_buf, kvfree(recv_buf->page_list); return -ENOMEM; } +static void unregister_dma_pages(struct mlx5_core_dev *mdev, u32 npages, + u32 *mkey_in) +{ + dma_addr_t addr; + __be64 *mtt; + int i; + + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); + for (i = npages - 1; i >= 0; i--) { + addr = be64_to_cpu(mtt[i]); + dma_unmap_single(mdev->device, addr, PAGE_SIZE, + DMA_FROM_DEVICE); + } +} -static int register_dma_recv_pages(struct mlx5_core_dev *mdev, - struct mlx5_vhca_recv_buf *recv_buf) +static int register_dma_pages(struct mlx5_core_dev *mdev, u32 npages, + struct page **page_list, u32 *mkey_in) { - int i, j; + dma_addr_t addr; + __be64 *mtt; + int i; - recv_buf->dma_addrs = kvcalloc(recv_buf->npages, - sizeof(*recv_buf->dma_addrs), - GFP_KERNEL_ACCOUNT); - if (!recv_buf->dma_addrs) - return -ENOMEM; + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); - for (i = 0; i < recv_buf->npages; i++) { - recv_buf->dma_addrs[i] = dma_map_page(mdev->device, - recv_buf->page_list[i], - 0, PAGE_SIZE, - DMA_FROM_DEVICE); - if (dma_mapping_error(mdev->device, recv_buf->dma_addrs[i])) + for (i = 0; i < npages; i++) { + addr = dma_map_page(mdev->device, page_list[i], 0, PAGE_SIZE, + DMA_FROM_DEVICE); + if (dma_mapping_error(mdev->device, addr)) goto error; + + *mtt++ = cpu_to_be64(addr); } + return 0; error: - for (j = 0; j < i; j++) - dma_unmap_single(mdev->device, recv_buf->dma_addrs[j], - PAGE_SIZE, DMA_FROM_DEVICE); - - kvfree(recv_buf->dma_addrs); + unregister_dma_pages(mdev, i, mkey_in); return -ENOMEM; } -static void unregister_dma_recv_pages(struct mlx5_core_dev *mdev, - struct mlx5_vhca_recv_buf *recv_buf) -{ - int i; - - for (i = 0; i < recv_buf->npages; i++) - dma_unmap_single(mdev->device, recv_buf->dma_addrs[i], - PAGE_SIZE, DMA_FROM_DEVICE); - - kvfree(recv_buf->dma_addrs); -} - static void mlx5vf_free_qp_recv_resources(struct mlx5_core_dev *mdev, struct mlx5_vhca_qp *qp) { struct mlx5_vhca_recv_buf *recv_buf = &qp->recv_buf; mlx5_core_destroy_mkey(mdev, recv_buf->mkey); - unregister_dma_recv_pages(mdev, recv_buf); + unregister_dma_pages(mdev, recv_buf->npages, recv_buf->mkey_in); + kvfree(recv_buf->mkey_in); free_recv_pages(&qp->recv_buf); } @@ -1445,18 +1454,29 @@ static int mlx5vf_alloc_qp_recv_resources(struct mlx5_core_dev *mdev, if (err < 0) return err; - err = register_dma_recv_pages(mdev, recv_buf); - if (err) + recv_buf->mkey_in = alloc_mkey_in(npages, pdn); + if (!recv_buf->mkey_in) { + err = -ENOMEM; goto end; + } + + err = register_dma_pages(mdev, npages, recv_buf->page_list, + recv_buf->mkey_in); + if (err) + goto err_register_dma; - err = _create_mkey(mdev, pdn, NULL, recv_buf, &recv_buf->mkey); + err = create_mkey(mdev, npages, NULL, recv_buf->mkey_in, + &recv_buf->mkey); if (err) goto err_create_mkey; return 0; err_create_mkey: - unregister_dma_recv_pages(mdev, recv_buf); + unregister_dma_pages(mdev, npages, recv_buf->mkey_in); +err_register_dma: + kvfree(recv_buf->mkey_in); + recv_buf->mkey_in = NULL; end: free_recv_pages(recv_buf); return err; diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index 7d4a833b6900..25dd6ff54591 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -58,8 +58,8 @@ struct mlx5_vhca_data_buffer { u64 length; u32 npages; u32 mkey; + u32 *mkey_in; enum dma_data_direction dma_dir; - u8 dmaed:1; u8 stop_copy_chunk_num; struct list_head buf_elm; struct mlx5_vf_migration_file *migf; @@ -133,8 +133,8 @@ struct mlx5_vhca_cq { struct mlx5_vhca_recv_buf { u32 npages; struct page **page_list; - dma_addr_t *dma_addrs; u32 next_rq_offset; + u32 *mkey_in; u32 mkey; }; From patchwork Thu Sep 12 11:15:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801938 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CF84EEB591 for ; Thu, 12 Sep 2024 11:17:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 309AA6B00AA; Thu, 12 Sep 2024 07:17:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B91C6B00AB; Thu, 12 Sep 2024 07:17:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 10B346B00AC; Thu, 12 Sep 2024 07:17:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E58FF6B00AA for ; Thu, 12 Sep 2024 07:17:08 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id A685F1A19F4 for ; Thu, 12 Sep 2024 11:17:08 +0000 (UTC) X-FDA: 82555834536.20.AF8AF28 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf05.hostedemail.com (Postfix) with ESMTP id 0D426100008 for ; Thu, 12 Sep 2024 11:17:06 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="UsRoai/G"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf05.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139749; a=rsa-sha256; cv=none; b=PYhIFwA9aEw73vxq+Q4SdqLgeCP0bCxe1utk+B0LJVCvp29otPrbi/vPRmCePPgPMtaB/U z3Vc8el87ci24x82n5P9Jz+L1TpdsxeVjomJv5MZrlUAMyhe2T7qMolR0xYQYb546y44a0 ddeZ2WxzyYz0KZoQSIw6YTcLqACCczE= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="UsRoai/G"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf05.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139749; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kvELhg5bmq55ZU6FSBUdWCR4sfP8Fe6MKXTxImdKssI=; b=Dh0Xkz6lGvKXoyTNsvr5D6owdPzHBR2Wwz5IIEPsCPvaFYzIn7IjlWYUhWNdu9MwVLHU9P KKWYf+zf88UXa9EqWsFrr/FozpUsufUK5H+xQEDUkeBiqwF3GQWUAH5C0YZboElbLahKPo BiLMFzXBm7CKDwybx6KL2wTc2mKngKw= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 6BC4F5C5A77; Thu, 12 Sep 2024 11:17:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E846BC4CEC3; Thu, 12 Sep 2024 11:17:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139825; bh=bjk1/BnJ6tUGghkO4rbI4kdxFyJB6hE60cQFrbYFFeA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UsRoai/Gl1O7Sp/5JxPcgDAMkM6ipvgzmFXZSPfGag9Gk7eyJG+VdfuCd9QQGwabJ pJBnMolBMOdTzs2oBWlCngWlKspt0aKi0IHd0NV1QihXlPw5ge9SnJv5HvlQ9XFtJ2 wvSrcUz771/xmw0ldpU61b3KgEsbAG/eWsjbov/ea9Im3sNpjnLMXT3teQeLNuWGlW zII+LQmGfNzh72GCbpz3SCfX2gX6N1CfXTsSY+Ww/DdM41MOuzoB721TYZNr8BqV4f riGijidjEDVmzfKlimoUn1I9Q+v5auRTA9QlQWa335Fob1c5NshgGTh7q6eI8Gv2/D Lza+f4KBMQA/g== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 16/21] vfio/mlx5: Explicitly store page list Date: Thu, 12 Sep 2024 14:15:51 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Queue-Id: 0D426100008 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 3arqwqenkd9espqggq8171jw98wibd39 X-HE-Tag: 1726139826-499318 X-HE-Meta: U2FsdGVkX18vMMTU3eVkcjSJzctoIYfVcZ3ZBSqoz6CTl87vuiqDETjOovxKmBREDzNHnPGmAuBCp9BZcyNEHChusHw5rs+ivtYqp3WJgpDaWIbMjFuJLYBwWt+UHh7o6gP0Dn7ewahFdzibTkGAbMXYw35jqz++xXNI3G8tquN8nVnJcnBuRRVWvP7ArDEo1rD25q0Biyyi01QFJ62S5cvb7VZTs2LGELJ7zX+C6XDCdlm+iB+g7zyHQ1QGP5O6kZ8nCjFoWG98TduIECZICmi5uSrVurYhB9FviaWD9Zu6jt3S4QHq5BKqM77X9iaK/TEZPUQhRs349Fc6CJeTaPq8eVigTHxqf/ZRhBAjC0D3ID8PQYQvVMAqmHwF5UzzXksBtGtAMDu/cTo6xe3iXKKO7D1O7IlmVPsGRAEn68LVwaeLXBTSF9v8NmSY0pySGqBiTEZDPvmnz7oQIFSxwVgRbsW/7XMMuFs/Fc4A9FN+PgfVIOF7QW6AHJmalqyCAhyTr4Wny1v92ja8Sd6ex7xKxegSI2nRg6/sEPnOTr4RYo1VG9Jr/5KGm0w1xQq6/8FP5ZEtCtV3zL7t+f9+Ck4sZ1w1785GVIWP5PlhgSKwgI17zbwkN3i0hPWlEsYRwOVv65yhdyAfr8cdRg+wyPwfhmwT8BsRflzB1EsoW9sDeYR+SSsPhP4Zo1AGiZ5g+OQzSOxUwdC7oa8EJnkrwvZiK0VU3wzLk7sGtMlTN4aj8dhqFhGNmsKn3WHbUmHyv/h5d1pFjn1LaPiWiLDXaAJVuNVmUhPhfqcnlaBIN03H1DXG4R2GudCJtPOldXjyodiaq/+4uIv5jrJfCgkNOAIa/yCwZx12P10QyjNmrVmsefzYqcaYr4uDFXCdzcH6lasmHNysy69V0OhwFuMtsxYuyUcvDY6mvs3pfcoyZUmqKwQSUkpSCwSLiXBMm+wkPXS/pdiiAtVl5/O0o12 dTGJx/Fx ND/RK0C4LQMu9iJ/6A1IghCZAD3lqFy7K7gjMFextLx337vWyEumvCrL7oasyn8uBMAUSv2nze2Fbx0PGa4axKJlEGhFXKKACrEBjdBwn2GLgBzKy7PcBeXL+2w3M60TxUIVJdfHsoiAgL3ahTw+HC4VELfYbraM4p2EO5Qia4CH4oMBxV1iewQtHajd20Uy/vbMYXKlugMYwGAZ/JUVcDto7GQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky As a preparation to removal scatter-gather table and unifying receive and send list, explicitly store page list. Signed-off-by: Leon Romanovsky --- drivers/vfio/pci/mlx5/cmd.c | 29 ++++++++++++----------------- drivers/vfio/pci/mlx5/cmd.h | 1 + 2 files changed, 13 insertions(+), 17 deletions(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index 1832a6c1f35d..34ae3e299a9e 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -422,6 +422,7 @@ void mlx5vf_free_data_buffer(struct mlx5_vhca_data_buffer *buf) for_each_sgtable_page(&buf->table.sgt, &sg_iter, 0) __free_page(sg_page_iter_page(&sg_iter)); sg_free_append_table(&buf->table); + kvfree(buf->page_list); kfree(buf); } @@ -434,39 +435,33 @@ static int mlx5vf_add_migration_pages(struct mlx5_vhca_data_buffer *buf, unsigned int to_fill; int ret; - to_fill = min_t(unsigned int, npages, PAGE_SIZE / sizeof(*page_list)); - page_list = kvzalloc(to_fill * sizeof(*page_list), GFP_KERNEL_ACCOUNT); + to_fill = min_t(unsigned int, npages, PAGE_SIZE / sizeof(*buf->page_list)); + page_list = kvzalloc(to_fill * sizeof(*buf->page_list), GFP_KERNEL_ACCOUNT); if (!page_list) return -ENOMEM; + buf->page_list = page_list; + do { filled = alloc_pages_bulk_array(GFP_KERNEL_ACCOUNT, to_fill, - page_list); - if (!filled) { - ret = -ENOMEM; - goto err; - } + buf->page_list + buf->npages); + if (!filled) + return -ENOMEM; + to_alloc -= filled; ret = sg_alloc_append_table_from_pages( - &buf->table, page_list, filled, 0, + &buf->table, buf->page_list + buf->npages, filled, 0, filled << PAGE_SHIFT, UINT_MAX, SG_MAX_SINGLE_ALLOC, GFP_KERNEL_ACCOUNT); if (ret) - goto err; + return ret; buf->npages += filled; - /* clean input for another bulk allocation */ - memset(page_list, 0, filled * sizeof(*page_list)); to_fill = min_t(unsigned int, to_alloc, - PAGE_SIZE / sizeof(*page_list)); + PAGE_SIZE / sizeof(*buf->page_list)); } while (to_alloc > 0); - kvfree(page_list); return 0; - -err: - kvfree(page_list); - return ret; } struct mlx5_vhca_data_buffer * diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index 25dd6ff54591..5b764199db53 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -53,6 +53,7 @@ struct mlx5_vf_migration_header { }; struct mlx5_vhca_data_buffer { + struct page **page_list; struct sg_append_table table; loff_t start_pos; u64 length; From patchwork Thu Sep 12 11:15:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801939 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3583EEB593 for ; Thu, 12 Sep 2024 11:17:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 52B2B6B00AC; Thu, 12 Sep 2024 07:17:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4DABC6B00AD; Thu, 12 Sep 2024 07:17:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32E3F6B00AF; Thu, 12 Sep 2024 07:17:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 0E17A6B00AC for ; Thu, 12 Sep 2024 07:17:13 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 73256C1F80 for ; Thu, 12 Sep 2024 11:17:12 +0000 (UTC) X-FDA: 82555834704.26.2AD6650 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf22.hostedemail.com (Postfix) with ESMTP id E0132C0003 for ; Thu, 12 Sep 2024 11:17:10 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kjpRjm1h; spf=pass (imf22.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139777; a=rsa-sha256; cv=none; b=0Z8DgbyvloTU/SDYTLFaBkDWr9pvh41vG8g/Wn/b1XVtuidSoAWi4CXxgW0WwzfbuKxla4 NHWg7iRTs5JoU5aR/3q3Q8+9or/9uAwmKTfAxoKo4lVNxoHvX4QsKNEHFkUQ91CKD/7iPA k/XNyCSWT5uvuR8eHcpRB/OrrvqTd6s= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=kjpRjm1h; spf=pass (imf22.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139777; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pbrijnu3kpZk+ozMRcwKkGh1mM45UoSC+bSgdBgg+AM=; b=Krs4tvoBJfXWXlDc4cqvAnOt7ZtigmGLPabr1he4ut21Hu76Krk0qnXO8pbUfr8n6k6Ka9 RrffKW4i5uCF+bVtxmgsiqz1c8JsNb2FyEOOFcJqfmo0ryVfHwvMLRo+oYGSGzygn79Z5G eRWw17pMr+zT9kiKcsK9lDYo39jWcN8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 77DD4A45279; Thu, 12 Sep 2024 11:17:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EE28CC4CEC3; Thu, 12 Sep 2024 11:17:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139829; bh=Dk4iS9NnVthuwT4fk3pwSykLGjgnL3XiLR0Yj7fW2pg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kjpRjm1h2bUvRuIExn5ie51IkI7Jm6dqok5PQoPIBdaKTiScuVj9poh0E5VihqlwL JjaQ3GvVH6eIHFREycnTcfFnR8B5WdUisaTBpXd95amUmnXD6U/VBhBmxrOCXvWbD/ 6VZ+PGdVF+TBqen+57t389GmQ2RDNqfz8l3I/lRG44jgVUlAaTyuO2vAXzLUl++RsP IaiugOGUZp1xGgxLVQFtR+MfHUzhg/N4H6zHKemVyqeb6wLciiRIG9IIgIwv3tWa++ P5dgq3KrKQkx0UkFhXCgHL+QUWnncEyB/uwrBtULY/llel3bfOUIKmAh2lBllYM+3R 2ozmZrOHNHgmg== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 17/21] vfio/mlx5: Convert vfio to use DMA link API Date: Thu, 12 Sep 2024 14:15:52 +0300 Message-ID: <6369f834edd1e1144fbe11fd4b3aed3f63e33ade.1726138681.git.leon@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: szt9pxkukc1hy5ycuiuram9r69obpra7 X-Rspamd-Queue-Id: E0132C0003 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1726139830-8231 X-HE-Meta: U2FsdGVkX1+vJV1BX2QPBI01Uvay3HQYleaSKix/3IrSSPc0uR4Avla0am+WIBkrMHXxbtdKg6rcveBjFUPsuxMbe/Yr5dB1Up3prHB1JCyPCNKzHUicD/PuiPcelfq1xAB4/gk2/+BKP6OgDZjS/z8M7jJxUd3MU31GN2aOyRCgvOMk3A1IwpJK0SJgaB2011DEB1DxOqOxwCMDTEMWC7aMcwLBmJQ+YmHEoE63zjh1/CF1y6DvkWs+16oJ+HiNtrQ7WuZwA8vYo4Hcj/LZ9fkvWgZzEe6TxlOwjuaMgcIObepbrizZRBsKhTMuOHj4Iu+mmjQzuDs+OzFypciwQYzuNwxZA1yi4ASanVum8ZGU93yMPSx4lCdHfzA6IKHZwx10EamBzY9gXA7ro2235iibhN706ZucGsxMd879A9Ak9TE/W++5RlMUuK/7bTzG9VUIobp/R+aVtWqdcWeAqpParvtgirI/xAaJluhHx40Gd1xsB2DE88AULlUGdPif8k6iZXOhmlhetuuw5YjWs2G76747KvKhD/KWWcwl9Qy17kXvuPXxXxOCF3/QaZtOEYisviKABAsAjm4FQMGsFfXR53diawyoI6AKGHsuAL4/tF0MPGT7fZbaHl2mDMuxWwtAw9wiKXiscABUrnGbo7aZhaC3RfdPP5rqgfhFD9IRA9Rd+9PPH5aHJgQmWyzKLgFhp2FLB6M3QmS8ohibm5GSBVjhiAQjNFzQbqXI8mGHE4jngFMleqZnNeczawQf03EVlPTDhAyuNYDCe7CU1EnlTHAICmWETAan3xNsniEhYJ5mT/zdqcFUdY4rTAdaxGQraS4A+qe76cjCccU1Otc3Q/vo/wGRMR0TDZwNlyn2R/e0w/xN5DqyKNxGMdV/VP0UrP2Wo8zs/cnRsxF+RFEXW7Jm4VV+f19Ng5Ksm/XrVgzXy6ZbA6/z6EAfnnujBHWRVYI4LWRG5oeLX8+ UVBqz9Vh HGMhxOgUleHkRnNJqRczttvcQISqdO7mbaYH3QazN7U+i3g8WygrjP+NaVg1yvDvF1nuGBwIrY0b79ZrhKnpZmXz4Th/zwL9E+ht/JvAbrtFsJ5Av518D7UOc/gLvbJIKrgBHwzbF3XnQaIe6vVPUfPE4Ult0C032dug8VgaQN454360/T/mpRmjn2vZGPE72gNY1QN0jbcKnEXSgkWi2tsPm92gec/B9s2fHrqosrT55nsw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Remove intermediate scatter-gather table as it is not needed if DMA link API is used. This conversion reduces drastically the memory used to manage that table. Signed-off-by: Leon Romanovsky --- drivers/vfio/pci/mlx5/cmd.c | 211 ++++++++++++++++++----------------- drivers/vfio/pci/mlx5/cmd.h | 8 +- drivers/vfio/pci/mlx5/main.c | 33 +----- 3 files changed, 112 insertions(+), 140 deletions(-) diff --git a/drivers/vfio/pci/mlx5/cmd.c b/drivers/vfio/pci/mlx5/cmd.c index 34ae3e299a9e..aa2f1ec326c0 100644 --- a/drivers/vfio/pci/mlx5/cmd.c +++ b/drivers/vfio/pci/mlx5/cmd.c @@ -345,25 +345,78 @@ static u32 *alloc_mkey_in(u32 npages, u32 pdn) return in; } -static int create_mkey(struct mlx5_core_dev *mdev, u32 npages, - struct mlx5_vhca_data_buffer *buf, u32 *mkey_in, +static int create_mkey(struct mlx5_core_dev *mdev, u32 npages, u32 *mkey_in, u32 *mkey) { + int inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + + sizeof(__be64) * round_up(npages, 2); + + return mlx5_core_create_mkey(mdev, mkey, mkey_in, inlen); +} + +static void unregister_dma_pages(struct mlx5_core_dev *mdev, u32 npages, + u32 *mkey_in, struct dma_iova_state *state) +{ + dma_addr_t addr; __be64 *mtt; - int inlen; + int i; - mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); - if (buf) { - struct sg_dma_page_iter dma_iter; + WARN_ON_ONCE(state->dir == DMA_NONE); - for_each_sgtable_dma_page(&buf->table.sgt, &dma_iter, 0) - *mtt++ = cpu_to_be64(sg_page_iter_dma_address(&dma_iter)); + if (state->use_iova) { + dma_unlink_range(state); + } else { + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, + klm_pas_mtt); + for (i = npages - 1; i >= 0; i--) { + addr = be64_to_cpu(mtt[i]); + dma_unmap_page(state->dev, addr, PAGE_SIZE, state->dir); + } } + dma_free_iova(state); +} - inlen = MLX5_ST_SZ_BYTES(create_mkey_in) + - sizeof(__be64) * round_up(npages, 2); +static int register_dma_pages(struct mlx5_core_dev *mdev, u32 npages, + struct page **page_list, u32 *mkey_in, + struct dma_iova_state *state) +{ + dma_addr_t addr; + __be64 *mtt; + int i, err; - return mlx5_core_create_mkey(mdev, mkey, mkey_in, inlen); + WARN_ON_ONCE(state->dir == DMA_NONE); + + err = dma_alloc_iova(state, npages * PAGE_SIZE); + if (err) + return err; + + dma_set_iova_state(state, page_list[0], PAGE_SIZE); + + mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); + err = dma_start_range(state); + if (err) { + dma_free_iova(state); + return err; + } + for (i = 0; i < npages; i++) { + if (state->use_iova) + addr = dma_link_range(state, page_to_phys(page_list[i]), + PAGE_SIZE); + else + addr = dma_map_page(mdev->device, page_list[i], 0, + PAGE_SIZE, state->dir); + err = dma_mapping_error(mdev->device, addr); + if (err) + goto error; + *mtt++ = cpu_to_be64(addr); + } + dma_end_range(state); + + return 0; + +error: + unregister_dma_pages(mdev, i, mkey_in, state); + return err; } static int mlx5vf_dma_data_buffer(struct mlx5_vhca_data_buffer *buf) @@ -379,50 +432,56 @@ static int mlx5vf_dma_data_buffer(struct mlx5_vhca_data_buffer *buf) if (buf->mkey_in || !buf->npages) return -EINVAL; - ret = dma_map_sgtable(mdev->device, &buf->table.sgt, buf->dma_dir, 0); - if (ret) - return ret; - buf->mkey_in = alloc_mkey_in(buf->npages, buf->migf->pdn); - if (!buf->mkey_in) { - ret = -ENOMEM; - goto err; - } + if (!buf->mkey_in) + return -ENOMEM; - ret = create_mkey(mdev, buf->npages, buf, buf->mkey_in, &buf->mkey); + ret = register_dma_pages(mdev, buf->npages, buf->page_list, + buf->mkey_in, &buf->state); + if (ret) + goto err_register_dma; + + ret = create_mkey(mdev, buf->npages, buf->mkey_in, &buf->mkey); if (ret) goto err_create_mkey; return 0; err_create_mkey: + unregister_dma_pages(mdev, buf->npages, buf->mkey_in, &buf->state); +err_register_dma: kvfree(buf->mkey_in); buf->mkey_in = NULL; -err: - dma_unmap_sgtable(mdev->device, &buf->table.sgt, buf->dma_dir, 0); return ret; } +static void free_page_list(u32 npages, struct page **page_list) +{ + int i; + + /* Undo alloc_pages_bulk_array() */ + for (i = npages - 1; i >= 0; i--) + __free_page(page_list[i]); + + kvfree(page_list); +} + void mlx5vf_free_data_buffer(struct mlx5_vhca_data_buffer *buf) { - struct mlx5_vf_migration_file *migf = buf->migf; - struct sg_page_iter sg_iter; + struct mlx5vf_pci_core_device *mvdev = buf->migf->mvdev; + struct mlx5_core_dev *mdev = mvdev->mdev; - lockdep_assert_held(&migf->mvdev->state_mutex); - WARN_ON(migf->mvdev->mdev_detach); + lockdep_assert_held(&mvdev->state_mutex); + WARN_ON(mvdev->mdev_detach); if (buf->mkey_in) { - mlx5_core_destroy_mkey(migf->mvdev->mdev, buf->mkey); + mlx5_core_destroy_mkey(mdev, buf->mkey); + unregister_dma_pages(mdev, buf->npages, buf->mkey_in, + &buf->state); kvfree(buf->mkey_in); - dma_unmap_sgtable(migf->mvdev->mdev->device, &buf->table.sgt, - buf->dma_dir, 0); } - /* Undo alloc_pages_bulk_array() */ - for_each_sgtable_page(&buf->table.sgt, &sg_iter, 0) - __free_page(sg_page_iter_page(&sg_iter)); - sg_free_append_table(&buf->table); - kvfree(buf->page_list); + free_page_list(buf->npages, buf->page_list); kfree(buf); } @@ -433,7 +492,6 @@ static int mlx5vf_add_migration_pages(struct mlx5_vhca_data_buffer *buf, struct page **page_list; unsigned long filled; unsigned int to_fill; - int ret; to_fill = min_t(unsigned int, npages, PAGE_SIZE / sizeof(*buf->page_list)); page_list = kvzalloc(to_fill * sizeof(*buf->page_list), GFP_KERNEL_ACCOUNT); @@ -443,22 +501,13 @@ static int mlx5vf_add_migration_pages(struct mlx5_vhca_data_buffer *buf, buf->page_list = page_list; do { - filled = alloc_pages_bulk_array(GFP_KERNEL_ACCOUNT, to_fill, - buf->page_list + buf->npages); + filled = alloc_pages_bulk_array(GFP_KERNEL_ACCOUNT, to_alloc, + buf->page_list + buf->npages); if (!filled) return -ENOMEM; to_alloc -= filled; - ret = sg_alloc_append_table_from_pages( - &buf->table, buf->page_list + buf->npages, filled, 0, - filled << PAGE_SHIFT, UINT_MAX, SG_MAX_SINGLE_ALLOC, - GFP_KERNEL_ACCOUNT); - - if (ret) - return ret; buf->npages += filled; - to_fill = min_t(unsigned int, to_alloc, - PAGE_SIZE / sizeof(*buf->page_list)); } while (to_alloc > 0); return 0; @@ -468,6 +517,7 @@ struct mlx5_vhca_data_buffer * mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, enum dma_data_direction dma_dir) { + struct mlx5_core_dev *mdev = migf->mvdev->mdev; struct mlx5_vhca_data_buffer *buf; int ret; @@ -475,7 +525,7 @@ mlx5vf_alloc_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, if (!buf) return ERR_PTR(-ENOMEM); - buf->dma_dir = dma_dir; + dma_init_iova_state(&buf->state, mdev->device, dma_dir); buf->migf = migf; if (npages) { ret = mlx5vf_add_migration_pages(buf, npages); @@ -518,7 +568,7 @@ mlx5vf_get_data_buffer(struct mlx5_vf_migration_file *migf, u32 npages, spin_lock_irq(&migf->list_lock); list_for_each_entry_safe(buf, temp_buf, &migf->avail_list, buf_elm) { - if (buf->dma_dir == dma_dir) { + if (buf->state.dir == dma_dir) { list_del_init(&buf->buf_elm); if (buf->npages >= npages) { spin_unlock_irq(&migf->list_lock); @@ -1340,17 +1390,6 @@ static void mlx5vf_destroy_qp(struct mlx5_core_dev *mdev, kfree(qp); } -static void free_recv_pages(struct mlx5_vhca_recv_buf *recv_buf) -{ - int i; - - /* Undo alloc_pages_bulk_array() */ - for (i = 0; i < recv_buf->npages; i++) - __free_page(recv_buf->page_list[i]); - - kvfree(recv_buf->page_list); -} - static int alloc_recv_pages(struct mlx5_vhca_recv_buf *recv_buf, unsigned int npages) { @@ -1386,45 +1425,6 @@ static int alloc_recv_pages(struct mlx5_vhca_recv_buf *recv_buf, kvfree(recv_buf->page_list); return -ENOMEM; } -static void unregister_dma_pages(struct mlx5_core_dev *mdev, u32 npages, - u32 *mkey_in) -{ - dma_addr_t addr; - __be64 *mtt; - int i; - - mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); - for (i = npages - 1; i >= 0; i--) { - addr = be64_to_cpu(mtt[i]); - dma_unmap_single(mdev->device, addr, PAGE_SIZE, - DMA_FROM_DEVICE); - } -} - -static int register_dma_pages(struct mlx5_core_dev *mdev, u32 npages, - struct page **page_list, u32 *mkey_in) -{ - dma_addr_t addr; - __be64 *mtt; - int i; - - mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, mkey_in, klm_pas_mtt); - - for (i = 0; i < npages; i++) { - addr = dma_map_page(mdev->device, page_list[i], 0, PAGE_SIZE, - DMA_FROM_DEVICE); - if (dma_mapping_error(mdev->device, addr)) - goto error; - - *mtt++ = cpu_to_be64(addr); - } - - return 0; - -error: - unregister_dma_pages(mdev, i, mkey_in); - return -ENOMEM; -} static void mlx5vf_free_qp_recv_resources(struct mlx5_core_dev *mdev, struct mlx5_vhca_qp *qp) @@ -1432,9 +1432,10 @@ static void mlx5vf_free_qp_recv_resources(struct mlx5_core_dev *mdev, struct mlx5_vhca_recv_buf *recv_buf = &qp->recv_buf; mlx5_core_destroy_mkey(mdev, recv_buf->mkey); - unregister_dma_pages(mdev, recv_buf->npages, recv_buf->mkey_in); + unregister_dma_pages(mdev, recv_buf->npages, recv_buf->mkey_in, + &recv_buf->state); kvfree(recv_buf->mkey_in); - free_recv_pages(&qp->recv_buf); + free_page_list(recv_buf->npages, recv_buf->page_list); } static int mlx5vf_alloc_qp_recv_resources(struct mlx5_core_dev *mdev, @@ -1455,25 +1456,25 @@ static int mlx5vf_alloc_qp_recv_resources(struct mlx5_core_dev *mdev, goto end; } + recv_buf->state.dir = DMA_FROM_DEVICE; err = register_dma_pages(mdev, npages, recv_buf->page_list, - recv_buf->mkey_in); + recv_buf->mkey_in, &recv_buf->state); if (err) goto err_register_dma; - err = create_mkey(mdev, npages, NULL, recv_buf->mkey_in, - &recv_buf->mkey); + err = create_mkey(mdev, npages, recv_buf->mkey_in, &recv_buf->mkey); if (err) goto err_create_mkey; return 0; err_create_mkey: - unregister_dma_pages(mdev, npages, recv_buf->mkey_in); + unregister_dma_pages(mdev, npages, recv_buf->mkey_in, &recv_buf->state); err_register_dma: kvfree(recv_buf->mkey_in); recv_buf->mkey_in = NULL; end: - free_recv_pages(recv_buf); + free_page_list(npages, recv_buf->page_list); return err; } diff --git a/drivers/vfio/pci/mlx5/cmd.h b/drivers/vfio/pci/mlx5/cmd.h index 5b764199db53..8b0cd0ee11a0 100644 --- a/drivers/vfio/pci/mlx5/cmd.h +++ b/drivers/vfio/pci/mlx5/cmd.h @@ -54,20 +54,15 @@ struct mlx5_vf_migration_header { struct mlx5_vhca_data_buffer { struct page **page_list; - struct sg_append_table table; + struct dma_iova_state state; loff_t start_pos; u64 length; u32 npages; u32 mkey; u32 *mkey_in; - enum dma_data_direction dma_dir; u8 stop_copy_chunk_num; struct list_head buf_elm; struct mlx5_vf_migration_file *migf; - /* Optimize mlx5vf_get_migration_page() for sequential access */ - struct scatterlist *last_offset_sg; - unsigned int sg_last_entry; - unsigned long last_offset; }; struct mlx5vf_async_data { @@ -134,6 +129,7 @@ struct mlx5_vhca_cq { struct mlx5_vhca_recv_buf { u32 npages; struct page **page_list; + struct dma_iova_state state; u32 next_rq_offset; u32 *mkey_in; u32 mkey; diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index d899cd499e27..f395b526e0ef 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -34,35 +34,10 @@ static struct mlx5vf_pci_core_device *mlx5vf_drvdata(struct pci_dev *pdev) core_device); } -struct page * -mlx5vf_get_migration_page(struct mlx5_vhca_data_buffer *buf, - unsigned long offset) +struct page *mlx5vf_get_migration_page(struct mlx5_vhca_data_buffer *buf, + unsigned long offset) { - unsigned long cur_offset = 0; - struct scatterlist *sg; - unsigned int i; - - /* All accesses are sequential */ - if (offset < buf->last_offset || !buf->last_offset_sg) { - buf->last_offset = 0; - buf->last_offset_sg = buf->table.sgt.sgl; - buf->sg_last_entry = 0; - } - - cur_offset = buf->last_offset; - - for_each_sg(buf->last_offset_sg, sg, - buf->table.sgt.orig_nents - buf->sg_last_entry, i) { - if (offset < sg->length + cur_offset) { - buf->last_offset_sg = sg; - buf->sg_last_entry += i; - buf->last_offset = cur_offset; - return nth_page(sg_page(sg), - (offset - cur_offset) / PAGE_SIZE); - } - cur_offset += sg->length; - } - return NULL; + return buf->page_list[offset / PAGE_SIZE]; } static void mlx5vf_disable_fd(struct mlx5_vf_migration_file *migf) @@ -121,7 +96,7 @@ static void mlx5vf_buf_read_done(struct mlx5_vhca_data_buffer *vhca_buf) struct mlx5_vf_migration_file *migf = vhca_buf->migf; if (vhca_buf->stop_copy_chunk_num) { - bool is_header = vhca_buf->dma_dir == DMA_NONE; + bool is_header = vhca_buf->state.dir == DMA_NONE; u8 chunk_num = vhca_buf->stop_copy_chunk_num; size_t next_required_umem_size = 0; From patchwork Thu Sep 12 11:15:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801940 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9350DEEB593 for ; Thu, 12 Sep 2024 11:17:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D0AC6B00AF; Thu, 12 Sep 2024 07:17:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1823B6B00B0; Thu, 12 Sep 2024 07:17:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F14336B00B1; Thu, 12 Sep 2024 07:17:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id CD6B76B00AF for ; Thu, 12 Sep 2024 07:17:16 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 9508DA1835 for ; Thu, 12 Sep 2024 11:17:16 +0000 (UTC) X-FDA: 82555834872.10.207CB7C Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf15.hostedemail.com (Postfix) with ESMTP id E7E65A001B for ; Thu, 12 Sep 2024 11:17:14 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Xr8vEFr1; spf=pass (imf15.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139806; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0I86HfnI8+yHbVmbJ8gO9byjIbm9pWmXPGJsGzThyGI=; b=nfcDb31lsvRN5DNUzkXF4i1sMiUjco8Bfqb4xCenZuWE6bZOJb5lUeDI9ojRWyg4EG/8Ka ktQvDkSTbOM27Qgw7OjMr1i7ZvKfqXuyL/reBXMQRJRW0tN9+DoOxFSbcedGnlyPWjr+cm oM6FruJBI5jeJzxyU4J6Qbc33lezEzY= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Xr8vEFr1; spf=pass (imf15.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139806; a=rsa-sha256; cv=none; b=cM8rEMSibgFrFrWKmz7Ut7JKbjkClxi5hcYFHLeV0avQ5JDSIA2zJWJQa6eAn0pi9c9z85 Bo6wFs/wn4vwxeDMbYbYvt2+gcqTotiiewZzZJ2vhP90oOoQr9j2Y6I9PtTDoPUk3PJ1GJ cK/6iU9cRV/yOiKP9OqGyuAX5OUZ280= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 7BF09A4527D; Thu, 12 Sep 2024 11:17:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 02A2FC4CEC3; Thu, 12 Sep 2024 11:17:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139833; bh=6/tc2ugkTwWAFbLVbxAp24kBUNCe2t7dWg8SzqJRz7s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Xr8vEFr1DBAnFhLFzCX/ZunFrp1xgq07DQNQ6E4SLge/45rmHGLi/xIM/74GFsJFy DblrLX/Paxj097l/XVrGmAEAGjBmNAPhhYMAYhI75lkR60W7Rv92gMkz81H6ZIwBGw AAFtnY/3JoE6a4DOYz8fqOV3WqwNG8ugzOPPEA6BL4yTXDI2WsJ0p569Ke0Qjc6Tls Y9acUkiss7PZkIFREINyhR2/pEWITABWRwwBwbuLSAd2W41RsvHksDG9Lln6FB6B97 ukDvlvhuUGueTTORLCsLbcVzFqp9H/eO2Wg7hI+EmTp+BvHdbdrfwpoJ1hqzhrJUKG FYCii5JzIKsAQ== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 18/21] nvme-pci: remove optimizations for single DMA entry Date: Thu, 12 Sep 2024 14:15:53 +0300 Message-ID: <875d92e2c453649e9d95080f27e631f196270008.1726138681.git.leon@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: x8og3h6dyxd8zn6kr5m4ay4o3jpqqiy1 X-Rspamd-Queue-Id: E7E65A001B X-Rspamd-Server: rspam11 X-HE-Tag: 1726139834-211848 X-HE-Meta: U2FsdGVkX19u3/w7UqWyOKssMUSDiPTHfjckNJPiOTO1nNi17Ddj1BqNQhGBz3EY85MdEbfWjwHX1gIJd5QzdnQvLsi0oUUhp8xd3iOItxlt836WtsJ6jb+0BlaoMauQaT6x7WgxFarW/Jg8wMrQcZ115S/8PCOYvPWUvBGGXQ033JHo1ObIWV5I5yMoi/PhssKuSgfxsG3oitNTh1VeOiAlwwn1Bsf1cR5xbaTVfSjHldF0AQvT4FYdqJAhAqdqWgvhiDSbpddzvvhj9jK9XVb2HevB2KqYRhC6yXOpHm5SsgFaJLGLvqkAZ5z7TA0bxWvH38BhOckRpSlHWlP5Pj9rvH6Cps4zJCLXueQwOdt/Y3QYkSQoOPqWnVIYEASALGg/oyxrpoi2xzDgN7rXrjlRJqY63jdVczZnmJSE7p+0ncoqD+t1rCAezGz0wYrXT6FJJIGaMlwQjJDWMuDK+W6ctd9tLPadw2PrQ2CrU1XD8tydyC8eB6ttc9ljd5XT5j4OkBuI9+ZGK1Dc7TpqHf/DdGKL0ILUupIOW9JEw/nl5P5FxmA3HFsf+lnEyK2LVdH+Vqv7o37NhrsKYytk15rJy8FM3odYaIH86u9OuQ2cuaDElgCd5MUTSL2CbGaAvfsMuLAZ9EKmks7NWANxpIuKgKyqevjXw7BFkUlh/2Ce/5z2rO9MfIv+3LWd+53PbUcgr2K4vkOd0t2UDUQjrXuV174W7VgQs0F5UtxAuw33fP3nDFyEPg8cG2aX7gSA5MLXHSsD88U8HmtUFsjZ0yskyqP6OuWBK3KKuASWZe+Bl7iqI0AKE/+M3BvvjSiFAgCdIaoN5xAlhcZboou00D/JtJHdwG/5w4iA8ad1ioGygpl1atO7tf8EGhfC++R1EWIM240XaH6fF8qtw4q1XrNOo6f1k3FznILRUV7BCzcOoSvOt9fYnkvHzvRIFzwoB+W+dRCKIZT2TZ1Oqns FtLtXS6/ 1Cx35/cbodlfbF7Na+Stf+qa3swKxbKrhznaqe0NEjZp33Tirlm7Eh9y0kAU8QUITaauRXVEnlofeCLkAr8aTrk8BEHi7xtaz8oAuUMybd58MLIQuwLSsFhcUEy7XIYQ4r4c1TioAEMidGNzJ2uu0K/CMRKaHLsfp2WqjvF4Ga2ohIfTJlFWbf4+qp9uPcr9efo1LRf/hVghpV4//40ebY66ziC0yLTIhHwVnf5Yr4bfHL4Y= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Future patches will remove SG table allocation from the NVMe PCI code, which these single DMA entries tried to optimize. As a preparation, let's remove them to unify the DMA mapping code. Signed-off-by: Leon Romanovsky --- drivers/nvme/host/pci.c | 69 ----------------------------------------- 1 file changed, 69 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 6cd9395ba9ec..a9a66f184138 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -233,7 +233,6 @@ struct nvme_iod { bool aborted; s8 nr_allocations; /* PRP list pool allocations. 0 means small pool in use */ - unsigned int dma_len; /* length of single DMA segment mapping */ dma_addr_t first_dma; dma_addr_t meta_dma; struct sg_table sgt; @@ -541,12 +540,6 @@ static void nvme_unmap_data(struct nvme_dev *dev, struct request *req) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); - if (iod->dma_len) { - dma_unmap_page(dev->dev, iod->first_dma, iod->dma_len, - rq_dma_dir(req)); - return; - } - WARN_ON_ONCE(!iod->sgt.nents); dma_unmap_sgtable(dev->dev, &iod->sgt, rq_dma_dir(req), 0); @@ -696,11 +689,6 @@ static blk_status_t nvme_pci_setup_sgls(struct nvme_dev *dev, /* setting the transfer type as SGL */ cmd->flags = NVME_CMD_SGL_METABUF; - if (entries == 1) { - nvme_pci_sgl_set_data(&cmd->dptr.sgl, sg); - return BLK_STS_OK; - } - if (entries <= (256 / sizeof(struct nvme_sgl_desc))) { pool = dev->prp_small_pool; iod->nr_allocations = 0; @@ -727,45 +715,6 @@ static blk_status_t nvme_pci_setup_sgls(struct nvme_dev *dev, return BLK_STS_OK; } -static blk_status_t nvme_setup_prp_simple(struct nvme_dev *dev, - struct request *req, struct nvme_rw_command *cmnd, - struct bio_vec *bv) -{ - struct nvme_iod *iod = blk_mq_rq_to_pdu(req); - unsigned int offset = bv->bv_offset & (NVME_CTRL_PAGE_SIZE - 1); - unsigned int first_prp_len = NVME_CTRL_PAGE_SIZE - offset; - - iod->first_dma = dma_map_bvec(dev->dev, bv, rq_dma_dir(req), 0); - if (dma_mapping_error(dev->dev, iod->first_dma)) - return BLK_STS_RESOURCE; - iod->dma_len = bv->bv_len; - - cmnd->dptr.prp1 = cpu_to_le64(iod->first_dma); - if (bv->bv_len > first_prp_len) - cmnd->dptr.prp2 = cpu_to_le64(iod->first_dma + first_prp_len); - else - cmnd->dptr.prp2 = 0; - return BLK_STS_OK; -} - -static blk_status_t nvme_setup_sgl_simple(struct nvme_dev *dev, - struct request *req, struct nvme_rw_command *cmnd, - struct bio_vec *bv) -{ - struct nvme_iod *iod = blk_mq_rq_to_pdu(req); - - iod->first_dma = dma_map_bvec(dev->dev, bv, rq_dma_dir(req), 0); - if (dma_mapping_error(dev->dev, iod->first_dma)) - return BLK_STS_RESOURCE; - iod->dma_len = bv->bv_len; - - cmnd->flags = NVME_CMD_SGL_METABUF; - cmnd->dptr.sgl.addr = cpu_to_le64(iod->first_dma); - cmnd->dptr.sgl.length = cpu_to_le32(iod->dma_len); - cmnd->dptr.sgl.type = NVME_SGL_FMT_DATA_DESC << 4; - return BLK_STS_OK; -} - static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, struct nvme_command *cmnd) { @@ -773,24 +722,6 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, blk_status_t ret = BLK_STS_RESOURCE; int rc; - if (blk_rq_nr_phys_segments(req) == 1) { - struct nvme_queue *nvmeq = req->mq_hctx->driver_data; - struct bio_vec bv = req_bvec(req); - - if (!is_pci_p2pdma_page(bv.bv_page)) { - if ((bv.bv_offset & (NVME_CTRL_PAGE_SIZE - 1)) + - bv.bv_len <= NVME_CTRL_PAGE_SIZE * 2) - return nvme_setup_prp_simple(dev, req, - &cmnd->rw, &bv); - - if (nvmeq->qid && sgl_threshold && - nvme_ctrl_sgl_supported(&dev->ctrl)) - return nvme_setup_sgl_simple(dev, req, - &cmnd->rw, &bv); - } - } - - iod->dma_len = 0; iod->sgt.sgl = mempool_alloc(dev->iod_mempool, GFP_ATOMIC); if (!iod->sgt.sgl) return BLK_STS_RESOURCE; From patchwork Thu Sep 12 11:15:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801944 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B226EEB593 for ; Thu, 12 Sep 2024 11:17:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E2C616B00B8; Thu, 12 Sep 2024 07:17:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DDEA18D0003; Thu, 12 Sep 2024 07:17:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C078A8D0001; Thu, 12 Sep 2024 07:17:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9E4056B00B8 for ; Thu, 12 Sep 2024 07:17:32 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 575321414BB for ; Thu, 12 Sep 2024 11:17:32 +0000 (UTC) X-FDA: 82555835544.04.E03C0D7 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf23.hostedemail.com (Postfix) with ESMTP id AC5C814001F for ; Thu, 12 Sep 2024 11:17:30 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=rjQEVR0W; spf=pass (imf23.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139797; a=rsa-sha256; cv=none; b=uR5JneUYo5t27+MqLl8OqBXAFZFlSPBf0fk37ACsrzAkeKne26q2F0mfLA4KcPkBa+TR27 TkA13dcSEkZU8vxDouxGIxZsD9Lv17NJFq/IR30Len9WHyOmg1rgQ7gP5uIO0+dMsB2TP9 1jphv+hhw7o7K2tjB/Ra9RKbYOqMgCY= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=rjQEVR0W; spf=pass (imf23.hostedemail.com: domain of leon@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139797; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vXUTI8GNy8HG3SImA/IuHNiER0tJEICmf+Jwlta/u6c=; b=v9uCjlh2f4HD3GomoFdeTvauErMgBhxoLxna0+7BvogEFW9QqIif7Gj6tiswJjRcGaI3Bt ajmRDyiSfSd7qwHAufv/oJSpu3HUyFlYmESiWhTG8dywtONFuzBUtldoSAhOJN+jQexNgC eWfc/OYqsQC8r03Tf+RzisguVjrqEGg= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 8B718A4527A; Thu, 12 Sep 2024 11:17:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0B911C4CECE; Thu, 12 Sep 2024 11:17:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139849; bh=Ch307cuLMtu9IkxYPq9fEPmV/1GlNElaatF0p6+K/3o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rjQEVR0WNIDFIumqiRWFBZ/LeaRIQrWt7cRNRpwd5Qr3Ery2891OThDNpX5O70OkA 3u+K7SybOwx4RJVpNWxWVcp9dPJOARjRDtQ5bfDWc6ZBdwjeLjroi/LGRBt7k8iEnk rrKGRP00PzsLM6BQjTbD0bOTaSYxwnYo6ipA4GhON7Q3sfr8WoiFk8o1QO2Fv+ILhF /xfjffj8UsYtxpX9Wuzy0IcN8Uxvpj7UQIuJSgTBOPVfFQgt+1eyO4AqZDO/l5tLYd 5CdaCvA6GAmBErDw8MkhJzDPoXCGNeg9rN3Ks4+pahOUrr4xwY6R/1BbLfmuF5yF5c mxeUvTEss0paA== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 19/21] nvme-pci: precalculate number of DMA entries for each command Date: Thu, 12 Sep 2024 14:15:54 +0300 Message-ID: <8c5b0e5ab1716166fc93e76cb2d3e01ca9cf8769.1726138681.git.leon@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: ggdq7zff486qw3b1nxpuha1btpjsq46c X-Rspamd-Queue-Id: AC5C814001F X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1726139850-988127 X-HE-Meta: U2FsdGVkX19nt7Ujulw+SulLVSXEjf8E6OX/voU+OGkJTlbLoMpa/cUVF9pvo7COL/2jjbS1nhkrflkZuDHSnRXb2fFY+wUZNKCt8CzVOQcel+LSpWJBec76aY5O76fAZ4TsVIiaKP0jxWuRxOuw2oFTaR2Jb+OCCB1NYG1q9nhPVObQOgexQDO07M4EUmtPwtLJr7n9IfUEpre585oJAgo3p9LzQW630IGL+grbv66YD0L9MojORXkxN8GqQoUDqHGkkybp15w6Yi84GL64nkUP7Fhgrc63e0CPc09YmV3mAfYffapUIMVovgn60ZqGrK/YaHhxp7pFkRz6TFtvzWBPwP/09Yp4R7cWBqJBK2okUQ+hMsHsxYe0R/7AcDJodI3mS5J7FICxUnUahpEKoywardEbZliaGEeMjeITEYMLhHPfK/Mm7dIxN9csApfFeDs29Tc9UXCTjjgRi82YUIJXehfHY0voslUyyxWOvLEAlUx/Skuj7FGSgrhbHKiA3KfewUcQLvtXf2yJPgoZrKIMh/7IUnNQ6MmP8H5x8FYUnEf1E/yuzKDnnn/rxnmqJf9DmTAC7fhR4ikDvwwn0lgypmAatWuHan6BIRDH1JXEQL4h+WYwRxQu3XVfPzIPNZ9ovhKsPi00JE8D401334PEoIf+mo20CN1vundjJ/llK356UBpLCTUbXGyjUsH8EDWH//Xv8M2kem5yTsv4dyxy9/1IDwMCpFkGOn+GN/j7GuwLewclUkBu1BIiNSjA3aDdsVH3DrsA+yYONhGL0l4iBpL24YgPLIaCdYDi5LK01MLaoluH96wzYfdOhEt9diDQh0mJ0fLxWwjCPOcy+1bCI/XlixLRz/q6j9TAkWy4J5tqEdYOmqi7AUwXL0ysQMepCsQXDmDdU0ZnF7TsrGQRFwrdIX3UhqeRVt1ilxFRMP0Hf2tcegI6EVHT/WiCGRAaR/1wo578cktpQhV YXX5KuQg p95MUdidKAQbbicVKhHT/NgkiCd1hrC2co3ak8+p1AbmgHWrnh9g61+okyqwE9KFwu82c89VbSl7zgTsTjOZFr6ipOCiwxXwTUEY73d5RoW4HORnoDSe49vTOoaLUUkyq5nDgKEVFKQk77hjqb5BuRoOCuK3YWo6FgoITGIv1rtmyP2UQWN6vY7b+3nncyb891ueuNeGAyLhoLb3/ZMXQFiGipNZGtah5XdV9PJD7QFyxPmE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky Calculate the number of DMA entries for each command in the request in advance. Signed-off-by: Leon Romanovsky --- drivers/nvme/host/pci.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index a9a66f184138..2b236b1d209e 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -231,6 +231,7 @@ struct nvme_iod { struct nvme_request req; struct nvme_command cmd; bool aborted; + u8 nr_dmas; s8 nr_allocations; /* PRP list pool allocations. 0 means small pool in use */ dma_addr_t first_dma; @@ -766,6 +767,23 @@ static blk_status_t nvme_map_metadata(struct nvme_dev *dev, struct request *req, return BLK_STS_OK; } +static u8 nvme_calc_num_dmas(struct request *req) +{ + struct bio_vec bv; + u8 nr_dmas; + + if (blk_rq_nr_phys_segments(req) == 0) + return 0; + + nr_dmas = DIV_ROUND_UP(blk_rq_payload_bytes(req), NVME_CTRL_PAGE_SIZE); + bv = req_bvec(req); + if (bv.bv_offset && (bv.bv_offset + bv.bv_len) >= NVME_CTRL_PAGE_SIZE) + /* Accommodate for unaligned first page */ + nr_dmas++; + + return nr_dmas; +} + static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); @@ -779,6 +797,8 @@ static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req) if (ret) return ret; + iod->nr_dmas = nvme_calc_num_dmas(req); + if (blk_rq_nr_phys_segments(req)) { ret = nvme_map_data(dev, req, &iod->cmd); if (ret) From patchwork Thu Sep 12 11:15:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801942 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E94A6EEB594 for ; Thu, 12 Sep 2024 11:17:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E2356B00B4; Thu, 12 Sep 2024 07:17:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 43E596B00B6; Thu, 12 Sep 2024 07:17:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F5EA6B00B7; Thu, 12 Sep 2024 07:17:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id ECB626B00B4 for ; Thu, 12 Sep 2024 07:17:24 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A4D4A801A2 for ; Thu, 12 Sep 2024 11:17:24 +0000 (UTC) X-FDA: 82555835208.06.4D4A499 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf23.hostedemail.com (Postfix) with ESMTP id E773B14000D for ; Thu, 12 Sep 2024 11:17:22 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Aa4jqYp4; spf=pass (imf23.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139789; a=rsa-sha256; cv=none; b=1hCYNxvWrax2+V2FMXpxh2V3SqNgFG3JeK/oylLDt4pdfZfEmR9eg0Sawu9o3aLIfoJECV q8Yfzy2gzAcyjoKJGnrJfeDIweiepiXGP+rT+aBJUM8eg3TL02R9HuUSq83nQpJ7x438vH DJq/RqznM3TmVXdIhOWm8OzyKjEERC8= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=Aa4jqYp4; spf=pass (imf23.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139789; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gCzTwSH7apXpeR4jTKM0LqylmyabU15o/IX+c10RfHU=; b=TG09Qrym+KwCE6OJXYKuDkbRhcJxWtAZexET+2ZYsMDZrecAGrCyVl5CHYkAv8EtP030Mw 4RGS8ckvurVYX0FO9jmgFjixBKK/iThFkLI6B3ndWIZNTKBhHG1TlFyNnTJRhZj7uTZpYW FpUE3isqzzuJoYHjH/8jYUmOB4iMJsU= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 81EC85C5ABD; Thu, 12 Sep 2024 11:17:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EBFD2C4CEC3; Thu, 12 Sep 2024 11:17:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139841; bh=lOZJXJxWXIBcVAPdAmMr2ULkiHyK5NnvqvNZOrmoKEg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Aa4jqYp4u1ExCOP7/w9kgMgTLanlFjcSENpCvHwLxZJG24/4JEMDB8WrWR/ZXE8kY Juyav+p0VQovEVj9JAlq+j6aeQwGcbFAnb01BLl/tXzu7uKtwokGJL0ZctSl0LoCs7 P3QuWaV5EUXW7Wouq24VW1LbPHzZdmmoKaQGTyPu2gT81+/pLyiftxi0OCxKlymzGX oU3F0T4vwXclQxcMSER6kRBjONpvJzo7+ESdvUT5S/Y5RorNZdQCwAdOoxlqTbphSz Ik8xt5/NptqmHyfH7KdmzxpRq1LwwSbYqDHqX56CkwE+ey2gSFTwy5a0IjjPmgrzgt eRh1dP9PiSeNg== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 20/21] nvme-pci: use new dma API Date: Thu, 12 Sep 2024 14:15:55 +0300 Message-ID: X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: oxfsw8ypswwytsj1ecf9r35k8zouw77p X-Rspamd-Queue-Id: E773B14000D X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1726139842-768434 X-HE-Meta: U2FsdGVkX1+epMqgwjfXtqVdpP3kOKjT83bCpp2aRX/W6ksGgb2kFyRWN0nx3Sy3CBNQocHvqgypOnHv1ADWOmYKeZWC6saYVPv7cAgW7s92tk/kJGYowJcZfNT9SL972WW/dJf561P8jbOYMCXzwwjPNXg02sbQ7Ec/NTjwLA6pSalseYta38Rb1117FAidGPyhzQKWVACzcmIiaeHL0ltM3VvYUEoNhzo3wiAtw6JGEEG/2JO1AiK+NqoafQWGrOg0qizeCigw+jjEkEyEzmkUE4Vwg6R7P74sCFT6+zPUtcAn9IZK/aabHw89+8UDpFIHKiOAFOuo3qnEnrVYr/3fB0mrx1PxQJqxUDO2uES+UWEZzdcFztwAz8mgaoSFkmN7c5Bz1z4KR31wmSCLnhfeky/jKbtaRerR4PNkV8mbEpTD5odufcMV7+w2ChDGtUYMjNsC7+L+0kAzIKBM2dDmiNr8EEJxjwgsyUSaTEqUlU3oaPBxild6sXZ5VekiYJsXNi0KF8F+gJq8EqNc8ETOTV0E6wZ8kCtRyk/bqBkvhIKiyMmbvRll3DZqO7aDVsNN/0R50VZP6q4oKJdTTimJXlta1r0Sjf6n5Ik7UbFL6rb9G21LGw8AfX9LSrliTCiBu9tQE3zkW+XquMglQeUfeYCT+fp8rwee2lpd65lr+y3LrFhLfzbi1QjCNJwmGboSDTpvnCFNQ0gczyfgC74P+lH77ktNXkwiT3jxxfVzGm31zqQL5UkhNkodRvElEwuLvF1bkytxhbIJVSg48TwHZjHgrreEPMzzhZPssdPpl6+cF65TGalXzm2N1tVvY8vEDm88i1A0MTU4K/5zH2+W7Eu4WRzddrtsyIRa/AsCbdiTodIPPfQWTunD+sIJaJazfDMt9Wu+J/L59408KYXZTBVRMjqCL6vyE+yehr7Xq/DXAuZ3xApoc3pDPkKTkDmzPQ9w0WDVH5GqQzV DUk4NmN4 cEXjaljv78QCu9x3JkEy+MDGIwIFdT1N6nBb5Mrm4gdr6M1Fcpfq1mONG1R0LZHm+tPLQ0F/GW2F6y3Jy+tqiwyUlI+9OggdHHzmZaGjsQ8GhD8nlOVU2oKFgHtHI9hRId/OvtffL3I/c25mP6fXnPzjuRkgLnvoOKlTdCYTiXPMKyIkaRGSDOwgZ79tWGaTZLDFvxG96pywnfcxvYHe72Lqk0VVE4Bxj2QTumy07QN+YWDgfPAGgJ3SFejXiM0EnPXSEVKUtxXyaC24= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky This demonstrates how the new DMA API can fit into the NVMe driver and replace the old DMA APIs. As this is an RFC, I expect more robust error handling, optimizations, and in-depth testing for the final version once we agree on DMA API architecture. Following is the performance comparision for existing DMA API case with sg_table and with dma_map, once we have agreement on the new DMA API design I intend to get similar profiling numbers for new DMA API. sgl (sg_table + old dma API ) vs no_sgl (iod_dma_map + new DMA API) :- block size IOPS (k) Average of 3 4K -------------------------------------------------------------- sg-list-fio-perf.bs-4k-1.fio: 68.6 sg-list-fio-perf.bs-4k-2.fio: 68 68.36 sg-list-fio-perf.bs-4k-3.fio: 68.5 no-sg-list-fio-perf.bs-4k-1.fio: 68.7 no-sg-list-fio-perf.bs-4k-2.fio: 68.5 68.43 no-sg-list-fio-perf.bs-4k-3.fio: 68.1 % Change default vs new DMA API = +0.0975% 8K -------------------------------------------------------------- sg-list-fio-perf.bs-8k-1.fio: 67 sg-list-fio-perf.bs-8k-2.fio: 67.1 67.03 sg-list-fio-perf.bs-8k-3.fio: 67 no-sg-list-fio-perf.bs-8k-1.fio: 66.7 no-sg-list-fio-perf.bs-8k-2.fio: 66.7 66.7 no-sg-list-fio-perf.bs-8k-3.fio: 66.7 % Change default vs new DMA API = +0.4993% 16K -------------------------------------------------------------- sg-list-fio-perf.bs-16k-1.fio: 63.8 sg-list-fio-perf.bs-16k-2.fio: 63.4 63.5 sg-list-fio-perf.bs-16k-3.fio: 63.3 no-sg-list-fio-perf.bs-16k-1.fio: 63.5 no-sg-list-fio-perf.bs-16k-2.fio: 63.4 63.33 no-sg-list-fio-perf.bs-16k-3.fio: 63.1 % Change default vs new DMA API = -0.2632% 32K -------------------------------------------------------------- sg-list-fio-perf.bs-32k-1.fio: 59.3 sg-list-fio-perf.bs-32k-2.fio: 59.3 59.36 sg-list-fio-perf.bs-32k-3.fio: 59.5 no-sg-list-fio-perf.bs-32k-1.fio: 59.5 no-sg-list-fio-perf.bs-32k-2.fio: 59.6 59.43 no-sg-list-fio-perf.bs-32k-3.fio: 59.2 % Change default vs new DMA API = +0.1122% 64K -------------------------------------------------------------- sg-list-fio-perf.bs-64k-1.fio: 53.7 sg-list-fio-perf.bs-64k-2.fio: 53.4 53.56 sg-list-fio-perf.bs-64k-3.fio: 53.6 no-sg-list-fio-perf.bs-64k-1.fio: 53.5 no-sg-list-fio-perf.bs-64k-2.fio: 53.8 53.63 no-sg-list-fio-perf.bs-64k-3.fio: 53.6 % Change default vs new DMA API = +0.1246% 128K -------------------------------------------------------------- sg-list-fio-perf/bs-128k-1.fio: 48 sg-list-fio-perf/bs-128k-2.fio: 46.4 47.13 sg-list-fio-perf/bs-128k-3.fio: 47 no-sg-list-fio-perf/bs-128k-1.fio: 46.6 no-sg-list-fio-perf/bs-128k-2.fio: 47 46.9 no-sg-list-fio-perf/bs-128k-3.fio: 47.1 % Change default vs new DMA API = −0.495% 256K -------------------------------------------------------------- sg-list-fio-perf/bs-256k-1.fio: 37 sg-list-fio-perf/bs-256k-2.fio: 41 39.93 sg-list-fio-perf/bs-256k-3.fio: 41.8 no-sg-list-fio-perf/bs-256k-1.fio: 37.5 no-sg-list-fio-perf/bs-256k-2.fio: 41.4 40.5 no-sg-list-fio-perf/bs-256k-3.fio: 42.6 % Change default vs new DMA API = +1.42% 512K -------------------------------------------------------------- sg-list-fio-perf/bs-512k-1.fio: 28.5 sg-list-fio-perf/bs-512k-2.fio: 28.2 28.4 sg-list-fio-perf/bs-512k-3.fio: 28.5 no-sg-list-fio-perf/bs-512k-1.fio: 28.7 no-sg-list-fio-perf/bs-512k-2.fio: 28.6 28.7 no-sg-list-fio-perf/bs-512k-3.fio: 28.8 % Change default vs new DMA API = +1.06% Signed-off-by: Chaitanya Kulkarni Signed-off-by: Leon Romanovsky --- drivers/nvme/host/pci.c | 354 ++++++++++++++++++++++++---------------- 1 file changed, 215 insertions(+), 139 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 2b236b1d209e..881cbf2c0cac 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -221,6 +221,12 @@ union nvme_descriptor { __le64 *prp_list; }; +/* TODO: move to common header */ +struct dma_entry { + dma_addr_t addr; + unsigned int len; +}; + /* * The nvme_iod describes the data in an I/O. * @@ -234,9 +240,11 @@ struct nvme_iod { u8 nr_dmas; s8 nr_allocations; /* PRP list pool allocations. 0 means small pool in use */ + struct dma_iova_state state; + struct dma_entry dma; + struct dma_entry *map; dma_addr_t first_dma; dma_addr_t meta_dma; - struct sg_table sgt; union nvme_descriptor list[NVME_MAX_NR_ALLOCATIONS]; }; @@ -540,10 +548,9 @@ static void nvme_free_prps(struct nvme_dev *dev, struct request *req) static void nvme_unmap_data(struct nvme_dev *dev, struct request *req) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); - - WARN_ON_ONCE(!iod->sgt.nents); - - dma_unmap_sgtable(dev->dev, &iod->sgt, rq_dma_dir(req), 0); + struct req_iterator iter; + struct bio_vec bv; + int cnt = 0; if (iod->nr_allocations == 0) dma_pool_free(dev->prp_small_pool, iod->list[0].sg_list, @@ -553,20 +560,17 @@ static void nvme_unmap_data(struct nvme_dev *dev, struct request *req) iod->first_dma); else nvme_free_prps(dev, req); - mempool_free(iod->sgt.sgl, dev->iod_mempool); -} -static void nvme_print_sgl(struct scatterlist *sgl, int nents) -{ - int i; - struct scatterlist *sg; - - for_each_sg(sgl, sg, nents, i) { - dma_addr_t phys = sg_phys(sg); - pr_warn("sg[%d] phys_addr:%pad offset:%d length:%d " - "dma_address:%pad dma_length:%d\n", - i, &phys, sg->offset, sg->length, &sg_dma_address(sg), - sg_dma_len(sg)); + if (iod->map) { + rq_for_each_bvec(bv, req, iter) { + dma_unmap_page(dev->dev, iod->map[cnt].addr, + iod->map[cnt].len, rq_dma_dir(req)); + cnt++; + } + kfree(iod->map); + } else { + dma_unlink_range(&iod->state); + dma_free_iova(&iod->state); } } @@ -574,97 +578,63 @@ static blk_status_t nvme_pci_setup_prps(struct nvme_dev *dev, struct request *req, struct nvme_rw_command *cmnd) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); - struct dma_pool *pool; - int length = blk_rq_payload_bytes(req); - struct scatterlist *sg = iod->sgt.sgl; - int dma_len = sg_dma_len(sg); - u64 dma_addr = sg_dma_address(sg); - int offset = dma_addr & (NVME_CTRL_PAGE_SIZE - 1); - __le64 *prp_list; - dma_addr_t prp_dma; - int nprps, i; - - length -= (NVME_CTRL_PAGE_SIZE - offset); - if (length <= 0) { - iod->first_dma = 0; - goto done; - } - - dma_len -= (NVME_CTRL_PAGE_SIZE - offset); - if (dma_len) { - dma_addr += (NVME_CTRL_PAGE_SIZE - offset); - } else { - sg = sg_next(sg); - dma_addr = sg_dma_address(sg); - dma_len = sg_dma_len(sg); - } + __le64 *prp_list = iod->list[0].prp_list; + int i = 0, idx = 0; + struct bio_vec bv; + struct req_iterator iter; + dma_addr_t offset = 0; - if (length <= NVME_CTRL_PAGE_SIZE) { - iod->first_dma = dma_addr; - goto done; + if (iod->nr_dmas <= 2) { + i = iod->nr_dmas; + /* We can use the inline PRP/SG list */ + goto set_addr; } - nprps = DIV_ROUND_UP(length, NVME_CTRL_PAGE_SIZE); - if (nprps <= (256 / 8)) { - pool = dev->prp_small_pool; - iod->nr_allocations = 0; - } else { - pool = dev->prp_page_pool; - iod->nr_allocations = 1; - } + rq_for_each_bvec(bv, req, iter) { + dma_addr_t addr; - prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma); - if (!prp_list) { - iod->nr_allocations = -1; - return BLK_STS_RESOURCE; - } - iod->list[0].prp_list = prp_list; - iod->first_dma = prp_dma; - i = 0; - for (;;) { - if (i == NVME_CTRL_PAGE_SIZE >> 3) { - __le64 *old_prp_list = prp_list; - prp_list = dma_pool_alloc(pool, GFP_ATOMIC, &prp_dma); - if (!prp_list) - goto free_prps; - iod->list[iod->nr_allocations++].prp_list = prp_list; - prp_list[0] = old_prp_list[i - 1]; - old_prp_list[i - 1] = cpu_to_le64(prp_dma); - i = 1; + if (iod->map) + offset = 0; + + while (offset < bv.bv_len) { + if (iod->map) + addr = iod->map[i].addr; + else + addr = iod->dma.addr; + + prp_list[idx] = cpu_to_le64(addr + offset); + offset += NVME_CTRL_PAGE_SIZE; + idx++; } - prp_list[i++] = cpu_to_le64(dma_addr); - dma_len -= NVME_CTRL_PAGE_SIZE; - dma_addr += NVME_CTRL_PAGE_SIZE; - length -= NVME_CTRL_PAGE_SIZE; - if (length <= 0) - break; - if (dma_len > 0) - continue; - if (unlikely(dma_len < 0)) - goto bad_sgl; - sg = sg_next(sg); - dma_addr = sg_dma_address(sg); - dma_len = sg_dma_len(sg); - } -done: - cmnd->dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sgt.sgl)); - cmnd->dptr.prp2 = cpu_to_le64(iod->first_dma); + i++; + } + +set_addr: + if (iod->map) + cmnd->dptr.prp1 = cpu_to_le64(iod->map[0].addr); + else + cmnd->dptr.prp1 = cpu_to_le64(iod->dma.addr); + if (idx == 1 && i == 1) + cmnd->dptr.prp2 = 0; + else if (idx == 2 && i == 2) + if (iod->map) + cmnd->dptr.prp2 = + cpu_to_le64((iod->map[0].addr + NVME_CTRL_PAGE_SIZE) & + ~(NVME_CTRL_PAGE_SIZE - 1)); + else + cmnd->dptr.prp2 = + cpu_to_le64((iod->dma.addr + NVME_CTRL_PAGE_SIZE) & + ~(NVME_CTRL_PAGE_SIZE - 1)); + else + cmnd->dptr.prp2 = cpu_to_le64(iod->first_dma); return BLK_STS_OK; -free_prps: - nvme_free_prps(dev, req); - return BLK_STS_RESOURCE; -bad_sgl: - WARN(DO_ONCE(nvme_print_sgl, iod->sgt.sgl, iod->sgt.nents), - "Invalid SGL for payload:%d nents:%d\n", - blk_rq_payload_bytes(req), iod->sgt.nents); - return BLK_STS_IOERR; } -static void nvme_pci_sgl_set_data(struct nvme_sgl_desc *sge, - struct scatterlist *sg) +static void nvme_pci_sgl_set_data(struct nvme_sgl_desc *sge, dma_addr_t addr, + int len) { - sge->addr = cpu_to_le64(sg_dma_address(sg)); - sge->length = cpu_to_le32(sg_dma_len(sg)); + sge->addr = cpu_to_le64(addr); + sge->length = cpu_to_le32(len); sge->type = NVME_SGL_FMT_DATA_DESC << 4; } @@ -680,17 +650,77 @@ static blk_status_t nvme_pci_setup_sgls(struct nvme_dev *dev, struct request *req, struct nvme_rw_command *cmd) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); - struct dma_pool *pool; - struct nvme_sgl_desc *sg_list; - struct scatterlist *sg = iod->sgt.sgl; - unsigned int entries = iod->sgt.nents; - dma_addr_t sgl_dma; - int i = 0; + struct nvme_sgl_desc *sg_list = iod->list[0].sg_list; + struct bio_vec bv = req_bvec(req); + struct req_iterator iter; + int i = 0, idx = 0; + dma_addr_t offset = 0; /* setting the transfer type as SGL */ cmd->flags = NVME_CMD_SGL_METABUF; - if (entries <= (256 / sizeof(struct nvme_sgl_desc))) { + if (iod->nr_dmas <= 1) + /* We can use the inline PRP/SG list */ + goto set_addr; + + rq_for_each_bvec(bv, req, iter) { + dma_addr_t addr; + + if (iod->map) + offset = 0; + + while (offset < bv.bv_len) { + if (iod->map) + addr = iod->map[i].addr; + else + addr = iod->dma.addr; + + nvme_pci_sgl_set_data(&sg_list[idx], addr + offset, + bv.bv_len); + offset += NVME_CTRL_PAGE_SIZE; + idx++; + } + i++; + } + +set_addr: + nvme_pci_sgl_set_seg(&cmd->dptr.sgl, iod->first_dma, + blk_rq_nr_phys_segments(req)); + return BLK_STS_OK; +} + +static void nvme_pci_free_pool(struct nvme_dev *dev, struct request *req) +{ + struct nvme_iod *iod = blk_mq_rq_to_pdu(req); + + if (iod->nr_allocations == 0) + dma_pool_free(dev->prp_small_pool, iod->list[0].sg_list, + iod->first_dma); + else if (iod->nr_allocations == 1) + dma_pool_free(dev->prp_page_pool, iod->list[0].sg_list, + iod->first_dma); + else + nvme_free_prps(dev, req); +} + +static blk_status_t nvme_pci_setup_pool(struct nvme_dev *dev, + struct request *req, bool is_sgl) +{ + struct nvme_iod *iod = blk_mq_rq_to_pdu(req); + struct dma_pool *pool; + size_t entry_sz; + dma_addr_t addr; + u8 entries; + void *list; + + if (iod->nr_dmas <= 2) + /* Do nothing, we can use the inline PRP/SG list */ + return BLK_STS_OK; + + /* First DMA address goes to prp1 anyway */ + entries = iod->nr_dmas - 1; + entry_sz = (is_sgl) ? sizeof(struct nvme_sgl_desc) : sizeof(__le64); + if (entries <= (256 / entry_sz)) { pool = dev->prp_small_pool; iod->nr_allocations = 0; } else { @@ -698,21 +728,20 @@ static blk_status_t nvme_pci_setup_sgls(struct nvme_dev *dev, iod->nr_allocations = 1; } - sg_list = dma_pool_alloc(pool, GFP_ATOMIC, &sgl_dma); - if (!sg_list) { + /* TBD: allocate mulitple pools and chain them */ + WARN_ON(entries > 512); + + list = dma_pool_alloc(pool, GFP_ATOMIC, &addr); + if (!list) { iod->nr_allocations = -1; return BLK_STS_RESOURCE; } - iod->list[0].sg_list = sg_list; - iod->first_dma = sgl_dma; - - nvme_pci_sgl_set_seg(&cmd->dptr.sgl, sgl_dma, entries); - do { - nvme_pci_sgl_set_data(&sg_list[i++], sg); - sg = sg_next(sg); - } while (--entries > 0); - + if (is_sgl) + iod->list[0].sg_list = list; + else + iod->list[0].prp_list = list; + iod->first_dma = addr; return BLK_STS_OK; } @@ -721,36 +750,84 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); blk_status_t ret = BLK_STS_RESOURCE; - int rc; - - iod->sgt.sgl = mempool_alloc(dev->iod_mempool, GFP_ATOMIC); - if (!iod->sgt.sgl) + unsigned short n_segments = blk_rq_nr_phys_segments(req); + struct bio_vec bv = req_bvec(req); + struct req_iterator iter; + dma_addr_t dma_addr; + int rc, cnt = 0; + bool is_sgl; + + dma_init_iova_state(&iod->state, dev->dev, rq_dma_dir(req)); + dma_set_iova_state(&iod->state, bv.bv_page, bv.bv_len); + + rc = dma_start_range(&iod->state); + if (rc) return BLK_STS_RESOURCE; - sg_init_table(iod->sgt.sgl, blk_rq_nr_phys_segments(req)); - iod->sgt.orig_nents = blk_rq_map_sg(req->q, req, iod->sgt.sgl); - if (!iod->sgt.orig_nents) - goto out_free_sg; - rc = dma_map_sgtable(dev->dev, &iod->sgt, rq_dma_dir(req), - DMA_ATTR_NO_WARN); - if (rc) { - if (rc == -EREMOTEIO) - ret = BLK_STS_TARGET; - goto out_free_sg; + iod->dma.len = 0; + iod->dma.addr = 0; + + if (dma_can_use_iova(&iod->state)) { + iod->map = NULL; + rc = dma_alloc_iova_unaligned(&iod->state, bvec_phys(&bv), + blk_rq_payload_bytes(req)); + if (rc) + return BLK_STS_RESOURCE; + + rq_for_each_bvec(bv, req, iter) { + dma_addr = dma_link_range(&iod->state, bvec_phys(&bv), + bv.bv_len); + if (dma_mapping_error(dev->dev, dma_addr)) + goto out_free; + + if (!iod->dma.addr) + iod->dma.addr = dma_addr; + } + WARN_ON(blk_rq_payload_bytes(req) != iod->state.range_size); + } else { + iod->map = kmalloc_array(n_segments, sizeof(*iod->map), + GFP_ATOMIC); + if (!iod->map) + return BLK_STS_RESOURCE; + + rq_for_each_bvec(bv, req, iter) { + dma_addr = dma_map_bvec(dev->dev, &bv, rq_dma_dir(req), 0); + if (dma_mapping_error(dev->dev, dma_addr)) + goto out_free; + + iod->map[cnt].addr = dma_addr; + iod->map[cnt].len = bv.bv_len; + cnt++; + } } + dma_end_range(&iod->state); - if (nvme_pci_use_sgls(dev, req, iod->sgt.nents)) + is_sgl = nvme_pci_use_sgls(dev, req, n_segments); + ret = nvme_pci_setup_pool(dev, req, is_sgl); + if (ret != BLK_STS_OK) + goto out_free; + + if (is_sgl) ret = nvme_pci_setup_sgls(dev, req, &cmnd->rw); else ret = nvme_pci_setup_prps(dev, req, &cmnd->rw); if (ret != BLK_STS_OK) - goto out_unmap_sg; + goto out_free_pool; + return BLK_STS_OK; -out_unmap_sg: - dma_unmap_sgtable(dev->dev, &iod->sgt, rq_dma_dir(req), 0); -out_free_sg: - mempool_free(iod->sgt.sgl, dev->iod_mempool); +out_free_pool: + nvme_pci_free_pool(dev, req); +out_free: + if (iod->map) { + while (cnt--) + dma_unmap_page(dev->dev, iod->map[cnt].addr, + iod->map[cnt].len, rq_dma_dir(req)); + kfree(iod->map); + } else { + dma_unlink_range(&iod->state); + dma_free_iova(&iod->state); + } return ret; } @@ -791,7 +868,6 @@ static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req) iod->aborted = false; iod->nr_allocations = -1; - iod->sgt.nents = 0; ret = nvme_setup_cmd(req->q->queuedata, req); if (ret) From patchwork Thu Sep 12 11:15:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13801943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75BADEEB594 for ; Thu, 12 Sep 2024 11:17:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0257C6B00B6; Thu, 12 Sep 2024 07:17:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EF8C26B00B8; Thu, 12 Sep 2024 07:17:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D8FFB6B00B9; Thu, 12 Sep 2024 07:17:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id B1A536B00B6 for ; Thu, 12 Sep 2024 07:17:28 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6C9821C2493 for ; Thu, 12 Sep 2024 11:17:28 +0000 (UTC) X-FDA: 82555835376.09.D578B4D Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf02.hostedemail.com (Postfix) with ESMTP id C72E580010 for ; Thu, 12 Sep 2024 11:17:26 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="iMt/9VHS"; spf=pass (imf02.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1726139794; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Lnv22ho9UPmeIkvMTWgYqimBSgTdZG5s6KWeUG6lN8E=; b=sLWO3CHtTgF61w59yhWfe39Bp1U+AyKfslxy/ScnXGkH4EMLE2qHJde8UTG4AKb3ydbSSF yZB3il45jcVAy+xTnrvxt/tkwsh7woCTFDsnUcUEurxYyW7Kmao270TCzUuQmEgHLKHoRy FVYVtTBVmno2a7LSEuu1AwpTm8r+Ljg= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="iMt/9VHS"; spf=pass (imf02.hostedemail.com: domain of leon@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=leon@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1726139794; a=rsa-sha256; cv=none; b=lejN8e5NeqnMc0kf5JPFHxWeUbfxyN6//8ILDhKRQJliO+7+GN26iklZk/0AeBQ0RoIRyW 4G3fGdv57t/A+nwE3ORUI/OYu8r+0R+jCDnm/NZkhh2uTUizMtYgx4s+ZgoRN50GXnB1N9 ArrNCw1GbljsHDO7e/Y0OjpYxVL4rUk= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 767CF5C0650; Thu, 12 Sep 2024 11:17:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EFE14C4CEC5; Thu, 12 Sep 2024 11:17:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1726139845; bh=HzFyylaMePR5bS7VyKQApbkFy4MWvNmh8/Ifgo8KsU0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iMt/9VHSYIlDpVGdX26n1N0CdVrTRm10It8Sa8NMHcf+17B/DEKPd5jcRIQstDF5l y/zknkTz0QWWnnaDNoU/w5MrYILx1JHP+V4tz9u84Cewb0+/xN0X7Xobu9RYdt4sE3 8QSp8DqCuKUgnnC5JLgioLFGolGO2jWjErhnFWAI64U90eRPEY/a80PM1URWvna2tv EjnP1bBa2i6W+ZV7YFLYJDEIpcKOoFD0i83H5rqEd6sFjATC3qHX2bD88Wi/OivMQx jvCqxXIec8J/zMqtF5Uhjik7uSRXJU4fi3neuttBSeo6DVEN/1c89HG77KYw9oQArh FWiOq1isCI2iQ== From: Leon Romanovsky To: Jens Axboe , Jason Gunthorpe , Robin Murphy , Joerg Roedel , Will Deacon , Keith Busch , Christoph Hellwig , "Zeng, Oak" , Chaitanya Kulkarni Cc: Leon Romanovsky , Sagi Grimberg , Bjorn Helgaas , Logan Gunthorpe , Yishai Hadas , Shameer Kolothum , Kevin Tian , Alex Williamson , Marek Szyprowski , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , Andrew Morton , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 21/21] nvme-pci: don't allow mapping of bvecs with offset Date: Thu, 12 Sep 2024 14:15:56 +0300 Message-ID: <63cdbb87e1b08464705fa343b65e561eb3abd5f9.1726138681.git.leon@kernel.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: C72E580010 X-Stat-Signature: 7edf3967jchnwh8sgywi3hep7gcy7wds X-HE-Tag: 1726139846-731968 X-HE-Meta: U2FsdGVkX1+fL16AzRUsUeMRQtwWoZnQVxXbfeCZwXN/Ykao4z6EFOR1zMBGY1p8dc9m56lmOpoEHtXSKYfAxYYNJ3y+a/jBIAXp/lMR+HXhy24iwp7hAlLBWpIJJrYkcyo/HC6JQgnYctskpdXSNqXeLy9tF9Wfy0ipU4NkItRm7GZmmAO1syAXh+gSTKXVRwYLbEfAZxkLDG30jSLA1NWtCE8vJXFK6JTgxwWT8359LXkiEXmT9YEJRH6QFKSnoy3lnWSU47Ue6P0RTNDMRpkUNYJuE309fPweDsAKonEzZkPxdToX1tnv566USAuWOH7hpUFXzt1l+7vjI8/PyaUgn0plKITHaiAPAowATcAnBFyDb2CAR1lCA8Q3HPSTIL+8KcE9GxTNwUCt2UWyAguMNgq6RurbwNRu/PAeM2Fzdt35HuPyJ4SLwmeHUGU4FHu5A1Wif8e9nJKOkl+K2E8u2ikUDFj0ABILv8h5zeu3lhncchBOhsfFw/mkzqDaittReYKaizCIQxoB7coB0qX6ZhUA0xHI9g6zf9IbSegA4SebKOpfUYQwpAkCFXDvscGj4U8qcoOpju68sroM2HWxJeYPUPkqQevJxGrgu6lTPxr4SDapVqYqq9EzjmmRERbUa0C7rIRxSUJm0Kswn1DQKJOBenr9ajFOTtg4znwRNPGez2r82d6JCIKyC9+HyxIwXPEJolbhUh4ZZQpg+LUvkJEXiIC8LIApPjCCM5QLEhNyztLGD7V6bKyBqbli4Pm+PPn6hTxgi3N92WQuPkrG37TG8HPsmozkeukTjzzgOpSK9wOumQxsYEBDW+uhMjTzIZWvZD7wKc55YfbdwivZD/u3rjXBaQCBEYKjb43I3S98QVpu615zN+tY30R05eXUcD8L4UrSpzhsXZAiJTOa8Puahy0Ke5FKELbnSl3WfL85dB5Lt3qiIaz88dfcx7zDzZ2zc+mXSshuKJn zfU07UBG CrFYk4sTz+E6f7cgq9Ue1UekTLvbCafQbBkQ87Kbew7OllLVJD0IiC4Bxj36z2Bh4CkYLxeVktEhzww8JVyV5NFjJ1jxn0+moLyCSCxCs2yUZIAIUKFrdzB1dTjgF5G7Av8kwjPGL/ERE9iFcBb3NRywGkzrJyRc1CSisUNhLoSCHmB5tvczKIf7uibRZcK61rLgYjq0WHChhiZek0oFgJpSR6WeJqSQEjBh2XPAHYDN91jY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Leon Romanovsky It is a hack, but direct DMA works now. Signed-off-by: Leon Romanovsky --- drivers/nvme/host/pci.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 881cbf2c0cac..1872fa91ac76 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -791,6 +791,9 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, return BLK_STS_RESOURCE; rq_for_each_bvec(bv, req, iter) { + if (bv.bv_offset != 0) + goto out_free; + dma_addr = dma_map_bvec(dev->dev, &bv, rq_dma_dir(req), 0); if (dma_mapping_error(dev->dev, dma_addr)) goto out_free;