From patchwork Fri Mar 14 10:10:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 14016565 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E49E51F3BB6 for ; Fri, 14 Mar 2025 10:12:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741947145; cv=none; b=BU8cp/TNzNEJvPROl1kHyOIxBs7L1gMn7LltmS9Xn6Fb3FoHTVIknXz0uQu2zFVtFJZ0rQQS7qbztKYOGmJdwl4Vnkm4TPGekOLnlNoPiPZVhARuEx/ZBAHRVIk3iZPDfam82SoV0lnnQD3Za/D/Re0iPeHOkHuvMyczqO4XQus= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741947145; c=relaxed/simple; bh=VzvXCYiJB66dpmGnvt1GQ4j7QA1QWPVA0WnnxJNRFz4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=emaFaLThU6u7z7h9/1BbS5SqfQUorrPMptvshPiyUjKH9i7NMYFfIeHoTLxcOjtd3/Jju3AkMLBAt0igLaQ68+XmzMGUClSJ5TPKYVpSimrtUL8U62mnhjn9BR77l81s0YuDrCZdxsnLKq1VNgPVLdIMcgtuC5xw+8HOF8R+kyo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=A2rtmMru; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="A2rtmMru" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741947140; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ave2IjOyoZ2UosD/vzUXfIW4ywUrT9zR4dZ7jLrzKAc=; b=A2rtmMrugjQglCy8L/VKTb92n45r6Wb0LJ6g51vHSM0JTJSj1K41CdX71FvCC7pz5LRKW+ fGDP2mVuFH5aoXUD8yvQloUdKG7qAtbKxrAQTO6M9QXtDZDIXmNr4RrDGHVo9gWAxuT3EI QRM4gGPeLKtE18LnpzRlsBaTkd8f1k0= Received: from mail-lf1-f69.google.com (mail-lf1-f69.google.com [209.85.167.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-407-QEOCOD8ZPxad9eXCBYpQUA-1; Fri, 14 Mar 2025 06:12:19 -0400 X-MC-Unique: QEOCOD8ZPxad9eXCBYpQUA-1 X-Mimecast-MFC-AGG-ID: QEOCOD8ZPxad9eXCBYpQUA_1741947138 Received: by mail-lf1-f69.google.com with SMTP id 2adb3069b0e04-5498963ebc3so1053296e87.0 for ; Fri, 14 Mar 2025 03:12:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741947138; x=1742551938; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ave2IjOyoZ2UosD/vzUXfIW4ywUrT9zR4dZ7jLrzKAc=; b=N6DdDXIN5/4Li6nWmRp6M11SwcgdKPtAkIPspeXfyTwNbQqrhZ8tO/s4Iwi+a/IBqr LRfk2EuZDUw+ueLTTDV2r2iDF9cVwULofyD9qCP95HpQvJBKElEU5sW7q7By5SPqZYi9 0JL8xdkeizg727QtOQElOeMUSq9f+KynFAMxcqOEhXKH5FUq8COIvCfoW6HSyA3csdVa IPh33M+g/hq5JlReY3luXdFPKCZlJoGEtfWv31jePWb1GBfvO1/BclLfX+IoY1WWYGUD aWcUKc/jGMSS82QW+b+UwT8QabPxRCa6Y2BII4osKnhrECL2GwOxsVp2URYiyMkst3k7 pQLA== X-Forwarded-Encrypted: i=1; AJvYcCW+ifKMVOoHuGrlgBJeshe3gNaJVEdYjnZTH5PcEAAbfdwymauY2kfE1EJQfH5zN88gRpU=@vger.kernel.org X-Gm-Message-State: AOJu0YyXB+yIBuCQDRcs5H5JzOc22kvNXUWuUKexm1qXcWaowvRKgfbD +96lRK1UdstWair8+qN7fXlDIlnMNhowZ3RJZMKW2J9F/lqcS8W2zuqLhxZhCt/p98pdup8PzYh xwYXROmrugf5w2aH43Y6IRJRgnDFjk5wGMQehUIv1kd5VvmRmlw== X-Gm-Gg: ASbGncuYKOUvopB+o5rP1NpYUeeGVQ5M7davjevUbuPhQLb/KjB425woRoHYUNqHK9U 3IXDsD/qBRirKVMSFRh1GU7xRs7s1YLejW7bTfQAf7gDxX9aS10mtP18ZFaY2OnZ9V1B8a9iCK4 GbaeTT9i1TqhIhNqP8fPnOdqHvvcPNmvsVf2zGxDAps8rpSj6p6mVrHqpW5VGCpBzIyBHQXaN9b rgLQpIn0CXvl6uim/m+WUDKN/D/XEZaCe7E+5ujxRAa0RJnDc6cGTLOd4TXbeZQRgvtS6iJ9bXr FdtCJlrXm6EYms4nu1uTLYMpU7OqzLh92N7on02S X-Received: by 2002:a05:6512:10d1:b0:545:3031:40aa with SMTP id 2adb3069b0e04-549c38f209cmr513460e87.9.1741947137923; Fri, 14 Mar 2025 03:12:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHzxGfVA9udR6p6TQJwZ91X1aXY0n12wGHzdlM1JECrgnA89h9JnwLyeEXzDY94I1wIwwBIgA== X-Received: by 2002:a05:6512:10d1:b0:545:3031:40aa with SMTP id 2adb3069b0e04-549c38f209cmr513435e87.9.1741947137461; Fri, 14 Mar 2025 03:12:17 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-549ba8804a9sm469301e87.168.2025.03.14.03.12.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Mar 2025 03:12:15 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 09C6918FA92E; Fri, 14 Mar 2025 11:12:13 +0100 (CET) From: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= Date: Fri, 14 Mar 2025 11:10:19 +0100 Subject: [PATCH net-next 1/3] page_pool: Move pp_magic check into helper functions Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250314-page-pool-track-dma-v1-1-c212e57a74c2@redhat.com> References: <20250314-page-pool-track-dma-v1-0-c212e57a74c2@redhat.com> In-Reply-To: <20250314-page-pool-track-dma-v1-0-c212e57a74c2@redhat.com> To: "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Andrew Lunn , Eric Dumazet , Paolo Abeni , Ilias Apalodimas , Simon Horman , Andrew Morton , Mina Almasry , Yonglong Liu , Yunsheng Lin , Pavel Begunkov , Matthew Wilcox Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Mailer: b4 0.14.2 X-Patchwork-Delegate: kuba@kernel.org Since we are about to stash some more information into the pp_magic field, let's move the magic signature checks into a pair of helper functions so it can be changed in one place. Signed-off-by: Toke Høiland-Jørgensen --- drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c | 4 ++-- include/net/page_pool/types.h | 18 ++++++++++++++++++ mm/page_alloc.c | 9 +++------ net/core/netmem_priv.h | 5 +++++ net/core/skbuff.c | 16 ++-------------- net/core/xdp.c | 4 ++-- 6 files changed, 32 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index 6f3094a479e1ec61854bb48a6a0c812167487173..70c6f0b2abb921778c98fbd428594ebd7986a302 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -706,8 +706,8 @@ static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq *sq, xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo); page = xdpi.page.page; - /* No need to check ((page->pp_magic & ~0x3UL) == PP_SIGNATURE) - * as we know this is a page_pool page. + /* No need to check page_pool_page_is_pp() as we + * know this is a page_pool page. */ page_pool_recycle_direct(page->pp, page); } while (++n < num); diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index 36eb57d73abc6cfc601e700ca08be20fb8281055..df0d3c1608929605224feb26173135ff37951ef8 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -54,6 +54,14 @@ struct pp_alloc_cache { netmem_ref cache[PP_ALLOC_CACHE_SIZE]; }; +/* Mask used for checking in page_pool_page_is_pp() below. page->pp_magic is + * OR'ed with PP_SIGNATURE after the allocation in order to preserve bit 0 for + * the head page of compound page and bit 1 for pfmemalloc page. + * page_is_pfmemalloc() is checked in __page_pool_put_page() to avoid recycling + * the pfmemalloc page. + */ +#define PP_MAGIC_MASK ~0x3UL + /** * struct page_pool_params - page pool parameters * @fast: params accessed frequently on hotpath @@ -264,6 +272,11 @@ void page_pool_destroy(struct page_pool *pool); void page_pool_use_xdp_mem(struct page_pool *pool, void (*disconnect)(void *), const struct xdp_mem_info *mem); void page_pool_put_netmem_bulk(netmem_ref *data, u32 count); + +static inline bool page_pool_page_is_pp(struct page *page) +{ + return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE; +} #else static inline void page_pool_destroy(struct page_pool *pool) { @@ -278,6 +291,11 @@ static inline void page_pool_use_xdp_mem(struct page_pool *pool, static inline void page_pool_put_netmem_bulk(netmem_ref *data, u32 count) { } + +static inline bool page_pool_page_is_pp(struct page *page) +{ + return false; +} #endif void page_pool_put_unrefed_netmem(struct page_pool *pool, netmem_ref netmem, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 579789600a3c7bfb7b0d847d51af702a9d4b139a..0268b68935ceb27d9781e59a474a234a6a61ea74 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -55,6 +55,7 @@ #include #include #include +#include #include #include "internal.h" #include "shuffle.h" @@ -872,9 +873,7 @@ static inline bool page_expected_state(struct page *page, #ifdef CONFIG_MEMCG page->memcg_data | #endif -#ifdef CONFIG_PAGE_POOL - ((page->pp_magic & ~0x3UL) == PP_SIGNATURE) | -#endif + page_pool_page_is_pp(page) | (page->flags & check_flags))) return false; @@ -901,10 +900,8 @@ static const char *page_bad_reason(struct page *page, unsigned long flags) if (unlikely(page->memcg_data)) bad_reason = "page still charged to cgroup"; #endif -#ifdef CONFIG_PAGE_POOL - if (unlikely((page->pp_magic & ~0x3UL) == PP_SIGNATURE)) + if (unlikely(page_pool_page_is_pp(page))) bad_reason = "page_pool leak"; -#endif return bad_reason; } diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h index 7eadb8393e002fd1cc2cef8a313d2ea7df76f301..f33162fd281c23e109273ba09950c5d0a2829bc9 100644 --- a/net/core/netmem_priv.h +++ b/net/core/netmem_priv.h @@ -18,6 +18,11 @@ static inline void netmem_clear_pp_magic(netmem_ref netmem) __netmem_clear_lsb(netmem)->pp_magic = 0; } +static inline bool netmem_is_pp(netmem_ref netmem) +{ + return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE; +} + static inline void netmem_set_pp(netmem_ref netmem, struct page_pool *pool) { __netmem_clear_lsb(netmem)->pp = pool; diff --git a/net/core/skbuff.c b/net/core/skbuff.c index ab8acb737b93299f503e5c298b87e18edd59d555..a64d777488e403d5fdef83ae42ae9e4924c1a0dc 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -893,11 +893,6 @@ static void skb_clone_fraglist(struct sk_buff *skb) skb_get(list); } -static bool is_pp_netmem(netmem_ref netmem) -{ - return (netmem_get_pp_magic(netmem) & ~0x3UL) == PP_SIGNATURE; -} - int skb_pp_cow_data(struct page_pool *pool, struct sk_buff **pskb, unsigned int headroom) { @@ -995,14 +990,7 @@ bool napi_pp_put_page(netmem_ref netmem) { netmem = netmem_compound_head(netmem); - /* page->pp_magic is OR'ed with PP_SIGNATURE after the allocation - * in order to preserve any existing bits, such as bit 0 for the - * head page of compound page and bit 1 for pfmemalloc page, so - * mask those bits for freeing side when doing below checking, - * and page_is_pfmemalloc() is checked in __page_pool_put_page() - * to avoid recycling the pfmemalloc page. - */ - if (unlikely(!is_pp_netmem(netmem))) + if (unlikely(!netmem_is_pp(netmem))) return false; page_pool_put_full_netmem(netmem_get_pp(netmem), netmem, false); @@ -1042,7 +1030,7 @@ static int skb_pp_frag_ref(struct sk_buff *skb) for (i = 0; i < shinfo->nr_frags; i++) { head_netmem = netmem_compound_head(shinfo->frags[i].netmem); - if (likely(is_pp_netmem(head_netmem))) + if (likely(netmem_is_pp(head_netmem))) page_pool_ref_netmem(head_netmem); else page_ref_inc(netmem_to_page(head_netmem)); diff --git a/net/core/xdp.c b/net/core/xdp.c index f86eedad586a77eb63a96a85aa6d068d3e94f077..0ba73943c6eed873b3d1c681b3b9a802b590f2d9 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -437,8 +437,8 @@ void __xdp_return(netmem_ref netmem, enum xdp_mem_type mem_type, netmem = netmem_compound_head(netmem); if (napi_direct && xdp_return_frame_no_direct()) napi_direct = false; - /* No need to check ((page->pp_magic & ~0x3UL) == PP_SIGNATURE) - * as mem->type knows this a page_pool page + /* No need to check netmem_is_pp() as mem->type knows this a + * page_pool page */ page_pool_put_full_netmem(netmem_get_pp(netmem), netmem, napi_direct); From patchwork Fri Mar 14 10:10:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 14016562 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 541641F3B9C for ; Fri, 14 Mar 2025 10:12:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741947143; cv=none; b=SHmn46jeW+O/Bsruh1wm5g5+RpHad0u84XccSE6vULGtxkX7cg0+afu+vx3Uhy4lToAJAMceav5RlZMTcTJpi0z8d3veT/SodqdOiKPxLV5awMGsDvB23zVJ5UdFHodL+yQ4PsUty4V9yvtbLE9VqWTFlhn4h/QEUdzQFfms5uc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741947143; c=relaxed/simple; bh=lgxRxmGH3EqSkqPGLRwmTX5PGlD3rFE5OCI1sheAXjI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=HAH0iQVjR1NcZcDgDD8amogXFgEsvdP9hqH2X1lC2evss0cVWUzOTOv5e/k5UH9V53Ixv/sGmxNplldwNw32kP6l6ZqYb23RrZ0WK6ALGfacY6MzZO9ZopC6ujC8FT06Yh4yG6CYPWkCs4d4E14QYwLPjAO5IEF1mGD2SzXfA8s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Lo16Sknh; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Lo16Sknh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741947140; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MG96L1nD9BNqKXUr9HdilPzD1EgMIYgZeATyxIiL+XI=; b=Lo16SknhzOJRwgV1muPNjmyDG93kjvpKOxQgaQxc1a6NSPv7rdl5OIvNyGntisqRINH+WC 2/XK3noQNx7EklrRGOzScf2CID8ZqQawz5brOqlGPekkYvZiFvUs9+pKplpk4iC5Yg9sFC bxoBxZvbnbizt0axdMI1hRLhIU+fHHU= Received: from mail-lf1-f70.google.com (mail-lf1-f70.google.com [209.85.167.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-590-pZVOFS86OGCBLhWf23Mmhw-1; Fri, 14 Mar 2025 06:12:19 -0400 X-MC-Unique: pZVOFS86OGCBLhWf23Mmhw-1 X-Mimecast-MFC-AGG-ID: pZVOFS86OGCBLhWf23Mmhw_1741947138 Received: by mail-lf1-f70.google.com with SMTP id 2adb3069b0e04-5495a1c0be4so1201338e87.2 for ; Fri, 14 Mar 2025 03:12:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741947137; x=1742551937; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MG96L1nD9BNqKXUr9HdilPzD1EgMIYgZeATyxIiL+XI=; b=HxKBokyUyJg60yk5AHf+Bjg15HqpMBa4qsB+uCJyeR8W+l6zNPEQcxihHhsE3kDtPQ ZF6Gb6c6iYkGMVG2PDMP7ik4mAF1LhZ+q1/CirgqjJyswC7I3eNA1wK3draAgd8vEicE RUFc4zpD1nN/FDW2OnjF77HYZKZBMvTE8rL4wg700ZG8CXw/FJw7UHAE9MWfPzDgFPKJ WY+foX1QIwV17oMrgdjwhZkrxbaq9zARkQMwxqBpPMjvQs1KHDoKw+0cCcrWKMZUoOL5 wRGQ+RZHl7jtx5BCdPsL7FJhtU1EnGYFkQxlElqtN+UeJMdGY1MU3dbby0ZEeIii/XZj JObg== X-Forwarded-Encrypted: i=1; AJvYcCWR9pru36ufcHk7CGaSHApCq99PoYpGpyjWYfHip8jgjWTMrBoY1IX0QYU41tPVx5opR3I=@vger.kernel.org X-Gm-Message-State: AOJu0YwuWSzFc9rI5bPFf7n6xeMUeLDLQxme1br7s63GIn0rviu1reIt 3UrIBOPoImgsoo0hwCTl1ZLDuVoeICd6v0cC+R4mZhCLxTQszfZSB94H0QygMmk6JEgTQRmGuju z3q8lUdAzc0+lPD4weE+TiaQkq+r6XzEGDpzmvcgVg8FXyXGUdQ== X-Gm-Gg: ASbGncu7hMbuQ52NMaiwCcZq3NeuuDBJeiSQdhMdlbyDIok09C3lWrgQGpxOEHh1NrX o1Wz6fiH9Qh6NE6dyV5bwcGJ9gric+chLmpV7VICyRGcasH1jmYo8y8aPH78lp9tHHldEaA5gs5 YfRzLrOy9MtaCqJx/OY3UjMxVcRRhcxEXyP+FWMWzkN5eCCTVLnulk2FwuGp/92UhQv6UYBkV3T Et92ktZkThw8xBxGJ2U1/Q7JQVXPCI80W2Lq/4riFCHEjzYlMmPMK+RvxJMA4XTUl6NBH2SmGUa 2rqEEOrWc39eX/CpOKqJZsutKR18XEpdpi5NdRyj X-Received: by 2002:a05:6512:2826:b0:549:7c64:3bc0 with SMTP id 2adb3069b0e04-549c38f1fb3mr472365e87.29.1741947137556; Fri, 14 Mar 2025 03:12:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEAx3kALDA6mOmLxS/XJeBJ/I21XkejnmA65cY5OZtqAXgthxp5EycvrTQUN0iEyVF8q4pJBw== X-Received: by 2002:a05:6512:2826:b0:549:7c64:3bc0 with SMTP id 2adb3069b0e04-549c38f1fb3mr472342e87.29.1741947137122; Fri, 14 Mar 2025 03:12:17 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([45.145.92.2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-549ba7a8b11sm482509e87.30.2025.03.14.03.12.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Mar 2025 03:12:15 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 0C33A18FA930; Fri, 14 Mar 2025 11:12:13 +0100 (CET) From: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= Date: Fri, 14 Mar 2025 11:10:20 +0100 Subject: [PATCH net-next 2/3] page_pool: Turn dma_sync and dma_sync_cpu fields into a bitmap Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250314-page-pool-track-dma-v1-2-c212e57a74c2@redhat.com> References: <20250314-page-pool-track-dma-v1-0-c212e57a74c2@redhat.com> In-Reply-To: <20250314-page-pool-track-dma-v1-0-c212e57a74c2@redhat.com> To: "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Andrew Lunn , Eric Dumazet , Paolo Abeni , Ilias Apalodimas , Simon Horman , Andrew Morton , Mina Almasry , Yonglong Liu , Yunsheng Lin , Pavel Begunkov , Matthew Wilcox Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Mailer: b4 0.14.2 X-Patchwork-Delegate: kuba@kernel.org Change the single-bit booleans for dma_sync into an unsigned long with BIT() definitions so that a subsequent patch can write them both with a singe WRITE_ONCE() on teardown. Also move the check for the sync_cpu side into __page_pool_dma_sync_for_cpu() so it can be disabled for non-netmem providers as well. Signed-off-by: Toke Høiland-Jørgensen --- include/net/page_pool/helpers.h | 6 +++--- include/net/page_pool/types.h | 8 ++++++-- net/core/devmem.c | 3 +-- net/core/page_pool.c | 9 +++++---- 4 files changed, 15 insertions(+), 11 deletions(-) diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h index 582a3d00cbe2315edeb92850b6a42ab21e509e45..7ed32bde4b8944deb7fb22e291e95b8487be681a 100644 --- a/include/net/page_pool/helpers.h +++ b/include/net/page_pool/helpers.h @@ -443,6 +443,9 @@ static inline void __page_pool_dma_sync_for_cpu(const struct page_pool *pool, const dma_addr_t dma_addr, u32 offset, u32 dma_sync_size) { + if (!(READ_ONCE(pool->dma_sync) & PP_DMA_SYNC_CPU)) + return; + dma_sync_single_range_for_cpu(pool->p.dev, dma_addr, offset + pool->p.offset, dma_sync_size, page_pool_get_dma_dir(pool)); @@ -473,9 +476,6 @@ page_pool_dma_sync_netmem_for_cpu(const struct page_pool *pool, const netmem_ref netmem, u32 offset, u32 dma_sync_size) { - if (!pool->dma_sync_for_cpu) - return; - __page_pool_dma_sync_for_cpu(pool, page_pool_get_dma_addr_netmem(netmem), offset, dma_sync_size); diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index df0d3c1608929605224feb26173135ff37951ef8..fbe34024b20061e8bcd1d4474f6ebfc70992f1eb 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -33,6 +33,10 @@ #define PP_FLAG_ALL (PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV | \ PP_FLAG_SYSTEM_POOL | PP_FLAG_ALLOW_UNREADABLE_NETMEM) +/* bit values used in pp->dma_sync */ +#define PP_DMA_SYNC_DEV BIT(0) +#define PP_DMA_SYNC_CPU BIT(1) + /* * Fast allocation side cache array/stack * @@ -175,12 +179,12 @@ struct page_pool { bool has_init_callback:1; /* slow::init_callback is set */ bool dma_map:1; /* Perform DMA mapping */ - bool dma_sync:1; /* Perform DMA sync for device */ - bool dma_sync_for_cpu:1; /* Perform DMA sync for cpu */ #ifdef CONFIG_PAGE_POOL_STATS bool system:1; /* This is a global percpu pool */ #endif + unsigned long dma_sync; + __cacheline_group_begin_aligned(frag, PAGE_POOL_FRAG_GROUP_ALIGN); long frag_users; netmem_ref frag_page; diff --git a/net/core/devmem.c b/net/core/devmem.c index 7c6e0b5b6acb55f376ec725dfb71d1f70a4320c3..16e43752566feb510b3e47fbec2d8da0f26a6adc 100644 --- a/net/core/devmem.c +++ b/net/core/devmem.c @@ -337,8 +337,7 @@ int mp_dmabuf_devmem_init(struct page_pool *pool) /* dma-buf dma addresses do not need and should not be used with * dma_sync_for_cpu/device. Force disable dma_sync. */ - pool->dma_sync = false; - pool->dma_sync_for_cpu = false; + pool->dma_sync = 0; if (pool->p.order != 0) return -E2BIG; diff --git a/net/core/page_pool.c b/net/core/page_pool.c index acef1fcd8ddcfd1853a6f2055c1f1820ab248e8d..d51ca4389dd62d8bc266a9a2b792838257173535 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -203,7 +203,7 @@ static int page_pool_init(struct page_pool *pool, memcpy(&pool->slow, ¶ms->slow, sizeof(pool->slow)); pool->cpuid = cpuid; - pool->dma_sync_for_cpu = true; + pool->dma_sync = PP_DMA_SYNC_CPU; /* Validate only known flags were used */ if (pool->slow.flags & ~PP_FLAG_ALL) @@ -238,7 +238,7 @@ static int page_pool_init(struct page_pool *pool, if (!pool->p.max_len) return -EINVAL; - pool->dma_sync = true; + pool->dma_sync |= PP_DMA_SYNC_DEV; /* pool->p.offset has to be set according to the address * offset used by the DMA engine to start copying rx data @@ -291,7 +291,7 @@ static int page_pool_init(struct page_pool *pool, } if (pool->mp_ops) { - if (!pool->dma_map || !pool->dma_sync) + if (!pool->dma_map || !(pool->dma_sync & PP_DMA_SYNC_DEV)) return -EOPNOTSUPP; if (WARN_ON(!is_kernel_rodata((unsigned long)pool->mp_ops))) { @@ -466,7 +466,8 @@ page_pool_dma_sync_for_device(const struct page_pool *pool, netmem_ref netmem, u32 dma_sync_size) { - if (pool->dma_sync && dma_dev_need_sync(pool->p.dev)) + if ((READ_ONCE(pool->dma_sync) & PP_DMA_SYNC_DEV) && + dma_dev_need_sync(pool->p.dev)) __page_pool_dma_sync_for_device(pool, netmem, dma_sync_size); } From patchwork Fri Mar 14 10:10:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 14016563 X-Patchwork-Delegate: kuba@kernel.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 541C31F3BAE for ; Fri, 14 Mar 2025 10:12:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741947144; cv=none; b=TIa8FRYyQQKduvasr5SRl+ubeW923uaRBTAklhst87mVl7DgRq14fSUpX1vpm7UUv1FXbSM8FKyWoEMAzOM7LlZOt4w97RgLiuRtRQ3UbdhCdfDfAAg9SY0z5fuUDxQC7mQ+78pAhn584DXIVcpfZPNDuJz7esCXjkddo8NXLNM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741947144; c=relaxed/simple; bh=HwWk2I3OdQJPTfT3Cl0K+4+N5uv5GmbCTPwYpoP+U5Y=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ZDCZRXgNwYvH6RtWTASYvo7GNCBZdFFQ8wz2fgHCH4Hlyxld1Xd2EcGLeqt3Wy+GcikGKTFaMo/D5HDeKF2ameQ9wajgafEg1MMLySruwu4DYERnTEN5zK77sVY7Ilz0IUk5C41qgNLzA/NLj0f7BVFtXqbDmRe3w2v8YQO811w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Nx9T+aoz; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Nx9T+aoz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1741947140; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=m+DimXfUHWFE+Vsqjj1iCGyhBq1kmkWNtKkNWDSpM9o=; b=Nx9T+aozWfe7ydcY+F94+h1kyaB8tXzc6hzUMIAgubz+FSZqG62Cy2vd/PX57w6twv5bvt pIw1bIwWEUVNY4AbdvN/zx6kq3FezIGZBxW/rydyBz4A1BOKhjnV52IYv7vaGJGRMTnzZz 6BFTdFt72sVmXSOh1hmvKfdL5gq9Itw= Received: from mail-lj1-f199.google.com (mail-lj1-f199.google.com [209.85.208.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-674-a1Qx4fSSN4SmO-C1y055pg-1; Fri, 14 Mar 2025 06:12:19 -0400 X-MC-Unique: a1Qx4fSSN4SmO-C1y055pg-1 X-Mimecast-MFC-AGG-ID: a1Qx4fSSN4SmO-C1y055pg_1741947138 Received: by mail-lj1-f199.google.com with SMTP id 38308e7fff4ca-30b860e140dso8732101fa.1 for ; Fri, 14 Mar 2025 03:12:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741947137; x=1742551937; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m+DimXfUHWFE+Vsqjj1iCGyhBq1kmkWNtKkNWDSpM9o=; b=TaBHxdbLviUOyh6b4mTJKS1g3KEZcXOv9Om4XeQUVEcjvnwn7x98dVkHnd4+/lzcwb Gyjl2zq8pjgPntWrEWca4bI/fEnYQURQtyGF+39mt0SUb6dku0OaMzur145Cs7ziLyv7 kBxLa6rYbR4tE5W+CHxnnkReAf/1GrqPRGJJE5ydNoRQZtiWEHyexyWtqTqWI8EwAdLb Qz9U3ugqgJ79Gty5AjNhugpKRvAzQz+oaH3kwVkHHp3/2HTHXevUXCUoWcf5M/3Tw11J E21mKpVKsz8dLK2x7WH7flwouUPRf1Zq4IPKjxrHC5b1qcKlPeI3fyOjNsJI1tQL6zIE hWnA== X-Forwarded-Encrypted: i=1; AJvYcCVvx4CX0WDzAkBRUPI3GQPMMPZooYU5+DjF2bQMJQYqzMOBTeCDwyBhCOCeIC4bOC568B4=@vger.kernel.org X-Gm-Message-State: AOJu0YzOocRppgCkk7rcl/EXxRZekbzA3wNRjIhGOq34EaR49vbuZ1w3 vcziA9/mWGTyAnmAFdrpymfDvP1IEVmn/BRUVGhPGzzqN1rxPV3frv1tNRoGMYesVdJtAkCpEMh EOqZT0XxsZTNC2kizJaP3eb/9yz0KV5aFS1NJXN2TlzQA18IUug== X-Gm-Gg: ASbGncsnr4ejX3gm41IKg+Xhqum+B+3H7jMZfns3NGPH1PXkZ9rNYkY7yUQr4Es7NaG lTYvXgMNp/48YThWrtsBh3OmSAgZK7Uf+f/35jqk+c+5EGZifMzFP9WG+2e1+9egPCZIjKNzJPa OVqVanlyinHQYTtoXWCVVpI4ikolEhyPh3XewWK50UWySa2zWlectuG2zyHhi8VT+rBFWGhAZZj GDDXM1ZfXjP0g8iiipKe6RwBhAHWv5IaA10Ve6plyaNOGZhOU5Y7+fnsKaR84YYS3DEjWY+6159 g6hIGKTVw/hz X-Received: by 2002:a2e:8902:0:b0:30c:4be7:1d42 with SMTP id 38308e7fff4ca-30c4be720c0mr4692591fa.12.1741947137275; Fri, 14 Mar 2025 03:12:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEUBB+wtWsQxl8uXrZ431QyFroWOXaMppHSjgaRgNrqxZyQrpKDNG0hx3B3XyzCMTPkytOKww== X-Received: by 2002:a2e:8902:0:b0:30c:4be7:1d42 with SMTP id 38308e7fff4ca-30c4be720c0mr4692291fa.12.1741947136730; Fri, 14 Mar 2025 03:12:16 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([2a0c:4d80:42:443::2]) by smtp.gmail.com with ESMTPSA id 38308e7fff4ca-30c4012af5dsm5146841fa.15.2025.03.14.03.12.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Mar 2025 03:12:15 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 0F5FC18FA932; Fri, 14 Mar 2025 11:12:13 +0100 (CET) From: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= Date: Fri, 14 Mar 2025 11:10:21 +0100 Subject: [PATCH net-next 3/3] page_pool: Track DMA-mapped pages and unmap them when destroying the pool Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20250314-page-pool-track-dma-v1-3-c212e57a74c2@redhat.com> References: <20250314-page-pool-track-dma-v1-0-c212e57a74c2@redhat.com> In-Reply-To: <20250314-page-pool-track-dma-v1-0-c212e57a74c2@redhat.com> To: "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Andrew Lunn , Eric Dumazet , Paolo Abeni , Ilias Apalodimas , Simon Horman , Andrew Morton , Mina Almasry , Yonglong Liu , Yunsheng Lin , Pavel Begunkov , Matthew Wilcox Cc: netdev@vger.kernel.org, bpf@vger.kernel.org, linux-rdma@vger.kernel.org, linux-mm@kvack.org, =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , Qiuling Ren , Yuying Ma X-Mailer: b4 0.14.2 X-Patchwork-Delegate: kuba@kernel.org When enabling DMA mapping in page_pool, pages are kept DMA mapped until they are released from the pool, to avoid the overhead of re-mapping the pages every time they are used. This causes resource leaks and/or crashes when there are pages still outstanding while the device is torn down, because page_pool will attempt an unmap through a non-existent DMA device on the subsequent page return. To fix this, implement a simple tracking of outstanding DMA-mapped pages in page pool using an xarray. This was first suggested by Mina[0], and turns out to be fairly straight forward: We simply store pointers to pages directly in the xarray with xa_alloc() when they are first DMA mapped, and remove them from the array on unmap. Then, when a page pool is torn down, it can simply walk the xarray and unmap all pages still present there before returning, which also allows us to get rid of the get/put_device() calls in page_pool. Using xa_cmpxchg(), no additional synchronisation is needed, as a page will only ever be unmapped once. To avoid having to walk the entire xarray on unmap to find the page reference, we stash the ID assigned by xa_alloc() into the page structure itself, using the upper bits of the pp_magic field. This requires a couple of defines to avoid conflicting with the POINTER_POISON_DELTA define, but this is all evaluated at compile-time, so does not affect run-time performance. The bitmap calculations in this patch gives the following number of bits for different architectures: - 24 bits on 32-bit architectures - 21 bits on PPC64 (because of the definition of ILLEGAL_POINTER_VALUE) - 32 bits on other 64-bit architectures Since all the tracking is performed on DMA map/unmap, no additional code is needed in the fast path, meaning the performance overhead of this tracking is negligible. A micro-benchmark shows that the total overhead of using xarray for this purpose is about 400 ns (39 cycles(tsc) 395.218 ns; sum for both map and unmap[1]). Since this cost is only paid on DMA map and unmap, it seems like an acceptable cost to fix the late unmap issue. Further optimisation can narrow the cases where this cost is paid (for instance by eliding the tracking when DMA map/unmap is a no-op). The extra memory needed to track the pages is neatly encapsulated inside xarray, which uses the 'struct xa_node' structure to track items. This structure is 576 bytes long, with slots for 64 items, meaning that a full node occurs only 9 bytes of overhead per slot it tracks (in practice, it probably won't be this efficient, but in any case it should be an acceptable overhead). [0] https://lore.kernel.org/all/CAHS8izPg7B5DwKfSuzz-iOop_YRbk3Sd6Y4rX7KBG9DcVJcyWg@mail.gmail.com/ [1] https://lore.kernel.org/r/ae07144c-9295-4c9d-a400-153bb689fe9e@huawei.com Reported-by: Yonglong Liu Closes: https://lore.kernel.org/r/8743264a-9700-4227-a556-5f931c720211@huawei.com Fixes: ff7d6b27f894 ("page_pool: refurbish version of page_pool code") Suggested-by: Mina Almasry Reviewed-by: Mina Almasry Reviewed-by: Jesper Dangaard Brouer Tested-by: Jesper Dangaard Brouer Tested-by: Qiuling Ren Tested-by: Yuying Ma Signed-off-by: Toke Høiland-Jørgensen --- include/net/page_pool/types.h | 36 +++++++++++++++++++--- net/core/netmem_priv.h | 28 ++++++++++++++++- net/core/page_pool.c | 72 +++++++++++++++++++++++++++++++++++++------ 3 files changed, 121 insertions(+), 15 deletions(-) diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index fbe34024b20061e8bcd1d4474f6ebfc70992f1eb..1e187489d392757c8dd2960870b0d875c1dde01b 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -6,6 +6,7 @@ #include #include #include +#include #include #define PP_FLAG_DMA_MAP BIT(0) /* Should page_pool do the DMA @@ -58,13 +59,38 @@ struct pp_alloc_cache { netmem_ref cache[PP_ALLOC_CACHE_SIZE]; }; +/* + * DMA mapping IDs + * + * When DMA-mapping a page, we allocate an ID (from an xarray) and stash this in + * the upper bits of page->pp_magic. The number of bits available here is + * constrained by the size of an unsigned long, and the definition of + * PP_SIGNATURE. + */ +#define PP_DMA_INDEX_SHIFT (1 + __fls(PP_SIGNATURE - POISON_POINTER_DELTA)) +#define _PP_DMA_INDEX_BITS MIN(32, BITS_PER_LONG - PP_DMA_INDEX_SHIFT - 1) + +/* PP_SIGNATURE includes POISON_POINTER_DELTA, so limit the size of the DMA + * index to not overlap with that if set + */ +#if POISON_POINTER_DELTA > 0 +#define PP_DMA_INDEX_BITS MIN(_PP_DMA_INDEX_BITS, \ + __ffs(POISON_POINTER_DELTA) - PP_DMA_INDEX_SHIFT) +#else +#define PP_DMA_INDEX_BITS _PP_DMA_INDEX_BITS +#endif + +#define PP_DMA_INDEX_MASK GENMASK(PP_DMA_INDEX_BITS + PP_DMA_INDEX_SHIFT - 1, \ + PP_DMA_INDEX_SHIFT) +#define PP_DMA_INDEX_LIMIT XA_LIMIT(1, BIT(PP_DMA_INDEX_BITS) - 1) + /* Mask used for checking in page_pool_page_is_pp() below. page->pp_magic is * OR'ed with PP_SIGNATURE after the allocation in order to preserve bit 0 for - * the head page of compound page and bit 1 for pfmemalloc page. - * page_is_pfmemalloc() is checked in __page_pool_put_page() to avoid recycling - * the pfmemalloc page. + * the head page of compound page and bit 1 for pfmemalloc page, as well as the + * bits used for the DMA index. page_is_pfmemalloc() is checked in + * __page_pool_put_page() to avoid recycling the pfmemalloc page. */ -#define PP_MAGIC_MASK ~0x3UL +#define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL) /** * struct page_pool_params - page pool parameters @@ -233,6 +259,8 @@ struct page_pool { void *mp_priv; const struct memory_provider_ops *mp_ops; + struct xarray dma_mapped; + #ifdef CONFIG_PAGE_POOL_STATS /* recycle stats are per-cpu to avoid locking */ struct page_pool_recycle_stats __percpu *recycle_stats; diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h index f33162fd281c23e109273ba09950c5d0a2829bc9..cd95394399b40c3604934ba7898eeeeacb8aee99 100644 --- a/net/core/netmem_priv.h +++ b/net/core/netmem_priv.h @@ -5,7 +5,7 @@ static inline unsigned long netmem_get_pp_magic(netmem_ref netmem) { - return __netmem_clear_lsb(netmem)->pp_magic; + return __netmem_clear_lsb(netmem)->pp_magic & ~PP_DMA_INDEX_MASK; } static inline void netmem_or_pp_magic(netmem_ref netmem, unsigned long pp_magic) @@ -15,6 +15,8 @@ static inline void netmem_or_pp_magic(netmem_ref netmem, unsigned long pp_magic) static inline void netmem_clear_pp_magic(netmem_ref netmem) { + WARN_ON_ONCE(__netmem_clear_lsb(netmem)->pp_magic & PP_DMA_INDEX_MASK); + __netmem_clear_lsb(netmem)->pp_magic = 0; } @@ -33,4 +35,28 @@ static inline void netmem_set_dma_addr(netmem_ref netmem, { __netmem_clear_lsb(netmem)->dma_addr = dma_addr; } + +static inline unsigned long netmem_get_dma_index(netmem_ref netmem) +{ + unsigned long magic; + + if (WARN_ON_ONCE(netmem_is_net_iov(netmem))) + return 0; + + magic = __netmem_clear_lsb(netmem)->pp_magic; + + return (magic & PP_DMA_INDEX_MASK) >> PP_DMA_INDEX_SHIFT; +} + +static inline void netmem_set_dma_index(netmem_ref netmem, + unsigned long id) +{ + unsigned long magic; + + if (WARN_ON_ONCE(netmem_is_net_iov(netmem))) + return; + + magic = netmem_get_pp_magic(netmem) | (id << PP_DMA_INDEX_SHIFT); + __netmem_clear_lsb(netmem)->pp_magic = magic; +} #endif diff --git a/net/core/page_pool.c b/net/core/page_pool.c index d51ca4389dd62d8bc266a9a2b792838257173535..5612d72d483cad8c24d2f703bb48aad185cfe59a 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -226,6 +226,8 @@ static int page_pool_init(struct page_pool *pool, return -EINVAL; pool->dma_map = true; + + xa_init_flags(&pool->dma_mapped, XA_FLAGS_ALLOC1); } if (pool->slow.flags & PP_FLAG_DMA_SYNC_DEV) { @@ -275,9 +277,6 @@ static int page_pool_init(struct page_pool *pool, /* Driver calling page_pool_create() also call page_pool_destroy() */ refcount_set(&pool->user_cnt, 1); - if (pool->dma_map) - get_device(pool->p.dev); - if (pool->slow.flags & PP_FLAG_ALLOW_UNREADABLE_NETMEM) { /* We rely on rtnl_lock()ing to make sure netdev_rx_queue * configuration doesn't change while we're initializing @@ -325,7 +324,7 @@ static void page_pool_uninit(struct page_pool *pool) ptr_ring_cleanup(&pool->ring, NULL); if (pool->dma_map) - put_device(pool->p.dev); + xa_destroy(&pool->dma_mapped); #ifdef CONFIG_PAGE_POOL_STATS if (!pool->system) @@ -471,9 +470,11 @@ page_pool_dma_sync_for_device(const struct page_pool *pool, __page_pool_dma_sync_for_device(pool, netmem, dma_sync_size); } -static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem) +static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t gfp) { dma_addr_t dma; + int err; + u32 id; /* Setup DMA mapping: use 'struct page' area for storing DMA-addr * since dma_addr_t can be either 32 or 64 bits and does not always fit @@ -487,15 +488,28 @@ static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem) if (dma_mapping_error(pool->p.dev, dma)) return false; - if (page_pool_set_dma_addr_netmem(netmem, dma)) + if (in_softirq()) + err = xa_alloc(&pool->dma_mapped, &id, netmem_to_page(netmem), + PP_DMA_INDEX_LIMIT, gfp); + else + err = xa_alloc_bh(&pool->dma_mapped, &id, netmem_to_page(netmem), + PP_DMA_INDEX_LIMIT, gfp); + if (err) { + WARN_ONCE(1, "couldn't track DMA mapping, please report to netdev@"); goto unmap_failed; + } + if (page_pool_set_dma_addr_netmem(netmem, dma)) { + WARN_ONCE(1, "unexpected DMA address, please report to netdev@"); + goto unmap_failed; + } + + netmem_set_dma_index(netmem, id); page_pool_dma_sync_for_device(pool, netmem, pool->p.max_len); return true; unmap_failed: - WARN_ONCE(1, "unexpected DMA address, please report to netdev@"); dma_unmap_page_attrs(pool->p.dev, dma, PAGE_SIZE << pool->p.order, pool->p.dma_dir, DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING); @@ -512,7 +526,7 @@ static struct page *__page_pool_alloc_page_order(struct page_pool *pool, if (unlikely(!page)) return NULL; - if (pool->dma_map && unlikely(!page_pool_dma_map(pool, page_to_netmem(page)))) { + if (pool->dma_map && unlikely(!page_pool_dma_map(pool, page_to_netmem(page), gfp))) { put_page(page); return NULL; } @@ -558,7 +572,7 @@ static noinline netmem_ref __page_pool_alloc_pages_slow(struct page_pool *pool, */ for (i = 0; i < nr_pages; i++) { netmem = pool->alloc.cache[i]; - if (dma_map && unlikely(!page_pool_dma_map(pool, netmem))) { + if (dma_map && unlikely(!page_pool_dma_map(pool, netmem, gfp))) { put_page(netmem_to_page(netmem)); continue; } @@ -660,6 +674,8 @@ void page_pool_clear_pp_info(netmem_ref netmem) static __always_inline void __page_pool_release_page_dma(struct page_pool *pool, netmem_ref netmem) { + struct page *old, *page = netmem_to_page(netmem); + unsigned long id; dma_addr_t dma; if (!pool->dma_map) @@ -668,6 +684,17 @@ static __always_inline void __page_pool_release_page_dma(struct page_pool *pool, */ return; + id = netmem_get_dma_index(netmem); + if (!id) + return; + + if (in_softirq()) + old = xa_cmpxchg(&pool->dma_mapped, id, page, NULL, 0); + else + old = xa_cmpxchg_bh(&pool->dma_mapped, id, page, NULL, 0); + if (old != page) + return; + dma = page_pool_get_dma_addr_netmem(netmem); /* When page is unmapped, it cannot be returned to our pool */ @@ -675,6 +702,7 @@ static __always_inline void __page_pool_release_page_dma(struct page_pool *pool, PAGE_SIZE << pool->p.order, pool->p.dma_dir, DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING); page_pool_set_dma_addr_netmem(netmem, 0); + netmem_set_dma_index(netmem, 0); } /* Disconnects a page (from a page_pool). API users can have a need @@ -1084,8 +1112,32 @@ static void page_pool_empty_alloc_cache_once(struct page_pool *pool) static void page_pool_scrub(struct page_pool *pool) { + unsigned long id; + void *ptr; + page_pool_empty_alloc_cache_once(pool); - pool->destroy_cnt++; + if (!pool->destroy_cnt++ && pool->dma_map) { + if (pool->dma_sync) { + /* paired with READ_ONCE in + * page_pool_dma_sync_for_device() and + * __page_pool_dma_sync_for_cpu() + */ + WRITE_ONCE(pool->dma_sync, false); + + /* Make sure all concurrent returns that may see the old + * value of dma_sync (and thus perform a sync) have + * finished before doing the unmapping below. Skip the + * wait if the device doesn't actually need syncing, or + * if there are no outstanding mapped pages. + */ + if (dma_dev_need_sync(pool->p.dev) && + !xa_empty(&pool->dma_mapped)) + synchronize_net(); + } + + xa_for_each(&pool->dma_mapped, id, ptr) + __page_pool_release_page_dma(pool, page_to_netmem(ptr)); + } /* No more consumers should exist, but producers could still * be in-flight.