From patchwork Sat Oct 9 09:37:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 12547405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D96A4C433FE for ; Sat, 9 Oct 2021 09:39:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8398360F9C for ; Sat, 9 Oct 2021 09:39:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8398360F9C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 280156B0071; Sat, 9 Oct 2021 05:39:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D6096B0074; Sat, 9 Oct 2021 05:39:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F04E36B0075; Sat, 9 Oct 2021 05:39:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0112.hostedemail.com [216.40.44.112]) by kanga.kvack.org (Postfix) with ESMTP id D7BC16B0073 for ; Sat, 9 Oct 2021 05:39:12 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 8674D180ACC54 for ; Sat, 9 Oct 2021 09:39:12 +0000 (UTC) X-FDA: 78676400544.12.FCBCEDD Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf14.hostedemail.com (Postfix) with ESMTP id 39D886004403 for ; Sat, 9 Oct 2021 09:39:11 +0000 (UTC) Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4HRKds3TrvzbmZp; Sat, 9 Oct 2021 17:34:41 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 9 Oct 2021 17:39:06 +0800 Received: from localhost.localdomain (10.69.192.56) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 9 Oct 2021 17:39:06 +0800 From: Yunsheng Lin To: , CC: , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH net-next -v5 1/4] page_pool: disable dma mapping support for 32-bit arch with 64-bit DMA Date: Sat, 9 Oct 2021 17:37:21 +0800 Message-ID: <20211009093724.10539-2-linyunsheng@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211009093724.10539-1-linyunsheng@huawei.com> References: <20211009093724.10539-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.56] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 39D886004403 X-Stat-Signature: xpz83h575ur3qna168qrqw6ymc8ehujf Authentication-Results: imf14.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=huawei.com; spf=pass (imf14.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com X-HE-Tag: 1633772351-456225 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As the 32-bit arch with 64-bit DMA seems to rare those days, and page pool is carrying a lot of code and complexity for systems that possibly don't exist when the pp page frag tracking support is added. So disable dma mapping support for such systems, if drivers really want to work on such systems, they have to implement their own DMA-mapping fallback tracking outside page_pool. Reviewed-by: Ilias Apalodimas Signed-off-by: Yunsheng Lin --- include/linux/mm_types.h | 13 +------------ include/net/page_pool.h | 12 +----------- net/core/page_pool.c | 10 ++++++---- 3 files changed, 8 insertions(+), 27 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7f8ee09c711f..436e0946d691 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -104,18 +104,7 @@ struct page { struct page_pool *pp; unsigned long _pp_mapping_pad; unsigned long dma_addr; - union { - /** - * dma_addr_upper: might require a 64-bit - * value on 32-bit architectures. - */ - unsigned long dma_addr_upper; - /** - * For frag page support, not supported in - * 32-bit architectures with 64-bit DMA. - */ - atomic_long_t pp_frag_count; - }; + atomic_long_t pp_frag_count; }; struct { /* slab, slob and slub */ union { diff --git a/include/net/page_pool.h b/include/net/page_pool.h index a4082406a003..3855f069627f 100644 --- a/include/net/page_pool.h +++ b/include/net/page_pool.h @@ -216,24 +216,14 @@ static inline void page_pool_recycle_direct(struct page_pool *pool, page_pool_put_full_page(pool, page, true); } -#define PAGE_POOL_DMA_USE_PP_FRAG_COUNT \ - (sizeof(dma_addr_t) > sizeof(unsigned long)) - static inline dma_addr_t page_pool_get_dma_addr(struct page *page) { - dma_addr_t ret = page->dma_addr; - - if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) - ret |= (dma_addr_t)page->dma_addr_upper << 16 << 16; - - return ret; + return page->dma_addr; } static inline void page_pool_set_dma_addr(struct page *page, dma_addr_t addr) { page->dma_addr = addr; - if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) - page->dma_addr_upper = upper_32_bits(addr); } static inline void page_pool_set_frag_count(struct page *page, long nr) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 1a6978427d6c..9b60e4301a44 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -49,6 +49,12 @@ static int page_pool_init(struct page_pool *pool, * which is the XDP_TX use-case. */ if (pool->p.flags & PP_FLAG_DMA_MAP) { + /* DMA-mapping is not supported on 32-bit systems with + * 64-bit DMA mapping. + */ + if (sizeof(dma_addr_t) > sizeof(unsigned long)) + return -EOPNOTSUPP; + if ((pool->p.dma_dir != DMA_FROM_DEVICE) && (pool->p.dma_dir != DMA_BIDIRECTIONAL)) return -EINVAL; @@ -69,10 +75,6 @@ static int page_pool_init(struct page_pool *pool, */ } - if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT && - pool->p.flags & PP_FLAG_PAGE_FRAG) - return -EINVAL; - if (ptr_ring_init(&pool->ring, ring_qsize, GFP_KERNEL) < 0) return -ENOMEM; From patchwork Sat Oct 9 09:37:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 12547409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D00BFC43217 for ; Sat, 9 Oct 2021 09:39:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 82AC260F9C for ; Sat, 9 Oct 2021 09:39:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 82AC260F9C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6081B6B0073; Sat, 9 Oct 2021 05:39:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 567B86B0075; Sat, 9 Oct 2021 05:39:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 32C8B900002; Sat, 9 Oct 2021 05:39:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0131.hostedemail.com [216.40.44.131]) by kanga.kvack.org (Postfix) with ESMTP id 1B13A6B0075 for ; Sat, 9 Oct 2021 05:39:13 -0400 (EDT) Received: from smtpin34.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id B79BC18023939 for ; Sat, 9 Oct 2021 09:39:12 +0000 (UTC) X-FDA: 78676400544.34.FFCF059 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf12.hostedemail.com (Postfix) with ESMTP id 0E5081002192 for ; Sat, 9 Oct 2021 09:39:11 +0000 (UTC) Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4HRKds49zHzRgMb; Sat, 9 Oct 2021 17:34:41 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 9 Oct 2021 17:39:06 +0800 Received: from localhost.localdomain (10.69.192.56) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 9 Oct 2021 17:39:06 +0800 From: Yunsheng Lin To: , CC: , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH net-next -v5 2/4] page_pool: change BIAS_MAX to support incrementing Date: Sat, 9 Oct 2021 17:37:22 +0800 Message-ID: <20211009093724.10539-3-linyunsheng@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211009093724.10539-1-linyunsheng@huawei.com> References: <20211009093724.10539-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.56] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 0E5081002192 X-Stat-Signature: bnxhc5355hpw5h38hiab4gz4e8pecpji Authentication-Results: imf12.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=huawei.com; spf=pass (imf12.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com X-Rspamd-Server: rspam06 X-HE-Tag: 1633772351-577116 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As the page->pp_frag_count need incrementing for pp page frag tracking support, so change BIAS_MAX to (LONG_MAX / 2) to avoid overflowing. Signed-off-by: Yunsheng Lin --- net/core/page_pool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 9b60e4301a44..2c643b72ce16 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -24,7 +24,7 @@ #define DEFER_TIME (msecs_to_jiffies(1000)) #define DEFER_WARN_INTERVAL (60 * HZ) -#define BIAS_MAX LONG_MAX +#define BIAS_MAX (LONG_MAX / 2) static int page_pool_init(struct page_pool *pool, const struct page_pool_params *params) From patchwork Sat Oct 9 09:37:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 12547403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FF50C433EF for ; Sat, 9 Oct 2021 09:39:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A5A5C610E5 for ; Sat, 9 Oct 2021 09:39:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A5A5C610E5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 061416B0072; Sat, 9 Oct 2021 05:39:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 011056B0071; Sat, 9 Oct 2021 05:39:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA5C96B0074; Sat, 9 Oct 2021 05:39:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0204.hostedemail.com [216.40.44.204]) by kanga.kvack.org (Postfix) with ESMTP id C808D6B0071 for ; Sat, 9 Oct 2021 05:39:12 -0400 (EDT) Received: from smtpin37.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 788113A2D8 for ; Sat, 9 Oct 2021 09:39:12 +0000 (UTC) X-FDA: 78676400544.37.04C2787 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf15.hostedemail.com (Postfix) with ESMTP id 48209D0020EE for ; Sat, 9 Oct 2021 09:39:11 +0000 (UTC) Received: from dggemv711-chm.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4HRKjB4krKzWjdh; Sat, 9 Oct 2021 17:37:34 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggemv711-chm.china.huawei.com (10.1.198.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 9 Oct 2021 17:39:06 +0800 Received: from localhost.localdomain (10.69.192.56) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 9 Oct 2021 17:39:06 +0800 From: Yunsheng Lin To: , CC: , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH net-next -v5 3/4] mm: introduce __get_page() and __put_page() Date: Sat, 9 Oct 2021 17:37:23 +0800 Message-ID: <20211009093724.10539-4-linyunsheng@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211009093724.10539-1-linyunsheng@huawei.com> References: <20211009093724.10539-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.56] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 48209D0020EE X-Stat-Signature: po4959h6nzcg1ikewtp9jz9biw8rqa9m Authentication-Results: imf15.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=huawei.com; spf=pass (imf15.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com X-HE-Tag: 1633772351-17632 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Introduce __get_page() and __put_page() to operate on the base page or head of a compound page for the cases when a page is known to be a base page or head of a compound page. Signed-off-by: Yunsheng Lin --- include/linux/mm.h | 21 ++++++++++++++------- mm/swap.c | 6 +++--- 2 files changed, 17 insertions(+), 10 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 73a52aba448f..5683313c3e9d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -902,7 +902,7 @@ static inline struct page *virt_to_head_page(const void *x) return compound_head(page); } -void __put_page(struct page *page); +void __put_single_or_compound_page(struct page *page); void put_pages_list(struct list_head *pages); @@ -1203,9 +1203,8 @@ static inline bool is_pci_p2pdma_page(const struct page *page) #define page_ref_zero_or_close_to_overflow(page) \ ((unsigned int) page_ref_count(page) + 127u <= 127u) -static inline void get_page(struct page *page) +static inline void __get_page(struct page *page) { - page = compound_head(page); /* * Getting a normal page or the head of a compound page * requires to already have an elevated page->_refcount. @@ -1214,6 +1213,11 @@ static inline void get_page(struct page *page) page_ref_inc(page); } +static inline void get_page(struct page *page) +{ + __get_page(compound_head(page)); +} + bool __must_check try_grab_page(struct page *page, unsigned int flags); struct page *try_grab_compound_head(struct page *page, int refs, unsigned int flags); @@ -1228,10 +1232,8 @@ static inline __must_check bool try_get_page(struct page *page) return true; } -static inline void put_page(struct page *page) +static inline void __put_page(struct page *page) { - page = compound_head(page); - /* * For devmap managed pages we need to catch refcount transition from * 2 to 1, when refcount reach one it means the page is free and we @@ -1244,7 +1246,12 @@ static inline void put_page(struct page *page) } if (put_page_testzero(page)) - __put_page(page); + __put_single_or_compound_page(page); +} + +static inline void put_page(struct page *page) +{ + __put_page(compound_head(page)); } /* diff --git a/mm/swap.c b/mm/swap.c index af3cad4e5378..565cbde1caea 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -111,7 +111,7 @@ static void __put_compound_page(struct page *page) destroy_compound_page(page); } -void __put_page(struct page *page) +void __put_single_or_compound_page(struct page *page) { if (is_zone_device_page(page)) { put_dev_pagemap(page->pgmap); @@ -128,7 +128,7 @@ void __put_page(struct page *page) else __put_single_page(page); } -EXPORT_SYMBOL(__put_page); +EXPORT_SYMBOL(__put_single_or_compound_page); /** * put_pages_list() - release a list of pages @@ -1153,7 +1153,7 @@ void put_devmap_managed_page(struct page *page) if (count == 1) free_devmap_managed_page(page); else if (!count) - __put_page(page); + __put_single_or_compound_page(page); } EXPORT_SYMBOL(put_devmap_managed_page); #endif From patchwork Sat Oct 9 09:37:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 12547411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB5CDC433FE for ; Sat, 9 Oct 2021 09:39:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A0DCD610CE for ; Sat, 9 Oct 2021 09:39:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A0DCD610CE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B14ED6B0075; Sat, 9 Oct 2021 05:39:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A4A416B0078; Sat, 9 Oct 2021 05:39:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89DE76B007B; Sat, 9 Oct 2021 05:39:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0007.hostedemail.com [216.40.44.7]) by kanga.kvack.org (Postfix) with ESMTP id 659C76B0075 for ; Sat, 9 Oct 2021 05:39:15 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 13A723A2DC for ; Sat, 9 Oct 2021 09:39:15 +0000 (UTC) X-FDA: 78676400670.24.9F2A813 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf27.hostedemail.com (Postfix) with ESMTP id EDE167016DB3 for ; Sat, 9 Oct 2021 09:39:13 +0000 (UTC) Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.56]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4HRKjC1XKCz1DHQk; Sat, 9 Oct 2021 17:37:35 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 9 Oct 2021 17:39:07 +0800 Received: from localhost.localdomain (10.69.192.56) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 9 Oct 2021 17:39:06 +0800 From: Yunsheng Lin To: , CC: , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH net-next -v5 4/4] skbuff: keep track of pp page when pp_frag_count is used Date: Sat, 9 Oct 2021 17:37:24 +0800 Message-ID: <20211009093724.10539-5-linyunsheng@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20211009093724.10539-1-linyunsheng@huawei.com> References: <20211009093724.10539-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.56] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: EDE167016DB3 X-Stat-Signature: 834tu4g6yuis6eihn5mttdrf6h6m7n4g Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=huawei.com; spf=pass (imf27.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com X-HE-Tag: 1633772353-531773 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As the skb->pp_recycle and page->pp_magic may not be enough to track if a frag page is from page pool after the calling of __skb_frag_ref(), mostly because of a data race, see: commit 2cc3aeb5eccc ("skbuff: Fix a potential race while recycling page_pool packets"). There may be clone and expand head case that might lose the track if a frag page is from page pool or not. And not being able to keep track of pp page may cause problem for the skb_split() case in tso_fragment() too: Supposing a skb has 3 frag pages, all coming from a page pool, and is split to skb1 and skb2: skb1: first frag page + first half of second frag page skb2: second half of second frag page + third frag page How do we set the skb->pp_recycle of skb1 and skb2? 1. If we set both of them to 1, then we may have a similar race as the above commit for second frag page. 2. If we set only one of them to 1, then we may have resource leaking problem as both first frag page and third frag page are indeed from page pool. Increment the pp_frag_count of pp page frag in __skb_frag_ref(), and only use page->pp_magic to indicate a pp page frag in __skb_frag_unref() to keep track of pp page frag. Similar handling is done for the head page of a skb too. As we need the head page of a compound page to decide if it is from page pool at first, so __page_frag_cache_drain() and page_ref_inc() is used to avoid unnecessary compound_head() calling. Signed-off-by: Yunsheng Lin --- include/linux/skbuff.h | 30 ++++++++++++++++++++---------- include/net/page_pool.h | 24 +++++++++++++++++++++++- net/core/page_pool.c | 17 ++--------------- net/core/skbuff.c | 10 ++++++++-- 4 files changed, 53 insertions(+), 28 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 841e2f0f5240..c4f8b04a694c 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -3073,7 +3073,19 @@ static inline struct page *skb_frag_page(const skb_frag_t *frag) */ static inline void __skb_frag_ref(skb_frag_t *frag) { - get_page(skb_frag_page(frag)); + struct page *page = skb_frag_page(frag); + + page = compound_head(page); + +#ifdef CONFIG_PAGE_POOL + if (page_pool_is_pp_page(page) && + page_pool_is_pp_page_frag(page)) { + page_pool_atomic_inc_frag_count(page); + return; + } +#endif + + __get_page(page); } /** @@ -3100,11 +3112,16 @@ static inline void __skb_frag_unref(skb_frag_t *frag, bool recycle) { struct page *page = skb_frag_page(frag); + page = compound_head(page); + #ifdef CONFIG_PAGE_POOL - if (recycle && page_pool_return_skb_page(page)) + if (page_pool_is_pp_page(page) && + (recycle || page_pool_is_pp_page_frag(page))) { + page_pool_return_skb_page(page); return; + } #endif - put_page(page); + __put_page(page); } /** @@ -4718,12 +4735,5 @@ static inline void skb_mark_for_recycle(struct sk_buff *skb) } #endif -static inline bool skb_pp_recycle(struct sk_buff *skb, void *data) -{ - if (!IS_ENABLED(CONFIG_PAGE_POOL) || !skb->pp_recycle) - return false; - return page_pool_return_skb_page(virt_to_page(data)); -} - #endif /* __KERNEL__ */ #endif /* _LINUX_SKBUFF_H */ diff --git a/include/net/page_pool.h b/include/net/page_pool.h index 3855f069627f..740a8ca7f4a6 100644 --- a/include/net/page_pool.h +++ b/include/net/page_pool.h @@ -164,7 +164,7 @@ inline enum dma_data_direction page_pool_get_dma_dir(struct page_pool *pool) return pool->p.dma_dir; } -bool page_pool_return_skb_page(struct page *page); +void page_pool_return_skb_page(struct page *page); struct page_pool *page_pool_create(const struct page_pool_params *params); @@ -231,6 +231,28 @@ static inline void page_pool_set_frag_count(struct page *page, long nr) atomic_long_set(&page->pp_frag_count, nr); } +static inline void page_pool_atomic_inc_frag_count(struct page *page) +{ + atomic_long_inc(&page->pp_frag_count); +} + +static inline bool page_pool_is_pp_page(struct page *page) +{ + /* page->pp_magic is OR'ed with PP_SIGNATURE after the allocation + * in order to preserve any existing bits, such as bit 0 for the + * head page of compound page and bit 1 for pfmemalloc page, so + * mask those bits for freeing side when doing below checking, + * and page_is_pfmemalloc() is checked in __page_pool_put_page() + * to avoid recycling the pfmemalloc page. + */ + return (page->pp_magic & ~0x3UL) == PP_SIGNATURE; +} + +static inline bool page_pool_is_pp_page_frag(struct page *page) +{ + return !!atomic_long_read(&page->pp_frag_count); +} + static inline long page_pool_atomic_sub_frag_count_return(struct page *page, long nr) { diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 2c643b72ce16..d141e00459c9 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -219,6 +219,7 @@ static void page_pool_set_pp_info(struct page_pool *pool, { page->pp = pool; page->pp_magic |= PP_SIGNATURE; + page_pool_set_frag_count(page, 0); } static void page_pool_clear_pp_info(struct page *page) @@ -736,22 +737,10 @@ void page_pool_update_nid(struct page_pool *pool, int new_nid) } EXPORT_SYMBOL(page_pool_update_nid); -bool page_pool_return_skb_page(struct page *page) +void page_pool_return_skb_page(struct page *page) { struct page_pool *pp; - page = compound_head(page); - - /* page->pp_magic is OR'ed with PP_SIGNATURE after the allocation - * in order to preserve any existing bits, such as bit 0 for the - * head page of compound page and bit 1 for pfmemalloc page, so - * mask those bits for freeing side when doing below checking, - * and page_is_pfmemalloc() is checked in __page_pool_put_page() - * to avoid recycling the pfmemalloc page. - */ - if (unlikely((page->pp_magic & ~0x3UL) != PP_SIGNATURE)) - return false; - pp = page->pp; /* Driver set this to memory recycling info. Reset it on recycle. @@ -760,7 +749,5 @@ bool page_pool_return_skb_page(struct page *page) * 'flipped' fragment being in use or not. */ page_pool_put_full_page(pp, page, false); - - return true; } EXPORT_SYMBOL(page_pool_return_skb_page); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 74601bbc56ac..e3691b025d30 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -646,9 +646,15 @@ static void skb_free_head(struct sk_buff *skb) unsigned char *head = skb->head; if (skb->head_frag) { - if (skb_pp_recycle(skb, head)) + struct page *page = virt_to_head_page(head); + + if (page_pool_is_pp_page(page) && + (skb->pp_recycle || page_pool_is_pp_page_frag(page))) { + page_pool_return_skb_page(page); return; - skb_free_frag(head); + } + + __page_frag_cache_drain(page, 1); } else { kfree(head); }