From patchwork Thu Nov 14 12:15:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13875058 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFECED65C7E for ; Thu, 14 Nov 2024 12:22:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E8BF6B0085; Thu, 14 Nov 2024 07:22:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 471176B0088; Thu, 14 Nov 2024 07:22:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 29AD36B0089; Thu, 14 Nov 2024 07:22:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0EBB66B0085 for ; Thu, 14 Nov 2024 07:22:46 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 90E4A1C6940 for ; Thu, 14 Nov 2024 12:22:45 +0000 (UTC) X-FDA: 82784612946.30.EB9345D Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf20.hostedemail.com (Postfix) with ESMTP id D44811C000C for ; Thu, 14 Nov 2024 12:21:47 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf20.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731586900; a=rsa-sha256; cv=none; b=DyerxaSceC8P9d+5ooZld+RrFCqF5IhOjUh85s1F9y6twqF8UEWEia0eJarA2dbqNjGVWl sAPc4IJJn9RW9UfSbRj158SSVWw5ANM9AzzF1Ux9EbZxkDUYXLnV9ttlsvXLwLApybGZOr 5ky18tcD1jqz2USf0v1+6FxKmxtKvm0= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf20.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731586900; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3xiqcNUxqGifLj5ro6FSlzRyYoA7Pm6KCQOaDJzaKxk=; b=NTawWnndFfTWOUnQwGiomK+iGs/jev6MczYVUwnpxtXjhyfk3iF5jwU6gl7Fu2x4IPkqsn J153g6c6M976HX9nQIb4MXtjRHFL23sveCjOTK3X51sRlGyw2dx9rYEL6mt06K4UUxtZp+ Kyk07NzvcNZ1p9o/0UzaR01War2UD+4= Received: from mail.maildlp.com (unknown [172.19.162.254]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Xpzk46HcFzDrDS; Thu, 14 Nov 2024 20:19:56 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 7F9541800F2; Thu, 14 Nov 2024 20:22:37 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 14 Nov 2024 20:22:37 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Linux-MM Subject: [PATCH net-next v1 01/10] mm: page_frag: some minor refactoring before adding new API Date: Thu, 14 Nov 2024 20:15:56 +0800 Message-ID: <20241114121606.3434517-2-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20241114121606.3434517-1-linyunsheng@huawei.com> References: <20241114121606.3434517-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspam-User: X-Rspamd-Queue-Id: D44811C000C X-Rspamd-Server: rspam11 X-Stat-Signature: osgnigj1wntojntbndnauzioqqxsuthw X-HE-Tag: 1731586907-787288 X-HE-Meta: U2FsdGVkX1//Kaf24PzFsfKCWCR3kOl5GPWv0By8iY0nGAz1vkxrbhbC33BfVZ4AHr2TuK+FI81NQ7dImlBaxm/EGjaIG3AUcaB8x+/R2VYwAjwf7zqczLLJqnXFB8TLOewqx8Kx7ZAn0MHTooeUq+8ZUy2wItNDdgCaLUXRt8nwFXQW/9s0efFQopFHQKWLnNfJOaLeG0Qaxb0h4DGrKDK3kRAOTzWQdNoD3ef+43HXX3UBn7+aBdamXDCX6oN5ZuwG73PG+GWb2QmgFBrPrmT1kdBzYC+LAaUPelTo/1Vrhkz8LbwYgXfTxO4iJTG6Q1WMVP12PZLXJsv2pR5YCsjOtZy59TieQcGnrN/IbvdO43HNlPm3nh6tz/6kOPID5uYFB1Vf5Od56DpZS9AzGVWASj9Zj0WSb7BXpHvnoOUy5mF9iwh9vrgM5kxTlxnuSYBlQ7W5rbMjC2z7RXwKkYyhqwaTg9wxSZswz4G6pdU6fiEkT0+Lb9/MygwTjUPwd54pkHeUo/eSEGpe2Bd6GNyYcOUcsvp1AxAkAo0S+IonZ2muab7bLJ0VKnGyc/rc+Roeapyz5E8NufSp0WYBfUz6gr38tglHef3rFcWxud3ZYecDupHch727pEgdZV7JGiIVf0nomjT34Z8uBVL9732wj5x6oe0BP+NHhqbk6zpFmV0mGyFfQlHoIyzfPZac5dAhwW8puj73/JhxalNNGdTNc9j7upwy2pFq5I0SxxlHDxKTDkXWx4lqy1Tn3g715a2KpiK2+orB88vUTZuJRR6dAoQcAaQgPDEpIjsNHFiX5VvC6ktXZ6SvoWIwJ9Bl0RdUysArcrbc286AiTTB0qx6Uy7jCbk1ZRh/x7J+ZQE7lcn4091Ez+svY+sPbtNNVt+TnLGTui/uGWWE4GG+0HBHg2+59vlGV2ZXDyMtjKkAO4nf2JZD3b3xrEXf4+R4g6mxsbtamX/gOQEGbac LLpPLWej reGwk7VXUgneE+2aH36h0mgfhQxJ5kO/vhENHIJ/0B/0B+rgkpcm3MNFctOJ4hgttBFiiEfo56glxy8nv05Gfne31qskepbZYc1brY7WWqKFUngoRW8coJWsdklByLqrq45WycZye2gis79CUNJYqovuX/yYrGpFsqa357d9WPPLwg/j9yzauTI1W9WxT8MtICcy0s9KwwCvoqWKCCVKyiOWZTKhy/dcXPmjvgOC+EK023QjM2gGEiYBmZlHz3XHO1nD0VWZdJVbSCR92q8Ym5ryq6SWmkKFFUlavFrIH3AMvPY9TiXHU3/LQN/pgS1PILOkj26O7FJBGPQlu/S/DKYAbuD6PJAYj5zv39jcymuTCNqQmmsbzcInJmwniHsEIjm4CCVOCWaoDGiN9kBWInrHhcQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Refactor common codes from __page_frag_alloc_va_align() to __page_frag_cache_prepare() and __page_frag_cache_commit(), so that the new API can make use of them. CC: Alexander Duyck CC: Andrew Morton CC: Linux-MM Signed-off-by: Yunsheng Lin --- include/linux/page_frag_cache.h | 34 ++++++++++++++++++++++++++-- mm/page_frag_cache.c | 40 ++++++++++++++++++++++++++------- 2 files changed, 64 insertions(+), 10 deletions(-) diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index 41a91df82631..5ae97f93a0a1 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -5,6 +5,7 @@ #include #include +#include #include #include @@ -39,8 +40,37 @@ static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc) void page_frag_cache_drain(struct page_frag_cache *nc); void __page_frag_cache_drain(struct page *page, unsigned int count); -void *__page_frag_alloc_align(struct page_frag_cache *nc, unsigned int fragsz, - gfp_t gfp_mask, unsigned int align_mask); +void *__page_frag_cache_prepare(struct page_frag_cache *nc, unsigned int fragsz, + struct page_frag *pfrag, gfp_t gfp_mask, + unsigned int align_mask); +unsigned int __page_frag_cache_commit_noref(struct page_frag_cache *nc, + struct page_frag *pfrag, + unsigned int used_sz); + +static inline unsigned int __page_frag_cache_commit(struct page_frag_cache *nc, + struct page_frag *pfrag, + unsigned int used_sz) +{ + VM_BUG_ON(!nc->pagecnt_bias); + nc->pagecnt_bias--; + + return __page_frag_cache_commit_noref(nc, pfrag, used_sz); +} + +static inline void *__page_frag_alloc_align(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask, + unsigned int align_mask) +{ + struct page_frag page_frag; + void *va; + + va = __page_frag_cache_prepare(nc, fragsz, &page_frag, gfp_mask, + align_mask); + if (likely(va)) + __page_frag_cache_commit(nc, &page_frag, fragsz); + + return va; +} static inline void *page_frag_alloc_align(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask, diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c index 3f7a203d35c6..f55d34cf7d43 100644 --- a/mm/page_frag_cache.c +++ b/mm/page_frag_cache.c @@ -90,9 +90,31 @@ void __page_frag_cache_drain(struct page *page, unsigned int count) } EXPORT_SYMBOL(__page_frag_cache_drain); -void *__page_frag_alloc_align(struct page_frag_cache *nc, - unsigned int fragsz, gfp_t gfp_mask, - unsigned int align_mask) +unsigned int __page_frag_cache_commit_noref(struct page_frag_cache *nc, + struct page_frag *pfrag, + unsigned int used_sz) +{ + unsigned int orig_offset; + + VM_BUG_ON(used_sz > pfrag->size); + VM_BUG_ON(pfrag->page != encoded_page_decode_page(nc->encoded_page)); + VM_BUG_ON(pfrag->offset + pfrag->size > + (PAGE_SIZE << encoded_page_decode_order(nc->encoded_page))); + + /* pfrag->offset might be bigger than the nc->offset due to alignment */ + VM_BUG_ON(nc->offset > pfrag->offset); + + orig_offset = nc->offset; + nc->offset = pfrag->offset + used_sz; + + /* Return true size back to caller considering the offset alignment */ + return nc->offset - orig_offset; +} +EXPORT_SYMBOL(__page_frag_cache_commit_noref); + +void *__page_frag_cache_prepare(struct page_frag_cache *nc, unsigned int fragsz, + struct page_frag *pfrag, gfp_t gfp_mask, + unsigned int align_mask) { unsigned long encoded_page = nc->encoded_page; unsigned int size, offset; @@ -114,6 +136,8 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, /* reset page count bias and offset to start of new frag */ nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; nc->offset = 0; + } else { + page = encoded_page_decode_page(encoded_page); } size = PAGE_SIZE << encoded_page_decode_order(encoded_page); @@ -132,8 +156,6 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, return NULL; } - page = encoded_page_decode_page(encoded_page); - if (!page_ref_sub_and_test(page, nc->pagecnt_bias)) goto refill; @@ -148,15 +170,17 @@ void *__page_frag_alloc_align(struct page_frag_cache *nc, /* reset page count bias and offset to start of new frag */ nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1; + nc->offset = 0; offset = 0; } - nc->pagecnt_bias--; - nc->offset = offset + fragsz; + pfrag->page = page; + pfrag->offset = offset; + pfrag->size = size - offset; return encoded_page_decode_virt(encoded_page) + offset; } -EXPORT_SYMBOL(__page_frag_alloc_align); +EXPORT_SYMBOL(__page_frag_cache_prepare); /* * Frees a page fragment allocated out of either a compound or order 0 page. From patchwork Thu Nov 14 12:15:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13875059 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8EB1D637A7 for ; Thu, 14 Nov 2024 12:22:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7F3466B0089; Thu, 14 Nov 2024 07:22:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 77C8C6B008A; Thu, 14 Nov 2024 07:22:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F5CB6B008C; Thu, 14 Nov 2024 07:22:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 416D26B0089 for ; Thu, 14 Nov 2024 07:22:49 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 04DC3810F1 for ; Thu, 14 Nov 2024 12:22:48 +0000 (UTC) X-FDA: 82784613156.22.B7EF5D2 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf21.hostedemail.com (Postfix) with ESMTP id C4BBE1C000E for ; Thu, 14 Nov 2024 12:21:22 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf21.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731586836; a=rsa-sha256; cv=none; b=kL3zKEb7KTA3zmmejlcQhFsGvcGKw9MFquVs557OQiV9I3Iq+cD0MvIVkWANFBZz724Eic PSObz2w0+tw4teoaxQrNtT8GIhAZqWsqBCjX6+b1GeL0ucyK/49L/kEx8BIVEHpDNRoOo5 k4JMZJAw6EyF8a7C6QvzIsbjAVKbHCY= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf21.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731586836; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=B8BESktr9IjMF4qyErGXRqxqIzluNuuWawkiuUhmtRQ=; b=IKQ6keOkFCXmMYlUKLZCFV+d+qtaLwAN1yalCjId3DZdM7gWQhVwN2X81un3WGuHEapk96 U71YMnzRQyPX12K7toXyqHKyJ387M8N2Myzj/KYCDs6CiNjhCBioHkvkZVcoCvCjsoVco8 2WN/Ivk0yWRJ+D+knfJH8BS6ETfkhyA= Received: from mail.maildlp.com (unknown [172.19.88.105]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Xpzl158h6z10V9f; Thu, 14 Nov 2024 20:20:45 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 264411400DC; Thu, 14 Nov 2024 20:22:41 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 14 Nov 2024 20:22:40 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Linux-MM , Alexander Duyck , Eric Dumazet , Simon Horman , David Ahern Subject: [PATCH net-next v1 02/10] net: rename skb_copy_to_page_nocache() helper Date: Thu, 14 Nov 2024 20:15:57 +0800 Message-ID: <20241114121606.3434517-3-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20241114121606.3434517-1-linyunsheng@huawei.com> References: <20241114121606.3434517-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspamd-Queue-Id: C4BBE1C000E X-Stat-Signature: zn75p9j9wnk7hzg589x7ye3qsj4znmaj X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1731586882-397790 X-HE-Meta: U2FsdGVkX1/UVUD1DyDC28DRQbNT02PWOmCxtAp22W8kVxsS03N3ppYSMraOfy036NudPygsbS8wsJoojsJLcYDq4YwFp/Zn2+ZbwrVXKwHD9tK1tA7zSSVw69pgLjLn+er4YMZX14GMwop0JrP/bliNBvSFa3Y3p3qPfWfVBQITLxxU2DqfCjOTtHl9hmG/9Ml/AC2gViSsAIM2AU8OTKqkKs7BxL1zucEF9ndUfgZPP3in2MxJ6uSmD5Uqu+4bVf7SFU/mFCbV6AdcIG6K/H6WHIbUNggjHA2rez4c2MzfXNZ/KxAqBa7tbwAzo9yS/vHgabwArB52bGGBMLLCiHaweQNNVdIotUPxACugA2c9cbRUIXP5uJTxlldpXs3y2AgIolST3O71Bfc56+en1iOm/P7zTkWivaZm17Eh/lRllhBB4zCGObli0NPL82INZbeXVlhIvIGUKkXhIQjFQC8oafl+F7ddwJFTMzE9K1UCPO303PyZsQu33rn/9p2mDWGI2u9zZBeBJ+qG6uIMXi6CpLAnlg0it35GIwnU7RRsqIUwPibpqkMOxiEEOwD25Z6CshE8B1t1FvONmNkr6W3Po4QxGtLTm0kRefAjLoSp0SSgua2UcY0pMpnHeT5uZVN+VMX4dlZq6mAXlz+Y0H+JvA9XFzg32tt7xm6K4YB+LyzIY49vkcmJpTJp76ZuMr7oPpEtnX2PX4bJhlnUNjwV37+lo04hTrBjqVE9eohuwRS7GBufHxR4R+O++wROgaM4qrzTtqQePxqseNa+xOYLO0mqS6gY0rtl2xgm8MtYMhjBbS2VKTsPEq9DsX1igXRD0171jFfdO2/tpCZqI4cmndTWibekIqX94V7plOJJMetvGBwFlJvu1RC9I0r7R7pnME6o5Elgr2V7X5hH28hnn1MPUbt2ZWhOJBjVLxuBxblkZJBDaiC8UyDIUxRY927m+ZudHJdN4IRQVv/ iGVT+fQY r7++SmfrzVHOUYyvbRuews/DdzPVPBHmFmr7ArFqgN64f85lHpDfQ+W0/1yyj0yP8VB7HRMn+S7j9awESTfssMWJzlmE+oEl+PN2R1ng2M2IV0zhEE+De7jMzd623lbDs4iiib20yR4K9su5Fj0BXYFev+oix9cVU2eK8HikgzkmAGDMsVtKVXa9C2YwyZq53kgTjbRVT7U98/tipqQCSmUodDl/bIRRwCcOAJ/p7qhE+fWCXHisE5uE4RPNqRZU7Xh0dXPxsnmbJTYe14Gk017PFPJW4P5s6IUsbfTp6oD89OlBDG1jfZoaRiUUy+cjxzv49HXIwbfLNgVif+Lu7ggXAcfGQ91Ct7jjRFHQeqFwZxAb3QXQzgILN5b0D3gwNNcxBlSMePYDOtDrBbviH/gF3Z1hCwh2rbNOquMUJ4g8t6mA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Rename skb_copy_to_page_nocache() to skb_copy_to_frag_nocache() to avoid calling virt_to_page() as we are about to pass virtual address directly. CC: Alexander Duyck CC: Andrew Morton CC: Linux-MM Signed-off-by: Yunsheng Lin Reviewed-by: Alexander Duyck --- include/net/sock.h | 9 ++++----- net/ipv4/tcp.c | 7 +++---- net/kcm/kcmsock.c | 7 +++---- 3 files changed, 10 insertions(+), 13 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 7464e9f9f47c..cf037c870e3b 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2203,15 +2203,14 @@ static inline int skb_add_data_nocache(struct sock *sk, struct sk_buff *skb, return err; } -static inline int skb_copy_to_page_nocache(struct sock *sk, struct iov_iter *from, +static inline int skb_copy_to_frag_nocache(struct sock *sk, + struct iov_iter *from, struct sk_buff *skb, - struct page *page, - int off, int copy) + char *va, int copy) { int err; - err = skb_do_copy_data_nocache(sk, skb, from, page_address(page) + off, - copy, skb->len); + err = skb_do_copy_data_nocache(sk, skb, from, va, copy, skb->len); if (err) return err; diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0d704bda6c41..0fbf1e222cda 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1219,10 +1219,9 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (!copy) goto wait_for_space; - err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb, - pfrag->page, - pfrag->offset, - copy); + err = skb_copy_to_frag_nocache(sk, &msg->msg_iter, skb, + page_address(pfrag->page) + + pfrag->offset, copy); if (err) goto do_error; diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index 24aec295a51c..94719d4af5fa 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -856,10 +856,9 @@ static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) if (!sk_wmem_schedule(sk, copy)) goto wait_for_memory; - err = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb, - pfrag->page, - pfrag->offset, - copy); + err = skb_copy_to_frag_nocache(sk, &msg->msg_iter, skb, + page_address(pfrag->page) + + pfrag->offset, copy); if (err) goto out_error; From patchwork Thu Nov 14 12:15:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13875060 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8DE8D65C7E for ; Thu, 14 Nov 2024 12:22:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7CC716B008A; Thu, 14 Nov 2024 07:22:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 77BD46B008C; Thu, 14 Nov 2024 07:22:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 61D066B0092; Thu, 14 Nov 2024 07:22:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 46DD86B008A for ; Thu, 14 Nov 2024 07:22:50 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E5798121143 for ; Thu, 14 Nov 2024 12:22:49 +0000 (UTC) X-FDA: 82784613240.12.90B8D73 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf29.hostedemail.com (Postfix) with ESMTP id A2E94120024 for ; Thu, 14 Nov 2024 12:21:47 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731586792; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VyfAy/0F+66WfZkRUvQ5hPfL76WheTY9mTIz1fIBfxM=; b=KHGnm708L56ut3Hs6nNfqGljFga2D6j4VWWvySKr4fSzNGZj0Aw2O9vzKfWd0uV/AJSax1 VXSzyCSshPHgQspVzWkMeZCi/cntxzn9aTeFGq6OkG0dRypJ2p5YpOLeqNmSu6n9ni6pAa 25x9jRuqdEbeCvOzRvP2tF3nufBuKuM= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731586792; a=rsa-sha256; cv=none; b=ctWE/TDc2El7fLi3ww1smhFfFHcUjZjCIzZOxLWgEe5EK+qyWcnXWPqJFv1M6zvPSpSYDp blrxxZYSfobKG143kBxcBT/EeAX8molrE56lYf1YdHqDHshHt+pNbwjNRv7ACRPeGPMP2A 00A5tTLPSwFpv5R0ayRy8pkFuyxTpmQ= Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4Xpzl866ddz2Dgwf; Thu, 14 Nov 2024 20:20:52 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 3F9491A0188; Thu, 14 Nov 2024 20:22:44 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 14 Nov 2024 20:22:43 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Linux-MM , Jonathan Corbet , Subject: [PATCH net-next v1 03/10] mm: page_frag: update documentation for page_frag Date: Thu, 14 Nov 2024 20:15:58 +0800 Message-ID: <20241114121606.3434517-4-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20241114121606.3434517-1-linyunsheng@huawei.com> References: <20241114121606.3434517-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: A2E94120024 X-Stat-Signature: 3i4hjgr3b5deq3tiyd1jb7bgfchtnedo X-Rspam-User: X-HE-Tag: 1731586907-778236 X-HE-Meta: U2FsdGVkX1/mQ2qsrKDLR/osQxbgnHDnIiciPIpf5MIlcc7CUqh/IrdkMx8y7CPk827Vq/hVMgwwpnp6dU5oO99cM+jwUaJJmE3LAm9+w/Kp/GGsy0jttBYx5kTAPD2qMPw+WIXNaMPU9DkmpNdL91GydQJl2A1t3DIVwf36w5TNH3kqaGqi1LzkLSjs8G+VvPynTxWjWzTIAeExsU7kdGjTmXkwfuGMclyyTHgYF+c+xeEFdkUryygivGxYj4sKnyVlsAFTDjfrWvczTwoAAARD73hZx3X2s9vqnqJ+V5qoN4HrD8fYvszVFOxiH7r6RUsSeya7PvaMvvU9aGis3J24D5G3PLYGNfYEWWApN2DvYYeTd9NmgmQ+BGbzUoiIkWyQPhp6vBflubGPvs14x6exdEjdyclyrzuoDDgomiO/RSgyd9z4XX1Aq2GgFIhHY26QfGGU3oOQbjvTdOWQbBwHGQwV4YCuv9PfP6MtxtsbdstZNZTkzA7LOzmoUE2KKHFNS6Ssi1J/I+qzAACUXt7r4GNnKnWtFArGJmQozrTBG3GpU/Xc1kAUZmDRWYumbW8hUwBSr3dOwMlkStaxJmGJ3HHX8wnXrzmMyLVCygSiRoBPZXHFLhmGeGJJC6Oip5mWMi5DfbR8gBk6jcSN6S5idGodZkwcGq/ynba6S9IE3DJOhmhf2URaDM5EsJaGcz0TZpc9/t4vZK7hXgWoBuZTT+Bvm0dyyLvl/PSXmBuERe1pqlLrstPQBfgNnN+lWuvB3qcbQgD+Hq4xaEJVQLcCVIMv8YtslR6zHijQCj8jB7Tjqo1rB8yFiz5Mft1cxHvNpqO+bghRRM3WPvJwEvkh9S4Fxk22RTrD/4wuR/Wio03gCIL6YHlTdvahXPWOSgfedeVwns/Xlk+ec3IzHc6nocEUaScsDmPiZnrEPBeLmNV04knfTTOQQ2H7iS21ExkWPvmNRae5xvsBAOc MXT5Y9PC DGGYgdnhfWXq80phzlbkVniYj6XIvSy6bKuRGEO5EAxJs8DvQMASNGQewsn0XMmOKviWQ8aXBr39r6FRiXEjREQdIpM/7lkS0CrfKqCX03WvdltlzOdkqjFQAqShS7yawDcVj0ALj0iORUw7zfyWZU7aWuSd6IoxfvAY3PUrwLp131xOEDXTibirXdYa7CvqEqLXzftjIQ7t3UXiTifm22Uw300QmGRNW2EpXuUGQp2WxMEDNT5l+CQJZyf9zZQ2KuGqwfh+vXEJant2zdKGg4bgHw07mkydJwehQ7L5pvOl730FSf8F6tMBFsz7UmNZDOqC7VHlVcz5f4OHaBUF0BdgCeYNz7a/PtypEQOZOqvJAZhJxzw6ZgwAmw7bxvn3uFB7posoREiVCr0PlF3zUOqXfdQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Update documentation about design, implementation and API usages for page_frag. CC: Alexander Duyck CC: Andrew Morton CC: Linux-MM Signed-off-by: Yunsheng Lin --- Documentation/mm/page_frags.rst | 110 +++++++++++++++++++++++++++++++- include/linux/page_frag_cache.h | 54 ++++++++++++++++ mm/page_frag_cache.c | 12 +++- 3 files changed, 173 insertions(+), 3 deletions(-) diff --git a/Documentation/mm/page_frags.rst b/Documentation/mm/page_frags.rst index 503ca6cdb804..34e654c2956e 100644 --- a/Documentation/mm/page_frags.rst +++ b/Documentation/mm/page_frags.rst @@ -1,3 +1,5 @@ +.. SPDX-License-Identifier: GPL-2.0 + ============== Page fragments ============== @@ -40,4 +42,110 @@ page via a single call. The advantage to doing this is that it allows for cleaning up the multiple references that were added to a page in order to avoid calling get_page per allocation. -Alexander Duyck, Nov 29, 2016. + +Architecture overview +===================== + +.. code-block:: none + + +----------------------+ + | page_frag API caller | + +----------------------+ + | + | + v + +------------------------------------------------------------------+ + | request page fragment | + +------------------------------------------------------------------+ + | | | + | | | + | Cache not enough | + | | | + | +-----------------+ | + | | reuse old cache |--Usable-->| + | +-----------------+ | + | | | + | Not usable | + | | | + | v | + Cache empty +-----------------+ | + | | drain old cache | | + | +-----------------+ | + | | | + v_________________________________v | + | | + | | + _________________v_______________ | + | | Cache is enough + | | | + PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE | | + | | | + | PAGE_SIZE >= PAGE_FRAG_CACHE_MAX_SIZE | + v | | + +----------------------------------+ | | + | refill cache with order > 0 page | | | + +----------------------------------+ | | + | | | | + | | | | + | Refill failed | | + | | | | + | v v | + | +------------------------------------+ | + | | refill cache with order 0 page | | + | +----------------------------------=-+ | + | | | + Refill succeed | | + | Refill succeed | + | | | + v v v + +------------------------------------------------------------------+ + | allocate fragment from cache | + +------------------------------------------------------------------+ + +API interface +============= + +Depending on different aligning requirement, the page_frag API caller may call +page_frag_*_align*() to ensure the returned virtual address or offset of the +page is aligned according to the 'align/alignment' parameter. Note the size of +the allocated fragment is not aligned, the caller needs to provide an aligned +fragsz if there is an alignment requirement for the size of the fragment. + +.. kernel-doc:: include/linux/page_frag_cache.h + :identifiers: page_frag_cache_init page_frag_cache_is_pfmemalloc + __page_frag_alloc_align page_frag_alloc_align page_frag_alloc + +.. kernel-doc:: mm/page_frag_cache.c + :identifiers: page_frag_cache_drain page_frag_free + +Coding examples +=============== + +Initialization and draining API +------------------------------- + +.. code-block:: c + + page_frag_cache_init(nc); + ... + page_frag_cache_drain(nc); + + +Allocation & freeing API +------------------------ + +.. code-block:: c + + void *va; + + va = page_frag_alloc_align(nc, size, gfp, align); + if (!va) + goto do_error; + + err = do_something(va, size); + if (err) + goto do_error; + + ... + + page_frag_free(va); diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index 5ae97f93a0a1..a2b1127e8ac8 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -28,11 +28,28 @@ static inline bool encoded_page_decode_pfmemalloc(unsigned long encoded_page) return !!(encoded_page & PAGE_FRAG_CACHE_PFMEMALLOC_BIT); } +/** + * page_frag_cache_init() - Init page_frag cache. + * @nc: page_frag cache from which to init + * + * Inline helper to initialize the page_frag cache. + */ static inline void page_frag_cache_init(struct page_frag_cache *nc) { nc->encoded_page = 0; } +/** + * page_frag_cache_is_pfmemalloc() - Check for pfmemalloc. + * @nc: page_frag cache from which to check + * + * Check if the current page in page_frag cache is allocated from the pfmemalloc + * reserves. It has the same calling context expectation as the allocation API. + * + * Return: + * true if the current page in page_frag cache is allocated from the pfmemalloc + * reserves, otherwise return false. + */ static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc) { return encoded_page_decode_pfmemalloc(nc->encoded_page); @@ -57,6 +74,19 @@ static inline unsigned int __page_frag_cache_commit(struct page_frag_cache *nc, return __page_frag_cache_commit_noref(nc, pfrag, used_sz); } +/** + * __page_frag_alloc_align() - Allocate a page fragment with aligning + * requirement. + * @nc: page_frag cache from which to allocate + * @fragsz: the requested fragment size + * @gfp_mask: the allocation gfp to use when cache need to be refilled + * @align_mask: the requested aligning requirement for the 'va' + * + * Allocate a page fragment from page_frag cache with aligning requirement. + * + * Return: + * Virtual address of the page fragment, otherwise return NULL. + */ static inline void *__page_frag_alloc_align(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask, unsigned int align_mask) @@ -72,6 +102,19 @@ static inline void *__page_frag_alloc_align(struct page_frag_cache *nc, return va; } +/** + * page_frag_alloc_align() - Allocate a page fragment with aligning requirement. + * @nc: page_frag cache from which to allocate + * @fragsz: the requested fragment size + * @gfp_mask: the allocation gfp to use when cache needs to be refilled + * @align: the requested aligning requirement for the fragment + * + * WARN_ON_ONCE() checking for @align before allocating a page fragment from + * page_frag cache with aligning requirement. + * + * Return: + * virtual address of the page fragment, otherwise return NULL. + */ static inline void *page_frag_alloc_align(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask, unsigned int align) @@ -80,6 +123,17 @@ static inline void *page_frag_alloc_align(struct page_frag_cache *nc, return __page_frag_alloc_align(nc, fragsz, gfp_mask, -align); } +/** + * page_frag_alloc() - Allocate a page fragment. + * @nc: page_frag cache from which to allocate + * @fragsz: the requested fragment size + * @gfp_mask: the allocation gfp to use when cache need to be refilled + * + * Allocate a page fragment from page_frag cache. + * + * Return: + * virtual address of the page fragment, otherwise return NULL. + */ static inline void *page_frag_alloc(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask) { diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c index f55d34cf7d43..d014130fb893 100644 --- a/mm/page_frag_cache.c +++ b/mm/page_frag_cache.c @@ -70,6 +70,10 @@ static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, return page; } +/** + * page_frag_cache_drain - Drain the current page from page_frag cache. + * @nc: page_frag cache from which to drain + */ void page_frag_cache_drain(struct page_frag_cache *nc) { if (!nc->encoded_page) @@ -182,8 +186,12 @@ void *__page_frag_cache_prepare(struct page_frag_cache *nc, unsigned int fragsz, } EXPORT_SYMBOL(__page_frag_cache_prepare); -/* - * Frees a page fragment allocated out of either a compound or order 0 page. +/** + * page_frag_free - Free a page fragment. + * @addr: va of page fragment to be freed + * + * Free a page fragment allocated out of either a compound or order 0 page by + * virtual address. */ void page_frag_free(void *addr) { From patchwork Thu Nov 14 12:15:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13875061 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93878D637CE for ; Thu, 14 Nov 2024 12:22:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0D386B008C; Thu, 14 Nov 2024 07:22:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DBE2E6B0093; Thu, 14 Nov 2024 07:22:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5F2E6B0095; Thu, 14 Nov 2024 07:22:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 979C76B008C for ; Thu, 14 Nov 2024 07:22:53 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 4CF4081125 for ; Thu, 14 Nov 2024 12:22:53 +0000 (UTC) X-FDA: 82784613828.11.FAF9C8C Received: from szxga05-in.huawei.com (szxga05-in.huawei.com [45.249.212.191]) by imf22.hostedemail.com (Postfix) with ESMTP id CE36EC0025 for ; Thu, 14 Nov 2024 12:21:55 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.191 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731586915; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tDKWn9PXBBfdu4Yz/x4AvaY5+qtAqpsrigULpBMtHIs=; b=Ba2jzNp8/b/C0wbQCgEqUn++6thSemENMU1wtYrNE7QjGS3w8e2nhJxUGtwiSqR3e7HOM5 wiLcqiOPvxQ63k48uZJn/uA3sGr6qCvlcqqUdVhdv0TIxdnCefXt1TUmy62A01+PPm5e1I BwR6zniEhCW3F/WNeIsGArwlyZYpn5E= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731586915; a=rsa-sha256; cv=none; b=cFyezct7nVzVVS8wEzG/xfAp+X9KZDjQP4lHhUJYEbF71rRIDHLs9XYVVAzUxPY9dYdDK2 3ljw8QF32kux4OT5RXvQk3OtvVhCYUjVhhej7FZpbrCEvpDODiDEWQsrqS+xT0n0JFbmN8 OLrNiC5tpJWCmk3l5AnjvJSbLxx5xgw= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; spf=pass (imf22.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.191 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.234]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4XpzlD3TxJz1jyy5; Thu, 14 Nov 2024 20:20:56 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id A675E140391; Thu, 14 Nov 2024 20:22:47 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 14 Nov 2024 20:22:47 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Linux-MM , Jonathan Corbet , Subject: [PATCH net-next v1 04/10] mm: page_frag: introduce page_frag_alloc_abort() related API Date: Thu, 14 Nov 2024 20:15:59 +0800 Message-ID: <20241114121606.3434517-5-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20241114121606.3434517-1-linyunsheng@huawei.com> References: <20241114121606.3434517-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Stat-Signature: 7mom3knn6z4tbdx3hkasd794ycxitfn1 X-Rspam-User: X-Rspamd-Queue-Id: CE36EC0025 X-Rspamd-Server: rspam02 X-HE-Tag: 1731586915-376454 X-HE-Meta: U2FsdGVkX18u44ZhPqFWLzgmdUwsa1aXwmfST936Dc49H6VZzgl0dTtGuKa7bioJOQaKybSUxzjarY26ABkNuehCmdThxCS5y3Ll7axj/WNmdVZ8pjDxPctN2B64DzyM0jkMZSYG7sUsmlCV3AFWNO66IRWKbuc0DKgBz3ggXzkLYXgpX8WKyHgU075MVxwmqXuFR1IQbXa8ybWSmxUpCRMudUEAS/wzxr0c/NGksFpuSPWf4ojXC6pI4g7SV2eU3jNjQXNdz6xuEVZUtSTSX7yQrj7T+oM48QddejGQUktW+PYsb1hDRSv2Z0qZK2HTVgUa8SDhskxGSQ0zHvc3eD/gO4+usr3Ujpnw+3GG8UDe+jpqYnRw5KsiXObGR8AuxIEdnZ8RXVopFUBb6S/hwebHgCj7kK5FyZvej+sLJUKQKx1VEqolLdA4PnkX2BnoxAAq+qh985Sgk1cVwvK6tlqg0vssd6dAbD/QOA55aW1C6HyXfXlEteuZyucXTvCLC6OOCGeC3gzUAS/LwSv7ytfMd8LeYzFvj+7XFctviEGCCWNayV4sWVypEsUVK0ncvQrmCKQMq1BdmYJ11Mi6ZF37h7A3giKj5j4uLRRb07rao2OtlwE9BxPUE4aNvN/X9ruVw/fqlUylyqxsXPPcnIJVriR2QpZdRYOW/JvsBVhp/2ijpYE5Zl5QZsc4FpFsYAjHrQW6tA7mcR1rfhLQdMclRQ4I1X0/X46Ku+Lky8JMs7Gi3JIf+p9rf+J0n9Ia8guUbc4PQFR5uvwMRETasaYXgUutHlCcmLS3QpUEH7r+PIPRsw605MbCvbjE75yz8KgURheJtkXBsVsaQOYdmlebUvh+JaO4oamvD2fKozfRlxVVeV/AWyEIIvSh2lY4+fai8pyIFQ8u9sAZlOa3F9CyWf3Sj0Lat3+yKRRI/jokoZNDCYGnwbBXJBisAE8BhpKvJa/eQDLRiff2PO1 ZWTbECt5 ycqkgj64nz1ogUuVm5obcszPLIJS12ZKv9VqBnwL0g0ZY+feSpctfEG9t+oMLCcNg7WHYk1XMoyv3bAjw8LsdBdDpb8UChYst7smqsjeO3o/YnxEGLDVUz3iyOFxhlLToOMUNaQA0eRz/JtD9pvjpYDdD9RnzcrTHZaAZxncfLPMrHo/3i4mwoTKe1FT8KqKA3F9qDYaT16Jaq3aeUe+cqVQdPCiGyUuYl8UgVq+K2KjXJq+9BpCcRBgLdtHRTCvTJsgPMfgmoq9GQn3MiZa3E8/P0sfSmhOxB/w7cwUXL/huEouX0xtkaAd956pWhsGS02dwj0o9dyZJWcfMEhYTulchO0xRTbGlQkGAjhtaTF+W3SZrRipWtWlDbQV4pTX/LjSKotbV1ZzkjLj7RTviLpSDKQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: For some case as tun_build_skb() without the needing of using complicated prepare & commit API, add the abort API to abort the operation of page_frag_alloc_*() related API for error handling knowing that no one else is taking extra reference to the just allocated fragment, and add abort_ref API to only abort the reference counting of the allocated fragment if it is already referenced by someone else. CC: Alexander Duyck CC: Andrew Morton CC: Linux-MM Signed-off-by: Yunsheng Lin --- Documentation/mm/page_frags.rst | 7 +++++-- include/linux/page_frag_cache.h | 20 ++++++++++++++++++++ mm/page_frag_cache.c | 21 +++++++++++++++++++++ 3 files changed, 46 insertions(+), 2 deletions(-) diff --git a/Documentation/mm/page_frags.rst b/Documentation/mm/page_frags.rst index 34e654c2956e..339e641beb53 100644 --- a/Documentation/mm/page_frags.rst +++ b/Documentation/mm/page_frags.rst @@ -114,9 +114,10 @@ fragsz if there is an alignment requirement for the size of the fragment. .. kernel-doc:: include/linux/page_frag_cache.h :identifiers: page_frag_cache_init page_frag_cache_is_pfmemalloc __page_frag_alloc_align page_frag_alloc_align page_frag_alloc + page_frag_alloc_abort .. kernel-doc:: mm/page_frag_cache.c - :identifiers: page_frag_cache_drain page_frag_free + :identifiers: page_frag_cache_drain page_frag_free page_frag_alloc_abort_ref Coding examples =============== @@ -143,8 +144,10 @@ Allocation & freeing API goto do_error; err = do_something(va, size); - if (err) + if (err) { + page_frag_alloc_abort(nc, va, size); goto do_error; + } ... diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index a2b1127e8ac8..c3347c97522c 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -141,5 +141,25 @@ static inline void *page_frag_alloc(struct page_frag_cache *nc, } void page_frag_free(void *addr); +void page_frag_alloc_abort_ref(struct page_frag_cache *nc, void *va, + unsigned int fragsz); + +/** + * page_frag_alloc_abort - Abort the page fragment allocation. + * @nc: page_frag cache to which the page fragment is aborted back + * @va: virtual address of page fragment to be aborted + * @fragsz: size of the page fragment to be aborted + * + * It is expected to be called from the same context as the allocation API. + * Mostly used for error handling cases to abort the fragment allocation knowing + * that no one else is taking extra reference to the just aborted fragment, so + * that the aborted fragment can be reused. + */ +static inline void page_frag_alloc_abort(struct page_frag_cache *nc, void *va, + unsigned int fragsz) +{ + page_frag_alloc_abort_ref(nc, va, fragsz); + nc->offset -= fragsz; +} #endif diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c index d014130fb893..8c3cfdbe8c2b 100644 --- a/mm/page_frag_cache.c +++ b/mm/page_frag_cache.c @@ -201,3 +201,24 @@ void page_frag_free(void *addr) free_unref_page(page, compound_order(page)); } EXPORT_SYMBOL(page_frag_free); + +/** + * page_frag_alloc_abort_ref - Abort the reference of allocated fragment. + * @nc: page_frag cache to which the page fragment is aborted back + * @va: virtual address of page fragment to be aborted + * @fragsz: size of the page fragment to be aborted + * + * It is expected to be called from the same context as the allocation API. + * Mostly used for error handling cases to abort the reference of allocated + * fragment if the fragment has been referenced for other usages, to avoid the + * atomic operation of page_frag_free() API. + */ +void page_frag_alloc_abort_ref(struct page_frag_cache *nc, void *va, + unsigned int fragsz) +{ + VM_BUG_ON(va + fragsz != + encoded_page_decode_virt(nc->encoded_page) + nc->offset); + + nc->pagecnt_bias++; +} +EXPORT_SYMBOL(page_frag_alloc_abort_ref); From patchwork Thu Nov 14 12:16:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13875062 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7817FD65C7E for ; Thu, 14 Nov 2024 12:22:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0A2E46B0095; Thu, 14 Nov 2024 07:22:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 02B466B0098; Thu, 14 Nov 2024 07:22:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E34376B0096; Thu, 14 Nov 2024 07:22:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C242F6B0093 for ; Thu, 14 Nov 2024 07:22:56 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 70BCCACFDE for ; Thu, 14 Nov 2024 12:22:56 +0000 (UTC) X-FDA: 82784613030.06.8F88304 Received: from szxga06-in.huawei.com (szxga06-in.huawei.com [45.249.212.32]) by imf27.hostedemail.com (Postfix) with ESMTP id C8FD74000D for ; Thu, 14 Nov 2024 12:22:08 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf27.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731586796; a=rsa-sha256; cv=none; b=5h7MXl7WkgfBhbMbawBeCMn0tk05lmEktHmapnHYkX9QpjvXjSvM7HrImxYsKlG01/cB93 HvgCPFqsbdN3AK3aVIS8mYBtNBXiSTmskw3q2Uljl7A9z06AJymG6N13n1jeLDawmCtxgm vwogZTciohVOOBMkWPh08Q6GgizbH1E= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf27.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731586796; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SU9hFE/IxapLxlkNs+ckNoGZnzqsXf/PqdlweufhXC8=; b=XI+pzH4DBiI5V04yK1auQhOeZDBspG/MRd7WeTbnVR7sGqOR/5ejdmDyulHgCsOpApM1Tk 3OgOeJ8QVp6k6nB6TTNQnwtp1YVAtXT9mrarCrDZ0UlMZ6FuzM5l51h0zHYsaMUROC53/Z FC9MmOEal39CYfak3XX6eaOxOBMTQnQ= Received: from mail.maildlp.com (unknown [172.19.88.234]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4Xpznf1Mz4z1yqVG; Thu, 14 Nov 2024 20:23:02 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 6E3CB1401F4; Thu, 14 Nov 2024 20:22:50 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 14 Nov 2024 20:22:50 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Linux-MM , Jonathan Corbet , Subject: [PATCH net-next v1 05/10] mm: page_frag: introduce refill prepare & commit API Date: Thu, 14 Nov 2024 20:16:00 +0800 Message-ID: <20241114121606.3434517-6-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20241114121606.3434517-1-linyunsheng@huawei.com> References: <20241114121606.3434517-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Stat-Signature: y7rfand5kx433ockc36pt1hhne358q5u X-Rspamd-Queue-Id: C8FD74000D X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1731586928-190307 X-HE-Meta: U2FsdGVkX18gG3xggpV7AXs6dDKEk4JDuInL06rvLHyzQUu57gXIwDtOq8dNiNjyiiZhezue1Qc1PUXpdzqx6GDG43QtucJqCMQrmiAAF8nVgVaiLtes2L2NGvJMy+3ct9P9WnKvQGZ6JvbF1F2eMyurcnKlNcH7fZHNzN518OaiyAx1S97nGVZqGbLotWc1r76bWgAgF6JCPDD5AHepPLDjKHYve9QaNUdph/tWPGwNzj9serlfZEca6TyoF3YXiFR/8R9Ixaxgz2DtcErzq5SLt9E4OKQHxvH6ZNFg+fnrxGNSqQmEwiLOi2YVs+SDsHDpK4FgWsuFb1k2cK6BbjLGDxF2FvuXOvKG5aCIgGNWiB8gY5NqHAQneiEuXBjY8RuBGFZYywOVZ+uHkwtNmtMYStKbqv+BDNisrs22hbvcWHIbRz2z3dD0f3i4US/i/EpMzaXImtu3jiZGgEilYhX0FR/QNXfrSey+79fprA17WbvD9ZVdWFZqvrGx3pr43eHwqBRxbNTWiZCkIkQ+boRcMbOsXscUWiabEPLj6Bg/VVeJfNU4N5bhn1njjeejZuI/48gycya+wJ8nuk2rBurqAc+YnGFnL4WHCD/TtfRxAaYL3N+eRCWFYL8sl6XxIPCUoTMni4cGe3kzulzSkZLDWOOSZPTHANtEt3cqCJFVM/NNXWxH5ui5WrmrVrb895kl9m55GnsGHJVqIBJhkIqp5eNL8HOhYI62pWl+5Vi+ekTgqyhiBFu8Gn+DmXdwrNTiZsYxBVbfJ7GlCBtWuMhvvDwqAatrH1WSUhdwRC4grtCn0QkaQ0ryMZlK3aZ9FmxtKMqeF/L11tkWybgsPfSdRNbDl46OISmydtxUR6bHDuPaSba6CiRoSwiyf2MFq0AahE82MDRXNQTgpDV5qp1iaRsXd2RcUQHCnEEZHx7TFGKChdIWwX73x4G3f8LXcOiBKTZ80DlrUHFKd2I le9aHTV2 1SjeGvUI7WVzQCGcA/QN/pZETpMqOR7bT45MShsS2XTWnamYWWzMXsgKSsMR8+K5qHHlX8Hne5atXuPMI/ojiPvyzlqlmv81BQp147/qe+p+uCPm6iFfKySusiLsh0AjWsiOf4SRSoE9/M5suIQHtroNTU1j4A+UzormMN+rYnD0N3SORtbjLQtO7htJ3cYFrgDdsgoAZf82FMjwRcX06rnb5Zb792vxjo+8hhyU4aTTy+Q6PamSibBP2v9Hj/82MTz3VP3j7N9oZB8g+UHYd9QsxhXkWVq2nakRTe+dOY7IZBm9azGECHEJJLWwovjpAO4hiL1kwh1EqPciRuUENmLgwRYuBfFcEMCbSSjHc9BrBOxwDZIc8j1pkWoGFcTlw870Q7IOA7AMmmfnVA+5McElhtg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently page_frag only have a alloc API which returns the virtual address of a fragment by a specific size. There are many use cases that need minimum memory in order for forward progress, but more performant if more memory is available, and expect to use the 'struct page' of the allocated fragment directly instead of the virtual address. Currently skb_page_frag_refill() API is used to solve the above use cases, but caller needs to know about the internal detail and access the data field of 'struct page_frag' to meet the requirement of the above use cases and its implementation is similar to the one in mm subsystem. To unify those two page_frag implementations, introduce a prepare API to ensure minimum memory is satisfied and return how much the actual memory is available to the caller. The caller needs to either call the commit API to report how much memory it actually uses, or not do so if deciding to not use any memory. CC: Alexander Duyck CC: Andrew Morton CC: Linux-MM Signed-off-by: Yunsheng Lin --- Documentation/mm/page_frags.rst | 43 ++++++++++++- include/linux/page_frag_cache.h | 110 ++++++++++++++++++++++++++++++++ 2 files changed, 152 insertions(+), 1 deletion(-) diff --git a/Documentation/mm/page_frags.rst b/Documentation/mm/page_frags.rst index 339e641beb53..4cfdbe7db55a 100644 --- a/Documentation/mm/page_frags.rst +++ b/Documentation/mm/page_frags.rst @@ -111,10 +111,18 @@ page is aligned according to the 'align/alignment' parameter. Note the size of the allocated fragment is not aligned, the caller needs to provide an aligned fragsz if there is an alignment requirement for the size of the fragment. +There is a use case that needs minimum memory in order for forward progress, but +more performant if more memory is available. By using the prepare and commit +related API, the caller calls prepare API to requests the minimum memory it +needs and prepare API will return the maximum size of the fragment returned. The +caller needs to either call the commit API to report how much memory it actually +uses, or not do so if deciding to not use any memory. + .. kernel-doc:: include/linux/page_frag_cache.h :identifiers: page_frag_cache_init page_frag_cache_is_pfmemalloc __page_frag_alloc_align page_frag_alloc_align page_frag_alloc - page_frag_alloc_abort + page_frag_alloc_abort __page_frag_refill_prepare_align + page_frag_refill_prepare_align page_frag_refill_prepare .. kernel-doc:: mm/page_frag_cache.c :identifiers: page_frag_cache_drain page_frag_free page_frag_alloc_abort_ref @@ -152,3 +160,36 @@ Allocation & freeing API ... page_frag_free(va); + + +Refill Preparation & committing API +----------------------------------- + +.. code-block:: c + + struct page_frag page_frag, *pfrag; + bool merge = true; + + pfrag = &page_frag; + if (!page_frag_refill_prepare(nc, 32U, pfrag, GFP_KERNEL)) + goto wait_for_space; + + copy = min_t(unsigned int, copy, pfrag->size); + if (!skb_can_coalesce(skb, i, pfrag->page, pfrag->offset)) { + if (i >= max_skb_frags) + goto new_segment; + + merge = false; + } + + copy = mem_schedule(copy); + if (!copy) + goto wait_for_space; + + if (merge) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + page_frag_refill_commit_noref(nc, pfrag, copy); + } else { + skb_fill_page_desc(skb, i, pfrag->page, pfrag->offset, copy); + page_frag_refill_commit(nc, pfrag, copy); + } diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index c3347c97522c..1e699334646a 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -140,6 +140,116 @@ static inline void *page_frag_alloc(struct page_frag_cache *nc, return __page_frag_alloc_align(nc, fragsz, gfp_mask, ~0u); } +/** + * __page_frag_refill_prepare_align() - Prepare refilling a page_frag with + * aligning requirement. + * @nc: page_frag cache from which to refill + * @fragsz: the requested fragment size + * @pfrag: the page_frag to be refilled. + * @gfp_mask: the allocation gfp to use when cache need to be refilled + * @align_mask: the requested aligning requirement for the fragment + * + * Prepare refilling a page_frag from page_frag cache with aligning requirement. + * + * Return: + * True if prepare refilling succeeds, otherwise return false. + */ +static inline bool __page_frag_refill_prepare_align(struct page_frag_cache *nc, + unsigned int fragsz, + struct page_frag *pfrag, + gfp_t gfp_mask, + unsigned int align_mask) +{ + return !!__page_frag_cache_prepare(nc, fragsz, pfrag, gfp_mask, + align_mask); +} + +/** + * page_frag_refill_prepare_align() - Prepare refilling a page_frag with + * aligning requirement. + * @nc: page_frag cache from which to refill + * @fragsz: the requested fragment size + * @pfrag: the page_frag to be refilled. + * @gfp_mask: the allocation gfp to use when cache needs to be refilled + * @align: the requested aligning requirement for the fragment + * + * WARN_ON_ONCE() checking for @align before prepare refilling a page_frag from + * page_frag cache with aligning requirement. + * + * Return: + * True if prepare refilling succeeds, otherwise return false. + */ +static inline bool page_frag_refill_prepare_align(struct page_frag_cache *nc, + unsigned int fragsz, + struct page_frag *pfrag, + gfp_t gfp_mask, + unsigned int align) +{ + WARN_ON_ONCE(!is_power_of_2(align)); + return __page_frag_refill_prepare_align(nc, fragsz, pfrag, gfp_mask, + -align); +} + +/** + * page_frag_refill_prepare() - Prepare refilling a page_frag. + * @nc: page_frag cache from which to refill + * @fragsz: the requested fragment size + * @pfrag: the page_frag to be refilled. + * @gfp_mask: the allocation gfp to use when cache need to be refilled + * + * Prepare refilling a page_frag from page_frag cache. + * + * Return: + * True if refill succeeds, otherwise return false. + */ +static inline bool page_frag_refill_prepare(struct page_frag_cache *nc, + unsigned int fragsz, + struct page_frag *pfrag, + gfp_t gfp_mask) +{ + return __page_frag_refill_prepare_align(nc, fragsz, pfrag, gfp_mask, + ~0u); +} + +/** + * page_frag_refill_commit - Commit a prepare refilling. + * @nc: page_frag cache from which to commit + * @pfrag: the page_frag to be committed + * @used_sz: size of the page fragment has been used + * + * Commit the actual used size for the refill that was prepared. + * + * Return: + * The true size of the fragment considering the offset alignment. + */ +static inline unsigned int page_frag_refill_commit(struct page_frag_cache *nc, + struct page_frag *pfrag, + unsigned int used_sz) +{ + return __page_frag_cache_commit(nc, pfrag, used_sz); +} + +/** + * page_frag_refill_commit_noref - Commit a prepare refilling without taking + * refcount. + * @nc: page_frag cache from which to commit + * @pfrag: the page_frag to be committed + * @used_sz: size of the page fragment has been used + * + * Commit the prepare refilling by passing the actual used size, but not taking + * refcount. Mostly used for fragmemt coalescing case when the current fragment + * can share the same refcount with previous fragment. + * + * Return: + * The true size of the fragment considering the offset alignment. + */ +static inline unsigned int +page_frag_refill_commit_noref(struct page_frag_cache *nc, + struct page_frag *pfrag, unsigned int used_sz) +{ + return __page_frag_cache_commit_noref(nc, pfrag, used_sz); +} + void page_frag_free(void *addr); void page_frag_alloc_abort_ref(struct page_frag_cache *nc, void *va, unsigned int fragsz); From patchwork Thu Nov 14 12:16:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13875063 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0387ED41D46 for ; Thu, 14 Nov 2024 12:23:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4247F6B0096; Thu, 14 Nov 2024 07:22:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2C3266B0098; Thu, 14 Nov 2024 07:22:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 13B116B0099; Thu, 14 Nov 2024 07:22:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id ED9D46B0096 for ; Thu, 14 Nov 2024 07:22:58 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A99501C6DA5 for ; Thu, 14 Nov 2024 12:22:58 +0000 (UTC) X-FDA: 82784613786.15.C705837 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf30.hostedemail.com (Postfix) with ESMTP id 034A980004 for ; Thu, 14 Nov 2024 12:21:32 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf30.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731586913; a=rsa-sha256; cv=none; b=z8TVklmTKKBBA2bxjyZUV9jTvdIZSJ16GVaBpDmCNDYyUwPGdV+p/6nGuNEjeBh4ayDRa+ zMcYEsf4r0aAOhoQ/i2/wmv6Ea1SvVaqHLyWJ3fLt6Rcif0F3/FaapaccA8oezSPhLrJeD vcl8g0/Feea7YaXfXm+u78WkFCtjIA4= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf30.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731586913; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lvuyRUjEeqKVfMNpsbMVljnBq4TEBQUgRXvscG9RWWs=; b=TgSH3Sfq3J8gZtJz8ql2Inpi+fWr2uKxWg8Wy2UivyCm3yna9XsV2Znz3Z867DannYB7jU mJGedrFh7oPWWqdZ1YS2zdualfRWu9K8XMp6duF0DfsZOz9eajKrjcu9YxIusgZTsbU9vw gdZtiR4EHnEMapMlMdODvb1NtgKP/C8= Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4Xpzm16HH1z21l8Z; Thu, 14 Nov 2024 20:21:37 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 8E7881A0188; Thu, 14 Nov 2024 20:22:53 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 14 Nov 2024 20:22:53 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Linux-MM , Jonathan Corbet , Subject: [PATCH net-next v1 06/10] mm: page_frag: introduce alloc_refill prepare & commit API Date: Thu, 14 Nov 2024 20:16:01 +0800 Message-ID: <20241114121606.3434517-7-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20241114121606.3434517-1-linyunsheng@huawei.com> References: <20241114121606.3434517-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspam-User: X-Rspamd-Queue-Id: 034A980004 X-Rspamd-Server: rspam11 X-Stat-Signature: 71mr755mufgdwfbnfp5h8ztbcu7a54mz X-HE-Tag: 1731586892-299668 X-HE-Meta: U2FsdGVkX1+i9ynjk96GGzttl05SBnkYJ697SuXCIYEB3EbIbw0yYIFcelTVFuri7xgU+Cy3lKgYD+HciQ8znGa0mDZ4IW0nCTPIgq7LqAfnlkWAq7YT7rI5EuYsU9O91fyFsWjlORZQ+xAbS+/HwBHnsPtCjqZWrEaC7bowZaoiGKcE9xTzPV19E8tYnBY/NsnejkhhK6RpA1GoRbAb/e3eDNPiTyBBb1PTIN3fTyxZmT62mWR0X5KrR8LSN4moSFhPH1vLKpyVLFKkG1EhuPJnktoMRSgH/AsaopxBYUpMKWGHK7dB6DMI9ohof+i/uUhNlgFUTANB4eIG4UAn4vlSKABzGXYVFI8yIXtIT9EzifmoSAVq2tuQp96S9HKjUe/afwM4voNi12JbRCh1QfFeFANXrYBvjhw6K+fv2pA4aWiEPJ4M441YDnAEYNaFrABq3IFtMrzwWV8VSSN030Ch4Vo2IM/lWicruhh7xSu7QSFSGJIboxZ1A6oCnufmgL6MKYKAi7tlnkiVEIS+pEeMFksILH62ZtzqhJ+jPHBNySgeEGKxUeiCtaoqB7PYed4S2tpgyJWDAaBYpWEYZGCTygsLP+4XBHxwhos8vzxzMbK+01qNZdCZIxEMsrIV8fPxwhurTEWFEyZ6furwFYrBupoc2bxos54mfGhjCoPBeFjUp8jcwT2uBe0/LXE+dy+N3J5Fe4Ok8vuCxg/Tj0NC04VKH3qoJxCe+bLUABGHG3VE4ht5KsBi5BssjaLOT/+6F7eJoF1z793g9aX1v7ldrIj51uOwEsISL9h1TeZX1KVSgiJgcKuI5T/syxdfhng4N60n9HX9T4lxtwtY4kIARHa2Ej9Umz5HQMwfcBqJWLAMzJ6ZJutsTFCf2liTiwhINSFMSxh9HGBVqflZI75KD7qXw+tWTgZRUNQaXARBkgdqJ+eiFjGppNc6HbJ8rmVfnMKJ8CEG9zZklrP y7OrT/CP CpFbzzWuqpOZcYrdH4bP0H5vCqiShM1YozbZUbupHLH5oVDTWZ5cuYNrvVT7Fvu3xkTwb6lQf38/71u+816LwmQ/+HlU7lxRPKcYYpaSr2sQ60Fw/BL4SiSVbCumhPOGr7NaMvCN2a1Nl7xvVZshgOtxVgUAraY+67Y6Q+yHvCcfvFHeN+RQMdpDvZMKGXYHywH+IitI0oNzRlvLJ3ECoSg/WkNA8SIhnO03ZaM2dyEoqb9ncNRA1RrA3/C753sl31y6XTozeB4S6vlZSu2yrv9H3N3Inym4IoyZV7SpanFYprOTpBDttcc2U8xiEVkihLBIHlx6Im6q3ZQzJ0dVJxKHbcOlNP0sZZbgZKpWjdDTFO3Kkr+XTSBfs/yYkux8uhBLy8gCU9ljQoiNsrQfWvW8Nsw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently alloc related API returns virtual address of the allocated fragment and refill related API returns page info of the allocated fragment through 'struct page_frag'. There are use cases that need both the virtual address and page info of the allocated fragment. Introduce alloc_refill API for those use cases. CC: Alexander Duyck CC: Andrew Morton CC: Linux-MM Signed-off-by: Yunsheng Lin --- Documentation/mm/page_frags.rst | 45 +++++++++++++++++++++ include/linux/page_frag_cache.h | 71 +++++++++++++++++++++++++++++++++ 2 files changed, 116 insertions(+) diff --git a/Documentation/mm/page_frags.rst b/Documentation/mm/page_frags.rst index 4cfdbe7db55a..1c98f7090d92 100644 --- a/Documentation/mm/page_frags.rst +++ b/Documentation/mm/page_frags.rst @@ -111,6 +111,9 @@ page is aligned according to the 'align/alignment' parameter. Note the size of the allocated fragment is not aligned, the caller needs to provide an aligned fragsz if there is an alignment requirement for the size of the fragment. +Depending on different use cases, callers expecting to deal with va, page or +both va and page may call alloc, refill or alloc_refill API accordingly. + There is a use case that needs minimum memory in order for forward progress, but more performant if more memory is available. By using the prepare and commit related API, the caller calls prepare API to requests the minimum memory it @@ -123,6 +126,9 @@ uses, or not do so if deciding to not use any memory. __page_frag_alloc_align page_frag_alloc_align page_frag_alloc page_frag_alloc_abort __page_frag_refill_prepare_align page_frag_refill_prepare_align page_frag_refill_prepare + __page_frag_alloc_refill_prepare_align + page_frag_alloc_refill_prepare_align + page_frag_alloc_refill_prepare .. kernel-doc:: mm/page_frag_cache.c :identifiers: page_frag_cache_drain page_frag_free page_frag_alloc_abort_ref @@ -193,3 +199,42 @@ Refill Preparation & committing API skb_fill_page_desc(skb, i, pfrag->page, pfrag->offset, copy); page_frag_refill_commit(nc, pfrag, copy); } + + +Alloc_Refill Preparation & committing API +----------------------------------------- + +.. code-block:: c + + struct page_frag page_frag, *pfrag; + bool merge = true; + void *va; + + pfrag = &page_frag; + va = page_frag_alloc_refill_prepare(nc, 32U, pfrag, GFP_KERNEL); + if (!va) + goto wait_for_space; + + copy = min_t(unsigned int, copy, pfrag->size); + if (!skb_can_coalesce(skb, i, pfrag->page, pfrag->offset)) { + if (i >= max_skb_frags) + goto new_segment; + + merge = false; + } + + copy = mem_schedule(copy); + if (!copy) + goto wait_for_space; + + err = copy_from_iter_full_nocache(va, copy, iter); + if (err) + goto do_error; + + if (merge) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + page_frag_refill_commit_noref(nc, pfrag, copy); + } else { + skb_fill_page_desc(skb, i, pfrag->page, pfrag->offset, copy); + page_frag_refill_commit(nc, pfrag, copy); + } diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index 1e699334646a..329390afbe78 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -211,6 +211,77 @@ static inline bool page_frag_refill_prepare(struct page_frag_cache *nc, ~0u); } +/** + * __page_frag_alloc_refill_prepare_align() - Prepare allocating a fragment and + * refilling a page_frag with aligning requirement. + * @nc: page_frag cache from which to allocate and refill + * @fragsz: the requested fragment size + * @pfrag: the page_frag to be refilled. + * @gfp_mask: the allocation gfp to use when cache need to be refilled + * @align_mask: the requested aligning requirement for the fragment. + * + * Prepare allocating a fragment and refilling a page_frag from page_frag cache. + * + * Return: + * virtual address of the page fragment, otherwise return NULL. + */ +static inline void +*__page_frag_alloc_refill_prepare_align(struct page_frag_cache *nc, + unsigned int fragsz, + struct page_frag *pfrag, + gfp_t gfp_mask, unsigned int align_mask) +{ + return __page_frag_cache_prepare(nc, fragsz, pfrag, gfp_mask, align_mask); +} + +/** + * page_frag_alloc_refill_prepare_align() - Prepare allocating a fragment and + * refilling a page_frag with aligning requirement. + * @nc: page_frag cache from which to allocate and refill + * @fragsz: the requested fragment size + * @pfrag: the page_frag to be refilled. + * @gfp_mask: the allocation gfp to use when cache need to be refilled + * @align: the requested aligning requirement for the fragment. + * + * WARN_ON_ONCE() checking for @align before prepare allocating a fragment and + * refilling a page_frag from page_frag cache. + * + * Return: + * virtual address of the page fragment, otherwise return NULL. + */ +static inline void +*page_frag_alloc_refill_prepare_align(struct page_frag_cache *nc, + unsigned int fragsz, + struct page_frag *pfrag, gfp_t gfp_mask, + unsigned int align) +{ + WARN_ON_ONCE(!is_power_of_2(align)); + return __page_frag_alloc_refill_prepare_align(nc, fragsz, pfrag, + gfp_mask, -align); +} + +/** + * page_frag_alloc_refill_prepare() - Prepare allocating a fragment and + * refilling a page_frag. + * @nc: page_frag cache from which to allocate and refill + * @fragsz: the requested fragment size + * @pfrag: the page_frag to be refilled. + * @gfp_mask: the allocation gfp to use when cache need to be refilled + * + * Prepare allocating a fragment and refilling a page_frag from page_frag cache. + * + * Return: + * virtual address of the page fragment, otherwise return NULL. + */ +static inline void *page_frag_alloc_refill_prepare(struct page_frag_cache *nc, + unsigned int fragsz, + struct page_frag *pfrag, + gfp_t gfp_mask) +{ + return __page_frag_alloc_refill_prepare_align(nc, fragsz, pfrag, + gfp_mask, ~0u); +} + /** * page_frag_refill_commit - Commit a prepare refilling. * @nc: page_frag cache from which to commit From patchwork Thu Nov 14 12:16:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13875064 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B34F5D65C7E for ; Thu, 14 Nov 2024 12:23:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BF6E6B009A; Thu, 14 Nov 2024 07:23:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 448E26B009C; Thu, 14 Nov 2024 07:23:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E9D66B009D; Thu, 14 Nov 2024 07:23:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0D1376B009A for ; Thu, 14 Nov 2024 07:23:03 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B9EF9141107 for ; Thu, 14 Nov 2024 12:23:02 +0000 (UTC) X-FDA: 82784613702.30.A106F9B Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) by imf16.hostedemail.com (Postfix) with ESMTP id 6C4DC18000B for ; Thu, 14 Nov 2024 12:22:16 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731586805; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J1srTfzk09opjcf5dAMfaweqcdxDibImPR2py+TJtKc=; b=FdufU0Ugos1HMC+MTwOG+LFciECQa3pczi8ykyqEmCQiOY6oGPbWZuJYjmU+ZlKFd3wzR9 hbwQ3vpH2gwt0dSqozHNTXyJq/A322yAVn8vntrUvu3NsM2S/79ZDbVRZDjIhrm8xRX529 2EVeugiutwzMajihRC421/B/35Cnhes= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.190 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731586805; a=rsa-sha256; cv=none; b=WPriwQV1nVmn8sl5WZlG5V+frlygn99PRAehnKurEWoXz0p6TYNZr76Ka77LftYpvj+tGN h/8GDo9/qnGqMB5+OkGOviSjv+3XKzvp3qveQ7R747rPJxVDUq7h7R1fP2Qal5q8tpH/P9 cdYXzc5QQM5omw9R3LmxDqMah/3Khoo= Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4XpzlP60DHz2Dh2P; Thu, 14 Nov 2024 20:21:05 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 3C66E1A016C; Thu, 14 Nov 2024 20:22:57 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 14 Nov 2024 20:22:56 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Linux-MM , Jonathan Corbet , Subject: [PATCH net-next v1 07/10] mm: page_frag: introduce probe related API Date: Thu, 14 Nov 2024 20:16:02 +0800 Message-ID: <20241114121606.3434517-8-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20241114121606.3434517-1-linyunsheng@huawei.com> References: <20241114121606.3434517-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 6C4DC18000B X-Stat-Signature: frzqe5asg9di36bqtnjpq1fgd3puqhz8 X-Rspam-User: X-HE-Tag: 1731586936-822785 X-HE-Meta: U2FsdGVkX18616r61xrb0St496MGuzYsbjPlg7DK1RZazyHjab7X0287oXACknLuaavj/UBH8N7StpZmn4nc/ATO8pswgghapF9Chue41Oua49LRnMi08LQqlFiacwTLBhDxmzblRI+KmzASrsIN+p0yw7CMdALwNouJb/NfrKDCCPtCPtb0cF8H2NKU3+61qplbNUddGi8aYaCDl/Kf0ePZWh/1f+lbevFCpAWKB2yt9tbUrnWPYZTuQy7Kt2cmzrFlH2HdTXZW2m7GBHNOzyNbFdOHz9WoihPqWBWfKO9XXG+gpvLpLLpy+B/ovdfVVGUjpo8fbbt4SrfVOIT1wYZZ330QIr1v+qW8X4cSW+4XUxklNJw3kp+xyu1XSRwqaIdNWMDP+T0dK49X+E1kRoyCP/WiFno1f/sD/lAeLdpNXY/7TQVV1o644YQFSdikkZ4GpNYEuBTtB44Ot03IFA6EqgoTaRIfKg0FAuv8HvoQ7AbdtlH+8HrxOBoEFLn+vbREgfJ38Q1EexABVvOQFskdqKSfY1NfxltYogJ2JgYfjYido1hqccyrw8HABO44u573YdvK1m7ENAlw5nvafGMHRaN92FummnzZxLBzVecj+08hhay3UnZtoeUV/CZnbnu2k3R3rc+BXFj7ejzHKaNqOhEEO53j957mGQIVZBTmGRuTb/kqWBNFBhoKAZEbZ//7SYKAawjivUIxEBdbcN0+FoPHVc1M6TQ1Q5MJV6CqzIJcXD4pCEWcBGdeVqAjOxrfiPvtnve6PW+pTMXrd9mXwV8H5MUTFDNuUWS9UYH7BrFv/vzF5xxi9ew1bmd0pb5S+ka/DXq3gY6JUlp1wbj40kPbyjFlcjQq6lLjt80bPLv4AVlr/OvOWACHOd5x6fx9umOPv6oNzG9Afy0ZGoEGYbeau/QRTLox1BDHNTW9jV+bK91CjmvtTDf7knQSIuNPPpZVd+cFfQUhtq/ dkgFK/4i hQ1G7Glz7cP2qE1w4GxIyU5vooNtk9RJZZgLDxb+/+RQNhreXpuc8qXUVUbjtQTSdI8Jk+K/hQtcA/GW0JPZtU81A8sN7RlpLhXSNzUyT/snQbZTNnpMcr9N2zUyyDEBeLSRXJvCtZpmFqNSPB81Z4gd7ChxCyChV6lSFRJ073TZ2fb6iRhr0OBD6xa76UxgCrR5HhaHh7W6nbRogpgTa+XgzDYNPB+I0zXrCRVFCSSUjJgIee1Mogu/1+ruBNUb23K2rVG+aoTqTB3Ofej0Uiw6HXCliE7Wc0R0REA+bHkJ9a4t+acHbsgCUrrPftN9FNq8Z79qFAAl55OlC76HoOV330w0Jcqt+T5WnCG2GdmLLbouJOW8XCa/TpXbfq0HjNBwsuDBzSDbNtXId2H8wZIpDNw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Some usecase may need a bigger fragment if current fragment can't be coalesced to previous fragment because more space for some header may be needed if it is a new fragment. So introduce probe related API to tell if there are minimum remaining memory in the cache to be coalesced to the previous fragment, in order to save memory as much as possible. CC: Alexander Duyck CC: Andrew Morton CC: Linux-MM Signed-off-by: Yunsheng Lin --- Documentation/mm/page_frags.rst | 10 +++++++- include/linux/page_frag_cache.h | 41 +++++++++++++++++++++++++++++++++ mm/page_frag_cache.c | 35 ++++++++++++++++++++++++++++ 3 files changed, 85 insertions(+), 1 deletion(-) diff --git a/Documentation/mm/page_frags.rst b/Documentation/mm/page_frags.rst index 1c98f7090d92..3e34831a0029 100644 --- a/Documentation/mm/page_frags.rst +++ b/Documentation/mm/page_frags.rst @@ -119,7 +119,13 @@ more performant if more memory is available. By using the prepare and commit related API, the caller calls prepare API to requests the minimum memory it needs and prepare API will return the maximum size of the fragment returned. The caller needs to either call the commit API to report how much memory it actually -uses, or not do so if deciding to not use any memory. +uses, or not do so if deciding to not use any memory. Some usecase may need a +bigger fragment if the current fragment can't be coalesced to previous fragment +because more space for some header may be needed if it is a new fragment, probe +related API can be used to tell if there are minimum remaining memory in the +cache to be coalesced to the previous fragment, in order to save memory as much +as possible. + .. kernel-doc:: include/linux/page_frag_cache.h :identifiers: page_frag_cache_init page_frag_cache_is_pfmemalloc @@ -129,9 +135,11 @@ uses, or not do so if deciding to not use any memory. __page_frag_alloc_refill_prepare_align page_frag_alloc_refill_prepare_align page_frag_alloc_refill_prepare + page_frag_alloc_refill_probe page_frag_refill_probe .. kernel-doc:: mm/page_frag_cache.c :identifiers: page_frag_cache_drain page_frag_free page_frag_alloc_abort_ref + __page_frag_alloc_refill_probe_align Coding examples =============== diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index 329390afbe78..0f7e8da91a67 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -63,6 +63,10 @@ void *__page_frag_cache_prepare(struct page_frag_cache *nc, unsigned int fragsz, unsigned int __page_frag_cache_commit_noref(struct page_frag_cache *nc, struct page_frag *pfrag, unsigned int used_sz); +void *__page_frag_alloc_refill_probe_align(struct page_frag_cache *nc, + unsigned int fragsz, + struct page_frag *pfrag, + unsigned int align_mask); static inline unsigned int __page_frag_cache_commit(struct page_frag_cache *nc, struct page_frag *pfrag, @@ -282,6 +286,43 @@ static inline void *page_frag_alloc_refill_prepare(struct page_frag_cache *nc, gfp_mask, ~0u); } +/** + * page_frag_alloc_refill_probe() - Probe allocating a fragment and refilling + * a page_frag. + * @nc: page_frag cache from which to allocate and refill + * @fragsz: the requested fragment size + * @pfrag: the page_frag to be refilled + * + * Probe allocating a fragment and refilling a page_frag from page_frag cache. + * + * Return: + * virtual address of the page fragment, otherwise return NULL. + */ +static inline void *page_frag_alloc_refill_probe(struct page_frag_cache *nc, + unsigned int fragsz, + struct page_frag *pfrag) +{ + return __page_frag_alloc_refill_probe_align(nc, fragsz, pfrag, ~0u); +} + +/** + * page_frag_refill_probe() - Probe refilling a page_frag. + * @nc: page_frag cache from which to refill + * @fragsz: the requested fragment size + * @pfrag: the page_frag to be refilled + * + * Probe refilling a page_frag from page_frag cache. + * + * Return: + * True if refill succeeds, otherwise return false. + */ +static inline bool page_frag_refill_probe(struct page_frag_cache *nc, + unsigned int fragsz, + struct page_frag *pfrag) +{ + return !!page_frag_alloc_refill_probe(nc, fragsz, pfrag); +} + /** * page_frag_refill_commit - Commit a prepare refilling. * @nc: page_frag cache from which to commit diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c index 8c3cfdbe8c2b..ae40520d452a 100644 --- a/mm/page_frag_cache.c +++ b/mm/page_frag_cache.c @@ -116,6 +116,41 @@ unsigned int __page_frag_cache_commit_noref(struct page_frag_cache *nc, } EXPORT_SYMBOL(__page_frag_cache_commit_noref); +/** + * __page_frag_alloc_refill_probe_align() - Probe allocating a fragment and + * refilling a page_frag with aligning requirement. + * @nc: page_frag cache from which to allocate and refill + * @fragsz: the requested fragment size + * @pfrag: the page_frag to be refilled. + * @align_mask: the requested aligning requirement for the fragment. + * + * Probe allocating a fragment and refilling a page_frag from page_frag cache + * with aligning requirement. + * + * Return: + * virtual address of the page fragment, otherwise return NULL. + */ +void *__page_frag_alloc_refill_probe_align(struct page_frag_cache *nc, + unsigned int fragsz, + struct page_frag *pfrag, + unsigned int align_mask) +{ + unsigned long encoded_page = nc->encoded_page; + unsigned int size, offset; + + size = PAGE_SIZE << encoded_page_decode_order(encoded_page); + offset = __ALIGN_KERNEL_MASK(nc->offset, ~align_mask); + if (unlikely(!encoded_page || offset + fragsz > size)) + return NULL; + + pfrag->page = encoded_page_decode_page(encoded_page); + pfrag->size = size - offset; + pfrag->offset = offset; + + return encoded_page_decode_virt(encoded_page) + offset; +} +EXPORT_SYMBOL(__page_frag_alloc_refill_probe_align); + void *__page_frag_cache_prepare(struct page_frag_cache *nc, unsigned int fragsz, struct page_frag *pfrag, gfp_t gfp_mask, unsigned int align_mask) From patchwork Thu Nov 14 12:16:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13875065 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6065DD65C7E for ; Thu, 14 Nov 2024 12:23:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E5C036B009D; Thu, 14 Nov 2024 07:23:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E0BAE6B009E; Thu, 14 Nov 2024 07:23:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD4D46B00A0; Thu, 14 Nov 2024 07:23:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A9C5D6B009D for ; Thu, 14 Nov 2024 07:23:06 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 644BA16064E for ; Thu, 14 Nov 2024 12:23:06 +0000 (UTC) X-FDA: 82784613828.23.34521FC Received: from szxga06-in.huawei.com (szxga06-in.huawei.com [45.249.212.32]) by imf19.hostedemail.com (Postfix) with ESMTP id 955741A0012 for ; Thu, 14 Nov 2024 12:22:08 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731586748; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YXj8sFW2sI2fPPOekjTdISpPeCEMGaPkaag0g/i1t1I=; b=jR2RtrzUMLQSMIXulonrqt49MsTlNs8NnuKRVwLZQYyZ+K5UQXFiLMSUtB6fdls/kS1BGm Lhb7Nr128FJQ3BJW7wYQOoQN6YFDvXTtEfUZO1Ivi2hBIXr0UVd2y94+3y93tDfleu9djg A7UOpDddxxVPSRjcOvfj/uwpVyLLIDw= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; spf=pass (imf19.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.32 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731586748; a=rsa-sha256; cv=none; b=aJblGfsAQfeTqq8AGl4pLy5lBKsqrG+sfNqLvhTGPekwinysOq94bwin9PZwXdXGraYu8z MYmapbuBAGbHZJPl9/JLR5y8wrZFzu6jhdnMYxYBsvQSL4yjsx87j+Y3UHsolbOGWQwgym ew2q3kE1pLxSFX09jF00FLKWtnvI9pQ= Received: from mail.maildlp.com (unknown [172.19.88.214]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4Xpznr5xKPz1yqBZ; Thu, 14 Nov 2024 20:23:12 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 191891A016C; Thu, 14 Nov 2024 20:23:01 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 14 Nov 2024 20:23:00 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Linux-MM , Shuah Khan , Subject: [PATCH net-next v1 08/10] mm: page_frag: add testing for the newly added API Date: Thu, 14 Nov 2024 20:16:03 +0800 Message-ID: <20241114121606.3434517-9-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20241114121606.3434517-1-linyunsheng@huawei.com> References: <20241114121606.3434517-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 955741A0012 X-Stat-Signature: spghuqsikza8gc4sq1u7sjyhqo4ndkj5 X-Rspam-User: X-HE-Tag: 1731586928-933697 X-HE-Meta: U2FsdGVkX18BpcO9oWkjlvj525hklUF0fCSlGjC3I6oDieuget82SHFLOsbXb3vVVtGZWtA9sOAwmzylcgkypqsmKcFux7yyttBDFoe/OWkBO1jj+pDxI55c4TRVCXpECncNOUgAQXtcmmnkTS66uE2i9w7NBxeMhYwxnNBtEm15Sy9FMvM+1QSUApjOdTKMw6D1CpIgzdfI4aznxZXHW/4U+i3TCtbZvXMiPuYssLV46Od7QezqvXCXxjQubYmKmMNvRenMnslccucNy6utyfJJM5smw7tpc751wVUzzewliK3R1mhAmG2d1hXmQFB+XtOWu3JqrgnnXTan6syDR3IT2CnhzJtyE6knC/cre0AKYsYRWcb88CFp9wPKTo5922On4zhKfahOjM7OhdUo/fddsxn/mCmrWol0FbiRVSIeXQPLQx6YLuhUFvqdc4pbg0u8A1gKn1RqkgDcw0X4BQyohZc8bEcaPAotqzCYNlB6qmRNXFGCaJlIx9wvJ+8ewMYXr8yQWtO+F8LNt5GvdGzx1haYxSzRNS2XSAK8gw/xJoV+muGY7wrvY2WFtHgD0QuxwODYuymmt9IFYLL2NhNv4EL7znjBejYnWBzvNpTfkiURBq0dV6+zouoUftVj2lZpBI1GDW/VJC1YB2dtfi6pQE74eE2f4gTRawKIQDXx7MwINajyFGFDrNBhiysi4AMBgb6OZrbcb7k8ruBpurJAv0M3If2g80pkYxJN8oF5/cvd9k/dU9/lAxpRY/Zw/7Ln93LT+sP7YhZdLkVeSQ1FlMg355m7x3oaiEiqKWIt9kmRwngKDzVBXEyfjE3uXdqlZReSsHyTNEOuXWl/skg2pIdvh6NWKZ2SSpztBn7qUIVNf3Wu8Yi1ifA2iP2CUdwjA9zMdB1K6hIIlJeX+yRINadVCITr5B4nXB0etKKEogaXXGIP67GMfNe3WmFHSW/QIo1orvq/93e8ZY2 0WKMkBvL wyUC/J+3MJM0cdG+OiS85nliqLp5T00vxyOwq9xcyczLSJgYn2FjbR11duopv2R+mQLR8NEOXVhUdLROvNsh91HfbHy0Taj0XoWB1GmUM0ha4IOzLZtkFB6f6HzSjC+1bk1CArNbB8OGFms244wfJSnoBpGDrJWHqGkCpYQ1Ql+carOEsnmmKuIWVrnogZCfzGasjtSgSL98zrsYXJc3xRVzXtTI6t1vFKyxXMBuklY/jlan9UNKvAyBt7jhyr//jGKXKWlR3eYtTh9Pn/vtzQnFN7xK3zQjPnOqC+4IaQN8Mj+ztSW5Edjqm/u7zurZphzVZDDQumxMQiGk/+0iXci/VljXTbmW/SCy39v69azR1LmlEiDZqAzQXKRol1qbF9QvO52yDIXq9yoFEJQoPAKRFjQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add testing for the newly added prepare API, for both aligned and non-aligned API, also probe API is also tested along with prepare API. CC: Alexander Duyck CC: Andrew Morton CC: Linux-MM Signed-off-by: Yunsheng Lin --- .../selftests/mm/page_frag/page_frag_test.c | 76 +++++++++++++++++-- tools/testing/selftests/mm/run_vmtests.sh | 4 + tools/testing/selftests/mm/test_page_frag.sh | 27 +++++++ 3 files changed, 102 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/mm/page_frag/page_frag_test.c b/tools/testing/selftests/mm/page_frag/page_frag_test.c index e806c1866e36..3b3c32389def 100644 --- a/tools/testing/selftests/mm/page_frag/page_frag_test.c +++ b/tools/testing/selftests/mm/page_frag/page_frag_test.c @@ -32,6 +32,10 @@ static bool test_align; module_param(test_align, bool, 0); MODULE_PARM_DESC(test_align, "use align API for testing"); +static bool test_prepare; +module_param(test_prepare, bool, 0); +MODULE_PARM_DESC(test_prepare, "use prepare API for testing"); + static int test_alloc_len = 2048; module_param(test_alloc_len, int, 0); MODULE_PARM_DESC(test_alloc_len, "alloc len for testing"); @@ -74,6 +78,21 @@ static int page_frag_pop_thread(void *arg) return 0; } +static void frag_frag_test_commit(struct page_frag_cache *nc, + struct page_frag *prepare_pfrag, + struct page_frag *probe_pfrag, + unsigned int used_sz) +{ + if (prepare_pfrag->page != probe_pfrag->page || + prepare_pfrag->offset != probe_pfrag->offset || + prepare_pfrag->size != probe_pfrag->size) { + force_exit = true; + WARN_ONCE(true, TEST_FAILED_PREFIX "wrong probed info\n"); + } + + page_frag_refill_commit(nc, prepare_pfrag, used_sz); +} + static int page_frag_push_thread(void *arg) { struct ptr_ring *ring = arg; @@ -86,15 +105,61 @@ static int page_frag_push_thread(void *arg) int ret; if (test_align) { - va = page_frag_alloc_align(&test_nc, test_alloc_len, - GFP_KERNEL, SMP_CACHE_BYTES); + if (test_prepare) { + struct page_frag prepare_frag, probe_frag; + void *probe_va; + + va = page_frag_alloc_refill_prepare_align(&test_nc, + test_alloc_len, + &prepare_frag, + GFP_KERNEL, + SMP_CACHE_BYTES); + + probe_va = __page_frag_alloc_refill_probe_align(&test_nc, + test_alloc_len, + &probe_frag, + -SMP_CACHE_BYTES); + if (va != probe_va) { + force_exit = true; + WARN_ONCE(true, TEST_FAILED_PREFIX "wrong va\n"); + } + + if (likely(va)) + frag_frag_test_commit(&test_nc, &prepare_frag, + &probe_frag, test_alloc_len); + } else { + va = page_frag_alloc_align(&test_nc, + test_alloc_len, + GFP_KERNEL, + SMP_CACHE_BYTES); + } if ((unsigned long)va & (SMP_CACHE_BYTES - 1)) { force_exit = true; WARN_ONCE(true, TEST_FAILED_PREFIX "unaligned va returned\n"); } } else { - va = page_frag_alloc(&test_nc, test_alloc_len, GFP_KERNEL); + if (test_prepare) { + struct page_frag prepare_frag, probe_frag; + void *probe_va; + + va = page_frag_alloc_refill_prepare(&test_nc, test_alloc_len, + &prepare_frag, GFP_KERNEL); + + probe_va = page_frag_alloc_refill_probe(&test_nc, test_alloc_len, + &probe_frag); + + if (va != probe_va) { + force_exit = true; + WARN_ONCE(true, TEST_FAILED_PREFIX "wrong va\n"); + } + + if (likely(va)) + frag_frag_test_commit(&test_nc, &prepare_frag, + &probe_frag, test_alloc_len); + } else { + va = page_frag_alloc(&test_nc, test_alloc_len, GFP_KERNEL); + } } if (!va) @@ -176,8 +241,9 @@ static int __init page_frag_test_init(void) } duration = (u64)ktime_us_delta(ktime_get(), start); - pr_info("%d of iterations for %s testing took: %lluus\n", nr_test, - test_align ? "aligned" : "non-aligned", duration); + pr_info("%d of iterations for %s %s API testing took: %lluus\n", nr_test, + test_align ? "aligned" : "non-aligned", + test_prepare ? "prepare" : "alloc", duration); out: ptr_ring_cleanup(&ptr_ring, NULL); diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh index 2c5394584af4..f6ff9080a6f2 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -464,6 +464,10 @@ CATEGORY="page_frag" run_test ./test_page_frag.sh aligned CATEGORY="page_frag" run_test ./test_page_frag.sh nonaligned +CATEGORY="page_frag" run_test ./test_page_frag.sh aligned_prepare + +CATEGORY="page_frag" run_test ./test_page_frag.sh nonaligned_prepare + echo "SUMMARY: PASS=${count_pass} SKIP=${count_skip} FAIL=${count_fail}" | tap_prefix echo "1..${count_total}" | tap_output diff --git a/tools/testing/selftests/mm/test_page_frag.sh b/tools/testing/selftests/mm/test_page_frag.sh index f55b105084cf..1c757fd11844 100755 --- a/tools/testing/selftests/mm/test_page_frag.sh +++ b/tools/testing/selftests/mm/test_page_frag.sh @@ -43,6 +43,8 @@ check_test_failed_prefix() { SMOKE_PARAM="test_push_cpu=$TEST_CPU_0 test_pop_cpu=$TEST_CPU_1" NONALIGNED_PARAM="$SMOKE_PARAM test_alloc_len=75 nr_test=$NR_TEST" ALIGNED_PARAM="$NONALIGNED_PARAM test_align=1" +NONALIGNED_PREPARE_PARAM="$NONALIGNED_PARAM test_prepare=1" +ALIGNED_PREPARE_PARAM="$ALIGNED_PARAM test_prepare=1" check_test_requirements() { @@ -77,6 +79,20 @@ run_aligned_check() insmod $DRIVER $ALIGNED_PARAM > /dev/null 2>&1 } +run_nonaligned_prepare_check() +{ + echo "Run performance tests to evaluate how fast nonaligned prepare API is." + + insmod $DRIVER $NONALIGNED_PREPARE_PARAM > /dev/null 2>&1 +} + +run_aligned_prepare_check() +{ + echo "Run performance tests to evaluate how fast aligned prepare API is." + + insmod $DRIVER $ALIGNED_PREPARE_PARAM > /dev/null 2>&1 +} + run_smoke_check() { echo "Run smoke test." @@ -87,6 +103,7 @@ run_smoke_check() usage() { echo -n "Usage: $0 [ aligned ] | [ nonaligned ] | | [ smoke ] | " + echo "[ aligned_prepare ] | [ nonaligned_prepare ] | " echo "manual parameters" echo echo "Valid tests and parameters:" @@ -107,6 +124,12 @@ usage() echo "# Performance testing for aligned alloc API" echo "$0 aligned" echo + echo "# Performance testing for nonaligned prepare API" + echo "$0 nonaligned_prepare" + echo + echo "# Performance testing for aligned prepare API" + echo "$0 aligned_prepare" + echo exit 0 } @@ -158,6 +181,10 @@ function run_test() run_nonaligned_check elif [[ "$1" = "aligned" ]]; then run_aligned_check + elif [[ "$1" = "nonaligned_prepare" ]]; then + run_nonaligned_prepare_check + elif [[ "$1" = "aligned_prepare" ]]; then + run_aligned_prepare_check else run_manual_check $@ fi From patchwork Thu Nov 14 12:16:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13875067 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57589D637A7 for ; Thu, 14 Nov 2024 12:23:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CFEB86B00A2; Thu, 14 Nov 2024 07:23:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CAC196B00A3; Thu, 14 Nov 2024 07:23:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AD7AD6B00A4; Thu, 14 Nov 2024 07:23:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 8837A6B00A2 for ; Thu, 14 Nov 2024 07:23:20 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3EDFD810F1 for ; Thu, 14 Nov 2024 12:23:20 +0000 (UTC) X-FDA: 82784615340.14.FF7689A Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) by imf26.hostedemail.com (Postfix) with ESMTP id F3329140019 for ; Thu, 14 Nov 2024 12:22:43 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf26.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731586934; a=rsa-sha256; cv=none; b=PeHS4qxJ8xChYLFXBC7WQ1FFtiwoUrNFUQMSGiaRXiywx9yVeUDWepZQa0UqU9Vg2UBkyx CwIvxvB5olyKD72zz93+EG1GDCMNI2CdnrGO6hsLReGbR3PHNQvGbwFWcVKifnKfqYPqZu jaj+tKTbJObwlSPaSNVab+qsbNUM/JA= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf26.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.35 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731586934; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w0rg/YgAs1qkzuLJKfEBAcBIImi+4c1jCqdFKNZXLJY=; b=VzzH/x+jl9mmfklLPSpiUW0yxxWZvSTDOnFDRf5tlR5xh3cmLOtYPQLfz49Dd0vrvnmPK6 XtYT7/pH1XGyvfXYC5YKRi2w7ce9g3VdHQnlQ6tEa3VinfOx2GuVfxy1mVmdM3vxMroBMr rJdKiTA6OsRO95KDEqo9nhndwX/g8uA= Received: from mail.maildlp.com (unknown [172.19.163.44]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4Xpzlh3hFPz1T54B; Thu, 14 Nov 2024 20:21:20 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id 768411401F4; Thu, 14 Nov 2024 20:23:12 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 14 Nov 2024 20:23:12 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Linux-MM , Ayush Sawal , Andrew Lunn , Eric Dumazet , Willem de Bruijn , Jason Wang , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Simon Horman , John Fastabend , Jakub Sitnicki , David Ahern , Matthieu Baerts , Mat Martineau , Geliang Tang , Boris Pismenny , , Subject: [PATCH net-next v1 09/10] net: replace page_frag with page_frag_cache Date: Thu, 14 Nov 2024 20:16:04 +0800 Message-ID: <20241114121606.3434517-10-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20241114121606.3434517-1-linyunsheng@huawei.com> References: <20241114121606.3434517-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspam-User: X-Rspamd-Queue-Id: F3329140019 X-Rspamd-Server: rspam11 X-Stat-Signature: 83k7k4akmyno5f514x5966ftmrffo4mp X-HE-Tag: 1731586963-699828 X-HE-Meta: U2FsdGVkX1+Gw1tgJ74RQRoJGQJL+bKrHXVM8XKBUtH4yb9X7uTqK2xelyI2JSS5EJNLmqVSBtFUXuYBqDAY7kQnP6NzpCpGSR/kerHZ+jDiYlpWmLyr8V+6NuHHXzmDnSqxiaHFdj6Sdcx+TsoUwwcWdP7qNG38Ynf/ldWgojo1VuKXte2s33fV+kKuJtH0PcedURhfca8ACX1jEYBU0YlXP1ujruE+K2AvktCHps3Zz57TU0pwKe/IfIrgAq4OeMDoFMpIe+beRLZqj2+/dGFOB7b1bH6me45aBuBrwOV5D3mlgvySr3hpUgIZuf6AFIHIDOTUL5FGTz5n1dY6xQnO7rWy+lU95J5AgmnSrB/D+iCaJQJ85FCBsZMo34xpRoHMAJkvd6JjNkcOy39ehckOD2m9kpejZCwmWHHK3HVoTCl/f58CxSES8ht/sYVrs+qnTjch3eQk1OFwiZ9l7sEmOGXVraVGauf/Hh392Rq0se4GdNr5jIkuFlf0c/RVsmSFJOzOpXqeCpr5XYK7/hM+HFebPE8Yc3mpvhFxvc1B7l+3lwN3PDaH5vivuclG2fzkK9AMsGkpQG4VhItUN4G4ZTeOFYDHcGHGnLo/3M/LACViWM1ZZMJFY74pO9e25LIOJt4MUsQ/o0m0lCJpYrys4KCGD9JwE909sd+7WMFr3jQJsxB3B/XjYanI57/mprnqCqUc32KAbpmLl77hlkC1CzGGDMqpfKPWXOejXlFxgp/4LqTOdDcl1bKJ4WFLmeLbhz1twBmbgITr+qqoOVBwRHT1xTAeMac3dK01joGdAufmjBVPmqDoxJP7CGBlJbRBLQQpWJERRZ2BMJm3aNgBrpDFa7mOXc8VqzmNUhZgjLJXpl/TTcv4LRWNdnDOT7Xr4tJr40+CU01q2xgvAtHY+tFN91M6U7UpJzS+HbrWxat3AsBaxMDtscIkg3o8Ki9GJEWbP/PWaSvyLcp z3G7XUrS d4bYKTVKDNlbJ7n/1qewj7Kz8vn89LlplEm6VUEOQeAyK44vdQTtR0zhJBN3bZL6Pdmnu5cwbAK5WtwtAn0RmH6KMellAsI0B1AN143dYo1N5s6xmzAQk8geXhVptiLz8gsaeKBO9k+5KtHTo/40RUUGx+5QpkXVLk0C7CadxbOrgLOa+6NtXYba3ZdkJJIHGPjqeTVZ4t+6+Y8OhgPlz8jx7Y6xniSwXcgwo52+RSpx5/uz0sA6rQh0+ftaKUh34FPmVWTO8Mmi62Ao2wG+96lIO3qj650cSxrK/sJLdl+KS+/UhypnUyNQX1f/+2t3xEzre54LVi94vxMXba2I+cp11fSrEOXZIXked/90Qt4O2Ksqm+TjQqxu3eekWlvrVvLHsObRK2IDUFU45qxwMcKO8ZBgqdI8xdCXo X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Use the newly introduced prepare/probe/commit API to replace page_frag with page_frag_cache for sk_page_frag(). CC: Alexander Duyck CC: Andrew Morton CC: Linux-MM Signed-off-by: Yunsheng Lin --- .../chelsio/inline_crypto/chtls/chtls.h | 3 - .../chelsio/inline_crypto/chtls/chtls_io.c | 101 +++++------------- .../chelsio/inline_crypto/chtls/chtls_main.c | 3 - drivers/net/tun.c | 47 ++++---- include/linux/sched.h | 2 +- include/net/sock.h | 21 ++-- kernel/exit.c | 3 +- kernel/fork.c | 3 +- net/core/skbuff.c | 58 +++++----- net/core/skmsg.c | 12 ++- net/core/sock.c | 32 ++++-- net/ipv4/ip_output.c | 28 +++-- net/ipv4/tcp.c | 23 ++-- net/ipv4/tcp_output.c | 25 +++-- net/ipv6/ip6_output.c | 28 +++-- net/kcm/kcmsock.c | 18 ++-- net/mptcp/protocol.c | 47 ++++---- net/tls/tls_device.c | 100 ++++++++++------- 18 files changed, 293 insertions(+), 261 deletions(-) diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h index 21e0dfeff158..85ce0b2f1f3f 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h @@ -234,7 +234,6 @@ struct chtls_dev { struct list_head list_node; struct list_head rcu_node; struct list_head na_node; - unsigned int send_page_order; int max_host_sndbuf; u32 round_robin_cnt; struct key_map kmap; @@ -453,8 +452,6 @@ enum { /* The ULP mode/submode of an skbuff */ #define skb_ulp_mode(skb) (ULP_SKB_CB(skb)->ulp_mode) -#define TCP_PAGE(sk) (sk->sk_frag.page) -#define TCP_OFF(sk) (sk->sk_frag.offset) static inline struct chtls_dev *to_chtls_dev(struct tls_toe_device *tlsdev) { diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c index d567e42e1760..7b1760ab55ba 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c @@ -825,12 +825,6 @@ void skb_entail(struct sock *sk, struct sk_buff *skb, int flags) ULP_SKB_CB(skb)->flags = flags; __skb_queue_tail(&csk->txq, skb); sk->sk_wmem_queued += skb->truesize; - - if (TCP_PAGE(sk) && TCP_OFF(sk)) { - put_page(TCP_PAGE(sk)); - TCP_PAGE(sk) = NULL; - TCP_OFF(sk) = 0; - } } static struct sk_buff *get_tx_skb(struct sock *sk, int size) @@ -882,16 +876,12 @@ static void push_frames_if_head(struct sock *sk) chtls_push_frames(csk, 1); } -static int chtls_skb_copy_to_page_nocache(struct sock *sk, - struct iov_iter *from, - struct sk_buff *skb, - struct page *page, - int off, int copy) +static int chtls_skb_copy_to_va_nocache(struct sock *sk, struct iov_iter *from, + struct sk_buff *skb, char *va, int copy) { int err; - err = skb_do_copy_data_nocache(sk, skb, from, page_address(page) + - off, copy, skb->len); + err = skb_do_copy_data_nocache(sk, skb, from, va, copy, skb->len); if (err) return err; @@ -1114,82 +1104,45 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size) if (err) goto do_fault; } else { + struct page_frag_cache *nc = &sk->sk_frag; + struct page_frag page_frag, *pfrag; int i = skb_shinfo(skb)->nr_frags; - struct page *page = TCP_PAGE(sk); - int pg_size = PAGE_SIZE; - int off = TCP_OFF(sk); - bool merge; - - if (page) - pg_size = page_size(page); - if (off < pg_size && - skb_can_coalesce(skb, i, page, off)) { + bool merge = false; + void *va; + + pfrag = &page_frag; + va = page_frag_alloc_refill_prepare(nc, 32U, pfrag, + sk->sk_allocation); + if (unlikely(!va)) + goto wait_for_memory; + + if (skb_can_coalesce(skb, i, pfrag->page, + pfrag->offset)) merge = true; - goto copy; - } - merge = false; - if (i == (is_tls_tx(csk) ? (MAX_SKB_FRAGS - 1) : - MAX_SKB_FRAGS)) + else if (i == (is_tls_tx(csk) ? (MAX_SKB_FRAGS - 1) : + MAX_SKB_FRAGS)) goto new_buf; - if (page && off == pg_size) { - put_page(page); - TCP_PAGE(sk) = page = NULL; - pg_size = PAGE_SIZE; - } - - if (!page) { - gfp_t gfp = sk->sk_allocation; - int order = cdev->send_page_order; - - if (order) { - page = alloc_pages(gfp | __GFP_COMP | - __GFP_NOWARN | - __GFP_NORETRY, - order); - if (page) - pg_size <<= order; - } - if (!page) { - page = alloc_page(gfp); - pg_size = PAGE_SIZE; - } - if (!page) - goto wait_for_memory; - off = 0; - } -copy: - if (copy > pg_size - off) - copy = pg_size - off; + copy = min_t(int, copy, pfrag->size); if (is_tls_tx(csk)) copy = min_t(int, copy, csk->tlshws.txleft); - err = chtls_skb_copy_to_page_nocache(sk, &msg->msg_iter, - skb, page, - off, copy); - if (unlikely(err)) { - if (!TCP_PAGE(sk)) { - TCP_PAGE(sk) = page; - TCP_OFF(sk) = 0; - } + err = chtls_skb_copy_to_va_nocache(sk, &msg->msg_iter, + skb, va, copy); + if (unlikely(err)) goto do_fault; - } + /* Update the skb. */ if (merge) { skb_frag_size_add( &skb_shinfo(skb)->frags[i - 1], copy); + page_frag_refill_commit_noref(nc, pfrag, copy); } else { - skb_fill_page_desc(skb, i, page, off, copy); - if (off + copy < pg_size) { - /* space left keep page */ - get_page(page); - TCP_PAGE(sk) = page; - } else { - TCP_PAGE(sk) = NULL; - } + skb_fill_page_desc(skb, i, pfrag->page, + pfrag->offset, copy); + page_frag_refill_commit(nc, pfrag, copy); } - TCP_OFF(sk) = off + copy; } if (unlikely(skb->len == mss)) tx_skb_finalize(skb); diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c index 96fd31d75dfd..7284269174c5 100644 --- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c +++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_main.c @@ -34,7 +34,6 @@ static DEFINE_MUTEX(notify_mutex); static RAW_NOTIFIER_HEAD(listen_notify_list); static struct proto chtls_cpl_prot, chtls_cpl_protv6; struct request_sock_ops chtls_rsk_ops, chtls_rsk_opsv6; -static uint send_page_order = (14 - PAGE_SHIFT < 0) ? 0 : 14 - PAGE_SHIFT; static void register_listen_notifier(struct notifier_block *nb) { @@ -273,8 +272,6 @@ static void *chtls_uld_add(const struct cxgb4_lld_info *info) INIT_WORK(&cdev->deferq_task, process_deferq); spin_lock_init(&cdev->listen_lock); spin_lock_init(&cdev->idr_lock); - cdev->send_page_order = min_t(uint, get_order(32768), - send_page_order); cdev->max_host_sndbuf = 48 * 1024; if (lldi->vr->key.size) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index d7a865ef370b..4ca6590ef5fe 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1599,21 +1599,19 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile, } static struct sk_buff *__tun_build_skb(struct tun_file *tfile, - struct page_frag *alloc_frag, char *buf, - int buflen, int len, int pad) + char *buf, int buflen, int len, int pad) { struct sk_buff *skb = build_skb(buf, buflen); - if (!skb) + if (!skb) { + page_frag_free(buf); return ERR_PTR(-ENOMEM); + } skb_reserve(skb, pad); skb_put(skb, len); skb_set_owner_w(skb, tfile->socket.sk); - get_page(alloc_frag->page); - alloc_frag->offset += buflen; - return skb; } @@ -1661,8 +1659,8 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, struct virtio_net_hdr *hdr, int len, int *skb_xdp) { - struct page_frag *alloc_frag = ¤t->task_frag; struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; + struct page_frag_cache *nc = ¤t->task_frag; struct bpf_prog *xdp_prog; int buflen = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); char *buf; @@ -1677,16 +1675,16 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, buflen += SKB_DATA_ALIGN(len + pad); rcu_read_unlock(); - alloc_frag->offset = ALIGN((u64)alloc_frag->offset, SMP_CACHE_BYTES); - if (unlikely(!skb_page_frag_refill(buflen, alloc_frag, GFP_KERNEL))) + buf = page_frag_alloc_align(nc, buflen, GFP_KERNEL, + SMP_CACHE_BYTES); + if (unlikely(!buf)) return ERR_PTR(-ENOMEM); - buf = (char *)page_address(alloc_frag->page) + alloc_frag->offset; - copied = copy_page_from_iter(alloc_frag->page, - alloc_frag->offset + pad, - len, from); - if (copied != len) + copied = copy_from_iter(buf + pad, len, from); + if (copied != len) { + page_frag_alloc_abort(nc, buf, buflen); return ERR_PTR(-EFAULT); + } /* There's a small window that XDP may be set after the check * of xdp_prog above, this should be rare and for simplicity @@ -1694,8 +1692,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, */ if (hdr->gso_type || !xdp_prog) { *skb_xdp = 1; - return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, - pad); + return __tun_build_skb(tfile, buf, buflen, len, pad); } *skb_xdp = 0; @@ -1712,21 +1709,23 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, xdp_prepare_buff(&xdp, buf, pad, len, false); act = bpf_prog_run_xdp(xdp_prog, &xdp); - if (act == XDP_REDIRECT || act == XDP_TX) { - get_page(alloc_frag->page); - alloc_frag->offset += buflen; - } err = tun_xdp_act(tun, xdp_prog, &xdp, act); if (err < 0) { - if (act == XDP_REDIRECT || act == XDP_TX) - put_page(alloc_frag->page); + if (act == XDP_REDIRECT || act == XDP_TX) { + page_frag_alloc_abort_ref(nc, buf, buflen); + goto out; + } + + page_frag_alloc_abort(nc, buf, buflen); goto out; } if (err == XDP_REDIRECT) xdp_do_flush(); - if (err != XDP_PASS) + if (err != XDP_PASS) { + page_frag_alloc_abort(nc, buf, buflen); goto out; + } pad = xdp.data - xdp.data_hard_start; len = xdp.data_end - xdp.data; @@ -1735,7 +1734,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, rcu_read_unlock(); local_bh_enable(); - return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad); + return __tun_build_skb(tfile, buf, buflen, len, pad); out: bpf_net_ctx_clear(bpf_net_ctx); diff --git a/include/linux/sched.h b/include/linux/sched.h index bb343136ddd0..957ad219e509 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1379,7 +1379,7 @@ struct task_struct { /* Cache last used pipe for splice(): */ struct pipe_inode_info *splice_pipe; - struct page_frag task_frag; + struct page_frag_cache task_frag; #ifdef CONFIG_TASK_DELAY_ACCT struct task_delay_info *delays; diff --git a/include/net/sock.h b/include/net/sock.h index cf037c870e3b..9b24f53c29e7 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -303,7 +303,7 @@ struct sk_filter; * @sk_stamp: time stamp of last packet received * @sk_stamp_seq: lock for accessing sk_stamp on 32 bit architectures only * @sk_tsflags: SO_TIMESTAMPING flags - * @sk_use_task_frag: allow sk_page_frag() to use current->task_frag. + * @sk_use_task_frag: allow sk_page_frag_cache() to use current->task_frag. * Sockets that can be used under memory reclaim should * set this to false. * @sk_bind_phc: SO_TIMESTAMPING bind PHC index of PTP virtual clock @@ -462,7 +462,7 @@ struct sock { struct sk_buff_head sk_write_queue; u32 sk_dst_pending_confirm; u32 sk_pacing_status; /* see enum sk_pacing */ - struct page_frag sk_frag; + struct page_frag_cache sk_frag; struct timer_list sk_timer; unsigned long sk_pacing_rate; /* bytes per second */ @@ -2491,22 +2491,22 @@ static inline void sk_stream_moderate_sndbuf(struct sock *sk) } /** - * sk_page_frag - return an appropriate page_frag + * sk_page_frag_cache - return an appropriate page_frag_cache * @sk: socket * - * Use the per task page_frag instead of the per socket one for + * Use the per task page_frag_cache instead of the per socket one for * optimization when we know that we're in process context and own * everything that's associated with %current. * * Both direct reclaim and page faults can nest inside other - * socket operations and end up recursing into sk_page_frag() - * while it's already in use: explicitly avoid task page_frag + * socket operations and end up recursing into sk_page_frag_cache() + * while it's already in use: explicitly avoid task page_frag_cache * when users disable sk_use_task_frag. * * Return: a per task page_frag if context allows that, * otherwise a per socket one. */ -static inline struct page_frag *sk_page_frag(struct sock *sk) +static inline struct page_frag_cache *sk_page_frag_cache(struct sock *sk) { if (sk->sk_use_task_frag) return ¤t->task_frag; @@ -2514,7 +2514,12 @@ static inline struct page_frag *sk_page_frag(struct sock *sk) return &sk->sk_frag; } -bool sk_page_frag_refill(struct sock *sk, struct page_frag *pfrag); +bool sk_page_frag_refill_prepare(struct sock *sk, struct page_frag_cache *nc, + struct page_frag *pfrag); + +void *sk_page_frag_alloc_refill_prepare(struct sock *sk, + struct page_frag_cache *nc, + struct page_frag *pfrag); /* * Default write policy as shown to user space via poll/select/SIGIO diff --git a/kernel/exit.c b/kernel/exit.c index 619f0014c33b..5f9b7f58098d 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -974,8 +974,7 @@ void __noreturn do_exit(long code) if (tsk->splice_pipe) free_pipe_info(tsk->splice_pipe); - if (tsk->task_frag.page) - put_page(tsk->task_frag.page); + page_frag_cache_drain(&tsk->task_frag); exit_task_stack_account(tsk); diff --git a/kernel/fork.c b/kernel/fork.c index 22f43721d031..048e5bc1c3fe 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -81,6 +81,7 @@ #include #include #include +#include #include #include #include @@ -1160,10 +1161,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) tsk->btrace_seq = 0; #endif tsk->splice_pipe = NULL; - tsk->task_frag.page = NULL; tsk->wake_q.next = NULL; tsk->worker_private = NULL; + page_frag_cache_init(&tsk->task_frag); kcov_task_init(tsk); kmsan_task_create(tsk); kmap_local_fork(tsk); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 6841e61a6bd0..684cd68ca4ab 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -3062,25 +3062,6 @@ static void sock_spd_release(struct splice_pipe_desc *spd, unsigned int i) put_page(spd->pages[i]); } -static struct page *linear_to_page(struct page *page, unsigned int *len, - unsigned int *offset, - struct sock *sk) -{ - struct page_frag *pfrag = sk_page_frag(sk); - - if (!sk_page_frag_refill(sk, pfrag)) - return NULL; - - *len = min_t(unsigned int, *len, pfrag->size - pfrag->offset); - - memcpy(page_address(pfrag->page) + pfrag->offset, - page_address(page) + *offset, *len); - *offset = pfrag->offset; - pfrag->offset += *len; - - return pfrag->page; -} - static bool spd_can_coalesce(const struct splice_pipe_desc *spd, struct page *page, unsigned int offset) @@ -3091,6 +3072,37 @@ static bool spd_can_coalesce(const struct splice_pipe_desc *spd, spd->partial[spd->nr_pages - 1].len == offset); } +static bool spd_fill_linear_page(struct splice_pipe_desc *spd, + struct page *page, unsigned int offset, + unsigned int *len, struct sock *sk) +{ + struct page_frag_cache *nc = sk_page_frag_cache(sk); + struct page_frag page_frag, *pfrag; + void *va; + + pfrag = &page_frag; + va = sk_page_frag_alloc_refill_prepare(sk, nc, pfrag); + if (!va) + return true; + + *len = min_t(unsigned int, *len, pfrag->size); + memcpy(va, page_address(page) + offset, *len); + + if (spd_can_coalesce(spd, pfrag->page, pfrag->offset)) { + spd->partial[spd->nr_pages - 1].len += *len; + page_frag_refill_commit_noref(nc, pfrag, *len); + return false; + } + + page_frag_refill_commit(nc, pfrag, *len); + spd->pages[spd->nr_pages] = pfrag->page; + spd->partial[spd->nr_pages].len = *len; + spd->partial[spd->nr_pages].offset = pfrag->offset; + spd->nr_pages++; + + return false; +} + /* * Fill page/offset/length into spd, if it can hold more pages. */ @@ -3103,11 +3115,9 @@ static bool spd_fill_page(struct splice_pipe_desc *spd, if (unlikely(spd->nr_pages == MAX_SKB_FRAGS)) return true; - if (linear) { - page = linear_to_page(page, len, &offset, sk); - if (!page) - return true; - } + if (linear) + return spd_fill_linear_page(spd, page, offset, len, sk); + if (spd_can_coalesce(spd, page, offset)) { spd->partial[spd->nr_pages - 1].len += *len; return false; diff --git a/net/core/skmsg.c b/net/core/skmsg.c index b1dcbd3be89e..65ca688d73f1 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -27,23 +27,25 @@ static bool sk_msg_try_coalesce_ok(struct sk_msg *msg, int elem_first_coalesce) int sk_msg_alloc(struct sock *sk, struct sk_msg *msg, int len, int elem_first_coalesce) { - struct page_frag *pfrag = sk_page_frag(sk); + struct page_frag_cache *nc = sk_page_frag_cache(sk); u32 osize = msg->sg.size; int ret = 0; len -= msg->sg.size; while (len > 0) { + struct page_frag page_frag, *pfrag; struct scatterlist *sge; u32 orig_offset; int use, i; - if (!sk_page_frag_refill(sk, pfrag)) { + pfrag = &page_frag; + if (!sk_page_frag_refill_prepare(sk, nc, pfrag)) { ret = -ENOMEM; goto msg_trim; } orig_offset = pfrag->offset; - use = min_t(int, len, pfrag->size - orig_offset); + use = min_t(int, len, pfrag->size); if (!sk_wmem_schedule(sk, use)) { ret = -ENOMEM; goto msg_trim; @@ -57,6 +59,7 @@ int sk_msg_alloc(struct sock *sk, struct sk_msg *msg, int len, sg_page(sge) == pfrag->page && sge->offset + sge->length == orig_offset) { sge->length += use; + page_frag_refill_commit_noref(nc, pfrag, use); } else { if (sk_msg_full(msg)) { ret = -ENOSPC; @@ -66,13 +69,12 @@ int sk_msg_alloc(struct sock *sk, struct sk_msg *msg, int len, sge = &msg->sg.data[msg->sg.end]; sg_unmark_end(sge); sg_set_page(sge, pfrag->page, use, orig_offset); - get_page(pfrag->page); + page_frag_refill_commit(nc, pfrag, use); sk_msg_iter_next(msg, end); } sk_mem_charge(sk, use); msg->sg.size += use; - pfrag->offset += use; len -= use; } diff --git a/net/core/sock.c b/net/core/sock.c index 7f398bd07fb7..f79776162f0d 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2270,10 +2270,7 @@ static void __sk_destruct(struct rcu_head *head) pr_debug("%s: optmem leakage (%d bytes) detected\n", __func__, atomic_read(&sk->sk_omem_alloc)); - if (sk->sk_frag.page) { - put_page(sk->sk_frag.page); - sk->sk_frag.page = NULL; - } + page_frag_cache_drain(&sk->sk_frag); /* We do not need to acquire sk->sk_peer_lock, we are the last user. */ put_cred(sk->sk_peer_cred); @@ -3029,16 +3026,33 @@ bool skb_page_frag_refill(unsigned int sz, struct page_frag *pfrag, gfp_t gfp) } EXPORT_SYMBOL(skb_page_frag_refill); -bool sk_page_frag_refill(struct sock *sk, struct page_frag *pfrag) +bool sk_page_frag_refill_prepare(struct sock *sk, struct page_frag_cache *nc, + struct page_frag *pfrag) { - if (likely(skb_page_frag_refill(32U, pfrag, sk->sk_allocation))) + if (likely(page_frag_refill_prepare(nc, 32U, pfrag, sk->sk_allocation))) return true; sk_enter_memory_pressure(sk); sk_stream_moderate_sndbuf(sk); return false; } -EXPORT_SYMBOL(sk_page_frag_refill); +EXPORT_SYMBOL(sk_page_frag_refill_prepare); + +void *sk_page_frag_alloc_refill_prepare(struct sock *sk, + struct page_frag_cache *nc, + struct page_frag *pfrag) +{ + void *va; + + va = page_frag_alloc_refill_prepare(nc, 32U, pfrag, sk->sk_allocation); + if (likely(va)) + return va; + + sk_enter_memory_pressure(sk); + sk_stream_moderate_sndbuf(sk); + return NULL; +} +EXPORT_SYMBOL(sk_page_frag_alloc_refill_prepare); void __lock_sock(struct sock *sk) __releases(&sk->sk_lock.slock) @@ -3560,8 +3574,8 @@ void sock_init_data_uid(struct socket *sock, struct sock *sk, kuid_t uid) sk->sk_error_report = sock_def_error_report; sk->sk_destruct = sock_def_destruct; - sk->sk_frag.page = NULL; - sk->sk_frag.offset = 0; + page_frag_cache_init(&sk->sk_frag); + sk->sk_peek_off = -1; sk->sk_peer_pid = NULL; diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 0065b1996c94..7033ae387062 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -953,7 +953,7 @@ static int __ip_append_data(struct sock *sk, struct flowi4 *fl4, struct sk_buff_head *queue, struct inet_cork *cork, - struct page_frag *pfrag, + struct page_frag_cache *nc, int getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb), void *from, int length, int transhdrlen, @@ -1233,13 +1233,19 @@ static int __ip_append_data(struct sock *sk, copy = err; wmem_alloc_delta += copy; } else if (!zc) { + struct page_frag page_frag, *pfrag; int i = skb_shinfo(skb)->nr_frags; + void *va; err = -ENOMEM; - if (!sk_page_frag_refill(sk, pfrag)) + pfrag = &page_frag; + va = sk_page_frag_alloc_refill_prepare(sk, nc, pfrag); + if (!va) goto error; skb_zcopy_downgrade_managed(skb); + copy = min_t(int, copy, pfrag->size); + if (!skb_can_coalesce(skb, i, pfrag->page, pfrag->offset)) { err = -EMSGSIZE; @@ -1247,18 +1253,18 @@ static int __ip_append_data(struct sock *sk, goto error; __skb_fill_page_desc(skb, i, pfrag->page, - pfrag->offset, 0); + pfrag->offset, copy); skb_shinfo(skb)->nr_frags = ++i; - get_page(pfrag->page); + page_frag_refill_commit(nc, pfrag, copy); + } else { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], + copy); + page_frag_refill_commit_noref(nc, pfrag, copy); } - copy = min_t(int, copy, pfrag->size - pfrag->offset); - if (getfrag(from, - page_address(pfrag->page) + pfrag->offset, - offset, copy, skb->len, skb) < 0) + + if (getfrag(from, va, offset, copy, skb->len, skb) < 0) goto error_efault; - pfrag->offset += copy; - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); skb_len_add(skb, copy); wmem_alloc_delta += copy; } else { @@ -1373,7 +1379,7 @@ int ip_append_data(struct sock *sk, struct flowi4 *fl4, } return __ip_append_data(sk, fl4, &sk->sk_write_queue, &inet->cork.base, - sk_page_frag(sk), getfrag, + sk_page_frag_cache(sk), getfrag, from, length, transhdrlen, flags); } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0fbf1e222cda..24068f949c4f 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1193,9 +1193,13 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) if (zc == 0) { bool merge = true; int i = skb_shinfo(skb)->nr_frags; - struct page_frag *pfrag = sk_page_frag(sk); + struct page_frag_cache *nc = sk_page_frag_cache(sk); + struct page_frag page_frag, *pfrag; + void *va; - if (!sk_page_frag_refill(sk, pfrag)) + pfrag = &page_frag; + va = sk_page_frag_alloc_refill_prepare(sk, nc, pfrag); + if (!va) goto wait_for_space; if (!skb_can_coalesce(skb, i, pfrag->page, @@ -1207,7 +1211,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) merge = false; } - copy = min_t(int, copy, pfrag->size - pfrag->offset); + copy = min_t(int, copy, pfrag->size); if (unlikely(skb_zcopy_pure(skb) || skb_zcopy_managed(skb))) { if (tcp_downgrade_zcopy_pure(sk, skb)) @@ -1220,20 +1224,19 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size) goto wait_for_space; err = skb_copy_to_frag_nocache(sk, &msg->msg_iter, skb, - page_address(pfrag->page) + - pfrag->offset, copy); + va, copy); if (err) goto do_error; /* Update the skb. */ if (merge) { skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + page_frag_refill_commit_noref(nc, pfrag, copy); } else { skb_fill_page_desc(skb, i, pfrag->page, pfrag->offset, copy); - page_ref_inc(pfrag->page); + page_frag_refill_commit(nc, pfrag, copy); } - pfrag->offset += copy; } else if (zc == MSG_ZEROCOPY) { /* First append to a fragless skb builds initial * pure zerocopy skb @@ -3393,11 +3396,7 @@ int tcp_disconnect(struct sock *sk, int flags) WARN_ON(inet->inet_num && !icsk->icsk_bind_hash); - if (sk->sk_frag.page) { - put_page(sk->sk_frag.page); - sk->sk_frag.page = NULL; - sk->sk_frag.offset = 0; - } + page_frag_cache_drain(&sk->sk_frag); sk_error_report(sk); return 0; } diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 5485a70b5fe5..d84b0d477a65 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3968,9 +3968,11 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn) struct inet_connection_sock *icsk = inet_csk(sk); struct tcp_sock *tp = tcp_sk(sk); struct tcp_fastopen_request *fo = tp->fastopen_req; - struct page_frag *pfrag = sk_page_frag(sk); + struct page_frag_cache *nc = sk_page_frag_cache(sk); + struct page_frag page_frag, *pfrag; struct sk_buff *syn_data; int space, err = 0; + void *va; tp->rx_opt.mss_clamp = tp->advmss; /* If MSS is not cached */ if (!tcp_fastopen_cookie_check(sk, &tp->rx_opt.mss_clamp, &fo->cookie)) @@ -3989,21 +3991,25 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn) space = min_t(size_t, space, fo->size); - if (space && - !skb_page_frag_refill(min_t(size_t, space, PAGE_SIZE), - pfrag, sk->sk_allocation)) - goto fallback; + if (space) { + pfrag = &page_frag; + va = page_frag_alloc_refill_prepare(nc, + min_t(size_t, space, PAGE_SIZE), + pfrag, sk->sk_allocation); + if (!va) + goto fallback; + } + syn_data = tcp_stream_alloc_skb(sk, sk->sk_allocation, false); if (!syn_data) goto fallback; memcpy(syn_data->cb, syn->cb, sizeof(syn->cb)); if (space) { - space = min_t(size_t, space, pfrag->size - pfrag->offset); + space = min_t(size_t, space, pfrag->size); space = tcp_wmem_schedule(sk, space); } if (space) { - space = copy_page_from_iter(pfrag->page, pfrag->offset, - space, &fo->data->msg_iter); + space = _copy_from_iter(va, space, &fo->data->msg_iter); if (unlikely(!space)) { tcp_skb_tsorted_anchor_cleanup(syn_data); kfree_skb(syn_data); @@ -4011,8 +4017,7 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn) } skb_fill_page_desc(syn_data, 0, pfrag->page, pfrag->offset, space); - page_ref_inc(pfrag->page); - pfrag->offset += space; + page_frag_refill_commit(nc, pfrag, space); skb_len_add(syn_data, space); skb_zcopy_set(syn_data, fo->uarg, NULL); } diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index f7b4608bb316..d9e76a76b7ba 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1416,7 +1416,7 @@ static int __ip6_append_data(struct sock *sk, struct sk_buff_head *queue, struct inet_cork_full *cork_full, struct inet6_cork *v6_cork, - struct page_frag *pfrag, + struct page_frag_cache *nc, int getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb), void *from, size_t length, int transhdrlen, @@ -1762,13 +1762,19 @@ static int __ip6_append_data(struct sock *sk, copy = err; wmem_alloc_delta += copy; } else if (!zc) { + struct page_frag page_frag, *pfrag; int i = skb_shinfo(skb)->nr_frags; + void *va; err = -ENOMEM; - if (!sk_page_frag_refill(sk, pfrag)) + pfrag = &page_frag; + va = sk_page_frag_alloc_refill_prepare(sk, nc, pfrag); + if (!va) goto error; skb_zcopy_downgrade_managed(skb); + copy = min_t(int, copy, pfrag->size); + if (!skb_can_coalesce(skb, i, pfrag->page, pfrag->offset)) { err = -EMSGSIZE; @@ -1776,18 +1782,18 @@ static int __ip6_append_data(struct sock *sk, goto error; __skb_fill_page_desc(skb, i, pfrag->page, - pfrag->offset, 0); + pfrag->offset, copy); skb_shinfo(skb)->nr_frags = ++i; - get_page(pfrag->page); + page_frag_refill_commit(nc, pfrag, copy); + } else { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], + copy); + page_frag_refill_commit_noref(nc, pfrag, copy); } - copy = min_t(int, copy, pfrag->size - pfrag->offset); - if (getfrag(from, - page_address(pfrag->page) + pfrag->offset, - offset, copy, skb->len, skb) < 0) + + if (getfrag(from, va, offset, copy, skb->len, skb) < 0) goto error_efault; - pfrag->offset += copy; - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); skb->len += copy; skb->data_len += copy; skb->truesize += copy; @@ -1850,7 +1856,7 @@ int ip6_append_data(struct sock *sk, } return __ip6_append_data(sk, &sk->sk_write_queue, &inet->cork, - &np->cork, sk_page_frag(sk), getfrag, + &np->cork, sk_page_frag_cache(sk), getfrag, from, length, transhdrlen, flags, ipc6); } EXPORT_SYMBOL_GPL(ip6_append_data); diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index 94719d4af5fa..8f241a7173ed 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -804,9 +804,13 @@ static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) while (msg_data_left(msg)) { bool merge = true; int i = skb_shinfo(skb)->nr_frags; - struct page_frag *pfrag = sk_page_frag(sk); + struct page_frag_cache *nc = sk_page_frag_cache(sk); + struct page_frag page_frag, *pfrag; + void *va; - if (!sk_page_frag_refill(sk, pfrag)) + pfrag = &page_frag; + va = sk_page_frag_alloc_refill_prepare(sk, nc, pfrag); + if (!va) goto wait_for_memory; if (!skb_can_coalesce(skb, i, pfrag->page, @@ -851,14 +855,12 @@ static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) if (head != skb) head->truesize += copy; } else { - copy = min_t(int, msg_data_left(msg), - pfrag->size - pfrag->offset); + copy = min_t(int, msg_data_left(msg), pfrag->size); if (!sk_wmem_schedule(sk, copy)) goto wait_for_memory; err = skb_copy_to_frag_nocache(sk, &msg->msg_iter, skb, - page_address(pfrag->page) + - pfrag->offset, copy); + va, copy); if (err) goto out_error; @@ -866,13 +868,13 @@ static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) if (merge) { skb_frag_size_add( &skb_shinfo(skb)->frags[i - 1], copy); + page_frag_refill_commit_noref(nc, pfrag, copy); } else { skb_fill_page_desc(skb, i, pfrag->page, pfrag->offset, copy); - get_page(pfrag->page); + page_frag_refill_commit(nc, pfrag, copy); } - pfrag->offset += copy; } copied += copy; diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index a6f2a25edb11..f158194cbec2 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -978,7 +978,6 @@ static bool mptcp_skb_can_collapse_to(u64 write_seq, } /* we can append data to the given data frag if: - * - there is space available in the backing page_frag * - the data frag tail matches the current page_frag free offset * - the data frag end sequence number matches the current write seq */ @@ -987,7 +986,6 @@ static bool mptcp_frag_can_collapse_to(const struct mptcp_sock *msk, const struct mptcp_data_frag *df) { return df && pfrag->page == df->page && - pfrag->size - pfrag->offset > 0 && pfrag->offset == (df->offset + df->data_len) && df->data_seq + df->data_len == msk->write_seq; } @@ -1103,14 +1101,20 @@ static void mptcp_enter_memory_pressure(struct sock *sk) /* ensure we get enough memory for the frag hdr, beyond some minimal amount of * data */ -static bool mptcp_page_frag_refill(struct sock *sk, struct page_frag *pfrag) +static void *mptcp_page_frag_alloc_refill_prepare(struct sock *sk, + struct page_frag_cache *nc, + struct page_frag *pfrag) { - if (likely(skb_page_frag_refill(32U + sizeof(struct mptcp_data_frag), - pfrag, sk->sk_allocation))) - return true; + unsigned int fragsz = 32U + sizeof(struct mptcp_data_frag); + void *va; + + va = page_frag_alloc_refill_prepare(nc, fragsz, pfrag, + sk->sk_allocation); + if (likely(va)) + return va; mptcp_enter_memory_pressure(sk); - return false; + return NULL; } static struct mptcp_data_frag * @@ -1813,7 +1817,7 @@ static u32 mptcp_send_limit(const struct sock *sk) static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) { struct mptcp_sock *msk = mptcp_sk(sk); - struct page_frag *pfrag; + struct page_frag_cache *nc; size_t copied = 0; int ret = 0; long timeo; @@ -1847,14 +1851,16 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) if (unlikely(sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))) goto do_error; - pfrag = sk_page_frag(sk); + nc = sk_page_frag_cache(sk); while (msg_data_left(msg)) { + struct page_frag page_frag, *pfrag; int total_ts, frag_truesize = 0; struct mptcp_data_frag *dfrag; bool dfrag_collapsed; - size_t psize, offset; u32 copy_limit; + size_t psize; + void *va; /* ensure fitting the notsent_lowat() constraint */ copy_limit = mptcp_send_limit(sk); @@ -1865,21 +1871,26 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) * page allocator */ dfrag = mptcp_pending_tail(sk); - dfrag_collapsed = mptcp_frag_can_collapse_to(msk, pfrag, dfrag); + pfrag = &page_frag; + va = page_frag_alloc_refill_probe(nc, 1, pfrag); + dfrag_collapsed = va && mptcp_frag_can_collapse_to(msk, pfrag, + dfrag); if (!dfrag_collapsed) { - if (!mptcp_page_frag_refill(sk, pfrag)) + va = mptcp_page_frag_alloc_refill_prepare(sk, nc, + pfrag); + if (!va) goto wait_for_memory; dfrag = mptcp_carve_data_frag(msk, pfrag, pfrag->offset); frag_truesize = dfrag->overhead; + va += dfrag->overhead; } /* we do not bound vs wspace, to allow a single packet. * memory accounting will prevent execessive memory usage * anyway */ - offset = dfrag->offset + dfrag->data_len; - psize = pfrag->size - offset; + psize = pfrag->size - frag_truesize; psize = min_t(size_t, psize, msg_data_left(msg)); psize = min_t(size_t, psize, copy_limit); total_ts = psize + frag_truesize; @@ -1887,8 +1898,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) if (!sk_wmem_schedule(sk, total_ts)) goto wait_for_memory; - ret = do_copy_data_nocache(sk, psize, &msg->msg_iter, - page_address(dfrag->page) + offset); + ret = do_copy_data_nocache(sk, psize, &msg->msg_iter, va); if (ret) goto do_error; @@ -1897,7 +1907,6 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) copied += psize; dfrag->data_len += psize; frag_truesize += psize; - pfrag->offset += frag_truesize; WRITE_ONCE(msk->write_seq, msk->write_seq + psize); /* charge data on mptcp pending queue to the msk socket @@ -1905,10 +1914,12 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) */ sk_wmem_queued_add(sk, frag_truesize); if (!dfrag_collapsed) { - get_page(dfrag->page); + page_frag_refill_commit(nc, pfrag, frag_truesize); list_add_tail(&dfrag->list, &msk->rtx_queue); if (!msk->first_pending) WRITE_ONCE(msk->first_pending, dfrag); + } else { + page_frag_refill_commit_noref(nc, pfrag, frag_truesize); } pr_debug("msk=%p dfrag at seq=%llu len=%u sent=%u new=%d\n", msk, dfrag->data_seq, dfrag->data_len, dfrag->already_sent, diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index dc063c2c7950..0f020293fe10 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -253,8 +253,8 @@ static void tls_device_resync_tx(struct sock *sk, struct tls_context *tls_ctx, } static void tls_append_frag(struct tls_record_info *record, - struct page_frag *pfrag, - int size) + struct page_frag_cache *nc, + struct page_frag *pfrag, int size) { skb_frag_t *frag; @@ -262,15 +262,34 @@ static void tls_append_frag(struct tls_record_info *record, if (skb_frag_page(frag) == pfrag->page && skb_frag_off(frag) + skb_frag_size(frag) == pfrag->offset) { skb_frag_size_add(frag, size); + page_frag_refill_commit_noref(nc, pfrag, size); } else { ++frag; skb_frag_fill_page_desc(frag, pfrag->page, pfrag->offset, size); ++record->num_frags; + page_frag_refill_commit(nc, pfrag, size); + } + + record->len += size; +} + +static void tls_append_dummy_frag(struct tls_record_info *record, + struct page_frag *pfrag, int size) +{ + skb_frag_t *frag; + + frag = &record->frags[record->num_frags - 1]; + if (skb_frag_page(frag) == pfrag->page && + skb_frag_off(frag) + skb_frag_size(frag) == pfrag->offset) { + skb_frag_size_add(frag, size); + } else { + ++frag; + skb_frag_fill_page_desc(frag, pfrag->page, pfrag->offset, size); + ++record->num_frags; get_page(pfrag->page); } - pfrag->offset += size; record->len += size; } @@ -311,11 +330,11 @@ static int tls_push_record(struct sock *sk, static void tls_device_record_close(struct sock *sk, struct tls_context *ctx, struct tls_record_info *record, - struct page_frag *pfrag, + struct page_frag_cache *nc, unsigned char record_type) { struct tls_prot_info *prot = &ctx->prot_info; - struct page_frag dummy_tag_frag; + struct page_frag dummy_tag_frag, *pfrag; /* append tag * device will fill in the tag, we just need to append a placeholder @@ -323,13 +342,16 @@ static void tls_device_record_close(struct sock *sk, * increases frag count) * if we can't allocate memory now use the dummy page */ - if (unlikely(pfrag->size - pfrag->offset < prot->tag_size) && - !skb_page_frag_refill(prot->tag_size, pfrag, sk->sk_allocation)) { + pfrag = &dummy_tag_frag; + if (unlikely(!page_frag_refill_probe(nc, prot->tag_size, pfrag) && + !page_frag_refill_prepare(nc, prot->tag_size, pfrag, + sk->sk_allocation))) { dummy_tag_frag.page = dummy_page; dummy_tag_frag.offset = 0; - pfrag = &dummy_tag_frag; + tls_append_dummy_frag(record, pfrag, prot->tag_size); + } else { + tls_append_frag(record, nc, pfrag, prot->tag_size); } - tls_append_frag(record, pfrag, prot->tag_size); /* fill prepend */ tls_fill_prepend(ctx, skb_frag_address(&record->frags[0]), @@ -338,6 +360,7 @@ static void tls_device_record_close(struct sock *sk, } static int tls_create_new_record(struct tls_offload_context_tx *offload_ctx, + struct page_frag_cache *nc, struct page_frag *pfrag, size_t prepend_size) { @@ -352,8 +375,7 @@ static int tls_create_new_record(struct tls_offload_context_tx *offload_ctx, skb_frag_fill_page_desc(frag, pfrag->page, pfrag->offset, prepend_size); - get_page(pfrag->page); - pfrag->offset += prepend_size; + page_frag_refill_commit(nc, pfrag, prepend_size); record->num_frags = 1; record->len = prepend_size; @@ -361,33 +383,34 @@ static int tls_create_new_record(struct tls_offload_context_tx *offload_ctx, return 0; } -static int tls_do_allocation(struct sock *sk, - struct tls_offload_context_tx *offload_ctx, - struct page_frag *pfrag, - size_t prepend_size) +static void *tls_do_allocation(struct sock *sk, + struct tls_offload_context_tx *offload_ctx, + struct page_frag_cache *nc, + size_t prepend_size, struct page_frag *pfrag) { int ret; if (!offload_ctx->open_record) { - if (unlikely(!skb_page_frag_refill(prepend_size, pfrag, - sk->sk_allocation))) { + void *va; + + if (unlikely(!page_frag_refill_prepare(nc, prepend_size, pfrag, + sk->sk_allocation))) { READ_ONCE(sk->sk_prot)->enter_memory_pressure(sk); sk_stream_moderate_sndbuf(sk); - return -ENOMEM; + return NULL; } - ret = tls_create_new_record(offload_ctx, pfrag, prepend_size); + ret = tls_create_new_record(offload_ctx, nc, pfrag, + prepend_size); if (ret) - return ret; + return NULL; - if (pfrag->size > pfrag->offset) - return 0; + va = page_frag_alloc_refill_probe(nc, 1, pfrag); + if (va) + return va; } - if (!sk_page_frag_refill(sk, pfrag)) - return -ENOMEM; - - return 0; + return sk_page_frag_alloc_refill_prepare(sk, nc, pfrag); } static int tls_device_copy_data(void *addr, size_t bytes, struct iov_iter *i) @@ -424,8 +447,8 @@ static int tls_push_data(struct sock *sk, struct tls_prot_info *prot = &tls_ctx->prot_info; struct tls_offload_context_tx *ctx = tls_offload_ctx_tx(tls_ctx); struct tls_record_info *record; + struct page_frag_cache *nc; int tls_push_record_flags; - struct page_frag *pfrag; size_t orig_size = size; u32 max_open_record_len; bool more = false; @@ -454,7 +477,7 @@ static int tls_push_data(struct sock *sk, return rc; } - pfrag = sk_page_frag(sk); + nc = sk_page_frag_cache(sk); /* TLS_HEADER_SIZE is not counted as part of the TLS record, and * we need to leave room for an authentication tag. @@ -462,8 +485,12 @@ static int tls_push_data(struct sock *sk, max_open_record_len = TLS_MAX_PAYLOAD_SIZE + prot->prepend_size; do { - rc = tls_do_allocation(sk, ctx, pfrag, prot->prepend_size); - if (unlikely(rc)) { + struct page_frag page_frag, *pfrag; + void *va; + + pfrag = &page_frag; + va = tls_do_allocation(sk, ctx, nc, prot->prepend_size, pfrag); + if (unlikely(!va)) { rc = sk_stream_wait_memory(sk, &timeo); if (!rc) continue; @@ -512,16 +539,15 @@ static int tls_push_data(struct sock *sk, zc_pfrag.offset = off; zc_pfrag.size = copy; - tls_append_frag(record, &zc_pfrag, copy); + tls_append_dummy_frag(record, &zc_pfrag, copy); } else if (copy) { - copy = min_t(size_t, copy, pfrag->size - pfrag->offset); + copy = min_t(size_t, copy, pfrag->size); - rc = tls_device_copy_data(page_address(pfrag->page) + - pfrag->offset, copy, - iter); + rc = tls_device_copy_data(va, copy, iter); if (rc) goto handle_error; - tls_append_frag(record, pfrag, copy); + + tls_append_frag(record, nc, pfrag, copy); } size -= copy; @@ -539,7 +565,7 @@ static int tls_push_data(struct sock *sk, if (done || record->len >= max_open_record_len || (record->num_frags >= MAX_SKB_FRAGS - 1)) { tls_device_record_close(sk, tls_ctx, record, - pfrag, record_type); + nc, record_type); rc = tls_push_record(sk, tls_ctx, From patchwork Thu Nov 14 12:16:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13875066 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8D47D41D46 for ; Thu, 14 Nov 2024 12:23:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D0C96B00A1; Thu, 14 Nov 2024 07:23:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 580226B00A2; Thu, 14 Nov 2024 07:23:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 46FB66B00A3; Thu, 14 Nov 2024 07:23:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 2972C6B00A1 for ; Thu, 14 Nov 2024 07:23:20 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E07801C6EAE for ; Thu, 14 Nov 2024 12:23:19 +0000 (UTC) X-FDA: 82784614794.18.AC6AD03 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf05.hostedemail.com (Postfix) with ESMTP id 64E41100010 for ; Thu, 14 Nov 2024 12:21:54 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf05.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731586910; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kMNEjaHbS9INJikC9bFoYj4dDFrHfI8z1tIVKFxY/dg=; b=Fm245zpYmai7oO9PC4oDTsD8yt2VPsjrmjCxKe5fzSTJ8OovBgZiU0mBDuCDxXZFRx0qqF RIe2UHQciA/9H3vpFfRRzy3ZviO6lOiqji98FLMY3yyS3rCN+iytrbJ5dVMkkvaebrEHjC fgDyNTwgT+sZUX5ClwLE14l8VOiWVoM= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf05.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731586910; a=rsa-sha256; cv=none; b=sXZJoedYv+xg4KYQTBBpKnZjar8sabST0No6gbQJq0+JGTdLrzjJ20S2zHQdknx6ZSDZec ghbe6ovIPlWLDkaOHUAr+CSQ2GukmuoV45Kw0UW2q5fenMfY3vTszRg8stpZDhEdBliIzL +RwvXnWGCFO5eX+ZOZLUns8wLmoKQAo= Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Xpzlk2S9jzpb0d; Thu, 14 Nov 2024 20:21:22 +0800 (CST) Received: from dggpemf200006.china.huawei.com (unknown [7.185.36.61]) by mail.maildlp.com (Postfix) with ESMTPS id D48F618009B; Thu, 14 Nov 2024 20:23:14 +0800 (CST) Received: from localhost.localdomain (10.90.30.45) by dggpemf200006.china.huawei.com (7.185.36.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 14 Nov 2024 20:23:14 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Andrew Morton , Linux-MM Subject: [PATCH net-next v1 10/10] mm: page_frag: add an entry in MAINTAINERS for page_frag Date: Thu, 14 Nov 2024 20:16:05 +0800 Message-ID: <20241114121606.3434517-11-linyunsheng@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20241114121606.3434517-1-linyunsheng@huawei.com> References: <20241114121606.3434517-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.90.30.45] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpemf200006.china.huawei.com (7.185.36.61) X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 64E41100010 X-Stat-Signature: hutw69qu9nnykotzcezncxo34u5mrcn5 X-HE-Tag: 1731586914-648877 X-HE-Meta: U2FsdGVkX1/17iocqdt4R1AErera+eWfcvVggypidWXyi+D+HkSWcvK1irxqJCzzP4CzTpoeakiBGVavZW+T7A6/3CMwPLPkQioUoXvpEzQwtT5X4W/CpobbpTQRt+FPuRK6UaB6QBJ5H1iPlsXGC1kFD2oqxqbQYZkhzLSo9QN7nhN9Bm/8AKV9clvbTN6l9JpnoIIGtGTUgAEr+7slFQtqPEnfC4HDz46JU9ATDHCcpT3BgnkytgUZq0tP1I8F6N3m7N1pHrWxgbsv+QHhUbYzxtqHyyJdPMJ4AuBQcKxJmcQwl1J23c/zBYmmOV1mAD5l14Fj4DP7V70kfHLocYP0Y6W0TnS4pFL4Lv+8K/9+B6l39c0gFdPxPmQEwMHxUEydiLEObUS5IKG4cDsmy6wFg7ZarSxUO1F6JwrA7ttfXg4RhfsMn1B5B80Z9hySEvAT9/6rqLcK29LxbzOKHd4fFmMWrWxKoIg03Y7Yy97/sLVb0zmiEeM+JJUfTSx3JdSx8CNe4td/XOVpB5JXiR+jI0GePDsSXUvr0VsRvMLwasISj+OfwDyg6+x4lVA5Fit/lZ6UMz1g6tCPqU0z37vBKN3lv/dqjtxsECu84cpA1/z8EV4AgZFxDQxxAw2jRvaUH25xxM5PhPvs6X4MDwcYBaMBnoSclP5nFeg6Xsi/sU56ffx7Y3mEeCxHxDOdZVIYqWkJpjD2j0RzZffxmKCfEsZW2sYHXNSh4JKfd6jRCUT8oQIQhQF255dC2Iw5L+/zDu+B19fO4TvV8aPD13jSjuzIMFz1lBYYoAMbhqzmWpH464OSNoJQn9bd8rxrpS1BIuoP9cpGhQQPGI5i4p7SwENUCBOQjze6pWZGNpLXvUdsA9jWjTInem9DIeaxVRDH94irQufRqLpXtFxYey46Db6Ia3yMfmUyZU5tg7dqqn+0ie6JURJZ2oFPzY6mE5HgP6osIj1YRV6bnpz hla8WbDJ CoVXLgbKfiepUp5D7EWBGLpKYVjEGA22vTNuQ1DW8wNTP8iXW5oa0SJL46AgZQYOpWgbH0XgzI0B9PTVEahtdThIHz2/Ui5VtmK9Hm/ezkwFYIbQ5Omyge59x409x20M4XYE1T1wFV4P5cpd2ij+ksQ0d9rOT7axWm+F2dwkttiy52WJy9VccNrXAM+vlx+XKkw2O3f+Sz7/FPdTKM/f4h2EG1mAHEjLy7sXK8cShTEtCTgPAqMG7ZvsAEXYaGa4zBFtI0kcgvk5rnxVn0J9mgn4ws0wFjJd8XiKERIPBwY0MOeNxNADdKf/C4FB4cImj/edGqHdbeQ/UWVDoeEvOF0cDCp9e+dviMaiFi2bwxTwPQ8/dO/Mm6XkxQKYITVdCLFbbZaEr+xwQblf6SQPYxjiNDPRVkmASLWAe5CMd0osr6covS505sbyUY4p6hbqC28t3ozY/FZ/6jwIR6B2BRjnAWbQAQnX/IHd8ATZYJ/dJvJg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: After this patchset, page_frag is a small subsystem/library on its own, so add an entry in MAINTAINERS to indicate the new subsystem/library's maintainer, maillist, status and file lists of page_frag. Alexander is the original author of page_frag, add him in the MAINTAINERS too. CC: Alexander Duyck CC: Andrew Morton CC: Linux-MM Signed-off-by: Yunsheng Lin --- MAINTAINERS | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 675bd38630b7..6b6b120a4f90 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -17473,6 +17473,18 @@ F: mm/page-writeback.c F: mm/readahead.c F: mm/truncate.c +PAGE FRAG +M: Alexander Duyck +M: Yunsheng Lin +L: linux-mm@kvack.org +L: netdev@vger.kernel.org +S: Supported +F: Documentation/mm/page_frags.rst +F: include/linux/page_frag_cache.h +F: mm/page_frag_cache.c +F: tools/testing/selftests/mm/page_frag/ +F: tools/testing/selftests/mm/test_page_frag.sh + PAGE POOL M: Jesper Dangaard Brouer M: Ilias Apalodimas