From patchwork Wed May 8 13:34:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Yunsheng Lin X-Patchwork-Id: 13658784 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81E23C25B5F for ; Wed, 8 May 2024 13:37:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E629A6B00A0; Wed, 8 May 2024 09:37:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DEF8E6B00A3; Wed, 8 May 2024 09:37:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C72EF6B00A0; Wed, 8 May 2024 09:37:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9E5856B00A0 for ; Wed, 8 May 2024 09:37:16 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 056CC1A01D9 for ; Wed, 8 May 2024 13:37:15 +0000 (UTC) X-FDA: 82095330072.30.43772A6 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf27.hostedemail.com (Postfix) with ESMTP id 79D8B40021 for ; Wed, 8 May 2024 13:37:13 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf27.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715175434; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=++lYCyycjKwTelcxQZykLT1jlUBj2DdbFG0CK280oZY=; b=Byk2C/6653nfFh3GfPEezynja6g4bnyO5cNEVmr7tCRCkp6n2hsr70smkgK2impq0asQ3W kwj2oJv0xmbTTgtXaE+p+J2Omq6zA/wBJp/jKwiTo0zhGDi79KELzzfxa7He2QlXqsZlx6 EzETjySnj9h3AIVT1s4Gw7lh6fhgaJo= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf27.hostedemail.com: domain of linyunsheng@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=linyunsheng@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715175434; a=rsa-sha256; cv=none; b=qh+D834k5N1/hZKxipcjLoo7u17xdmkf6DB1jSNe/NLEPqXXuuhR1WqPKSsszCIl65NJSI fDx8vkDogNnU1HwEp+2h76SWAqA06hEYshe5690JDCPWWlnwMdblq0pS15vx8QTo2l0JhY BL5osq03/+4Z/6Lpi0BlfHH9phFNmik= Received: from mail.maildlp.com (unknown [172.19.163.252]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4VZGLw4HtLztT5f; Wed, 8 May 2024 21:33:44 +0800 (CST) Received: from dggpemm500005.china.huawei.com (unknown [7.185.36.74]) by mail.maildlp.com (Postfix) with ESMTPS id C8D86180A9F; Wed, 8 May 2024 21:37:10 +0800 (CST) Received: from localhost.localdomain (10.69.192.56) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 8 May 2024 21:37:10 +0800 From: Yunsheng Lin To: , , CC: , , Yunsheng Lin , Alexander Duyck , Jonathan Corbet , Andrew Morton , , Subject: [PATCH net-next v3 12/13] mm: page_frag: update documentation for page_frag Date: Wed, 8 May 2024 21:34:07 +0800 Message-ID: <20240508133408.54708-13-linyunsheng@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20240508133408.54708-1-linyunsheng@huawei.com> References: <20240508133408.54708-1-linyunsheng@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.56] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemm500005.china.huawei.com (7.185.36.74) X-Rspamd-Server: rspam01 X-Stat-Signature: t1i5yozrs13354hkd3mesahirm8jzjhk X-Rspam-User: X-Rspamd-Queue-Id: 79D8B40021 X-HE-Tag: 1715175433-484625 X-HE-Meta: U2FsdGVkX18fhaCdbCCAG/aCzMbsv6IRZZPeZOK0r/J8CDiVugg9l8wW1NxE1XP6xqCjbo31vAgSByYAhSPhkeKqaDvQ3LIBuTSEP7gUfu8cVl3UZF3RG7QA4KMYpWBMdAwuhyol3oIo6C8wOODwdPHNYCW8c7RnpEe6UenMGYIj57+ZVeYuldfbQ4K4VW9vEqU0iYvffkx5dVp3ey7kStdw3BiEOLKEAFkOVaH8m5OGV1lChYTQEeOvLm2VcGGD0X6+AmjZ/lOLHwHRbnvcOCzBA1F3OcAtmBgUlCTY2KCPFYjhqGMElsLbH+JVnVDfY4TvWSWOhS7PBUiqfY68BNU7h48U1pzCTiI/7CNKALTnDDKqE5xzbfCp44gwvrvBB8Ye1ogYWCt2m3HXIzwOBvCA/fn5P1st49LPFwVPbRxlo+UM7sduiQhLXxVy5G2fYQfJvIwHCxgk5QUZ9mcweUuJmrltBpkhTGj2zx3DUfLQuyW+M5IbmJKBZc8wErORUa+33EMIf8LJp9+T8ETfe2H9UeacL/Cba58Jm1/7eMbOopKR82+UKvCm2Eutu31DmjUmSa4KCUniBPpe5heYeZCskojqEMlyN4f3WMpSzmxxCtFsQ5MiNKLXmuEb3Kxya/WzuztnaBzIqg+XPsV2pw1jQOFyc+5loY0e3BH8LDaaAwSucRQkA/oEajzsL27DKXwmij+PiTp+ezMr1pGcF25ndb/onwy/Bnzn1zY+Nr8XGnquKFU1Mzpmi2jvNapfXvp1Lxnkxlt6WUUWyuUa57ZEvzOEt04bLoomJN5kXX77nY3RamXTqlLvJ65XfbzHfDZCC7rEbm1PSZ7d0c4EkcUWsjt352/bAhDAPEOPrEbc+8zv1rXZXIOh4LjrBu7XfK2Gp6BxswDusoZlNPTLEfnhLHnQAiE9mbuXeCTkX9G6xFRcWavf9dtE5qINVnEnzlWyVR80CUdixkYnlas J0QjL1EF m4VAkprJDJ6UhknCOWHCUldNXhhUV9R+DuhAh+7kGeEzK1zUqPulx6sHVMQyvAfo0lzN6H1v2NFU/1OkcU+psW+LHDjKOEpbHRpGHcOuZjGynQV2UzR2NwxWGBciWkC92/0O1VXMYYsK3EtmzMj0LI/D3PRrq5FdiJ1IcxNiX/otL23zv1I6onzwES0dMFzbedWtOMU4YbYIWnVMbT1o2tMvM8wd40a5IFbzyEMyhMXGcRp5bLwpgBUkemfWZlPTCJZ/uwdZq3ccFvSkVzo0tEIjLbt9cM684nr2OanAJBLwpneK65/BUP9Op72J5u/qZvB/Ta8sDCZ81Ss8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Update documentation about design, implementation and API usages for page_frag. CC: Alexander Duyck Signed-off-by: Yunsheng Lin --- Documentation/mm/page_frags.rst | 156 +++++++++++++++++++++++++++++++- include/linux/page_frag_cache.h | 96 ++++++++++++++++++++ mm/page_frag_cache.c | 65 ++++++++++++- 3 files changed, 314 insertions(+), 3 deletions(-) diff --git a/Documentation/mm/page_frags.rst b/Documentation/mm/page_frags.rst index 503ca6cdb804..9c25c0fd81f0 100644 --- a/Documentation/mm/page_frags.rst +++ b/Documentation/mm/page_frags.rst @@ -1,3 +1,5 @@ +.. SPDX-License-Identifier: GPL-2.0 + ============== Page fragments ============== @@ -40,4 +42,156 @@ page via a single call. The advantage to doing this is that it allows for cleaning up the multiple references that were added to a page in order to avoid calling get_page per allocation. -Alexander Duyck, Nov 29, 2016. + +Architecture overview +===================== + +.. code-block:: none + + +----------------------+ + | page_frag API caller | + +----------------------+ + ^ + | + | + | + v + +------------------------------------------------+ + | request page fragment | + +------------------------------------------------+ + ^ ^ ^ + | | Cache not enough | + | Cache empty v | + | +-----------------+ | + | | drain old cache | | + | +-----------------+ | + | ^ | + | | | + v v | + +----------------------------------+ | + | refill cache with order 3 page | | + +----------------------------------+ | + ^ ^ | + | | | + | | Refill failed | + | | | Cache is enough + | | | + | v | + | +----------------------------------+ | + | | refill cache with order 0 page | | + | +----------------------------------+ | + | ^ | + | Refill succeed | | + | | Refill succeed | + | | | + v v v + +------------------------------------------------+ + | allocate fragment from cache | + +------------------------------------------------+ + +API interface +============= +As the design and implementation of page_frag API implies, the allocation side +does not allow concurrent calling. Instead it is assumed that the caller must +ensure there is not concurrent alloc calling to the same page_frag_cache +instance by using its own lock or rely on some lockless guarantee like NAPI +softirq. + +Depending on different aligning requirement, the page_frag API caller may call +page_frag_alloc*_align*() to ensure the returned virtual address or offset of +the page is aligned according to the 'align/alignment' parameter. Note the size +of the allocated fragment is not aligned, the caller need to provide a aligned +fragsz if there is a alignment requirement for the size of the fragment. + +Depending on different use cases, callers expecting to deal with va, page or +both va and page for them may call page_frag_alloc_va*, page_frag_alloc_pg*, +or page_frag_alloc* API accordingly. + +There is also a use case that need minimum memory in order for forward +progressing, but more performant if more memory is available. Using +page_frag_alloc_prepare() and page_frag_alloc_commit() related API, the caller +requests the minimum memory it need and the prepare API will return the maximum +size of the fragment returned, the caller needs to either call the commit API to +report how much memory it actually uses, or not do so if deciding to not use any +memory. + +.. kernel-doc:: include/linux/page_frag_cache.h + :identifiers: page_frag_cache_init page_frag_cache_is_pfmemalloc + page_frag_cache_page_offset page_frag_alloc_va + page_frag_alloc_va_align page_frag_alloc_va_prepare_align + page_frag_alloc_probe page_frag_alloc_commit + page_frag_alloc_commit_noref + +.. kernel-doc:: mm/page_frag_cache.c + :identifiers: __page_frag_alloc_va_align page_frag_alloc_va_prepare + page_frag_alloc_pg_prepare page_frag_alloc_prepare + page_frag_cache_drain page_frag_free_va + +Coding examples +=============== + +Init & Drain API +---------------- + +.. code-block:: c + + page_frag_cache_init(pfrag); + ... + page_frag_cache_drain(pfrag); + + +Alloc & Free API +---------------- + +.. code-block:: c + + void *va; + + va = page_frag_alloc_va_align(pfrag, size, gfp, align); + if (!va) + goto do_error; + + err = do_something(va, size); + if (err) { + page_frag_free_va(va); + goto do_error; + } + +Prepare & Commit API +-------------------- + +.. code-block:: c + + unsigned int offset, size; + bool merge = true; + struct page *page; + void *va; + + size = 32U; + page = page_frag_alloc_prepare(pfrag, &offset, &size, &va); + if (!page) + goto wait_for_space; + + copy = min_t(int, copy, size); + if (!skb_can_coalesce(skb, i, page, offset)) { + if (i >= max_skb_frags) + goto new_segment; + + merge = false; + } + + copy = mem_schedule(copy); + if (!copy) + goto wait_for_space; + + err = copy_from_iter_full_nocache(va, copy, iter); + if (err) + goto do_error; + + if (merge) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + page_frag_alloc_commit_noref(pfrag, offset, copy); + } else { + skb_fill_page_desc(skb, i, page, offset, copy); + page_frag_alloc_commit(pfrag, offset, copy); + } diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_cache.h index 30893638155b..8925397262a1 100644 --- a/include/linux/page_frag_cache.h +++ b/include/linux/page_frag_cache.h @@ -61,11 +61,28 @@ struct page_frag_cache { #endif }; +/** + * page_frag_cache_init() - Init page_frag cache. + * @nc: page_frag cache from which to init + * + * Inline helper to init the page_frag cache. + */ static inline void page_frag_cache_init(struct page_frag_cache *nc) { memset(nc, 0, sizeof(*nc)); } +/** + * page_frag_cache_is_pfmemalloc() - Check for pfmemalloc. + * @nc: page_frag cache from which to check + * + * Used to check if the current page in page_frag cache is pfmemalloc'ed. + * It has the same calling context expection as the alloc API. + * + * Return: + * Return true if the current page in page_frag cache is pfmemalloc'ed, + * otherwise return false. + */ static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache *nc) { return encoded_page_pfmemalloc(nc->encoded_va); @@ -92,6 +109,19 @@ void *__page_frag_alloc_va_align(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask, unsigned int align_mask); +/** + * page_frag_alloc_va_align() - Alloc a page fragment with aligning requirement. + * @nc: page_frag cache from which to allocate + * @fragsz: the requested fragment size + * @gfp_mask: the allocation gfp to use when cache need to be refilled + * @align: the requested aligning requirement for 'va' + * + * WARN_ON_ONCE() checking for 'align' before allocing a page fragment from + * page_frag cache with aligning requirement for 'va'. + * + * Return: + * Return va of the page fragment, otherwise return NULL. + */ static inline void *page_frag_alloc_va_align(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask, unsigned int align) @@ -100,11 +130,32 @@ static inline void *page_frag_alloc_va_align(struct page_frag_cache *nc, return __page_frag_alloc_va_align(nc, fragsz, gfp_mask, -align); } +/** + * page_frag_cache_page_offset() - Return the current page fragment's offset. + * @nc: page_frag cache from which to check + * + * The API is only used in net/sched/em_meta.c for historical reason, do not use + * it for new caller unless there is a strong reason. + * + * Return: + * Return the offset of the current page fragment in the page_frag cache. + */ static inline unsigned int page_frag_cache_page_offset(const struct page_frag_cache *nc) { return __page_frag_cache_page_offset(nc->encoded_va, nc->remaining); } +/** + * page_frag_alloc_va() - Alloc a page fragment. + * @nc: page_frag cache from which to allocate + * @fragsz: the requested fragment size + * @gfp_mask: the allocation gfp to use when cache need to be refilled + * + * Get a page fragment from page_frag cache. + * + * Return: + * Return va of the page fragment, otherwise return NULL. + */ static inline void *page_frag_alloc_va(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask) { @@ -114,6 +165,21 @@ static inline void *page_frag_alloc_va(struct page_frag_cache *nc, void *page_frag_alloc_va_prepare(struct page_frag_cache *nc, unsigned int *fragsz, gfp_t gfp); +/** + * page_frag_alloc_va_prepare_align() - Prepare allocing a page fragment with + * aligning requirement. + * @nc: page_frag cache from which to prepare + * @fragsz: in as the requested size, out as the available size + * @gfp: the allocation gfp to use when cache need to be refilled + * @align: the requested aligning requirement for 'va' + * + * WARN_ON_ONCE() checking for 'align' before preparing an aligned page fragment + * with minimum size of ‘fragsz’, 'fragsz' is also used to report the maximum + * size of the page fragment the caller can use. + * + * Return: + * Return va of the page fragment, otherwise return NULL. + */ static inline void *page_frag_alloc_va_prepare_align(struct page_frag_cache *nc, unsigned int *fragsz, gfp_t gfp, @@ -148,6 +214,19 @@ static inline struct encoded_va *__page_frag_alloc_probe(struct page_frag_cache return encoded_va; } +/** + * page_frag_alloc_probe - Probe the avaiable page fragment. + * @nc: page_frag cache from which to probe + * @offset: out as the offset of the page fragment + * @fragsz: in as the requested size, out as the available size + * @va: out as the virtual address of the returned page fragment + * + * Probe the current available memory to caller without doing cache refilling. + * If the cache is empty, return NULL. + * + * Return: + * Return the page fragment, otherwise return NULL. + */ #define page_frag_alloc_probe(nc, offset, fragsz, va) \ ({ \ struct encoded_va *__encoded_va; \ @@ -162,6 +241,13 @@ static inline struct encoded_va *__page_frag_alloc_probe(struct page_frag_cache __page; \ }) +/** + * page_frag_alloc_commit - Commit allocing a page fragment. + * @nc: page_frag cache from which to commit + * @fragsz: size of the page fragment has been used + * + * Commit the alloc preparing by passing the actual used size. + */ static inline void page_frag_alloc_commit(struct page_frag_cache *nc, unsigned int fragsz) { @@ -170,6 +256,16 @@ static inline void page_frag_alloc_commit(struct page_frag_cache *nc, nc->remaining -= fragsz; } +/** + * page_frag_alloc_commit_noref - Commit allocing a page fragment without taking + * page refcount. + * @nc: page_frag cache from which to commit + * @fragsz: size of the page fragment has been used + * + * Commit the alloc preparing by passing the actual used size, but not taking + * page refcount. Mostly used for fragmemt coaleasing case when the current + * fragmemt can share the same refcount with previous fragmemt. + */ static inline void page_frag_alloc_commit_noref(struct page_frag_cache *nc, unsigned int fragsz) { diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c index eb8bf59b26bb..85e23d5cbdcc 100644 --- a/mm/page_frag_cache.c +++ b/mm/page_frag_cache.c @@ -89,6 +89,18 @@ static struct page *page_frag_cache_refill(struct page_frag_cache *nc, return __page_frag_cache_refill(nc, gfp_mask); } +/** + * page_frag_alloc_va_prepare() - Prepare allocing a page fragment. + * @nc: page_frag cache from which to prepare + * @fragsz: in as the requested size, out as the available size + * @gfp: the allocation gfp to use when cache need to be refilled + * + * Prepare a page fragment with minimum size of ‘fragsz’, 'fragsz' is also used + * to report the maximum size of the page fragment the caller can use. + * + * Return: + * Return va of the page fragment, otherwise return NULL. + */ void *page_frag_alloc_va_prepare(struct page_frag_cache *nc, unsigned int *fragsz, gfp_t gfp) { @@ -111,6 +123,19 @@ void *page_frag_alloc_va_prepare(struct page_frag_cache *nc, } EXPORT_SYMBOL(page_frag_alloc_va_prepare); +/** + * page_frag_alloc_pg_prepare - Prepare allocing a page fragment. + * @nc: page_frag cache from which to prepare + * @offset: out as the offset of the page fragment + * @fragsz: in as the requested size, out as the available size + * @gfp: the allocation gfp to use when cache need to be refilled + * + * Prepare a page fragment with minimum size of ‘fragsz’, 'fragsz' is also used + * to report the maximum size of the page fragment the caller can use. + * + * Return: + * Return the page fragment, otherwise return NULL. + */ struct page *page_frag_alloc_pg_prepare(struct page_frag_cache *nc, unsigned int *offset, unsigned int *fragsz, gfp_t gfp) @@ -141,6 +166,21 @@ struct page *page_frag_alloc_pg_prepare(struct page_frag_cache *nc, } EXPORT_SYMBOL(page_frag_alloc_pg_prepare); +/** + * page_frag_alloc_prepare - Prepare allocing a page fragment. + * @nc: page_frag cache from which to prepare + * @offset: out as the offset of the page fragment + * @fragsz: in as the requested size, out as the available size + * @va: out as the virtual address of the returned page fragment + * @gfp: the allocation gfp to use when cache need to be refilled + * + * Prepare a page fragment with minimum size of ‘fragsz’, 'fragsz' is also used + * to report the maximum size of the page fragment. Return both 'page' and 'va' + * of the fragment to the caller. + * + * Return: + * Return the page fragment, otherwise return NULL. + */ struct page *page_frag_alloc_prepare(struct page_frag_cache *nc, unsigned int *offset, unsigned int *fragsz, @@ -173,6 +213,10 @@ struct page *page_frag_alloc_prepare(struct page_frag_cache *nc, } EXPORT_SYMBOL(page_frag_alloc_prepare); +/** + * page_frag_cache_drain - Drain the current page from page_frag cache. + * @nc: page_frag cache from which to drain + */ void page_frag_cache_drain(struct page_frag_cache *nc) { if (!nc->encoded_va) @@ -193,6 +237,19 @@ void __page_frag_cache_drain(struct page *page, unsigned int count) } EXPORT_SYMBOL(__page_frag_cache_drain); +/** + * __page_frag_alloc_va_align() - Alloc a page fragment with aligning + * requirement. + * @nc: page_frag cache from which to allocate + * @fragsz: the requested fragment size + * @gfp_mask: the allocation gfp to use when cache need to be refilled + * @align_mask: the requested aligning requirement for the 'va' + * + * Get a page fragment from page_frag cache with aligning requirement. + * + * Return: + * Return va of the page fragment, otherwise return NULL. + */ void *__page_frag_alloc_va_align(struct page_frag_cache *nc, unsigned int fragsz, gfp_t gfp_mask, unsigned int align_mask) @@ -263,8 +320,12 @@ void *__page_frag_alloc_va_align(struct page_frag_cache *nc, } EXPORT_SYMBOL(__page_frag_alloc_va_align); -/* - * Frees a page fragment allocated out of either a compound or order 0 page. +/** + * page_frag_free_va - Free a page fragment. + * @addr: va of page fragment to be freed + * + * Free a page fragment allocated out of either a compound or order 0 page by + * virtual address. */ void page_frag_free_va(void *addr) {