From patchwork Wed Jul 24 07:03:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13740624 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3FC7C3DA61 for ; Wed, 24 Jul 2024 07:04:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B8D2F6B0082; Wed, 24 Jul 2024 03:04:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B3BF56B0083; Wed, 24 Jul 2024 03:04:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A03F06B0085; Wed, 24 Jul 2024 03:04:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 7B6E26B0083 for ; Wed, 24 Jul 2024 03:04:26 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 17858C06E4 for ; Wed, 24 Jul 2024 07:04:26 +0000 (UTC) X-FDA: 82373757732.05.E9DC23F Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) by imf05.hostedemail.com (Postfix) with ESMTP id 177A8100002 for ; Wed, 24 Jul 2024 07:04:23 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=T4ozr6xV; spf=pass (imf05.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.112 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721804641; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IgdtO0YJm1XtjSUXDNrZ8xElYsmGF7utZpmG/kiJcWg=; b=Pxf8pnY4b9y/Sn7KJqJct4Fju9KmktNuG3wg3K02yyyF8fooom22H77Fc/5W8peA+GBu03 wAnm799wrDgbZySTHG8Yx/SKAkpBxY5tlHynpJQ8ljDA2XoOSwDxba3a65Xx0Fd0oPz5p6 82dFBw95YGJFRV06CmwMe/3xYn9rBOw= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=T4ozr6xV; spf=pass (imf05.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.112 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721804641; a=rsa-sha256; cv=none; b=xjyWMQHzRAnJxlrBXQnTQJSYRFgjAEduntz8k7x/SM1MbFDgCWhxAAJWdRR9zaU6Uw5sFl VK4kQpOLqxWR5BLj9a1rkopzu068UX6l2dj7vJYkZXpIuFK0ghE/NBwQ2s6XOZG2GD4sko kWo8gLTYD8qMLapyPi4tP9s4C44W1qQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1721804661; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=IgdtO0YJm1XtjSUXDNrZ8xElYsmGF7utZpmG/kiJcWg=; b=T4ozr6xVyg91deIMvp3xFsgzESCIb/WXsIw0L+wJEyc7wn3vPrSSc50jtXG9ko4nzfthhLHxAdhVeyAVtgDPGYBASdA9doScelINubo2r0w+R1g4mc66k0Tu/wotVKxSZ0+Ypp6L7pYw6Vw+oS2QDcix1ZHdRUecfaSj+Rt2twk= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045220184;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=13;SR=0;TI=SMTPD_---0WBDDM2T_1721804659; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WBDDM2T_1721804659) by smtp.aliyun-inc.com; Wed, 24 Jul 2024 15:04:19 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, 21cnbao@gmail.com, ryan.roberts@arm.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH 1/3] mm: shmem: add file length arg in shmem_get_folio() path Date: Wed, 24 Jul 2024 15:03:58 +0800 Message-Id: <70972d294797b377bf24a7290659e9057b978287.1721720891.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 177A8100002 X-Stat-Signature: bz9mdtw5hy8z7x8zecd3z7pquyah14kn X-HE-Tag: 1721804663-79441 X-HE-Meta: U2FsdGVkX1+vMuJKnS61fN7VLtF3zNbE+yhyRp6ScLwhmGw19OGq0KNwkkuSb4hX/5Kg7sSxvMJoFWPF7QdcgDMQf3X2K4dfYHHlRXe7XElnH5Hef5e54eDCxeu03oVWCBHj/6NDyPygnpf6ZSmrtHc1Evd+hFsg83epRo/o0YGeMQ7xku5G+/1vRogfiMYcW/5nyvkcHaOZTOkFUsmJBwUPRKbCFfUgWcB1vVBPST1P9k3jahyoEQ6r0vVJ1I9qxNo9wIucqccFenUHUG7QF6+nm0ZxVk8c8QJVci7MQpLGe3fZpyhKUzemg0hBssoTde0lceU8ZpLqpV26y51yglDZ+nK7Y0Mq3l8d/MRd5e+HdE23uxaMmhtkGe1fxadeiECPOUbz9qdVqCoM9K7Yo2X38aGH8lDkIU4Q068c80/nuT41ZqG4ZXaQgwOSnpPjwxYEA4uF+Gx+OeUtiGcS7H2UMhh1LN/Myz/N+T2WrWsRHCyYTemcmNTvW6/b/X57JvG8XefqxI8fh8VHcLpwWncNd/xK4JPmMugQGY9XuXuYgUeJX7yFJNkLPX580XLnKrXCEUFaj9ualmgqIsyZZq0PkASFPccFuXOx0oSJEhUfkyGIXVR9BdlFIhxqlLOLrfmkUF7ANHtyFNGBgdKLLhimATKVnQnKx3nIiukQb2tJ3WiAevPiIzM6tSNrzI2egJ24oZv6PBtTps96DMQxI5IqwlsDx7G9EdyPdQ9tAAmTuN47b+FyoswaJtJQ68FOZNRYSh/LIoV1x/mdY8veLNrZsvMhJVuRkLbzHp128MprEb1DT1auNIYZtuu62S3syQ6WnUPFZBVIkTTMLzMfB96NRvokIeeXSP2amXWAg1yFwWCZsCFw3MyE2vo2lh+YXI+pa+u1HllE91lqwR9+5U5VEHwS/RyDTliuH1RzYju2lY5qKyzHn9t0ourj+oQrqwEnug5Q6psb7C7aEFa h2NrS/6s FmmZyadkrbPY7jKFqJgN4jTMY3vm4pZl1hcSys/tKrwIUGGFScT6BOR7Ves0xYPmOWzo/otxdtYlQRMNQlftjp0kJEvMptOEg6Ft5LVWMeJj42t0K2+DS10jitFa9yolqoRaTMI9LtLu6cXeHTnIYB2fAgDzCLBmFYtWYZAhhudriRj84Sct50UWXHzWq9rZpoXUp5jkTZmlpEIK6MnOneBbsGUdD6kwy9wpKrXHmIvdBQCvWENxRHagi/Dpv96PvBtVp/ogojJJp8sWgZi+uCydmQ5twOcbHORv0UoQ0WLewIL4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Daniel Gomez In preparation for large folio in the write and fallocate paths, add file length argument in shmem_get_folio() path to be able to calculate the folio order based on the file size. Use of order-0 (PAGE_SIZE) for read, page cache read, and vm fault. This enables high order folios in the write and fallocate path once the folio order is calculated based on the length. Signed-off-by: Daniel Gomez Signed-off-by: Baolin Wang --- fs/xfs/scrub/xfile.c | 6 +++--- fs/xfs/xfs_buf_mem.c | 3 ++- include/linux/shmem_fs.h | 2 +- mm/khugepaged.c | 3 ++- mm/shmem.c | 28 ++++++++++++++++------------ mm/userfaultfd.c | 2 +- 6 files changed, 25 insertions(+), 19 deletions(-) diff --git a/fs/xfs/scrub/xfile.c b/fs/xfs/scrub/xfile.c index d848222f802b..d814d9d786d3 100644 --- a/fs/xfs/scrub/xfile.c +++ b/fs/xfs/scrub/xfile.c @@ -127,7 +127,7 @@ xfile_load( unsigned int offset; if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, - SGP_READ) < 0) + SGP_READ, PAGE_SIZE) < 0) break; if (!folio) { /* @@ -197,7 +197,7 @@ xfile_store( unsigned int offset; if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, - SGP_CACHE) < 0) + SGP_CACHE, PAGE_SIZE) < 0) break; if (filemap_check_wb_err(inode->i_mapping, 0)) { folio_unlock(folio); @@ -268,7 +268,7 @@ xfile_get_folio( pflags = memalloc_nofs_save(); error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, - (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ); + (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ, PAGE_SIZE); memalloc_nofs_restore(pflags); if (error) return ERR_PTR(error); diff --git a/fs/xfs/xfs_buf_mem.c b/fs/xfs/xfs_buf_mem.c index 9bb2d24de709..784c81d35a1f 100644 --- a/fs/xfs/xfs_buf_mem.c +++ b/fs/xfs/xfs_buf_mem.c @@ -149,7 +149,8 @@ xmbuf_map_page( return -ENOMEM; } - error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, SGP_CACHE); + error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, SGP_CACHE, + PAGE_SIZE); if (error) return error; diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 1564d7d3ca61..34beaca2f853 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -144,7 +144,7 @@ enum sgp_type { }; int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, - enum sgp_type sgp); + enum sgp_type sgp, size_t len); struct folio *shmem_read_folio_gfp(struct address_space *mapping, pgoff_t index, gfp_t gfp); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a5ec03ef8722..3c9dbebbdf38 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1867,7 +1867,8 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, xas_unlock_irq(&xas); /* swap in or instantiate fallocated page */ if (shmem_get_folio(mapping->host, index, - &folio, SGP_NOALLOC)) { + &folio, SGP_NOALLOC, + PAGE_SIZE)) { result = SCAN_FAIL; goto xa_unlocked; } diff --git a/mm/shmem.c b/mm/shmem.c index db8f74cac1a2..92ed09527682 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -980,7 +980,7 @@ static struct folio *shmem_get_partial_folio(struct inode *inode, pgoff_t index) * (although in some cases this is just a waste of time). */ folio = NULL; - shmem_get_folio(inode, index, &folio, SGP_READ); + shmem_get_folio(inode, index, &folio, SGP_READ, PAGE_SIZE); return folio; } @@ -2094,7 +2094,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, */ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, struct folio **foliop, enum sgp_type sgp, gfp_t gfp, - struct vm_fault *vmf, vm_fault_t *fault_type) + struct vm_fault *vmf, vm_fault_t *fault_type, size_t len) { struct vm_area_struct *vma = vmf ? vmf->vma : NULL; struct mm_struct *fault_mm; @@ -2297,10 +2297,10 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, * Return: 0 if successful, else a negative error code. */ int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, - enum sgp_type sgp) + enum sgp_type sgp, size_t len) { return shmem_get_folio_gfp(inode, index, foliop, sgp, - mapping_gfp_mask(inode->i_mapping), NULL, NULL); + mapping_gfp_mask(inode->i_mapping), NULL, NULL, len); } EXPORT_SYMBOL_GPL(shmem_get_folio); @@ -2395,7 +2395,7 @@ static vm_fault_t shmem_fault(struct vm_fault *vmf) WARN_ON_ONCE(vmf->page != NULL); err = shmem_get_folio_gfp(inode, vmf->pgoff, &folio, SGP_CACHE, - gfp, vmf, &ret); + gfp, vmf, &ret, PAGE_SIZE); if (err) return vmf_error(err); if (folio) { @@ -2895,6 +2895,9 @@ shmem_write_begin(struct file *file, struct address_space *mapping, struct folio *folio; int ret = 0; + if (!mapping_large_folio_support(mapping)) + len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos)); + /* i_rwsem is held by caller */ if (unlikely(info->seals & (F_SEAL_GROW | F_SEAL_WRITE | F_SEAL_FUTURE_WRITE))) { @@ -2904,7 +2907,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping, return -EPERM; } - ret = shmem_get_folio(inode, index, &folio, SGP_WRITE); + ret = shmem_get_folio(inode, index, &folio, SGP_WRITE, len); if (ret) return ret; @@ -2975,7 +2978,7 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) break; } - error = shmem_get_folio(inode, index, &folio, SGP_READ); + error = shmem_get_folio(inode, index, &folio, SGP_READ, PAGE_SIZE); if (error) { if (error == -EINVAL) error = 0; @@ -3152,7 +3155,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, break; error = shmem_get_folio(inode, *ppos / PAGE_SIZE, &folio, - SGP_READ); + SGP_READ, PAGE_SIZE); if (error) { if (error == -EINVAL) error = 0; @@ -3339,7 +3342,8 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, error = -ENOMEM; else error = shmem_get_folio(inode, index, &folio, - SGP_FALLOC); + SGP_FALLOC, + (end - index) << PAGE_SHIFT); if (error) { info->fallocend = undo_fallocend; /* Remove the !uptodate folios we added */ @@ -3690,7 +3694,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, } else { inode_nohighmem(inode); inode->i_mapping->a_ops = &shmem_aops; - error = shmem_get_folio(inode, 0, &folio, SGP_WRITE); + error = shmem_get_folio(inode, 0, &folio, SGP_WRITE, PAGE_SIZE); if (error) goto out_remove_offset; inode->i_op = &shmem_symlink_inode_operations; @@ -3736,7 +3740,7 @@ static const char *shmem_get_link(struct dentry *dentry, struct inode *inode, return ERR_PTR(-ECHILD); } } else { - error = shmem_get_folio(inode, 0, &folio, SGP_READ); + error = shmem_get_folio(inode, 0, &folio, SGP_READ, PAGE_SIZE); if (error) return ERR_PTR(error); if (!folio) @@ -5209,7 +5213,7 @@ struct folio *shmem_read_folio_gfp(struct address_space *mapping, int error; error = shmem_get_folio_gfp(inode, index, &folio, SGP_CACHE, - gfp, NULL, NULL); + gfp, NULL, NULL, PAGE_SIZE); if (error) return ERR_PTR(error); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index e54e5c8907fa..c275e34c435a 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -391,7 +391,7 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd, struct page *page; int ret; - ret = shmem_get_folio(inode, pgoff, &folio, SGP_NOALLOC); + ret = shmem_get_folio(inode, pgoff, &folio, SGP_NOALLOC, PAGE_SIZE); /* Our caller expects us to return -EFAULT if we failed to find folio */ if (ret == -ENOENT) ret = -EFAULT; From patchwork Wed Jul 24 07:03:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13740626 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E57AC3DA63 for ; Wed, 24 Jul 2024 07:04:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AC91A6B008A; Wed, 24 Jul 2024 03:04:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A78A46B008C; Wed, 24 Jul 2024 03:04:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91A1C6B0092; Wed, 24 Jul 2024 03:04:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6AC206B008A for ; Wed, 24 Jul 2024 03:04:35 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E5D2DA06E6 for ; Wed, 24 Jul 2024 07:04:34 +0000 (UTC) X-FDA: 82373758068.03.DF7786A Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) by imf19.hostedemail.com (Postfix) with ESMTP id 818A31A0009 for ; Wed, 24 Jul 2024 07:04:31 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="Ua/zCN0e"; spf=pass (imf19.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721804610; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GJHtzrbfRGkXA1beo6I8v9VjRBWlqaeEznsY5RTbsHE=; b=L3c/udbIKTvCiFHR3ish+pWe9Rk0169aw0CJ0e3cQViwKmhDsXs8gnS5THJYpSH/geURBZ PBBSO0mJGC3E7LXEjf8oLNnlcNWBv6fMd/IxkEM161nw+zd0IMy8LltYw4rq6ID94CVNXr mHidwKvAWuAtSHU5swA+elwliMHUREQ= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="Ua/zCN0e"; spf=pass (imf19.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.99 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721804610; a=rsa-sha256; cv=none; b=aumnpRxC5DQREPMWTpmd7Q0CFMHuuSyzap7AQVZbm+yXh9NObpJBGZ32+TmTZoF42nShQv WXYrI8wPU+s5lmS9WvA24GR1BtLP9ycaYtPCgTRBQ/+r4qATMs9f08vdnLqPyq7t42Q35H XRzIAfoJTc1ESUDpONlivyZFg5By/Tg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1721804662; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=GJHtzrbfRGkXA1beo6I8v9VjRBWlqaeEznsY5RTbsHE=; b=Ua/zCN0eLIbXjFO4dYelpVgtUBvPbYS8e9bA9zRjsFI5JRwgNw+ayGQK0Tpk0xdQHnNIusorkzZ4GE2iqW7BgJsZT7CdeOwea/R2Kwtk20drU3yrR9Zozl+Z1dERZfemT0nsmpKzz/YgluJFAcsd1ZZeYZcWEM4RHOZb96Biq2A= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R551e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033045220184;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=13;SR=0;TI=SMTPD_---0WBDNN8B_1721804660; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WBDNN8B_1721804660) by smtp.aliyun-inc.com; Wed, 24 Jul 2024 15:04:20 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, 21cnbao@gmail.com, ryan.roberts@arm.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH 2/3] mm: shmem: add large folio support to the write and fallocate paths Date: Wed, 24 Jul 2024 15:03:59 +0800 Message-Id: <05a31373ee1aa40a6a85d4897324a400686e2fb1.1721720891.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 818A31A0009 X-Stat-Signature: tmcxq4grpbqpetpdwi3qzy38tx5og69s X-Rspam-User: X-HE-Tag: 1721804671-893732 X-HE-Meta: U2FsdGVkX189iCtmd6n9OBuhHKK1hy496x7j+oEkduePldMXar+IE760QaE+CBuiC7aw0UJOsB7DN5PpQ75f6Ns5fV7HpO8Hl3x4cr1j0vr0ubHLZ+gQo88p4y1oNKlVbE4IzAeodO0mtW3vl2avbB/W7rLslkh4BmN0p5FseZyMkp7ALB28vLRxBwSip6qsKfRzrKNz0bFjYJNnrVk5ZX5OtAVJkvIocWWxyVWnqY3hmUMdKVV05LFjOXoNYcWnC7hT8Rs2FDFFtyT3WlmSSISBi3NT8l6Tj2CbL7s2yMRUIN0rVFe6g24KQCNgeyyAt/jjx4+z6fqKqXRlWMJz8SiwljFK+6MvD8T6jmye27evNPtJuxK4/0+YYP3uGKO1xk1/EkOCVSbQpsOaRfbr1hUK3rFVZCfBbfvOzTRpBYd9XMR31AMCDA4mhxh7Optiv3KeXCpA9YB65UWt0VK7+VFCjHoSwo89oKl9ILEDATBJgxaMznD+ZLTTeNARqtuGs2in8tmP40PRq8KQ0likrjNICzOMxNEHxiUu2hAql2OOlawbqrvwH9/U4XA4/poRmwwyxKykhIJ+Krqfr+pIzdv35jTvuOYkuddDHO+l5y6j0oVCdUZqcegKpnfwxfCgnKK6HRQpOftZFYS8GLet1LVnxO9qClqoJ/tCeiY/LgxfcF59p2AyXLhst/Mt3M9u+GWmnRLgC7p/FDKJ5TNki1sgCY6OoLjGKYqBt+uQOttaJAWlTgeR63wd059ak1wYBrDqJEnO2800dskNL5MotEnf70y86mBEHe7oePWxAQEf5z/b2ieCrBvnmFAia3xIZHboHMvehIKBcHi/tXUPDT+HlMFk7jilwIZaM2hjN3nA3KoONWMma+67aXdxdrHn+qV+esU+iALsFXCrxIcFenSURoGdxFwk26ZGun9T3hxYCTbBRCXb1am7qL4btYyKBvzfAObXKu4KtJz9n6y MoRmwW/y ffo3Iz4H9IwK6qya9ue7M00TtVwoWf3aauyG8q81xdaHMxqBLGYJzo+NGnVrLjb2fdy/MBqQ0tg0Yun3KahwhN6e8XLwReHRlDZ0PF0WDlyTtZMImM1US2+sIDCeVWC+7zZXlOeg0BvC20UsUfVodCngHGvdRRTymU8qycphexVhpWn/VxPbqlpr7/LenDMJLMQiVFOgK+YJEspaupAUXKAmt3d5XgVRu6cbX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Daniel Gomez Add large folio support for shmem write and fallocate paths matching the same high order preference mechanism used in the iomap buffered IO path as used in __filemap_get_folio(). Add shmem_mapping_size_order() to get a hint for the order of the folio based on the file size which takes care of the mapping requirements. Swap does not support high order folios for now, so make it order-0 in case swap is enabled. If the top level huge page (controlled by '/sys/kernel/mm/transparent_hugepage/shmem_enabled') is enabled, we just allow PMD sized THP to keep interface backward compatibility. Co-developed-by: Baolin Wang Signed-off-by: Daniel Gomez Signed-off-by: Baolin Wang --- include/linux/shmem_fs.h | 4 +-- mm/huge_memory.c | 2 +- mm/shmem.c | 57 ++++++++++++++++++++++++++++++++++++---- 3 files changed, 55 insertions(+), 8 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 34beaca2f853..fb0771218f1b 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -113,11 +113,11 @@ int shmem_unuse(unsigned int type); #ifdef CONFIG_TRANSPARENT_HUGEPAGE unsigned long shmem_allowable_huge_orders(struct inode *inode, struct vm_area_struct *vma, pgoff_t index, - bool shmem_huge_force); + bool shmem_huge_force, size_t len); #else static inline unsigned long shmem_allowable_huge_orders(struct inode *inode, struct vm_area_struct *vma, pgoff_t index, - bool shmem_huge_force) + bool shmem_huge_force, size_t len) { return 0; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e555fcdd19d4..a8fc3b9e4034 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -162,7 +162,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, if (!in_pf && shmem_file(vma->vm_file)) return shmem_allowable_huge_orders(file_inode(vma->vm_file), vma, vma->vm_pgoff, - !enforce_sysfs); + !enforce_sysfs, PAGE_SIZE); if (!vma_is_anonymous(vma)) { /* diff --git a/mm/shmem.c b/mm/shmem.c index 92ed09527682..cc0c1b790267 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1630,10 +1630,47 @@ static gfp_t limit_gfp_mask(gfp_t huge_gfp, gfp_t limit_gfp) return result; } +/** + * shmem_mapping_size_order - Get maximum folio order for the given file size. + * @mapping: Target address_space. + * @index: The page index. + * @size: The suggested size of the folio to create. + * + * This returns a high order for folios (when supported) based on the file size + * which the mapping currently allows at the given index. The index is relevant + * due to alignment considerations the mapping might have. The returned order + * may be less than the size passed. + * + * Like __filemap_get_folio order calculation. + * + * Return: The order. + */ +static inline unsigned int +shmem_mapping_size_order(struct address_space *mapping, pgoff_t index, + size_t size, struct shmem_sb_info *sbinfo) +{ + unsigned int order = ilog2(size); + + if ((order <= PAGE_SHIFT) || + (!mapping_large_folio_support(mapping) || !sbinfo->noswap)) + return 0; + + order -= PAGE_SHIFT; + + /* If we're not aligned, allocate a smaller folio */ + if (index & ((1UL << order) - 1)) + order = __ffs(index); + + order = min_t(size_t, order, MAX_PAGECACHE_ORDER); + + /* Order-1 not supported due to THP dependency */ + return (order == 1) ? 0 : order; +} + #ifdef CONFIG_TRANSPARENT_HUGEPAGE unsigned long shmem_allowable_huge_orders(struct inode *inode, struct vm_area_struct *vma, pgoff_t index, - bool shmem_huge_force) + bool shmem_huge_force, size_t len) { unsigned long mask = READ_ONCE(huge_shmem_orders_always); unsigned long within_size_orders = READ_ONCE(huge_shmem_orders_within_size); @@ -1659,10 +1696,20 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode, vma, vm_flags); if (!vma || !vma_is_anon_shmem(vma)) { /* - * For tmpfs, we now only support PMD sized THP if huge page - * is enabled, otherwise fallback to order 0. + * For tmpfs, if top level huge page is enabled, we just allow + * PMD size THP to keep interface backward compatibility. + */ + if (global_huge) + return BIT(HPAGE_PMD_ORDER); + + /* + * Otherwise, get a highest order hint based on the size of + * write and fallocate paths, then will try each allowable + * huge orders. */ - return global_huge ? BIT(HPAGE_PMD_ORDER) : 0; + order = shmem_mapping_size_order(inode->i_mapping, index, + len, SHMEM_SB(inode->i_sb)); + return BIT(order + 1) - 1; } /* @@ -2174,7 +2221,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, } /* Find hugepage orders that are allowed for anonymous shmem and tmpfs. */ - orders = shmem_allowable_huge_orders(inode, vma, index, false); + orders = shmem_allowable_huge_orders(inode, vma, index, false, len); if (orders > 0) { gfp_t huge_gfp; From patchwork Wed Jul 24 07:04:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13740625 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F175C3DA70 for ; Wed, 24 Jul 2024 07:04:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4D7476B0083; Wed, 24 Jul 2024 03:04:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 485EA6B0085; Wed, 24 Jul 2024 03:04:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D9E26B0088; Wed, 24 Jul 2024 03:04:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 064396B0083 for ; Wed, 24 Jul 2024 03:04:27 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A534D1C0B25 for ; Wed, 24 Jul 2024 07:04:27 +0000 (UTC) X-FDA: 82373757774.03.419002D Received: from out30-97.freemail.mail.aliyun.com (out30-97.freemail.mail.aliyun.com [115.124.30.97]) by imf07.hostedemail.com (Postfix) with ESMTP id AC5594000F for ; Wed, 24 Jul 2024 07:04:25 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=tY+ggRWe; spf=pass (imf07.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721804642; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tdqWP64rTR+e1Asc5O6zn2fOj7NNlZAEMcoj2njyV9U=; b=2vU4cdS76QjYiTZnXunee+qLozOqCVIPXyILZxhFPp9SrtMbFi/WZ9nlXn3vjPowLVNg9U 5emiHeOA0m5+IxSyxiy6otSmcUCZkZUh3/X1VoBxHA7q1xljEW3m3ExV80LgqfKeXB5sFm lz8XYC5Evp5c/P/zLbd7O4/6bjFJ0x4= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=tY+ggRWe; spf=pass (imf07.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721804642; a=rsa-sha256; cv=none; b=dTfO8zFnWzcTHcCMbpy6Cbh4oj/s3qq1LwdH7QuUzLQ0Sds/1X0WYewfb7lIPUGp2f6yuS H39akCBGuIpkz2LSNP1M2lr0OhPKSCHuCX+6WJ5UgLkb5uaxibOApp/WgeRQ6DyZ5NtGi+ g+wp8qyfrJIsvbJgEnnEJUBYk2KWfLE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1721804663; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=tdqWP64rTR+e1Asc5O6zn2fOj7NNlZAEMcoj2njyV9U=; b=tY+ggRWeZ7iwXAAcHSVxs1nslPchK6vv5cwzoJ0+vkrp6BtYWmlSvbfOjNAJRF7KjVLshyFhluJTpafbbUFYtSOW3vGGJauo0quUGz3EHZ27prrpo/Cewxr1mP7R6oDmUapkZ7L/Y1B74X6WXzmr5KJmZvCm2eDHTMehGcHV9F8= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037067112;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=13;SR=0;TI=SMTPD_---0WBDDM3B_1721804661; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WBDDM3B_1721804661) by smtp.aliyun-inc.com; Wed, 24 Jul 2024 15:04:21 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, 21cnbao@gmail.com, ryan.roberts@arm.com, ziy@nvidia.com, ioworker0@gmail.com, da.gomez@samsung.com, p.raghav@samsung.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH 3/3] mm: shmem: use mTHP interface to control huge orders for tmpfs Date: Wed, 24 Jul 2024 15:04:00 +0800 Message-Id: <03c77c20a8a6ccdd90678dcc6bf7d4aeaa9d29ad.1721720891.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: AC5594000F X-Stat-Signature: fa9rgzife7qgdfcy7njp1a3fnho4gndz X-HE-Tag: 1721804665-771421 X-HE-Meta: U2FsdGVkX19WWDoNAOIHFl53jwESSwIZ6wRLSC4WyNAe+YfXaI1yBQ6ATvL1B5I1RMLBm9sWGFCdicWrZc/K4ShZKk8CXrH0tZQHzeBDjP+6kCoTYVia3MbvZvzp4U6x+Bl1sQZACR6vZO6lpYkvkC4t0pZPxGdvuCJ/YpP/+uabR1Z/PhMg/IPokVKkkLncfg2jvsq2FgBtrRjhF4iw1ffzxuERXlPyF6iR/ZfQTAXbFZVuBd3znK/xj1xXgr7mFHJRFLbh6Zq06AGeKeODuGBvGXpzGayvMXstq+jabUnbwejRhsKEhD5xeARQX0YiiuvMcxzpCqYj4Udmiq9RGqufH61mTy7hKscFBs2vwQsaSYaxrWr+ixrhfcfyd+n9fWMhieXyxHgUiqnG4AYvshLQit+T4I9XFRCX8oJisw7h78SxFwkm6JJzGce9HDvAVQWZkkm7aaK/81N0vZuJgk2FGiaou+10mv+F46fxc48TavwQL5B8V9wKHxIQhDOYBmV4HAvVIEBrbeRO3I2cg7uTdYBW82WmOPjqqEz/sOvAEKw9K22Q2WOu8eDcXEx5QTacDcwxvh5t+oNPYj97mGGQAtr/MmicnqPyV/cl/xhOhwqK556zKRR/qAcAKZ02O6cEbnE1auzngCvJAtikhd9wrgarrSB8hh6jsLuvbFCsqcLa+QjIRCLL5hg79tgo3as1hBzyHNf7hSmM71zzb3OWHwBrUm/fldpesQKNrOJdlVuODdmT66UX883R3f+OLYkd3VaM90FWFuf9EqGW5ZevXa4u6OwUnMFPibNXRa8q7Islhu0bBRoEPW9e89LNh9EP8FlXxzcHn/dY6mH4N0j/j4Z9Ix4wCsGza3toLeSYHRM5gSwV4s5esSJfDJJUvB/i41vJbD6NRl7N8hygUSDd+pFr5gOCQCPk8p5qX+5ukBedUsdYvPG7qNs1hZOeVxpfEZNtsFvjOtDtTe+ op8nKj5w etu9w7sKCEQ9mh+nmOs8BJOlwd2izqMk+qlJ6+uoboGr8Nu07aVdfEeZrl+crNUvH5DQGntBN1w2B0waZn7VlY/RvRSmQUWgWULbTIPegtmLOGmOEDA9LX4vwnOlJZwBZSQAnZCiCO6tBOl4ffJzn05UkbTR/63YIpYT9k9ppvtpxpLuf9xkPqA8FP17oVb3v/jAr4nWSvybvvpM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: For the huge orders allowed by writable mmap() faults on tmpfs, the mTHP interface is used to control the allowable huge orders, while 'huge_shmem_orders_inherit' maintains backward compatibility with top-level interface. For the huge orders allowed by write() and fallocate() paths on tmpfs, getting a highest order hint based on the size of write and fallocate paths, then will try each allowable huge orders filtered by the mTHP interfaces if set. Signed-off-by: Baolin Wang --- mm/memory.c | 4 ++-- mm/shmem.c | 42 ++++++++++++++++++++++-------------------- 2 files changed, 24 insertions(+), 22 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 802d0d8a40f9..3a7f43c66db7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4877,10 +4877,10 @@ vm_fault_t finish_fault(struct vm_fault *vmf) /* * Using per-page fault to maintain the uffd semantics, and same - * approach also applies to non-anonymous-shmem faults to avoid + * approach also applies to non shmem/tmpfs faults to avoid * inflating the RSS of the process. */ - if (!vma_is_anon_shmem(vma) || unlikely(userfaultfd_armed(vma))) { + if (!vma_is_shmem(vma) || unlikely(userfaultfd_armed(vma))) { nr_pages = 1; } else if (nr_pages > 1) { pgoff_t idx = folio_page_idx(folio, page); diff --git a/mm/shmem.c b/mm/shmem.c index cc0c1b790267..8e60cc566196 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1692,26 +1692,6 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode, if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED)) return 0; - global_huge = shmem_huge_global_enabled(inode, index, shmem_huge_force, - vma, vm_flags); - if (!vma || !vma_is_anon_shmem(vma)) { - /* - * For tmpfs, if top level huge page is enabled, we just allow - * PMD size THP to keep interface backward compatibility. - */ - if (global_huge) - return BIT(HPAGE_PMD_ORDER); - - /* - * Otherwise, get a highest order hint based on the size of - * write and fallocate paths, then will try each allowable - * huge orders. - */ - order = shmem_mapping_size_order(inode->i_mapping, index, - len, SHMEM_SB(inode->i_sb)); - return BIT(order + 1) - 1; - } - /* * Following the 'deny' semantics of the top level, force the huge * option off from all mounts. @@ -1742,9 +1722,31 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode, if (vm_flags & VM_HUGEPAGE) mask |= READ_ONCE(huge_shmem_orders_madvise); + global_huge = shmem_huge_global_enabled(inode, index, shmem_huge_force, + vma, vm_flags); if (global_huge) mask |= READ_ONCE(huge_shmem_orders_inherit); + /* + * For the huge orders allowed by writable mmap() faults on tmpfs, + * the mTHP interface is used to control the allowable huge orders, + * while 'huge_shmem_orders_inherit' maintains backward compatibility + * with top-level interface. + * + * For the huge orders allowed by write() and fallocate() paths on tmpfs, + * get a highest order hint based on the size of write and fallocate + * paths, then will try each allowable huge orders filtered by the mTHP + * interfaces if set. + */ + if (!vma && !global_huge) { + int highest_order = shmem_mapping_size_order(inode->i_mapping, index, len, + SHMEM_SB(inode->i_sb)); + + if (!mask) + return highest_order > 0 ? BIT(highest_order + 1) - 1 : 0; + + mask &= BIT(highest_order + 1) - 1; + } return orders & mask; }