From patchwork Fri Jun 7 14:58:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BA34C27C5F for ; Fri, 7 Jun 2024 14:59:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE0086B0095; Fri, 7 Jun 2024 10:59:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C90806B0098; Fri, 7 Jun 2024 10:59:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B30A96B009A; Fri, 7 Jun 2024 10:59:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8B5DF6B0095 for ; Fri, 7 Jun 2024 10:59:23 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2C42B1403A3 for ; Fri, 7 Jun 2024 14:59:23 +0000 (UTC) X-FDA: 82204401006.24.5A95371 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) by imf26.hostedemail.com (Postfix) with ESMTP id 5FC2A14000D for ; Fri, 7 Jun 2024 14:59:20 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=ZakUtY0e; spf=pass (imf26.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.151 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772360; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BFnh1b+RXQDGMofoFIGxG63YkPABCa7aRyrJNKKzJxw=; b=1Oj1dqRhPT5Hx7Jcc3CrPNCT9xkdyQ07kzDXm1V4jZAAN58CcLfdks6LFLGlYr6zo37VNj zG9t8dzXEPcN8GrkwH/mQymmGFcoEI8T1PTzsKcJqiE7GminRLyYA867DqgLxjGXX4oY09 PkWa2NEKY18E22zLFK83mryDKrzmJ3s= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=ZakUtY0e; spf=pass (imf26.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.151 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772360; a=rsa-sha256; cv=none; b=emdhrk/QGe8bBhGoqIYu3by9J5tUnJrQ8XfbcqssNhWMUDzqPi7tZldQUG3enkoDc7Jm/T nTdrptyIoJWVEyChtw+1yASDs6fKKiZTHLUjSuoCVoYB9/3m3nH36w9wXQStBdB7mUf0Vm MNWMqJDKEsciph9/pP1c9NO5X5ZxdjA= Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4Vwkqm0Pjmz9snP; Fri, 7 Jun 2024 16:59:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772356; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BFnh1b+RXQDGMofoFIGxG63YkPABCa7aRyrJNKKzJxw=; b=ZakUtY0escrvE06apDRyPD+1UovYO/TCH/gEKA5f4SUTvP6JxD64cPXtsGhIjM+jrRM9lq QXWUzDK/c6MGMaqvT2CA7TxTC9qF7ZoZnM3yBIYS48l9zppDElgFuUjmthaKpMe9dxK2aL Qs4y/kQxPz+XcJzDud9EP/FaiL9kmlb9SqPI0uOHzGwIT90s5/4aoiNKypyjSc6uAGgq/+ 5cnkgbVHYIMHETJVDlYEUT1j1K1/mWw2+640NJ36SVzwawc1zfww8hjo+B7Xx2RcCf1Kst cSL4JhZNXPk4joWl9bagi2zOUzySYMklsOAGyWrSmtRLsWEJlS3imCE+/Nb2dw== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: [PATCH v7 01/11] readahead: rework loop in page_cache_ra_unbounded() Date: Fri, 7 Jun 2024 14:58:52 +0000 Message-ID: <20240607145902.1137853-2-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 5FC2A14000D X-Stat-Signature: qf3pzsi8wrup4d7o7cpxngf76j5hbroe X-Rspam-User: X-HE-Tag: 1717772360-578656 X-HE-Meta: U2FsdGVkX1/h/tRl9WV3cbxLaMtXiiRz/tqMJABswF7IdlzNGy7q+rGfFhoTQVdpOct5lWG+ntP9sGp7wtv4Qv4rImm3ZhkHsSPiup6M1c6UmtD6ne+we8A0JJfnwQz23POdZLjNldMsUpc8SKNm01gRCZe9eRfdC6veZ1Jj5tPhmrKnY8y8z0OG2UT0DaRQQnoC/816aBsczKGaEMo6iI9Q9luAgWH9zFqt7ch5IKDdapZeAj7sDodORWQbQO5SoEl+irJ93R96AJGnmgqbalJyiZTuVZnLVHK3j132W4TC6dhkWY1qAJ9YOec7FGeSGApGHN72ahj15DtLqux9JOTDwXPbd3An90fvkv5on1Je0mBkBZAFRPszR5AxgiCVd7/kyzRfQM/sTtaDabePKil0XL995i7L6x1wc4MBeL8jpCwvWYxOlDSGNUHgaNcx7jBEzYl+U0nAuvbekC2zdv4L9e0XlVO6sHwn4fw6KZi8wbARLgTJv5oNCmS2Jct2H0jvO+iZRyj/+Hp4EPyypEuCCW6IXID8heaFzFQBBAam+i7Q6lby2Gl98nnhCfqgashff+YP00cnA834CtYh11UZy926k4NK+PkZRKjGKccBofKXkTdKI12sYP5gc01E87LhznfmGkOm6pbls7Pq0mSUl7/VvikfZiBBsn0FKxTPv6bN5vzTAY5JCdQ1iLwrQdEYyIg2txlXxT6BFx43T6b1mKLHjx7AZ32V+dHMgYiv9oqoZpl03SBBOD1x/HKBpI8Li6KXOyfBC+NBMXh6UgIJcbxF35c9tFW8ME5cp1tLrNo+vxOyp3dG15+mIpKeDwuF4KHRgVMAYc0zD7yTkucG1WyOWZIxY1dbe50QArcUSz5D6h2kNzPipywhqtC7nCbnNxVx8LQdzend8yfQDoPGry1ElT8bEdFoPMsEpHMRVd7Z6G9RIdYNetvcr43XuLAMB7vyCeXxhb7LGp+ AtFmfcrq 9VPALVZZtlc+ak08+N9uwaFP5XZe5PDo5hDaV4uynHFRcZEpsKgAJ3awwXAbVUZmkwPMjS34ytX+KM51RmP2l1vFyNblK3ZizhJaZZhYMKj3SGttQs/hsntiCA22kTfSnd9rz8TiQ+sdNK03HUdKt3DD2KIKVFpAVuCOxFCLJIsmZMXONFTLdnX7qyGl9wSqE7Oa5PLoq0G2DGpyNfXT0JmK5924toNSr3tLESE/3ziw7Z92b8JU91Y4Ziv8cxsJF4nCbC0vPZ21kr4Mis7Du+hmHZmRfoAI6FMCJIKKeuTLyn3KCT3k1wxBOdWDaKvmBmT4uGd83MSlJ9I0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Hannes Reinecke Rework the loop in page_cache_ra_unbounded() to advance with the number of pages in a folio instead of just one page at a time. Note that the index is incremented by 1 if filemap_add_folio() fails because the size of the folio we are trying to add is 1 (order 0). Signed-off-by: Hannes Reinecke Co-developed-by: Pankaj Raghav Acked-by: Darrick J. Wong Signed-off-by: Pankaj Raghav --- mm/readahead.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index c1b23989d9ca..75e934a1fd78 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -208,7 +208,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, struct address_space *mapping = ractl->mapping; unsigned long index = readahead_index(ractl); gfp_t gfp_mask = readahead_gfp_mask(mapping); - unsigned long i; + unsigned long i = 0; /* * Partway through the readahead operation, we will have added @@ -226,7 +226,7 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, /* * Preallocate as many pages as we will need. */ - for (i = 0; i < nr_to_read; i++) { + while (i < nr_to_read) { struct folio *folio = xa_load(&mapping->i_pages, index + i); int ret; @@ -240,8 +240,8 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, * not worth getting one just for that. */ read_pages(ractl); - ractl->_index++; - i = ractl->_index + ractl->_nr_pages - index - 1; + ractl->_index += folio_nr_pages(folio); + i = ractl->_index + ractl->_nr_pages - index; continue; } @@ -256,13 +256,14 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, break; read_pages(ractl); ractl->_index++; - i = ractl->_index + ractl->_nr_pages - index - 1; + i = ractl->_index + ractl->_nr_pages - index; continue; } if (i == nr_to_read - lookahead_size) folio_set_readahead(folio); ractl->_workingset |= folio_test_workingset(folio); - ractl->_nr_pages++; + ractl->_nr_pages += folio_nr_pages(folio); + i += folio_nr_pages(folio); } /* From patchwork Fri Jun 7 14:58:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22BC2C27C53 for ; Fri, 7 Jun 2024 14:59:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E92FE6B0098; Fri, 7 Jun 2024 10:59:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E18616B009A; Fri, 7 Jun 2024 10:59:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6A106B009B; Fri, 7 Jun 2024 10:59:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A25786B0098 for ; Fri, 7 Jun 2024 10:59:24 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 5BA37140277 for ; Fri, 7 Jun 2024 14:59:24 +0000 (UTC) X-FDA: 82204401048.06.81C2D5A Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by imf30.hostedemail.com (Postfix) with ESMTP id 8779680002 for ; Fri, 7 Jun 2024 14:59:22 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=l6vv1KfM; spf=pass (imf30.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772362; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4mzOvE/Z1tPhlQ1A0hhzAMqXJSB3DKrNdQ2eVNMhu3Y=; b=dWFWieQ1T2q55q+xND+vt/Kk9cn5B6vdbkGyyh8X1zIFlj1FfwFtjiRVm9ATpFzBg5PCk9 cWEGmokjOsnLiLkgl80bnXUrTnaBkJyjmRNwOGsUBVOGUTY3KYDnORvAXy03odZQFMliUQ 1DuxyyIFYkfXJkDlyYpGGmOi/aBVlMU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772362; a=rsa-sha256; cv=none; b=pPrf6aGz8mkdW98wgvTuX+NRVl1c96RSOLTTy2BS1sBUsOS6qdT4HYJHtQnFjnRdWGkHyL xNjnVD89tkAFIneOZ1Ygj6tt9ezngCK6tWVtKhm5laNIE+xGQVocBk3KsMZDNdmSJMYEjr +NLP0qexK8rrHB2J1WybVvfJrtXeMf0= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=l6vv1KfM; spf=pass (imf30.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4Vwkqq2KNNz9sQg; Fri, 7 Jun 2024 16:59:19 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772359; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4mzOvE/Z1tPhlQ1A0hhzAMqXJSB3DKrNdQ2eVNMhu3Y=; b=l6vv1KfMVqrMdMRoq9TGWnHlDH6KVSjw/LT3A24s+OjF9ypgQRMKtZ/ooen93DzpjvwQbQ XcUICgY4UiYD7QTHiM4NdWGsuNyLz51Ddw7nzDdItHJVNkkM1LYKxzVUXLnwMychNd2xK0 qqR3j0RQlX9JxGOC1BqlEgnsozRVeViDsIfdHbIS348tmkxi160IDAALHb4pq2TDal65YW HhlBHffpLPkLidrHxPGeen6N6gT0+KL0UqRf6wMjrpzNUhGCXLUVcrgsO1qW0ynWH+wXF9 up2BzOu2BC9JkoJSsmaz7TtsC3KY0xKMpwHwzbY2X1Vsh2nNl8ko0b7eaWJd4w== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: [PATCH v7 02/11] fs: Allow fine-grained control of folio sizes Date: Fri, 7 Jun 2024 14:58:53 +0000 Message-ID: <20240607145902.1137853-3-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: a7xnr91n7t3dorjtgq4pqmn8wxzgnnwi X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 8779680002 X-HE-Tag: 1717772362-688136 X-HE-Meta: U2FsdGVkX1/PuS3bcYh9jwPfoUPBLoOf+det5BmdwDeuHKFJS3ofRY2z6yrsjcnOQgnIxShqkqUVFiUI7R4ZfH03Ga+xbtNzD+Os8QW1vsq+a1sYRTCXgE6p1W/4ayVayKddU9hf9qoEhtEmnXF2+HgI5NsEUaVaaXXf+p0QRIe2r6lUNSFoaUon6iHE18L40TXCBEXdq/qzfN2xvKIBjNS4luyQAnn5WjpzOEQfGHFpNBaxyrEnziLcDIS7nwaxwUcwn7fT50JYZF83m4jWxaPcrTAbnj5M541mvJh/CK1kAU1ogms58+P/ayiHl3tXADzyjpjsOM2LW5qX9CY/UNJUvdNukCy0PzG1bUyJpjx7TO5rWEP9/re1ptBOVCJYiAnJZ6DIJcxD6LJYknGAZWffwY4ghRuSwOPjtodJ/jQTG+sye1eLx5aEeU9w7vCiR7P291Buk+jOfrRWzbh7l8nl9+GPcQccz0z9qkLyVeJBEXzzXpv1nZ/7LQXPkhCWTQlP0InUuA81d6gZTaIvkVNr/Ho2eslZr4Q/VMzAPDVrLGYR40fO27TG0LhP8qdKlD5j/3JZO/qg52zTfOp+Ge8m/SgjT8w0YPSFoUMYh8V/bmB4QKl7oph0A729ybASXu238TzL05Aah0n70DLSfPBgrmnDmKikqKffY1yJpX9NocS6VscLHtJCftZGLgcrxtNbCGIEwbTJcxMNieg6JhW5IU/wFFY3KejvNvoFT9MSSKy7IxJ6vZiK2wls0OjFk+206kY3LRWTs1bfzB0nulaK1vAJ0lBHkk46YctmpbwZhrUZQ3LGh0gaycdNrbd7LswlRc/qr8NnotyOWxwzXzwoUO8nWY5x3FhksD4Ukbi+h2uKWIeWiQh20cNUrKekOUoUSbmcc0+E/0izc2YXq3qVR4hSVyevQudO3OieT+Rsus54Yk5iiU95AN/WMfkUqhJl6Z+a8ygl8qIWbpQ Z+SQQKXA 9ccf1HHUhAcMffGA+cu0Z8KGwgRbWWH5RTFYiOBiVo9JPY9lJUFKdtSfCSzwfrXuCWV72kMJVaW3/arPFtvb/T7+soDd+ATCpmyNkxIvsYWL8Uhb6zBq8pOiWDeTsGAWV0WdGDTAQslupFbdM1b57qmkVqPnUWZiN6ZalA1m5NCCFju/Elh1Ff0LfUeQAVHiFZiqRnb1iXzN6hun4L6TAB13qCI4uCiQFNMPBFFKTgUF+j4xlXamE3ijgWxFWp096JgV1SY0tiPhdWcc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: "Matthew Wilcox (Oracle)" We need filesystems to be able to communicate acceptable folio sizes to the pagecache for a variety of uses (e.g. large block sizes). Support a range of folio sizes between order-0 and order-31. Signed-off-by: Matthew Wilcox (Oracle) Co-developed-by: Pankaj Raghav Signed-off-by: Pankaj Raghav Reviewed-by: Hannes Reinecke Reviewed-by: Darrick J. Wong --- include/linux/pagemap.h | 86 ++++++++++++++++++++++++++++++++++------- mm/filemap.c | 6 +-- mm/readahead.c | 4 +- 3 files changed, 77 insertions(+), 19 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 8f09ed4a4451..228275e7049f 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -204,14 +204,21 @@ enum mapping_flags { AS_EXITING = 4, /* final truncate in progress */ /* writeback related tags are not used */ AS_NO_WRITEBACK_TAGS = 5, - AS_LARGE_FOLIO_SUPPORT = 6, - AS_RELEASE_ALWAYS, /* Call ->release_folio(), even if no private data */ - AS_STABLE_WRITES, /* must wait for writeback before modifying + AS_RELEASE_ALWAYS = 6, /* Call ->release_folio(), even if no private data */ + AS_STABLE_WRITES = 7, /* must wait for writeback before modifying folio contents */ - AS_UNMOVABLE, /* The mapping cannot be moved, ever */ - AS_INACCESSIBLE, /* Do not attempt direct R/W access to the mapping */ + AS_UNMOVABLE = 8, /* The mapping cannot be moved, ever */ + AS_INACCESSIBLE = 9, /* Do not attempt direct R/W access to the mapping */ + /* Bits 16-25 are used for FOLIO_ORDER */ + AS_FOLIO_ORDER_BITS = 5, + AS_FOLIO_ORDER_MIN = 16, + AS_FOLIO_ORDER_MAX = AS_FOLIO_ORDER_MIN + AS_FOLIO_ORDER_BITS, }; +#define AS_FOLIO_ORDER_MASK ((1u << AS_FOLIO_ORDER_BITS) - 1) +#define AS_FOLIO_ORDER_MIN_MASK (AS_FOLIO_ORDER_MASK << AS_FOLIO_ORDER_MIN) +#define AS_FOLIO_ORDER_MAX_MASK (AS_FOLIO_ORDER_MASK << AS_FOLIO_ORDER_MAX) + /** * mapping_set_error - record a writeback error in the address_space * @mapping: the mapping in which an error should be set @@ -360,9 +367,49 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) #define MAX_PAGECACHE_ORDER 8 #endif +/* + * mapping_set_folio_order_range() - Set the orders supported by a file. + * @mapping: The address space of the file. + * @min: Minimum folio order (between 0-MAX_PAGECACHE_ORDER inclusive). + * @max: Maximum folio order (between @min-MAX_PAGECACHE_ORDER inclusive). + * + * The filesystem should call this function in its inode constructor to + * indicate which base size (min) and maximum size (max) of folio the VFS + * can use to cache the contents of the file. This should only be used + * if the filesystem needs special handling of folio sizes (ie there is + * something the core cannot know). + * Do not tune it based on, eg, i_size. + * + * Context: This should not be called while the inode is active as it + * is non-atomic. + */ +static inline void mapping_set_folio_order_range(struct address_space *mapping, + unsigned int min, + unsigned int max) +{ + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) + return; + + if (min > MAX_PAGECACHE_ORDER) + min = MAX_PAGECACHE_ORDER; + if (max > MAX_PAGECACHE_ORDER) + max = MAX_PAGECACHE_ORDER; + if (max < min) + max = min; + + mapping->flags = (mapping->flags & ~AS_FOLIO_ORDER_MASK) | + (min << AS_FOLIO_ORDER_MIN) | (max << AS_FOLIO_ORDER_MAX); +} + +static inline void mapping_set_folio_min_order(struct address_space *mapping, + unsigned int min) +{ + mapping_set_folio_order_range(mapping, min, MAX_PAGECACHE_ORDER); +} + /** * mapping_set_large_folios() - Indicate the file supports large folios. - * @mapping: The file. + * @mapping: The address space of the file. * * The filesystem should call this function in its inode constructor to * indicate that the VFS can use large folios to cache the contents of @@ -373,7 +420,23 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) */ static inline void mapping_set_large_folios(struct address_space *mapping) { - __set_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + mapping_set_folio_order_range(mapping, 0, MAX_PAGECACHE_ORDER); +} + +static inline +unsigned int mapping_max_folio_order(const struct address_space *mapping) +{ + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) + return 0; + return (mapping->flags & AS_FOLIO_ORDER_MAX_MASK) >> AS_FOLIO_ORDER_MAX; +} + +static inline +unsigned int mapping_min_folio_order(const struct address_space *mapping) +{ + if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) + return 0; + return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN; } /* @@ -382,16 +445,13 @@ static inline void mapping_set_large_folios(struct address_space *mapping) */ static inline bool mapping_large_folio_support(struct address_space *mapping) { - return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) && - test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags); + return mapping_max_folio_order(mapping) > 0; } /* Return the maximum folio size for this pagecache mapping, in bytes. */ -static inline size_t mapping_max_folio_size(struct address_space *mapping) +static inline size_t mapping_max_folio_size(const struct address_space *mapping) { - if (mapping_large_folio_support(mapping)) - return PAGE_SIZE << MAX_PAGECACHE_ORDER; - return PAGE_SIZE; + return PAGE_SIZE << mapping_max_folio_order(mapping); } static inline int filemap_nr_thps(struct address_space *mapping) diff --git a/mm/filemap.c b/mm/filemap.c index 37061aafd191..46c7a6f59788 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1933,10 +1933,8 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, if (WARN_ON_ONCE(!(fgp_flags & (FGP_LOCK | FGP_FOR_MMAP)))) fgp_flags |= FGP_LOCK; - if (!mapping_large_folio_support(mapping)) - order = 0; - if (order > MAX_PAGECACHE_ORDER) - order = MAX_PAGECACHE_ORDER; + if (order > mapping_max_folio_order(mapping)) + order = mapping_max_folio_order(mapping); /* If we're not aligned, allocate a smaller folio */ if (index & ((1UL << order) - 1)) order = __ffs(index); diff --git a/mm/readahead.c b/mm/readahead.c index 75e934a1fd78..da34b28da02c 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -504,9 +504,9 @@ void page_cache_ra_order(struct readahead_control *ractl, limit = min(limit, index + ra->size - 1); - if (new_order < MAX_PAGECACHE_ORDER) { + if (new_order < mapping_max_folio_order(mapping)) { new_order += 2; - new_order = min_t(unsigned int, MAX_PAGECACHE_ORDER, new_order); + new_order = min(mapping_max_folio_order(mapping), new_order); new_order = min_t(unsigned int, new_order, ilog2(ra->size)); } From patchwork Fri Jun 7 14:58:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D183C27C53 for ; Fri, 7 Jun 2024 14:59:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F5796B00A1; Fri, 7 Jun 2024 10:59:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 25A0F6B00A2; Fri, 7 Jun 2024 10:59:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D3706B00A3; Fri, 7 Jun 2024 10:59:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DBF3B6B00A1 for ; Fri, 7 Jun 2024 10:59:27 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6A9D2402DA for ; Fri, 7 Jun 2024 14:59:27 +0000 (UTC) X-FDA: 82204401174.12.CA118BF Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) by imf05.hostedemail.com (Postfix) with ESMTP id 9DE4D10002B for ; Fri, 7 Jun 2024 14:59:25 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=FkxZmTeU; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf05.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772365; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Irw62N1JW6TXxSmdYHpkVdAv1s6RDr+2+xzjFjEPurM=; b=CsO87KSshgd3WPQTA/U6z80WkDRMtl2ADphoz+7+tgDtrr03rAuaZvl6RKq+Lec489EaFl 7joOxG7WqP07NNpNX1sSGmhdpmeG9MvVzHeUJOcIpcmwooZYgLu/3KUxTB4OR93iG5gUUR mg435GLl6hfnO/SpcPt5nlHTAkwiMAg= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=FkxZmTeU; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf05.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772365; a=rsa-sha256; cv=none; b=EtB8tRNPuW23CWaCmVT6t6oP6XmHlhT+MiSyt2XvJsFuI75MdQ8KNerihHQjDH9QnKJfxu yERcGgr/qh++nQIzKqUsiB8JN0w0fdapNRtsNKCEDc/CU45vcE6Qvqly3ORZiYBrEfvwK2 lj9uHL2Av7AeDPdMIQVT60NGyCBkPGM= Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4Vwkqt2MWmz9sjD; Fri, 7 Jun 2024 16:59:22 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772362; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Irw62N1JW6TXxSmdYHpkVdAv1s6RDr+2+xzjFjEPurM=; b=FkxZmTeUQodjEwwEb75K+2zFPvV6b3JqGei1ApUQXuLQDxmDPz7MmaHI7SKOBUvLpmf/+6 RIG/chfBNPr6kotV6OPdDvQ2fwPorxEiUAdm4lDe/5yIeIzs6Lg5h1Hw/zyJ3o1NyHKRQq kmed/ezABoj9DNu1Wa4aNFXxRIjx/Qn/Vol75q9M1CNMQMSfGRzNQwUMhcHAe6jVL088In KZPehD/qjeDDVgFzm+4i5Nz9ZXZ+uuuIGrUE9KmBArwJl4NsL/bEQaSQuKxPcLwgZpiX1/ 1jJOTZA699fY5g1ol6vT8/4maWJuppnAYwwt91laAD3wYIiOgszdFZcKO9WSRw== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: [PATCH v7 03/11] filemap: allocate mapping_min_order folios in the page cache Date: Fri, 7 Jun 2024 14:58:54 +0000 Message-ID: <20240607145902.1137853-4-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 9DE4D10002B X-Stat-Signature: tw4uynnozu9isc5wha8aikhsu5hwhsoh X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1717772365-636602 X-HE-Meta: U2FsdGVkX1+tBpaws7WkBs8GHMOzalUL6zjRQIMmn/d19AxyP6t5WgRW87tyfHDz7YPJkUOtC9RY1XaitG2wl36jabVJBW9r/mbNHI3BL1aes+FHGq7RL2jaiPhGef4uRwQwnWQNUx8mXcpKKQEpYvd6BmXQuz/mioak0r7Q/1GI7arGYbDCX9ioNHoS7/zNWqVeGSYyb2vxLFu/JuV6hc9xYQHaV6/GxjQ2zyOrBEgSfS6c6d5VAFlnHrhGJDlrgXm05n4NYnKk5w9SJ6BYhwl14YKljqSJVc5seJJN+8ViiJBmoBxhO8GeZfyB7YnQ3i55rMC1t5on15EkmHRlkOONS4pYgIB9lWoPk/c5O9m8jfMuh9TloXxYyQ73ykgcBGuys7bJb4kjxdktrZDkrKxa8BcIlFSNfwX3msHskHPgSIt0Ehn2JPEDxNwakbWDznts2S8lm6r4HKqtXtmD4fdBrXkZshs/xLKhukuX7vwB5IIwHJ8T42NjPZSWJGQMUYpT037/J5wDyQHXqfoa5s11Ctxfm36egHi+W1swb0UWhxq6yUWrfwTgC+szLLZF5WJlyYpSErLbOUKZFcjkzZZy9oq7Yk9UPMvHmRS0juzraHYJmDWh2NETl80PdVIvt6dLJrKs3CHlcZ/LYNPm+e0gNw+nKT5o9LKqi8v8J9mKdgxlrxYweVxOE70WXXjrjU5Kf8G+1Jcxw9/qG2bmSiAHayUTmPNdIO79PvcDrP28GoS1slv9VIvOulSdOnk4SbNUtbS1bSk7KtHS0SN7vifzNDLxy5AWSh46BfdNKQ/eMANgpj53531KuOko3jTEpV3bs/iRUWlnThbp4GaZbT0YAza3cHy78qpCOweaa+amWgRWm+r4iBOFksBHGfs45Y931EAo7MW2J1emgti1kbmnHliVKYftOL/NpPurZis4TzLiB9Pz2C/vveGlaRrxUgXkN8vfXuL6nae1H+N E0g6IEdY M/ESMw2vR9IjS4v5OWCd32LjHykssjSBH7BwfgA44KB2vScouxjgXtWM/VDMB37YBXd+wDnbZ49E+0LstO1dE78JGTF6hyaaPFqoa0lY0zSCBJ44b3l7JjRlw5DS+fzosfy7GveV8UDn1qpm1Kg2ExhyBz4kCW6snKW/evOU/HGgAtolItazcKB0p6/uQeCQzQ++C9yViY8JKhL2Aq47p1PAiW7Hp+FvSrO28+RX81uSrHxDkGB9Ejhds5E0DGtRxyNVES9C8J/dtjAyASe5W+4jQMuG7aipj2cPgnKYKXk0OE3w+spuVO7GpUGvLz9WEroQ0fdWGVsNk/mutKELB2AqNJA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav filemap_create_folio() and do_read_cache_folio() were always allocating folio of order 0. __filemap_get_folio was trying to allocate higher order folios when fgp_flags had higher order hint set but it will default to order 0 folio if higher order memory allocation fails. Supporting mapping_min_order implies that we guarantee each folio in the page cache has at least an order of mapping_min_order. When adding new folios to the page cache we must also ensure the index used is aligned to the mapping_min_order as the page cache requires the index to be aligned to the order of the folio. Co-developed-by: Luis Chamberlain Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav Reviewed-by: Hannes Reinecke Reviewed-by: Darrick J. Wong --- include/linux/pagemap.h | 20 ++++++++++++++++++++ mm/filemap.c | 26 ++++++++++++++++++-------- 2 files changed, 38 insertions(+), 8 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 228275e7049f..899b8d751768 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -439,6 +439,26 @@ unsigned int mapping_min_folio_order(const struct address_space *mapping) return (mapping->flags & AS_FOLIO_ORDER_MIN_MASK) >> AS_FOLIO_ORDER_MIN; } +static inline unsigned long mapping_min_folio_nrpages(struct address_space *mapping) +{ + return 1UL << mapping_min_folio_order(mapping); +} + +/** + * mapping_align_start_index() - Align starting index based on the min + * folio order of the page cache. + * @mapping: The address_space. + * + * Ensure the index used is aligned to the minimum folio order when adding + * new folios to the page cache by rounding down to the nearest minimum + * folio number of pages. + */ +static inline pgoff_t mapping_align_start_index(struct address_space *mapping, + pgoff_t index) +{ + return round_down(index, mapping_min_folio_nrpages(mapping)); +} + /* * Large folio support currently depends on THP. These dependencies are * being worked on but are not yet fixed. diff --git a/mm/filemap.c b/mm/filemap.c index 46c7a6f59788..8bb0d2bc93c5 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -859,6 +859,8 @@ noinline int __filemap_add_folio(struct address_space *mapping, VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_swapbacked(folio), folio); + VM_BUG_ON_FOLIO(folio_order(folio) < mapping_min_folio_order(mapping), + folio); mapping_set_update(&xas, mapping); VM_BUG_ON_FOLIO(index & (folio_nr_pages(folio) - 1), folio); @@ -1919,8 +1921,10 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, folio_wait_stable(folio); no_page: if (!folio && (fgp_flags & FGP_CREAT)) { - unsigned order = FGF_GET_ORDER(fgp_flags); + unsigned int min_order = mapping_min_folio_order(mapping); + unsigned int order = max(min_order, FGF_GET_ORDER(fgp_flags)); int err; + index = mapping_align_start_index(mapping, index); if ((fgp_flags & FGP_WRITE) && mapping_can_writeback(mapping)) gfp |= __GFP_WRITE; @@ -1943,7 +1947,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, gfp_t alloc_gfp = gfp; err = -ENOMEM; - if (order > 0) + if (order > min_order) alloc_gfp |= __GFP_NORETRY | __GFP_NOWARN; folio = filemap_alloc_folio(alloc_gfp, order); if (!folio) @@ -1958,7 +1962,7 @@ struct folio *__filemap_get_folio(struct address_space *mapping, pgoff_t index, break; folio_put(folio); folio = NULL; - } while (order-- > 0); + } while (order-- > min_order); if (err == -EEXIST) goto repeat; @@ -2447,13 +2451,16 @@ static int filemap_update_page(struct kiocb *iocb, } static int filemap_create_folio(struct file *file, - struct address_space *mapping, pgoff_t index, + struct address_space *mapping, loff_t pos, struct folio_batch *fbatch) { struct folio *folio; int error; + unsigned int min_order = mapping_min_folio_order(mapping); + pgoff_t index; - folio = filemap_alloc_folio(mapping_gfp_mask(mapping), 0); + folio = filemap_alloc_folio(mapping_gfp_mask(mapping), + min_order); if (!folio) return -ENOMEM; @@ -2471,6 +2478,8 @@ static int filemap_create_folio(struct file *file, * well to keep locking rules simple. */ filemap_invalidate_lock_shared(mapping); + /* index in PAGE units but aligned to min_order number of pages. */ + index = (pos >> (PAGE_SHIFT + min_order)) << min_order; error = filemap_add_folio(mapping, folio, index, mapping_gfp_constraint(mapping, GFP_KERNEL)); if (error == -EEXIST) @@ -2531,8 +2540,7 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count, if (!folio_batch_count(fbatch)) { if (iocb->ki_flags & (IOCB_NOWAIT | IOCB_WAITQ)) return -EAGAIN; - err = filemap_create_folio(filp, mapping, - iocb->ki_pos >> PAGE_SHIFT, fbatch); + err = filemap_create_folio(filp, mapping, iocb->ki_pos, fbatch); if (err == AOP_TRUNCATED_PAGE) goto retry; return err; @@ -3748,9 +3756,11 @@ static struct folio *do_read_cache_folio(struct address_space *mapping, repeat: folio = filemap_get_folio(mapping, index); if (IS_ERR(folio)) { - folio = filemap_alloc_folio(gfp, 0); + folio = filemap_alloc_folio(gfp, + mapping_min_folio_order(mapping)); if (!folio) return ERR_PTR(-ENOMEM); + index = mapping_align_start_index(mapping, index); err = filemap_add_folio(mapping, folio, index, gfp); if (unlikely(err)) { folio_put(folio); From patchwork Fri Jun 7 14:58:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8257DC27C53 for ; Fri, 7 Jun 2024 14:59:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F9636B00A5; Fri, 7 Jun 2024 10:59:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0836F6B00A6; Fri, 7 Jun 2024 10:59:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E66106B00A7; Fri, 7 Jun 2024 10:59:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C6E536B00A5 for ; Fri, 7 Jun 2024 10:59:33 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 670D6A3903 for ; Fri, 7 Jun 2024 14:59:33 +0000 (UTC) X-FDA: 82204401426.21.4429154 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) by imf20.hostedemail.com (Postfix) with ESMTP id 988C91C0017 for ; Fri, 7 Jun 2024 14:59:29 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=HNg2Zlay; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf20.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.151 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772370; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gcGImSYw6myhhr+aFoG6zYcMMyebatE1gThNoSCMqDI=; b=o/WetUQazqMUtF+xzd5y32jvsG7UdifxslNNbULCOI4dVHSloG1wdozks+uRYn2PE4JCq+ YS0UOQManqii5Ls7ctktG/rYC29V2946olgDNXAqn0Mg4AJgfR12d2g2Yk9VnboeKhgr1C CzmlUPk2e+mCg8nFlTlwOKrUl94TWbA= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=HNg2Zlay; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf20.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.151 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772370; a=rsa-sha256; cv=none; b=LQOLypI/iOoKkDnc0myrSUcN+7Vd9CQcaFBQT6xn8O8T+DwJkhECps5lBRmQ5wCVS443UK BSInC0gJGOkr2rPucO7DeuKfLsaJYVzZdlTAN8RrJOTQpHKStmNF/i1+2YxNTR+87aBATW r9zGmMMMTDwaHIQ6ubazoc08lG29B28= Received: from smtp102.mailbox.org (smtp102.mailbox.org [10.196.197.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4Vwkqy3GMxz9smQ; Fri, 7 Jun 2024 16:59:26 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772366; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gcGImSYw6myhhr+aFoG6zYcMMyebatE1gThNoSCMqDI=; b=HNg2Zlay2Z+TJiNAAL/h42UPCFPyyjecycy9Rzn3UPRWBgUjnJgbsCSIjM4WoHM1JTXBMX hCfT1g6FDmQQYumow62hfgcTTyqWLaWigc8MLjlBgOs9U6Udn2Ufg9QXj3KCcR3HpLIQOn EzIjcnhiQRKvFbHJz8ZhAcXMXtn+kf3u7YBF9+W6zny1XIDPiilEMmKiSzCUBL2MDZ8V5O TZYFf27k5Kci3MKtZI63RCU84Fiw6g4YpkFOSs+k6O61zJw84mFhgXfNM3j2tCeWzGVWe5 gTVlByOdsdDkNx/NdTOKUzebiyJ03ma21W5BwlHvtkWZI9H2UZeReiue4sdvlg== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: [PATCH v7 04/11] readahead: allocate folios with mapping_min_order in readahead Date: Fri, 7 Jun 2024 14:58:55 +0000 Message-ID: <20240607145902.1137853-5-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 988C91C0017 X-Stat-Signature: pgi4tffditdzb3hxkw9ssqztfq4rio3z X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1717772369-302636 X-HE-Meta: U2FsdGVkX193lPOr8cu0EZCPj07MGg+TStxs6NFydVZSE7L4Aw/CoOjOrh9ArA48X/+HNcmDdcV0d+uPPOgbBsJCcoFRedU2BCq9b4ZJZ8949lsoUr0jFE9fg1wqtD/Ow5OkoIOBm7mq2QTyKvkUt20kvilOccKukjxPZOBgIRJQB5qcCkIzD1keqHKpeB7dhgNZL7DEQky60PKKhsasYb41cj6P6gBnHEqAACIieSMLTSuk4LJ+tT4y02YSaZxb+Y9O6pbZQUM9SL3bEBFcx7ARaJupPsHIh6qWFxCOl4h1+QGzRLwYALOBPMCNFaNrbP1kEbYu6uLIxl8E58LG73b2U0AXu76nXMutu2cnU3CqgJZjoL8BHb6HaozxK2ddSThdpOY7mvst6YaBBujX6xR313HDHJGL9JLZtbUvSHK6ikVMZLBDmlfmZkRpG0USW5mZWV01iNHDcxeAcfECBkVOFL5SU8O0Y9LFebufgbj45qzv+1Aj2BYlrYa+N941+fcb1hy9il7YM8Tz+XqN60zhyZCr/E5UiC5podgezfaVaqwd3fVjz2qwTxWXod/tNz/ePCkz7FsjeD/lplCGgiu6uKcgxq7yuLQlPD+EaR7TK1I2REu9IGAOEclQTbzpg+T/tY5i6NHb+OMmeMGAFvg9AlrKrL6Jk25ogjUxjmkcb4g6gC5ZQfzv3hfB9dKLxNY4r2UmSTHQROgpTEz4KpX9ymdUgQdWNKgqxnXhNOOsLNVrprO5pyjC5MKkndhiXzikTektnm5mKJSTttNcKItCPgpYRGNj6GU16F4cg7xnnTONA/Ahw2OV9AOaEQX+dHkU6usenCke+9A4iQwixINHJH/Kh3uQi9Qt+N9oeThncypXKs5M3kqrTuyIZFDsCYK2N6ER6xnehFuIKJCd3ZnMjfRMTdB30ZygQX1AiRATrLk++tJ5CP289lqodasyjxac9+0ViNI2bVPEpCD 9ESJB/2D ju10k6l8i8XNM9RQOzhMkeag94GypQ2ep+4BIZ+XSR0EVpjEdD8tfYPKmv0wxQ3o3uqux7v3rLx7S1dCbye8vfQW4L/HMPds91ivjK0zg3tmomRHpxNNHzgODLOfBE8/AgxojDK69WYp2Gv1y3fVcVQ+6lLKYdDyKSwH9slTRjnJuaz5TdyHLrIY42FjyMBE5lfGyN5KXCBsFQnKt6OosO5JqcQLb40IImCaKlbV00xFQbXv4boHieZLVQg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav page_cache_ra_unbounded() was allocating single pages (0 order folios) if there was no folio found in an index. Allocate mapping_min_order folios as we need to guarantee the minimum order if it is set. When read_pages() is triggered and if a page is already present, check for truncation and move the ractl->_index by mapping_min_nrpages if that folio was truncated. This is done to ensure we keep the alignment requirement while adding a folio to the page cache. page_cache_ra_order() tries to allocate folio to the page cache with a higher order if the index aligns with that order. Modify it so that the order does not go below the mapping_min_order requirement of the page cache. This function will do the right thing even if the new_order passed is less than the mapping_min_order. When adding new folios to the page cache we must also ensure the index used is aligned to the mapping_min_order as the page cache requires the index to be aligned to the order of the folio. readahead_expand() is called from readahead aops to extend the range of the readahead so this function can assume ractl->_index to be aligned with min_order. Reviewed-by: Hannes Reinecke Signed-off-by: Pankaj Raghav --- mm/readahead.c | 85 +++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 71 insertions(+), 14 deletions(-) diff --git a/mm/readahead.c b/mm/readahead.c index da34b28da02c..389cd802da63 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -206,9 +206,10 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, unsigned long nr_to_read, unsigned long lookahead_size) { struct address_space *mapping = ractl->mapping; - unsigned long index = readahead_index(ractl); + unsigned long ra_folio_index, index = readahead_index(ractl); gfp_t gfp_mask = readahead_gfp_mask(mapping); - unsigned long i = 0; + unsigned long mark, i = 0; + unsigned int min_nrpages = mapping_min_folio_nrpages(mapping); /* * Partway through the readahead operation, we will have added @@ -223,6 +224,22 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, unsigned int nofs = memalloc_nofs_save(); filemap_invalidate_lock_shared(mapping); + index = mapping_align_start_index(mapping, index); + + /* + * As iterator `i` is aligned to min_nrpages, round_up the + * difference between nr_to_read and lookahead_size to mark the + * index that only has lookahead or "async_region" to set the + * readahead flag. + */ + ra_folio_index = round_up(readahead_index(ractl) + nr_to_read - lookahead_size, + min_nrpages); + mark = ra_folio_index - index; + if (index != readahead_index(ractl)) { + nr_to_read += readahead_index(ractl) - index; + ractl->_index = index; + } + /* * Preallocate as many pages as we will need. */ @@ -230,7 +247,9 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, struct folio *folio = xa_load(&mapping->i_pages, index + i); int ret; + if (folio && !xa_is_value(folio)) { + long nr_pages = folio_nr_pages(folio); /* * Page already present? Kick off the current batch * of contiguous pages before continuing with the @@ -240,12 +259,24 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, * not worth getting one just for that. */ read_pages(ractl); - ractl->_index += folio_nr_pages(folio); + + /* + * Move the ractl->_index by at least min_pages + * if the folio got truncated to respect the + * alignment constraint in the page cache. + * + */ + if (mapping != folio->mapping) + nr_pages = min_nrpages; + + VM_BUG_ON_FOLIO(nr_pages < min_nrpages, folio); + ractl->_index += nr_pages; i = ractl->_index + ractl->_nr_pages - index; continue; } - folio = filemap_alloc_folio(gfp_mask, 0); + folio = filemap_alloc_folio(gfp_mask, + mapping_min_folio_order(mapping)); if (!folio) break; @@ -255,11 +286,11 @@ void page_cache_ra_unbounded(struct readahead_control *ractl, if (ret == -ENOMEM) break; read_pages(ractl); - ractl->_index++; + ractl->_index += min_nrpages; i = ractl->_index + ractl->_nr_pages - index; continue; } - if (i == nr_to_read - lookahead_size) + if (i == mark) folio_set_readahead(folio); ractl->_workingset |= folio_test_workingset(folio); ractl->_nr_pages += folio_nr_pages(folio); @@ -493,13 +524,19 @@ void page_cache_ra_order(struct readahead_control *ractl, { struct address_space *mapping = ractl->mapping; pgoff_t index = readahead_index(ractl); + unsigned int min_order = mapping_min_folio_order(mapping); pgoff_t limit = (i_size_read(mapping->host) - 1) >> PAGE_SHIFT; pgoff_t mark = index + ra->size - ra->async_size; unsigned int nofs; int err = 0; gfp_t gfp = readahead_gfp_mask(mapping); + unsigned int min_ra_size = max(4, mapping_min_folio_nrpages(mapping)); - if (!mapping_large_folio_support(mapping) || ra->size < 4) + /* + * Fallback when size < min_nrpages as each folio should be + * at least min_nrpages anyway. + */ + if (!mapping_large_folio_support(mapping) || ra->size < min_ra_size) goto fallback; limit = min(limit, index + ra->size - 1); @@ -508,11 +545,20 @@ void page_cache_ra_order(struct readahead_control *ractl, new_order += 2; new_order = min(mapping_max_folio_order(mapping), new_order); new_order = min_t(unsigned int, new_order, ilog2(ra->size)); + new_order = max(new_order, min_order); } /* See comment in page_cache_ra_unbounded() */ nofs = memalloc_nofs_save(); filemap_invalidate_lock_shared(mapping); + /* + * If the new_order is greater than min_order and index is + * already aligned to new_order, then this will be noop as index + * aligned to new_order should also be aligned to min_order. + */ + ractl->_index = mapping_align_start_index(mapping, index); + index = readahead_index(ractl); + while (index <= limit) { unsigned int order = new_order; @@ -520,7 +566,7 @@ void page_cache_ra_order(struct readahead_control *ractl, if (index & ((1UL << order) - 1)) order = __ffs(index); /* Don't allocate pages past EOF */ - while (index + (1UL << order) - 1 > limit) + while (order > min_order && index + (1UL << order) - 1 > limit) order--; err = ra_alloc_folio(ractl, index, mark, order, gfp); if (err) @@ -784,8 +830,15 @@ void readahead_expand(struct readahead_control *ractl, struct file_ra_state *ra = ractl->ra; pgoff_t new_index, new_nr_pages; gfp_t gfp_mask = readahead_gfp_mask(mapping); + unsigned long min_nrpages = mapping_min_folio_nrpages(mapping); + unsigned int min_order = mapping_min_folio_order(mapping); new_index = new_start / PAGE_SIZE; + /* + * Readahead code should have aligned the ractl->_index to + * min_nrpages before calling readahead aops. + */ + VM_BUG_ON(!IS_ALIGNED(ractl->_index, min_nrpages)); /* Expand the leading edge downwards */ while (ractl->_index > new_index) { @@ -795,9 +848,11 @@ void readahead_expand(struct readahead_control *ractl, if (folio && !xa_is_value(folio)) return; /* Folio apparently present */ - folio = filemap_alloc_folio(gfp_mask, 0); + folio = filemap_alloc_folio(gfp_mask, min_order); if (!folio) return; + + index = mapping_align_start_index(mapping, index); if (filemap_add_folio(mapping, folio, index, gfp_mask) < 0) { folio_put(folio); return; @@ -807,7 +862,7 @@ void readahead_expand(struct readahead_control *ractl, ractl->_workingset = true; psi_memstall_enter(&ractl->_pflags); } - ractl->_nr_pages++; + ractl->_nr_pages += min_nrpages; ractl->_index = folio->index; } @@ -822,9 +877,11 @@ void readahead_expand(struct readahead_control *ractl, if (folio && !xa_is_value(folio)) return; /* Folio apparently present */ - folio = filemap_alloc_folio(gfp_mask, 0); + folio = filemap_alloc_folio(gfp_mask, min_order); if (!folio) return; + + index = mapping_align_start_index(mapping, index); if (filemap_add_folio(mapping, folio, index, gfp_mask) < 0) { folio_put(folio); return; @@ -834,10 +891,10 @@ void readahead_expand(struct readahead_control *ractl, ractl->_workingset = true; psi_memstall_enter(&ractl->_pflags); } - ractl->_nr_pages++; + ractl->_nr_pages += min_nrpages; if (ra) { - ra->size++; - ra->async_size++; + ra->size += min_nrpages; + ra->async_size += min_nrpages; } } } From patchwork Fri Jun 7 14:58:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690331 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D774C27C5F for ; Fri, 7 Jun 2024 14:59:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B5B926B00A8; Fri, 7 Jun 2024 10:59:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AE35C6B00A9; Fri, 7 Jun 2024 10:59:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 937CF6B00AA; Fri, 7 Jun 2024 10:59:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 6FA386B00A8 for ; Fri, 7 Jun 2024 10:59:38 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id ECB16A035F for ; Fri, 7 Jun 2024 14:59:36 +0000 (UTC) X-FDA: 82204401552.01.90B897B Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) by imf30.hostedemail.com (Postfix) with ESMTP id 46AA78000D for ; Fri, 7 Jun 2024 14:59:34 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=ZdAYb2mh; spf=pass (imf30.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.151 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772374; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Zln41o0VnudScHG2dePgG7Ab473xEOkNPoWQiBNMdss=; b=1pcHQwtN0AKuV/nxnqjR/2BIwFNDzLU83ZHzOBVSV1Sr47wSIARoIch9l3v9DC+nc98gzv SrxchpmHXYIyqYRfNxTIO+bvUZj5yU07HlGbiaxhcmpBsrCWK3ifi4p9Cl+WlrsTryCjhG G/Gty5x7FaODbxSECd6gSnyJ8cFnO9o= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=ZdAYb2mh; spf=pass (imf30.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.151 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772374; a=rsa-sha256; cv=none; b=l0J77ffixnuuZQivFAirNrYgL94LAK6PEXD+1DGy2SRkY+lnHP/gOCZvjzwGFSaLkfQtvF KhWstRTO/M4y5O62sF24Xxs17K5T+iXGoRqjuQojTKkw2mmKuP4uzLuY+dZlgSom4VF/q4 mS77C9d1S1F2+Ai1NJOnFcWkxw2J2/Y= Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4Vwkr25d11z9sSR; Fri, 7 Jun 2024 16:59:30 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772370; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Zln41o0VnudScHG2dePgG7Ab473xEOkNPoWQiBNMdss=; b=ZdAYb2mhM8DTD+9gaL5oSO2ufutF9jYKZ4LaFL+Ccv/WLe0FxfLHgOJ2jz8hJmq+zFaFfH 1DVnmqnhsPB7nK4CJPGkYUR7AAH4trDdVYczePkfKc9SwfSuTYQRG9XfxAdqWCS4uKbdPc EKynLmDvK1hp0uwUe1uRCT48T8M5mXIOp70Iqha+WDTuuS1TuKD/kmNr20ot2lHMs99dsg YsfCd9qdPR+rjscTKB5m/S9/AvrAKtbO/ioMY54a+NgvHIp61f28fwKq4nfgw0CrqMugOj 5XfVNAGPgRGXSEOPhIpFcNWG9shvhG2c1Tw94MlE9I9bl9XDCueBioKoU38Tmg== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: [PATCH v7 05/11] mm: split a folio in minimum folio order chunks Date: Fri, 7 Jun 2024 14:58:56 +0000 Message-ID: <20240607145902.1137853-6-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Stat-Signature: fo1qjzpaifqtowjmj8z937takhwwcggf X-Rspamd-Queue-Id: 46AA78000D X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717772374-864644 X-HE-Meta: U2FsdGVkX1/X7lUB7I22xEFOj4FH+HqY5mkmVDLqNfajoTRD9pW0JRJJrECAO1hQDDWIHjYJhtu3hutxyfxKhVYsT6y+4XO4PS6vkDjHcafJIe2Hj31/J0aq2XDFioYEJ96718rnk3gBvXw9IcKKbi7R7sM7F4XdkGGcI8QOxvrt5m52fC0pvz30oQqn+H3NkKV8ozYfqoY8SNvwcwDFhRlPDG9bEa3nsCwWH3OMeGlou5smMUen82PA+arXa3lhC3bmOdoES6SMg3S49AB5xYDDRCw3oySz+WzAJ38ug2XaQefn6EfFfFX9gHLV7jA4tuuDa0jzH4LLyU2f4u5DPgE22zvWFPDxLECJrAKUTitqTLU/wO06cYx4n/9yfBUyck/UEm8wGM2Khzz0gOUZXLP4ThG2Saz11sRjfjcfgJgHuH+8f4o7Jym1PYD7grzlOQRMlfYviesnsks3ZFRjxJxKcLWyC8u3Kn5wLtEJKD4YqrkboRDWSFGKpvLG7KszrygIiesmgNzWOqPeI5Q0YGZYlMlBcKgHVXTNtPRyHdcRWxY9Aw5l2HOCMQDKbgA/7yFErc+uRmE8SbpkGSdi42Dlf+K9rdmgdT4H6gkX6GLq2j2TXk7SEBOX1GFPsTPlwms+D8GzHszmKv5MqnP1nNDOU9UiUVpudMm7AYwJAG/SUzENOpju2ZpDn8RlC5yeebfy2X9KNcfcqzKzGBMbUCMMxGCi4jn4cXAVDekSSl6fIFSfk3xlsaij7+QXe9ndXUahGg7I1rtZ6WSIniklK15zuLDIM3X+p3Wc8RtbJCbjkJCBa1sCUZI0pccz3APimBaOsyFI3/RG/1bKDHIT7C/pmqbabJNvMms90v4cnKEDc1vXmzTwLLVUA8U56Yte0dMMSsIeXmxoNFDKCv8FqIAMpSIi7Wlr2z7ZLGXLrZkbcUumMmmAfTNzQOk3ud1dkdngptn4y9lfHsve5Fc CRNPklvY 6rrgFW1+ke/DYPyDQ8gRDP2CeCz/oY8t9ZD1FSxcf8dE6xyvd7UCOAwviOMonLSEa801/iwa/kxghwF/PIoMzKeb/UyPF8C77XjhD1RfY6PSlE6Oxjb/mXZnShPSK3YMQ1ls+ZbMTOwVoNU57TC8vPQTlC7ALCUiVw39fjcqnINZp59IuQhF1fmYszKotT0f4G0arau1lPhQWYetwtj4yhYfjJ1AS0xPvJVwRtsU+0sBnbPNOHDVEGIR5Wnf+A9wu51/TRyZaY1qzU1JVAOn0RKBxO3ynH5vnNO4z X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Luis Chamberlain split_folio() and split_folio_to_list() assume order 0, to support minorder for non-anonymous folios, we must expand these to check the folio mapping order and use that. Set new_order to be at least minimum folio order if it is set in split_huge_page_to_list() so that we can maintain minimum folio order requirement in the page cache. Update the debugfs write files used for testing to ensure the order is respected as well. We simply enforce the min order when a file mapping is used. Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav Reviewed-by: Hannes Reinecke --- include/linux/huge_mm.h | 14 ++++++++--- mm/huge_memory.c | 55 ++++++++++++++++++++++++++++++++++++++--- 2 files changed, 61 insertions(+), 8 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 020e2344eb86..15caa4e7b00e 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -88,6 +88,8 @@ extern struct kobj_attribute shmem_enabled_attr; #define thp_vma_allowable_order(vma, vm_flags, tva_flags, order) \ (!!thp_vma_allowable_orders(vma, vm_flags, tva_flags, BIT(order))) +#define split_folio(f) split_folio_to_list(f, NULL) + #ifdef CONFIG_PGTABLE_HAS_HUGE_LEAVES #define HPAGE_PMD_SHIFT PMD_SHIFT #define HPAGE_PUD_SHIFT PUD_SHIFT @@ -307,9 +309,10 @@ unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long add bool can_split_folio(struct folio *folio, int *pextra_pins); int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, unsigned int new_order); +int split_folio_to_list(struct folio *folio, struct list_head *list); static inline int split_huge_page(struct page *page) { - return split_huge_page_to_list_to_order(page, NULL, 0); + return split_folio(page_folio(page)); } void deferred_split_folio(struct folio *folio); @@ -474,6 +477,12 @@ static inline int split_huge_page(struct page *page) { return 0; } + +static inline int split_folio_to_list(struct folio *folio, struct list_head *list) +{ + return 0; +} + static inline void deferred_split_folio(struct folio *folio) {} #define split_huge_pmd(__vma, __pmd, __address) \ do { } while (0) @@ -578,7 +587,4 @@ static inline int split_folio_to_order(struct folio *folio, int new_order) return split_folio_to_list_to_order(folio, NULL, new_order); } -#define split_folio_to_list(f, l) split_folio_to_list_to_order(f, l, 0) -#define split_folio(f) split_folio_to_order(f, 0) - #endif /* _LINUX_HUGE_MM_H */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8e49f402d7c7..399a4f5125c7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3068,6 +3068,9 @@ bool can_split_folio(struct folio *folio, int *pextra_pins) * released, or if some unexpected race happened (e.g., anon VMA disappeared, * truncation). * + * Callers should ensure that the order respects the address space mapping + * min-order if one is set for non-anonymous folios. + * * Returns -EINVAL when trying to split to an order that is incompatible * with the folio. Splitting to order 0 is compatible with all folios. */ @@ -3143,6 +3146,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, mapping = NULL; anon_vma_lock_write(anon_vma); } else { + unsigned int min_order; gfp_t gfp; mapping = folio->mapping; @@ -3153,6 +3157,14 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, goto out; } + min_order = mapping_min_folio_order(folio->mapping); + if (new_order < min_order) { + VM_WARN_ONCE(1, "Cannot split mapped folio below min-order: %u", + min_order); + ret = -EINVAL; + goto out; + } + gfp = current_gfp_context(mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK); @@ -3264,6 +3276,21 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, return ret; } +int split_folio_to_list(struct folio *folio, struct list_head *list) +{ + unsigned int min_order = 0; + + if (!folio_test_anon(folio)) { + if (!folio->mapping) { + count_vm_event(THP_SPLIT_PAGE_FAILED); + return -EBUSY; + } + min_order = mapping_min_folio_order(folio->mapping); + } + + return split_huge_page_to_list_to_order(&folio->page, list, min_order); +} + void __folio_undo_large_rmappable(struct folio *folio) { struct deferred_split *ds_queue; @@ -3493,6 +3520,8 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, struct vm_area_struct *vma = vma_lookup(mm, addr); struct page *page; struct folio *folio; + struct address_space *mapping; + unsigned int target_order = new_order; if (!vma) break; @@ -3513,7 +3542,13 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, if (!is_transparent_hugepage(folio)) goto next; - if (new_order >= folio_order(folio)) + if (!folio_test_anon(folio)) { + mapping = folio->mapping; + target_order = max(new_order, + mapping_min_folio_order(mapping)); + } + + if (target_order >= folio_order(folio)) goto next; total++; @@ -3529,9 +3564,13 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, if (!folio_trylock(folio)) goto next; - if (!split_folio_to_order(folio, new_order)) + if (!folio_test_anon(folio) && folio->mapping != mapping) + goto unlock; + + if (!split_folio_to_order(folio, target_order)) split++; +unlock: folio_unlock(folio); next: folio_put(folio); @@ -3556,6 +3595,7 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, pgoff_t index; int nr_pages = 1; unsigned long total = 0, split = 0; + unsigned int min_order; file = getname_kernel(file_path); if (IS_ERR(file)) @@ -3569,9 +3609,11 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, file_path, off_start, off_end); mapping = candidate->f_mapping; + min_order = mapping_min_folio_order(mapping); for (index = off_start; index < off_end; index += nr_pages) { struct folio *folio = filemap_get_folio(mapping, index); + unsigned int target_order = new_order; nr_pages = 1; if (IS_ERR(folio)) @@ -3580,18 +3622,23 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, if (!folio_test_large(folio)) goto next; + target_order = max(new_order, min_order); total++; nr_pages = folio_nr_pages(folio); - if (new_order >= folio_order(folio)) + if (target_order >= folio_order(folio)) goto next; if (!folio_trylock(folio)) goto next; - if (!split_folio_to_order(folio, new_order)) + if (folio->mapping != mapping) + goto unlock; + + if (!split_folio_to_order(folio, target_order)) split++; +unlock: folio_unlock(folio); next: folio_put(folio); From patchwork Fri Jun 7 14:58:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690332 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3229C27C53 for ; Fri, 7 Jun 2024 14:59:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 716786B00AA; Fri, 7 Jun 2024 10:59:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C4A16B00AB; Fri, 7 Jun 2024 10:59:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5666C6B00AC; Fri, 7 Jun 2024 10:59:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 369126B00AA for ; Fri, 7 Jun 2024 10:59:45 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id DB7AA41967 for ; Fri, 7 Jun 2024 14:59:44 +0000 (UTC) X-FDA: 82204401888.16.0CDDF66 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by imf11.hostedemail.com (Postfix) with ESMTP id 6EA7640029 for ; Fri, 7 Jun 2024 14:59:37 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b="J8/R4PXX"; spf=pass (imf11.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772378; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WZ8spY/tVuIWIrFMviERGy/5Qi8haMN3dT86BjfIdPw=; b=NY3on8A2ka/rLfjp763sS45Q2pzr7m4RbTZT+n9pzzMDF4vNS+DZF5PhljctSkDS5Uwhim 9Zik6MEytjeiFzAiadA481vkqgoEzD3i3NtD/1arRd+nELZnK0ISHOkcdzW49vWIY5c8op f+wfwKn3W1ORCWAXhA8SyAfrsc0B/CM= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b="J8/R4PXX"; spf=pass (imf11.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772378; a=rsa-sha256; cv=none; b=E1WVMS7R2qe2QqBGvUjZb39S2MaC5lwYCy9YN7MYLC+qfLGNRA59YRX07sFXa+wzPZcR2h 2DjY6//dtp/WB2/RWInOlqd1LNGybhkgPhb+xaHZEZUeX9j1BA7Z93AVcTQSszGkB8NKEV nV1a9w75q7pytjRO14qB/wTlTrYARyU= Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4Vwkr61KLfz9spm; Fri, 7 Jun 2024 16:59:34 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772374; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WZ8spY/tVuIWIrFMviERGy/5Qi8haMN3dT86BjfIdPw=; b=J8/R4PXXvkkbYJHQl7XbsXJeITLpZQv1B3Zo+xpZIlQ9Ut+9PsC2vT1rlqmadtHLrvo6rb zQOFhlJSqPC0bSLUvcoaqFYoWSUq9y324Ft40iYEiTvD8Gqy4dTQiwviA6QtV/5ausrBl9 YuOH5eZnKiQZYSaZMW2jyWegt9+fG8U0OoxOic/0njfB9gN5Sot38JjbEQI0Ikc0AQ2OGl sC40ZxMSHSlBRJwsKF/AxOJqEuwqlUVm/3p8+nMH2fa/fLOWTCS/SyVBd71MvC3DjUI8Z5 YxBU/uRO8gGE3g3g0dBUpfzCsla0Q1BrAs5G17j6X7vu91SRzTxSWlx4rMMy+Q== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: [PATCH v7 06/11] filemap: cap PTE range to be created to allowed zero fill in folio_map_range() Date: Fri, 7 Jun 2024 14:58:57 +0000 Message-ID: <20240607145902.1137853-7-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 6EA7640029 X-Stat-Signature: 784z4het43tph1mfmd4qdxc7xpddio9p X-HE-Tag: 1717772377-228864 X-HE-Meta: U2FsdGVkX19BlVq5QKYvzs789zQcTVBcULmEqYTli4VWbFI08uzPRcvKv0b+YdTjI5CqMhuP4f927fF7JKZKKXzMoFIt/lA8PHYttbnrSPkgwsIx4yuwf77HHmuUO0nrUxn5NLHXyWC6V15WPmwOjtinUGv8PAoJ8McCGf+JsEZT5HgGv6ldvcjkFnf5Xs8GrkBaH6eUnEnWkSxblSK6/XC08D4/b+enNHxZUqQM+L2qTIhSdoeBJXVRxx23pqiSG6Q769AeaV6wG6itZCAB/1IMCi43Y41mGb0AqHESTsEvTgUbXqDZ2Wbh3R00vPoksJNPwR2zAMWw3AKp9w392XqnBxmWYOppxCCn99COHth4W23NFzKE2HjmbwfDSGK3CCaHppnVqE1vzf7It/R8keUyrmzdHVBl1gqwnA4lEUsEIxORmYzi+yDkAUFPiX3PLTyAlp7mkgKC1OKBbjiyvlBCcEKwEBILxwq8bsXgpf9Fb+mkTydyUMo/fmQxbOOVQhqCml1HtYkXlyPzj2hh+ccZ5g22Cob79eWvechNX6HG2dFuwazCGJAySQRErTQFf5yFs6mEWOSIkK+1cMoDakmGJZ8KyukQqQ2xONsV/EKQEB3tZBQ9mppnbwc986GCCv7xKTUwOzNZBmO+bgA0NoVJbb0yNKwSO1apW4xS4+Gh3rx9JujTEuUpfewjF94zijqEdpq+nNmYXNq/EhxvpJZSk9bc4e6CI8e8mKBiNLwq7j+kbJk2QBZ7/etmU76euv7PqINqTiIXlRleXSvJsm+jWSpqezQ5FafSPV66O4vLDKMQp97by5yenusYaWwA3LLs9UTcc6S5hFAJYWpuecat2e4co0d7pBxH6vUCmDqt1pD8bU60BEM2RK6X+JQDMXitzAsHdie7sTe3QqNkFCYaSMDtC2G1S2t32tXPwDQXZGyYQ7QrjxU0C8Fv+nB9RF2sY3l9RgTxWAk/0pc m7pmTztM oHoFEg4tWIyzUkweZm/SAeq84byQfhKmYmSJqnkuTGtuoYsXO08w27feMXNNWAGmdRGnKSCp9w9iXgN74ZkIbqOq+sqy+wezW0eLGDEQln8DOb9SSBVGTqgZSXmEdqnmN0kjN3IZAeE7ICfyITFu/uPcRsvOmAkriSD3zYhWvJztFjfJEw6/o3g0AsV3zpP25W+ollL/IgFr+cTuEWFZ28vS50wVkK2PIVGN9x+Zc6RKCg1MgEq4j2vXN7VkeS5cKxvxP4JZtmLdzx6Ssl4Bu78WaofQA+OAQMPxIAgwbJD4Os5o= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav Usually the page cache does not extend beyond the size of the inode, therefore, no PTEs are created for folios that extend beyond the size. But with LBS support, we might extend page cache beyond the size of the inode as we need to guarantee folios of minimum order. Cap the PTE range to be created for the page cache up to the max allowed zero-fill file end, which is aligned to the PAGE_SIZE. An fstests test has been created to trigger this edge case [0]. [0] https://lore.kernel.org/fstests/20240415081054.1782715-1-mcgrof@kernel.org/ Signed-off-by: Luis Chamberlain Reviewed-by: Hannes Reinecke Signed-off-by: Pankaj Raghav Reviewed-by: Matthew Wilcox (Oracle) --- mm/filemap.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/filemap.c b/mm/filemap.c index 8bb0d2bc93c5..0e48491b3d10 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3610,7 +3610,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, struct vm_area_struct *vma = vmf->vma; struct file *file = vma->vm_file; struct address_space *mapping = file->f_mapping; - pgoff_t last_pgoff = start_pgoff; + pgoff_t file_end, last_pgoff = start_pgoff; unsigned long addr; XA_STATE(xas, &mapping->i_pages, start_pgoff); struct folio *folio; @@ -3636,6 +3636,10 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, goto out; } + file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1; + if (end_pgoff > file_end) + end_pgoff = file_end; + folio_type = mm_counter_file(folio); do { unsigned long end; From patchwork Fri Jun 7 14:58:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690333 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AAE2EC27C5F for ; Fri, 7 Jun 2024 14:59:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F4096B00AB; Fri, 7 Jun 2024 10:59:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4547C6B00AC; Fri, 7 Jun 2024 10:59:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 232D16B00AD; Fri, 7 Jun 2024 10:59:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id F13766B00AB for ; Fri, 7 Jun 2024 10:59:45 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E72AEA18DA for ; Fri, 7 Jun 2024 14:59:42 +0000 (UTC) X-FDA: 82204401804.20.8038DD3 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) by imf22.hostedemail.com (Postfix) with ESMTP id 2EA3FC001A for ; Fri, 7 Jun 2024 14:59:40 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=G8rJmHPP; spf=pass (imf22.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772381; a=rsa-sha256; cv=none; b=RM0x3RxWNZcw6nvhxEAWPPLIPPnjW2oGy4umwNAz3mE9pisuljjYSM1fPF4DDCDxlaJzQP vSnGhMBmzy/bnHUXImpYpvjvhjUQk8b5kgzJ5zUgyJ/tsX09f1glbbQp7YmHyJcMGMtNHX aXZmfuYRj0sr4V4Qknn2foeDnR7Rovg= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=G8rJmHPP; spf=pass (imf22.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.172 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772381; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8sLXh3ItwEq5TYZ7rJHrfyrYcE/YpAXA+GtXY9esiOQ=; b=26db4X2GqbpvEY6KhPymTYbe2aH1pzM2KefxBdMZZvN/ptkBD/cQVRIolQuoFstZOqVPWb 9hJxl8XYJBs0Rt8J9Hsy1ZDj4W8fvM/ETYYp7EByxZQwQlc3MMpPINeqqrrWDgfsrIq1rD N88nghVij6IrgFxnEUYXGhPsq0fQOVE= Received: from smtp102.mailbox.org (smtp102.mailbox.org [IPv6:2001:67c:2050:b231:465::102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4Vwkr96mtFz9sQg; Fri, 7 Jun 2024 16:59:37 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772377; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8sLXh3ItwEq5TYZ7rJHrfyrYcE/YpAXA+GtXY9esiOQ=; b=G8rJmHPPcxHDWg9xX/dktQ9HwiydrWEABafRT0DQVy8yun8SAsiK2HjYovvsfD5tVewnLU e7vCv0u+F3IDEkMs18wVCelFF2ytkEpFlyJ2UHfSCGpjQgl9Zm3XPmXkOE5Stwz+ivsiu5 s+6IBi9WHp6kYkf1WfMmcP0yU+306PvjCpL45362UtPH2MixOdqebdY0grac6KOrRpEREn sDu5S4JegDjd8Q8z7xfpbF4Xr8wwvQb7z40XHmLfgquiHYp3yZLjhvhi6mEZEtWImeg/ph Do6WmwQ31Zv2BMk8zutT+kBCCfXWK0y9kDt1dRtkeBNviEt4xKtPZ6E304xGqQ== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: [PATCH v7 07/11] iomap: fix iomap_dio_zero() for fs bs > system page size Date: Fri, 7 Jun 2024 14:58:58 +0000 Message-ID: <20240607145902.1137853-8-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 2EA3FC001A X-Stat-Signature: f1n6c8gmztry61hx6m9ss36fkdfnw1qf X-HE-Tag: 1717772380-602803 X-HE-Meta: U2FsdGVkX19rvEsklD1uy3glorJlay2gwU1IXYVrbZsgcSVX6U88EUYgkBgKJpCLsdgSNzzhT+tOQzrygKMfosvJ9Cz6mDR0ZsG1/xs8le6r+jkkK7+uNeLQmSO4o877zRk6Edb1FWt2778QgNQ74tiUFWrZtdQYzOcaFjaa55DInk0S+NXVs6oLWb9QSRfWi7qNR7z6uJD9j4ptTzXwpQo9Rrs1IzW1iHv2mft082ov/sSSMy0PoVz8M3WT57y7nnq+vT1w9ORMmpOIhGstAbibBOnBe4SxBkSCAtWQpqAr1kFOScJG6bOXr+jEXxft/aFdSMiduZDQBjGuqC32ZZdLwX30D4A+2n+aJuHnatk3Xkguy3TOqxN7JSjR0PtbdqjFOIqO1vLOPP2fZ7nyDvoqzf1Tq3pCIRywBkqesm1fW0aEeLSfOdYs1KF6RlzA69IjcrEtXpARLI1FP3jWkeJn09z+Lv7gUXY9+FjlpvPkB9FZYMYaKL2fIxokzk/g4GzvP5J6hIHlJ0fVg/mO8bWzxbwM3r75wQu7tCn8RyQ2jwUGc0RNajQV8/jEefl33UMy29V7PCR7GsqOHhppcM7OTk3PdQWgKbmDaa8shCmMt4hc/pWgC7Ho6RxmZcxDIGdnS6Skawj39UKJlS+B/OFLOo4pWYp3A1nsa/ymLydQooO0rS+ts0dZWBYaUMGLRxSi85Si0kkjjsEByycswfs6TkOVOEdbsCQuVeHIIpj6VEjjL59udJETva/jql9bL3hpBgxUZlFXK2V9ep7daIE6zxl+PJisrQsv9U1IiMseAkY9O8l5k72amUqQxB6CkzvwjV1UomxNOHXIzwYJDlIAm6t+Sa5Y1RnhVOdVqq0b6+KJCaM4tSoR+PPu7Igr6ZDgXRK/mWsOHlt4MyOd2iMM+x1Fa1TptHeryz1VJNmbbRrX9Kkf9v3NW+ZHclfMpsYNhPeqw83QU+Bo1Uv +OMZFJrd PyzpBVOxhH/L+cXNIxxV7R0haTGkZuKCZyztFdR8zDY47U+jQC9N6hy2zERAL3/DtmIgkDEXYPhsObL0EL6XB2wf9wT5k6LHZqi/nzbMJhYGX5RvbzcvAg1HGROiMP1GFoXmsJvAkuWSMng3rgiAOUjlj+92cdjQ1ujSERcVBwMagiQzNTQCNzBLxBo5KFv8V9WsnEQXbs3UmJ8DXia00o5PEsqUesNaJi/vxFVIUEvQCWw0liwuYhFtXy3dSW5mpQp6VJd15O+1CdoCFUmm6HrxR7Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav iomap_dio_zero() will pad a fs block with zeroes if the direct IO size < fs block size. iomap_dio_zero() has an implicit assumption that fs block size < page_size. This is true for most filesystems at the moment. If the block size > page size, this will send the contents of the page next to zero page(as len > PAGE_SIZE) to the underlying block device, causing FS corruption. iomap is a generic infrastructure and it should not make any assumptions about the fs block size and the page size of the system. Signed-off-by: Pankaj Raghav Reviewed-by: Hannes Reinecke --- fs/internal.h | 5 +++++ fs/iomap/buffered-io.c | 6 ++++++ fs/iomap/direct-io.c | 26 ++++++++++++++++++++++++-- 3 files changed, 35 insertions(+), 2 deletions(-) diff --git a/fs/internal.h b/fs/internal.h index 84f371193f74..30217f0ff4c6 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -35,6 +35,11 @@ static inline void bdev_cache_init(void) int __block_write_begin_int(struct folio *folio, loff_t pos, unsigned len, get_block_t *get_block, const struct iomap *iomap); +/* + * iomap/direct-io.c + */ +int iomap_dio_init(void); + /* * char_dev.c */ diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 49938419fcc7..9f791db473e4 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1990,6 +1990,12 @@ EXPORT_SYMBOL_GPL(iomap_writepages); static int __init iomap_init(void) { + int ret; + + ret = iomap_dio_init(); + if (ret) + return ret; + return bioset_init(&iomap_ioend_bioset, 4 * (PAGE_SIZE / SECTOR_SIZE), offsetof(struct iomap_ioend, io_bio), BIOSET_NEED_BVECS); diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index f3b43d223a46..b95600b254a3 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -27,6 +27,13 @@ #define IOMAP_DIO_WRITE (1U << 30) #define IOMAP_DIO_DIRTY (1U << 31) +/* + * Used for sub block zeroing in iomap_dio_zero() + */ +#define ZERO_FSB_SIZE (65536) +#define ZERO_FSB_ORDER (get_order(ZERO_FSB_SIZE)) +static struct page *zero_fs_block; + struct iomap_dio { struct kiocb *iocb; const struct iomap_dio_ops *dops; @@ -52,6 +59,16 @@ struct iomap_dio { }; }; +int iomap_dio_init(void) +{ + zero_fs_block = alloc_pages(GFP_KERNEL | __GFP_ZERO, ZERO_FSB_ORDER); + + if (!zero_fs_block) + return -ENOMEM; + + return 0; +} + static struct bio *iomap_dio_alloc_bio(const struct iomap_iter *iter, struct iomap_dio *dio, unsigned short nr_vecs, blk_opf_t opf) { @@ -236,17 +253,22 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio, loff_t pos, unsigned len) { struct inode *inode = file_inode(dio->iocb->ki_filp); - struct page *page = ZERO_PAGE(0); struct bio *bio; + /* + * Max block size supported is 64k + */ + WARN_ON_ONCE(len > ZERO_FSB_SIZE); + bio = iomap_dio_alloc_bio(iter, dio, 1, REQ_OP_WRITE | REQ_SYNC | REQ_IDLE); fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits, GFP_KERNEL); + bio->bi_iter.bi_sector = iomap_sector(&iter->iomap, pos); bio->bi_private = dio; bio->bi_end_io = iomap_dio_bio_end_io; - __bio_add_page(bio, page, len, 0); + __bio_add_page(bio, zero_fs_block, len, 0); iomap_dio_submit_bio(iter, dio, bio, pos); } From patchwork Fri Jun 7 14:58:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690334 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B75BFC27C53 for ; Fri, 7 Jun 2024 14:59:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5BC316B00AD; Fri, 7 Jun 2024 10:59:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 518D66B00AE; Fri, 7 Jun 2024 10:59:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 369AE6B00AF; Fri, 7 Jun 2024 10:59:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0F9396B00AD for ; Fri, 7 Jun 2024 10:59:48 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id C7A3E818F1 for ; Fri, 7 Jun 2024 14:59:47 +0000 (UTC) X-FDA: 82204402014.20.3211E83 Received: from mout-p-103.mailbox.org (mout-p-103.mailbox.org [80.241.56.161]) by imf16.hostedemail.com (Postfix) with ESMTP id 750FA180004 for ; Fri, 7 Jun 2024 14:59:44 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=NolyD4e3; spf=pass (imf16.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.161 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772384; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kZG/lL31K25//iaIAV7jBbt1MNjj987Sk8ll7Dxjdts=; b=LY2/qhf18l7BaS5yFhH4ew/bNk0XZ6kG6q/wifIV6NzaIxCp/ODMX3+VeUVcKDOZIwyEQ5 BHv50sitGTGOQFIJyvDu5M6F65lOVued/s5nG5oGho12xV2I6skwFuOfcIzCMf/59fCWfp aG3cvwmEDObaDojlPu2w+55mM9lqVe0= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=NolyD4e3; spf=pass (imf16.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.161 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772384; a=rsa-sha256; cv=none; b=5P4M988snPP1JDrakPj77qaJbF1hxJ/Op32a44+FjuRmm5Keo99NoQEQGf01kwcUjC+q9U 34xNADzbvNy0lTywXmqP0Yh0c5OLFOHttwj/StPVxNiExfMhR5VwUnhFnuMB/RPjuZIn2L 5s4oGsvmbi8W3E7Vx6AZMOHxutCJtWI= Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-103.mailbox.org (Postfix) with ESMTPS id 4VwkrF2Pd1z9sng; Fri, 7 Jun 2024 16:59:41 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772381; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kZG/lL31K25//iaIAV7jBbt1MNjj987Sk8ll7Dxjdts=; b=NolyD4e3omUCcx8jYDPY5ZsDQIHnS6RImuHNHeschPG7wr1uaeUR7qeU92o88kizWNRrAr SPhn3Cqkx/nm6ynGmBxDMlx2bkrNAdRZYpqYRswXXrTZ4S9FOqoHy95aPpCwC7eWGKAhgV UYRW6njfy8HdZueqtcGalhYbUfZUliyN9Dpi69gv2Rf3twKEUV51yeGEhqz47K6Gfi/vUp WvGe3GyAxkVYhSTpOGarHrkIJRYmf6bgft57mVFmQofyk/wz89uXsAMB/OeiiET5V3FeLw +IKc3x61zFHlTj9qpPNiPwyTtnKkR+k3siGQMDaR8gHW6pfH/5JO6whYg9+Jqg== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com, Dave Chinner Subject: [PATCH v7 08/11] xfs: use kvmalloc for xattr buffers Date: Fri, 7 Jun 2024 14:58:59 +0000 Message-ID: <20240607145902.1137853-9-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Stat-Signature: pssb5fm16p5cgwcseirxwpmhc7mjamfu X-Rspamd-Queue-Id: 750FA180004 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717772384-79543 X-HE-Meta: U2FsdGVkX18qLGjBtO3GiMwHdXjM2hGwfIEvNvssGD119GAj6pgzKbtNCl5RUzMTnf8cW3gx+C8k8R1ynP/KOAHXxBr1uZvqHzCXcqe9vH8nXvC9UKu/Yy5u4pFNQdsjr6Qqgn+g0v7qA3DeLWAjvL6zszsmzTtBVqnmaggZRD2KyFd5GbuP7XmSmMnQjeaS/PykZDspmo4aNMy0L1M85iUteNxkGQYOJ1fSk/yBuPohrrJxT2EZcpXXo/o8v2YY4FjCm4Uq50fI12hMtH95kKJ2gyj49eHfnVxg6Gc+ynqwogoXsNtXvKCRk19DCnMOCOMSpSYDe6tEDsw2I2f970jfiDeskIQIAMsUx3x0aUemNjYb78sxSOHxPmwe4xun3HCFo5ce7Yo7s1x0WoHPr9c0zAxY8iJmVxf2IYsMYvPNt2Q39jva0lTzDIJDZZCRGkJ0XNO+jidGVbCX5E5qJpEPT/5ES8xQtYPcv+0vz+M2sEtS88Ws7cPLiPUDNyVJrlSDfi46o6xnMuTT1XHVgf7/vyHVGGBh0PFdwFN1znolOhp17lOU0v70duF2IOziNaQrRByo831QFAjTcihfpElBQKD67ePT6ziAbdJRM/Gre+GxGN6RF/+wjZGNsLv8MqGQZhggtdRIRvAKHg4KxF9uPVy4MmnQdCkDJPuT8z+9sXtE0xQ5qgiQTXWM0dp07X773k15q+SUB5KGPY/fecb3wbXBIEiZk1mKbSXtedFupTy8klPhL1qD069YN5wKzFkUfXPAHy5DyL6gLopywOgHcgW5RJjs9z1sO7EKGM1GlPUK8ILOGrR3Ss0e0loAPQ/zuJkLCEP7T+OZgTIGrjSUQxNr8i4YIzKrH/xHCQ20aAaPpFOfYyvt5rpacXGR0FGzbNk0lwUKIBIBGCaTCM6qAeqTUVnkVcfYk5k7K2K4VLSnwKRX343zhQOQnk2qSbi1OscTNDK8f3DHtZO eSvNlQYT 2/gW04u8Buzzcn2KaNQeTEOg5SjtZkDjNLyYMUi0kRpQINsyAAGdLCSD4F+D163YGWB0+ECz+dBjGGlJmt4lmVyGe/qyYmRELe3ghCXO6SkdjNbtv3LfsbWrv7e6lr6GQiuCPG22t4URt95kdDpKHKJF+WffcEmdpDu3TKQIy4TdTVTmR3/BwXjnPI6xvs3PI5gQwfIcaYgUnIaVqJ8T6nMXCnj5Ajy92D2p0P+C949zaoiZmWyYb6gSjFa5N3jNOLYIB/RUS5vClsH3Qky20hdvOeK8wllZ2MNV1h5pBRmHGs3Y5YxigOjWianPFuDH25q0cilSomQWUnB6cepHGUacqJSFA8Lx22lPO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Dave Chinner Pankaj Raghav reported that when filesystem block size is larger than page size, the xattr code can use kmalloc() for high order allocations. This triggers a useless warning in the allocator as it is a __GFP_NOFAIL allocation here: static inline struct page *rmqueue(struct zone *preferred_zone, struct zone *zone, unsigned int order, gfp_t gfp_flags, unsigned int alloc_flags, int migratetype) { struct page *page; /* * We most definitely don't want callers attempting to * allocate greater than order-1 page units with __GFP_NOFAIL. */ >>>> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); ... Fix this by changing all these call sites to use kvmalloc(), which will strip the NOFAIL from the kmalloc attempt and if that fails will do a __GFP_NOFAIL vmalloc(). This is not an issue that productions systems will see as filesystems with block size > page size cannot be mounted by the kernel; Pankaj is developing this functionality right now. Reported-by: Pankaj Raghav Fixes: f078d4ea8276 ("xfs: convert kmem_alloc() to kmalloc()") Signed-off-by: Dave Chinner Reviewed-by: Darrick J. Wong Reviewed-by: Pankaj Raghav --- fs/xfs/libxfs/xfs_attr_leaf.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c index b9e98950eb3d..09f4cb061a6e 100644 --- a/fs/xfs/libxfs/xfs_attr_leaf.c +++ b/fs/xfs/libxfs/xfs_attr_leaf.c @@ -1138,10 +1138,7 @@ xfs_attr3_leaf_to_shortform( trace_xfs_attr_leaf_to_sf(args); - tmpbuffer = kmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL); - if (!tmpbuffer) - return -ENOMEM; - + tmpbuffer = kvmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL); memcpy(tmpbuffer, bp->b_addr, args->geo->blksize); leaf = (xfs_attr_leafblock_t *)tmpbuffer; @@ -1205,7 +1202,7 @@ xfs_attr3_leaf_to_shortform( error = 0; out: - kfree(tmpbuffer); + kvfree(tmpbuffer); return error; } @@ -1613,7 +1610,7 @@ xfs_attr3_leaf_compact( trace_xfs_attr_leaf_compact(args); - tmpbuffer = kmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL); + tmpbuffer = kvmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL); memcpy(tmpbuffer, bp->b_addr, args->geo->blksize); memset(bp->b_addr, 0, args->geo->blksize); leaf_src = (xfs_attr_leafblock_t *)tmpbuffer; @@ -1651,7 +1648,7 @@ xfs_attr3_leaf_compact( */ xfs_trans_log_buf(trans, bp, 0, args->geo->blksize - 1); - kfree(tmpbuffer); + kvfree(tmpbuffer); } /* @@ -2330,7 +2327,7 @@ xfs_attr3_leaf_unbalance( struct xfs_attr_leafblock *tmp_leaf; struct xfs_attr3_icleaf_hdr tmphdr; - tmp_leaf = kzalloc(state->args->geo->blksize, + tmp_leaf = kvzalloc(state->args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL); /* @@ -2371,7 +2368,7 @@ xfs_attr3_leaf_unbalance( } memcpy(save_leaf, tmp_leaf, state->args->geo->blksize); savehdr = tmphdr; /* struct copy */ - kfree(tmp_leaf); + kvfree(tmp_leaf); } xfs_attr3_leaf_hdr_to_disk(state->args->geo, save_leaf, &savehdr); From patchwork Fri Jun 7 14:59:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45535C27C53 for ; Fri, 7 Jun 2024 14:59:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7DB136B00AE; Fri, 7 Jun 2024 10:59:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 761AA6B00B1; Fri, 7 Jun 2024 10:59:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 566036B00B2; Fri, 7 Jun 2024 10:59:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 33B896B00AE for ; Fri, 7 Jun 2024 10:59:51 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E15D3A15DB for ; Fri, 7 Jun 2024 14:59:50 +0000 (UTC) X-FDA: 82204402140.21.362CCDC Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) by imf25.hostedemail.com (Postfix) with ESMTP id 40D5BA000A for ; Fri, 7 Jun 2024 14:59:49 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=ipibfMCc; spf=pass (imf25.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.152 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772389; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Si+AaspsxfVPlRl2lS+XF/iVPhaV4Y/0cfkhGRX9a7A=; b=gubJveZk08hOuDXNj/9KAtxHPVdfdZigI8/ajr+oa69yL+aCBqhAck3NOdTNzjj2bz4BDC GiikYyKFtNKFBfopTwjzGVEGjcchQXYhH3eRqgerW2PWuab0GHbgjMBeV/Q4V3dZcTBYn8 zneZFWUzt4qvzW6ow2S69zoKOf3JBDU= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=ipibfMCc; spf=pass (imf25.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.152 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772389; a=rsa-sha256; cv=none; b=iYeByGO79TDd6a5eS1gcff9GNgZ1nNLbY6cfPqMdE+7ntKssxbwhNvDuIN+Km6Hkeuipm2 qVIvEFjsap9IZsR7Qa5Ur6csn0P9//ra/+Gwd3jvz8kW4SCZ7fVjrwyr1U4l7+qc9MGvT/ n2/HzIDY6QMDW4HOkw/z2zoV8eE6YzM= Received: from smtp102.mailbox.org (smtp102.mailbox.org [10.196.197.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4VwkrK5f8Nz9slT; Fri, 7 Jun 2024 16:59:45 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772385; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Si+AaspsxfVPlRl2lS+XF/iVPhaV4Y/0cfkhGRX9a7A=; b=ipibfMCcjAJSHX3K3J8PQnqGVKt9ACztRTpPrEYNtUQOBwkE5tuUBCkJ1NEhEbAIgX22Vr 8ZMBlDyAlLQOlxff6A6gb3CGFCr3npI2926nhh457x12GrYZ+5vuQTGvfBhUCb4MVIJJFP 6vLHCY956wJpI3OyKxVjlkj4YSIjNNCH0fCKRrKsZYyqUMda22pbi2Ze9Q+TliKg51buIE IVZjdpD45TfZGSvWF/o2Gc8Ou37qiBGaO77XkcmPrdVh6uzxEQw7z4Vhmx7g1chNrifJ3B 58zwc4kj6LfNQ1u++f2ihUmK8wbv6GteLXO4JjZayvOkWS/rL6gskNB5XST9xg== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: [PATCH v7 09/11] xfs: expose block size in stat Date: Fri, 7 Jun 2024 14:59:00 +0000 Message-ID: <20240607145902.1137853-10-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 40D5BA000A X-Stat-Signature: 93h1h3qxon5j17wc5bhqahjtse8gj51q X-HE-Tag: 1717772389-257270 X-HE-Meta: U2FsdGVkX19ucziSnuzcg8+TsNgM5cZuoHK/4ClVv/9bZ8BPg6mIOKNPiDBRFZTLplUkPr1tE9TV5OGbDavb1IGPoep1WZG8Inc+eeDnbXwfLJZj/1q5bfngGtaEklehE0B54YdsJjWeX84WdS7DAqJtMMQzUcXc/NYW8DrKh0K1Ql0dvZ06sx5M+NEWVEjNmxUEP9aorTBAj5UvxhvAWYtZ6MAsGn2NmEOwcrgLgCtEv7rzuQuNqptxLAvY9CF46l3Nhn4GqctEqeqWfk4SQHySoIe+OyIDeSsgoSwj0QUnZcsEGgE8l3Cuc/KeF0bhX2pAFebVc2d2mtZLVWc9T6Fs6sxn6kcigFfwBBC5R7wIvAKQTjmUABgbasl55fzqHp027QtzFaLr52J9TSDZLm87n8/mFYdLbSklKoK1EUq8287+5Zy0fRzmaCnemTr92vXCc0ufPlhcifkPaN0rzCssY3GnjKhK0NQtc8INcRjbNqequH+1LXoTFw1psSw6dObYVUcyKweoEuIaXyeaiT558oBrlf3n1of7g2iSTmY5SdE6TCWkNg532TkFamsINqxJfiuNJ06DMO2yuaXnZokLDgiXEklawPC6GOtkkryMXc04o4utI+aXTeyXyDFo4MYblNLGZWIrIrT9I5ssDQcqfI+GrZaF+RWaEJeUdvRDC11jHh68aKXtF9//LvDgv1UcrGKpadhNA3jGiZOPjCl/8fDXj5ZREPMgg7S7daDRyfIXEHWfo7CVcVPy0JylZTeU/jnis18SNidoVuHYFpfZR+fT09ugKAe1wFhHCpMwIS7HvnBac3hta3AFgTGrOPVsqQzndsCF66rNneEdddAidHuD2Dt5dmcxv7QD1QpNPmbiczqx4szCdCHX337BCBp+V0dkZtHsONjwfitujQBoNgiDr0ld1xNW0GeybjeWqM52xQkgNedf0NaKkCEwW64iy7CSrRY1rew8NM4 pXOllS13 IpEJjzIbUSChXTWSRPtLg0qXe811U8HevOtg1SvB5UG+insfNm5giJFRmFSn+sr7tmaMYlQjhb1Kh9cemI8XlssIgg9/yKrLfGlRlME/H5llduWrleZPKaD2UU9jsoykNw/8BUOCR9BWmYUON+atZ8EqYwYS/cHdb/y1zghrtIsTI0XWFjlF3RMtVM8c+iYVfWFcWP9S9PgBcKefNLp7ZeBt2B97Kb/B6go+idKcGHwOp8EqckD4C0WFrWpdWOORIbtiWZvSQGAVUIaKhSZG+KVozHKFrJHJ1wJggy2b+Eh4HQGrixOLdGJn+xV8uShecA5MIQFSLqTPUHJYC2lOBLiqOOMzq8I8+gOa2B9YP/klo3hisZ7GPM09hQss2diwAdnzR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav For block size larger than page size, the unit of efficient IO is the block size, not the page size. Leaving stat() to report PAGE_SIZE as the block size causes test programs like fsx to issue illegal ranges for operations that require block size alignment (e.g. fallocate() insert range). Hence update the preferred IO size to reflect the block size in this case. This change is based on a patch originally from Dave Chinner.[1] [1] https://lwn.net/ml/linux-fsdevel/20181107063127.3902-16-david@fromorbit.com/ Reviewed-by: Darrick J. Wong Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- fs/xfs/xfs_iops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index ff222827e550..a7883303dee8 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -560,7 +560,7 @@ xfs_stat_blksize( return 1U << mp->m_allocsize_log; } - return PAGE_SIZE; + return max_t(uint32_t, PAGE_SIZE, mp->m_sb.sb_blocksize); } STATIC int From patchwork Fri Jun 7 14:59:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690336 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7244BC27C5F for ; Fri, 7 Jun 2024 15:00:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 00F3A6B00B3; Fri, 7 Jun 2024 11:00:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F01366B00B4; Fri, 7 Jun 2024 11:00:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC99C6B00B5; Fri, 7 Jun 2024 11:00:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B0C456B00B3 for ; Fri, 7 Jun 2024 11:00:03 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 6204CA0535 for ; Fri, 7 Jun 2024 15:00:03 +0000 (UTC) X-FDA: 82204402686.21.064EC4C Received: from mout-p-103.mailbox.org (mout-p-103.mailbox.org [80.241.56.161]) by imf03.hostedemail.com (Postfix) with ESMTP id 4D89D20052 for ; Fri, 7 Jun 2024 14:59:58 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b="Cn0j/AWN"; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf03.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.161 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772398; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Bne/CkGLFb5Q+0r/mTCjdzHVWDyDH0TqoTxmXgBXKCE=; b=DuCGTpX7pbwdD6Byif/vA+Tq4J6XwuAaFbsP6ZFwsSRoNjyL+QtSk+zBc977Q5QS3GfflO Jz3Oyz9EMSlyVaxHXVwepCkx+YEWPYK65+RkgCNcaOfVUMK6irEItEnUX+O4LkK/GxYmTY 6gnDFfuvoGLLpz7IE89U98m3IO5e1D0= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b="Cn0j/AWN"; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf03.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.161 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772398; a=rsa-sha256; cv=none; b=Q7w30KlHy/pSi7wJzmLjN/EbD2V7wCfLfpPkahznBV9CMqJg41WexGgUECWixOYsW/5GL9 NY8BM12/qJRl6x2mTFI4/HXJWo6GTOgXq3K0+mBnnzB6DTcELg08mWHhj4NmRiB1H4ufif dHClk8vBB48d2w/mqvD3DgEjmbXuTi0= Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-103.mailbox.org (Postfix) with ESMTPS id 4VwkrP1CCjz9snn; Fri, 7 Jun 2024 16:59:49 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772389; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Bne/CkGLFb5Q+0r/mTCjdzHVWDyDH0TqoTxmXgBXKCE=; b=Cn0j/AWNicr8OHAovh1eFeMf2drV755n5F+zvZ1MdbQyUkqzCsMz2XxYhmPOC2AHubjaPN By+3Lnr4JK0aS27yj94k9MEkCJckWgq9jK1c1j3LFluOxy7/JDiNmX3uKasP8JO2jy6GXs qtnId+YktikV8HX2Fv9n7TpvJhdGTgqufWtf8fk40FJTjnJd+iomWBoBupp3MQc3ELjc3X lWt6MxodQqcZvTxBpdTkn7wBci1296lYMYsDoQYaUOU31D2WNUvNG6Mk1PrmUiDTt3/QKy Q7qtduMYWr9zGNNdSdW+wsgGejLyBrAGbz1EjEttBWc9C32hcnrBwyRzNil4Ng== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: [PATCH v7 10/11] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Date: Fri, 7 Jun 2024 14:59:01 +0000 Message-ID: <20240607145902.1137853-11-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 4D89D20052 X-Stat-Signature: muw6tuxeidx4bqib7xgnzwncd9u9cisf X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1717772398-437388 X-HE-Meta: U2FsdGVkX19B+ZjsGnmLquiwm+9x9Waof/FxY/OTK2mWnGDqu8SBshOPeS1m5kV1Zmim/E8fCO86xwjPtKxU6jSKt6dHBA5FInz6iOxilyT+J571kbwQwBxb6DZDNibIqwY9ci5eEwH6FwoZJYge7rVROAw0Lr9MLc4EiVqkxhwrV7r8ltPYgFX/QQtV9eY4XsHbrJ7diPJ/eacn0SpqnkLxzGvIx6YWGiElIyQw3MglmODFo0B/d196XR1lRDcq79gBerCAEUJj5+reVa9uZmPPBb5QnlRxSmcy1+Vzz3hyrQN6gG8UMVxYDGVR0LE3++l5zuTHoPHqRGRoodLDo+VK5XgoDTNqUO+uvlHt/lqgOy4E/XNypZif0FyIxP8Wzea6MLITnFH5blxL89N/voitjh5lHwA50fVrhXPajIp2bj3Vm0cP1WIbG8GeLLTk3Bg6zONI/S1JKGh7LiOaXi+9NWyR57mz5zkTK8mdwvEitBQodISnqiH8WNm3ozTN2goOC2xHllQkymKfGEpH3EuDxT34v4nGy1N3ZrcWQUTBwdobSmuMbOjaioS1YvAbO3XQiEj31fDaBxUasHaX3I5ZGfdVLfpLmsWja5TgrPL/yzp5wRDX6BQqCmaEM6hKQaaTDC3uW304hLbcSaWu93+YCrnl6UNyeagSvvpXcrw3LymSqRTdg3ABO6CN1inn5kQWj1ZowZzlYPsVOKtKkAnbDM3SJT0aUxyd4rT1Q8mZ64WlIqfoBlZ8fy0pjWHgoEIlArpi8oR16MpMArxvtKUa7w5sLUxopQ9zo9XsYs3GFOLSBZXXpWNh7TCobq1eiEcO+DJud9cpMAnYxOBIm9FoMcKrBWY7u5tk9uCT3pT+3uRMUjZj+d+LNDp3QGGuwRBh0yVT4xAYLGZeNFzMMoR/3WF7m1VSG7qk8sXKoIBf+d+yOi0Nn8aYPUVFhGGylTbcSt4nmQn5P/EYyWD P0C14lyh PoBPknZBEbkM+ku5Ck/a4X/AOo1+IVbbU8eAsT2Qj04yHBZyWvJ/O0iD/YPvsuHfIyrKkqsC3vZzC9D/dDi7J8wbZyA9JVPqYQwSDmgM5YIO/dwev8Z2ETdy/btLVISLOctBha6lv4uHAWgj4b/ixXYBCJEbk4NV0dQe7iFXqxtsQHeIg6Au4BWrW+6D1/v4DezaDSb+XOD2MUzj4uzIOuW/pYQOuahbIzD6x7HobiaOwoigG2lYwhmNpaEA5C3DO7h2Fiw4PT6yQ8CnqU10i+Q5ZnGEka2wFGa40p8zfdvFDZj9M7ii3YQA74b3u2+PsMVmIQ5Jd9x78E6g= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000027, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav Instead of assuming that PAGE_SHIFT is always higher than the blocklog, make the calculation generic so that page cache count can be calculated correctly for LBS. Reviewed-by: Darrick J. Wong Signed-off-by: Pankaj Raghav --- fs/xfs/xfs_mount.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 09eef1721ef4..46cb0384143b 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -132,11 +132,19 @@ xfs_sb_validate_fsb_count( xfs_sb_t *sbp, uint64_t nblocks) { + uint64_t max_index; + uint64_t max_bytes; + ASSERT(PAGE_SHIFT >= sbp->sb_blocklog); ASSERT(sbp->sb_blocklog >= BBSHIFT); + if (check_shl_overflow(nblocks, sbp->sb_blocklog, &max_bytes)) + return -EFBIG; + /* Limited by ULONG_MAX of page cache index */ - if (nblocks >> (PAGE_SHIFT - sbp->sb_blocklog) > ULONG_MAX) + max_index = max_bytes >> PAGE_SHIFT; + + if (max_index > ULONG_MAX) return -EFBIG; return 0; } From patchwork Fri Jun 7 14:59:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Pankaj Raghav (Samsung)" X-Patchwork-Id: 13690337 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CB53C27C53 for ; Fri, 7 Jun 2024 15:00:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A96E66B00B5; Fri, 7 Jun 2024 11:00:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A46F66B00B6; Fri, 7 Jun 2024 11:00:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 935AB6B00B7; Fri, 7 Jun 2024 11:00:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 740876B00B5 for ; Fri, 7 Jun 2024 11:00:09 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 4A3C0121970 for ; Fri, 7 Jun 2024 15:00:07 +0000 (UTC) X-FDA: 82204402854.28.81F5654 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) by imf19.hostedemail.com (Postfix) with ESMTP id 96FFA1A0009 for ; Fri, 7 Jun 2024 14:59:57 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=xI6YI1vM; spf=pass (imf19.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717772397; a=rsa-sha256; cv=none; b=q/TTL5m/Q17Y1mozsT2JfJgmTqZo2RBqKUMdAuOjCYlyUm4n6wBJT1AllwR1QxWPh49Ckv mLVyEUJj+mQMCYhDAgbI8WVoZeC3fTNqkZSs7h3LA/qBwDpdPAqIOqq7anzeDvmjPE9sz6 d5UIrGeFV55G5vreSVaHgb9bBBr3k5o= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=xI6YI1vM; spf=pass (imf19.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.171 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com; dmarc=pass (policy=quarantine) header.from=pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717772397; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9vXYgSxRyUppcEZmCRlqS1p5lc/MCHDHslK/G0CWBeg=; b=RcVLeMdPPE8gw4EvexFfGXbmOB6740gZIhh/+N1q43YhwVCH40nyJctxF51RthCG/VdlD7 eClS38hQcj49H0I7COc1lHlhvoCcmOezu1dHJTARfhqtDXOnffapvBkuD33HpzMJAMyGjX rLkhaaYAXl6kApeC3jprylb5QMed8+c= Received: from smtp102.mailbox.org (smtp102.mailbox.org [10.196.197.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4VwkrV2rnVz9sZS; Fri, 7 Jun 2024 16:59:54 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1717772394; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9vXYgSxRyUppcEZmCRlqS1p5lc/MCHDHslK/G0CWBeg=; b=xI6YI1vM1H1/ZjtAWLYXlEyhjVfEgiFVm+IOb4aP6oO9OsaYYEFEfmH/q1rDW1In2Biwz5 5rIXG/pNhXxtoegWKWgJfGO3WfQLKHHfCxqjr8QK+3gX08tjTqUJ8V616NTECLo3IV2hzi gIkcMCJCVJ50L/SEL5voxRVc0ElELkW64AbYcGod1KcUyz4CIaWqirZLQgFNW3goGUcA+g LeN+UgpwOVClQ6AFRS2hScRrskVOJ6oYALJeagygISDlQMdz4hJWZaknxgysxDQBT5hYxj 7yjRzzmeXuu6xHTlv7/Y9U2IbOciSs04WqUwILZp+aKSpUbtLyQgnr6wT2O3cA== From: "Pankaj Raghav (Samsung)" To: david@fromorbit.com, djwong@kernel.org, chandan.babu@oracle.com, brauner@kernel.org, akpm@linux-foundation.org, willy@infradead.org Cc: mcgrof@kernel.org, linux-mm@kvack.org, hare@suse.de, linux-kernel@vger.kernel.org, yang@os.amperecomputing.com, Zi Yan , linux-xfs@vger.kernel.org, p.raghav@samsung.com, linux-fsdevel@vger.kernel.org, kernel@pankajraghav.com, hch@lst.de, gost.dev@samsung.com, cl@os.amperecomputing.com, john.g.garry@oracle.com Subject: [PATCH v7 11/11] xfs: enable block size larger than page size support Date: Fri, 7 Jun 2024 14:59:02 +0000 Message-ID: <20240607145902.1137853-12-kernel@pankajraghav.com> In-Reply-To: <20240607145902.1137853-1-kernel@pankajraghav.com> References: <20240607145902.1137853-1-kernel@pankajraghav.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 96FFA1A0009 X-Stat-Signature: kowkksseuc8c8o19dje5ybi96qb18ubc X-HE-Tag: 1717772397-20232 X-HE-Meta: U2FsdGVkX1/7TcoDp8qN8cSSlh+qg+WHO4UmZXNlm2XVG3HZjopVGz8slafBTq1G3emzZHaARKgUNE6zej/83TaF1aPl9jEMcABoB8+8c7pr/H/k9eGKQGWKOenjZ06W4zphewulDPKLgvw+1tnN5PdZG8HkG8DZ6/0LGuUkPjpWh/6ZmKhQvlBt3pO4cSLJFd8zZYsv53vu6vNZlDPzPWuN+Q1OkJ4hVMU9THiN078g7yHeEhdXL4xy14Ew99S0vnE/oyEcuboaQpk9OgwBx9zsi2Rlfn19yPmncVj/Fp2ZchNtFpKdGS81rFuCv5L2+N1aTtjuU83au7aCZjswO3zYLP9mzHBKkE+NWeOx6EyAmB/6YKRx8cUBxtzS47UliEa/mi9A++FgZCdv1cBAl1ZZxTZbD4oCnUMs0iDgGDIkUZZiu36zyrMuEopU41xnOhVTz4fRLt3YQVtvkr105pwAtoeZp+AKaBjCkrK1TmJpBa9hxLH4NYAd8ejO7wb29KE7VgOU4OCz2Wl2g4q++ff143spdsoXLPS80+qAwwxSb7VmpKAqBcft7Mkujnjhj5qWYtYu5gW5IZyqfJrSCYd5NpeNzut5oy1vaa8wMHudulV7y2jTSyqC0SBmL2WBpZTe6W2s4eJ3//IXQSEqSCoQ7YpbRzoLqm1nY8558ecDpl7+YibTfD332hNWqjED2Tbpr88lnehoIU0yQsRDtNcUWSZ8mv8zzyV5RoZDLz2bcspY5RWv/eXLFesSTL71vzcVvbYto/84BKId4C7byP81Br70he70DvPsq2ZAK/y+Qwwev9JVQwnrVbyij5e3oJdNZf3aYB82z4qjGFyb9cw65yPP+k/wrTwunfAait/eKtYkdfmV4a0IIW1HP/7cgl80LD0WXoG9baQMpSdoRwLdQQRjYHnqv1zchYwQ+rFAf8SZqFEEfon7ce5JK+HVmc464FV605EFVumMIHC Az6LWcOg bhL8kyLb1mxiReGMDXoPFftYE5RecDrr5SN3tKw/nZdg3VYFLoIsail+mLMJUDq8qE8dSfr+8CyI+OeEBzgLzZeU2WtxFQB4s77Z2Gqtg/o4cZnnpYzXA/x/j3yemSdet0m3AiItruaFHsx31YBeG1eAw0WFaMmfJAGUQ2BBhs12qLUm+2N3cMBdliFLRvaxYBryCDwwKdPOpwMpV89/atMgwmH9KRK4yn5KuQmyNE6VPV6X5mXGBGUkLJAyeS2hsUsl5 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Pankaj Raghav Page cache now has the ability to have a minimum order when allocating a folio which is a prerequisite to add support for block size > page size. Reviewed-by: Darrick J. Wong Signed-off-by: Luis Chamberlain Signed-off-by: Pankaj Raghav --- fs/xfs/libxfs/xfs_ialloc.c | 5 +++++ fs/xfs/libxfs/xfs_shared.h | 3 +++ fs/xfs/xfs_icache.c | 6 ++++-- fs/xfs/xfs_mount.c | 1 - fs/xfs/xfs_super.c | 18 ++++++++++-------- 5 files changed, 22 insertions(+), 11 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index 14c81f227c5b..1e76431d75a4 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -3019,6 +3019,11 @@ xfs_ialloc_setup_geometry( igeo->ialloc_align = mp->m_dalign; else igeo->ialloc_align = 0; + + if (mp->m_sb.sb_blocksize > PAGE_SIZE) + igeo->min_folio_order = mp->m_sb.sb_blocklog - PAGE_SHIFT; + else + igeo->min_folio_order = 0; } /* Compute the location of the root directory inode that is laid out by mkfs. */ diff --git a/fs/xfs/libxfs/xfs_shared.h b/fs/xfs/libxfs/xfs_shared.h index 34f104ed372c..e67a1c7cc0b0 100644 --- a/fs/xfs/libxfs/xfs_shared.h +++ b/fs/xfs/libxfs/xfs_shared.h @@ -231,6 +231,9 @@ struct xfs_ino_geometry { /* precomputed value for di_flags2 */ uint64_t new_diflags2; + /* minimum folio order of a page cache allocation */ + unsigned int min_folio_order; + }; #endif /* __XFS_SHARED_H__ */ diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 0953163a2d84..5ed3dc9e7d90 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -89,7 +89,8 @@ xfs_inode_alloc( /* VFS doesn't initialise i_mode or i_state! */ VFS_I(ip)->i_mode = 0; VFS_I(ip)->i_state = 0; - mapping_set_large_folios(VFS_I(ip)->i_mapping); + mapping_set_folio_min_order(VFS_I(ip)->i_mapping, + M_IGEO(mp)->min_folio_order); XFS_STATS_INC(mp, vn_active); ASSERT(atomic_read(&ip->i_pincount) == 0); @@ -324,7 +325,8 @@ xfs_reinit_inode( inode->i_rdev = dev; inode->i_uid = uid; inode->i_gid = gid; - mapping_set_large_folios(inode->i_mapping); + mapping_set_folio_min_order(inode->i_mapping, + M_IGEO(mp)->min_folio_order); return error; } diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 46cb0384143b..a99454208807 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -135,7 +135,6 @@ xfs_sb_validate_fsb_count( uint64_t max_index; uint64_t max_bytes; - ASSERT(PAGE_SHIFT >= sbp->sb_blocklog); ASSERT(sbp->sb_blocklog >= BBSHIFT); if (check_shl_overflow(nblocks, sbp->sb_blocklog, &max_bytes)) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 27e9f749c4c7..b8a93a8f35ca 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1638,16 +1638,18 @@ xfs_fs_fill_super( goto out_free_sb; } - /* - * Until this is fixed only page-sized or smaller data blocks work. - */ if (mp->m_sb.sb_blocksize > PAGE_SIZE) { - xfs_warn(mp, - "File system with blocksize %d bytes. " - "Only pagesize (%ld) or less will currently work.", + if (!xfs_has_crc(mp)) { + xfs_warn(mp, +"V4 Filesystem with blocksize %d bytes. Only pagesize (%ld) or less is supported.", mp->m_sb.sb_blocksize, PAGE_SIZE); - error = -ENOSYS; - goto out_free_sb; + error = -ENOSYS; + goto out_free_sb; + } + + xfs_warn(mp, +"EXPERIMENTAL: V5 Filesystem with Large Block Size (%d bytes) enabled.", + mp->m_sb.sb_blocksize); } /* Ensure this filesystem fits in the page cache limits */