From patchwork Tue Feb 4 23:12:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13960165 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4082521C17E; Tue, 4 Feb 2025 23:12:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710738; cv=none; b=AbPfvXqyfT9qMKW1tH8/JFocBGGbfW5/q7HhgxkXkVz8b4z/drV3Wv+t3OMpfdiA+4FMLDGB+MqkEMTDAwvjaLWgQX28wkUZO5kn7npjWwgKRaECh025TSgqOWaZO+WKpyl8YMhA2wcNAla3DBSxy8m6B9kSKEhfGzQxN+BSeoM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710738; c=relaxed/simple; bh=o4HWFKGhOfRqQZejNjFvDKTVduD7VxcCSTYrB59N0Yg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lRYQA6sz0eUAzOej2DEolEDPWDzTYJpTpLZryxjRM2QC3bmkV71O6ocQ0OuDUf43MZsdq9CEhQXMg12LLvEvgm6iEAbeFlzQLiLo9SXhvL7EMyADGPCsViANUl/02IJjguhoMYtMpP8RfCsCzqXzd0/ZARGG4T1XakrDzKj2QOE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=dwYwqC0B; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="dwYwqC0B" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=Z9eItY4c7G0s+c+amJ06bcyD3TNKrS6Yw3t1FzanFdU=; b=dwYwqC0BnaxfmaHsYRpSVX1szZ y51ea1k/HUWuGbfBMjZT8cR+F3m8SputDuamldmdZf26OFtaPx0ElsPayezYJxgw5K2dqiPUuWrgS UJwLbSyAoStA48qFVkBvCVCtSVR4ODs+VlzjWMsE33dibGXHO0jLRbZyVe4qM4di5obEtKYv3Rg+0 nBqbJHk/Lou9nFRSKHo1AS/uxHZASXTQ7zj3S1vbwUFgoeF0JCya92/amtnBc5LMYrhRoNzQlEpIs Qx+Mz+ZbBjWgGnIVp7mVUFaVm60IxAVsd/bddgwX6rQogTzV7PwUcb8tsSLx/dxLiQO8+lkMBa9dO 8hsBmURw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tfS5T-00000001nhM-14jK; Tue, 04 Feb 2025 23:12:11 +0000 From: Luis Chamberlain To: hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v2 1/8] fs/buffer: simplify block_read_full_folio() with bh_offset() Date: Tue, 4 Feb 2025 15:12:02 -0800 Message-ID: <20250204231209.429356-2-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250204231209.429356-1-mcgrof@kernel.org> References: <20250204231209.429356-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain When we read over all buffers in a folio we currently use the buffer index on the folio and blocksize to get the offset. Simplify this with bh_offset(). This simplifies the loop while making no functional changes. Suggested-by: Matthew Wilcox Signed-off-by: Luis Chamberlain --- fs/buffer.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index cc8452f60251..b99560e8a142 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2381,7 +2381,6 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) lblock = div_u64(limit + blocksize - 1, blocksize); bh = head; nr = 0; - i = 0; do { if (buffer_uptodate(bh)) @@ -2398,7 +2397,7 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) page_error = true; } if (!buffer_mapped(bh)) { - folio_zero_range(folio, i * blocksize, + folio_zero_range(folio, bh_offset(bh), blocksize); if (!err) set_buffer_uptodate(bh); @@ -2412,7 +2411,7 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) continue; } arr[nr++] = bh; - } while (i++, iblock++, (bh = bh->b_this_page) != head); + } while (iblock++, (bh = bh->b_this_page) != head); if (fully_mapped) folio_set_mappedtodisk(folio); From patchwork Tue Feb 4 23:12:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13960166 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F6A921C9E1; Tue, 4 Feb 2025 23:12:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710738; cv=none; b=LbbRq/W7s9h+2UrbEc2MlX6tc7M1mpD1vDPXHYBpugcAv94Lqb/hop8v/SclF0YxeXhK0yml2LDN5CT4L24bXIJSNlF7ePemroD/envKwrbl1KVZQnnuruZ/+o50oz3uvhsqJ08YqNJ0NarhShQHLhSmWOK/GmpWhoGX1iATkD4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710738; c=relaxed/simple; bh=dr8CNNrk7V5yLGgUuBtwQy4diHxtVvVQURzGb4Pcjss=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=i2mpRzg7nkY9W+urgFfmycC68I9P5pqrlmYtKI7S5FdUgL2o82jrbu2CuyytJIwNicEZdVr46nuwoLJ5eDTsPx0CUimJOnZWrbRjYjSrLj2d/demKu1i5INM8lfl+y+5aYXfDA2vtDdwwrpp7IeD5i6e5SFj2atrOKJrCck2Lf4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=bLk3usnU; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="bLk3usnU" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=AJ1b0u4x2rstuiVDqeFM5lSxnt8lCHKW5w+bHMF0FyA=; b=bLk3usnU7zD694D9cQlqCOIrGl Zxo2SpJ8CmrEnmdcSP9tTWq9KazLoLEJXfKmVyoWXHcZIIzXjNuhivNzLCD1GkFL9/Kj3aqWK86pS EzI+jZTF6PaJ5TBjJC3z7QZZAz2ATQJKMVfE/PH34cMBCPJjo9fAfzGGhOS+pylbC0D2dzUf77w23 FtQT5QLp0D9jcPgK6T8gfY7FY55KtdBlH01vUDo8v2ESxod5Zu+CxKgNXCTNQ820pC9tUpMAuZ/od qo3+wT5PtysGvfbIUhMxWAYLub35Ep2ZoLFFj5hMU+FkI2ef676zc86QDgnz2KnEuAZTgNELMIAL0 nK7kF1sA==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tfS5T-00000001nhO-1EZy; Tue, 04 Feb 2025 23:12:11 +0000 From: Luis Chamberlain To: hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v2 2/8] fs/buffer: remove batching from async read Date: Tue, 4 Feb 2025 15:12:03 -0800 Message-ID: <20250204231209.429356-3-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250204231209.429356-1-mcgrof@kernel.org> References: <20250204231209.429356-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Matthew Wilcox The current implementation of a folio async read in block_read_full_folio() first batches all buffer-heads which need IOs issued for by putting them on an array of max size MAX_BUF_PER_PAGE. After collection it locks the batched buffer-heads and finally submits the pending reads. On systems with CPUs where the system page size is quite larger like Hexagon with 256 KiB this batching can lead stack growth warnings so we want to avoid that. Note the use of folio_end_read() through block_read_full_folio(), its used either when the folio is determined to be fully uptodate and no pending read is needed, an IO error happened on get_block(), or an out of bound read raced against batching collection to make our required reads uptodate. We can simplify this logic considerably and remove the stack growth issues of MAX_BUF_PER_PAGE by just replacing the batched logic with one which only issues IO for the previous buffer-head keeping in mind we'll always have one buffer-head (the current one) on the folio with an async flag, this will prevent any calls to folio_end_read(). So we accomplish two things with this: o Avoid large stacks arrays with MAX_BUF_PER_PAGE o Make the need for folio_end_read() explicit and easier to read Suggested-by: Matthew Wilcox Signed-off-by: Luis Chamberlain --- fs/buffer.c | 51 +++++++++++++++++++++------------------------------ 1 file changed, 21 insertions(+), 30 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index b99560e8a142..167fa3e33566 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2361,9 +2361,8 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) { struct inode *inode = folio->mapping->host; sector_t iblock, lblock; - struct buffer_head *bh, *head, *arr[MAX_BUF_PER_PAGE]; + struct buffer_head *bh, *head, *prev = NULL; size_t blocksize; - int nr, i; int fully_mapped = 1; bool page_error = false; loff_t limit = i_size_read(inode); @@ -2380,7 +2379,6 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) iblock = div_u64(folio_pos(folio), blocksize); lblock = div_u64(limit + blocksize - 1, blocksize); bh = head; - nr = 0; do { if (buffer_uptodate(bh)) @@ -2410,40 +2408,33 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) if (buffer_uptodate(bh)) continue; } - arr[nr++] = bh; + + lock_buffer(bh); + if (buffer_uptodate(bh)) { + unlock_buffer(bh); + continue; + } + + mark_buffer_async_read(bh); + if (prev) + submit_bh(REQ_OP_READ, prev); + prev = bh; } while (iblock++, (bh = bh->b_this_page) != head); if (fully_mapped) folio_set_mappedtodisk(folio); - if (!nr) { - /* - * All buffers are uptodate or get_block() returned an - * error when trying to map them - we can finish the read. - */ - folio_end_read(folio, !page_error); - return 0; - } - - /* Stage two: lock the buffers */ - for (i = 0; i < nr; i++) { - bh = arr[i]; - lock_buffer(bh); - mark_buffer_async_read(bh); - } - /* - * Stage 3: start the IO. Check for uptodateness - * inside the buffer lock in case another process reading - * the underlying blockdev brought it uptodate (the sct fix). + * All buffers are uptodate or get_block() returned an error + * when trying to map them - we must finish the read because + * end_buffer_async_read() will never be called on any buffer + * in this folio. */ - for (i = 0; i < nr; i++) { - bh = arr[i]; - if (buffer_uptodate(bh)) - end_buffer_async_read(bh, 1); - else - submit_bh(REQ_OP_READ, bh); - } + if (prev) + submit_bh(REQ_OP_READ, prev); + else + folio_end_read(folio, !page_error); + return 0; } EXPORT_SYMBOL(block_read_full_folio); From patchwork Tue Feb 4 23:12:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13960161 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D72721C19D; Tue, 4 Feb 2025 23:12:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710736; cv=none; b=MoViqw5t9fwYOq1AU//lgx4Sq2hGJU00BpgOMexCOtA/A1L+ZzKl5awI0eKAUXvzbD+VPjuHxbrvNgJLX4dOd1RJ/eBobLKplb6SSA7NX4j+xTZeVL7xpH4GgV7siJQFqz2HBI1dhlJXfQ6LWKhL90HpzzV713YBI88AbqybDxk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710736; c=relaxed/simple; bh=zt97TGl2KF08+WytHpTHgv7Zf2m36wjwZhA67iwywMg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MBrANGt0gS3s34EJ/GyEdHkkOh8903l4ankLxeojvYr2rVEJPQbOYugCOLUb3G5obGzzFFAFwh0VearaesBhR6hILRAB7aXJuI8YWjgY9baT7dzel5MJhLdzgZBRZXIepCNpAJN1YYcBxct+JBsSkfu2TpnkccZIjmTAWl4risM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=IHtFMQGC; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="IHtFMQGC" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=dx9mRi5amWVcM0ecsur6IbVdNelNNovE80LuaqvVo+c=; b=IHtFMQGC1lK7B131euedd45YPt nvZXb+Eg/WELScuPXyzcOvYG5gWjU1eXe++DWyQ+PqIpGdRMjXPlrkWuXofu9nDbHrNeXRzYrzPpH sYemqQLRrwX9oSG97zypNzZd1Z0nKV2MhFdmFsi2SRLxWUw7R66nAHOcgFB8KEi1u2Q14RXLCmv0q 71XJQ0YgjmH6eipR2q3wHm8hoYFC61FAadJWvmI6TPp0Rf5otCojled0yolTsPGl3yosS0jCJwNb9 1ZKv0Oqq+F+DsKYrgflgR1rtAAqpBL/9Iwj9Vt5eTq5STdHZjtIZ+XK1owFLcZBUz1HOv/ZphMXaU U/pJyBtQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tfS5T-00000001nhQ-1MeY; Tue, 04 Feb 2025 23:12:11 +0000 From: Luis Chamberlain To: hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org, Hannes Reinecke Subject: [PATCH v2 3/8] fs/mpage: avoid negative shift for large blocksize Date: Tue, 4 Feb 2025 15:12:04 -0800 Message-ID: <20250204231209.429356-4-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250204231209.429356-1-mcgrof@kernel.org> References: <20250204231209.429356-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Hannes Reinecke For large blocksizes the number of block bits is larger than PAGE_SHIFT, so use instead use folio_pos(folio) >> blkbits to calculate the sector number. This is required to enable large folios with buffer-heads. Signed-off-by: Luis Chamberlain Signed-off-by: Hannes Reinecke --- fs/mpage.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/mpage.c b/fs/mpage.c index 82aecf372743..a3c82206977f 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -181,7 +181,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) if (folio_buffers(folio)) goto confused; - block_in_file = (sector_t)folio->index << (PAGE_SHIFT - blkbits); + block_in_file = folio_pos(folio) >> blkbits; last_block = block_in_file + args->nr_pages * blocks_per_page; last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits; if (last_block > last_block_in_file) @@ -527,7 +527,7 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, * The page has no buffers: map it to disk */ BUG_ON(!folio_test_uptodate(folio)); - block_in_file = (sector_t)folio->index << (PAGE_SHIFT - blkbits); + block_in_file = folio_pos(folio) >> blkbits; /* * Whole page beyond EOF? Skip allocating blocks to avoid leaking * space. From patchwork Tue Feb 4 23:12:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13960163 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B8FD621C9E0; Tue, 4 Feb 2025 23:12:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710737; cv=none; b=Egb3/8XkssZDCm955LCq43wegfsU3OdxMyEQ49zUtgVe332Cx4AIlydCyJSSgWLOwugZoVVwGEQVtzQAF3GfwbG0Bo3i9djv4fJbO9W+FYc7a5QR6ba0NWt8gmMNHP7/IP0hgQYTCj/nXrQrlYTqVKmALF/zhgF53m67yUC32hs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710737; c=relaxed/simple; bh=/LtunQBro5lWWS6ItH+pjo/YgjqD0Kqtf6qbAHsoURE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tJ/WIzdvaCgfMz6TeJGTxoJrlWAv/GYTrYucAGfi+2rspe78wtWmfyTM9yYk7IxT4lD0JkPXcKEgWboUjKpm0Hb4NC3iCABurEgNZ5g7wSs+P0Ng5RbHjlmD4vohqxrb3zgCD3xRJhfmshIHw0ta1kYiwA8jeE0NS2srs1M2f1M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=e+b994xu; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="e+b994xu" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=0Z0iXLDDA0FqixDntgSkkhZtkEsfKRvlixQFpa6OzQk=; b=e+b994xuT1jXSNuGpGkSqkCp66 fajUl5jWltnA42/bzNM/j//7w0eUOCresTP1juGfRIh5OaT75uo/XSWJEMzCU7he9Sb4DqvWCg4uZ 5wPQ5za3OmKz8WCNUIwfGRwkRk24ntESqzdUsl9DB4/qqTdc5gLDt75XA5Ek+/VNqhE184cBuDoHJ c/6w2JszvSjEKh2HkzQEGguaP/Uq0C9q+G4ER+L0aXmGAopqgDxStHv6eg7ILUP3YKoB7J0bDGanm OkEuBowZZqn3npGN+mWhh7tXr60CMZBl93t7d1+KlLd/kNTzdvuy9GurHHCzEUQilbi237AjyYKw+ 72YC6mlw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tfS5T-00000001nhS-1X69; Tue, 04 Feb 2025 23:12:11 +0000 From: Luis Chamberlain To: hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v2 4/8] fs/mpage: use blocks_per_folio instead of blocks_per_page Date: Tue, 4 Feb 2025 15:12:05 -0800 Message-ID: <20250204231209.429356-5-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250204231209.429356-1-mcgrof@kernel.org> References: <20250204231209.429356-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Hannes Reinecke Convert mpage to folios and associate the number of blocks with a folio and not a page. [mcgrof: keep 1 page request on mpage_read_folio()] Signed-off-by: Hannes Reinecke Signed-off-by: Luis Chamberlain --- fs/mpage.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/fs/mpage.c b/fs/mpage.c index a3c82206977f..c17d7a724e4b 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -107,7 +107,7 @@ static void map_buffer_to_folio(struct folio *folio, struct buffer_head *bh, * don't make any buffers if there is only one buffer on * the folio and the folio just needs to be set up to date */ - if (inode->i_blkbits == PAGE_SHIFT && + if (inode->i_blkbits == folio_shift(folio) && buffer_uptodate(bh)) { folio_mark_uptodate(folio); return; @@ -153,7 +153,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) struct folio *folio = args->folio; struct inode *inode = folio->mapping->host; const unsigned blkbits = inode->i_blkbits; - const unsigned blocks_per_page = PAGE_SIZE >> blkbits; + const unsigned blocks_per_folio = folio_size(folio) >> blkbits; const unsigned blocksize = 1 << blkbits; struct buffer_head *map_bh = &args->map_bh; sector_t block_in_file; @@ -161,7 +161,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) sector_t last_block_in_file; sector_t first_block; unsigned page_block; - unsigned first_hole = blocks_per_page; + unsigned first_hole = blocks_per_folio; struct block_device *bdev = NULL; int length; int fully_mapped = 1; @@ -182,7 +182,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) goto confused; block_in_file = folio_pos(folio) >> blkbits; - last_block = block_in_file + args->nr_pages * blocks_per_page; + last_block = block_in_file + args->nr_pages * blocks_per_folio; last_block_in_file = (i_size_read(inode) + blocksize - 1) >> blkbits; if (last_block > last_block_in_file) last_block = last_block_in_file; @@ -204,7 +204,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) clear_buffer_mapped(map_bh); break; } - if (page_block == blocks_per_page) + if (page_block == blocks_per_folio) break; page_block++; block_in_file++; @@ -216,7 +216,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) * Then do more get_blocks calls until we are done with this folio. */ map_bh->b_folio = folio; - while (page_block < blocks_per_page) { + while (page_block < blocks_per_folio) { map_bh->b_state = 0; map_bh->b_size = 0; @@ -229,7 +229,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) if (!buffer_mapped(map_bh)) { fully_mapped = 0; - if (first_hole == blocks_per_page) + if (first_hole == blocks_per_folio) first_hole = page_block; page_block++; block_in_file++; @@ -247,7 +247,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) goto confused; } - if (first_hole != blocks_per_page) + if (first_hole != blocks_per_folio) goto confused; /* hole -> non-hole */ /* Contiguous blocks? */ @@ -260,7 +260,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) if (relative_block == nblocks) { clear_buffer_mapped(map_bh); break; - } else if (page_block == blocks_per_page) + } else if (page_block == blocks_per_folio) break; page_block++; block_in_file++; @@ -268,7 +268,7 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) bdev = map_bh->b_bdev; } - if (first_hole != blocks_per_page) { + if (first_hole != blocks_per_folio) { folio_zero_segment(folio, first_hole << blkbits, PAGE_SIZE); if (first_hole == 0) { folio_mark_uptodate(folio); @@ -303,10 +303,10 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) relative_block = block_in_file - args->first_logical_block; nblocks = map_bh->b_size >> blkbits; if ((buffer_boundary(map_bh) && relative_block == nblocks) || - (first_hole != blocks_per_page)) + (first_hole != blocks_per_folio)) args->bio = mpage_bio_submit_read(args->bio); else - args->last_block_in_bio = first_block + blocks_per_page - 1; + args->last_block_in_bio = first_block + blocks_per_folio - 1; out: return args->bio; @@ -456,12 +456,12 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, struct address_space *mapping = folio->mapping; struct inode *inode = mapping->host; const unsigned blkbits = inode->i_blkbits; - const unsigned blocks_per_page = PAGE_SIZE >> blkbits; + const unsigned blocks_per_folio = folio_size(folio) >> blkbits; sector_t last_block; sector_t block_in_file; sector_t first_block; unsigned page_block; - unsigned first_unmapped = blocks_per_page; + unsigned first_unmapped = blocks_per_folio; struct block_device *bdev = NULL; int boundary = 0; sector_t boundary_block = 0; @@ -486,12 +486,12 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, */ if (buffer_dirty(bh)) goto confused; - if (first_unmapped == blocks_per_page) + if (first_unmapped == blocks_per_folio) first_unmapped = page_block; continue; } - if (first_unmapped != blocks_per_page) + if (first_unmapped != blocks_per_folio) goto confused; /* hole -> non-hole */ if (!buffer_dirty(bh) || !buffer_uptodate(bh)) @@ -536,7 +536,7 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, goto page_is_mapped; last_block = (i_size - 1) >> blkbits; map_bh.b_folio = folio; - for (page_block = 0; page_block < blocks_per_page; ) { + for (page_block = 0; page_block < blocks_per_folio; ) { map_bh.b_state = 0; map_bh.b_size = 1 << blkbits; @@ -618,14 +618,14 @@ static int __mpage_writepage(struct folio *folio, struct writeback_control *wbc, BUG_ON(folio_test_writeback(folio)); folio_start_writeback(folio); folio_unlock(folio); - if (boundary || (first_unmapped != blocks_per_page)) { + if (boundary || (first_unmapped != blocks_per_folio)) { bio = mpage_bio_submit_write(bio); if (boundary_block) { write_boundary_block(boundary_bdev, boundary_block, 1 << blkbits); } } else { - mpd->last_block_in_bio = first_block + blocks_per_page - 1; + mpd->last_block_in_bio = first_block + blocks_per_folio - 1; } goto out; From patchwork Tue Feb 4 23:12:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13960159 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 223B221C165; Tue, 4 Feb 2025 23:12:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710736; cv=none; b=fXsltvp0FV7h2KkbdjxL7f6yVP3Txt2JftRWOMyN3YEVM6cnD+HiAow89d5L6QlWMSQ4QWAbTKlYT7re9HZwvJ5h7uPekF5ffhIHv9PpPtaLF8tI9FTFvg0GwDO57XLMd9umFEHPSY9f+irPZW5tc3Zd1fUIU2t2aUzISj2UXPg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710736; c=relaxed/simple; bh=F7xenTjqnGD5aXkLVQCMlQcbOfxYcqUtL1eNNXM1DRY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FMk6b9e7Er8nlB8OxzcaP5BG51e1KPuewkzkRywkRQuk3J6pH3IgEJ6gUWEzogzWHxzMz1Y6qBdt4RNMr91HuGm+Rs3uSlqPb2Kt/N5DX97SiPgeRI0c6cTVhDQq1D1M28COykhMSuZdl05EUcNo5KkREDq+70gxpe4g0kEuJL0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=KQXrbBpY; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="KQXrbBpY" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=gIJE1yXpif2CCfNIqrsdZ1ScxfCjjJhcAk10zquVIzE=; b=KQXrbBpYImm6WQgyMBh8uZWBFd 8fhOrvLjp0Y/9fv2pHE6+OciYfileSX4qGXBfSqqNu9NuyuYUkiJofTC4mo/bfK0FpAbSt6pdDTKx e3LX6yUYJrHxU5f+VyUaDtY8raa3hM28p6lQaeocODVMM1sVq4eLRSY2ePqRtbR3cjBM79DMzRDtf TIy7gt97JTaLPvTuNE6PocSik5aH8Vu9kzVyVX86STWSLRWJtL2OlOsgBQTvjR4PpS210bT8QSeoA +Vt5POzfHSI8ksLlgyPmqy283VLaWdT45QWNdGiDEEyLEehOpBPhZEzsPXJnJ7RmgrjQYwc8UZaMj y4Uc0O/g==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tfS5T-00000001nhU-1fWW; Tue, 04 Feb 2025 23:12:11 +0000 From: Luis Chamberlain To: hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v2 5/8] fs/buffer fs/mpage: remove large folio restriction Date: Tue, 4 Feb 2025 15:12:06 -0800 Message-ID: <20250204231209.429356-6-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250204231209.429356-1-mcgrof@kernel.org> References: <20250204231209.429356-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain Now that buffer-heads has been converted over to support large folios we can remove the built-in VM_BUG_ON_FOLIO() checks which prevents their use. Signed-off-by: Luis Chamberlain --- fs/buffer.c | 2 -- fs/mpage.c | 3 --- 2 files changed, 5 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 167fa3e33566..194eacbefc95 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2371,8 +2371,6 @@ int block_read_full_folio(struct folio *folio, get_block_t *get_block) if (IS_ENABLED(CONFIG_FS_VERITY) && IS_VERITY(inode)) limit = inode->i_sb->s_maxbytes; - VM_BUG_ON_FOLIO(folio_test_large(folio), folio); - head = folio_create_buffers(folio, inode, 0); blocksize = head->b_size; diff --git a/fs/mpage.c b/fs/mpage.c index c17d7a724e4b..031230531a2a 100644 --- a/fs/mpage.c +++ b/fs/mpage.c @@ -170,9 +170,6 @@ static struct bio *do_mpage_readpage(struct mpage_readpage_args *args) unsigned relative_block; gfp_t gfp = mapping_gfp_constraint(folio->mapping, GFP_KERNEL); - /* MAX_BUF_PER_PAGE, for example */ - VM_BUG_ON_FOLIO(folio_test_large(folio), folio); - if (args->is_readahead) { opf |= REQ_RAHEAD; gfp |= __GFP_NORETRY | __GFP_NOWARN; From patchwork Tue Feb 4 23:12:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13960160 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4089521C182; Tue, 4 Feb 2025 23:12:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710736; cv=none; b=Dw9r7lRlhzqZ4kT/8YfOGj5d1we8ddfJqB/S3EateJg0BdBlnBqAdEWWXbfD4futlvFk1GhJZvghesi1GR/7XLUTK8WOJpq9hWlKIKIa/DMfU7vKAUPWJCFZKZGgy9pQkY3oE8awuHIaFh7dHjcoQMI1sSA5z4utvi2Dnar7MGQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710736; c=relaxed/simple; bh=HbYBj5rAZfZP5YdThJ5y7bQrat0JXHXg5QPwfqhiCLw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IuM4D5LgEzs7YK08a9od2he8VNfPJ10Of6AQl1Zt8ll/8Zl4KYk/CUAJuwtg8wqGF7IS8NbTHRY7ChhJHAKEMxdelLsqSswMTvb6hQxmQdkLpgePAGREMTYOUWr5AsdJxVd2vDec082ZwXkkCQnb03zPFlczFXYh4ZWDGdcX4n0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=Y7kbPbrZ; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="Y7kbPbrZ" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=BqAmEHN/NB4fRpClNavDgsxh4n8ElEwRbJC+1e1GTe8=; b=Y7kbPbrZWUV0UxwwKcM89Dh4Cv 4dxa4VsxyQWeZvEu4hKO19mk9Oahn1gL/beoB8rGrrXjF6u0/dChAjDktCPfvzCpmcH4tRI8pz9MG A3Y2ewRHNTTYqRWySxth22/RQPFoYLru4nRX05nafsuR7gb19uX8vZLDt0wZsslpTJi7N/SKPgiyL Os4lK9jtkE6l8Ai4GI0oZfoXiZPFY4MbnfNhlsT1hHbSp0reE/ZrsEY3yWe85OVnH9RX0IhEg9wVW bVkvsCsUGGXielAB07u++sxl65h/dSO9scBeUGQQn1NgEqw4rgbP0GvNTnctfEc9KpiGA85cCl8ll ymPxZ9Yg==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tfS5T-00000001nhW-1oc0; Tue, 04 Feb 2025 23:12:11 +0000 From: Luis Chamberlain To: hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v2 6/8] block/bdev: enable large folio support for large logical block sizes Date: Tue, 4 Feb 2025 15:12:07 -0800 Message-ID: <20250204231209.429356-7-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250204231209.429356-1-mcgrof@kernel.org> References: <20250204231209.429356-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain From: Hannes Reinecke Call mapping_set_folio_min_order() when modifying the logical block size to ensure folios are allocated with the correct size. Signed-off-by: Hannes Reinecke --- block/bdev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/block/bdev.c b/block/bdev.c index 9d73a8fbf7f9..8aadf1f23cb4 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -148,6 +148,8 @@ static void set_init_blocksize(struct block_device *bdev) bsize <<= 1; } BD_INODE(bdev)->i_blkbits = blksize_bits(bsize); + mapping_set_folio_min_order(BD_INODE(bdev)->i_mapping, + get_order(bsize)); } int set_blocksize(struct file *file, int size) @@ -169,6 +171,7 @@ int set_blocksize(struct file *file, int size) if (inode->i_blkbits != blksize_bits(size)) { sync_blockdev(bdev); inode->i_blkbits = blksize_bits(size); + mapping_set_folio_min_order(inode->i_mapping, get_order(size)); kill_bdev(bdev); } return 0; From patchwork Tue Feb 4 23:12:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13960158 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E969C1FF7A5; Tue, 4 Feb 2025 23:12:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710736; cv=none; b=qhC+4957EYgIidROCOmyZJ2IQkdrAlbBWEh5iIdm38v6+59TMBVALS5H6pykkMcMbejqEKdxjU2VgbfdxCm1XatrnBzBuQF88NwdFAQGyOrIJufpCwthwMeUOFtsmgH8i6S3iJber2yrKrjCLfA1grOZdKwghKELSBNEcjO4CXU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710736; c=relaxed/simple; bh=SmE0ANenCwNCRIh861iB/cVXFzTWu0vhYRGAtIA6Re4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eD5dE93zYvJ4G6CELBgYoOxbiVPOgZvOgAMxdpsQxy+CS4wFrr9p5Y1BGvQFYfmAh5Agg5vwXidvphObbwjRFztoT+GWQEliw4lB/xm/zNvH6CThYWUPW2nfcENyfoDIPozXRH7sw5OBNjpllpEILavkMfcd2FK5RHwwQsJGEVc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=btPrmYVU; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="btPrmYVU" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=uOeLYsw9Sv9eePIry8YkgaAM0u7ucnWKFkLyfp328sM=; b=btPrmYVUIxv/PUPNXWIIMwBM6P xdr5bCfUDfEXVLeDiWlkcc9wcAAlPLtZcKGPl/KPku29F6Heroqfk02KMNylpMuKL7Plx2J0Is6pN r29mKYQRjTW2qgleR4u8VtGuYVSGPqq9kZpsczBByM/09yCXiLysvwsVPSRQAsqDj+akDHjW4yyY6 lQx6hToYIMU3ptGlugC+owrMVO0Vzv/+2A5D0t9Y0gB4ZVnyMiVlPkCKl27D9rZR1EGoeRcC8nHKO hGbQrFB6d9WIpejPUoWvCwLG9N3tbA/1BoMoDt4Mt/VctQGHM6yhkDu+8qidTpDefS6Em56DHpcCr bHlfs/cQ==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tfS5T-00000001nhY-1xER; Tue, 04 Feb 2025 23:12:11 +0000 From: Luis Chamberlain To: hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v2 7/8] block/bdev: lift block size restrictions to 64k Date: Tue, 4 Feb 2025 15:12:08 -0800 Message-ID: <20250204231209.429356-8-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250204231209.429356-1-mcgrof@kernel.org> References: <20250204231209.429356-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain We now can support blocksizes larger than PAGE_SIZE, so in theory we should be able to lift the restriction up to the max supported page cache order. However bound ourselves to what we can currently validate and test. Through blktests and fstest we can validate up to 64k today. Reviewed-by: Hannes Reinecke Signed-off-by: Luis Chamberlain --- block/bdev.c | 3 +-- include/linux/blkdev.h | 9 ++++++++- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index 8aadf1f23cb4..22806ce11e1d 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -183,8 +183,7 @@ int sb_set_blocksize(struct super_block *sb, int size) { if (set_blocksize(sb->s_bdev_file, size)) return 0; - /* If we get here, we know size is power of two - * and it's value is between 512 and PAGE_SIZE */ + /* If we get here, we know size is validated */ sb->s_blocksize = size; sb->s_blocksize_bits = blksize_bits(size); return sb->s_blocksize; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 248416ecd01c..a89513302977 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -26,6 +26,7 @@ #include #include #include +#include struct module; struct request_queue; @@ -267,10 +268,16 @@ static inline dev_t disk_devt(struct gendisk *disk) return MKDEV(disk->major, disk->first_minor); } +/* + * We should strive for 1 << (PAGE_SHIFT + MAX_PAGECACHE_ORDER) + * however we constrain this to what we can validate and test. + */ +#define BLK_MAX_BLOCK_SIZE SZ_64K + /* blk_validate_limits() validates bsize, so drivers don't usually need to */ static inline int blk_validate_block_size(unsigned long bsize) { - if (bsize < 512 || bsize > PAGE_SIZE || !is_power_of_2(bsize)) + if (bsize < 512 || bsize > BLK_MAX_BLOCK_SIZE || !is_power_of_2(bsize)) return -EINVAL; return 0; From patchwork Tue Feb 4 23:12:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Luis Chamberlain X-Patchwork-Id: 13960162 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D6BA21C187; Tue, 4 Feb 2025 23:12:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.133 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710736; cv=none; b=h3OsAiwoDMoS3V5bpRFThoNfOzNNrD6yO9oqP31tu0y3xJ/IaP8AhORLrqpc/xT+Li6Mzx0hBJRjdFTH79FtMyaHIP7e8msMQipxr9rcvIF3JI//HITwtb9pwuj25YxazenX6HksNMb9JSdSQ4C92/3Nz3nXl413EYvsk3dWFFg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738710736; c=relaxed/simple; bh=0il+T3BQ5n+k4lrS7bxTfATTMygd4Dtzc6zHZySEXRM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Wj3uuhWeYzca/28zBRdLxA71zx/Fzd8TNuKwttBymg1ciDu+MmXGnR+ShTel+UMrZ4FbLWa5GNFPXvIYuM0HFnCYL7cy9/bR/eon4hGOT8HyxPcRCNifE4FDd23+YZ7imazsY8pKotvumsZH7kMGOD8v5AXYm6MFhX71DqxpcRc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=0nzOxxML; arc=none smtp.client-ip=198.137.202.133 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="0nzOxxML" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=vsI3NrYDCkaTTiln7YWklNd5/+H3Tzw74X2k8TbvG94=; b=0nzOxxMLwU+TflgBQMEtdhAsm8 +xmsBu/qY0JwCI1dXIoV5MXYXGahDwQF1Ibbnvkoarthv7xByRdwPYPpQ/searzACLLZUcBGbwb/f hsWuhQ1VpcW+/la2tFAWJkaS4gmr4MAUuS+SdmOEHQkdAErpyyqnSP6+j1uLcQhyu+w7AJGOwPZH/ 5zwlFRBnUrjElWvfumBsjkqRVi3pBEMglKpJgQcR6fRgwwIzyz+3bubKGK8cPLv01+aHwHmYpzTSK yr+uAl+OC6m/bYnoo6CNNI0vb3mMHP96V+hxotakRfuquvEe2QcVD+tT4Xnj8F+kvdZGCm75aRJOF YAgjVBWw==; Received: from mcgrof by bombadil.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tfS5T-00000001nha-261m; Tue, 04 Feb 2025 23:12:11 +0000 From: Luis Chamberlain To: hare@suse.de, willy@infradead.org, dave@stgolabs.net, david@fromorbit.com, djwong@kernel.org, kbusch@kernel.org Cc: john.g.garry@oracle.com, hch@lst.de, ritesh.list@gmail.com, linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org, gost.dev@samsung.com, p.raghav@samsung.com, da.gomez@samsung.com, kernel@pankajraghav.com, mcgrof@kernel.org Subject: [PATCH v2 8/8] bdev: use bdev_io_min() for statx block size Date: Tue, 4 Feb 2025 15:12:09 -0800 Message-ID: <20250204231209.429356-9-mcgrof@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250204231209.429356-1-mcgrof@kernel.org> References: <20250204231209.429356-1-mcgrof@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sender: Luis Chamberlain You can use lsblk to query for a block device block device block size: lsblk -o MIN-IO /dev/nvme0n1 MIN-IO 4096 The min-io is the minimum IO the block device prefers for optimal performance. In turn we map this to the block device block size. The current block size exposed even for block devices with an LBA format of 16k is 4k. Likewise devices which support 4k LBA format but have a larger Indirection Unit of 16k have an exposed block size of 4k. This incurs read-modify-writes on direct IO against devices with a min-io larger than the page size. To fix this, use the block device min io, which is the minimal optimal IO the device prefers. With this we now get: lsblk -o MIN-IO /dev/nvme0n1 MIN-IO 16384 And so userspace gets the appropriate information it needs for optimal performance. This is verified with blkalgn against mkfs against a device with LBA format of 4k but an NPWG of 16k (min io size) mkfs.xfs -f -b size=16k /dev/nvme3n1 blkalgn -d nvme3n1 --ops Write Block size : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 0 | | 8192 -> 16383 : 0 | | 16384 -> 32767 : 66 |****************************************| 32768 -> 65535 : 0 | | 65536 -> 131071 : 0 | | 131072 -> 262143 : 2 |* | Block size: 14 - 66 Block size: 17 - 2 Algn size : count distribution 0 -> 1 : 0 | | 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 0 | | 256 -> 511 : 0 | | 512 -> 1023 : 0 | | 1024 -> 2047 : 0 | | 2048 -> 4095 : 0 | | 4096 -> 8191 : 0 | | 8192 -> 16383 : 0 | | 16384 -> 32767 : 66 |****************************************| 32768 -> 65535 : 0 | | 65536 -> 131071 : 0 | | 131072 -> 262143 : 2 |* | Algn size: 14 - 66 Algn size: 17 - 2 Signed-off-by: Luis Chamberlain --- block/bdev.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index 22806ce11e1d..3bd948e6438d 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -1276,9 +1276,6 @@ void bdev_statx(struct path *path, struct kstat *stat, struct inode *backing_inode; struct block_device *bdev; - if (!(request_mask & (STATX_DIOALIGN | STATX_WRITE_ATOMIC))) - return; - backing_inode = d_backing_inode(path->dentry); /* @@ -1305,6 +1302,8 @@ void bdev_statx(struct path *path, struct kstat *stat, queue_atomic_write_unit_max_bytes(bd_queue)); } + stat->blksize = bdev_io_min(bdev); + blkdev_put_no_open(bdev); }