From patchwork Sat Jul 1 07:34:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13298983 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40A3FC0015E for ; Sat, 1 Jul 2023 07:35:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229597AbjGAHfc (ORCPT ); Sat, 1 Jul 2023 03:35:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60988 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229501AbjGAHf3 (ORCPT ); Sat, 1 Jul 2023 03:35:29 -0400 Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BA7A5199; Sat, 1 Jul 2023 00:35:26 -0700 (PDT) Received: by mail-pg1-x52b.google.com with SMTP id 41be03b00d2f7-55adfa61199so1840534a12.2; Sat, 01 Jul 2023 00:35:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688196925; x=1690788925; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yTSBRxmT6EFcDgvBi8GZxfXilWylDwjfXz02P+8Pqkw=; b=jksadR/at2jwGt3mgpwxnM2GydB/ERWEebNlJC/rjyz0Nq2n5/UXELhyTyY4kwjcKB rHmg2R7FOu+S1auF6m92pNYHAJPKkDjRcC+BhF39V01RYZEj4KmTdDdiDaS1PlxXKhU8 3wwxhNGnyz/Q4RKzcDmqxojmKRaYdQEs5u+Q/ZmEb/XBkjgZVX5WBCL1hJ3YKY8bar5J NbWa0GnMwJgTBAwKwSBPpp60CR2nCjYighZe+DW6N/fsAOYofFsowvMu5u95XYjYoNCy jykPIs/iDVmolt9+B2RlP/MlHpSFTzcRWRQVrOB7WHPqCdfe7ESF44svZfM70RoqZfU4 LOVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688196925; x=1690788925; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yTSBRxmT6EFcDgvBi8GZxfXilWylDwjfXz02P+8Pqkw=; b=EuxsUmBL9iu3xYh99MWFjDH20blMA/vkXoNLCklPSAT2JVDvyL4p8IbDUSTHUlzClD Mph86B4qPqqabtN/KtF7NheyftAGDAX/yi6KT3LolnvaidGo5E8ETBAOniYkLwN4q8p2 xsYSkZr2nwp8LHQfqlNSPDv9AGK+eySZ2coui1xS0Q+I8lUaRhsgPTaZ3WDOdztCTqK9 uiHRY7ro7RO5LVurRnMSBmgC3F4BrvO1aQ7BoVSqKORw1mrBzva44SQiDEBkhWnZULHr KCqcJTf3f4QLSm9uu3pPQQop9IMZ2qliShDZUFaErl1qPLg90ddAlAfunAJhvbmnY5NM keRg== X-Gm-Message-State: AC+VfDyiI9oqyYtqaPVBUNdAMfVuwchlL7Pl6vJ2wXrAM43qa+/3B6cJ jf/ifabhOB2UT4obVRCKA7s1zeyGdtU= X-Google-Smtp-Source: ACHHUZ5jd6w0YwxO/5MPs0agyV2/ONgGhkl/muD7fUnXDqs4fCUl9ScyFIy9dWv4NvhMEjCGSk1f9Q== X-Received: by 2002:a05:6a20:2a29:b0:125:4d74:cd67 with SMTP id e41-20020a056a202a2900b001254d74cd67mr4223665pzh.21.1688196925045; Sat, 01 Jul 2023 00:35:25 -0700 (PDT) Received: from dw-tp.localdomain ([49.207.232.207]) by smtp.gmail.com with ESMTPSA id h14-20020aa786ce000000b0063aa1763146sm8603414pfo.17.2023.07.01.00.35.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Jul 2023 00:35:24 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , Matthew Wilcox , Christoph Hellwig , Brian Foster , Andreas Gruenbacher , "Ritesh Harjani (IBM)" , Christoph Hellwig Subject: [PATCHv11 1/8] iomap: Rename iomap_page to iomap_folio_state and others Date: Sat, 1 Jul 2023 13:04:34 +0530 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org struct iomap_page actually tracks per-block state of a folio. Hence it make sense to rename some of these function names and data structures for e.g. 1. struct iomap_page (iop) -> struct iomap_folio_state (ifs) 2. iomap_page_create() -> ifs_alloc() 3. iomap_page_release() -> ifs_free() 4. iomap_iop_set_range_uptodate() -> ifs_set_range_uptodate() 5. to_iomap_page() -> folio->private Since in later patches we are also going to add per-block dirty state tracking to iomap_folio_state. Hence this patch also renames "uptodate" & "uptodate_lock" members of iomap_folio_state to "state" and"state_lock". We don't really need to_iomap_page() function, instead directly open code it as folio->private; Reviewed-by: Christoph Hellwig Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 151 ++++++++++++++++++++--------------------- 1 file changed, 72 insertions(+), 79 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 063133ec77f4..2675a3e0ac1d 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -24,64 +24,57 @@ #define IOEND_BATCH_SIZE 4096 /* - * Structure allocated for each folio when block size < folio size - * to track sub-folio uptodate status and I/O completions. + * Structure allocated for each folio to track per-block uptodate state + * and I/O completions. */ -struct iomap_page { +struct iomap_folio_state { atomic_t read_bytes_pending; atomic_t write_bytes_pending; - spinlock_t uptodate_lock; - unsigned long uptodate[]; + spinlock_t state_lock; + unsigned long state[]; }; -static inline struct iomap_page *to_iomap_page(struct folio *folio) -{ - if (folio_test_private(folio)) - return folio_get_private(folio); - return NULL; -} - static struct bio_set iomap_ioend_bioset; -static struct iomap_page * -iomap_page_create(struct inode *inode, struct folio *folio, unsigned int flags) +static struct iomap_folio_state *ifs_alloc(struct inode *inode, + struct folio *folio, unsigned int flags) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = folio->private; unsigned int nr_blocks = i_blocks_per_folio(inode, folio); gfp_t gfp; - if (iop || nr_blocks <= 1) - return iop; + if (ifs || nr_blocks <= 1) + return ifs; if (flags & IOMAP_NOWAIT) gfp = GFP_NOWAIT; else gfp = GFP_NOFS | __GFP_NOFAIL; - iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)), + ifs = kzalloc(struct_size(ifs, state, BITS_TO_LONGS(nr_blocks)), gfp); - if (iop) { - spin_lock_init(&iop->uptodate_lock); + if (ifs) { + spin_lock_init(&ifs->state_lock); if (folio_test_uptodate(folio)) - bitmap_fill(iop->uptodate, nr_blocks); - folio_attach_private(folio, iop); + bitmap_fill(ifs->state, nr_blocks); + folio_attach_private(folio, ifs); } - return iop; + return ifs; } -static void iomap_page_release(struct folio *folio) +static void ifs_free(struct folio *folio) { - struct iomap_page *iop = folio_detach_private(folio); + struct iomap_folio_state *ifs = folio_detach_private(folio); struct inode *inode = folio->mapping->host; unsigned int nr_blocks = i_blocks_per_folio(inode, folio); - if (!iop) + if (!ifs) return; - WARN_ON_ONCE(atomic_read(&iop->read_bytes_pending)); - WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending)); - WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != + WARN_ON_ONCE(atomic_read(&ifs->read_bytes_pending)); + WARN_ON_ONCE(atomic_read(&ifs->write_bytes_pending)); + WARN_ON_ONCE(bitmap_full(ifs->state, nr_blocks) != folio_test_uptodate(folio)); - kfree(iop); + kfree(ifs); } /* @@ -90,7 +83,7 @@ static void iomap_page_release(struct folio *folio) static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, loff_t *pos, loff_t length, size_t *offp, size_t *lenp) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = folio->private; loff_t orig_pos = *pos; loff_t isize = i_size_read(inode); unsigned block_bits = inode->i_blkbits; @@ -105,12 +98,12 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, * per-block uptodate status and adjust the offset and length if needed * to avoid reading in already uptodate ranges. */ - if (iop) { + if (ifs) { unsigned int i; /* move forward for each leading block marked uptodate */ for (i = first; i <= last; i++) { - if (!test_bit(i, iop->uptodate)) + if (!test_bit(i, ifs->state)) break; *pos += block_size; poff += block_size; @@ -120,7 +113,7 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* truncate len if we find any trailing uptodate block(s) */ for ( ; i <= last; i++) { - if (test_bit(i, iop->uptodate)) { + if (test_bit(i, ifs->state)) { plen -= (last - i + 1) * block_size; last = i - 1; break; @@ -144,26 +137,26 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, *lenp = plen; } -static void iomap_iop_set_range_uptodate(struct folio *folio, - struct iomap_page *iop, size_t off, size_t len) +static void ifs_set_range_uptodate(struct folio *folio, + struct iomap_folio_state *ifs, size_t off, size_t len) { struct inode *inode = folio->mapping->host; unsigned first = off >> inode->i_blkbits; unsigned last = (off + len - 1) >> inode->i_blkbits; unsigned long flags; - spin_lock_irqsave(&iop->uptodate_lock, flags); - bitmap_set(iop->uptodate, first, last - first + 1); - if (bitmap_full(iop->uptodate, i_blocks_per_folio(inode, folio))) + spin_lock_irqsave(&ifs->state_lock, flags); + bitmap_set(ifs->state, first, last - first + 1); + if (bitmap_full(ifs->state, i_blocks_per_folio(inode, folio))) folio_mark_uptodate(folio); - spin_unlock_irqrestore(&iop->uptodate_lock, flags); + spin_unlock_irqrestore(&ifs->state_lock, flags); } static void iomap_set_range_uptodate(struct folio *folio, - struct iomap_page *iop, size_t off, size_t len) + struct iomap_folio_state *ifs, size_t off, size_t len) { - if (iop) - iomap_iop_set_range_uptodate(folio, iop, off, len); + if (ifs) + ifs_set_range_uptodate(folio, ifs, off, len); else folio_mark_uptodate(folio); } @@ -171,16 +164,16 @@ static void iomap_set_range_uptodate(struct folio *folio, static void iomap_finish_folio_read(struct folio *folio, size_t offset, size_t len, int error) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = folio->private; if (unlikely(error)) { folio_clear_uptodate(folio); folio_set_error(folio); } else { - iomap_set_range_uptodate(folio, iop, offset, len); + iomap_set_range_uptodate(folio, ifs, offset, len); } - if (!iop || atomic_sub_and_test(len, &iop->read_bytes_pending)) + if (!ifs || atomic_sub_and_test(len, &ifs->read_bytes_pending)) folio_unlock(folio); } @@ -213,7 +206,7 @@ struct iomap_readpage_ctx { static int iomap_read_inline_data(const struct iomap_iter *iter, struct folio *folio) { - struct iomap_page *iop; + struct iomap_folio_state *ifs; const struct iomap *iomap = iomap_iter_srcmap(iter); size_t size = i_size_read(iter->inode) - iomap->offset; size_t poff = offset_in_page(iomap->offset); @@ -231,15 +224,15 @@ static int iomap_read_inline_data(const struct iomap_iter *iter, if (WARN_ON_ONCE(size > iomap->length)) return -EIO; if (offset > 0) - iop = iomap_page_create(iter->inode, folio, iter->flags); + ifs = ifs_alloc(iter->inode, folio, iter->flags); else - iop = to_iomap_page(folio); + ifs = folio->private; addr = kmap_local_folio(folio, offset); memcpy(addr, iomap->inline_data, size); memset(addr + size, 0, PAGE_SIZE - poff - size); kunmap_local(addr); - iomap_set_range_uptodate(folio, iop, offset, PAGE_SIZE - poff); + iomap_set_range_uptodate(folio, ifs, offset, PAGE_SIZE - poff); return 0; } @@ -260,7 +253,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, loff_t pos = iter->pos + offset; loff_t length = iomap_length(iter) - offset; struct folio *folio = ctx->cur_folio; - struct iomap_page *iop; + struct iomap_folio_state *ifs; loff_t orig_pos = pos; size_t poff, plen; sector_t sector; @@ -269,20 +262,20 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, return iomap_read_inline_data(iter, folio); /* zero post-eof blocks as the page may be mapped */ - iop = iomap_page_create(iter->inode, folio, iter->flags); + ifs = ifs_alloc(iter->inode, folio, iter->flags); iomap_adjust_read_range(iter->inode, folio, &pos, length, &poff, &plen); if (plen == 0) goto done; if (iomap_block_needs_zeroing(iter, pos)) { folio_zero_range(folio, poff, plen); - iomap_set_range_uptodate(folio, iop, poff, plen); + iomap_set_range_uptodate(folio, ifs, poff, plen); goto done; } ctx->cur_folio_in_bio = true; - if (iop) - atomic_add(plen, &iop->read_bytes_pending); + if (ifs) + atomic_add(plen, &ifs->read_bytes_pending); sector = iomap_sector(iomap, pos); if (!ctx->bio || @@ -436,11 +429,11 @@ EXPORT_SYMBOL_GPL(iomap_readahead); */ bool iomap_is_partially_uptodate(struct folio *folio, size_t from, size_t count) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = folio->private; struct inode *inode = folio->mapping->host; unsigned first, last, i; - if (!iop) + if (!ifs) return false; /* Caller's range may extend past the end of this folio */ @@ -451,7 +444,7 @@ bool iomap_is_partially_uptodate(struct folio *folio, size_t from, size_t count) last = (from + count - 1) >> inode->i_blkbits; for (i = first; i <= last; i++) - if (!test_bit(i, iop->uptodate)) + if (!test_bit(i, ifs->state)) return false; return true; } @@ -490,7 +483,7 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags) */ if (folio_test_dirty(folio) || folio_test_writeback(folio)) return false; - iomap_page_release(folio); + ifs_free(folio); return true; } EXPORT_SYMBOL_GPL(iomap_release_folio); @@ -507,12 +500,12 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len) if (offset == 0 && len == folio_size(folio)) { WARN_ON_ONCE(folio_test_writeback(folio)); folio_cancel_dirty(folio); - iomap_page_release(folio); + ifs_free(folio); } else if (folio_test_large(folio)) { - /* Must release the iop so the page can be split */ + /* Must release the ifs so the page can be split */ WARN_ON_ONCE(!folio_test_uptodate(folio) && folio_test_dirty(folio)); - iomap_page_release(folio); + ifs_free(folio); } } EXPORT_SYMBOL_GPL(iomap_invalidate_folio); @@ -547,7 +540,7 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, size_t len, struct folio *folio) { const struct iomap *srcmap = iomap_iter_srcmap(iter); - struct iomap_page *iop; + struct iomap_folio_state *ifs; loff_t block_size = i_blocksize(iter->inode); loff_t block_start = round_down(pos, block_size); loff_t block_end = round_up(pos + len, block_size); @@ -559,8 +552,8 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, return 0; folio_clear_error(folio); - iop = iomap_page_create(iter->inode, folio, iter->flags); - if ((iter->flags & IOMAP_NOWAIT) && !iop && nr_blocks > 1) + ifs = ifs_alloc(iter->inode, folio, iter->flags); + if ((iter->flags & IOMAP_NOWAIT) && !ifs && nr_blocks > 1) return -EAGAIN; do { @@ -589,7 +582,7 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, if (status) return status; } - iomap_set_range_uptodate(folio, iop, poff, plen); + iomap_set_range_uptodate(folio, ifs, poff, plen); } while ((block_start += plen) < block_end); return 0; @@ -696,7 +689,7 @@ static int iomap_write_begin(struct iomap_iter *iter, loff_t pos, static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, size_t copied, struct folio *folio) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = folio->private; flush_dcache_folio(folio); /* @@ -712,7 +705,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, */ if (unlikely(copied < len && !folio_test_uptodate(folio))) return 0; - iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len); + iomap_set_range_uptodate(folio, ifs, offset_in_folio(folio, pos), len); filemap_dirty_folio(inode->i_mapping, folio); return copied; } @@ -1290,17 +1283,17 @@ EXPORT_SYMBOL_GPL(iomap_page_mkwrite); static void iomap_finish_folio_write(struct inode *inode, struct folio *folio, size_t len, int error) { - struct iomap_page *iop = to_iomap_page(folio); + struct iomap_folio_state *ifs = folio->private; if (error) { folio_set_error(folio); mapping_set_error(inode->i_mapping, error); } - WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !iop); - WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) <= 0); + WARN_ON_ONCE(i_blocks_per_folio(inode, folio) > 1 && !ifs); + WARN_ON_ONCE(ifs && atomic_read(&ifs->write_bytes_pending) <= 0); - if (!iop || atomic_sub_and_test(len, &iop->write_bytes_pending)) + if (!ifs || atomic_sub_and_test(len, &ifs->write_bytes_pending)) folio_end_writeback(folio); } @@ -1567,7 +1560,7 @@ iomap_can_add_to_ioend(struct iomap_writepage_ctx *wpc, loff_t offset, */ static void iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio, - struct iomap_page *iop, struct iomap_writepage_ctx *wpc, + struct iomap_folio_state *ifs, struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct list_head *iolist) { sector_t sector = iomap_sector(&wpc->iomap, pos); @@ -1585,8 +1578,8 @@ iomap_add_to_ioend(struct inode *inode, loff_t pos, struct folio *folio, bio_add_folio(wpc->ioend->io_bio, folio, len, poff); } - if (iop) - atomic_add(len, &iop->write_bytes_pending); + if (ifs) + atomic_add(len, &ifs->write_bytes_pending); wpc->ioend->io_size += len; wbc_account_cgroup_owner(wbc, &folio->page, len); } @@ -1612,7 +1605,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct inode *inode, struct folio *folio, u64 end_pos) { - struct iomap_page *iop = iomap_page_create(inode, folio, 0); + struct iomap_folio_state *ifs = ifs_alloc(inode, folio, 0); struct iomap_ioend *ioend, *next; unsigned len = i_blocksize(inode); unsigned nblocks = i_blocks_per_folio(inode, folio); @@ -1620,7 +1613,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, int error = 0, count = 0, i; LIST_HEAD(submit_list); - WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0); + WARN_ON_ONCE(ifs && atomic_read(&ifs->write_bytes_pending) != 0); /* * Walk through the folio to find areas to write back. If we @@ -1628,7 +1621,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, * invalid, grab a new one. */ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) { - if (iop && !test_bit(i, iop->uptodate)) + if (ifs && !test_bit(i, ifs->state)) continue; error = wpc->ops->map_blocks(wpc, inode, pos); @@ -1639,7 +1632,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, continue; if (wpc->iomap.type == IOMAP_HOLE) continue; - iomap_add_to_ioend(inode, pos, folio, iop, wpc, wbc, + iomap_add_to_ioend(inode, pos, folio, ifs, wpc, wbc, &submit_list); count++; } From patchwork Sat Jul 1 07:34:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13298984 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DABA3EB64DC for ; Sat, 1 Jul 2023 07:35:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229612AbjGAHfd (ORCPT ); Sat, 1 Jul 2023 03:35:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229585AbjGAHfa (ORCPT ); Sat, 1 Jul 2023 03:35:30 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 784341B0; Sat, 1 Jul 2023 00:35:29 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-6726d5d92afso1622946b3a.1; Sat, 01 Jul 2023 00:35:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688196928; x=1690788928; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CfrcRZSVSc6FW1n1MsNyxKR20bTVgx4c9T+kFzqOa1Y=; b=JkMO+EeB7TWOE7W9QKoHmH8W7iCPrZjQqvp66uxp/7zD0bvofSDW0F2mnGA6yjxk1/ oiKRTXah1NwQ/NJ6aCXTL9nGevQ1llhhZEiltwyvZOOFJMhuP3N6+rAPLjUpw9HfgpJb LPYdA+1snOR4OkjMYLMWStvmN0nO9JAtxbi1fwsoPpycBM27MuxBCOU0bU13k7QX+ILr vP+w2s56aLLUqL8ZKsRVFaFwlp9jfRDv3vUlXCU9EsMf6tFj9zATjXvLbJB16JOchmXI t8J1Taof/3wU+XiVDKoSDwtZnY/CdOBLEfxsw+TDWx4ZvF5gdIOUAQrj4oqYHMZz8yBt KAug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688196928; x=1690788928; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CfrcRZSVSc6FW1n1MsNyxKR20bTVgx4c9T+kFzqOa1Y=; b=dmuk4Eb1sAeGXNYWhDHUetL3H/uLtZcNlAGQFyt9uerii2n0gR/BA6QXwQcrDaqiSq QwujuLeuQjoFKPU079sJeBchBHlj3AadRaOeqCbxjGsJN/vzY9E/Xbywr7j5REOm1Eag qmfTJ21xf4BHPoVIN20e3DRU2BN5nbAZA7jhowAeYAEQW00H8ciRE1aIhE5W1THoe6wW mH0mrXg8wpmQ7e3bvkC5BJGJ3Efz2QkcyMA6W/jwwlf/0q/5rNurZVBBIZbT74LIdMOa k2Ft5Nr/ZFHkoGL52degbfOxisguaEJl7Cmvn7b53QoiuInPC5hxLB1Sa1wa1J7457Vg I7Rg== X-Gm-Message-State: AC+VfDxw5SZdg/w6UZwe8F0IKnc0c/Rfl4mDVSTUY5b/hBH9YzhKUPAc d+EEL3oUdnVP1yYV6KGitAm2gbZlpdY= X-Google-Smtp-Source: ACHHUZ4lqHz9MT/g7gUNPBOy4CAGK9iU55942CR0NdkIt9xEpN+iRavJLNCgE9cBSSKkRUMFazWxfg== X-Received: by 2002:a05:6a20:1612:b0:11d:9249:170e with SMTP id l18-20020a056a20161200b0011d9249170emr6580892pzj.12.1688196928216; Sat, 01 Jul 2023 00:35:28 -0700 (PDT) Received: from dw-tp.localdomain ([49.207.232.207]) by smtp.gmail.com with ESMTPSA id h14-20020aa786ce000000b0063aa1763146sm8603414pfo.17.2023.07.01.00.35.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Jul 2023 00:35:27 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , Matthew Wilcox , Christoph Hellwig , Brian Foster , Andreas Gruenbacher , "Ritesh Harjani (IBM)" , Christoph Hellwig Subject: [PATCHv11 2/8] iomap: Drop ifs argument from iomap_set_range_uptodate() Date: Sat, 1 Jul 2023 13:04:35 +0530 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org iomap_folio_state (ifs) can be derived directly from the folio, making it unnecessary to pass "ifs" as an argument to iomap_set_range_uptodate(). This patch eliminates "ifs" argument from iomap_set_range_uptodate() function. Also, the definition of iomap_set_range_uptodate() and ifs_set_range_uptodate() functions are moved above ifs_alloc(). In upcoming patches, we plan to introduce additional helper routines for handling dirty state, with the intention of consolidating all of "ifs" state handling routines at one place. Reviewed-by: Christoph Hellwig Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 67 +++++++++++++++++++++--------------------- 1 file changed, 33 insertions(+), 34 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 2675a3e0ac1d..3ff7688b360a 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -36,6 +36,33 @@ struct iomap_folio_state { static struct bio_set iomap_ioend_bioset; +static void ifs_set_range_uptodate(struct folio *folio, + struct iomap_folio_state *ifs, size_t off, size_t len) +{ + struct inode *inode = folio->mapping->host; + unsigned int first_blk = off >> inode->i_blkbits; + unsigned int last_blk = (off + len - 1) >> inode->i_blkbits; + unsigned int nr_blks = last_blk - first_blk + 1; + unsigned long flags; + + spin_lock_irqsave(&ifs->state_lock, flags); + bitmap_set(ifs->state, first_blk, nr_blks); + if (bitmap_full(ifs->state, i_blocks_per_folio(inode, folio))) + folio_mark_uptodate(folio); + spin_unlock_irqrestore(&ifs->state_lock, flags); +} + +static void iomap_set_range_uptodate(struct folio *folio, size_t off, + size_t len) +{ + struct iomap_folio_state *ifs = folio->private; + + if (ifs) + ifs_set_range_uptodate(folio, ifs, off, len); + else + folio_mark_uptodate(folio); +} + static struct iomap_folio_state *ifs_alloc(struct inode *inode, struct folio *folio, unsigned int flags) { @@ -137,30 +164,6 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, *lenp = plen; } -static void ifs_set_range_uptodate(struct folio *folio, - struct iomap_folio_state *ifs, size_t off, size_t len) -{ - struct inode *inode = folio->mapping->host; - unsigned first = off >> inode->i_blkbits; - unsigned last = (off + len - 1) >> inode->i_blkbits; - unsigned long flags; - - spin_lock_irqsave(&ifs->state_lock, flags); - bitmap_set(ifs->state, first, last - first + 1); - if (bitmap_full(ifs->state, i_blocks_per_folio(inode, folio))) - folio_mark_uptodate(folio); - spin_unlock_irqrestore(&ifs->state_lock, flags); -} - -static void iomap_set_range_uptodate(struct folio *folio, - struct iomap_folio_state *ifs, size_t off, size_t len) -{ - if (ifs) - ifs_set_range_uptodate(folio, ifs, off, len); - else - folio_mark_uptodate(folio); -} - static void iomap_finish_folio_read(struct folio *folio, size_t offset, size_t len, int error) { @@ -170,7 +173,7 @@ static void iomap_finish_folio_read(struct folio *folio, size_t offset, folio_clear_uptodate(folio); folio_set_error(folio); } else { - iomap_set_range_uptodate(folio, ifs, offset, len); + iomap_set_range_uptodate(folio, offset, len); } if (!ifs || atomic_sub_and_test(len, &ifs->read_bytes_pending)) @@ -206,7 +209,6 @@ struct iomap_readpage_ctx { static int iomap_read_inline_data(const struct iomap_iter *iter, struct folio *folio) { - struct iomap_folio_state *ifs; const struct iomap *iomap = iomap_iter_srcmap(iter); size_t size = i_size_read(iter->inode) - iomap->offset; size_t poff = offset_in_page(iomap->offset); @@ -224,15 +226,13 @@ static int iomap_read_inline_data(const struct iomap_iter *iter, if (WARN_ON_ONCE(size > iomap->length)) return -EIO; if (offset > 0) - ifs = ifs_alloc(iter->inode, folio, iter->flags); - else - ifs = folio->private; + ifs_alloc(iter->inode, folio, iter->flags); addr = kmap_local_folio(folio, offset); memcpy(addr, iomap->inline_data, size); memset(addr + size, 0, PAGE_SIZE - poff - size); kunmap_local(addr); - iomap_set_range_uptodate(folio, ifs, offset, PAGE_SIZE - poff); + iomap_set_range_uptodate(folio, offset, PAGE_SIZE - poff); return 0; } @@ -269,7 +269,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, if (iomap_block_needs_zeroing(iter, pos)) { folio_zero_range(folio, poff, plen); - iomap_set_range_uptodate(folio, ifs, poff, plen); + iomap_set_range_uptodate(folio, poff, plen); goto done; } @@ -582,7 +582,7 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, if (status) return status; } - iomap_set_range_uptodate(folio, ifs, poff, plen); + iomap_set_range_uptodate(folio, poff, plen); } while ((block_start += plen) < block_end); return 0; @@ -689,7 +689,6 @@ static int iomap_write_begin(struct iomap_iter *iter, loff_t pos, static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, size_t copied, struct folio *folio) { - struct iomap_folio_state *ifs = folio->private; flush_dcache_folio(folio); /* @@ -705,7 +704,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, */ if (unlikely(copied < len && !folio_test_uptodate(folio))) return 0; - iomap_set_range_uptodate(folio, ifs, offset_in_folio(folio, pos), len); + iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len); filemap_dirty_folio(inode->i_mapping, folio); return copied; } From patchwork Sat Jul 1 07:34:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13298985 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9286BC0015E for ; Sat, 1 Jul 2023 07:35:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229799AbjGAHf4 (ORCPT ); Sat, 1 Jul 2023 03:35:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229643AbjGAHfe (ORCPT ); Sat, 1 Jul 2023 03:35:34 -0400 Received: from mail-oa1-x2c.google.com (mail-oa1-x2c.google.com [IPv6:2001:4860:4864:20::2c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6646199; Sat, 1 Jul 2023 00:35:32 -0700 (PDT) Received: by mail-oa1-x2c.google.com with SMTP id 586e51a60fabf-1b07d97180dso2596490fac.3; Sat, 01 Jul 2023 00:35:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688196931; x=1690788931; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=h43LX+LAxBuBFFPVFB6jb55GjVMsi0wQAMdNCc2VO9c=; b=liWGhQ2v8DPvK9VsakBQEjHmzX9rAAbdkJn80DOpWpqIJw+GG+YwOgSYPMlJotIKlF 8mkex3g6fZ+ji75QN2Bdu8DpIkqCGer/8KslL7KtknU5ua74UlumtYNg0us0ZmvQiSZo 0lv16NVoBl5zlkbtu1ghEwKQ5UpHCdSd0LilMS3X27taQjRflG2qffPM8XJ6Zkr/70+C xfHQBp5wBf/sYa+iXbanD9LRbuEpf4QDoWhAhLYWbBdEbeEfbp5tFC0iky4fmWrQnGTp BLxIX7AG3xWdrYbR4Cnvj7CG7gBM8/Mu5UDUan6Sg4eIj02nYVYYXRzBD8//+9tl912B Bldg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688196931; x=1690788931; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=h43LX+LAxBuBFFPVFB6jb55GjVMsi0wQAMdNCc2VO9c=; b=Hu7nnVfTPWY6/EaemV21dpheoi3YOglan83FIT11Pt8FEjxh3jCKeKDRwsOspw15m3 7nzjg5NShUpekc8e3EcthzoSiUXBld0Ei0AwQV1NP/I3q1WifWyqVeEcsES5rLLpIToL AD2IebmElkSsK5y6E6TNQxLioeD2Y5eTItM5dDHZqiSWyyXkq5qoTYjYAjmp9Byxghg3 OAFng9m1ufcboFBkWgw10oTiK8ux/MbXscJ4WxRidSstH4oZixiYQZ9c9Cu8DcjUaqvk i3eoq+MtrmJlFLn+HknHgUzXFV9dhwy1xJFe65CoRerdlpsqs/MVQi02a/dgJ8U83gmv yImA== X-Gm-Message-State: ABy/qLZbvyWduG0o/8nHAA4RQW9TYr/ckEfUydN8EAb0pFn94r1JPZ8U x4uzsQ7cKE0NGVOdQP9z53Hh+vi/SLY= X-Google-Smtp-Source: APBJJlFWKBxoN+23envHRckVO/mfIK0JVzTCgPA3WMbuIvcdcqKQgBouVfJWhSME8BwlHyvFI6ryug== X-Received: by 2002:a05:6870:bacf:b0:1b0:2c0d:9aee with SMTP id js15-20020a056870bacf00b001b02c0d9aeemr7565267oab.14.1688196931454; Sat, 01 Jul 2023 00:35:31 -0700 (PDT) Received: from dw-tp.localdomain ([49.207.232.207]) by smtp.gmail.com with ESMTPSA id h14-20020aa786ce000000b0063aa1763146sm8603414pfo.17.2023.07.01.00.35.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Jul 2023 00:35:30 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , Matthew Wilcox , Christoph Hellwig , Brian Foster , Andreas Gruenbacher , "Ritesh Harjani (IBM)" , Christoph Hellwig Subject: [PATCHv11 3/8] iomap: Add some uptodate state handling helpers for ifs state bitmap Date: Sat, 1 Jul 2023 13:04:36 +0530 Message-Id: <04ba7f53e55649a908943b6c7c27ef333d47c71f.1688188958.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch adds two of the helper routines ifs_is_fully_uptodate() and ifs_block_is_uptodate() for managing uptodate state of "ifs" state bitmap. In later patches ifs state bitmap array will also handle dirty state of all blocks of a folio. Hence this patch adds some helper routines for handling uptodate state of the ifs state bitmap. Reviewed-by: Christoph Hellwig Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 28 ++++++++++++++++++++-------- 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 3ff7688b360a..e45368e91eca 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -36,6 +36,20 @@ struct iomap_folio_state { static struct bio_set iomap_ioend_bioset; +static inline bool ifs_is_fully_uptodate(struct folio *folio, + struct iomap_folio_state *ifs) +{ + struct inode *inode = folio->mapping->host; + + return bitmap_full(ifs->state, i_blocks_per_folio(inode, folio)); +} + +static inline bool ifs_block_is_uptodate(struct iomap_folio_state *ifs, + unsigned int block) +{ + return test_bit(block, ifs->state); +} + static void ifs_set_range_uptodate(struct folio *folio, struct iomap_folio_state *ifs, size_t off, size_t len) { @@ -47,7 +61,7 @@ static void ifs_set_range_uptodate(struct folio *folio, spin_lock_irqsave(&ifs->state_lock, flags); bitmap_set(ifs->state, first_blk, nr_blks); - if (bitmap_full(ifs->state, i_blocks_per_folio(inode, folio))) + if (ifs_is_fully_uptodate(folio, ifs)) folio_mark_uptodate(folio); spin_unlock_irqrestore(&ifs->state_lock, flags); } @@ -92,14 +106,12 @@ static struct iomap_folio_state *ifs_alloc(struct inode *inode, static void ifs_free(struct folio *folio) { struct iomap_folio_state *ifs = folio_detach_private(folio); - struct inode *inode = folio->mapping->host; - unsigned int nr_blocks = i_blocks_per_folio(inode, folio); if (!ifs) return; WARN_ON_ONCE(atomic_read(&ifs->read_bytes_pending)); WARN_ON_ONCE(atomic_read(&ifs->write_bytes_pending)); - WARN_ON_ONCE(bitmap_full(ifs->state, nr_blocks) != + WARN_ON_ONCE(ifs_is_fully_uptodate(folio, ifs) != folio_test_uptodate(folio)); kfree(ifs); } @@ -130,7 +142,7 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* move forward for each leading block marked uptodate */ for (i = first; i <= last; i++) { - if (!test_bit(i, ifs->state)) + if (!ifs_block_is_uptodate(ifs, i)) break; *pos += block_size; poff += block_size; @@ -140,7 +152,7 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* truncate len if we find any trailing uptodate block(s) */ for ( ; i <= last; i++) { - if (test_bit(i, ifs->state)) { + if (ifs_block_is_uptodate(ifs, i)) { plen -= (last - i + 1) * block_size; last = i - 1; break; @@ -444,7 +456,7 @@ bool iomap_is_partially_uptodate(struct folio *folio, size_t from, size_t count) last = (from + count - 1) >> inode->i_blkbits; for (i = first; i <= last; i++) - if (!test_bit(i, ifs->state)) + if (!ifs_block_is_uptodate(ifs, i)) return false; return true; } @@ -1620,7 +1632,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, * invalid, grab a new one. */ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) { - if (ifs && !test_bit(i, ifs->state)) + if (ifs && !ifs_block_is_uptodate(ifs, i)) continue; error = wpc->ops->map_blocks(wpc, inode, pos); From patchwork Sat Jul 1 07:34:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13298986 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B9B1EB64DC for ; Sat, 1 Jul 2023 07:36:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229585AbjGAHf7 (ORCPT ); Sat, 1 Jul 2023 03:35:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229688AbjGAHfg (ORCPT ); Sat, 1 Jul 2023 03:35:36 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 84175E46; Sat, 1 Jul 2023 00:35:35 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-666ecf9a081so2063797b3a.2; Sat, 01 Jul 2023 00:35:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688196934; x=1690788934; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WZ/jQNKid21qXJou8CnoLfrwuQ5vopNmAWZzAmROpvo=; b=g1cnZGQHwHY4j5KqSn3DnVK6viYRmvBT8dDWQEStnmyYo0zrWwBOBgDAKfzjQ3O00e ia0xhnaXagXyWT+ExNhYx8B9R4Yg7/roQkvvse8PFvuPC2k2Zh8HRKQl/X9f035YdvK3 EhrcA61QPKcSDd8l1k2joX528ZdnUFlES/F5+KYNZceB75QKvQhZKax6v56YSxzVwd+X x0/PgC8rIkV0kftn0Uhag7cdxdO6eB/GXcGSWrjQRWBLdd6MCHVXXJILXJOiQKYwJyUL 8O6yxXwLYIVDHcKDUoab3Q824p7RWhP0kJsWcnJp5V6oYJSJp3mnuPJH5ga0z3xNCbt/ eoBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688196934; x=1690788934; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WZ/jQNKid21qXJou8CnoLfrwuQ5vopNmAWZzAmROpvo=; b=bKOSFAOZ/XzqpRiXo9kBlBEx7Po5UStbCjdPUKy1tv7G7qBx95G8CENF5xl0E1SXvq ABc05lBRaLBm3igCvRwHNy6Q0WASs6Y5zrV6ZJT51tMlT9YpBr081qHJYHGgNmyLMmnm p6sWKcenbiKUIcYVTd/zsbkPeBeKT+WeFc9pcYMtVwj0DCmLlJUhjG+ow2TxtY8vHmD5 fESZfxxkyF0Lt876CZgvxOAhlcuHzkyj1NZIF5kp4UBG414gqysim+bZIutVVha8U7Yq pf/kKo1GSBLgz7fMIN62sHSEkW09Z2L2G7J1s9yO6YMhxzq/YTijRUwqF1ZVmCrAwr0t 6sxw== X-Gm-Message-State: ABy/qLbdXp87bqNiiEdwJaXXzaj1TchhZnUf5o22qhZar8416mXw6yXz EoU+k14CeAi8g2VRr4HC9lc1tt0YNAc= X-Google-Smtp-Source: APBJJlHv0wtr72ZPUzmgnzwvqNQjWmc+9jhDBZ6z+VKpMVHtjQGh+HFxjKQ7X5P9DC+9+dmPKBrBOQ== X-Received: by 2002:a05:6a00:9a8:b0:677:cda3:2222 with SMTP id u40-20020a056a0009a800b00677cda32222mr6554696pfg.14.1688196934433; Sat, 01 Jul 2023 00:35:34 -0700 (PDT) Received: from dw-tp.localdomain ([49.207.232.207]) by smtp.gmail.com with ESMTPSA id h14-20020aa786ce000000b0063aa1763146sm8603414pfo.17.2023.07.01.00.35.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Jul 2023 00:35:33 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , Matthew Wilcox , Christoph Hellwig , Brian Foster , Andreas Gruenbacher , "Ritesh Harjani (IBM)" Subject: [PATCHv11 4/8] iomap: Fix possible overflow condition in iomap_write_delalloc_scan Date: Sat, 1 Jul 2023 13:04:37 +0530 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org folio_next_index() returns an unsigned long value which left shifted by PAGE_SHIFT could possibly cause an overflow on 32-bit system. Instead use folio_pos(folio) + folio_size(folio), which does this correctly. Suggested-by: Matthew Wilcox Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.40.1 diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index e45368e91eca..cddf01b96d8a 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -933,7 +933,7 @@ static int iomap_write_delalloc_scan(struct inode *inode, * the end of this data range, not the end of the folio. */ *punch_start_byte = min_t(loff_t, end_byte, - folio_next_index(folio) << PAGE_SHIFT); + folio_pos(folio) + folio_size(folio)); } /* move offset to start of next folio in range */ From patchwork Sat Jul 1 07:34:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13298993 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 673D3C0015E for ; Sat, 1 Jul 2023 07:36:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229847AbjGAHgO (ORCPT ); Sat, 1 Jul 2023 03:36:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229762AbjGAHfs (ORCPT ); Sat, 1 Jul 2023 03:35:48 -0400 Received: from mail-oi1-x229.google.com (mail-oi1-x229.google.com [IPv6:2607:f8b0:4864:20::229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D13E8E50; Sat, 1 Jul 2023 00:35:38 -0700 (PDT) Received: by mail-oi1-x229.google.com with SMTP id 5614622812f47-3a1ebb79579so1965836b6e.3; Sat, 01 Jul 2023 00:35:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688196937; x=1690788937; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=D04xZI5Sg1R/L1W0hLA/ZrQjdqsnxoz6I76PCXx+fj8=; b=eAGcNFGRySshnEzsaFrsJ9i7dY4MXF/kQcwEBGiOfjQN22gFj6udkXtxrk/qjHEwfJ tZRvZeEBk1ntC4NLg9o165Ff8c2qHeleaHItIsvmE1h2qLrsA47amsFqzZbAQfmStHkW D6wTys2uJVNES5t+4beTV2NpxTc9wnOyxSQuZwJ0PA7aqONaqdWJHlhlaX9TioWDvIPt vELhEl5FSeHoyyFZkXhBeR1E6Z1aytQTAVs8ufM0+F55UxVPal6274ejL2JJkocjYTNO nfq/OTZ13tonSDrnTBrv7rbEXQPmrNu3L0GEOjmMepyduHCNvF/RD+6d3LnbKVUm4JVQ vCRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688196937; x=1690788937; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=D04xZI5Sg1R/L1W0hLA/ZrQjdqsnxoz6I76PCXx+fj8=; b=ggvmxeAIHQi/w/b/EerTtU2gTHZqKb0Fl+dAXsWKpu9WZAzHkj20+od0BlhfAie7R3 8tt3VSCiVHNZYAOIYW2a+o34vbipWE572ZZ5atUw/ug/eG7q/A5sQF23pcnusbgGBx5M fnOXEmlATvwuenG0GOPMaauv/T7hv9+ts+TkCU4EREdpFNfH7mzxKwRebwvQ/ACQ/mkA f89AbSM+rV60J1yMmkRbUGujTCcMxUmZEpgN2R8gI+GpKzN/Xr2+hLPZ7u6JHrStDL26 V4+XDNwCtUuA59Vzn73n18r/ZDe8tEV5SCWYmAsF7yJ1Y+kiLH6fK/8p/mxSnFLaVrZD cSZA== X-Gm-Message-State: AC+VfDwPNS08G2IcboUvqfTbG3YOJ2w4FYriTV0I2nGoRk+WshR9CDop PwQKiOQL4qc6mF0po3wp0rsKCWlg1a0= X-Google-Smtp-Source: ACHHUZ6nwBScw18iAZO6nvvtXop/fFa891zJjctsilB495dnmmca0IgUPLv11nAA8UPiKcIXr1WEQw== X-Received: by 2002:a05:6808:191c:b0:3a3:6e43:e681 with SMTP id bf28-20020a056808191c00b003a36e43e681mr5980844oib.58.1688196937422; Sat, 01 Jul 2023 00:35:37 -0700 (PDT) Received: from dw-tp.localdomain ([49.207.232.207]) by smtp.gmail.com with ESMTPSA id h14-20020aa786ce000000b0063aa1763146sm8603414pfo.17.2023.07.01.00.35.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Jul 2023 00:35:36 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , Matthew Wilcox , Christoph Hellwig , Brian Foster , Andreas Gruenbacher , "Ritesh Harjani (IBM)" Subject: [PATCHv11 5/8] iomap: Use iomap_punch_t typedef Date: Sat, 1 Jul 2023 13:04:38 +0530 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org It makes it much easier if we have iomap_punch_t typedef for "punch" function pointer in all delalloc related punch, scan and release functions. It will be useful in later patches when we will factor out iomap_write_delalloc_punch() function. Suggested-by: Matthew Wilcox Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index cddf01b96d8a..33fc5ed0049f 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -23,6 +23,7 @@ #define IOEND_BATCH_SIZE 4096 +typedef int (*iomap_punch_t)(struct inode *inode, loff_t offset, loff_t length); /* * Structure allocated for each folio to track per-block uptodate state * and I/O completions. @@ -900,7 +901,7 @@ EXPORT_SYMBOL_GPL(iomap_file_buffered_write); */ static int iomap_write_delalloc_scan(struct inode *inode, loff_t *punch_start_byte, loff_t start_byte, loff_t end_byte, - int (*punch)(struct inode *inode, loff_t offset, loff_t length)) + iomap_punch_t punch) { while (start_byte < end_byte) { struct folio *folio; @@ -978,8 +979,7 @@ static int iomap_write_delalloc_scan(struct inode *inode, * the code to subtle off-by-one bugs.... */ static int iomap_write_delalloc_release(struct inode *inode, - loff_t start_byte, loff_t end_byte, - int (*punch)(struct inode *inode, loff_t pos, loff_t length)) + loff_t start_byte, loff_t end_byte, iomap_punch_t punch) { loff_t punch_start_byte = start_byte; loff_t scan_end_byte = min(i_size_read(inode), end_byte); @@ -1072,8 +1072,7 @@ static int iomap_write_delalloc_release(struct inode *inode, */ int iomap_file_buffered_write_punch_delalloc(struct inode *inode, struct iomap *iomap, loff_t pos, loff_t length, - ssize_t written, - int (*punch)(struct inode *inode, loff_t pos, loff_t length)) + ssize_t written, iomap_punch_t punch) { loff_t start_byte; loff_t end_byte; From patchwork Sat Jul 1 07:34:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13298994 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A001C001DE for ; Sat, 1 Jul 2023 07:36:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229786AbjGAHgQ (ORCPT ); Sat, 1 Jul 2023 03:36:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229768AbjGAHfs (ORCPT ); Sat, 1 Jul 2023 03:35:48 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 237FAE58; Sat, 1 Jul 2023 00:35:42 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-668704a5b5bso2098845b3a.0; Sat, 01 Jul 2023 00:35:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688196941; x=1690788941; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Qo1Zdp1lRV/O3jeD7knI1qn5KUdKSMVI92S/6Gq9i5k=; b=cSXGBQpyUN9k8hqY+ufSHthOWI9WMrL4KbdPRlV6qHOtcvWbg2IzM5LCj7Bp0/qT3b KN1tXJWsq4KjXSBQPPd3P+okfe56u5ZkIeaXs9Dyb9xk/PLxha6BRzDJX25BA4m7aWbP +ojHX0c6yZl0Ihfr2DJjLoL+D9Yw/OMBE0HCu3TNC66KwPaXeuAgDbraQ/VhufpTK0q7 aw2KyWD3Qwqv4FyuFLNGcnkUr+lL9KIjs06ID5ukC87xXhveeEMlrzZCgkjZCfyqw0zk nCsyhiw3JlHmVuM61RV7A3Nm+IjQufwEYpCXQyz+tHIoMkYl5nSipFWe2mzzyXZHRIxJ nKPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688196941; x=1690788941; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qo1Zdp1lRV/O3jeD7knI1qn5KUdKSMVI92S/6Gq9i5k=; b=NvMAWS6Yb1TeTQKW+/2zGkua7z/uDxg28Irq+iAInri1erUkWLL5VT9ghh5oEWIVZn sabsNXERsh/0NHr3vzejYK/B/oidskSPnQA16bxCGmkxAQpUUIr261LOajpZcC/PDZkX vRU3l/JTcXMRIiQ8HH8juVtZXHZxIwLam/qjcXfM9khIzRm+ZTefmHCsHtD96ZB6INpk MOF0FtZNj2Wc6uGJOLrPC9IZOsPh4ijrU4xJVzqRwldoIqIVX15Ka1IMk9BdsnPMBpKX ijIe+iTAx1yuU4DP/2kvzstZAQ5yTA365Y+ejol/iZekm7VKtpurpig/9QXJ4HENbgPA dkrA== X-Gm-Message-State: ABy/qLbEkkMr3KSRc0iCuRT/Kx5NlASEdmZY9IOHm5cKAunBmW4roAt+ cQwU101vIGUIAKiK0AkB4RRBPngSuoA= X-Google-Smtp-Source: APBJJlHq7lw8rVw0u2abDpLRtNWQvayECJM44Bxs8G2NHUari9tHAZ4YWbS3bJm0HVgAcMKFzX9wHA== X-Received: by 2002:a05:6a00:1806:b0:66b:6021:10fe with SMTP id y6-20020a056a00180600b0066b602110femr6668982pfa.31.1688196940693; Sat, 01 Jul 2023 00:35:40 -0700 (PDT) Received: from dw-tp.localdomain ([49.207.232.207]) by smtp.gmail.com with ESMTPSA id h14-20020aa786ce000000b0063aa1763146sm8603414pfo.17.2023.07.01.00.35.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Jul 2023 00:35:40 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , Matthew Wilcox , Christoph Hellwig , Brian Foster , Andreas Gruenbacher , "Ritesh Harjani (IBM)" , Christoph Hellwig Subject: [PATCHv11 6/8] iomap: Refactor iomap_write_delalloc_punch() function out Date: Sat, 1 Jul 2023 13:04:39 +0530 Message-Id: <6ff2ed87400236d947ff270f591e77b5e16a38b4.1688188958.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch factors iomap_write_delalloc_punch() function out. This function is resposible for actual punch out operation. The reason for doing this is, to avoid deep indentation when we bring punch-out of individual non-dirty blocks within a dirty folio in a later patch (which adds per-block dirty status handling to iomap) to avoid delalloc block leak. Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Signed-off-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 53 ++++++++++++++++++++++++++---------------- 1 file changed, 33 insertions(+), 20 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 33fc5ed0049f..6abe19c41b30 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -882,6 +882,32 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i, } EXPORT_SYMBOL_GPL(iomap_file_buffered_write); +static int iomap_write_delalloc_punch(struct inode *inode, struct folio *folio, + loff_t *punch_start_byte, loff_t start_byte, loff_t end_byte, + iomap_punch_t punch) +{ + int ret = 0; + + if (!folio_test_dirty(folio)) + return ret; + + /* if dirty, punch up to offset */ + if (start_byte > *punch_start_byte) { + ret = punch(inode, *punch_start_byte, + start_byte - *punch_start_byte); + if (ret) + return ret; + } + /* + * Make sure the next punch start is correctly bound to + * the end of this data range, not the end of the folio. + */ + *punch_start_byte = min_t(loff_t, end_byte, + folio_pos(folio) + folio_size(folio)); + + return ret; +} + /* * Scan the data range passed to us for dirty page cache folios. If we find a * dirty folio, punch out the preceeding range and update the offset from which @@ -905,6 +931,7 @@ static int iomap_write_delalloc_scan(struct inode *inode, { while (start_byte < end_byte) { struct folio *folio; + int ret; /* grab locked page */ folio = filemap_lock_folio(inode->i_mapping, @@ -915,26 +942,12 @@ static int iomap_write_delalloc_scan(struct inode *inode, continue; } - /* if dirty, punch up to offset */ - if (folio_test_dirty(folio)) { - if (start_byte > *punch_start_byte) { - int error; - - error = punch(inode, *punch_start_byte, - start_byte - *punch_start_byte); - if (error) { - folio_unlock(folio); - folio_put(folio); - return error; - } - } - - /* - * Make sure the next punch start is correctly bound to - * the end of this data range, not the end of the folio. - */ - *punch_start_byte = min_t(loff_t, end_byte, - folio_pos(folio) + folio_size(folio)); + ret = iomap_write_delalloc_punch(inode, folio, punch_start_byte, + start_byte, end_byte, punch); + if (ret) { + folio_unlock(folio); + folio_put(folio); + return ret; } /* move offset to start of next folio in range */ From patchwork Sat Jul 1 07:34:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13298995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BC29C0015E for ; Sat, 1 Jul 2023 07:36:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229514AbjGAHgR (ORCPT ); Sat, 1 Jul 2023 03:36:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229774AbjGAHfz (ORCPT ); Sat, 1 Jul 2023 03:35:55 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54BA5E5C; Sat, 1 Jul 2023 00:35:45 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id af79cd13be357-76731802203so235082185a.3; Sat, 01 Jul 2023 00:35:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688196944; x=1690788944; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qgq6MS9CqB7Jw3ef4fl/DnYThdTWoNrF7yr1DYSnI3I=; b=HQLJx/AvTAaY6LRfSwGVCdK3+cawzGhTkx9wnFTwdSvq2908mu8BKamD0vMQ1v4x4/ fvamADPfdtsWI9W/Wmpqxa8ZJzhYG/6tDWPAixXEXGCei1irJBPEJ8tXpwiHCCQKLoba MY7ZbaNu9Urv2rSmenU0m39ZAaAgQwnaLEvBIjXz5AhSlTpmc/kcqPLy+Cao+EDYhWht aJvPnCezuXw5+mGL4fpJ038x2DGfujPqvLq42sp5tDy7HlXSjgiItQ16ZC6uabiWrf5p l4di8RraIpoeDf+hFVZXsqIr3743sz6V4rKsJLNigDg9lWAV/uXY4Eu6BrsjilX+wpLq YdUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688196944; x=1690788944; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qgq6MS9CqB7Jw3ef4fl/DnYThdTWoNrF7yr1DYSnI3I=; b=kORSTGzXr+vlJkym4duq1t7a+qYyz5pZM7a6f2qal9MlhT/DRtEJP3C9glqGgs8QQn n3CGddtBOXEyOX3FfLDxQ+BQjDitTXLYlMmFcKAdMsJev4vSwbF8pRXANO2/aDxBtDqI 3HT1yzTiTp0/1PQowQsTqQyCWTzeyEw+rxkIXlmXHEIq3Gtvrke/Jtfn/WAlgxXJuweI RytXDiqpGoAGolcDN7fNw99wOUtqn3cWEFsYBkuHS2BQL0/8jrw5ijZuKSz7zsCWgS/L OHFA7R9Bmo3Sz4uILYmJB8LaiESOb/Wu2LpyJQ2afWlSRGCj1nBIghpKK06WpUnkQcym Klvg== X-Gm-Message-State: AC+VfDzTOeinfvmf7N2Q3bAYy0SY7eDLwG209mC1Sy1wtWPob+wSN/MM ueEmRYR0XfCQhPtgKF4njON7f9vcswY= X-Google-Smtp-Source: ACHHUZ4cwF/WU6hg1VJFXy+pZqM3Jx0Em66Mmig2DNTqBGw2xmsb3KvUeHLiUwYZJuHDwZhfmTQG1Q== X-Received: by 2002:a05:620a:8d8c:b0:767:1c73:69fc with SMTP id rc12-20020a05620a8d8c00b007671c7369fcmr3925950qkn.27.1688196943889; Sat, 01 Jul 2023 00:35:43 -0700 (PDT) Received: from dw-tp.localdomain ([49.207.232.207]) by smtp.gmail.com with ESMTPSA id h14-20020aa786ce000000b0063aa1763146sm8603414pfo.17.2023.07.01.00.35.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Jul 2023 00:35:43 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , Matthew Wilcox , Christoph Hellwig , Brian Foster , Andreas Gruenbacher , "Ritesh Harjani (IBM)" , Christoph Hellwig Subject: [PATCHv11 7/8] iomap: Allocate ifs in ->write_begin() early Date: Sat, 1 Jul 2023 13:04:40 +0530 Message-Id: <62b33ebf74e876a0430e32940a7bc0f4868a5e5e.1688188958.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org We dont need to allocate an ifs in ->write_begin() for writes where the position and length completely overlap with the given folio. Therefore, such cases are skipped. Currently when the folio is uptodate, we only allocate ifs at writeback time (in iomap_writepage_map()). This is ok until now, but when we are going to add support for per-block dirty state bitmap in ifs, this could cause some performance degradation. The reason is that if we don't allocate ifs during ->write_begin(), then we will never mark the necessary dirty bits in ->write_end() call. And we will have to mark all the bits as dirty at the writeback time, that could cause the same write amplification and performance problems as it is now. Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Signed-off-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 6abe19c41b30..fb6c2b6a4358 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -561,14 +561,23 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, size_t from = offset_in_folio(folio, pos), to = from + len; size_t poff, plen; - if (folio_test_uptodate(folio)) + /* + * If the write completely overlaps the current folio, then + * entire folio will be dirtied so there is no need for + * per-block state tracking structures to be attached to this folio. + */ + if (pos <= folio_pos(folio) && + pos + len >= folio_pos(folio) + folio_size(folio)) return 0; - folio_clear_error(folio); ifs = ifs_alloc(iter->inode, folio, iter->flags); if ((iter->flags & IOMAP_NOWAIT) && !ifs && nr_blocks > 1) return -EAGAIN; + if (folio_test_uptodate(folio)) + return 0; + folio_clear_error(folio); + do { iomap_adjust_read_range(iter->inode, folio, &block_start, block_end - block_start, &poff, &plen); From patchwork Sat Jul 1 07:34:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13298996 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8063FEB64DC for ; Sat, 1 Jul 2023 07:36:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229490AbjGAHgc (ORCPT ); Sat, 1 Jul 2023 03:36:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32832 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229787AbjGAHf4 (ORCPT ); Sat, 1 Jul 2023 03:35:56 -0400 Received: from mail-oi1-x229.google.com (mail-oi1-x229.google.com [IPv6:2607:f8b0:4864:20::229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77727199; Sat, 1 Jul 2023 00:35:48 -0700 (PDT) Received: by mail-oi1-x229.google.com with SMTP id 5614622812f47-39ed11d6a50so1951530b6e.2; Sat, 01 Jul 2023 00:35:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1688196947; x=1690788947; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pBs/SVDx8HJAgGEhcKk2jHgmyHp/mIRnKT9oVm91jRs=; b=D/om7IVr3yJV4PwLn1yYdF4ZQVhS9k3TK05J5IfrXMIHvlA5el0S7cySegnpyfknyK MPMN83c93FIaHwmLyDtxUn+MHPhRPFQ41H8LLdjk7Gj2M+ht7s4xkJuWyKh2r7sXKqsH 5RQhkffYqyTvAnMAJXtFAOnDURIkLCgohB8d+VvKs4ZpE+nekAYwmAh2fg5MQLJd0T3U Tu6d0mEDrlGQH5GweXam10p3TItLg+MXPA8kSBDOzsCiHjZCGGbzWYzNUmW+9BbiE01K EGaquu5mG1+x1+6luvxywlE1o0e2ieJeXjym7QtILKKTYKEeJO+iQMeCx0tVj9zQAliT /dyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688196947; x=1690788947; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pBs/SVDx8HJAgGEhcKk2jHgmyHp/mIRnKT9oVm91jRs=; b=BcMQWptDMnFX28udX3dso6c5CCLe+QCZOKFNyQzLooNjkW3S7W3MPmZih8kgG80+zt Z8letoxKz90VaSkp2M5//YCP3pXbLyQ8uv41Eacztyl8e+yh18HIJXearKwxbRU6Tdku rESemGDgTb86cPdyoh/IYY5ylsiskYOQbQ1QtUE+z/u4geGbDooQ2sHARBWORlUhHgDV v0FiItTt+PUewThouVrzC4IVuDz7wpt9MydA3uUSPwqsIz90QiP/TwmWlK35Af8sQnnl UH73gKnYXwYhgIlggkctUAhM0fk86YBvxEddODsoTfRooUe2MH6VlV8sY34ehF69QVwH MpoQ== X-Gm-Message-State: AC+VfDw+i+AjyGrjArGvdXgQnjHYLPRcOrdDSYOQ3HPIumVpFzLK6LPi bomfPbb3Yqn70KZYPX3euMu9G9MDeSw= X-Google-Smtp-Source: ACHHUZ6MB6lqwzw4n4BIJrCaEL0BnmigltL63MRRK8IySNXQRBEC3BpQfVLqEUROX6sYTCcbybTHQQ== X-Received: by 2002:a05:6808:d4e:b0:399:8529:6726 with SMTP id w14-20020a0568080d4e00b0039985296726mr5961508oik.51.1688196947083; Sat, 01 Jul 2023 00:35:47 -0700 (PDT) Received: from dw-tp.localdomain ([49.207.232.207]) by smtp.gmail.com with ESMTPSA id h14-20020aa786ce000000b0063aa1763146sm8603414pfo.17.2023.07.01.00.35.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Jul 2023 00:35:46 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, "Darrick J . Wong" , Matthew Wilcox , Christoph Hellwig , Brian Foster , Andreas Gruenbacher , "Ritesh Harjani (IBM)" , Aravinda Herle Subject: [PATCHv11 8/8] iomap: Add per-block dirty state tracking to improve performance Date: Sat, 1 Jul 2023 13:04:41 +0530 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When filesystem blocksize is less than folio size (either with mapping_large_folio_support() or with blocksize < pagesize) and when the folio is uptodate in pagecache, then even a byte write can cause an entire folio to be written to disk during writeback. This happens because we currently don't have a mechanism to track per-block dirty state within struct iomap_folio_state. We currently only track uptodate state. This patch implements support for tracking per-block dirty state in iomap_folio_state->state bitmap. This should help improve the filesystem write performance and help reduce write amplification. Performance testing of below fio workload reveals ~16x performance improvement using nvme with XFS (4k blocksize) on Power (64K pagesize) FIO reported write bw scores improved from around ~28 MBps to ~452 MBps. 1. [global] ioengine=psync rw=randwrite overwrite=1 pre_read=1 direct=0 bs=4k size=1G dir=./ numjobs=8 fdatasync=1 runtime=60 iodepth=64 group_reporting=1 [fio-run] 2. Also our internal performance team reported that this patch improves their database workload performance by around ~83% (with XFS on Power) Reported-by: Aravinda Herle Reported-by: Brian Foster Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Darrick J. Wong --- fs/gfs2/aops.c | 2 +- fs/iomap/buffered-io.c | 149 ++++++++++++++++++++++++++++++++++++++--- fs/xfs/xfs_aops.c | 2 +- fs/zonefs/file.c | 2 +- include/linux/iomap.h | 1 + 5 files changed, 142 insertions(+), 14 deletions(-) diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index a5f4be6b9213..75efec3c3b71 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -746,7 +746,7 @@ static const struct address_space_operations gfs2_aops = { .writepages = gfs2_writepages, .read_folio = gfs2_read_folio, .readahead = gfs2_readahead, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .bmap = gfs2_bmap, diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index fb6c2b6a4358..2fd9413838de 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -25,7 +25,7 @@ typedef int (*iomap_punch_t)(struct inode *inode, loff_t offset, loff_t length); /* - * Structure allocated for each folio to track per-block uptodate state + * Structure allocated for each folio to track per-block uptodate, dirty state * and I/O completions. */ struct iomap_folio_state { @@ -78,6 +78,61 @@ static void iomap_set_range_uptodate(struct folio *folio, size_t off, folio_mark_uptodate(folio); } +static inline bool ifs_block_is_dirty(struct folio *folio, + struct iomap_folio_state *ifs, int block) +{ + struct inode *inode = folio->mapping->host; + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + + return test_bit(block + blks_per_folio, ifs->state); +} + +static void ifs_clear_range_dirty(struct folio *folio, + struct iomap_folio_state *ifs, size_t off, size_t len) +{ + struct inode *inode = folio->mapping->host; + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + unsigned int first_blk = (off >> inode->i_blkbits); + unsigned int last_blk = (off + len - 1) >> inode->i_blkbits; + unsigned int nr_blks = last_blk - first_blk + 1; + unsigned long flags; + + spin_lock_irqsave(&ifs->state_lock, flags); + bitmap_clear(ifs->state, first_blk + blks_per_folio, nr_blks); + spin_unlock_irqrestore(&ifs->state_lock, flags); +} + +static void iomap_clear_range_dirty(struct folio *folio, size_t off, size_t len) +{ + struct iomap_folio_state *ifs = folio->private; + + if (ifs) + ifs_clear_range_dirty(folio, ifs, off, len); +} + +static void ifs_set_range_dirty(struct folio *folio, + struct iomap_folio_state *ifs, size_t off, size_t len) +{ + struct inode *inode = folio->mapping->host; + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + unsigned int first_blk = (off >> inode->i_blkbits); + unsigned int last_blk = (off + len - 1) >> inode->i_blkbits; + unsigned int nr_blks = last_blk - first_blk + 1; + unsigned long flags; + + spin_lock_irqsave(&ifs->state_lock, flags); + bitmap_set(ifs->state, first_blk + blks_per_folio, nr_blks); + spin_unlock_irqrestore(&ifs->state_lock, flags); +} + +static void iomap_set_range_dirty(struct folio *folio, size_t off, size_t len) +{ + struct iomap_folio_state *ifs = folio->private; + + if (ifs) + ifs_set_range_dirty(folio, ifs, off, len); +} + static struct iomap_folio_state *ifs_alloc(struct inode *inode, struct folio *folio, unsigned int flags) { @@ -93,14 +148,24 @@ static struct iomap_folio_state *ifs_alloc(struct inode *inode, else gfp = GFP_NOFS | __GFP_NOFAIL; - ifs = kzalloc(struct_size(ifs, state, BITS_TO_LONGS(nr_blocks)), - gfp); - if (ifs) { - spin_lock_init(&ifs->state_lock); - if (folio_test_uptodate(folio)) - bitmap_fill(ifs->state, nr_blocks); - folio_attach_private(folio, ifs); - } + /* + * ifs->state tracks two sets of state flags when the + * filesystem block size is smaller than the folio size. + * The first state tracks per-block uptodate and the + * second tracks per-block dirty state. + */ + ifs = kzalloc(struct_size(ifs, state, + BITS_TO_LONGS(2 * nr_blocks)), gfp); + if (!ifs) + return ifs; + + spin_lock_init(&ifs->state_lock); + if (folio_test_uptodate(folio)) + bitmap_set(ifs->state, 0, nr_blocks); + if (folio_test_dirty(folio)) + bitmap_set(ifs->state, nr_blocks, nr_blocks); + folio_attach_private(folio, ifs); + return ifs; } @@ -523,6 +588,17 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len) } EXPORT_SYMBOL_GPL(iomap_invalidate_folio); +bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio) +{ + struct inode *inode = mapping->host; + size_t len = folio_size(folio); + + ifs_alloc(inode, folio, 0); + iomap_set_range_dirty(folio, 0, len); + return filemap_dirty_folio(mapping, folio); +} +EXPORT_SYMBOL_GPL(iomap_dirty_folio); + static void iomap_write_failed(struct inode *inode, loff_t pos, unsigned len) { @@ -727,6 +803,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, if (unlikely(copied < len && !folio_test_uptodate(folio))) return 0; iomap_set_range_uptodate(folio, offset_in_folio(folio, pos), len); + iomap_set_range_dirty(folio, offset_in_folio(folio, pos), copied); filemap_dirty_folio(inode->i_mapping, folio); return copied; } @@ -891,6 +968,43 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i, } EXPORT_SYMBOL_GPL(iomap_file_buffered_write); +static int iomap_write_delalloc_ifs_punch(struct inode *inode, + struct folio *folio, loff_t start_byte, loff_t end_byte, + iomap_punch_t punch) +{ + unsigned int first_blk, last_blk, i; + loff_t last_byte; + u8 blkbits = inode->i_blkbits; + struct iomap_folio_state *ifs; + int ret = 0; + + /* + * When we have per-block dirty tracking, there can be + * blocks within a folio which are marked uptodate + * but not dirty. In that case it is necessary to punch + * out such blocks to avoid leaking any delalloc blocks. + */ + ifs = folio->private; + if (!ifs) + return ret; + + last_byte = min_t(loff_t, end_byte - 1, + folio_pos(folio) + folio_size(folio) - 1); + first_blk = offset_in_folio(folio, start_byte) >> blkbits; + last_blk = offset_in_folio(folio, last_byte) >> blkbits; + for (i = first_blk; i <= last_blk; i++) { + if (!ifs_block_is_dirty(folio, ifs, i)) { + ret = punch(inode, folio_pos(folio) + (i << blkbits), + 1 << blkbits); + if (ret) + return ret; + } + } + + return ret; +} + + static int iomap_write_delalloc_punch(struct inode *inode, struct folio *folio, loff_t *punch_start_byte, loff_t start_byte, loff_t end_byte, iomap_punch_t punch) @@ -907,6 +1021,13 @@ static int iomap_write_delalloc_punch(struct inode *inode, struct folio *folio, if (ret) return ret; } + + /* Punch non-dirty blocks within folio */ + ret = iomap_write_delalloc_ifs_punch(inode, folio, start_byte, + end_byte, punch); + if (ret) + return ret; + /* * Make sure the next punch start is correctly bound to * the end of this data range, not the end of the folio. @@ -1637,7 +1758,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct inode *inode, struct folio *folio, u64 end_pos) { - struct iomap_folio_state *ifs = ifs_alloc(inode, folio, 0); + struct iomap_folio_state *ifs = folio->private; struct iomap_ioend *ioend, *next; unsigned len = i_blocksize(inode); unsigned nblocks = i_blocks_per_folio(inode, folio); @@ -1645,6 +1766,11 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, int error = 0, count = 0, i; LIST_HEAD(submit_list); + if (!ifs && nblocks > 1) { + ifs = ifs_alloc(inode, folio, 0); + iomap_set_range_dirty(folio, 0, folio_size(folio)); + } + WARN_ON_ONCE(ifs && atomic_read(&ifs->write_bytes_pending) != 0); /* @@ -1653,7 +1779,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, * invalid, grab a new one. */ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) { - if (ifs && !ifs_block_is_uptodate(ifs, i)) + if (ifs && !ifs_block_is_dirty(folio, ifs, i)) continue; error = wpc->ops->map_blocks(wpc, inode, pos); @@ -1697,6 +1823,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, } } + iomap_clear_range_dirty(folio, 0, end_pos - folio_pos(folio)); folio_start_writeback(folio); folio_unlock(folio); diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 451942fb38ec..2fca4b4e7fd8 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -578,7 +578,7 @@ const struct address_space_operations xfs_address_space_operations = { .read_folio = xfs_vm_read_folio, .readahead = xfs_vm_readahead, .writepages = xfs_vm_writepages, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .bmap = xfs_vm_bmap, diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c index 132f01d3461f..e508c8e97372 100644 --- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -175,7 +175,7 @@ const struct address_space_operations zonefs_file_aops = { .read_folio = zonefs_read_folio, .readahead = zonefs_readahead, .writepages = zonefs_writepages, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .migrate_folio = filemap_migrate_folio, diff --git a/include/linux/iomap.h b/include/linux/iomap.h index e2b836c2e119..eb9335c46bf3 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -264,6 +264,7 @@ bool iomap_is_partially_uptodate(struct folio *, size_t from, size_t count); struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos); bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags); void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len); +bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio); int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len, const struct iomap_ops *ops); int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,