From patchwork Sun May 7 19:27:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13233902 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 214D3C77B7C for ; Sun, 7 May 2023 19:28:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229986AbjEGT2X (ORCPT ); Sun, 7 May 2023 15:28:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229460AbjEGT2T (ORCPT ); Sun, 7 May 2023 15:28:19 -0400 Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 909F3AD31; Sun, 7 May 2023 12:28:18 -0700 (PDT) Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-64384274895so2616045b3a.2; Sun, 07 May 2023 12:28:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683487698; x=1686079698; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JYZXBFa4nJVghUci8sYWd0cGOxJus9KnW3k3Yf7tQpM=; b=GQFQpOplbWBwDsTmAKzcESbiZOgSgBpPK8at8oKeA/AwD9acAP49MUdcSYjmA1vSQQ Gz2MjXEoCq7jP5ox6sHA8gRWXv9MotpZJVzBniOicx7c+lvkgHL1PEl7KRD6r+Q+WBck enECn2YxctOIHdFQYtExy47g/gsTWygpoz3r4DeZ9ec1LPcK0aNowLXOUBS7PWsQ880x TBg2YcntG0UlOOWMdVosFuu6v9Iaj4pS63TMt1rHU77KRdr2DxL0qN7cI9OStCyKAm0Y vCeegE9TXO6Iedhr0neQO1v7vrwqUvL8bFghqdmVDUPnwjdDDB4pIMWqke00S/GGQ4xE n3Iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683487698; x=1686079698; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JYZXBFa4nJVghUci8sYWd0cGOxJus9KnW3k3Yf7tQpM=; b=e7B4m6+7qxTVrKkyYHoHyNHqJ0fExHGfroBuxVwYHCHDeAYk587ndcQELZo3vHDmxf iRY1mDOtOQxiDJbCX94ToCZntk8RMbH2p+7Xedi+RizlxZGJWPHHu21sAzA9Wpkg3t67 W/NjWd8ZCPd2ixFrfrADc+4vQb1Ktuovr3hLUW8of7o2op9+0wo9F5mDGiMjC/dFd2nH 9Q9Qy8aQf4gk46CC4vrhLdMJ0ffY6camO7GMnbQ/OwoODwZQ1O5mGGxcLXqD+m1kNSMS fLBLCmENjyvb3IqlJ8QC8XG4qy3r1RWsmiCCpU+qQwpqkMWCv7Aj21hk03Gwrnd7AwS3 bjUw== X-Gm-Message-State: AC+VfDwkxFV5D2iGtG1G+SY4K5GxVFpD+0fZeMKFNqLZSfTVlmpagipf 9bZo1mwvK7gR32OIDUoLk4pXS/Wuzso= X-Google-Smtp-Source: ACHHUZ4yEIKajAYcYng78lhSAI4gDqICGe4vcoo+C8yE26ucYVjGRQnDk/l2EI0sOIT7/CdUf/40DQ== X-Received: by 2002:a17:902:d510:b0:1ac:7ae7:3fdf with SMTP id b16-20020a170902d51000b001ac7ae73fdfmr1000474plg.41.1683487697673; Sun, 07 May 2023 12:28:17 -0700 (PDT) Received: from rh-tp.. ([2406:7400:63:80ba:4cb4:7226:d064:79aa]) by smtp.gmail.com with ESMTPSA id jg18-20020a17090326d200b001a505f04a06sm5485624plb.190.2023.05.07.12.28.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 May 2023 12:28:17 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [RFCv5 1/5] iomap: Rename iomap_page_create/release() to iop_alloc/free() Date: Mon, 8 May 2023 00:57:56 +0530 Message-Id: <03639dbe54a0a0ef2bd789f4e8318df22a4c5d12.1683485700.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch renames the iomap_page_create/release() functions to iop_alloc/free() calls. Later patches adds more functions for handling iop structure with iop_** naming conventions. Hence iop_alloc/free() makes more sense. Note, this patch also move folio_detach_private() to happen later after checking for bitmap_full(). This is just another small refactor because in later patches we will move bitmap_** helpers to iop_** related helpers which will only take a folio and hence we should move folio_detach_private() to the end before calling kfree(iop). Signed-off-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 6f4c97a6d7e9..cbd945d96584 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -43,8 +43,8 @@ static inline struct iomap_page *to_iomap_page(struct folio *folio) static struct bio_set iomap_ioend_bioset; -static struct iomap_page * -iomap_page_create(struct inode *inode, struct folio *folio, unsigned int flags) +static struct iomap_page *iop_alloc(struct inode *inode, struct folio *folio, + unsigned int flags) { struct iomap_page *iop = to_iomap_page(folio); unsigned int nr_blocks = i_blocks_per_folio(inode, folio); @@ -69,9 +69,9 @@ iomap_page_create(struct inode *inode, struct folio *folio, unsigned int flags) return iop; } -static void iomap_page_release(struct folio *folio) +static void iop_free(struct folio *folio) { - struct iomap_page *iop = folio_detach_private(folio); + struct iomap_page *iop = to_iomap_page(folio); struct inode *inode = folio->mapping->host; unsigned int nr_blocks = i_blocks_per_folio(inode, folio); @@ -81,6 +81,7 @@ static void iomap_page_release(struct folio *folio) WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending)); WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != folio_test_uptodate(folio)); + folio_detach_private(folio); kfree(iop); } @@ -231,7 +232,7 @@ static int iomap_read_inline_data(const struct iomap_iter *iter, if (WARN_ON_ONCE(size > iomap->length)) return -EIO; if (offset > 0) - iop = iomap_page_create(iter->inode, folio, iter->flags); + iop = iop_alloc(iter->inode, folio, iter->flags); else iop = to_iomap_page(folio); @@ -269,7 +270,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, return iomap_read_inline_data(iter, folio); /* zero post-eof blocks as the page may be mapped */ - iop = iomap_page_create(iter->inode, folio, iter->flags); + iop = iop_alloc(iter->inode, folio, iter->flags); iomap_adjust_read_range(iter->inode, folio, &pos, length, &poff, &plen); if (plen == 0) goto done; @@ -497,7 +498,7 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags) */ if (folio_test_dirty(folio) || folio_test_writeback(folio)) return false; - iomap_page_release(folio); + iop_free(folio); return true; } EXPORT_SYMBOL_GPL(iomap_release_folio); @@ -514,12 +515,12 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len) if (offset == 0 && len == folio_size(folio)) { WARN_ON_ONCE(folio_test_writeback(folio)); folio_cancel_dirty(folio); - iomap_page_release(folio); + iop_free(folio); } else if (folio_test_large(folio)) { /* Must release the iop so the page can be split */ WARN_ON_ONCE(!folio_test_uptodate(folio) && folio_test_dirty(folio)); - iomap_page_release(folio); + iop_free(folio); } } EXPORT_SYMBOL_GPL(iomap_invalidate_folio); @@ -566,7 +567,8 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, return 0; folio_clear_error(folio); - iop = iomap_page_create(iter->inode, folio, iter->flags); + iop = iop_alloc(iter->inode, folio, iter->flags); + if ((iter->flags & IOMAP_NOWAIT) && !iop && nr_blocks > 1) return -EAGAIN; @@ -1619,7 +1621,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct inode *inode, struct folio *folio, u64 end_pos) { - struct iomap_page *iop = iomap_page_create(inode, folio, 0); + struct iomap_page *iop = iop_alloc(inode, folio, 0); struct iomap_ioend *ioend, *next; unsigned len = i_blocksize(inode); unsigned nblocks = i_blocks_per_folio(inode, folio); From patchwork Sun May 7 19:27:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13233903 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 77CCEC7EE2A for ; Sun, 7 May 2023 19:28:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230265AbjEGT2Y (ORCPT ); Sun, 7 May 2023 15:28:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55436 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229877AbjEGT2W (ORCPT ); Sun, 7 May 2023 15:28:22 -0400 Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1B27100D9; Sun, 7 May 2023 12:28:21 -0700 (PDT) Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-643465067d1so2789395b3a.0; Sun, 07 May 2023 12:28:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683487701; x=1686079701; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sK0wQ7QKcUfY9HPjMPEbXTmR1QCK1bRi/YuJR9xyKMU=; b=e78tsOUHdrd4DbrN/nOW/i66l6Kl2a83dBZBq3kfQ/g5+WzwY+QM44/XlMcorQN4be 9GuwkHsJ4kdp2MRYCSdmrIaWnNh4dd+Zsv4/J+obmhAlsJW0Hv29TZOt+8KVAjwftVSU gDpnWPtYRRNdB/YOzjb4j3Vgyj9U0VOXgpS/mNwGn2+SUhDZF92r6t2HGS/R8atzFbjk whtQK7DHd7fPOxp8AGQiGCc1FSm6slMUrbLtk9r2y61Lrez2O1fws9qgGQ/XwxeKnLhH c+AAkyQrKbmZonFLxkJzrr9dUz7MDNb9aDYLZUSfHfSxjXmxQX8AAJZz9y5YFGdobwiy +9nA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683487701; x=1686079701; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sK0wQ7QKcUfY9HPjMPEbXTmR1QCK1bRi/YuJR9xyKMU=; b=VRNe0LFELCfHj3L3LV+WU2BNlxDrt9G2dRm5mNj9JercDCDqzwqoDJ60cCH50KlQph veW1ZK7hl+HfQpiOotU5Wpn2sEnJHzfHFXF8gGBA8t4Jhnh5WrrSVKU70I2Rbe1xsxDn 0VfqHSuxuhit+05AkhgPqlPRRjHMmarMioZnLPgOuzsQ/gH5o+pxzWCEtK+EEIl3IXpR kcuYdALkaGbQZy1b4hrRgqUZjp3Ee16/Sm5uRCU8x27f1zgwRlL/W+f5foNcHUVBEZP4 cE6GX/PlX/Em4sIWe95Zjq9jIK5g8EJ5W4lJI2/ncBGJNwTCvKbokJ6Z/TYRGRCNZqvg 26eg== X-Gm-Message-State: AC+VfDzd5lcVOOAnVtT21yyS/2RLe8Pt9lRq3zMRcZK4qATn5L3Fr83H HsEJVLDUGvZMcXQ9e6AyLTnppqfMbmc= X-Google-Smtp-Source: ACHHUZ5IWVJk+QBzRS2+ieAWmqfrC8kJEsIQNnE/4XlUagWqiBOCKE7WK5ral1NF09nwIKaEssqvOg== X-Received: by 2002:a17:903:247:b0:1ab:16cd:51a3 with SMTP id j7-20020a170903024700b001ab16cd51a3mr10619270plh.10.1683487700636; Sun, 07 May 2023 12:28:20 -0700 (PDT) Received: from rh-tp.. ([2406:7400:63:80ba:4cb4:7226:d064:79aa]) by smtp.gmail.com with ESMTPSA id jg18-20020a17090326d200b001a505f04a06sm5485624plb.190.2023.05.07.12.28.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 May 2023 12:28:20 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [RFCv5 2/5] iomap: Refactor iop_set_range_uptodate() function Date: Mon, 8 May 2023 00:57:57 +0530 Message-Id: <203a9e25873f6c94c9de89823439aa1f6a7dc714.1683485700.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This patch moves up and combine the definitions of two functions (iomap_iop_set_range_uptodate() & iomap_set_range_uptodate()) into iop_set_range_uptodate() & refactors it's arguments a bit. No functionality change in this patch. Signed-off-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 57 ++++++++++++++++++++---------------------- 1 file changed, 27 insertions(+), 30 deletions(-) -- 2.39.2 diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index cbd945d96584..e732581dc2d4 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -43,6 +43,27 @@ static inline struct iomap_page *to_iomap_page(struct folio *folio) static struct bio_set iomap_ioend_bioset; +static void iop_set_range_uptodate(struct inode *inode, struct folio *folio, + size_t off, size_t len) +{ + struct iomap_page *iop = to_iomap_page(folio); + unsigned int first_blk = off >> inode->i_blkbits; + unsigned int last_blk = (off + len - 1) >> inode->i_blkbits; + unsigned int nr_blks = last_blk - first_blk + 1; + unsigned long flags; + + if (iop) { + spin_lock_irqsave(&iop->uptodate_lock, flags); + bitmap_set(iop->uptodate, first_blk, nr_blks); + if (bitmap_full(iop->uptodate, + i_blocks_per_folio(inode, folio))) + folio_mark_uptodate(folio); + spin_unlock_irqrestore(&iop->uptodate_lock, flags); + } else { + folio_mark_uptodate(folio); + } +} + static struct iomap_page *iop_alloc(struct inode *inode, struct folio *folio, unsigned int flags) { @@ -145,30 +166,6 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, *lenp = plen; } -static void iomap_iop_set_range_uptodate(struct folio *folio, - struct iomap_page *iop, size_t off, size_t len) -{ - struct inode *inode = folio->mapping->host; - unsigned first = off >> inode->i_blkbits; - unsigned last = (off + len - 1) >> inode->i_blkbits; - unsigned long flags; - - spin_lock_irqsave(&iop->uptodate_lock, flags); - bitmap_set(iop->uptodate, first, last - first + 1); - if (bitmap_full(iop->uptodate, i_blocks_per_folio(inode, folio))) - folio_mark_uptodate(folio); - spin_unlock_irqrestore(&iop->uptodate_lock, flags); -} - -static void iomap_set_range_uptodate(struct folio *folio, - struct iomap_page *iop, size_t off, size_t len) -{ - if (iop) - iomap_iop_set_range_uptodate(folio, iop, off, len); - else - folio_mark_uptodate(folio); -} - static void iomap_finish_folio_read(struct folio *folio, size_t offset, size_t len, int error) { @@ -178,7 +175,8 @@ static void iomap_finish_folio_read(struct folio *folio, size_t offset, folio_clear_uptodate(folio); folio_set_error(folio); } else { - iomap_set_range_uptodate(folio, iop, offset, len); + iop_set_range_uptodate(folio->mapping->host, folio, offset, + len); } if (!iop || atomic_sub_and_test(len, &iop->read_bytes_pending)) @@ -240,7 +238,7 @@ static int iomap_read_inline_data(const struct iomap_iter *iter, memcpy(addr, iomap->inline_data, size); memset(addr + size, 0, PAGE_SIZE - poff - size); kunmap_local(addr); - iomap_set_range_uptodate(folio, iop, offset, PAGE_SIZE - poff); + iop_set_range_uptodate(iter->inode, folio, offset, PAGE_SIZE - poff); return 0; } @@ -277,7 +275,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, if (iomap_block_needs_zeroing(iter, pos)) { folio_zero_range(folio, poff, plen); - iomap_set_range_uptodate(folio, iop, poff, plen); + iop_set_range_uptodate(iter->inode, folio, poff, plen); goto done; } @@ -598,7 +596,7 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, if (status) return status; } - iomap_set_range_uptodate(folio, iop, poff, plen); + iop_set_range_uptodate(iter->inode, folio, poff, plen); } while ((block_start += plen) < block_end); return 0; @@ -705,7 +703,6 @@ static int iomap_write_begin(struct iomap_iter *iter, loff_t pos, static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, size_t copied, struct folio *folio) { - struct iomap_page *iop = to_iomap_page(folio); flush_dcache_folio(folio); /* @@ -721,7 +718,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, */ if (unlikely(copied < len && !folio_test_uptodate(folio))) return 0; - iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len); + iop_set_range_uptodate(inode, folio, offset_in_folio(folio, pos), len); filemap_dirty_folio(inode->i_mapping, folio); return copied; } From patchwork Sun May 7 19:27:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13233904 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 655B4C77B7C for ; Sun, 7 May 2023 19:28:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231234AbjEGT21 (ORCPT ); Sun, 7 May 2023 15:28:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229877AbjEGT2Z (ORCPT ); Sun, 7 May 2023 15:28:25 -0400 Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6564711547; Sun, 7 May 2023 12:28:24 -0700 (PDT) Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-64115e652eeso30067139b3a.0; Sun, 07 May 2023 12:28:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683487703; x=1686079703; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=agiEqnLjzjXxpXLDTF39APBkVtFZSKM6C+u+Yt5CMi4=; b=NjBaByKXRGFTzdX2KJuB7Xeosxi/DMnQggs9p17YnmjkDgIKso8hFa5IIjDRELMvHW pLr8wdLNcQiGap6RnkPvWr/wrgPjwRXeI/ZwpjrFoImXVtXWldxOm9laAPdidinn9Lwp SH42KojklWkuYcJli7L5zIlW3xY6v05/mPOv1yYTo/9UGZwWlQF1ybuvekTemRvjBCpo Rg8DLjpvyjrmLMxoWdKk6O3sqaqdq61BxG7UJCYI5Py6pgYpimprrUS6YweUuEHsbZfP rCkDo+pql33UTKnMqaP809CavDLgodI/N37iDTLIYcqIvd8Y8VM8skVijfrbhQ2KezzI hQRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683487703; x=1686079703; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=agiEqnLjzjXxpXLDTF39APBkVtFZSKM6C+u+Yt5CMi4=; b=lQcyENEbkbx700t6yslf0ovqyIyB8w7MHiGn4Q6ABvCUj4jfuNAg6sJP6tD82kS46G OQS/mN8ldDdu20GDlRHvLNTZoDcJKe+6rXoFDJFLtF+qn/nLlPeUqwxxLEWYSwMGSCAZ 7ZiFnisoa0JSvmiAIafaTH8pFzobpv78beT1dXfgBGodPv13JPE5K5sotSJnZmS1Nbng 40vEnqR0TcoND+1p1UrCefdOTKWqLngIlK/OYapQcsGG9/Bv4twXvhUdx2rlbIG3t1k/ dWQD6pcd9RsUcW7Bc/G3NqyAluyd2De7wQ5bbzS9hn93Uv1fUKfYTGpPIlJqyJ7lACAg slGw== X-Gm-Message-State: AC+VfDzVX2TL3cw5Xyavklcc6PmtwNOcEA3pxwvm22avR+6BHNn1IgWm KDGwPYBk1whMubjUsR4CTpa7AamXtqk= X-Google-Smtp-Source: ACHHUZ7UCtNxawgYhVbY67y9DriFqQNxvQuUINmq9J5Ph6JtkRb9JanChlqgwv8ILpRxFqbHWZEZjg== X-Received: by 2002:a17:903:32c6:b0:1ab:1e6f:74d1 with SMTP id i6-20020a17090332c600b001ab1e6f74d1mr17206880plr.0.1683487703545; Sun, 07 May 2023 12:28:23 -0700 (PDT) Received: from rh-tp.. ([2406:7400:63:80ba:4cb4:7226:d064:79aa]) by smtp.gmail.com with ESMTPSA id jg18-20020a17090326d200b001a505f04a06sm5485624plb.190.2023.05.07.12.28.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 May 2023 12:28:23 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [RFCv5 3/5] iomap: Add iop's uptodate state handling functions Date: Mon, 8 May 2023 00:57:58 +0530 Message-Id: <5372f29f986052f37b45c368a0eb8eed25eb8fdb.1683485700.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Firstly this patch renames iop->uptodate to iop->state bitmap. This is because we will add dirty state to iop->state bitmap in later patches. So it makes sense to rename the iop->uptodate bitmap to iop->state. Secondly this patch also adds other helpers for uptodate state bitmap handling of iop->state. No functionality change in this patch. Signed-off-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 78 +++++++++++++++++++++++++++++++----------- 1 file changed, 58 insertions(+), 20 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index e732581dc2d4..5103b644e115 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -24,14 +24,14 @@ #define IOEND_BATCH_SIZE 4096 /* - * Structure allocated for each folio when block size < folio size - * to track sub-folio uptodate status and I/O completions. + * Structure allocated for each folio to track per-block uptodate state + * and I/O completions. */ struct iomap_page { atomic_t read_bytes_pending; atomic_t write_bytes_pending; - spinlock_t uptodate_lock; - unsigned long uptodate[]; + spinlock_t state_lock; + unsigned long state[]; }; static inline struct iomap_page *to_iomap_page(struct folio *folio) @@ -43,6 +43,47 @@ static inline struct iomap_page *to_iomap_page(struct folio *folio) static struct bio_set iomap_ioend_bioset; +/* + * inline helpers for bitmap operations on iop->state + */ +static inline void iop_set_range(struct iomap_page *iop, unsigned int start_blk, + unsigned int nr_blks) +{ + bitmap_set(iop->state, start_blk, nr_blks); +} + +static inline bool iop_test_block(struct iomap_page *iop, unsigned int block) +{ + return test_bit(block, iop->state); +} + +static inline bool iop_bitmap_full(struct iomap_page *iop, + unsigned int blks_per_folio) +{ + return bitmap_full(iop->state, blks_per_folio); +} + +/* + * iop related helpers for checking uptodate/dirty state of per-block + * or range of blocks within a folio + */ +static bool iop_test_full_uptodate(struct folio *folio) +{ + struct iomap_page *iop = to_iomap_page(folio); + struct inode *inode = folio->mapping->host; + + WARN_ON(!iop); + return iop_bitmap_full(iop, i_blocks_per_folio(inode, folio)); +} + +static bool iop_test_block_uptodate(struct folio *folio, unsigned int block) +{ + struct iomap_page *iop = to_iomap_page(folio); + + WARN_ON(!iop); + return iop_test_block(iop, block); +} + static void iop_set_range_uptodate(struct inode *inode, struct folio *folio, size_t off, size_t len) { @@ -53,12 +94,11 @@ static void iop_set_range_uptodate(struct inode *inode, struct folio *folio, unsigned long flags; if (iop) { - spin_lock_irqsave(&iop->uptodate_lock, flags); - bitmap_set(iop->uptodate, first_blk, nr_blks); - if (bitmap_full(iop->uptodate, - i_blocks_per_folio(inode, folio))) + spin_lock_irqsave(&iop->state_lock, flags); + iop_set_range(iop, first_blk, nr_blks); + if (iop_test_full_uptodate(folio)) folio_mark_uptodate(folio); - spin_unlock_irqrestore(&iop->uptodate_lock, flags); + spin_unlock_irqrestore(&iop->state_lock, flags); } else { folio_mark_uptodate(folio); } @@ -79,12 +119,12 @@ static struct iomap_page *iop_alloc(struct inode *inode, struct folio *folio, else gfp = GFP_NOFS | __GFP_NOFAIL; - iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)), + iop = kzalloc(struct_size(iop, state, BITS_TO_LONGS(nr_blocks)), gfp); if (iop) { - spin_lock_init(&iop->uptodate_lock); + spin_lock_init(&iop->state_lock); if (folio_test_uptodate(folio)) - bitmap_fill(iop->uptodate, nr_blocks); + iop_set_range(iop, 0, nr_blocks); folio_attach_private(folio, iop); } return iop; @@ -93,15 +133,13 @@ static struct iomap_page *iop_alloc(struct inode *inode, struct folio *folio, static void iop_free(struct folio *folio) { struct iomap_page *iop = to_iomap_page(folio); - struct inode *inode = folio->mapping->host; - unsigned int nr_blocks = i_blocks_per_folio(inode, folio); if (!iop) return; WARN_ON_ONCE(atomic_read(&iop->read_bytes_pending)); WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending)); - WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != - folio_test_uptodate(folio)); + WARN_ON_ONCE(iop_test_full_uptodate(folio) != + folio_test_uptodate(folio)); folio_detach_private(folio); kfree(iop); } @@ -132,7 +170,7 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* move forward for each leading block marked uptodate */ for (i = first; i <= last; i++) { - if (!test_bit(i, iop->uptodate)) + if (!iop_test_block_uptodate(folio, i)) break; *pos += block_size; poff += block_size; @@ -142,7 +180,7 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* truncate len if we find any trailing uptodate block(s) */ for ( ; i <= last; i++) { - if (test_bit(i, iop->uptodate)) { + if (iop_test_block_uptodate(folio, i)) { plen -= (last - i + 1) * block_size; last = i - 1; break; @@ -450,7 +488,7 @@ bool iomap_is_partially_uptodate(struct folio *folio, size_t from, size_t count) last = (from + count - 1) >> inode->i_blkbits; for (i = first; i <= last; i++) - if (!test_bit(i, iop->uptodate)) + if (!iop_test_block_uptodate(folio, i)) return false; return true; } @@ -1634,7 +1672,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, * invalid, grab a new one. */ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) { - if (iop && !test_bit(i, iop->uptodate)) + if (iop && !iop_test_block_uptodate(folio, i)) continue; error = wpc->ops->map_blocks(wpc, inode, pos); From patchwork Sun May 7 19:27:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13233905 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87A59C77B75 for ; Sun, 7 May 2023 19:28:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231472AbjEGT23 (ORCPT ); Sun, 7 May 2023 15:28:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231287AbjEGT22 (ORCPT ); Sun, 7 May 2023 15:28:28 -0400 Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02CC3AD3C; Sun, 7 May 2023 12:28:27 -0700 (PDT) Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-64115e652eeso30067244b3a.0; Sun, 07 May 2023 12:28:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683487706; x=1686079706; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=u7Ew7G9qML4pL47doYyM9uniO4TkPYu8HEc43GhrRfs=; b=Y1fXpYDrmQDOpCucQG17f3SI7VUw+1cAxe9VZlsLHnI+MAYw0xqIZituSydbpQneZF 8frr/pP55DbiyVk0XBiTdFeAQ6kUOqLSJeH0WZOSvg9OTX7PJB5Dc+aplC/2lsHQxNKz lafx6TiKJEcjIHwRtOcFLIyTFULHFKAOX+jjbrDkTc7dI7+REQv53b+t4zNpGMeO145s zy2JlZjTIbTskBNhf+t30SsTrLanz3yfSZhToEj8L3HcUt1uVAA+L6OnQMSJQVoIlO+4 X/JdY04QNiFjMr0t3TPtcHZUe8nz5oozm841/rWi6dH+9vzFUdnS4NdIWLg5u7V27b9p 4/vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683487706; x=1686079706; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u7Ew7G9qML4pL47doYyM9uniO4TkPYu8HEc43GhrRfs=; b=bld5sBoiof/x33Jg4DIfjNhBxRj6N5X6G5aKIl0+d/QhGkNXo3ozh52riajnGdSTK+ UA0gQfLKgNG9ati6HLeh5PvYBdIUCBDeFuy6OaJgWLHk7/h5nEznqhqZkHcfj5KPH4aH +DX5/1nao7AFFW2aNB4B15xILYdtoOEI4YHoVzIXtS6rfBaWTq8P7c9KbpzXD9AdUFEP t4LNwqdWb9TSiJbgp41OAS4WI1k67xjn8acupPS0otVoWdPIqomfk8W5Ifk9z/k4i6X6 8GR9TMLYIuGk+5hAxUg2kphppeSkw1JZJzzjCk+ErwXVhYs5q/d9YpvtXDwtHc1qyBEA YxzA== X-Gm-Message-State: AC+VfDwvGY/0DaxQKB/CN0p/xd2f2VJYip4IRhh2JK+6sdOW+BQuIEqt nDi4tjTc92CXuIGSHvYJpIHijTzP6js= X-Google-Smtp-Source: ACHHUZ7lkhpxpTrP4UX4grM7UTazzg62oQk25zG0/f4SeQogZXZSUoJxHCatJCFHYVUhF6a1gp/iYA== X-Received: by 2002:a17:903:338e:b0:1a1:cb18:7f99 with SMTP id kb14-20020a170903338e00b001a1cb187f99mr6984133plb.30.1683487706362; Sun, 07 May 2023 12:28:26 -0700 (PDT) Received: from rh-tp.. ([2406:7400:63:80ba:4cb4:7226:d064:79aa]) by smtp.gmail.com with ESMTPSA id jg18-20020a17090326d200b001a505f04a06sm5485624plb.190.2023.05.07.12.28.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 May 2023 12:28:26 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [RFCv5 4/5] iomap: Allocate iop in ->write_begin() early Date: Mon, 8 May 2023 00:57:59 +0530 Message-Id: X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Earlier when the folio is uptodate, we only allocate iop at writeback time (in iomap_writepage_map()). This is ok until now, but when we are going to add support for per-block dirty state bitmap in iop, this could cause some performance degradation. The reason is that if we don't allocate iop during ->write_begin(), then we will never mark the necessary dirty bits in ->write_end() call. And we will have to mark all the bits as dirty at the writeback time, that could cause the same write amplification and performance problems as it is now. However, for all the writes with (pos, len) which completely overlaps the given folio, there is no need to allocate an iop during ->write_begin(). So skip those cases. Signed-off-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 5103b644e115..25f20f269214 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -599,15 +599,25 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, size_t from = offset_in_folio(folio, pos), to = from + len; size_t poff, plen; - if (folio_test_uptodate(folio)) + /* + * If the write completely overlaps the current folio, then + * entire folio will be dirtied so there is no need for + * per-block state tracking structures to be attached to this folio. + */ + if (pos <= folio_pos(folio) && + pos + len >= folio_pos(folio) + folio_size(folio)) return 0; - folio_clear_error(folio); iop = iop_alloc(iter->inode, folio, iter->flags); if ((iter->flags & IOMAP_NOWAIT) && !iop && nr_blocks > 1) return -EAGAIN; + if (folio_test_uptodate(folio)) + return 0; + folio_clear_error(folio); + + do { iomap_adjust_read_range(iter->inode, folio, &block_start, block_end - block_start, &poff, &plen); From patchwork Sun May 7 19:28:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13233906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75DB9C77B75 for ; Sun, 7 May 2023 19:28:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231609AbjEGT2d (ORCPT ); Sun, 7 May 2023 15:28:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55544 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231563AbjEGT2c (ORCPT ); Sun, 7 May 2023 15:28:32 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 52F141156D; Sun, 7 May 2023 12:28:30 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1aaec9ad820so35184375ad.0; Sun, 07 May 2023 12:28:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683487709; x=1686079709; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=L8AgPtFTLwnnodFqXnj65oA9+VPfSCvnmXBU33BWJTQ=; b=UzDy2pFnv8hMMLnzl0UM+Hojq3uIdg15q+uOAHKnXsFuHaTOtqrW7c4Q2O+Oc8ABzC yqiGg07rDtNm3+Nro30SZY55ROh0NlwAVNxWNDtMI5n3l8aQuizsezn15c//IcbMAkh1 Whxcm7N8VRo/ZWNNZ2Pk7gJ8EPfuFmRLZmbc95KqWB3isuCkQnjZdb9k6JsUa9QHhX/N Jo8ghD3kOVEIwIe+b95sFVwGmqikk2kjWjEl/J7S1uA7RDhkeqdhYCV7NZ97V8ws5sNJ 5TDVaFfeEonMPSd5ubN2o7WRtD14b9082ySyRogpS5WO0dGtnAZkN1eJOiEYHUt0QOnC Iypg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683487709; x=1686079709; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=L8AgPtFTLwnnodFqXnj65oA9+VPfSCvnmXBU33BWJTQ=; b=AX8+ZlcRWjNPw3QHhXcsn4g5+NWYLcE33O5pXsmwNxk+86DLoyooqiT6Mqv0h0LbcO efWeO0WbgfB0Yqbs6eOptocSw2YgbWLYzRh7xq7C9IdItWQuhBfhZrYlvxqGDx6b9ITA Qi7Wace/Ooncyap4wGk29ze5ekS+BpTI2xprDl+fh5IgD0BdN/ISIL97ZeBByzhfoMTv ZD6JiRMDyIu/nasyZa78a19F9BHLdHbsrm7VqlnmIT7EWFBulWs+4v+7UDWToqpjho6g G6P18Wd+k8Ibv8tt9aBiUK0HI/kVDTO6kgT/EOUQCO9kyh0YQ/eUPGd62l0h1nVKSMz1 63RA== X-Gm-Message-State: AC+VfDyQu4GMpIk3r+o6yUjz/5mpBk+vQltHgEvSK70fXbq6zhRNIG7K pI6oHFJEA5wy2o+tc1w4fi7vVnJWCNc= X-Google-Smtp-Source: ACHHUZ63huWcIE+wanl2Cpp834onJAx9K24ZbqEam48izbzxWYt0dV6h3DtZCETWZ3+utxpf6tliuw== X-Received: by 2002:a17:903:41c1:b0:1ac:4162:5922 with SMTP id u1-20020a17090341c100b001ac41625922mr9668351ple.66.1683487709390; Sun, 07 May 2023 12:28:29 -0700 (PDT) Received: from rh-tp.. ([2406:7400:63:80ba:4cb4:7226:d064:79aa]) by smtp.gmail.com with ESMTPSA id jg18-20020a17090326d200b001a505f04a06sm5485624plb.190.2023.05.07.12.28.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 07 May 2023 12:28:29 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" , Aravinda Herle Subject: [RFCv5 5/5] iomap: Add per-block dirty state tracking to improve performance Date: Mon, 8 May 2023 00:58:00 +0530 Message-Id: <86987466d8d7863bd0dca81e9d6c3eff7abd4964.1683485700.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When filesystem blocksize is less than folio size (either with mapping_large_folio_support() or with blocksize < pagesize) and when the folio is uptodate in pagecache, then even a byte write can cause an entire folio to be written to disk during writeback. This happens because we currently don't have a mechanism to track per-block dirty state within struct iomap_page. We currently only track uptodate state. This patch implements support for tracking per-block dirty state in iomap_page->state bitmap. This should help improve the filesystem write performance and help reduce write amplification. Performance testing of below fio workload reveals ~16x performance improvement using nvme with XFS (4k blocksize) on Power (64K pagesize) FIO reported write bw scores improved from around ~28 MBps to ~452 MBps. 1. [global] ioengine=psync rw=randwrite overwrite=1 pre_read=1 direct=0 bs=4k size=1G dir=./ numjobs=8 fdatasync=1 runtime=60 iodepth=64 group_reporting=1 [fio-run] 2. Also our internal performance team reported that this patch improves their database workload performance by around ~83% (with XFS on Power) Reported-by: Aravinda Herle Reported-by: Brian Foster Signed-off-by: Ritesh Harjani (IBM) --- fs/gfs2/aops.c | 2 +- fs/iomap/buffered-io.c | 115 ++++++++++++++++++++++++++++++++++++++--- fs/xfs/xfs_aops.c | 2 +- fs/zonefs/file.c | 2 +- include/linux/iomap.h | 1 + 5 files changed, 112 insertions(+), 10 deletions(-) -- 2.39.2 diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index a5f4be6b9213..75efec3c3b71 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -746,7 +746,7 @@ static const struct address_space_operations gfs2_aops = { .writepages = gfs2_writepages, .read_folio = gfs2_read_folio, .readahead = gfs2_readahead, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .bmap = gfs2_bmap, diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 25f20f269214..c7f41b26280a 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -52,6 +52,12 @@ static inline void iop_set_range(struct iomap_page *iop, unsigned int start_blk, bitmap_set(iop->state, start_blk, nr_blks); } +static inline void iop_clear_range(struct iomap_page *iop, + unsigned int start_blk, unsigned int nr_blks) +{ + bitmap_clear(iop->state, start_blk, nr_blks); +} + static inline bool iop_test_block(struct iomap_page *iop, unsigned int block) { return test_bit(block, iop->state); @@ -84,6 +90,16 @@ static bool iop_test_block_uptodate(struct folio *folio, unsigned int block) return iop_test_block(iop, block); } +static bool iop_test_block_dirty(struct folio *folio, int block) +{ + struct iomap_page *iop = to_iomap_page(folio); + struct inode *inode = folio->mapping->host; + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + + WARN_ON(!iop); + return iop_test_block(iop, block + blks_per_folio); +} + static void iop_set_range_uptodate(struct inode *inode, struct folio *folio, size_t off, size_t len) { @@ -104,8 +120,42 @@ static void iop_set_range_uptodate(struct inode *inode, struct folio *folio, } } +static void iop_set_range_dirty(struct inode *inode, struct folio *folio, + size_t off, size_t len) +{ + struct iomap_page *iop = to_iomap_page(folio); + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + unsigned int first_blk = (off >> inode->i_blkbits); + unsigned int last_blk = ((off + len - 1) >> inode->i_blkbits); + unsigned int nr_blks = last_blk - first_blk + 1; + unsigned long flags; + + if (!iop) + return; + spin_lock_irqsave(&iop->state_lock, flags); + iop_set_range(iop, first_blk + blks_per_folio, nr_blks); + spin_unlock_irqrestore(&iop->state_lock, flags); +} + +static void iop_clear_range_dirty(struct folio *folio, size_t off, size_t len) +{ + struct iomap_page *iop = to_iomap_page(folio); + struct inode *inode = folio->mapping->host; + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + unsigned int first_blk = (off >> inode->i_blkbits); + unsigned int last_blk = ((off + len - 1) >> inode->i_blkbits); + unsigned int nr_blks = last_blk - first_blk + 1; + unsigned long flags; + + if (!iop) + return; + spin_lock_irqsave(&iop->state_lock, flags); + iop_clear_range(iop, first_blk + blks_per_folio, nr_blks); + spin_unlock_irqrestore(&iop->state_lock, flags); +} + static struct iomap_page *iop_alloc(struct inode *inode, struct folio *folio, - unsigned int flags) + unsigned int flags, bool is_dirty) { struct iomap_page *iop = to_iomap_page(folio); unsigned int nr_blocks = i_blocks_per_folio(inode, folio); @@ -119,12 +169,20 @@ static struct iomap_page *iop_alloc(struct inode *inode, struct folio *folio, else gfp = GFP_NOFS | __GFP_NOFAIL; - iop = kzalloc(struct_size(iop, state, BITS_TO_LONGS(nr_blocks)), + /* + * iop->state tracks two sets of state flags when the + * filesystem block size is smaller than the folio size. + * The first state tracks per-block uptodate and the + * second tracks per-block dirty state. + */ + iop = kzalloc(struct_size(iop, state, BITS_TO_LONGS(2 * nr_blocks)), gfp); if (iop) { spin_lock_init(&iop->state_lock); if (folio_test_uptodate(folio)) iop_set_range(iop, 0, nr_blocks); + if (is_dirty) + iop_set_range(iop, nr_blocks, nr_blocks); folio_attach_private(folio, iop); } return iop; @@ -268,7 +326,8 @@ static int iomap_read_inline_data(const struct iomap_iter *iter, if (WARN_ON_ONCE(size > iomap->length)) return -EIO; if (offset > 0) - iop = iop_alloc(iter->inode, folio, iter->flags); + iop = iop_alloc(iter->inode, folio, iter->flags, + folio_test_dirty(folio)); else iop = to_iomap_page(folio); @@ -306,7 +365,8 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, return iomap_read_inline_data(iter, folio); /* zero post-eof blocks as the page may be mapped */ - iop = iop_alloc(iter->inode, folio, iter->flags); + iop = iop_alloc(iter->inode, folio, iter->flags, + folio_test_dirty(folio)); iomap_adjust_read_range(iter->inode, folio, &pos, length, &poff, &plen); if (plen == 0) goto done; @@ -561,6 +621,18 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len) } EXPORT_SYMBOL_GPL(iomap_invalidate_folio); +bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio) +{ + struct iomap_page *iop; + struct inode *inode = mapping->host; + size_t len = i_blocks_per_folio(inode, folio) << inode->i_blkbits; + + iop = iop_alloc(inode, folio, 0, false); + iop_set_range_dirty(inode, folio, 0, len); + return filemap_dirty_folio(mapping, folio); +} +EXPORT_SYMBOL_GPL(iomap_dirty_folio); + static void iomap_write_failed(struct inode *inode, loff_t pos, unsigned len) { @@ -608,7 +680,8 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, pos + len >= folio_pos(folio) + folio_size(folio)) return 0; - iop = iop_alloc(iter->inode, folio, iter->flags); + iop = iop_alloc(iter->inode, folio, iter->flags, + folio_test_dirty(folio)); if ((iter->flags & IOMAP_NOWAIT) && !iop && nr_blocks > 1) return -EAGAIN; @@ -767,6 +840,7 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, if (unlikely(copied < len && !folio_test_uptodate(folio))) return 0; iop_set_range_uptodate(inode, folio, offset_in_folio(folio, pos), len); + iop_set_range_dirty(inode, folio, offset_in_folio(folio, pos), copied); filemap_dirty_folio(inode->i_mapping, folio); return copied; } @@ -954,6 +1028,10 @@ static int iomap_write_delalloc_scan(struct inode *inode, { while (start_byte < end_byte) { struct folio *folio; + struct iomap_page *iop; + unsigned int first_blk, last_blk, blks_per_folio, i; + loff_t last_byte; + u8 blkbits = inode->i_blkbits; /* grab locked page */ folio = filemap_lock_folio(inode->i_mapping, @@ -978,6 +1056,28 @@ static int iomap_write_delalloc_scan(struct inode *inode, } } + /* + * When we have per-block dirty tracking, there can be + * blocks within a folio which are marked uptodate + * but not dirty. In that case it is necessary to punch + * out such blocks to avoid leaking any delalloc blocks. + */ + iop = to_iomap_page(folio); + if (!iop) + goto skip_iop_punch; + last_byte = min_t(loff_t, end_byte - 1, + (folio_next_index(folio) << PAGE_SHIFT) - 1); + first_blk = offset_in_folio(folio, start_byte) >> + blkbits; + last_blk = offset_in_folio(folio, last_byte) >> + blkbits; + blks_per_folio = i_blocks_per_folio(inode, folio); + for (i = first_blk; i <= last_blk; i++) { + if (!iop_test_block_dirty(folio, i)) + punch(inode, i << blkbits, + 1 << blkbits); + } +skip_iop_punch: /* * Make sure the next punch start is correctly bound to * the end of this data range, not the end of the folio. @@ -1666,7 +1766,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct inode *inode, struct folio *folio, u64 end_pos) { - struct iomap_page *iop = iop_alloc(inode, folio, 0); + struct iomap_page *iop = iop_alloc(inode, folio, 0, true); struct iomap_ioend *ioend, *next; unsigned len = i_blocksize(inode); unsigned nblocks = i_blocks_per_folio(inode, folio); @@ -1682,7 +1782,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, * invalid, grab a new one. */ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) { - if (iop && !iop_test_block_uptodate(folio, i)) + if (iop && !iop_test_block_dirty(folio, i)) continue; error = wpc->ops->map_blocks(wpc, inode, pos); @@ -1726,6 +1826,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, } } + iop_clear_range_dirty(folio, 0, end_pos - folio_pos(folio)); folio_start_writeback(folio); folio_unlock(folio); diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 2ef78aa1d3f6..77c7332ae197 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -578,7 +578,7 @@ const struct address_space_operations xfs_address_space_operations = { .read_folio = xfs_vm_read_folio, .readahead = xfs_vm_readahead, .writepages = xfs_vm_writepages, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .bmap = xfs_vm_bmap, diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c index 132f01d3461f..e508c8e97372 100644 --- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -175,7 +175,7 @@ const struct address_space_operations zonefs_file_aops = { .read_folio = zonefs_read_folio, .readahead = zonefs_readahead, .writepages = zonefs_writepages, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .migrate_folio = filemap_migrate_folio, diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 0f8123504e5e..0c2bee80565c 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -264,6 +264,7 @@ bool iomap_is_partially_uptodate(struct folio *, size_t from, size_t count); struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos); bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags); void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len); +bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio); int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len, const struct iomap_ops *ops); int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,