From patchwork Mon Jun 5 10:55:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13267206 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D7AFC7EE23 for ; Mon, 5 Jun 2023 10:55:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231800AbjFEKzi (ORCPT ); Mon, 5 Jun 2023 06:55:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231432AbjFEKzZ (ORCPT ); Mon, 5 Jun 2023 06:55:25 -0400 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7612911C; Mon, 5 Jun 2023 03:55:19 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1b01d912924so43170645ad.1; Mon, 05 Jun 2023 03:55:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685962518; x=1688554518; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TxrzfIp1bwlxrggScZU2c1c7t8HyvxqKo5xx2x+BHzs=; b=a9yswcidqIQodT44r0/9muHlwWLyf8GVMw1Dm4Q0PvIpYODGAzzSarevkO9IEYWmUk uOAsSxRCWyb+9qLVgHAtJckaSupYP0QiRtgbgBIn+8VV/+w1RxbzBhCz9eOMMYYONkkR SyxBN99mt4/fUy4SlBfnWQyFlTz7wqPU3TiC/+RHlNisksAOtfOt6triJVHp8mnDJlSx 7Pin25efvEKHfu8sHjtrBEKhHnWar/dvH01ILzJ+4sP2EGvNTLQ/FZiJbc1CKRINcLvz KJQfCJYm3ZbL0ag/7tWjfByk/+s0exTR/4HlSVIPQEARYrtJNepTJEmukYRvbpT6G97c RV6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685962518; x=1688554518; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TxrzfIp1bwlxrggScZU2c1c7t8HyvxqKo5xx2x+BHzs=; b=eAj3PaJy2fP0jZkk4JQVEHWP0yQs5Wm3M5witSBieeI0GucmPvP77vRfdQs6eERD3X prNaq19IY36i5TCTIE2LQ8aP5kG7UB8MxemysdDgUWKLpErvIK4gyZ6ho1poac+t+0DK gohue0Jtraz8aN031cqeAQHvVeE7jQ8X7YudPT8K0ScLKE/xRiUntAAfQ4e979c+Qgq8 zDAdI4BTdSQS6Owmh/D38WSmhn734244MQO9YXYovb16cP2YyVfmeGdUQra5a9HM7ssh YQNokKR+ImGAqECH8xRJh1MxyvFSrLU293YwuxqSdgisy19faw4D8jgWy+DIlbKR8xXS xXag== X-Gm-Message-State: AC+VfDz3rAjI8yQRXQEwvaGfWOvkLrIY5OguHKZx9oQ8uW8YdzxO/LTf i1dIVaY7M8p95k4IamRDPeA1QKCL33A= X-Google-Smtp-Source: ACHHUZ64dDKctH++/r2IBTDMoQhIUBSSuVb3O63nNI3T9gvaB5JEgO22OBuph0UeU1YCffStYCDI0g== X-Received: by 2002:a17:902:d506:b0:1b1:9233:bbf5 with SMTP id b6-20020a170902d50600b001b19233bbf5mr9197475plg.57.1685962518438; Mon, 05 Jun 2023 03:55:18 -0700 (PDT) Received: from dw-tp.c4p-in.ibmmobiledemo.com ([129.41.58.19]) by smtp.gmail.com with ESMTPSA id q3-20020a17090311c300b001b0f727bc44sm6266883plh.16.2023.06.05.03.55.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jun 2023 03:55:18 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Christoph Hellwig , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [PATCHv7 1/6] iomap: Rename iomap_page_create/release() to iomap_iop_alloc/free() Date: Mon, 5 Jun 2023 16:25:01 +0530 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org This patch renames the iomap_page_create/release() functions to iomap_iop_alloc/free() calls. Later patches adds more functions for handling iop structure with iomap_iop_** naming conventions. Hence iomap_iop_alloc/free() makes more sense to be consistent with all APIs. Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 063133ec77f4..4567bdd4fff9 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -43,8 +43,8 @@ static inline struct iomap_page *to_iomap_page(struct folio *folio) static struct bio_set iomap_ioend_bioset; -static struct iomap_page * -iomap_page_create(struct inode *inode, struct folio *folio, unsigned int flags) +static struct iomap_page *iomap_iop_alloc(struct inode *inode, + struct folio *folio, unsigned int flags) { struct iomap_page *iop = to_iomap_page(folio); unsigned int nr_blocks = i_blocks_per_folio(inode, folio); @@ -69,7 +69,7 @@ iomap_page_create(struct inode *inode, struct folio *folio, unsigned int flags) return iop; } -static void iomap_page_release(struct folio *folio) +static void iomap_iop_free(struct folio *folio) { struct iomap_page *iop = folio_detach_private(folio); struct inode *inode = folio->mapping->host; @@ -231,7 +231,7 @@ static int iomap_read_inline_data(const struct iomap_iter *iter, if (WARN_ON_ONCE(size > iomap->length)) return -EIO; if (offset > 0) - iop = iomap_page_create(iter->inode, folio, iter->flags); + iop = iomap_iop_alloc(iter->inode, folio, iter->flags); else iop = to_iomap_page(folio); @@ -269,7 +269,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, return iomap_read_inline_data(iter, folio); /* zero post-eof blocks as the page may be mapped */ - iop = iomap_page_create(iter->inode, folio, iter->flags); + iop = iomap_iop_alloc(iter->inode, folio, iter->flags); iomap_adjust_read_range(iter->inode, folio, &pos, length, &poff, &plen); if (plen == 0) goto done; @@ -490,7 +490,7 @@ bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags) */ if (folio_test_dirty(folio) || folio_test_writeback(folio)) return false; - iomap_page_release(folio); + iomap_iop_free(folio); return true; } EXPORT_SYMBOL_GPL(iomap_release_folio); @@ -507,12 +507,12 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len) if (offset == 0 && len == folio_size(folio)) { WARN_ON_ONCE(folio_test_writeback(folio)); folio_cancel_dirty(folio); - iomap_page_release(folio); + iomap_iop_free(folio); } else if (folio_test_large(folio)) { /* Must release the iop so the page can be split */ WARN_ON_ONCE(!folio_test_uptodate(folio) && folio_test_dirty(folio)); - iomap_page_release(folio); + iomap_iop_free(folio); } } EXPORT_SYMBOL_GPL(iomap_invalidate_folio); @@ -559,7 +559,8 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, return 0; folio_clear_error(folio); - iop = iomap_page_create(iter->inode, folio, iter->flags); + iop = iomap_iop_alloc(iter->inode, folio, iter->flags); + if ((iter->flags & IOMAP_NOWAIT) && !iop && nr_blocks > 1) return -EAGAIN; @@ -1612,7 +1613,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct inode *inode, struct folio *folio, u64 end_pos) { - struct iomap_page *iop = iomap_page_create(inode, folio, 0); + struct iomap_page *iop = iomap_iop_alloc(inode, folio, 0); struct iomap_ioend *ioend, *next; unsigned len = i_blocksize(inode); unsigned nblocks = i_blocks_per_folio(inode, folio); From patchwork Mon Jun 5 10:55:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13267207 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D1AAC7EE23 for ; Mon, 5 Jun 2023 10:55:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231264AbjFEKzp (ORCPT ); Mon, 5 Jun 2023 06:55:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60216 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231624AbjFEKza (ORCPT ); Mon, 5 Jun 2023 06:55:30 -0400 Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com [IPv6:2607:f8b0:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7A14E18E; Mon, 5 Jun 2023 03:55:23 -0700 (PDT) Received: by mail-pg1-x534.google.com with SMTP id 41be03b00d2f7-53202149ae2so2543185a12.3; Mon, 05 Jun 2023 03:55:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685962522; x=1688554522; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ATOgTCD4yzJoITjwwVO7z1jeNiclQOzGdaaC13HT7dU=; b=U3XJZon/zyBw9Xr0o///D7ldH+DK7KtmJJ745H0oL3skfDSbj1nR2l36OpxK+evEAB 0I5vxqJziM7OD4g32unebKfgBVSOG5EwBjnrRJhA68QSR4OFajWdnnbh8H9YRb8HbVj7 xr6jw5K1hlsnSovZiz1Rspmd/ipqlUOTZUHAetqi53d2Gy4oXIQIke4CbrK/LivD16mC yAK6yk4ADioj+Y72Od4KPyd5vrSu9FIXwJvLuOep8D9FM89H1AqhKQ2P2tm0kbLcTfFt l1pa+F7C40nUbYXOegMsnaCxEZAB804cjB6fG1yZIwZT01hZI7pXSGErxDLbHuIRA/hB VM+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685962522; x=1688554522; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ATOgTCD4yzJoITjwwVO7z1jeNiclQOzGdaaC13HT7dU=; b=PqT1Ptjn1naoRCki42Rs9P8K7x9qS+d/JIEU6XApL+MHicXCm/YQ3aZ7LWfhhG+TUN trRlWaUtCXGRSoFlYhsLT7s0RslcDDrPzO1IzQUlQ2IXPeq4Kk+CEJOiJIkLKGpJY9m9 H6FOcnz3cReKI0WRDe6JL2kp3NYrUAH7OcYXkL5mE9pYEj62i4Hslg87G5Qz0Q0FRR/A EeUw9CfDr3i+a/iTepj+2KCCOy2KUF3QtyrCrwr4hsYPa6RCt5u1A3ExlRsX4nwEB2IJ fmBs43PjQNl/ubi+RP0MDqiZ+BWDxLGIvv0G9AtZbmJdhWKZclMR5ylc1Lb7bL6ElHVy rWWw== X-Gm-Message-State: AC+VfDx33WvV15tIxSszmxu0eg7QGDn3o9iJc5G/FMf5Bf3eg0uHfbWd Wa4j2nEjpQJ1q7iRkik6SERHVLAnR4o= X-Google-Smtp-Source: ACHHUZ5ICNcF4CADAq0qIJmjOJH1ij3Pd+z6r9TrBRYClaEF9aFwSEeRpRLPIvl4RTC1Ty1o4uLTPA== X-Received: by 2002:a05:6a20:7d87:b0:10c:c5df:8bb7 with SMTP id v7-20020a056a207d8700b0010cc5df8bb7mr4001097pzj.30.1685962522459; Mon, 05 Jun 2023 03:55:22 -0700 (PDT) Received: from dw-tp.c4p-in.ibmmobiledemo.com ([129.41.58.19]) by smtp.gmail.com with ESMTPSA id q3-20020a17090311c300b001b0f727bc44sm6266883plh.16.2023.06.05.03.55.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jun 2023 03:55:22 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Christoph Hellwig , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [PATCHv7 2/6] iomap: Move folio_detach_private() in iomap_iop_free() to the end Date: Mon, 5 Jun 2023 16:25:02 +0530 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In later patches we will add other accessor APIs which will take inode and folio to operate over struct iomap_page. Since we need folio's private (iomap_page) in those functions, hence this function moves detaching of folio's private at the end just before calling kfree(iop). Signed-off-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 4567bdd4fff9..6fffda355c45 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -71,7 +71,7 @@ static struct iomap_page *iomap_iop_alloc(struct inode *inode, static void iomap_iop_free(struct folio *folio) { - struct iomap_page *iop = folio_detach_private(folio); + struct iomap_page *iop = to_iomap_page(folio); struct inode *inode = folio->mapping->host; unsigned int nr_blocks = i_blocks_per_folio(inode, folio); @@ -81,6 +81,7 @@ static void iomap_iop_free(struct folio *folio) WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending)); WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != folio_test_uptodate(folio)); + folio_detach_private(folio); kfree(iop); } From patchwork Mon Jun 5 10:55:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13267210 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 998FAC7EE2D for ; Mon, 5 Jun 2023 10:55:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231680AbjFEKzx (ORCPT ); Mon, 5 Jun 2023 06:55:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231747AbjFEKzh (ORCPT ); Mon, 5 Jun 2023 06:55:37 -0400 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02C8E109; Mon, 5 Jun 2023 03:55:27 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1b1806264e9so23368895ad.0; Mon, 05 Jun 2023 03:55:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685962526; x=1688554526; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mthp516lHjyz9NR6tUgrOY0idygGmolqlOw/E9u74Dw=; b=HHXwEk7ARBHOCa8IZOjSZwmd7uq7akNbzbZ/IhG4fIH8BbzfD/hFBEmV1Ik0qx4RUL JjpdBIVHY1sY8DX208ulcC9hDhiVe5FB44kjj0E5f28Tqk4bRbXFPSvu7UTYtHmT7VOJ Q28BRDvBIQ2LfilenRJOoWh9utDhLWbxdI+ox2qGgVRLrwTcaf9z6mSZ2NTJwQjB2A9V s8o553RSHl3DL+ycfI6xYYd+XiHn1SMKzh1WfTwGsSyi+w4/yfsEPLYTT4BXEXFnd65+ wnhb/Mqxbj8SirJldhB3EWWJBS0dB9sgUlSbiX/B4VNBmP4DRtt/VEenXf1hQIZlw3o2 oS5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685962526; x=1688554526; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mthp516lHjyz9NR6tUgrOY0idygGmolqlOw/E9u74Dw=; b=KFmKpRHg6vP2aR+4753+V6eBerUSGPZWwBLbiHqhSg01kLz0PAhaDTK1dFFqElSJ2w GdvewJwGGC1R7F5BqNsCaKxNM2MHxY5pH1X0u6d1FiWb6TM+hU2tGPGuMQSr2+0FMYlr F2L4MppYxI93wC8OCVeTuQUJISqLe68UdYosPZzmeiiBUeS4iqRZFLwdcOG3uS4Gq13w eB38A9l3rpg5ZeXirVAEtw0fdOkc7v7QAmwpWEGzOXUP7DyiZJ+ERlUcDKHoy+OV8pnp 2v5dAg9+FQCoR9dJH4PvuymfANkMBCpBia8qtxUWkoqrrHtG03hoMqUEBfTIWl9yqA+6 TReA== X-Gm-Message-State: AC+VfDxd0z5/eLokChR/F3FUMUffYPNTln5HJF9e7ZpJP9etmCqRi2fY ktkgW2u2bGBihzx9nNyJ3Q34TXBHR1E= X-Google-Smtp-Source: ACHHUZ55jDYklzHjwVvzGVrqgrUjZM2pZljDm3YLb+OwJHxnIa0iWD/Plgo4/Crd/jMf/2hk5Vj+jw== X-Received: by 2002:a17:902:b694:b0:1af:b957:718b with SMTP id c20-20020a170902b69400b001afb957718bmr3273111pls.39.1685962526042; Mon, 05 Jun 2023 03:55:26 -0700 (PDT) Received: from dw-tp.c4p-in.ibmmobiledemo.com ([129.41.58.19]) by smtp.gmail.com with ESMTPSA id q3-20020a17090311c300b001b0f727bc44sm6266883plh.16.2023.06.05.03.55.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jun 2023 03:55:25 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Christoph Hellwig , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [PATCHv7 3/6] iomap: Refactor some iop related accessor functions Date: Mon, 5 Jun 2023 16:25:03 +0530 Message-Id: <4fe4937718d44c89e0c279175c65921717d9f591.1685962158.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org We would eventually use iomap_iop_** function naming by the rest of the buffered-io iomap code. This patch update function arguments and naming from iomap_set_range_uptodate() -> iomap_iop_set_range_uptodate(). iop_set_range_uptodate() then becomes an accessor function used by iomap_iop_** functions. Signed-off-by: Ritesh Harjani (IBM) --- fs/iomap/buffered-io.c | 111 +++++++++++++++++++++++------------------ 1 file changed, 63 insertions(+), 48 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 6fffda355c45..136f57ccd0be 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -24,14 +24,14 @@ #define IOEND_BATCH_SIZE 4096 /* - * Structure allocated for each folio when block size < folio size - * to track sub-folio uptodate status and I/O completions. + * Structure allocated for each folio to track per-block uptodate state + * and I/O completions. */ struct iomap_page { atomic_t read_bytes_pending; atomic_t write_bytes_pending; - spinlock_t uptodate_lock; - unsigned long uptodate[]; + spinlock_t state_lock; + unsigned long state[]; }; static inline struct iomap_page *to_iomap_page(struct folio *folio) @@ -43,6 +43,48 @@ static inline struct iomap_page *to_iomap_page(struct folio *folio) static struct bio_set iomap_ioend_bioset; +static bool iop_test_full_uptodate(struct folio *folio) +{ + struct iomap_page *iop = to_iomap_page(folio); + struct inode *inode = folio->mapping->host; + + return bitmap_full(iop->state, i_blocks_per_folio(inode, folio)); +} + +static bool iop_test_block_uptodate(struct folio *folio, unsigned int block) +{ + struct iomap_page *iop = to_iomap_page(folio); + + return test_bit(block, iop->state); +} + +static void iop_set_range_uptodate(struct inode *inode, struct folio *folio, + size_t off, size_t len) +{ + struct iomap_page *iop = to_iomap_page(folio); + unsigned int first_blk = off >> inode->i_blkbits; + unsigned int last_blk = (off + len - 1) >> inode->i_blkbits; + unsigned int nr_blks = last_blk - first_blk + 1; + unsigned long flags; + + spin_lock_irqsave(&iop->state_lock, flags); + bitmap_set(iop->state, first_blk, nr_blks); + if (iop_test_full_uptodate(folio)) + folio_mark_uptodate(folio); + spin_unlock_irqrestore(&iop->state_lock, flags); +} + +static void iomap_iop_set_range_uptodate(struct inode *inode, + struct folio *folio, size_t off, size_t len) +{ + struct iomap_page *iop = to_iomap_page(folio); + + if (iop) + iop_set_range_uptodate(inode, folio, off, len); + else + folio_mark_uptodate(folio); +} + static struct iomap_page *iomap_iop_alloc(struct inode *inode, struct folio *folio, unsigned int flags) { @@ -58,12 +100,12 @@ static struct iomap_page *iomap_iop_alloc(struct inode *inode, else gfp = GFP_NOFS | __GFP_NOFAIL; - iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)), + iop = kzalloc(struct_size(iop, state, BITS_TO_LONGS(nr_blocks)), gfp); if (iop) { - spin_lock_init(&iop->uptodate_lock); + spin_lock_init(&iop->state_lock); if (folio_test_uptodate(folio)) - bitmap_fill(iop->uptodate, nr_blocks); + bitmap_fill(iop->state, nr_blocks); folio_attach_private(folio, iop); } return iop; @@ -72,14 +114,12 @@ static struct iomap_page *iomap_iop_alloc(struct inode *inode, static void iomap_iop_free(struct folio *folio) { struct iomap_page *iop = to_iomap_page(folio); - struct inode *inode = folio->mapping->host; - unsigned int nr_blocks = i_blocks_per_folio(inode, folio); if (!iop) return; WARN_ON_ONCE(atomic_read(&iop->read_bytes_pending)); WARN_ON_ONCE(atomic_read(&iop->write_bytes_pending)); - WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != + WARN_ON_ONCE(iop_test_full_uptodate(folio) != folio_test_uptodate(folio)); folio_detach_private(folio); kfree(iop); @@ -111,7 +151,7 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* move forward for each leading block marked uptodate */ for (i = first; i <= last; i++) { - if (!test_bit(i, iop->uptodate)) + if (!iop_test_block_uptodate(folio, i)) break; *pos += block_size; poff += block_size; @@ -121,7 +161,7 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, /* truncate len if we find any trailing uptodate block(s) */ for ( ; i <= last; i++) { - if (test_bit(i, iop->uptodate)) { + if (iop_test_block_uptodate(folio, i)) { plen -= (last - i + 1) * block_size; last = i - 1; break; @@ -145,30 +185,6 @@ static void iomap_adjust_read_range(struct inode *inode, struct folio *folio, *lenp = plen; } -static void iomap_iop_set_range_uptodate(struct folio *folio, - struct iomap_page *iop, size_t off, size_t len) -{ - struct inode *inode = folio->mapping->host; - unsigned first = off >> inode->i_blkbits; - unsigned last = (off + len - 1) >> inode->i_blkbits; - unsigned long flags; - - spin_lock_irqsave(&iop->uptodate_lock, flags); - bitmap_set(iop->uptodate, first, last - first + 1); - if (bitmap_full(iop->uptodate, i_blocks_per_folio(inode, folio))) - folio_mark_uptodate(folio); - spin_unlock_irqrestore(&iop->uptodate_lock, flags); -} - -static void iomap_set_range_uptodate(struct folio *folio, - struct iomap_page *iop, size_t off, size_t len) -{ - if (iop) - iomap_iop_set_range_uptodate(folio, iop, off, len); - else - folio_mark_uptodate(folio); -} - static void iomap_finish_folio_read(struct folio *folio, size_t offset, size_t len, int error) { @@ -178,7 +194,8 @@ static void iomap_finish_folio_read(struct folio *folio, size_t offset, folio_clear_uptodate(folio); folio_set_error(folio); } else { - iomap_set_range_uptodate(folio, iop, offset, len); + iomap_iop_set_range_uptodate(folio->mapping->host, folio, + offset, len); } if (!iop || atomic_sub_and_test(len, &iop->read_bytes_pending)) @@ -214,7 +231,6 @@ struct iomap_readpage_ctx { static int iomap_read_inline_data(const struct iomap_iter *iter, struct folio *folio) { - struct iomap_page *iop; const struct iomap *iomap = iomap_iter_srcmap(iter); size_t size = i_size_read(iter->inode) - iomap->offset; size_t poff = offset_in_page(iomap->offset); @@ -232,15 +248,14 @@ static int iomap_read_inline_data(const struct iomap_iter *iter, if (WARN_ON_ONCE(size > iomap->length)) return -EIO; if (offset > 0) - iop = iomap_iop_alloc(iter->inode, folio, iter->flags); - else - iop = to_iomap_page(folio); + iomap_iop_alloc(iter->inode, folio, iter->flags); addr = kmap_local_folio(folio, offset); memcpy(addr, iomap->inline_data, size); memset(addr + size, 0, PAGE_SIZE - poff - size); kunmap_local(addr); - iomap_set_range_uptodate(folio, iop, offset, PAGE_SIZE - poff); + iomap_iop_set_range_uptodate(iter->inode, folio, offset, + PAGE_SIZE - poff); return 0; } @@ -277,7 +292,7 @@ static loff_t iomap_readpage_iter(const struct iomap_iter *iter, if (iomap_block_needs_zeroing(iter, pos)) { folio_zero_range(folio, poff, plen); - iomap_set_range_uptodate(folio, iop, poff, plen); + iomap_iop_set_range_uptodate(iter->inode, folio, poff, plen); goto done; } @@ -452,7 +467,7 @@ bool iomap_is_partially_uptodate(struct folio *folio, size_t from, size_t count) last = (from + count - 1) >> inode->i_blkbits; for (i = first; i <= last; i++) - if (!test_bit(i, iop->uptodate)) + if (!iop_test_block_uptodate(folio, i)) return false; return true; } @@ -591,7 +606,7 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, if (status) return status; } - iomap_set_range_uptodate(folio, iop, poff, plen); + iomap_iop_set_range_uptodate(iter->inode, folio, poff, plen); } while ((block_start += plen) < block_end); return 0; @@ -698,7 +713,6 @@ static int iomap_write_begin(struct iomap_iter *iter, loff_t pos, static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, size_t copied, struct folio *folio) { - struct iomap_page *iop = to_iomap_page(folio); flush_dcache_folio(folio); /* @@ -714,7 +728,8 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, */ if (unlikely(copied < len && !folio_test_uptodate(folio))) return 0; - iomap_set_range_uptodate(folio, iop, offset_in_folio(folio, pos), len); + iomap_iop_set_range_uptodate(inode, folio, offset_in_folio(folio, pos), + len); filemap_dirty_folio(inode->i_mapping, folio); return copied; } @@ -1630,7 +1645,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, * invalid, grab a new one. */ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) { - if (iop && !test_bit(i, iop->uptodate)) + if (iop && !iop_test_block_uptodate(folio, i)) continue; error = wpc->ops->map_blocks(wpc, inode, pos); From patchwork Mon Jun 5 10:55:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13267208 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09391C7EE23 for ; Mon, 5 Jun 2023 10:55:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230515AbjFEKzv (ORCPT ); Mon, 5 Jun 2023 06:55:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60470 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231751AbjFEKzh (ORCPT ); Mon, 5 Jun 2023 06:55:37 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4DA3510C; Mon, 5 Jun 2023 03:55:30 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-53fa455cd94so2116897a12.2; Mon, 05 Jun 2023 03:55:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685962529; x=1688554529; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WupwSFXXwQFHWk6jv9cWFxx36Vr+X7GVSjDECiCjfRE=; b=Kg5jNqx8MyZ+zzEUQ2iY7BALh0gh6CTVSKXxAtS4uerxTHBTFVFyztr8N6z3xnqUNu tvyUDcs34Eck3wkiK+6ch8qt4tetTpaAHKMmud+SVBmVysNyQQoaXDDrp+zQfBxUS6m/ zWjbf3r56f7dQvtxa+/tuWvQSy2Xm+jqZWTgCQSnA9IUURCdSdalGse6zf3CdXORWMxG KYrbMpSfhP1wMpmKuVCDHauzO8lufXLLfe+E/3YueuDfeulDti1ZnzEONcDC6S+sD1LK CmUYCff8KuKD2TOvolPVgQfB/Fy+35hlDvjURdeZXu4a5/l/AiPPkz1lksDUEm8nDbn7 4d1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685962529; x=1688554529; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WupwSFXXwQFHWk6jv9cWFxx36Vr+X7GVSjDECiCjfRE=; b=DMt+lEK5iuhZv08OELtI+sNBqgYas+LE2qoWMI7N4usQyQtE+bJkHecPGcExQRpHH1 Gf1nJz/tV2OsQrehHKzeHIyqp4f1I2aYOk9y3KMc0qJGXLofWBmbJ2M7AKOxjAnUIvwU oaTotJmwDwL+8xbSGc4ftVryYquVLZpdkaxL/d8F4Cl60Tkm8cSmIXMgc6z+N3XKig1M OlXoxQoobGqAPK94+9fQrMOcW4UWClKDtiCQp1IFiY7IxkdMUYGTY0u1iEGns7NLc+oO vgNWy2sYqtpnJmAoQRrFmA96S6XTyYeby3tq5a784HCXTHxrfa1J02ffsPsIN6wH4pSb q/qw== X-Gm-Message-State: AC+VfDw/XVmKWh2NUeI6rq2EjCqUZ84LFW7diqgNMfau8tqidkyF64zB bhRcC/VbdmiCgvParY7f5xfUwf8a5MA= X-Google-Smtp-Source: ACHHUZ6EH7acXwk36tvkKicdhUuZu6jJpp4NbCtfjiLlqHBx+/Rmnz5nzobzokopySgRWDVDW/4fFw== X-Received: by 2002:a05:6a20:3ca1:b0:115:5910:c82d with SMTP id b33-20020a056a203ca100b001155910c82dmr1230613pzj.43.1685962529421; Mon, 05 Jun 2023 03:55:29 -0700 (PDT) Received: from dw-tp.c4p-in.ibmmobiledemo.com ([129.41.58.19]) by smtp.gmail.com with ESMTPSA id q3-20020a17090311c300b001b0f727bc44sm6266883plh.16.2023.06.05.03.55.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jun 2023 03:55:29 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Christoph Hellwig , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [PATCHv7 4/6] iomap: Refactor iomap_write_delalloc_punch() function out Date: Mon, 5 Jun 2023 16:25:04 +0530 Message-Id: <27c39cdf2150f19d91b7118b7399177d6889a358.1685962158.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org This patch moves iomap_write_delalloc_punch() out of iomap_write_delalloc_scan(). No functionality change in this patch. Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 54 ++++++++++++++++++++++++++---------------- 1 file changed, 34 insertions(+), 20 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 136f57ccd0be..f55a339f99ec 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -894,6 +894,33 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *i, } EXPORT_SYMBOL_GPL(iomap_file_buffered_write); +static int iomap_write_delalloc_punch(struct inode *inode, struct folio *folio, + loff_t *punch_start_byte, loff_t start_byte, loff_t end_byte, + int (*punch)(struct inode *inode, loff_t offset, loff_t length)) +{ + int ret = 0; + + if (!folio_test_dirty(folio)) + return ret; + + /* if dirty, punch up to offset */ + if (start_byte > *punch_start_byte) { + ret = punch(inode, *punch_start_byte, + start_byte - *punch_start_byte); + if (ret) + goto out; + } + /* + * Make sure the next punch start is correctly bound to + * the end of this data range, not the end of the folio. + */ + *punch_start_byte = min_t(loff_t, end_byte, + folio_next_index(folio) << PAGE_SHIFT); + +out: + return ret; +} + /* * Scan the data range passed to us for dirty page cache folios. If we find a * dirty folio, punch out the preceeding range and update the offset from which @@ -917,6 +944,7 @@ static int iomap_write_delalloc_scan(struct inode *inode, { while (start_byte < end_byte) { struct folio *folio; + int ret; /* grab locked page */ folio = filemap_lock_folio(inode->i_mapping, @@ -927,26 +955,12 @@ static int iomap_write_delalloc_scan(struct inode *inode, continue; } - /* if dirty, punch up to offset */ - if (folio_test_dirty(folio)) { - if (start_byte > *punch_start_byte) { - int error; - - error = punch(inode, *punch_start_byte, - start_byte - *punch_start_byte); - if (error) { - folio_unlock(folio); - folio_put(folio); - return error; - } - } - - /* - * Make sure the next punch start is correctly bound to - * the end of this data range, not the end of the folio. - */ - *punch_start_byte = min_t(loff_t, end_byte, - folio_next_index(folio) << PAGE_SHIFT); + ret = iomap_write_delalloc_punch(inode, folio, punch_start_byte, + start_byte, end_byte, punch); + if (ret) { + folio_unlock(folio); + folio_put(folio); + return ret; } /* move offset to start of next folio in range */ From patchwork Mon Jun 5 10:55:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13267209 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 698D2C7EE2F for ; Mon, 5 Jun 2023 10:55:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231744AbjFEKzy (ORCPT ); Mon, 5 Jun 2023 06:55:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231825AbjFEKzi (ORCPT ); Mon, 5 Jun 2023 06:55:38 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6CAF11A; Mon, 5 Jun 2023 03:55:33 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1b02fcde49aso22701565ad.0; Mon, 05 Jun 2023 03:55:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685962533; x=1688554533; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=d69PWHaRaZYvBhBWCvONZIqp7N6hQ/icpHNEJwjyDko=; b=GTXHUkwAlo5x63tRI0za5+P+gNvokdkwdWawVcMrBFIz2K4FRxNMszwLyxrFTGAfff E+9dTzVN89f5Odk+jZWPIJ7DLungw0qPoMt/RlMJ3dyqfijDxCWIZGxmf4w4ceShpbu6 uuZ0VMpJPVX11JoKKT6ppHWY3VUB3BMzj2VEFTCIl2fgbv1Pek/2tw9T4MxTmqp9qXeU H1Sc3iQU+fx+HNQ7/4vRvXDKvYJPetcAdOyK7PkqM+7hJpTyTV+i0AgbIRpD5pTHTstX PXTptAwv3nq6t3a1/EYu6zj6cyK7j+knxeIVY/diSCGMBpxX+/OkPmboUvOmMlxHk03m UMLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685962533; x=1688554533; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d69PWHaRaZYvBhBWCvONZIqp7N6hQ/icpHNEJwjyDko=; b=ZhHZLY2xh5glvrqfwXQ6Rc3wELIUmbB3c/xtsJEj4KoyLh2t4er3y8OTbtcFekzZvF q5vzBAC8fTO40xKCQNbfck250xjvobTmbgng2SnCjHNC8aru3saQeMBOxCbeSQpl7zDD WCFwHVkGdA6wQ1dh6MYossOLgJ2MAIxI8BQOZYG+aQREavFuwiot/jCkhmk7Y/SsSB8o KaoRroyRGlcd9QYSIqzuMbfHk/K68iXeGenFxbGnyZilcqlw95gB72Tsp0137njMlEFv h6RW8ENXF6JmQDpgTWnEQ8YpMtLgKXh5D0KEiVs/I+hpBFJVjHUaL4VqA3mNe4pWmi7P O/Vg== X-Gm-Message-State: AC+VfDwxE5XKq/n5XWPmDXNFpCkv8XPELM7o0U5Fc7LiPnublf4UYZr7 2TVNNnuHtd5hM4rb+2FsPr6kieUQTWM= X-Google-Smtp-Source: ACHHUZ5S2D4jwQX/6faQFgnLPamvlMXOz4oBslTkLKRw6MOSheIYrzX4vJzUm5nzOvLm5zc01xm3JA== X-Received: by 2002:a17:903:1252:b0:1ae:50cc:455 with SMTP id u18-20020a170903125200b001ae50cc0455mr3546490plh.39.1685962532819; Mon, 05 Jun 2023 03:55:32 -0700 (PDT) Received: from dw-tp.c4p-in.ibmmobiledemo.com ([129.41.58.19]) by smtp.gmail.com with ESMTPSA id q3-20020a17090311c300b001b0f727bc44sm6266883plh.16.2023.06.05.03.55.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jun 2023 03:55:32 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Christoph Hellwig , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" Subject: [PATCHv7 5/6] iomap: Allocate iop in ->write_begin() early Date: Mon, 5 Jun 2023 16:25:05 +0530 Message-Id: X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org We dont need to allocate an iop in ->write_begin() for writes where the position and length completely overlap with the given folio. Therefore, such cases are skipped. Currently when the folio is uptodate, we only allocate iop at writeback time (in iomap_writepage_map()). This is ok until now, but when we are going to add support for per-block dirty state bitmap in iop, this could cause some performance degradation. The reason is that if we don't allocate iop during ->write_begin(), then we will never mark the necessary dirty bits in ->write_end() call. And we will have to mark all the bits as dirty at the writeback time, that could cause the same write amplification and performance problems as it is now. Signed-off-by: Ritesh Harjani (IBM) Reviewed-by: Darrick J. Wong --- fs/iomap/buffered-io.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index f55a339f99ec..2a97d73edb96 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -571,15 +571,24 @@ static int __iomap_write_begin(const struct iomap_iter *iter, loff_t pos, size_t from = offset_in_folio(folio, pos), to = from + len; size_t poff, plen; - if (folio_test_uptodate(folio)) + /* + * If the write completely overlaps the current folio, then + * entire folio will be dirtied so there is no need for + * per-block state tracking structures to be attached to this folio. + */ + if (pos <= folio_pos(folio) && + pos + len >= folio_pos(folio) + folio_size(folio)) return 0; - folio_clear_error(folio); iop = iomap_iop_alloc(iter->inode, folio, iter->flags); if ((iter->flags & IOMAP_NOWAIT) && !iop && nr_blocks > 1) return -EAGAIN; + if (folio_test_uptodate(folio)) + return 0; + folio_clear_error(folio); + do { iomap_adjust_read_range(iter->inode, folio, &block_start, block_end - block_start, &poff, &plen); From patchwork Mon Jun 5 10:55:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Ritesh Harjani (IBM)" X-Patchwork-Id: 13267218 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58178C77B73 for ; Mon, 5 Jun 2023 10:56:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231161AbjFEK4E (ORCPT ); Mon, 5 Jun 2023 06:56:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229902AbjFEKzj (ORCPT ); Mon, 5 Jun 2023 06:55:39 -0400 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A047F196; Mon, 5 Jun 2023 03:55:37 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1b01dac1a82so22508675ad.2; Mon, 05 Jun 2023 03:55:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685962536; x=1688554536; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=U9pK+lGKax+PoiYWQq/agmbPKQaSYD3QlR6pu+SOJlA=; b=szVww/1wvJbiSlL7aIK2UUR1AWLL6BjSfDZuhUc6iMVx/FRoGJp78aZPGtiPxlf7oM h40E17GYRDJ5bB1/3UWrxPZW1xlp9f2kQofD5aDLLCR0iDyuoHREJpPUq2Eh6GZgieMa WpLgI7wFcBEN3LYyLLDdGGaiXmVpp5rsLJDK+9ye8iUK8JYVSw5hIjku3XDMQVFXUGQJ zuVklDlK/UlvRL/6MuyL1gmu4w44+AZE9D2z8cA6LpkNBe3UUJ3jfLQWGcbLlu1RLART 4tmwGi3V1QYLOGKAwU3aIuL3+M7cHZDSYr1C+MBzMZAS3hVt2EX7FmY8GuGRJ/eDzHXA RkHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685962536; x=1688554536; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=U9pK+lGKax+PoiYWQq/agmbPKQaSYD3QlR6pu+SOJlA=; b=NXIOhpQmz40McojcP64Bwx1klXbx2GYrMJoA7jiNq9M+qdC1y+PI8642in8yRO0UUQ UQVL1u998cC7bx8h3Z1qwej6hyCENjdt4XiiO2JJgPwlxXIqO8FeWqXmiCUzg5ltXI51 ROL0R4nPnyLU/7isTuP5DYXQgOrUkBwNvDOH5ytHweEeaUgVBhLYZ1tqhm/zdgr3JV5X mSU6KXtBf8ge7HSad2ciE6hau+/+75OvxBAwpGDL/PA+SeBEDdRBR+Jdawzx7l9g7YV0 qV1GMq564vrYtl1bxqf9kP1n1xkc7krzFK2pX4YMZ+GdYDQEwyTlSyUnhrLenZG2d51F RVTg== X-Gm-Message-State: AC+VfDys+c8XWIdhEsbu4FKh9OuEsz8+z66JRTo7X7W5ZcWeLIDqHlrZ rgoxDhu+hRdbajC7UCtpWjs6PWUZQOA= X-Google-Smtp-Source: ACHHUZ4kbigYTVuC+Qq0hJ20yRlY7GNsPiMVIEPN4PWJ3Uk4VkKLP4i8rQIBzWobJKEHAnGqY0i47g== X-Received: by 2002:a17:903:120f:b0:1b1:9d43:ad4c with SMTP id l15-20020a170903120f00b001b19d43ad4cmr3166631plh.40.1685962536393; Mon, 05 Jun 2023 03:55:36 -0700 (PDT) Received: from dw-tp.c4p-in.ibmmobiledemo.com ([129.41.58.19]) by smtp.gmail.com with ESMTPSA id q3-20020a17090311c300b001b0f727bc44sm6266883plh.16.2023.06.05.03.55.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jun 2023 03:55:36 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linux-xfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , Dave Chinner , Brian Foster , Christoph Hellwig , Andreas Gruenbacher , Ojaswin Mujoo , Disha Goel , "Ritesh Harjani (IBM)" , Aravinda Herle Subject: [PATCHv7 6/6] iomap: Add per-block dirty state tracking to improve performance Date: Mon, 5 Jun 2023 16:25:06 +0530 Message-Id: <1d83ed98de8d7896b4a7cc56c31d6f9c33be272f.1685962158.git.ritesh.list@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org When filesystem blocksize is less than folio size (either with mapping_large_folio_support() or with blocksize < pagesize) and when the folio is uptodate in pagecache, then even a byte write can cause an entire folio to be written to disk during writeback. This happens because we currently don't have a mechanism to track per-block dirty state within struct iomap_page. We currently only track uptodate state. This patch implements support for tracking per-block dirty state in iomap_page->state bitmap. This should help improve the filesystem write performance and help reduce write amplification. Performance testing of below fio workload reveals ~16x performance improvement using nvme with XFS (4k blocksize) on Power (64K pagesize) FIO reported write bw scores improved from around ~28 MBps to ~452 MBps. 1. [global] ioengine=psync rw=randwrite overwrite=1 pre_read=1 direct=0 bs=4k size=1G dir=./ numjobs=8 fdatasync=1 runtime=60 iodepth=64 group_reporting=1 [fio-run] 2. Also our internal performance team reported that this patch improves their database workload performance by around ~83% (with XFS on Power) Reported-by: Aravinda Herle Reported-by: Brian Foster Signed-off-by: Ritesh Harjani (IBM) --- fs/gfs2/aops.c | 2 +- fs/iomap/buffered-io.c | 120 +++++++++++++++++++++++++++++++++++++++-- fs/xfs/xfs_aops.c | 2 +- fs/zonefs/file.c | 2 +- include/linux/iomap.h | 1 + 5 files changed, 120 insertions(+), 7 deletions(-) -- 2.40.1 diff --git a/fs/gfs2/aops.c b/fs/gfs2/aops.c index a5f4be6b9213..75efec3c3b71 100644 --- a/fs/gfs2/aops.c +++ b/fs/gfs2/aops.c @@ -746,7 +746,7 @@ static const struct address_space_operations gfs2_aops = { .writepages = gfs2_writepages, .read_folio = gfs2_read_folio, .readahead = gfs2_readahead, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .bmap = gfs2_bmap, diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 2a97d73edb96..e7d114b5b918 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -85,6 +85,63 @@ static void iomap_iop_set_range_uptodate(struct inode *inode, folio_mark_uptodate(folio); } +static bool iop_test_block_dirty(struct folio *folio, int block) +{ + struct iomap_page *iop = to_iomap_page(folio); + struct inode *inode = folio->mapping->host; + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + + return test_bit(block + blks_per_folio, iop->state); +} + +static void iop_set_range_dirty(struct inode *inode, struct folio *folio, + size_t off, size_t len) +{ + struct iomap_page *iop = to_iomap_page(folio); + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + unsigned int first_blk = off >> inode->i_blkbits; + unsigned int last_blk = (off + len - 1) >> inode->i_blkbits; + unsigned int nr_blks = last_blk - first_blk + 1; + unsigned long flags; + + spin_lock_irqsave(&iop->state_lock, flags); + bitmap_set(iop->state, first_blk + blks_per_folio, nr_blks); + spin_unlock_irqrestore(&iop->state_lock, flags); +} + +static void iomap_iop_set_range_dirty(struct inode *inode, struct folio *folio, + size_t off, size_t len) +{ + struct iomap_page *iop = to_iomap_page(folio); + + if (iop) + iop_set_range_dirty(inode, folio, off, len); +} + +static void iop_clear_range_dirty(struct inode *inode, struct folio *folio, + size_t off, size_t len) +{ + struct iomap_page *iop = to_iomap_page(folio); + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio); + unsigned int first_blk = off >> inode->i_blkbits; + unsigned int last_blk = (off + len - 1) >> inode->i_blkbits; + unsigned int nr_blks = last_blk - first_blk + 1; + unsigned long flags; + + spin_lock_irqsave(&iop->state_lock, flags); + bitmap_clear(iop->state, first_blk + blks_per_folio, nr_blks); + spin_unlock_irqrestore(&iop->state_lock, flags); +} + +static void iomap_iop_clear_range_dirty(struct inode *inode, + struct folio *folio, size_t off, size_t len) +{ + struct iomap_page *iop = to_iomap_page(folio); + + if (iop) + iop_clear_range_dirty(inode, folio, off, len); +} + static struct iomap_page *iomap_iop_alloc(struct inode *inode, struct folio *folio, unsigned int flags) { @@ -100,12 +157,20 @@ static struct iomap_page *iomap_iop_alloc(struct inode *inode, else gfp = GFP_NOFS | __GFP_NOFAIL; - iop = kzalloc(struct_size(iop, state, BITS_TO_LONGS(nr_blocks)), + /* + * iop->state tracks two sets of state flags when the + * filesystem block size is smaller than the folio size. + * The first state tracks per-block uptodate and the + * second tracks per-block dirty state. + */ + iop = kzalloc(struct_size(iop, state, BITS_TO_LONGS(2 * nr_blocks)), gfp); if (iop) { spin_lock_init(&iop->state_lock); if (folio_test_uptodate(folio)) - bitmap_fill(iop->state, nr_blocks); + bitmap_set(iop->state, 0, nr_blocks); + if (folio_test_dirty(folio)) + bitmap_set(iop->state, nr_blocks, nr_blocks); folio_attach_private(folio, iop); } return iop; @@ -533,6 +598,17 @@ void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len) } EXPORT_SYMBOL_GPL(iomap_invalidate_folio); +bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio) +{ + struct inode *inode = mapping->host; + size_t len = folio_size(folio); + + iomap_iop_alloc(inode, folio, 0); + iomap_iop_set_range_dirty(inode, folio, 0, len); + return filemap_dirty_folio(mapping, folio); +} +EXPORT_SYMBOL_GPL(iomap_dirty_folio); + static void iomap_write_failed(struct inode *inode, loff_t pos, unsigned len) { @@ -739,6 +815,8 @@ static size_t __iomap_write_end(struct inode *inode, loff_t pos, size_t len, return 0; iomap_iop_set_range_uptodate(inode, folio, offset_in_folio(folio, pos), len); + iomap_iop_set_range_dirty(inode, folio, offset_in_folio(folio, pos), + copied); filemap_dirty_folio(inode->i_mapping, folio); return copied; } @@ -908,6 +986,10 @@ static int iomap_write_delalloc_punch(struct inode *inode, struct folio *folio, int (*punch)(struct inode *inode, loff_t offset, loff_t length)) { int ret = 0; + struct iomap_page *iop; + unsigned int first_blk, last_blk, i; + loff_t last_byte; + u8 blkbits = inode->i_blkbits; if (!folio_test_dirty(folio)) return ret; @@ -919,6 +1001,29 @@ static int iomap_write_delalloc_punch(struct inode *inode, struct folio *folio, if (ret) goto out; } + /* + * When we have per-block dirty tracking, there can be + * blocks within a folio which are marked uptodate + * but not dirty. In that case it is necessary to punch + * out such blocks to avoid leaking any delalloc blocks. + */ + iop = to_iomap_page(folio); + if (!iop) + goto skip_iop_punch; + + last_byte = min_t(loff_t, end_byte - 1, + (folio_next_index(folio) << PAGE_SHIFT) - 1); + first_blk = offset_in_folio(folio, start_byte) >> blkbits; + last_blk = offset_in_folio(folio, last_byte) >> blkbits; + for (i = first_blk; i <= last_blk; i++) { + if (!iop_test_block_dirty(folio, i)) { + ret = punch(inode, i << blkbits, 1 << blkbits); + if (ret) + goto out; + } + } + +skip_iop_punch: /* * Make sure the next punch start is correctly bound to * the end of this data range, not the end of the folio. @@ -1652,7 +1757,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, struct writeback_control *wbc, struct inode *inode, struct folio *folio, u64 end_pos) { - struct iomap_page *iop = iomap_iop_alloc(inode, folio, 0); + struct iomap_page *iop = to_iomap_page(folio); struct iomap_ioend *ioend, *next; unsigned len = i_blocksize(inode); unsigned nblocks = i_blocks_per_folio(inode, folio); @@ -1660,6 +1765,11 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, int error = 0, count = 0, i; LIST_HEAD(submit_list); + if (!iop && nblocks > 1) { + iop = iomap_iop_alloc(inode, folio, 0); + iomap_iop_set_range_dirty(inode, folio, 0, folio_size(folio)); + } + WARN_ON_ONCE(iop && atomic_read(&iop->write_bytes_pending) != 0); /* @@ -1668,7 +1778,7 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, * invalid, grab a new one. */ for (i = 0; i < nblocks && pos < end_pos; i++, pos += len) { - if (iop && !iop_test_block_uptodate(folio, i)) + if (iop && !iop_test_block_dirty(folio, i)) continue; error = wpc->ops->map_blocks(wpc, inode, pos); @@ -1712,6 +1822,8 @@ iomap_writepage_map(struct iomap_writepage_ctx *wpc, } } + iomap_iop_clear_range_dirty(inode, folio, 0, + end_pos - folio_pos(folio)); folio_start_writeback(folio); folio_unlock(folio); diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 2ef78aa1d3f6..77c7332ae197 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -578,7 +578,7 @@ const struct address_space_operations xfs_address_space_operations = { .read_folio = xfs_vm_read_folio, .readahead = xfs_vm_readahead, .writepages = xfs_vm_writepages, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .bmap = xfs_vm_bmap, diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c index 132f01d3461f..e508c8e97372 100644 --- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -175,7 +175,7 @@ const struct address_space_operations zonefs_file_aops = { .read_folio = zonefs_read_folio, .readahead = zonefs_readahead, .writepages = zonefs_writepages, - .dirty_folio = filemap_dirty_folio, + .dirty_folio = iomap_dirty_folio, .release_folio = iomap_release_folio, .invalidate_folio = iomap_invalidate_folio, .migrate_folio = filemap_migrate_folio, diff --git a/include/linux/iomap.h b/include/linux/iomap.h index e2b836c2e119..eb9335c46bf3 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -264,6 +264,7 @@ bool iomap_is_partially_uptodate(struct folio *, size_t from, size_t count); struct folio *iomap_get_folio(struct iomap_iter *iter, loff_t pos); bool iomap_release_folio(struct folio *folio, gfp_t gfp_flags); void iomap_invalidate_folio(struct folio *folio, size_t offset, size_t len); +bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio); int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len, const struct iomap_ops *ops); int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,