From patchwork Thu Apr 14 22:54:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12814103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53C04C433F5 for ; Thu, 14 Apr 2022 22:54:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343542AbiDNW4d (ORCPT ); Thu, 14 Apr 2022 18:56:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244039AbiDNW4c (ORCPT ); Thu, 14 Apr 2022 18:56:32 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9EF8C644CA for ; Thu, 14 Apr 2022 15:54:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3538C620B7 for ; Thu, 14 Apr 2022 22:54:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 91283C385A1; Thu, 14 Apr 2022 22:54:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649976845; bh=wv/5MdiI8NXfH0XW5zbIZWFtBpjSyLmZEOrRlp/RF+8=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=ikwilpll2nI+V6NJ737S78NwZ/1LrkE2zOZDU3cDofTrKvvtUasNQawzDf9CRLy7J JFtD3yDGvi26PKXY/so2cD/MRXQn/xXG7Ymv4aqwg4jtV8VzFW0NLtoBgcT01WJuAN C0AJChgQWlEcDozbmrF2eo4W0VFnlsmd/vW8FEL981a49IwdteJ48TZA4p4ZlJhLeG IvPvKDBU7wJSXKI/kIqeQpwSLPnSn2/YNIcfriXadSFm8T+ht75TnOPhdY38UBFBui EjxQgwFsPiD2DPKx73J8Ioh67lRx/f2ChYENBgNZyx+axV0P+py2zKysmMet/SVXnq mFl5Aczj1NwJg== Subject: [PATCH 1/4] xfs: capture buffer ops in the xfs_buf tracepoints From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: linux-xfs@vger.kernel.org Date: Thu, 14 Apr 2022 15:54:05 -0700 Message-ID: <164997684506.383709.959361265801019630.stgit@magnolia> In-Reply-To: <164997683918.383709.10179435130868945685.stgit@magnolia> References: <164997683918.383709.10179435130868945685.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Record the buffer ops in the xfs_buf tracepoints so that we can monitor the alleged type of the buffer. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner --- fs/xfs/xfs_trace.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index b141ef78c755..ecde0be3030a 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -418,6 +418,7 @@ DECLARE_EVENT_CLASS(xfs_buf_class, __field(unsigned, lockval) __field(unsigned, flags) __field(unsigned long, caller_ip) + __field(const void *, buf_ops) ), TP_fast_assign( __entry->dev = bp->b_target->bt_dev; @@ -428,9 +429,10 @@ DECLARE_EVENT_CLASS(xfs_buf_class, __entry->lockval = bp->b_sema.count; __entry->flags = bp->b_flags; __entry->caller_ip = caller_ip; + __entry->buf_ops = bp->b_ops; ), TP_printk("dev %d:%d daddr 0x%llx bbcount 0x%x hold %d pincount %d " - "lock %d flags %s caller %pS", + "lock %d flags %s bufops %pS caller %pS", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long long)__entry->bno, __entry->nblks, @@ -438,6 +440,7 @@ DECLARE_EVENT_CLASS(xfs_buf_class, __entry->pincount, __entry->lockval, __print_flags(__entry->flags, "|", XFS_BUF_FLAGS), + __entry->buf_ops, (void *)__entry->caller_ip) ) From patchwork Thu Apr 14 22:54:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12814104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71F2BC433EF for ; Thu, 14 Apr 2022 22:54:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346896AbiDNW4i (ORCPT ); Thu, 14 Apr 2022 18:56:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244039AbiDNW4i (ORCPT ); Thu, 14 Apr 2022 18:56:38 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35FFC65162 for ; Thu, 14 Apr 2022 15:54:12 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CAF84620B7 for ; Thu, 14 Apr 2022 22:54:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 37476C385A5; Thu, 14 Apr 2022 22:54:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649976851; bh=jJ2grDlVKv3W/amCyxA1/bBWHzFLG+8m8Hv/IVcMqSo=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=hYfpO2fHXQcJCtrR8V7vxCuu1UGaNy0HPeLAj4zASUeptRaFQj3BiE8H18Rs4m/yj 0HdjZUNuEhudO1SyOFysE0z1hT9rXiFO9IQnbw25M/YvuQDCPqmHhvHdHk2gv4kYZt O7BbFVkGDIEtfw7Qt7CYyciZ63Dc6/nrVQS2k+tbLRxGegnkh7SAVf4k063tcRf7YN r+ziTrRcWDdQYGAdxFDKv+/3ptMy0BrUNESbNa+DLb+C6pHoqKPsdIh6VXzcNI78XV 9C5B7SbzVWIh2xHOH4nJfeKvSKsG2Oocg/FnJZhi8rMCtVJ9813eYg521C8JA0ZC4Z GTYikpDoFhcAQ== Subject: [PATCH 2/4] xfs: simplify xfs_rmap_lookup_le call sites From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: linux-xfs@vger.kernel.org Date: Thu, 14 Apr 2022 15:54:10 -0700 Message-ID: <164997685075.383709.9161047695879739444.stgit@magnolia> In-Reply-To: <164997683918.383709.10179435130868945685.stgit@magnolia> References: <164997683918.383709.10179435130868945685.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Most callers of xfs_rmap_lookup_le will retrieve the btree record immediately if the lookup succeeds. The overlapped version of this function (xfs_rmap_lookup_le_range) will return the record if the lookup succeeds, so make the regular version do it too. Get rid of the useless len argument, since it's not part of the lookup key. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_rmap.c | 59 +++++++++++++++++----------------------------- fs/xfs/libxfs/xfs_rmap.h | 4 ++- fs/xfs/scrub/bmap.c | 24 +++---------------- 3 files changed, 28 insertions(+), 59 deletions(-) diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index cd322174dbff..3eea8056e7bc 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -34,18 +34,32 @@ int xfs_rmap_lookup_le( struct xfs_btree_cur *cur, xfs_agblock_t bno, - xfs_extlen_t len, uint64_t owner, uint64_t offset, unsigned int flags, + struct xfs_rmap_irec *irec, int *stat) { + int get_stat = 0; + int error; + cur->bc_rec.r.rm_startblock = bno; - cur->bc_rec.r.rm_blockcount = len; + cur->bc_rec.r.rm_blockcount = 0; cur->bc_rec.r.rm_owner = owner; cur->bc_rec.r.rm_offset = offset; cur->bc_rec.r.rm_flags = flags; - return xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); + + error = xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); + if (error || !(*stat) || !irec) + return error; + + error = xfs_rmap_get_rec(cur, irec, &get_stat); + if (error) + return error; + if (!get_stat) + return -EFSCORRUPTED; + + return 0; } /* @@ -510,7 +524,7 @@ xfs_rmap_unmap( * for the AG headers at rm_startblock == 0 created by mkfs/growfs that * will not ever be removed from the tree. */ - error = xfs_rmap_lookup_le(cur, bno, len, owner, offset, flags, &i); + error = xfs_rmap_lookup_le(cur, bno, owner, offset, flags, <rec, &i); if (error) goto out_error; if (XFS_IS_CORRUPT(mp, i != 1)) { @@ -518,13 +532,6 @@ xfs_rmap_unmap( goto out_error; } - error = xfs_rmap_get_rec(cur, <rec, &i); - if (error) - goto out_error; - if (XFS_IS_CORRUPT(mp, i != 1)) { - error = -EFSCORRUPTED; - goto out_error; - } trace_xfs_rmap_lookup_le_range_result(cur->bc_mp, cur->bc_ag.pag->pag_agno, ltrec.rm_startblock, ltrec.rm_blockcount, ltrec.rm_owner, @@ -786,18 +793,11 @@ xfs_rmap_map( * record for our insertion point. This will also give us the record for * start block contiguity tests. */ - error = xfs_rmap_lookup_le(cur, bno, len, owner, offset, flags, + error = xfs_rmap_lookup_le(cur, bno, owner, offset, flags, <rec, &have_lt); if (error) goto out_error; if (have_lt) { - error = xfs_rmap_get_rec(cur, <rec, &have_lt); - if (error) - goto out_error; - if (XFS_IS_CORRUPT(mp, have_lt != 1)) { - error = -EFSCORRUPTED; - goto out_error; - } trace_xfs_rmap_lookup_le_range_result(cur->bc_mp, cur->bc_ag.pag->pag_agno, ltrec.rm_startblock, ltrec.rm_blockcount, ltrec.rm_owner, @@ -1022,7 +1022,7 @@ xfs_rmap_convert( * record for our insertion point. This will also give us the record for * start block contiguity tests. */ - error = xfs_rmap_lookup_le(cur, bno, len, owner, offset, oldext, &i); + error = xfs_rmap_lookup_le(cur, bno, owner, offset, oldext, &PREV, &i); if (error) goto done; if (XFS_IS_CORRUPT(mp, i != 1)) { @@ -1030,13 +1030,6 @@ xfs_rmap_convert( goto done; } - error = xfs_rmap_get_rec(cur, &PREV, &i); - if (error) - goto done; - if (XFS_IS_CORRUPT(mp, i != 1)) { - error = -EFSCORRUPTED; - goto done; - } trace_xfs_rmap_lookup_le_range_result(cur->bc_mp, cur->bc_ag.pag->pag_agno, PREV.rm_startblock, PREV.rm_blockcount, PREV.rm_owner, @@ -1140,7 +1133,7 @@ xfs_rmap_convert( _RET_IP_); /* reset the cursor back to PREV */ - error = xfs_rmap_lookup_le(cur, bno, len, owner, offset, oldext, &i); + error = xfs_rmap_lookup_le(cur, bno, owner, offset, oldext, NULL, &i); if (error) goto done; if (XFS_IS_CORRUPT(mp, i != 1)) { @@ -2677,7 +2670,7 @@ xfs_rmap_record_exists( ASSERT(XFS_RMAP_NON_INODE_OWNER(owner) || (flags & XFS_RMAP_BMBT_BLOCK)); - error = xfs_rmap_lookup_le(cur, bno, len, owner, offset, flags, + error = xfs_rmap_lookup_le(cur, bno, owner, offset, flags, &irec, &has_record); if (error) return error; @@ -2686,14 +2679,6 @@ xfs_rmap_record_exists( return 0; } - error = xfs_rmap_get_rec(cur, &irec, &has_record); - if (error) - return error; - if (!has_record) { - *has_rmap = false; - return 0; - } - *has_rmap = (irec.rm_owner == owner && irec.rm_startblock <= bno && irec.rm_startblock + irec.rm_blockcount >= bno + len); return 0; diff --git a/fs/xfs/libxfs/xfs_rmap.h b/fs/xfs/libxfs/xfs_rmap.h index b718ebeda372..11ec9406a0ea 100644 --- a/fs/xfs/libxfs/xfs_rmap.h +++ b/fs/xfs/libxfs/xfs_rmap.h @@ -122,8 +122,8 @@ int xfs_rmap_free(struct xfs_trans *tp, struct xfs_buf *agbp, const struct xfs_owner_info *oinfo); int xfs_rmap_lookup_le(struct xfs_btree_cur *cur, xfs_agblock_t bno, - xfs_extlen_t len, uint64_t owner, uint64_t offset, - unsigned int flags, int *stat); + uint64_t owner, uint64_t offset, unsigned int flags, + struct xfs_rmap_irec *irec, int *stat); int xfs_rmap_lookup_eq(struct xfs_btree_cur *cur, xfs_agblock_t bno, xfs_extlen_t len, uint64_t owner, uint64_t offset, unsigned int flags, int *stat); diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index a4cbbc346f60..1bd5c1089bf8 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -133,29 +133,13 @@ xchk_bmap_get_rmap( if (info->is_shared) { error = xfs_rmap_lookup_le_range(info->sc->sa.rmap_cur, agbno, owner, offset, rflags, rmap, &has_rmap); - if (!xchk_should_check_xref(info->sc, &error, - &info->sc->sa.rmap_cur)) - return false; - goto out; + } else { + error = xfs_rmap_lookup_le(info->sc->sa.rmap_cur, agbno, + owner, offset, rflags, rmap, &has_rmap); } - - /* - * Otherwise, use the (faster) regular lookup. - */ - error = xfs_rmap_lookup_le(info->sc->sa.rmap_cur, agbno, 0, owner, - offset, rflags, &has_rmap); - if (!xchk_should_check_xref(info->sc, &error, - &info->sc->sa.rmap_cur)) - return false; - if (!has_rmap) - goto out; - - error = xfs_rmap_get_rec(info->sc->sa.rmap_cur, rmap, &has_rmap); - if (!xchk_should_check_xref(info->sc, &error, - &info->sc->sa.rmap_cur)) + if (!xchk_should_check_xref(info->sc, &error, &info->sc->sa.rmap_cur)) return false; -out: if (!has_rmap) xchk_fblock_xref_set_corrupt(info->sc, info->whichfork, irec->br_startoff); From patchwork Thu Apr 14 22:54:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12814105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A013BC433EF for ; Thu, 14 Apr 2022 22:54:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347335AbiDNW4r (ORCPT ); Thu, 14 Apr 2022 18:56:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244039AbiDNW4r (ORCPT ); Thu, 14 Apr 2022 18:56:47 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5ED0F65162 for ; Thu, 14 Apr 2022 15:54:19 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0DDE1B82BDB for ; Thu, 14 Apr 2022 22:54:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C32CBC385A5; Thu, 14 Apr 2022 22:54:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649976856; bh=ifyGtJzhm1vlrZ+NzLcSboTIT+xNx1BqoWSI7rVCHpE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=LmTTDe97CbQwDgep9e40GNG83NkZVwzdg8XjT4aa/qVNMeIKM+GtTTp6NgLIYOzvN CT32tOw2XCcvlnpTFEyzR48zxBH0y1mjlj4mMSRiAF4zNw+nUxT1FccDdc71EeYXYW 9ODW9e9KEiE7cyO5uUy351bibVrOmp1e0QpI5snVbKe0LcdGiXn0h1zspb6pRJpLAv dbikd+3yJNrefPg6hFiacDCQ1x+6E+ahNXtRAicHQA/njXY/hxB1kl4ufR2iPvGgxf 4li1iZs0UptTM1hsmeJs79Scrjy4uwtAHi2+xSg77wp3KVJOa03z6pdo/tktR6r2vz uxzb/6Jzx0xfA== Subject: [PATCH 3/4] xfs: speed up rmap lookups by using non-overlapped lookups when possible From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: linux-xfs@vger.kernel.org Date: Thu, 14 Apr 2022 15:54:16 -0700 Message-ID: <164997685638.383709.4789775648712621300.stgit@magnolia> In-Reply-To: <164997683918.383709.10179435130868945685.stgit@magnolia> References: <164997683918.383709.10179435130868945685.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Reverse mapping on a reflink-capable filesystem has some pretty high overhead when performing file operations. This is because the rmap records for logically and physically adjacent extents might not be adjacent in the rmap index due to data block sharing. As a result, we use expensive overlapped-interval btree search, which walks every record that overlaps with the supplied key in the hopes of finding the record. However, profiling data shows that when the index contains a record that is an exact match for a query key, the non-overlapped btree search function can find the record much faster than the overlapped version. Try the non-overlapped lookup first, which will make scrub run much faster. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rmap.c | 38 ++++++++++++++++++++++++++++++++------ 1 file changed, 32 insertions(+), 6 deletions(-) diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index 3eea8056e7bc..5aa94deb3afd 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -402,12 +402,38 @@ xfs_rmap_lookup_le_range( info.irec = irec; info.stat = stat; - trace_xfs_rmap_lookup_le_range(cur->bc_mp, - cur->bc_ag.pag->pag_agno, bno, 0, owner, offset, flags); - error = xfs_rmap_query_range(cur, &info.high, &info.high, - xfs_rmap_lookup_le_range_helper, &info); - if (error == -ECANCELED) - error = 0; + trace_xfs_rmap_lookup_le_range(cur->bc_mp, cur->bc_ag.pag->pag_agno, + bno, 0, owner, offset, flags); + + /* + * Historically, we always used the range query to walk every reverse + * mapping that could possibly overlap the key that the caller asked + * for, and filter out the ones that don't. That is very slow when + * there are a lot of records. + * + * However, there are two scenarios where the classic btree search can + * produce correct results -- if the index contains a record that is an + * exact match for the lookup key; and if there are no other records + * between the record we want and the key we supplied. + * + * As an optimization, try a non-overlapped lookup first. This makes + * scrub run much faster on most filesystems because bmbt records are + * usually an exact match for rmap records. If we don't find what we + * want, we fall back to the overlapped query. + */ + error = xfs_rmap_lookup_le(cur, bno, owner, offset, flags, irec, stat); + if (error) + return error; + if (*stat) { + *stat = 0; + xfs_rmap_lookup_le_range_helper(cur, irec, &info); + } + if (!(*stat)) { + error = xfs_rmap_query_range(cur, &info.high, &info.high, + xfs_rmap_lookup_le_range_helper, &info); + if (error == -ECANCELED) + error = 0; + } if (*stat) trace_xfs_rmap_lookup_le_range_result(cur->bc_mp, cur->bc_ag.pag->pag_agno, irec->rm_startblock, From patchwork Thu Apr 14 22:54:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12814106 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E698BC433EF for ; Thu, 14 Apr 2022 22:54:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244039AbiDNW4v (ORCPT ); Thu, 14 Apr 2022 18:56:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347370AbiDNW4u (ORCPT ); Thu, 14 Apr 2022 18:56:50 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 764AEE1E for ; Thu, 14 Apr 2022 15:54:23 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1291B62071 for ; Thu, 14 Apr 2022 22:54:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6C57BC385A1; Thu, 14 Apr 2022 22:54:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649976862; bh=JJHgNs7wL3i8MsKwDqvRpiCDBMeux9LXZj+1ewn3VTw=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=QOMukMyp4w9eEcSsFr/Smo2c0BPQNrUsGQ1YDl5t5fYZ0FPjtOO6jLhxEBuPHZz/6 nIBjuWU5t4lg9N0LBP8vrT/2ToJ8sIFE5QSrxHUoI97fw31dYob/c98m3bu+ZzVZMu zdhffT4hp/r0xjLdhPc0Rx/3rjZOYRhfdTx88/fvm1Wvw4q73hjvbdOPLiazPPcv/e 10E6DIIdmiVpae5IMOgCsqmvug+A2lx9Ayhq3FY5y5Sbut4+fAQjDnsZV/+XgI4Rel 33shje7PR8vS//VgFfBwNGR4l6FZ/kY5GMxYJ1ZOSV9c3hBzICUN6GpTKVnY/rkRsV AlM8bB5h+lTgA== Subject: [PATCH 4/4] xfs: speed up write operations by using non-overlapped lookups when possible From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: linux-xfs@vger.kernel.org Date: Thu, 14 Apr 2022 15:54:22 -0700 Message-ID: <164997686196.383709.14448633533668211390.stgit@magnolia> In-Reply-To: <164997683918.383709.10179435130868945685.stgit@magnolia> References: <164997683918.383709.10179435130868945685.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Reverse mapping on a reflink-capable filesystem has some pretty high overhead when performing file operations. This is because the rmap records for logically and physically adjacent extents might not be adjacent in the rmap index due to data block sharing. As a result, we use expensive overlapped-interval btree search, which walks every record that overlaps with the supplied key in the hopes of finding the record. However, profiling data shows that when the index contains a record that is an exact match for a query key, the non-overlapped btree search function can find the record much faster than the overlapped version. Try the non-overlapped lookup first when we're trying to find the left neighbor rmap record for a given file mapping, which makes unwritten extent conversion and remap operations run faster if data block sharing is minimal in this part of the filesystem. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_rmap.c | 35 ++++++++++++++++++++++++++++++----- fs/xfs/libxfs/xfs_rmap.h | 3 --- 2 files changed, 30 insertions(+), 8 deletions(-) diff --git a/fs/xfs/libxfs/xfs_rmap.c b/fs/xfs/libxfs/xfs_rmap.c index 5aa94deb3afd..bd394138df9e 100644 --- a/fs/xfs/libxfs/xfs_rmap.c +++ b/fs/xfs/libxfs/xfs_rmap.c @@ -299,7 +299,7 @@ xfs_rmap_find_left_neighbor_helper( * return a match with the same owner and adjacent physical and logical * block ranges. */ -int +STATIC int xfs_rmap_find_left_neighbor( struct xfs_btree_cur *cur, xfs_agblock_t bno, @@ -332,10 +332,35 @@ xfs_rmap_find_left_neighbor( trace_xfs_rmap_find_left_neighbor_query(cur->bc_mp, cur->bc_ag.pag->pag_agno, bno, 0, owner, offset, flags); - error = xfs_rmap_query_range(cur, &info.high, &info.high, - xfs_rmap_find_left_neighbor_helper, &info); - if (error == -ECANCELED) - error = 0; + /* + * Historically, we always used the range query to walk every reverse + * mapping that could possibly overlap the key that the caller asked + * for, and filter out the ones that don't. That is very slow when + * there are a lot of records. + * + * However, there are two scenarios where the classic btree search can + * produce correct results -- if the index contains a record that is an + * exact match for the lookup key; and if there are no other records + * between the record we want and the key we supplied. + * + * As an optimization, try a non-overlapped lookup first. This makes + * extent conversion and remap operations run a bit faster if the + * physical extents aren't being shared. If we don't find what we + * want, we fall back to the overlapped query. + */ + error = xfs_rmap_lookup_le(cur, bno, owner, offset, flags, irec, stat); + if (error) + return error; + if (*stat) { + *stat = 0; + xfs_rmap_find_left_neighbor_helper(cur, irec, &info); + } + if (!(*stat)) { + error = xfs_rmap_query_range(cur, &info.high, &info.high, + xfs_rmap_find_left_neighbor_helper, &info); + if (error == -ECANCELED) + error = 0; + } if (*stat) trace_xfs_rmap_find_left_neighbor_result(cur->bc_mp, cur->bc_ag.pag->pag_agno, irec->rm_startblock, diff --git a/fs/xfs/libxfs/xfs_rmap.h b/fs/xfs/libxfs/xfs_rmap.h index 11ec9406a0ea..54741a591a17 100644 --- a/fs/xfs/libxfs/xfs_rmap.h +++ b/fs/xfs/libxfs/xfs_rmap.h @@ -184,9 +184,6 @@ int xfs_rmap_finish_one(struct xfs_trans *tp, enum xfs_rmap_intent_type type, xfs_fsblock_t startblock, xfs_filblks_t blockcount, xfs_exntst_t state, struct xfs_btree_cur **pcur); -int xfs_rmap_find_left_neighbor(struct xfs_btree_cur *cur, xfs_agblock_t bno, - uint64_t owner, uint64_t offset, unsigned int flags, - struct xfs_rmap_irec *irec, int *stat); int xfs_rmap_lookup_le_range(struct xfs_btree_cur *cur, xfs_agblock_t bno, uint64_t owner, uint64_t offset, unsigned int flags, struct xfs_rmap_irec *irec, int *stat);