From patchwork Fri Sep 27 17:17:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164863 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 43C9E112B for ; Fri, 27 Sep 2019 17:18:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2A88121906 for ; Fri, 27 Sep 2019 17:18:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726294AbfI0RSD (ORCPT ); Fri, 27 Sep 2019 13:18:03 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33828 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726033AbfI0RSD (ORCPT ); Fri, 27 Sep 2019 13:18:03 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4E01F30B4A82 for ; Fri, 27 Sep 2019 17:18:03 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 084FE26327 for ; Fri, 27 Sep 2019 17:18:02 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 01/11] xfs: track active state of allocation btree cursors Date: Fri, 27 Sep 2019 13:17:52 -0400 Message-Id: <20190927171802.45582-2-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 27 Sep 2019 17:18:03 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The upcoming allocation algorithm update searches multiple allocation btree cursors concurrently. As such, it requires an active state to track when a particular cursor should continue searching. While active state will be modified based on higher level logic, we can define base functionality based on the result of allocation btree lookups. Define an active flag in the private area of the btree cursor. Update it based on the result of lookups in the existing allocation btree helpers. Finally, provide a new helper to query the current state. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 24 +++++++++++++++++++++--- fs/xfs/libxfs/xfs_alloc_btree.c | 1 + fs/xfs/libxfs/xfs_btree.h | 3 +++ 3 files changed, 25 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 533b04aaf6f6..0ecc142c833b 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -146,9 +146,13 @@ xfs_alloc_lookup_eq( xfs_extlen_t len, /* length of extent */ int *stat) /* success/failure */ { + int error; + cur->bc_rec.a.ar_startblock = bno; cur->bc_rec.a.ar_blockcount = len; - return xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat); + error = xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat); + cur->bc_private.a.priv.abt.active = (*stat == 1); + return error; } /* @@ -162,9 +166,13 @@ xfs_alloc_lookup_ge( xfs_extlen_t len, /* length of extent */ int *stat) /* success/failure */ { + int error; + cur->bc_rec.a.ar_startblock = bno; cur->bc_rec.a.ar_blockcount = len; - return xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); + error = xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); + cur->bc_private.a.priv.abt.active = (*stat == 1); + return error; } /* @@ -178,9 +186,19 @@ xfs_alloc_lookup_le( xfs_extlen_t len, /* length of extent */ int *stat) /* success/failure */ { + int error; cur->bc_rec.a.ar_startblock = bno; cur->bc_rec.a.ar_blockcount = len; - return xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); + error = xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); + cur->bc_private.a.priv.abt.active = (*stat == 1); + return error; +} + +static inline bool +xfs_alloc_cur_active( + struct xfs_btree_cur *cur) +{ + return cur && cur->bc_private.a.priv.abt.active; } /* diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c index 2a94543857a1..279694d73e4e 100644 --- a/fs/xfs/libxfs/xfs_alloc_btree.c +++ b/fs/xfs/libxfs/xfs_alloc_btree.c @@ -507,6 +507,7 @@ xfs_allocbt_init_cursor( cur->bc_private.a.agbp = agbp; cur->bc_private.a.agno = agno; + cur->bc_private.a.priv.abt.active = false; if (xfs_sb_version_hascrc(&mp->m_sb)) cur->bc_flags |= XFS_BTREE_CRC_BLOCKS; diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index ced1e65d1483..b4e3ec1d7ff9 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -183,6 +183,9 @@ union xfs_btree_cur_private { unsigned long nr_ops; /* # record updates */ int shape_changes; /* # of extent splits */ } refc; + struct { + bool active; /* allocation cursor state */ + } abt; }; /* From patchwork Fri Sep 27 17:17:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164865 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 469ED1599 for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 191E0217D9 for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726883AbfI0RSE (ORCPT ); Fri, 27 Sep 2019 13:18:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56902 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726033AbfI0RSE (ORCPT ); Fri, 27 Sep 2019 13:18:04 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C383D10CC20B for ; Fri, 27 Sep 2019 17:18:03 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6D83226342 for ; Fri, 27 Sep 2019 17:18:03 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 02/11] xfs: introduce allocation cursor data structure Date: Fri, 27 Sep 2019 13:17:53 -0400 Message-Id: <20190927171802.45582-3-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.65]); Fri, 27 Sep 2019 17:18:03 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Introduce a new allocation cursor data structure to encapsulate the various states and structures used to perform an extent allocation. This structure will eventually be used to track overall allocation state across different search algorithms on both free space btrees. To start, include the three btree cursors (one for the cntbt and two for the bnobt left/right search) used by the near mode allocation algorithm and refactor the cursor setup and teardown code into helpers. This slightly changes cursor memory allocation patterns, but otherwise makes no functional changes to the allocation algorithm. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 318 +++++++++++++++++++------------------- 1 file changed, 163 insertions(+), 155 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 0ecc142c833b..e14ab480de92 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -710,8 +710,71 @@ xfs_alloc_update_counters( } /* - * Allocation group level functions. + * Block allocation algorithm and data structures. */ +struct xfs_alloc_cur { + struct xfs_btree_cur *cnt; /* btree cursors */ + struct xfs_btree_cur *bnolt; + struct xfs_btree_cur *bnogt; +}; + +/* + * Set up cursors, etc. in the extent allocation cursor. This function can be + * called multiple times to reset an initialized structure without having to + * reallocate cursors. + */ +static int +xfs_alloc_cur_setup( + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur) +{ + int error; + int i; + + ASSERT(args->alignment == 1 || args->type != XFS_ALLOCTYPE_THIS_BNO); + + /* + * Perform an initial cntbt lookup to check for availability of maxlen + * extents. If this fails, we'll return -ENOSPC to signal the caller to + * attempt a small allocation. + */ + if (!acur->cnt) + acur->cnt = xfs_allocbt_init_cursor(args->mp, args->tp, + args->agbp, args->agno, XFS_BTNUM_CNT); + error = xfs_alloc_lookup_ge(acur->cnt, 0, args->maxlen, &i); + if (error) + return error; + + /* + * Allocate the bnobt left and right search cursors. + */ + if (!acur->bnolt) + acur->bnolt = xfs_allocbt_init_cursor(args->mp, args->tp, + args->agbp, args->agno, XFS_BTNUM_BNO); + if (!acur->bnogt) + acur->bnogt = xfs_allocbt_init_cursor(args->mp, args->tp, + args->agbp, args->agno, XFS_BTNUM_BNO); + return i == 1 ? 0 : -ENOSPC; +} + +static void +xfs_alloc_cur_close( + struct xfs_alloc_cur *acur, + bool error) +{ + int cur_error = XFS_BTREE_NOERROR; + + if (error) + cur_error = XFS_BTREE_ERROR; + + if (acur->cnt) + xfs_btree_del_cursor(acur->cnt, cur_error); + if (acur->bnolt) + xfs_btree_del_cursor(acur->bnolt, cur_error); + if (acur->bnogt) + xfs_btree_del_cursor(acur->bnogt, cur_error); + acur->cnt = acur->bnolt = acur->bnogt = NULL; +} /* * Deal with the case where only small freespaces remain. Either return the @@ -1008,8 +1071,8 @@ xfs_alloc_ag_vextent_exact( STATIC int xfs_alloc_find_best_extent( struct xfs_alloc_arg *args, /* allocation argument structure */ - struct xfs_btree_cur **gcur, /* good cursor */ - struct xfs_btree_cur **scur, /* searching cursor */ + struct xfs_btree_cur *gcur, /* good cursor */ + struct xfs_btree_cur *scur, /* searching cursor */ xfs_agblock_t gdiff, /* difference for search comparison */ xfs_agblock_t *sbno, /* extent found by search */ xfs_extlen_t *slen, /* extent length */ @@ -1031,7 +1094,7 @@ xfs_alloc_find_best_extent( * Look until we find a better one, run out of space or run off the end. */ do { - error = xfs_alloc_get_rec(*scur, sbno, slen, &i); + error = xfs_alloc_get_rec(scur, sbno, slen, &i); if (error) goto error0; XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); @@ -1074,21 +1137,19 @@ xfs_alloc_find_best_extent( } if (!dir) - error = xfs_btree_increment(*scur, 0, &i); + error = xfs_btree_increment(scur, 0, &i); else - error = xfs_btree_decrement(*scur, 0, &i); + error = xfs_btree_decrement(scur, 0, &i); if (error) goto error0; } while (i); out_use_good: - xfs_btree_del_cursor(*scur, XFS_BTREE_NOERROR); - *scur = NULL; + scur->bc_private.a.priv.abt.active = false; return 0; out_use_search: - xfs_btree_del_cursor(*gcur, XFS_BTREE_NOERROR); - *gcur = NULL; + gcur->bc_private.a.priv.abt.active = false; return 0; error0: @@ -1102,13 +1163,12 @@ xfs_alloc_find_best_extent( * and of the form k * prod + mod unless there's nothing that large. * Return the starting a.g. block, or NULLAGBLOCK if we can't do it. */ -STATIC int /* error */ +STATIC int xfs_alloc_ag_vextent_near( - xfs_alloc_arg_t *args) /* allocation argument structure */ + struct xfs_alloc_arg *args) { - xfs_btree_cur_t *bno_cur_gt; /* cursor for bno btree, right side */ - xfs_btree_cur_t *bno_cur_lt; /* cursor for bno btree, left side */ - xfs_btree_cur_t *cnt_cur; /* cursor for count btree */ + struct xfs_alloc_cur acur = {0,}; + struct xfs_btree_cur *bno_cur; xfs_agblock_t gtbno; /* start bno of right side entry */ xfs_agblock_t gtbnoa; /* aligned ... */ xfs_extlen_t gtdiff; /* difference to right side entry */ @@ -1148,38 +1208,29 @@ xfs_alloc_ag_vextent_near( args->agbno = args->max_agbno; restart: - bno_cur_lt = NULL; - bno_cur_gt = NULL; ltlen = 0; gtlena = 0; ltlena = 0; busy = false; /* - * Get a cursor for the by-size btree. + * Set up cursors and see if there are any free extents as big as + * maxlen. If not, pick the last entry in the tree unless the tree is + * empty. */ - cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, args->agbp, - args->agno, XFS_BTNUM_CNT); - - /* - * See if there are any free extents as big as maxlen. - */ - if ((error = xfs_alloc_lookup_ge(cnt_cur, 0, args->maxlen, &i))) - goto error0; - /* - * If none, then pick up the last entry in the tree unless the - * tree is empty. - */ - if (!i) { - if ((error = xfs_alloc_ag_vextent_small(args, cnt_cur, <bno, - <len, &i))) - goto error0; + error = xfs_alloc_cur_setup(args, &acur); + if (error == -ENOSPC) { + error = xfs_alloc_ag_vextent_small(args, acur.cnt, <bno, + <len, &i); + if (error) + goto out; if (i == 0 || ltlen == 0) { - xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR); trace_xfs_alloc_near_noentry(args); - return 0; + goto out; } ASSERT(i == 1); + } else if (error) { + goto out; } args->wasfromfl = 0; @@ -1193,7 +1244,7 @@ xfs_alloc_ag_vextent_near( * This is written as a while loop so we can break out of it, * but we never loop back to the top. */ - while (xfs_btree_islastblock(cnt_cur, 0)) { + while (xfs_btree_islastblock(acur.cnt, 0)) { xfs_extlen_t bdiff; int besti=0; xfs_extlen_t blen=0; @@ -1210,32 +1261,35 @@ xfs_alloc_ag_vextent_near( * and skip all those smaller than minlen. */ if (ltlen || args->alignment > 1) { - cnt_cur->bc_ptrs[0] = 1; + acur.cnt->bc_ptrs[0] = 1; do { - if ((error = xfs_alloc_get_rec(cnt_cur, <bno, - <len, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); + error = xfs_alloc_get_rec(acur.cnt, <bno, + <len, &i); + if (error) + goto out; + XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); if (ltlen >= args->minlen) break; - if ((error = xfs_btree_increment(cnt_cur, 0, &i))) - goto error0; + error = xfs_btree_increment(acur.cnt, 0, &i); + if (error) + goto out; } while (i); ASSERT(ltlen >= args->minlen); if (!i) break; } - i = cnt_cur->bc_ptrs[0]; + i = acur.cnt->bc_ptrs[0]; for (j = 1, blen = 0, bdiff = 0; !error && j && (blen < args->maxlen || bdiff > 0); - error = xfs_btree_increment(cnt_cur, 0, &j)) { + error = xfs_btree_increment(acur.cnt, 0, &j)) { /* * For each entry, decide if it's better than * the previous best entry. */ - if ((error = xfs_alloc_get_rec(cnt_cur, <bno, <len, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); + error = xfs_alloc_get_rec(acur.cnt, <bno, <len, &i); + if (error) + goto out; + XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); busy = xfs_alloc_compute_aligned(args, ltbno, ltlen, <bnoa, <lena, &busy_gen); if (ltlena < args->minlen) @@ -1255,7 +1309,7 @@ xfs_alloc_ag_vextent_near( bdiff = ltdiff; bnew = ltnew; blen = args->len; - besti = cnt_cur->bc_ptrs[0]; + besti = acur.cnt->bc_ptrs[0]; } } /* @@ -1267,10 +1321,11 @@ xfs_alloc_ag_vextent_near( /* * Point at the best entry, and retrieve it again. */ - cnt_cur->bc_ptrs[0] = besti; - if ((error = xfs_alloc_get_rec(cnt_cur, <bno, <len, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); + acur.cnt->bc_ptrs[0] = besti; + error = xfs_alloc_get_rec(acur.cnt, <bno, <len, &i); + if (error) + goto out; + XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); ASSERT(ltbno + ltlen <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); args->len = blen; @@ -1280,23 +1335,14 @@ xfs_alloc_ag_vextent_near( args->agbno = bnew; ASSERT(bnew >= ltbno); ASSERT(bnew + blen <= ltbno + ltlen); - /* - * Set up a cursor for the by-bno tree. - */ - bno_cur_lt = xfs_allocbt_init_cursor(args->mp, args->tp, - args->agbp, args->agno, XFS_BTNUM_BNO); - /* - * Fix up the btree entries. - */ - if ((error = xfs_alloc_fixup_trees(cnt_cur, bno_cur_lt, ltbno, - ltlen, bnew, blen, XFSA_FIXUP_CNT_OK))) - goto error0; - xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR); - xfs_btree_del_cursor(bno_cur_lt, XFS_BTREE_NOERROR); - + error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, ltbno, + ltlen, bnew, blen, XFSA_FIXUP_CNT_OK); + if (error) + goto out; trace_xfs_alloc_near_first(args); - return 0; + goto out; } + /* * Second algorithm. * Search in the by-bno tree to the left and to the right @@ -1309,86 +1355,57 @@ xfs_alloc_ag_vextent_near( * level algorithm that picks allocation groups for allocations * is not supposed to do this. */ - /* - * Allocate and initialize the cursor for the leftward search. - */ - bno_cur_lt = xfs_allocbt_init_cursor(args->mp, args->tp, args->agbp, - args->agno, XFS_BTNUM_BNO); - /* - * Lookup <= bno to find the leftward search's starting point. - */ - if ((error = xfs_alloc_lookup_le(bno_cur_lt, args->agbno, args->maxlen, &i))) - goto error0; - if (!i) { - /* - * Didn't find anything; use this cursor for the rightward - * search. - */ - bno_cur_gt = bno_cur_lt; - bno_cur_lt = NULL; - } - /* - * Found something. Duplicate the cursor for the rightward search. - */ - else if ((error = xfs_btree_dup_cursor(bno_cur_lt, &bno_cur_gt))) - goto error0; - /* - * Increment the cursor, so we will point at the entry just right - * of the leftward entry if any, or to the leftmost entry. - */ - if ((error = xfs_btree_increment(bno_cur_gt, 0, &i))) - goto error0; - if (!i) { - /* - * It failed, there are no rightward entries. - */ - xfs_btree_del_cursor(bno_cur_gt, XFS_BTREE_NOERROR); - bno_cur_gt = NULL; - } + error = xfs_alloc_lookup_le(acur.bnolt, args->agbno, 0, &i); + if (error) + goto out; + error = xfs_alloc_lookup_ge(acur.bnogt, args->agbno, 0, &i); + if (error) + goto out; + /* * Loop going left with the leftward cursor, right with the * rightward cursor, until either both directions give up or * we find an entry at least as big as minlen. */ do { - if (bno_cur_lt) { - if ((error = xfs_alloc_get_rec(bno_cur_lt, <bno, <len, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); + if (xfs_alloc_cur_active(acur.bnolt)) { + error = xfs_alloc_get_rec(acur.bnolt, <bno, <len, &i); + if (error) + goto out; + XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); busy |= xfs_alloc_compute_aligned(args, ltbno, ltlen, <bnoa, <lena, &busy_gen); if (ltlena >= args->minlen && ltbnoa >= args->min_agbno) break; - if ((error = xfs_btree_decrement(bno_cur_lt, 0, &i))) - goto error0; - if (!i || ltbnoa < args->min_agbno) { - xfs_btree_del_cursor(bno_cur_lt, - XFS_BTREE_NOERROR); - bno_cur_lt = NULL; - } + error = xfs_btree_decrement(acur.bnolt, 0, &i); + if (error) + goto out; + if (!i || ltbnoa < args->min_agbno) + acur.bnolt->bc_private.a.priv.abt.active = false; } - if (bno_cur_gt) { - if ((error = xfs_alloc_get_rec(bno_cur_gt, >bno, >len, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); + if (xfs_alloc_cur_active(acur.bnogt)) { + error = xfs_alloc_get_rec(acur.bnogt, >bno, >len, &i); + if (error) + goto out; + XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); busy |= xfs_alloc_compute_aligned(args, gtbno, gtlen, >bnoa, >lena, &busy_gen); if (gtlena >= args->minlen && gtbnoa <= args->max_agbno) break; - if ((error = xfs_btree_increment(bno_cur_gt, 0, &i))) - goto error0; - if (!i || gtbnoa > args->max_agbno) { - xfs_btree_del_cursor(bno_cur_gt, - XFS_BTREE_NOERROR); - bno_cur_gt = NULL; - } + error = xfs_btree_increment(acur.bnogt, 0, &i); + if (error) + goto out; + if (!i || gtbnoa > args->max_agbno) + acur.bnogt->bc_private.a.priv.abt.active = false; } - } while (bno_cur_lt || bno_cur_gt); + } while (xfs_alloc_cur_active(acur.bnolt) || + xfs_alloc_cur_active(acur.bnogt)); /* * Got both cursors still active, need to find better entry. */ - if (bno_cur_lt && bno_cur_gt) { + if (xfs_alloc_cur_active(acur.bnolt) && + xfs_alloc_cur_active(acur.bnogt)) { if (ltlena >= args->minlen) { /* * Left side is good, look for a right side entry. @@ -1400,7 +1417,7 @@ xfs_alloc_ag_vextent_near( ltlena, <new); error = xfs_alloc_find_best_extent(args, - &bno_cur_lt, &bno_cur_gt, + acur.bnolt, acur.bnogt, ltdiff, >bno, >len, >bnoa, >lena, 0 /* search right */); @@ -1417,22 +1434,21 @@ xfs_alloc_ag_vextent_near( gtlena, >new); error = xfs_alloc_find_best_extent(args, - &bno_cur_gt, &bno_cur_lt, + acur.bnogt, acur.bnolt, gtdiff, <bno, <len, <bnoa, <lena, 1 /* search left */); } if (error) - goto error0; + goto out; } /* * If we couldn't get anything, give up. */ - if (bno_cur_lt == NULL && bno_cur_gt == NULL) { - xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR); - + if (!xfs_alloc_cur_active(acur.bnolt) && + !xfs_alloc_cur_active(acur.bnogt)) { if (busy) { trace_xfs_alloc_near_busy(args); xfs_extent_busy_flush(args->mp, args->pag, busy_gen); @@ -1440,7 +1456,7 @@ xfs_alloc_ag_vextent_near( } trace_xfs_alloc_size_neither(args); args->agbno = NULLAGBLOCK; - return 0; + goto out; } /* @@ -1449,16 +1465,17 @@ xfs_alloc_ag_vextent_near( * useful variables to the "left" set so we only have one * copy of this code. */ - if (bno_cur_gt) { - bno_cur_lt = bno_cur_gt; - bno_cur_gt = NULL; + if (xfs_alloc_cur_active(acur.bnogt)) { + bno_cur = acur.bnogt; ltbno = gtbno; ltbnoa = gtbnoa; ltlen = gtlen; ltlena = gtlena; j = 1; - } else + } else { + bno_cur = acur.bnolt; j = 0; + } /* * Fix up the length and compute the useful address. @@ -1474,27 +1491,18 @@ xfs_alloc_ag_vextent_near( ASSERT(ltnew >= args->min_agbno && ltnew <= args->max_agbno); args->agbno = ltnew; - if ((error = xfs_alloc_fixup_trees(cnt_cur, bno_cur_lt, ltbno, ltlen, - ltnew, rlen, XFSA_FIXUP_BNO_OK))) - goto error0; + error = xfs_alloc_fixup_trees(acur.cnt, bno_cur, ltbno, ltlen, ltnew, + rlen, XFSA_FIXUP_BNO_OK); + if (error) + goto out; if (j) trace_xfs_alloc_near_greater(args); else trace_xfs_alloc_near_lesser(args); - xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR); - xfs_btree_del_cursor(bno_cur_lt, XFS_BTREE_NOERROR); - return 0; - - error0: - trace_xfs_alloc_near_error(args); - if (cnt_cur != NULL) - xfs_btree_del_cursor(cnt_cur, XFS_BTREE_ERROR); - if (bno_cur_lt != NULL) - xfs_btree_del_cursor(bno_cur_lt, XFS_BTREE_ERROR); - if (bno_cur_gt != NULL) - xfs_btree_del_cursor(bno_cur_gt, XFS_BTREE_ERROR); +out: + xfs_alloc_cur_close(&acur, error); return error; } From patchwork Fri Sep 27 17:17:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164867 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 667461747 for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4D0C120872 for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726033AbfI0RSE (ORCPT ); Fri, 27 Sep 2019 13:18:04 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33836 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726394AbfI0RSE (ORCPT ); Fri, 27 Sep 2019 13:18:04 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 349503086272 for ; Fri, 27 Sep 2019 17:18:04 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id E4C4D196B2 for ; Fri, 27 Sep 2019 17:18:03 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 03/11] xfs: track allocation busy state in allocation cursor Date: Fri, 27 Sep 2019 13:17:54 -0400 Message-Id: <20190927171802.45582-4-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Fri, 27 Sep 2019 17:18:04 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Extend the allocation cursor to track extent busy state for an allocation attempt. No functional changes. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index e14ab480de92..aa71a01a93b1 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -716,6 +716,8 @@ struct xfs_alloc_cur { struct xfs_btree_cur *cnt; /* btree cursors */ struct xfs_btree_cur *bnolt; struct xfs_btree_cur *bnogt; + unsigned int busy_gen;/* busy state */ + bool busy; }; /* @@ -733,6 +735,9 @@ xfs_alloc_cur_setup( ASSERT(args->alignment == 1 || args->type != XFS_ALLOCTYPE_THIS_BNO); + acur->busy = false; + acur->busy_gen = 0; + /* * Perform an initial cntbt lookup to check for availability of maxlen * extents. If this fails, we'll return -ENOSPC to signal the caller to @@ -1185,8 +1190,6 @@ xfs_alloc_ag_vextent_near( xfs_extlen_t ltlena; /* aligned ... */ xfs_agblock_t ltnew; /* useful start bno of left side */ xfs_extlen_t rlen; /* length of returned extent */ - bool busy; - unsigned busy_gen; #ifdef DEBUG /* * Randomly don't execute the first algorithm. @@ -1211,7 +1214,6 @@ xfs_alloc_ag_vextent_near( ltlen = 0; gtlena = 0; ltlena = 0; - busy = false; /* * Set up cursors and see if there are any free extents as big as @@ -1290,8 +1292,8 @@ xfs_alloc_ag_vextent_near( if (error) goto out; XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - busy = xfs_alloc_compute_aligned(args, ltbno, ltlen, - <bnoa, <lena, &busy_gen); + acur.busy = xfs_alloc_compute_aligned(args, ltbno, ltlen, + <bnoa, <lena, &acur.busy_gen); if (ltlena < args->minlen) continue; if (ltbnoa < args->min_agbno || ltbnoa > args->max_agbno) @@ -1373,8 +1375,8 @@ xfs_alloc_ag_vextent_near( if (error) goto out; XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - busy |= xfs_alloc_compute_aligned(args, ltbno, ltlen, - <bnoa, <lena, &busy_gen); + acur.busy |= xfs_alloc_compute_aligned(args, ltbno, + ltlen, <bnoa, <lena, &acur.busy_gen); if (ltlena >= args->minlen && ltbnoa >= args->min_agbno) break; error = xfs_btree_decrement(acur.bnolt, 0, &i); @@ -1388,8 +1390,8 @@ xfs_alloc_ag_vextent_near( if (error) goto out; XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - busy |= xfs_alloc_compute_aligned(args, gtbno, gtlen, - >bnoa, >lena, &busy_gen); + acur.busy |= xfs_alloc_compute_aligned(args, gtbno, + gtlen, >bnoa, >lena, &acur.busy_gen); if (gtlena >= args->minlen && gtbnoa <= args->max_agbno) break; error = xfs_btree_increment(acur.bnogt, 0, &i); @@ -1449,9 +1451,10 @@ xfs_alloc_ag_vextent_near( */ if (!xfs_alloc_cur_active(acur.bnolt) && !xfs_alloc_cur_active(acur.bnogt)) { - if (busy) { + if (acur.busy) { trace_xfs_alloc_near_busy(args); - xfs_extent_busy_flush(args->mp, args->pag, busy_gen); + xfs_extent_busy_flush(args->mp, args->pag, + acur.busy_gen); goto restart; } trace_xfs_alloc_size_neither(args); From patchwork Fri Sep 27 17:17:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164869 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B2144112B for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8E358217D9 for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727061AbfI0RSF (ORCPT ); Fri, 27 Sep 2019 13:18:05 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55488 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726673AbfI0RSF (ORCPT ); Fri, 27 Sep 2019 13:18:05 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 97CA3180174F for ; Fri, 27 Sep 2019 17:18:04 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 548DE2632B for ; Fri, 27 Sep 2019 17:18:04 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 04/11] xfs: track best extent from cntbt lastblock scan in alloc cursor Date: Fri, 27 Sep 2019 13:17:55 -0400 Message-Id: <20190927171802.45582-5-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.62]); Fri, 27 Sep 2019 17:18:04 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org If the size lookup lands in the last block of the by-size btree, the near mode algorithm scans the entire block for the extent with best available locality. In preparation for similar best available extent tracking across both btrees, extend the allocation cursor with best extent data and lift the associated state from the cntbt last block scan code. No functional changes. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 63 ++++++++++++++++++++------------------- 1 file changed, 33 insertions(+), 30 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index aa71a01a93b1..4514a059486e 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -716,6 +716,11 @@ struct xfs_alloc_cur { struct xfs_btree_cur *cnt; /* btree cursors */ struct xfs_btree_cur *bnolt; struct xfs_btree_cur *bnogt; + xfs_agblock_t rec_bno;/* extent startblock */ + xfs_extlen_t rec_len;/* extent length */ + xfs_agblock_t bno; /* alloc bno */ + xfs_extlen_t len; /* alloc len */ + xfs_extlen_t diff; /* diff from search bno */ unsigned int busy_gen;/* busy state */ bool busy; }; @@ -735,6 +740,11 @@ xfs_alloc_cur_setup( ASSERT(args->alignment == 1 || args->type != XFS_ALLOCTYPE_THIS_BNO); + acur->rec_bno = 0; + acur->rec_len = 0; + acur->bno = 0; + acur->len = 0; + acur->diff = 0; acur->busy = false; acur->busy_gen = 0; @@ -1247,10 +1257,7 @@ xfs_alloc_ag_vextent_near( * but we never loop back to the top. */ while (xfs_btree_islastblock(acur.cnt, 0)) { - xfs_extlen_t bdiff; - int besti=0; - xfs_extlen_t blen=0; - xfs_agblock_t bnew=0; + xfs_extlen_t diff; #ifdef DEBUG if (dofirst) @@ -1281,8 +1288,8 @@ xfs_alloc_ag_vextent_near( break; } i = acur.cnt->bc_ptrs[0]; - for (j = 1, blen = 0, bdiff = 0; - !error && j && (blen < args->maxlen || bdiff > 0); + for (j = 1; + !error && j && (acur.len < args->maxlen || acur.diff > 0); error = xfs_btree_increment(acur.cnt, 0, &j)) { /* * For each entry, decide if it's better than @@ -1301,44 +1308,40 @@ xfs_alloc_ag_vextent_near( args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen); xfs_alloc_fix_len(args); ASSERT(args->len >= args->minlen); - if (args->len < blen) + if (args->len < acur.len) continue; - ltdiff = xfs_alloc_compute_diff(args->agbno, args->len, + diff = xfs_alloc_compute_diff(args->agbno, args->len, args->alignment, args->datatype, ltbnoa, ltlena, <new); if (ltnew != NULLAGBLOCK && - (args->len > blen || ltdiff < bdiff)) { - bdiff = ltdiff; - bnew = ltnew; - blen = args->len; - besti = acur.cnt->bc_ptrs[0]; + (args->len > acur.len || diff < acur.diff)) { + acur.rec_bno = ltbno; + acur.rec_len = ltlen; + acur.diff = diff; + acur.bno = ltnew; + acur.len = args->len; } } /* * It didn't work. We COULD be in a case where * there's a good record somewhere, so try again. */ - if (blen == 0) + if (acur.len == 0) break; - /* - * Point at the best entry, and retrieve it again. - */ - acur.cnt->bc_ptrs[0] = besti; - error = xfs_alloc_get_rec(acur.cnt, <bno, <len, &i); - if (error) - goto out; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - ASSERT(ltbno + ltlen <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); - args->len = blen; /* - * We are allocating starting at bnew for blen blocks. + * Allocate at the bno/len tracked in the cursor. */ - args->agbno = bnew; - ASSERT(bnew >= ltbno); - ASSERT(bnew + blen <= ltbno + ltlen); - error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, ltbno, - ltlen, bnew, blen, XFSA_FIXUP_CNT_OK); + args->agbno = acur.bno; + args->len = acur.len; + ASSERT(acur.bno >= acur.rec_bno); + ASSERT(acur.bno + acur.len <= acur.rec_bno + acur.rec_len); + ASSERT(acur.rec_bno + acur.rec_len <= + be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); + + error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, + acur.rec_bno, acur.rec_len, acur.bno, acur.len, + 0); if (error) goto out; trace_xfs_alloc_near_first(args); From patchwork Fri Sep 27 17:17:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164871 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 794CA1599 for ; Fri, 27 Sep 2019 17:18:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 55F1520872 for ; Fri, 27 Sep 2019 17:18:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727747AbfI0RSF (ORCPT ); Fri, 27 Sep 2019 13:18:05 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56938 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726394AbfI0RSF (ORCPT ); Fri, 27 Sep 2019 13:18:05 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0552210CC20B for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id B82B02632B for ; Fri, 27 Sep 2019 17:18:04 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 05/11] xfs: refactor cntbt lastblock scan best extent logic into helper Date: Fri, 27 Sep 2019 13:17:56 -0400 Message-Id: <20190927171802.45582-6-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.65]); Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The cntbt lastblock scan checks the size, alignment, locality, etc. of each free extent in the block and compares it with the current best candidate. This logic will be reused by the upcoming optimized cntbt algorithm, so refactor it into a separate helper. Note that acur->diff is now initialized to -1 (unsigned) instead of 0 to support the more granular comparison logic in the new helper. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 115 ++++++++++++++++++++++++++++---------- fs/xfs/xfs_trace.h | 26 +++++++++ 2 files changed, 113 insertions(+), 28 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 4514a059486e..38183953636a 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -744,7 +744,7 @@ xfs_alloc_cur_setup( acur->rec_len = 0; acur->bno = 0; acur->len = 0; - acur->diff = 0; + acur->diff = -1; acur->busy = false; acur->busy_gen = 0; @@ -791,6 +791,89 @@ xfs_alloc_cur_close( acur->cnt = acur->bnolt = acur->bnogt = NULL; } +/* + * Check an extent for allocation and track the best available candidate in the + * allocation structure. The cursor is deactivated if it has entered an out of + * range state based on allocation arguments. Optionally return the extent + * extent geometry and allocation status if requested by the caller. + */ +static int +xfs_alloc_cur_check( + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur, + struct xfs_btree_cur *cur, + int *new) +{ + int error, i; + xfs_agblock_t bno, bnoa, bnew; + xfs_extlen_t len, lena, diff = -1; + bool busy; + unsigned busy_gen = 0; + bool deactivate = false; + + *new = 0; + + error = xfs_alloc_get_rec(cur, &bno, &len, &i); + if (error) + return error; + XFS_WANT_CORRUPTED_RETURN(args->mp, i == 1); + + /* + * Check minlen and deactivate a cntbt cursor if out of acceptable size + * range (i.e., walking backwards looking for a minlen extent). + */ + if (len < args->minlen) { + deactivate = true; + goto out; + } + + busy = xfs_alloc_compute_aligned(args, bno, len, &bnoa, &lena, + &busy_gen); + acur->busy |= busy; + if (busy) + acur->busy_gen = busy_gen; + /* deactivate a bnobt cursor outside of locality range */ + if (bnoa < args->min_agbno || bnoa > args->max_agbno) + goto out; + if (lena < args->minlen) + goto out; + + args->len = XFS_EXTLEN_MIN(lena, args->maxlen); + xfs_alloc_fix_len(args); + ASSERT(args->len >= args->minlen); + if (args->len < acur->len) + goto out; + + /* + * We have an aligned record that satisfies minlen and beats or matches + * the candidate extent size. Compare locality for near allocation mode. + */ + ASSERT(args->type == XFS_ALLOCTYPE_NEAR_BNO); + diff = xfs_alloc_compute_diff(args->agbno, args->len, + args->alignment, args->datatype, + bnoa, lena, &bnew); + if (bnew == NULLAGBLOCK) + goto out; + if (diff > acur->diff) + goto out; + + ASSERT(args->len > acur->len || + (args->len == acur->len && diff <= acur->diff)); + acur->rec_bno = bno; + acur->rec_len = len; + acur->bno = bnew; + acur->len = args->len; + acur->diff = diff; + *new = 1; + +out: + if (deactivate) + cur->bc_private.a.priv.abt.active = false; + trace_xfs_alloc_cur_check(args->mp, cur->bc_btnum, bno, len, diff, + *new); + return 0; +} + /* * Deal with the case where only small freespaces remain. Either return the * contents of the last freespace record, or allocate space from the freelist if @@ -1257,8 +1340,6 @@ xfs_alloc_ag_vextent_near( * but we never loop back to the top. */ while (xfs_btree_islastblock(acur.cnt, 0)) { - xfs_extlen_t diff; - #ifdef DEBUG if (dofirst) break; @@ -1289,38 +1370,16 @@ xfs_alloc_ag_vextent_near( } i = acur.cnt->bc_ptrs[0]; for (j = 1; - !error && j && (acur.len < args->maxlen || acur.diff > 0); + !error && j && xfs_alloc_cur_active(acur.cnt) && + (acur.len < args->maxlen || acur.diff > 0); error = xfs_btree_increment(acur.cnt, 0, &j)) { /* * For each entry, decide if it's better than * the previous best entry. */ - error = xfs_alloc_get_rec(acur.cnt, <bno, <len, &i); + error = xfs_alloc_cur_check(args, &acur, acur.cnt, &i); if (error) goto out; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - acur.busy = xfs_alloc_compute_aligned(args, ltbno, ltlen, - <bnoa, <lena, &acur.busy_gen); - if (ltlena < args->minlen) - continue; - if (ltbnoa < args->min_agbno || ltbnoa > args->max_agbno) - continue; - args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen); - xfs_alloc_fix_len(args); - ASSERT(args->len >= args->minlen); - if (args->len < acur.len) - continue; - diff = xfs_alloc_compute_diff(args->agbno, args->len, - args->alignment, args->datatype, ltbnoa, - ltlena, <new); - if (ltnew != NULLAGBLOCK && - (args->len > acur.len || diff < acur.diff)) { - acur.rec_bno = ltbno; - acur.rec_len = ltlen; - acur.diff = diff; - acur.bno = ltnew; - acur.len = args->len; - } } /* * It didn't work. We COULD be in a case where diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index eaae275ed430..cf618919cacf 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1663,6 +1663,32 @@ DEFINE_ALLOC_EVENT(xfs_alloc_vextent_noagbp); DEFINE_ALLOC_EVENT(xfs_alloc_vextent_loopfailed); DEFINE_ALLOC_EVENT(xfs_alloc_vextent_allfailed); +TRACE_EVENT(xfs_alloc_cur_check, + TP_PROTO(struct xfs_mount *mp, xfs_btnum_t btnum, xfs_agblock_t bno, + xfs_extlen_t len, xfs_extlen_t diff, bool new), + TP_ARGS(mp, btnum, bno, len, diff, new), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_btnum_t, btnum) + __field(xfs_agblock_t, bno) + __field(xfs_extlen_t, len) + __field(xfs_extlen_t, diff) + __field(bool, new) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->btnum = btnum; + __entry->bno = bno; + __entry->len = len; + __entry->diff = diff; + __entry->new = new; + ), + TP_printk("dev %d:%d btree %s bno 0x%x len 0x%x diff 0x%x new %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + __print_symbolic(__entry->btnum, XFS_BTNUM_STRINGS), + __entry->bno, __entry->len, __entry->diff, __entry->new) +) + DECLARE_EVENT_CLASS(xfs_da_class, TP_PROTO(struct xfs_da_args *args), TP_ARGS(args), From patchwork Fri Sep 27 17:17:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164873 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B5D291747 for ; Fri, 27 Sep 2019 17:18:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8D8D2217D9 for ; Fri, 27 Sep 2019 17:18:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726394AbfI0RSG (ORCPT ); Fri, 27 Sep 2019 13:18:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43908 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726673AbfI0RSF (ORCPT ); Fri, 27 Sep 2019 13:18:05 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6ABC4B62C for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 27B27196B2 for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 06/11] xfs: reuse best extent tracking logic for bnobt scan Date: Fri, 27 Sep 2019 13:17:57 -0400 Message-Id: <20190927171802.45582-7-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The near mode bnobt scan searches left and right in the bnobt looking for the closest free extent to the allocation hint that satisfies minlen. Once such an extent is found, the left/right search terminates, we search one more time in the opposite direction and finish the allocation with the best overall extent. The left/right and find best searches are currently controlled via a combination of cursor state and local variables. Clean up this code and prepare for further improvements to the near mode fallback algorithm by reusing the allocation cursor best extent tracking mechanism. Update the tracking logic to deactivate bnobt cursors when out of allocation range and replace open-coded extent checks to calls to the common helper. In doing so, rename some misnamed local variables in the top-level near mode allocation function. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 276 +++++++++++--------------------------- fs/xfs/xfs_trace.h | 4 +- 2 files changed, 79 insertions(+), 201 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 38183953636a..d412ae41676a 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -810,6 +810,7 @@ xfs_alloc_cur_check( bool busy; unsigned busy_gen = 0; bool deactivate = false; + bool isbnobt = cur->bc_btnum == XFS_BTNUM_BNO; *new = 0; @@ -823,7 +824,7 @@ xfs_alloc_cur_check( * range (i.e., walking backwards looking for a minlen extent). */ if (len < args->minlen) { - deactivate = true; + deactivate = !isbnobt; goto out; } @@ -833,8 +834,10 @@ xfs_alloc_cur_check( if (busy) acur->busy_gen = busy_gen; /* deactivate a bnobt cursor outside of locality range */ - if (bnoa < args->min_agbno || bnoa > args->max_agbno) + if (bnoa < args->min_agbno || bnoa > args->max_agbno) { + deactivate = isbnobt; goto out; + } if (lena < args->minlen) goto out; @@ -854,8 +857,14 @@ xfs_alloc_cur_check( bnoa, lena, &bnew); if (bnew == NULLAGBLOCK) goto out; - if (diff > acur->diff) + + /* + * Deactivate a bnobt cursor with worse locality than the current best. + */ + if (diff > acur->diff) { + deactivate = isbnobt; goto out; + } ASSERT(args->len > acur->len || (args->len == acur->len && diff <= acur->diff)); @@ -1163,96 +1172,44 @@ xfs_alloc_ag_vextent_exact( } /* - * Search the btree in a given direction via the search cursor and compare - * the records found against the good extent we've already found. + * Search the btree in a given direction and check the records against the good + * extent we've already found. */ STATIC int xfs_alloc_find_best_extent( - struct xfs_alloc_arg *args, /* allocation argument structure */ - struct xfs_btree_cur *gcur, /* good cursor */ - struct xfs_btree_cur *scur, /* searching cursor */ - xfs_agblock_t gdiff, /* difference for search comparison */ - xfs_agblock_t *sbno, /* extent found by search */ - xfs_extlen_t *slen, /* extent length */ - xfs_agblock_t *sbnoa, /* aligned extent found by search */ - xfs_extlen_t *slena, /* aligned extent length */ - int dir) /* 0 = search right, 1 = search left */ + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur, + struct xfs_btree_cur *cur, + bool increment) { - xfs_agblock_t new; - xfs_agblock_t sdiff; int error; int i; - unsigned busy_gen; - - /* The good extent is perfect, no need to search. */ - if (!gdiff) - goto out_use_good; /* - * Look until we find a better one, run out of space or run off the end. + * Search so long as the cursor is active or we find a better extent. + * The cursor is deactivated if it extends beyond the range of the + * current allocation candidate. */ - do { - error = xfs_alloc_get_rec(scur, sbno, slen, &i); + while (xfs_alloc_cur_active(cur)) { + error = xfs_alloc_cur_check(args, acur, cur, &i); if (error) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); - xfs_alloc_compute_aligned(args, *sbno, *slen, - sbnoa, slena, &busy_gen); - - /* - * The good extent is closer than this one. - */ - if (!dir) { - if (*sbnoa > args->max_agbno) - goto out_use_good; - if (*sbnoa >= args->agbno + gdiff) - goto out_use_good; - } else { - if (*sbnoa < args->min_agbno) - goto out_use_good; - if (*sbnoa <= args->agbno - gdiff) - goto out_use_good; - } - - /* - * Same distance, compare length and pick the best. - */ - if (*slena >= args->minlen) { - args->len = XFS_EXTLEN_MIN(*slena, args->maxlen); - xfs_alloc_fix_len(args); - - sdiff = xfs_alloc_compute_diff(args->agbno, args->len, - args->alignment, - args->datatype, *sbnoa, - *slena, &new); - - /* - * Choose closer size and invalidate other cursor. - */ - if (sdiff < gdiff) - goto out_use_search; - goto out_use_good; - } + return error; + if (i == 1) + break; + if (!xfs_alloc_cur_active(cur)) + break; - if (!dir) - error = xfs_btree_increment(scur, 0, &i); + if (increment) + error = xfs_btree_increment(cur, 0, &i); else - error = xfs_btree_decrement(scur, 0, &i); + error = xfs_btree_decrement(cur, 0, &i); if (error) - goto error0; - } while (i); - -out_use_good: - scur->bc_private.a.priv.abt.active = false; - return 0; + return error; + if (i == 0) + cur->bc_private.a.priv.abt.active = false; + } -out_use_search: - gcur->bc_private.a.priv.abt.active = false; return 0; - -error0: - /* caller invalidates cursors */ - return error; } /* @@ -1266,23 +1223,13 @@ xfs_alloc_ag_vextent_near( struct xfs_alloc_arg *args) { struct xfs_alloc_cur acur = {0,}; - struct xfs_btree_cur *bno_cur; - xfs_agblock_t gtbno; /* start bno of right side entry */ - xfs_agblock_t gtbnoa; /* aligned ... */ - xfs_extlen_t gtdiff; /* difference to right side entry */ - xfs_extlen_t gtlen; /* length of right side entry */ - xfs_extlen_t gtlena; /* aligned ... */ - xfs_agblock_t gtnew; /* useful start bno of right side */ - int error; /* error code */ - int i; /* result code, temporary */ - int j; /* result code, temporary */ - xfs_agblock_t ltbno; /* start bno of left side entry */ - xfs_agblock_t ltbnoa; /* aligned ... */ - xfs_extlen_t ltdiff; /* difference to left side entry */ - xfs_extlen_t ltlen; /* length of left side entry */ - xfs_extlen_t ltlena; /* aligned ... */ - xfs_agblock_t ltnew; /* useful start bno of left side */ - xfs_extlen_t rlen; /* length of returned extent */ + struct xfs_btree_cur *fbcur = NULL; + int error; /* error code */ + int i; /* result code, temporary */ + int j; /* result code, temporary */ + xfs_agblock_t bno; + xfs_extlen_t len; + bool fbinc = false; #ifdef DEBUG /* * Randomly don't execute the first algorithm. @@ -1304,9 +1251,7 @@ xfs_alloc_ag_vextent_near( args->agbno = args->max_agbno; restart: - ltlen = 0; - gtlena = 0; - ltlena = 0; + len = 0; /* * Set up cursors and see if there are any free extents as big as @@ -1315,11 +1260,11 @@ xfs_alloc_ag_vextent_near( */ error = xfs_alloc_cur_setup(args, &acur); if (error == -ENOSPC) { - error = xfs_alloc_ag_vextent_small(args, acur.cnt, <bno, - <len, &i); + error = xfs_alloc_ag_vextent_small(args, acur.cnt, &bno, + &len, &i); if (error) goto out; - if (i == 0 || ltlen == 0) { + if (i == 0 || len == 0) { trace_xfs_alloc_near_noentry(args); goto out; } @@ -1350,21 +1295,21 @@ xfs_alloc_ag_vextent_near( * record smaller than maxlen, go to the start of this block, * and skip all those smaller than minlen. */ - if (ltlen || args->alignment > 1) { + if (len || args->alignment > 1) { acur.cnt->bc_ptrs[0] = 1; do { - error = xfs_alloc_get_rec(acur.cnt, <bno, - <len, &i); + error = xfs_alloc_get_rec(acur.cnt, &bno, &len, + &i); if (error) goto out; XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - if (ltlen >= args->minlen) + if (len >= args->minlen) break; error = xfs_btree_increment(acur.cnt, 0, &i); if (error) goto out; } while (i); - ASSERT(ltlen >= args->minlen); + ASSERT(len >= args->minlen); if (!i) break; } @@ -1433,77 +1378,43 @@ xfs_alloc_ag_vextent_near( */ do { if (xfs_alloc_cur_active(acur.bnolt)) { - error = xfs_alloc_get_rec(acur.bnolt, <bno, <len, &i); + error = xfs_alloc_cur_check(args, &acur, acur.bnolt, &i); if (error) goto out; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - acur.busy |= xfs_alloc_compute_aligned(args, ltbno, - ltlen, <bnoa, <lena, &acur.busy_gen); - if (ltlena >= args->minlen && ltbnoa >= args->min_agbno) + if (i == 1) { + trace_xfs_alloc_cur_left(args); + fbcur = acur.bnogt; + fbinc = true; break; + } error = xfs_btree_decrement(acur.bnolt, 0, &i); if (error) goto out; - if (!i || ltbnoa < args->min_agbno) + if (!i) acur.bnolt->bc_private.a.priv.abt.active = false; } if (xfs_alloc_cur_active(acur.bnogt)) { - error = xfs_alloc_get_rec(acur.bnogt, >bno, >len, &i); + error = xfs_alloc_cur_check(args, &acur, acur.bnogt, &i); if (error) goto out; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - acur.busy |= xfs_alloc_compute_aligned(args, gtbno, - gtlen, >bnoa, >lena, &acur.busy_gen); - if (gtlena >= args->minlen && gtbnoa <= args->max_agbno) + if (i == 1) { + trace_xfs_alloc_cur_right(args); + fbcur = acur.bnolt; + fbinc = false; break; + } error = xfs_btree_increment(acur.bnogt, 0, &i); if (error) goto out; - if (!i || gtbnoa > args->max_agbno) + if (!i) acur.bnogt->bc_private.a.priv.abt.active = false; } } while (xfs_alloc_cur_active(acur.bnolt) || xfs_alloc_cur_active(acur.bnogt)); - /* - * Got both cursors still active, need to find better entry. - */ - if (xfs_alloc_cur_active(acur.bnolt) && - xfs_alloc_cur_active(acur.bnogt)) { - if (ltlena >= args->minlen) { - /* - * Left side is good, look for a right side entry. - */ - args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen); - xfs_alloc_fix_len(args); - ltdiff = xfs_alloc_compute_diff(args->agbno, args->len, - args->alignment, args->datatype, ltbnoa, - ltlena, <new); - - error = xfs_alloc_find_best_extent(args, - acur.bnolt, acur.bnogt, - ltdiff, >bno, >len, - >bnoa, >lena, - 0 /* search right */); - } else { - ASSERT(gtlena >= args->minlen); - - /* - * Right side is good, look for a left side entry. - */ - args->len = XFS_EXTLEN_MIN(gtlena, args->maxlen); - xfs_alloc_fix_len(args); - gtdiff = xfs_alloc_compute_diff(args->agbno, args->len, - args->alignment, args->datatype, gtbnoa, - gtlena, >new); - - error = xfs_alloc_find_best_extent(args, - acur.bnogt, acur.bnolt, - gtdiff, <bno, <len, - <bnoa, <lena, - 1 /* search left */); - } - + /* search the opposite direction for a better entry */ + if (fbcur) { + error = xfs_alloc_find_best_extent(args, &acur, fbcur, fbinc); if (error) goto out; } @@ -1511,8 +1422,7 @@ xfs_alloc_ag_vextent_near( /* * If we couldn't get anything, give up. */ - if (!xfs_alloc_cur_active(acur.bnolt) && - !xfs_alloc_cur_active(acur.bnogt)) { + if (!acur.len) { if (acur.busy) { trace_xfs_alloc_near_busy(args); xfs_extent_busy_flush(args->mp, args->pag, @@ -1524,47 +1434,15 @@ xfs_alloc_ag_vextent_near( goto out; } - /* - * At this point we have selected a freespace entry, either to the - * left or to the right. If it's on the right, copy all the - * useful variables to the "left" set so we only have one - * copy of this code. - */ - if (xfs_alloc_cur_active(acur.bnogt)) { - bno_cur = acur.bnogt; - ltbno = gtbno; - ltbnoa = gtbnoa; - ltlen = gtlen; - ltlena = gtlena; - j = 1; - } else { - bno_cur = acur.bnolt; - j = 0; - } - - /* - * Fix up the length and compute the useful address. - */ - args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen); - xfs_alloc_fix_len(args); - rlen = args->len; - (void)xfs_alloc_compute_diff(args->agbno, rlen, args->alignment, - args->datatype, ltbnoa, ltlena, <new); - ASSERT(ltnew >= ltbno); - ASSERT(ltnew + rlen <= ltbnoa + ltlena); - ASSERT(ltnew + rlen <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); - ASSERT(ltnew >= args->min_agbno && ltnew <= args->max_agbno); - args->agbno = ltnew; - - error = xfs_alloc_fixup_trees(acur.cnt, bno_cur, ltbno, ltlen, ltnew, - rlen, XFSA_FIXUP_BNO_OK); - if (error) - goto out; + args->agbno = acur.bno; + args->len = acur.len; + ASSERT(acur.bno >= acur.rec_bno); + ASSERT(acur.bno + acur.len <= acur.rec_bno + acur.rec_len); + ASSERT(acur.rec_bno + acur.rec_len <= + be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); - if (j) - trace_xfs_alloc_near_greater(args); - else - trace_xfs_alloc_near_lesser(args); + error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, acur.rec_bno, + acur.rec_len, acur.bno, acur.len, 0); out: xfs_alloc_cur_close(&acur, error); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index cf618919cacf..ebe85e9ed7c8 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1642,8 +1642,8 @@ DEFINE_ALLOC_EVENT(xfs_alloc_exact_notfound); DEFINE_ALLOC_EVENT(xfs_alloc_exact_error); DEFINE_ALLOC_EVENT(xfs_alloc_near_nominleft); DEFINE_ALLOC_EVENT(xfs_alloc_near_first); -DEFINE_ALLOC_EVENT(xfs_alloc_near_greater); -DEFINE_ALLOC_EVENT(xfs_alloc_near_lesser); +DEFINE_ALLOC_EVENT(xfs_alloc_cur_right); +DEFINE_ALLOC_EVENT(xfs_alloc_cur_left); DEFINE_ALLOC_EVENT(xfs_alloc_near_error); DEFINE_ALLOC_EVENT(xfs_alloc_near_noentry); DEFINE_ALLOC_EVENT(xfs_alloc_near_busy); From patchwork Fri Sep 27 17:17:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164875 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 17BD2112B for ; Fri, 27 Sep 2019 17:18:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EEC3721850 for ; Fri, 27 Sep 2019 17:18:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727800AbfI0RSG (ORCPT ); Fri, 27 Sep 2019 13:18:06 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56940 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727235AbfI0RSF (ORCPT ); Fri, 27 Sep 2019 13:18:05 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CDD2010CC20B for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8B77D196B2 for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 07/11] xfs: refactor allocation tree fixup code Date: Fri, 27 Sep 2019 13:17:58 -0400 Message-Id: <20190927171802.45582-8-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.65]); Fri, 27 Sep 2019 17:18:05 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Both algorithms duplicate the same btree allocation code. Eliminate the duplication and reuse the fallback algorithm codepath. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 18 ++---------------- 1 file changed, 2 insertions(+), 16 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index d412ae41676a..32b378c8e16c 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -1333,23 +1333,8 @@ xfs_alloc_ag_vextent_near( if (acur.len == 0) break; - /* - * Allocate at the bno/len tracked in the cursor. - */ - args->agbno = acur.bno; - args->len = acur.len; - ASSERT(acur.bno >= acur.rec_bno); - ASSERT(acur.bno + acur.len <= acur.rec_bno + acur.rec_len); - ASSERT(acur.rec_bno + acur.rec_len <= - be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); - - error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, - acur.rec_bno, acur.rec_len, acur.bno, acur.len, - 0); - if (error) - goto out; trace_xfs_alloc_near_first(args); - goto out; + goto alloc; } /* @@ -1434,6 +1419,7 @@ xfs_alloc_ag_vextent_near( goto out; } +alloc: args->agbno = acur.bno; args->len = acur.len; ASSERT(acur.bno >= acur.rec_bno); From patchwork Fri Sep 27 17:17:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164879 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3ACEE1747 for ; Fri, 27 Sep 2019 17:18:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 20DD120872 for ; Fri, 27 Sep 2019 17:18:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727837AbfI0RSH (ORCPT ); Fri, 27 Sep 2019 13:18:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56946 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726673AbfI0RSH (ORCPT ); Fri, 27 Sep 2019 13:18:07 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4DAE710CC20D for ; Fri, 27 Sep 2019 17:18:06 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id EF19826330 for ; Fri, 27 Sep 2019 17:18:05 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 08/11] xfs: refactor and reuse best extent scanning logic Date: Fri, 27 Sep 2019 13:17:59 -0400 Message-Id: <20190927171802.45582-9-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.65]); Fri, 27 Sep 2019 17:18:06 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The bnobt "find best" helper implements a simple btree walker function. This general pattern, or a subset thereof, is reused in various parts of a near mode allocation operation. For example, the bnobt left/right scans are each iterative btree walks along with the cntbt lastblock scan. Rework this function into a generic btree walker, add a couple parameters to control termination behavior from various contexts and reuse it where applicable. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 110 +++++++++++++++++++------------------- 1 file changed, 55 insertions(+), 55 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 32b378c8e16c..85e82e184ec9 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -875,6 +875,13 @@ xfs_alloc_cur_check( acur->diff = diff; *new = 1; + /* + * We're done if we found a perfect allocation. This only deactivates + * the current cursor, but this is just an optimization to terminate a + * cntbt search that otherwise runs to the edge of the tree. + */ + if (acur->diff == 0 && acur->len == args->maxlen) + deactivate = true; out: if (deactivate) cur->bc_private.a.priv.abt.active = false; @@ -1172,30 +1179,38 @@ xfs_alloc_ag_vextent_exact( } /* - * Search the btree in a given direction and check the records against the good - * extent we've already found. + * Search a given number of btree records in a given direction. Check each + * record against the good extent we've already found. */ STATIC int -xfs_alloc_find_best_extent( +xfs_alloc_walk_iter( struct xfs_alloc_arg *args, struct xfs_alloc_cur *acur, struct xfs_btree_cur *cur, - bool increment) + bool increment, + bool find_one, /* quit on first candidate */ + int count, /* rec count (-1 for infinite) */ + int *stat) { int error; int i; + *stat = 0; + /* * Search so long as the cursor is active or we find a better extent. * The cursor is deactivated if it extends beyond the range of the * current allocation candidate. */ - while (xfs_alloc_cur_active(cur)) { + while (xfs_alloc_cur_active(cur) && count) { error = xfs_alloc_cur_check(args, acur, cur, &i); if (error) return error; - if (i == 1) - break; + if (i == 1) { + *stat = 1; + if (find_one) + break; + } if (!xfs_alloc_cur_active(cur)) break; @@ -1207,6 +1222,9 @@ xfs_alloc_find_best_extent( return error; if (i == 0) cur->bc_private.a.priv.abt.active = false; + + if (count > 0) + count--; } return 0; @@ -1226,7 +1244,6 @@ xfs_alloc_ag_vextent_near( struct xfs_btree_cur *fbcur = NULL; int error; /* error code */ int i; /* result code, temporary */ - int j; /* result code, temporary */ xfs_agblock_t bno; xfs_extlen_t len; bool fbinc = false; @@ -1313,19 +1330,12 @@ xfs_alloc_ag_vextent_near( if (!i) break; } - i = acur.cnt->bc_ptrs[0]; - for (j = 1; - !error && j && xfs_alloc_cur_active(acur.cnt) && - (acur.len < args->maxlen || acur.diff > 0); - error = xfs_btree_increment(acur.cnt, 0, &j)) { - /* - * For each entry, decide if it's better than - * the previous best entry. - */ - error = xfs_alloc_cur_check(args, &acur, acur.cnt, &i); - if (error) - goto out; - } + + error = xfs_alloc_walk_iter(args, &acur, acur.cnt, true, false, + -1, &i); + if (error) + goto out; + /* * It didn't work. We COULD be in a case where * there's a good record somewhere, so try again. @@ -1357,49 +1367,39 @@ xfs_alloc_ag_vextent_near( goto out; /* - * Loop going left with the leftward cursor, right with the - * rightward cursor, until either both directions give up or - * we find an entry at least as big as minlen. + * Loop going left with the leftward cursor, right with the rightward + * cursor, until either both directions give up or we find an entry at + * least as big as minlen. */ do { - if (xfs_alloc_cur_active(acur.bnolt)) { - error = xfs_alloc_cur_check(args, &acur, acur.bnolt, &i); - if (error) - goto out; - if (i == 1) { - trace_xfs_alloc_cur_left(args); - fbcur = acur.bnogt; - fbinc = true; - break; - } - error = xfs_btree_decrement(acur.bnolt, 0, &i); - if (error) - goto out; - if (!i) - acur.bnolt->bc_private.a.priv.abt.active = false; + error = xfs_alloc_walk_iter(args, &acur, acur.bnolt, false, + true, 1, &i); + if (error) + goto out; + if (i == 1) { + trace_xfs_alloc_cur_left(args); + fbcur = acur.bnogt; + fbinc = true; + break; } - if (xfs_alloc_cur_active(acur.bnogt)) { - error = xfs_alloc_cur_check(args, &acur, acur.bnogt, &i); - if (error) - goto out; - if (i == 1) { - trace_xfs_alloc_cur_right(args); - fbcur = acur.bnolt; - fbinc = false; - break; - } - error = xfs_btree_increment(acur.bnogt, 0, &i); - if (error) - goto out; - if (!i) - acur.bnogt->bc_private.a.priv.abt.active = false; + + error = xfs_alloc_walk_iter(args, &acur, acur.bnogt, true, true, + 1, &i); + if (error) + goto out; + if (i == 1) { + trace_xfs_alloc_cur_right(args); + fbcur = acur.bnolt; + fbinc = false; + break; } } while (xfs_alloc_cur_active(acur.bnolt) || xfs_alloc_cur_active(acur.bnogt)); /* search the opposite direction for a better entry */ if (fbcur) { - error = xfs_alloc_find_best_extent(args, &acur, fbcur, fbinc); + error = xfs_alloc_walk_iter(args, &acur, fbcur, fbinc, true, -1, + &i); if (error) goto out; } From patchwork Fri Sep 27 17:18:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164877 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0AE2F112B for ; Fri, 27 Sep 2019 17:18:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E645A21850 for ; Fri, 27 Sep 2019 17:18:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727824AbfI0RSH (ORCPT ); Fri, 27 Sep 2019 13:18:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43910 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727235AbfI0RSH (ORCPT ); Fri, 27 Sep 2019 13:18:07 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A00462A09D6 for ; Fri, 27 Sep 2019 17:18:06 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5E16E26330 for ; Fri, 27 Sep 2019 17:18:06 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 09/11] xfs: refactor near mode alloc bnobt scan into separate function Date: Fri, 27 Sep 2019 13:18:00 -0400 Message-Id: <20190927171802.45582-10-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 27 Sep 2019 17:18:06 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation to enhance the near mode allocation bnobt scan algorithm, lift it into a separate function. No functional changes. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 128 ++++++++++++++++++++++---------------- 1 file changed, 74 insertions(+), 54 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 85e82e184ec9..c1f59bfa8d09 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -1230,6 +1230,78 @@ xfs_alloc_walk_iter( return 0; } +/* + * Search in the by-bno btree to the left and to the right simultaneously until + * in each case we find a large enough free extent or run into the edge of the + * tree. When we run into the edge of the tree, we deactivate that cursor. + */ +STATIC int +xfs_alloc_ag_vextent_bnobt( + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur, + int *stat) +{ + struct xfs_btree_cur *fbcur = NULL; + int error; + int i; + bool fbinc; + + ASSERT(acur->len == 0); + ASSERT(args->type == XFS_ALLOCTYPE_NEAR_BNO); + + *stat = 0; + + error = xfs_alloc_lookup_le(acur->bnolt, args->agbno, 0, &i); + if (error) + return error; + error = xfs_alloc_lookup_ge(acur->bnogt, args->agbno, 0, &i); + if (error) + return error; + + /* + * Loop going left with the leftward cursor, right with the rightward + * cursor, until either both directions give up or we find an entry at + * least as big as minlen. + */ + while (xfs_alloc_cur_active(acur->bnolt) || + xfs_alloc_cur_active(acur->bnogt)) { + error = xfs_alloc_walk_iter(args, acur, acur->bnolt, false, + true, 1, &i); + if (error) + return error; + if (i == 1) { + trace_xfs_alloc_cur_left(args); + fbcur = acur->bnogt; + fbinc = true; + break; + } + + error = xfs_alloc_walk_iter(args, acur, acur->bnogt, true, true, + 1, &i); + if (error) + return error; + if (i == 1) { + trace_xfs_alloc_cur_right(args); + fbcur = acur->bnolt; + fbinc = false; + break; + } + } + + /* search the opposite direction for a better entry */ + if (fbcur) { + error = xfs_alloc_walk_iter(args, acur, fbcur, fbinc, true, -1, + &i); + if (error) + return error; + } + + if (acur->len) + *stat = 1; + + return 0; +} + /* * Allocate a variable extent near bno in the allocation group agno. * Extent's length (returned in len) will be between minlen and maxlen, @@ -1241,12 +1313,10 @@ xfs_alloc_ag_vextent_near( struct xfs_alloc_arg *args) { struct xfs_alloc_cur acur = {0,}; - struct xfs_btree_cur *fbcur = NULL; int error; /* error code */ int i; /* result code, temporary */ xfs_agblock_t bno; xfs_extlen_t len; - bool fbinc = false; #ifdef DEBUG /* * Randomly don't execute the first algorithm. @@ -1348,62 +1418,12 @@ xfs_alloc_ag_vextent_near( } /* - * Second algorithm. - * Search in the by-bno tree to the left and to the right - * simultaneously, until in each case we find a space big enough, - * or run into the edge of the tree. When we run into the edge, - * we deallocate that cursor. - * If both searches succeed, we compare the two spaces and pick - * the better one. - * With alignment, it's possible for both to fail; the upper - * level algorithm that picks allocation groups for allocations - * is not supposed to do this. + * Second algorithm. Search the bnobt left and right. */ - error = xfs_alloc_lookup_le(acur.bnolt, args->agbno, 0, &i); - if (error) - goto out; - error = xfs_alloc_lookup_ge(acur.bnogt, args->agbno, 0, &i); + error = xfs_alloc_ag_vextent_bnobt(args, &acur, &i); if (error) goto out; - /* - * Loop going left with the leftward cursor, right with the rightward - * cursor, until either both directions give up or we find an entry at - * least as big as minlen. - */ - do { - error = xfs_alloc_walk_iter(args, &acur, acur.bnolt, false, - true, 1, &i); - if (error) - goto out; - if (i == 1) { - trace_xfs_alloc_cur_left(args); - fbcur = acur.bnogt; - fbinc = true; - break; - } - - error = xfs_alloc_walk_iter(args, &acur, acur.bnogt, true, true, - 1, &i); - if (error) - goto out; - if (i == 1) { - trace_xfs_alloc_cur_right(args); - fbcur = acur.bnolt; - fbinc = false; - break; - } - } while (xfs_alloc_cur_active(acur.bnolt) || - xfs_alloc_cur_active(acur.bnogt)); - - /* search the opposite direction for a better entry */ - if (fbcur) { - error = xfs_alloc_walk_iter(args, &acur, fbcur, fbinc, true, -1, - &i); - if (error) - goto out; - } - /* * If we couldn't get anything, give up. */ From patchwork Fri Sep 27 17:18:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164881 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A65B1599 for ; Fri, 27 Sep 2019 17:18:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4144C20872 for ; Fri, 27 Sep 2019 17:18:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726673AbfI0RSH (ORCPT ); Fri, 27 Sep 2019 13:18:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53116 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727517AbfI0RSH (ORCPT ); Fri, 27 Sep 2019 13:18:07 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 15C4F89AC5 for ; Fri, 27 Sep 2019 17:18:07 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id C117926333 for ; Fri, 27 Sep 2019 17:18:06 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 10/11] xfs: factor out tree fixup logic into helper Date: Fri, 27 Sep 2019 13:18:01 -0400 Message-Id: <20190927171802.45582-11-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 27 Sep 2019 17:18:07 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Lift the btree fixup path into a helper function. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 42 +++++++++++++++++++++++++++++---------- fs/xfs/xfs_trace.h | 1 + 2 files changed, 33 insertions(+), 10 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index c1f59bfa8d09..b16aa41507e8 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -890,6 +890,36 @@ xfs_alloc_cur_check( return 0; } +/* + * Complete an allocation of a candidate extent. Remove the extent from both + * trees and update the args structure. + */ +STATIC int +xfs_alloc_cur_finish( + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur) +{ + int error; + + ASSERT(acur->cnt && acur->bnolt); + ASSERT(acur->bno >= acur->rec_bno); + ASSERT(acur->bno + acur->len <= acur->rec_bno + acur->rec_len); + ASSERT(acur->rec_bno + acur->rec_len <= + be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); + + error = xfs_alloc_fixup_trees(acur->cnt, acur->bnolt, acur->rec_bno, + acur->rec_len, acur->bno, acur->len, 0); + if (error) + return error; + + args->agbno = acur->bno; + args->len = acur->len; + args->wasfromfl = 0; + + trace_xfs_alloc_cur(args); + return 0; +} + /* * Deal with the case where only small freespaces remain. Either return the * contents of the last freespace record, or allocate space from the freelist if @@ -1359,7 +1389,6 @@ xfs_alloc_ag_vextent_near( } else if (error) { goto out; } - args->wasfromfl = 0; /* * First algorithm. @@ -1440,15 +1469,8 @@ xfs_alloc_ag_vextent_near( } alloc: - args->agbno = acur.bno; - args->len = acur.len; - ASSERT(acur.bno >= acur.rec_bno); - ASSERT(acur.bno + acur.len <= acur.rec_bno + acur.rec_len); - ASSERT(acur.rec_bno + acur.rec_len <= - be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); - - error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, acur.rec_bno, - acur.rec_len, acur.bno, acur.len, 0); + /* fix up btrees on a successful allocation */ + error = xfs_alloc_cur_finish(args, &acur); out: xfs_alloc_cur_close(&acur, error); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index ebe85e9ed7c8..21fa0e8822e3 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1642,6 +1642,7 @@ DEFINE_ALLOC_EVENT(xfs_alloc_exact_notfound); DEFINE_ALLOC_EVENT(xfs_alloc_exact_error); DEFINE_ALLOC_EVENT(xfs_alloc_near_nominleft); DEFINE_ALLOC_EVENT(xfs_alloc_near_first); +DEFINE_ALLOC_EVENT(xfs_alloc_cur); DEFINE_ALLOC_EVENT(xfs_alloc_cur_right); DEFINE_ALLOC_EVENT(xfs_alloc_cur_left); DEFINE_ALLOC_EVENT(xfs_alloc_near_error); From patchwork Fri Sep 27 17:18:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11164883 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C13521599 for ; Fri, 27 Sep 2019 17:18:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9E9E721850 for ; Fri, 27 Sep 2019 17:18:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727517AbfI0RSJ (ORCPT ); Fri, 27 Sep 2019 13:18:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:29026 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727235AbfI0RSI (ORCPT ); Fri, 27 Sep 2019 13:18:08 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 87462191864A for ; Fri, 27 Sep 2019 17:18:07 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 305702632F for ; Fri, 27 Sep 2019 17:18:07 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v5 11/11] xfs: optimize near mode bnobt scans with concurrent cntbt lookups Date: Fri, 27 Sep 2019 13:18:02 -0400 Message-Id: <20190927171802.45582-12-bfoster@redhat.com> In-Reply-To: <20190927171802.45582-1-bfoster@redhat.com> References: <20190927171802.45582-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.70]); Fri, 27 Sep 2019 17:18:07 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The near mode fallback algorithm consists of a left/right scan of the bnobt. This algorithm has very poor breakdown characteristics under worst case free space fragmentation conditions. If a suitable extent is far enough from the locality hint, each allocation may scan most or all of the bnobt before it completes. This causes pathological behavior and extremely high allocation latencies. While locality is important to near mode allocations, it is not so important as to incur pathological allocation latency to provide the asolute best available locality for every allocation. If the allocation is large enough or far enough away, there is a point of diminishing returns. As such, we can bound the overall operation by including an iterative cntbt lookup in the broader search. The cntbt lookup is optimized to immediately find the extent with best locality for the given size on each iteration. Since the cntbt is indexed by extent size, the lookup repeats with a variably aggressive increasing search key size until it runs off the edge of the tree. This approach provides a natural balance between the two algorithms for various situations. For example, the bnobt scan is able to satisfy smaller allocations such as for inode chunks or btree blocks more quickly where the cntbt search may have to search through a large set of extent sizes when the search key starts off small relative to the largest extent in the tree. On the other hand, the cntbt search more deterministically covers the set of suitable extents for larger data extent allocation requests that the bnobt scan may have to search the entire tree to locate. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 154 +++++++++++++++++++++++++++++++++++--- fs/xfs/xfs_trace.h | 2 + 2 files changed, 144 insertions(+), 12 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index b16aa41507e8..e9f74eb92073 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -716,6 +716,7 @@ struct xfs_alloc_cur { struct xfs_btree_cur *cnt; /* btree cursors */ struct xfs_btree_cur *bnolt; struct xfs_btree_cur *bnogt; + xfs_extlen_t cur_len;/* current search length */ xfs_agblock_t rec_bno;/* extent startblock */ xfs_extlen_t rec_len;/* extent length */ xfs_agblock_t bno; /* alloc bno */ @@ -740,6 +741,7 @@ xfs_alloc_cur_setup( ASSERT(args->alignment == 1 || args->type != XFS_ALLOCTYPE_THIS_BNO); + acur->cur_len = args->maxlen; acur->rec_bno = 0; acur->rec_len = 0; acur->bno = 0; @@ -920,6 +922,76 @@ xfs_alloc_cur_finish( return 0; } +/* + * Locality allocation lookup algorithm. This expects a cntbt cursor and uses + * bno optimized lookup to search for extents with ideal size and locality. + */ +STATIC int +xfs_alloc_cntbt_iter( + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur) +{ + struct xfs_btree_cur *cur = acur->cnt; + xfs_agblock_t bno; + xfs_extlen_t len, cur_len; + int error; + int i; + + if (!xfs_alloc_cur_active(cur)) + return 0; + + /* locality optimized lookup */ + cur_len = acur->cur_len; + error = xfs_alloc_lookup_ge(cur, args->agbno, cur_len, &i); + if (error) + return error; + if (i == 0) + return 0; + error = xfs_alloc_get_rec(cur, &bno, &len, &i); + if (error) + return error; + + /* check the current record and update search length from it */ + error = xfs_alloc_cur_check(args, acur, cur, &i); + if (error) + return error; + ASSERT(len >= acur->cur_len); + acur->cur_len = len; + + /* + * We looked up the first record >= [agbno, len] above. The agbno is a + * secondary key and so the current record may lie just before or after + * agbno. If it is past agbno, check the previous record too so long as + * the length matches as it may be closer. Don't check a smaller record + * because that could deactivate our cursor. + */ + if (bno > args->agbno) { + error = xfs_btree_decrement(cur, 0, &i); + if (!error && i) { + error = xfs_alloc_get_rec(cur, &bno, &len, &i); + if (!error && i && len == acur->cur_len) + error = xfs_alloc_cur_check(args, acur, cur, + &i); + } + if (error) + return error; + } + + /* + * Increment the search key until we find at least one allocation + * candidate or if the extent we found was larger. Otherwise, double the + * search key to optimize the search. Efficiency is more important here + * than absolute best locality. + */ + cur_len <<= 1; + if (!acur->len || acur->cur_len >= cur_len) + acur->cur_len++; + else + acur->cur_len = cur_len; + + return error; +} + /* * Deal with the case where only small freespaces remain. Either return the * contents of the last freespace record, or allocate space from the freelist if @@ -1261,12 +1333,11 @@ xfs_alloc_walk_iter( } /* - * Search in the by-bno btree to the left and to the right simultaneously until - * in each case we find a large enough free extent or run into the edge of the - * tree. When we run into the edge of the tree, we deactivate that cursor. + * Search the by-bno and by-size btrees in parallel in search of an extent with + * ideal locality based on the NEAR mode ->agbno locality hint. */ STATIC int -xfs_alloc_ag_vextent_bnobt( +xfs_alloc_ag_vextent_locality( struct xfs_alloc_arg *args, struct xfs_alloc_cur *acur, int *stat) @@ -1281,6 +1352,9 @@ xfs_alloc_ag_vextent_bnobt( *stat = 0; + error = xfs_alloc_lookup_ge(acur->cnt, args->agbno, acur->cur_len, &i); + if (error) + return error; error = xfs_alloc_lookup_le(acur->bnolt, args->agbno, 0, &i); if (error) return error; @@ -1289,12 +1363,37 @@ xfs_alloc_ag_vextent_bnobt( return error; /* - * Loop going left with the leftward cursor, right with the rightward - * cursor, until either both directions give up or we find an entry at - * least as big as minlen. + * Search the bnobt and cntbt in parallel. Search the bnobt left and + * right and lookup the closest extent to the locality hint for each + * extent size key in the cntbt. The entire search terminates + * immediately on a bnobt hit because that means we've found best case + * locality. Otherwise the search continues until the cntbt cursor runs + * off the end of the tree. If no allocation candidate is found at this + * point, give up on locality, walk backwards from the end of the cntbt + * and take the first available extent. + * + * The parallel tree searches balance each other out to provide fairly + * consistent performance for various situations. The bnobt search can + * have pathological behavior in the worst case scenario of larger + * allocation requests and fragmented free space. On the other hand, the + * bnobt is able to satisfy most smaller allocation requests much more + * quickly than the cntbt. The cntbt search can sift through fragmented + * free space and sets of free extents for larger allocation requests + * more quickly than the bnobt. Since the locality hint is just a hint + * and we don't want to scan the entire bnobt for perfect locality, the + * cntbt search essentially bounds the bnobt search such that we can + * find good enough locality at reasonable performance in most cases. */ while (xfs_alloc_cur_active(acur->bnolt) || - xfs_alloc_cur_active(acur->bnogt)) { + xfs_alloc_cur_active(acur->bnogt) || + xfs_alloc_cur_active(acur->cnt)) { + + trace_xfs_alloc_cur_lookup(args); + + /* + * Search the bnobt left and right. In the case of a hit, finish + * the search in the opposite direction and we're done. + */ error = xfs_alloc_walk_iter(args, acur, acur->bnolt, false, true, 1, &i); if (error) @@ -1305,7 +1404,6 @@ xfs_alloc_ag_vextent_bnobt( fbinc = true; break; } - error = xfs_alloc_walk_iter(args, acur, acur->bnogt, true, true, 1, &i); if (error) @@ -1316,9 +1414,40 @@ xfs_alloc_ag_vextent_bnobt( fbinc = false; break; } + + /* + * Check the extent with best locality based on the current + * extent size search key and keep track of the best candidate. + */ + error = xfs_alloc_cntbt_iter(args, acur); + if (error) + return error; + if (!xfs_alloc_cur_active(acur->cnt)) { + trace_xfs_alloc_cur_lookup_done(args); + break; + } + } + + /* + * If we failed to find anything due to busy extents, return empty + * handed so the caller can flush and retry. If no busy extents were + * found, walk backwards from the end of the cntbt as a last resort. + */ + if (!xfs_alloc_cur_active(acur->cnt) && !acur->len && !acur->busy) { + error = xfs_btree_decrement(acur->cnt, 0, &i); + if (error) + return error; + if (i) { + acur->cnt->bc_private.a.priv.abt.active = true; + fbcur = acur->cnt; + fbinc = false; + } } - /* search the opposite direction for a better entry */ + /* + * Search in the opposite direction for a better entry in the case of + * a bnobt hit or walk backwards from the end of the cntbt. + */ if (fbcur) { error = xfs_alloc_walk_iter(args, acur, fbcur, fbinc, true, -1, &i); @@ -1447,9 +1576,10 @@ xfs_alloc_ag_vextent_near( } /* - * Second algorithm. Search the bnobt left and right. + * Second algorithm. Combined cntbt and bnobt search to find ideal + * locality. */ - error = xfs_alloc_ag_vextent_bnobt(args, &acur, &i); + error = xfs_alloc_ag_vextent_locality(args, &acur, &i); if (error) goto out; diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 21fa0e8822e3..a07a5fe22218 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1645,6 +1645,8 @@ DEFINE_ALLOC_EVENT(xfs_alloc_near_first); DEFINE_ALLOC_EVENT(xfs_alloc_cur); DEFINE_ALLOC_EVENT(xfs_alloc_cur_right); DEFINE_ALLOC_EVENT(xfs_alloc_cur_left); +DEFINE_ALLOC_EVENT(xfs_alloc_cur_lookup); +DEFINE_ALLOC_EVENT(xfs_alloc_cur_lookup_done); DEFINE_ALLOC_EVENT(xfs_alloc_near_error); DEFINE_ALLOC_EVENT(xfs_alloc_near_noentry); DEFINE_ALLOC_EVENT(xfs_alloc_near_busy);