From patchwork Mon Sep 16 12:16:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146927 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EAC0F1746 for ; Mon, 16 Sep 2019 12:16:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D2AE4216C8 for ; Mon, 16 Sep 2019 12:16:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726875AbfIPMQh (ORCPT ); Mon, 16 Sep 2019 08:16:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48592 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731379AbfIPMQh (ORCPT ); Mon, 16 Sep 2019 08:16:37 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9591F85A03 for ; Mon, 16 Sep 2019 12:16:36 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 54BD55C298 for ; Mon, 16 Sep 2019 12:16:36 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 01/11] xfs: track active state of allocation btree cursors Date: Mon, 16 Sep 2019 08:16:25 -0400 Message-Id: <20190916121635.43148-2-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 16 Sep 2019 12:16:36 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The upcoming allocation algorithm update searches multiple allocation btree cursors concurrently. As such, it requires an active state to track when a particular cursor should continue searching. While active state will be modified based on higher level logic, we can define base functionality based on the result of allocation btree lookups. Define an active flag in the private area of the btree cursor. Update it based on the result of lookups in the existing allocation btree helpers. Finally, provide a new helper to query the current state. Signed-off-by: Brian Foster --- fs/xfs/libxfs/xfs_alloc.c | 24 +++++++++++++++++++++--- fs/xfs/libxfs/xfs_alloc_btree.c | 1 + fs/xfs/libxfs/xfs_btree.h | 3 +++ 3 files changed, 25 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 533b04aaf6f6..512a45888e06 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -146,9 +146,13 @@ xfs_alloc_lookup_eq( xfs_extlen_t len, /* length of extent */ int *stat) /* success/failure */ { + int error; + cur->bc_rec.a.ar_startblock = bno; cur->bc_rec.a.ar_blockcount = len; - return xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat); + error = xfs_btree_lookup(cur, XFS_LOOKUP_EQ, stat); + cur->bc_private.a.priv.abt.active = *stat == 1 ? true : false; + return error; } /* @@ -162,9 +166,13 @@ xfs_alloc_lookup_ge( xfs_extlen_t len, /* length of extent */ int *stat) /* success/failure */ { + int error; + cur->bc_rec.a.ar_startblock = bno; cur->bc_rec.a.ar_blockcount = len; - return xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); + error = xfs_btree_lookup(cur, XFS_LOOKUP_GE, stat); + cur->bc_private.a.priv.abt.active = *stat == 1 ? true : false; + return error; } /* @@ -178,9 +186,19 @@ xfs_alloc_lookup_le( xfs_extlen_t len, /* length of extent */ int *stat) /* success/failure */ { + int error; cur->bc_rec.a.ar_startblock = bno; cur->bc_rec.a.ar_blockcount = len; - return xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); + error = xfs_btree_lookup(cur, XFS_LOOKUP_LE, stat); + cur->bc_private.a.priv.abt.active = *stat == 1 ? true : false; + return error; +} + +static inline bool +xfs_alloc_cur_active( + struct xfs_btree_cur *cur) +{ + return cur && cur->bc_private.a.priv.abt.active; } /* diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c index 2a94543857a1..279694d73e4e 100644 --- a/fs/xfs/libxfs/xfs_alloc_btree.c +++ b/fs/xfs/libxfs/xfs_alloc_btree.c @@ -507,6 +507,7 @@ xfs_allocbt_init_cursor( cur->bc_private.a.agbp = agbp; cur->bc_private.a.agno = agno; + cur->bc_private.a.priv.abt.active = false; if (xfs_sb_version_hascrc(&mp->m_sb)) cur->bc_flags |= XFS_BTREE_CRC_BLOCKS; diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index ced1e65d1483..b4e3ec1d7ff9 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -183,6 +183,9 @@ union xfs_btree_cur_private { unsigned long nr_ops; /* # record updates */ int shape_changes; /* # of extent splits */ } refc; + struct { + bool active; /* allocation cursor state */ + } abt; }; /* From patchwork Mon Sep 16 12:16:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146929 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6180C14DB for ; Mon, 16 Sep 2019 12:16:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3697B216C8 for ; Mon, 16 Sep 2019 12:16:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732341AbfIPMQh (ORCPT ); Mon, 16 Sep 2019 08:16:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52018 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731514AbfIPMQh (ORCPT ); Mon, 16 Sep 2019 08:16:37 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 17693898103 for ; Mon, 16 Sep 2019 12:16:37 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id B80F65C1D6 for ; Mon, 16 Sep 2019 12:16:36 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 02/11] xfs: introduce allocation cursor data structure Date: Mon, 16 Sep 2019 08:16:26 -0400 Message-Id: <20190916121635.43148-3-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.67]); Mon, 16 Sep 2019 12:16:37 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Introduce a new allocation cursor data structure to encapsulate the various states and structures used to perform an extent allocation. This structure will eventually be used to track overall allocation state across different search algorithms on both free space btrees. To start, include the three btree cursors (one for the cntbt and two for the bnobt left/right search) used by the near mode allocation algorithm and refactor the cursor setup and teardown code into helpers. This slightly changes cursor memory allocation patterns, but otherwise makes no functional changes to the allocation algorithm. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 318 +++++++++++++++++++------------------- 1 file changed, 163 insertions(+), 155 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 512a45888e06..d159377ed603 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -710,8 +710,71 @@ xfs_alloc_update_counters( } /* - * Allocation group level functions. + * Block allocation algorithm and data structures. */ +struct xfs_alloc_cur { + struct xfs_btree_cur *cnt; /* btree cursors */ + struct xfs_btree_cur *bnolt; + struct xfs_btree_cur *bnogt; +}; + +/* + * Set up cursors, etc. in the extent allocation cursor. This function can be + * called multiple times to reset an initialized structure without having to + * reallocate cursors. + */ +static int +xfs_alloc_cur_setup( + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur) +{ + int error; + int i; + + ASSERT(args->alignment == 1 || args->type != XFS_ALLOCTYPE_THIS_BNO); + + /* + * Perform an initial cntbt lookup to check for availability of maxlen + * extents. If this fails, we'll return -ENOSPC to signal the caller to + * attempt a small allocation. + */ + if (!acur->cnt) + acur->cnt = xfs_allocbt_init_cursor(args->mp, args->tp, + args->agbp, args->agno, XFS_BTNUM_CNT); + error = xfs_alloc_lookup_ge(acur->cnt, 0, args->maxlen, &i); + if (error) + return error; + + /* + * Allocate the bnobt left and right search cursors. + */ + if (!acur->bnolt) + acur->bnolt = xfs_allocbt_init_cursor(args->mp, args->tp, + args->agbp, args->agno, XFS_BTNUM_BNO); + if (!acur->bnogt) + acur->bnogt = xfs_allocbt_init_cursor(args->mp, args->tp, + args->agbp, args->agno, XFS_BTNUM_BNO); + return i == 1 ? 0 : -ENOSPC; +} + +static void +xfs_alloc_cur_close( + struct xfs_alloc_cur *acur, + bool error) +{ + int cur_error = XFS_BTREE_NOERROR; + + if (error) + cur_error = XFS_BTREE_ERROR; + + if (acur->cnt) + xfs_btree_del_cursor(acur->cnt, cur_error); + if (acur->bnolt) + xfs_btree_del_cursor(acur->bnolt, cur_error); + if (acur->bnogt) + xfs_btree_del_cursor(acur->bnogt, cur_error); + acur->cnt = acur->bnolt = acur->bnogt = NULL; +} /* * Deal with the case where only small freespaces remain. Either return the @@ -1008,8 +1071,8 @@ xfs_alloc_ag_vextent_exact( STATIC int xfs_alloc_find_best_extent( struct xfs_alloc_arg *args, /* allocation argument structure */ - struct xfs_btree_cur **gcur, /* good cursor */ - struct xfs_btree_cur **scur, /* searching cursor */ + struct xfs_btree_cur *gcur, /* good cursor */ + struct xfs_btree_cur *scur, /* searching cursor */ xfs_agblock_t gdiff, /* difference for search comparison */ xfs_agblock_t *sbno, /* extent found by search */ xfs_extlen_t *slen, /* extent length */ @@ -1031,7 +1094,7 @@ xfs_alloc_find_best_extent( * Look until we find a better one, run out of space or run off the end. */ do { - error = xfs_alloc_get_rec(*scur, sbno, slen, &i); + error = xfs_alloc_get_rec(scur, sbno, slen, &i); if (error) goto error0; XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); @@ -1074,21 +1137,19 @@ xfs_alloc_find_best_extent( } if (!dir) - error = xfs_btree_increment(*scur, 0, &i); + error = xfs_btree_increment(scur, 0, &i); else - error = xfs_btree_decrement(*scur, 0, &i); + error = xfs_btree_decrement(scur, 0, &i); if (error) goto error0; } while (i); out_use_good: - xfs_btree_del_cursor(*scur, XFS_BTREE_NOERROR); - *scur = NULL; + scur->bc_private.a.priv.abt.active = false; return 0; out_use_search: - xfs_btree_del_cursor(*gcur, XFS_BTREE_NOERROR); - *gcur = NULL; + gcur->bc_private.a.priv.abt.active = false; return 0; error0: @@ -1102,13 +1163,12 @@ xfs_alloc_find_best_extent( * and of the form k * prod + mod unless there's nothing that large. * Return the starting a.g. block, or NULLAGBLOCK if we can't do it. */ -STATIC int /* error */ +STATIC int xfs_alloc_ag_vextent_near( - xfs_alloc_arg_t *args) /* allocation argument structure */ + struct xfs_alloc_arg *args) { - xfs_btree_cur_t *bno_cur_gt; /* cursor for bno btree, right side */ - xfs_btree_cur_t *bno_cur_lt; /* cursor for bno btree, left side */ - xfs_btree_cur_t *cnt_cur; /* cursor for count btree */ + struct xfs_alloc_cur acur = {0,}; + struct xfs_btree_cur *bno_cur; xfs_agblock_t gtbno; /* start bno of right side entry */ xfs_agblock_t gtbnoa; /* aligned ... */ xfs_extlen_t gtdiff; /* difference to right side entry */ @@ -1148,38 +1208,29 @@ xfs_alloc_ag_vextent_near( args->agbno = args->max_agbno; restart: - bno_cur_lt = NULL; - bno_cur_gt = NULL; ltlen = 0; gtlena = 0; ltlena = 0; busy = false; /* - * Get a cursor for the by-size btree. + * Set up cursors and see if there are any free extents as big as + * maxlen. If not, pick the last entry in the tree unless the tree is + * empty. */ - cnt_cur = xfs_allocbt_init_cursor(args->mp, args->tp, args->agbp, - args->agno, XFS_BTNUM_CNT); - - /* - * See if there are any free extents as big as maxlen. - */ - if ((error = xfs_alloc_lookup_ge(cnt_cur, 0, args->maxlen, &i))) - goto error0; - /* - * If none, then pick up the last entry in the tree unless the - * tree is empty. - */ - if (!i) { - if ((error = xfs_alloc_ag_vextent_small(args, cnt_cur, <bno, - <len, &i))) - goto error0; + error = xfs_alloc_cur_setup(args, &acur); + if (error == -ENOSPC) { + error = xfs_alloc_ag_vextent_small(args, acur.cnt, <bno, + <len, &i); + if (error) + goto out; if (i == 0 || ltlen == 0) { - xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR); trace_xfs_alloc_near_noentry(args); - return 0; + goto out; } ASSERT(i == 1); + } else if (error) { + goto out; } args->wasfromfl = 0; @@ -1193,7 +1244,7 @@ xfs_alloc_ag_vextent_near( * This is written as a while loop so we can break out of it, * but we never loop back to the top. */ - while (xfs_btree_islastblock(cnt_cur, 0)) { + while (xfs_btree_islastblock(acur.cnt, 0)) { xfs_extlen_t bdiff; int besti=0; xfs_extlen_t blen=0; @@ -1210,32 +1261,35 @@ xfs_alloc_ag_vextent_near( * and skip all those smaller than minlen. */ if (ltlen || args->alignment > 1) { - cnt_cur->bc_ptrs[0] = 1; + acur.cnt->bc_ptrs[0] = 1; do { - if ((error = xfs_alloc_get_rec(cnt_cur, <bno, - <len, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); + error = xfs_alloc_get_rec(acur.cnt, <bno, + <len, &i); + if (error) + goto out; + XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); if (ltlen >= args->minlen) break; - if ((error = xfs_btree_increment(cnt_cur, 0, &i))) - goto error0; + error = xfs_btree_increment(acur.cnt, 0, &i); + if (error) + goto out; } while (i); ASSERT(ltlen >= args->minlen); if (!i) break; } - i = cnt_cur->bc_ptrs[0]; + i = acur.cnt->bc_ptrs[0]; for (j = 1, blen = 0, bdiff = 0; !error && j && (blen < args->maxlen || bdiff > 0); - error = xfs_btree_increment(cnt_cur, 0, &j)) { + error = xfs_btree_increment(acur.cnt, 0, &j)) { /* * For each entry, decide if it's better than * the previous best entry. */ - if ((error = xfs_alloc_get_rec(cnt_cur, <bno, <len, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); + error = xfs_alloc_get_rec(acur.cnt, <bno, <len, &i); + if (error) + goto out; + XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); busy = xfs_alloc_compute_aligned(args, ltbno, ltlen, <bnoa, <lena, &busy_gen); if (ltlena < args->minlen) @@ -1255,7 +1309,7 @@ xfs_alloc_ag_vextent_near( bdiff = ltdiff; bnew = ltnew; blen = args->len; - besti = cnt_cur->bc_ptrs[0]; + besti = acur.cnt->bc_ptrs[0]; } } /* @@ -1267,10 +1321,11 @@ xfs_alloc_ag_vextent_near( /* * Point at the best entry, and retrieve it again. */ - cnt_cur->bc_ptrs[0] = besti; - if ((error = xfs_alloc_get_rec(cnt_cur, <bno, <len, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); + acur.cnt->bc_ptrs[0] = besti; + error = xfs_alloc_get_rec(acur.cnt, <bno, <len, &i); + if (error) + goto out; + XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); ASSERT(ltbno + ltlen <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); args->len = blen; @@ -1280,23 +1335,14 @@ xfs_alloc_ag_vextent_near( args->agbno = bnew; ASSERT(bnew >= ltbno); ASSERT(bnew + blen <= ltbno + ltlen); - /* - * Set up a cursor for the by-bno tree. - */ - bno_cur_lt = xfs_allocbt_init_cursor(args->mp, args->tp, - args->agbp, args->agno, XFS_BTNUM_BNO); - /* - * Fix up the btree entries. - */ - if ((error = xfs_alloc_fixup_trees(cnt_cur, bno_cur_lt, ltbno, - ltlen, bnew, blen, XFSA_FIXUP_CNT_OK))) - goto error0; - xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR); - xfs_btree_del_cursor(bno_cur_lt, XFS_BTREE_NOERROR); - + error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, ltbno, + ltlen, bnew, blen, XFSA_FIXUP_CNT_OK); + if (error) + goto out; trace_xfs_alloc_near_first(args); - return 0; + goto out; } + /* * Second algorithm. * Search in the by-bno tree to the left and to the right @@ -1309,86 +1355,57 @@ xfs_alloc_ag_vextent_near( * level algorithm that picks allocation groups for allocations * is not supposed to do this. */ - /* - * Allocate and initialize the cursor for the leftward search. - */ - bno_cur_lt = xfs_allocbt_init_cursor(args->mp, args->tp, args->agbp, - args->agno, XFS_BTNUM_BNO); - /* - * Lookup <= bno to find the leftward search's starting point. - */ - if ((error = xfs_alloc_lookup_le(bno_cur_lt, args->agbno, args->maxlen, &i))) - goto error0; - if (!i) { - /* - * Didn't find anything; use this cursor for the rightward - * search. - */ - bno_cur_gt = bno_cur_lt; - bno_cur_lt = NULL; - } - /* - * Found something. Duplicate the cursor for the rightward search. - */ - else if ((error = xfs_btree_dup_cursor(bno_cur_lt, &bno_cur_gt))) - goto error0; - /* - * Increment the cursor, so we will point at the entry just right - * of the leftward entry if any, or to the leftmost entry. - */ - if ((error = xfs_btree_increment(bno_cur_gt, 0, &i))) - goto error0; - if (!i) { - /* - * It failed, there are no rightward entries. - */ - xfs_btree_del_cursor(bno_cur_gt, XFS_BTREE_NOERROR); - bno_cur_gt = NULL; - } + error = xfs_alloc_lookup_le(acur.bnolt, args->agbno, 0, &i); + if (error) + goto out; + error = xfs_alloc_lookup_ge(acur.bnogt, args->agbno, 0, &i); + if (error) + goto out; + /* * Loop going left with the leftward cursor, right with the * rightward cursor, until either both directions give up or * we find an entry at least as big as minlen. */ do { - if (bno_cur_lt) { - if ((error = xfs_alloc_get_rec(bno_cur_lt, <bno, <len, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); + if (xfs_alloc_cur_active(acur.bnolt)) { + error = xfs_alloc_get_rec(acur.bnolt, <bno, <len, &i); + if (error) + goto out; + XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); busy |= xfs_alloc_compute_aligned(args, ltbno, ltlen, <bnoa, <lena, &busy_gen); if (ltlena >= args->minlen && ltbnoa >= args->min_agbno) break; - if ((error = xfs_btree_decrement(bno_cur_lt, 0, &i))) - goto error0; - if (!i || ltbnoa < args->min_agbno) { - xfs_btree_del_cursor(bno_cur_lt, - XFS_BTREE_NOERROR); - bno_cur_lt = NULL; - } + error = xfs_btree_decrement(acur.bnolt, 0, &i); + if (error) + goto out; + if (!i || ltbnoa < args->min_agbno) + acur.bnolt->bc_private.a.priv.abt.active = false; } - if (bno_cur_gt) { - if ((error = xfs_alloc_get_rec(bno_cur_gt, >bno, >len, &i))) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); + if (xfs_alloc_cur_active(acur.bnogt)) { + error = xfs_alloc_get_rec(acur.bnogt, >bno, >len, &i); + if (error) + goto out; + XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); busy |= xfs_alloc_compute_aligned(args, gtbno, gtlen, >bnoa, >lena, &busy_gen); if (gtlena >= args->minlen && gtbnoa <= args->max_agbno) break; - if ((error = xfs_btree_increment(bno_cur_gt, 0, &i))) - goto error0; - if (!i || gtbnoa > args->max_agbno) { - xfs_btree_del_cursor(bno_cur_gt, - XFS_BTREE_NOERROR); - bno_cur_gt = NULL; - } + error = xfs_btree_increment(acur.bnogt, 0, &i); + if (error) + goto out; + if (!i || gtbnoa > args->max_agbno) + acur.bnogt->bc_private.a.priv.abt.active = false; } - } while (bno_cur_lt || bno_cur_gt); + } while (xfs_alloc_cur_active(acur.bnolt) || + xfs_alloc_cur_active(acur.bnogt)); /* * Got both cursors still active, need to find better entry. */ - if (bno_cur_lt && bno_cur_gt) { + if (xfs_alloc_cur_active(acur.bnolt) && + xfs_alloc_cur_active(acur.bnogt)) { if (ltlena >= args->minlen) { /* * Left side is good, look for a right side entry. @@ -1400,7 +1417,7 @@ xfs_alloc_ag_vextent_near( ltlena, <new); error = xfs_alloc_find_best_extent(args, - &bno_cur_lt, &bno_cur_gt, + acur.bnolt, acur.bnogt, ltdiff, >bno, >len, >bnoa, >lena, 0 /* search right */); @@ -1417,22 +1434,21 @@ xfs_alloc_ag_vextent_near( gtlena, >new); error = xfs_alloc_find_best_extent(args, - &bno_cur_gt, &bno_cur_lt, + acur.bnogt, acur.bnolt, gtdiff, <bno, <len, <bnoa, <lena, 1 /* search left */); } if (error) - goto error0; + goto out; } /* * If we couldn't get anything, give up. */ - if (bno_cur_lt == NULL && bno_cur_gt == NULL) { - xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR); - + if (!xfs_alloc_cur_active(acur.bnolt) && + !xfs_alloc_cur_active(acur.bnogt)) { if (busy) { trace_xfs_alloc_near_busy(args); xfs_extent_busy_flush(args->mp, args->pag, busy_gen); @@ -1440,7 +1456,7 @@ xfs_alloc_ag_vextent_near( } trace_xfs_alloc_size_neither(args); args->agbno = NULLAGBLOCK; - return 0; + goto out; } /* @@ -1449,16 +1465,17 @@ xfs_alloc_ag_vextent_near( * useful variables to the "left" set so we only have one * copy of this code. */ - if (bno_cur_gt) { - bno_cur_lt = bno_cur_gt; - bno_cur_gt = NULL; + if (xfs_alloc_cur_active(acur.bnogt)) { + bno_cur = acur.bnogt; ltbno = gtbno; ltbnoa = gtbnoa; ltlen = gtlen; ltlena = gtlena; j = 1; - } else + } else { + bno_cur = acur.bnolt; j = 0; + } /* * Fix up the length and compute the useful address. @@ -1474,27 +1491,18 @@ xfs_alloc_ag_vextent_near( ASSERT(ltnew >= args->min_agbno && ltnew <= args->max_agbno); args->agbno = ltnew; - if ((error = xfs_alloc_fixup_trees(cnt_cur, bno_cur_lt, ltbno, ltlen, - ltnew, rlen, XFSA_FIXUP_BNO_OK))) - goto error0; + error = xfs_alloc_fixup_trees(acur.cnt, bno_cur, ltbno, ltlen, ltnew, + rlen, XFSA_FIXUP_BNO_OK); + if (error) + goto out; if (j) trace_xfs_alloc_near_greater(args); else trace_xfs_alloc_near_lesser(args); - xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR); - xfs_btree_del_cursor(bno_cur_lt, XFS_BTREE_NOERROR); - return 0; - - error0: - trace_xfs_alloc_near_error(args); - if (cnt_cur != NULL) - xfs_btree_del_cursor(cnt_cur, XFS_BTREE_ERROR); - if (bno_cur_lt != NULL) - xfs_btree_del_cursor(bno_cur_lt, XFS_BTREE_ERROR); - if (bno_cur_gt != NULL) - xfs_btree_del_cursor(bno_cur_gt, XFS_BTREE_ERROR); +out: + xfs_alloc_cur_close(&acur, error); return error; } From patchwork Mon Sep 16 12:16:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146931 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 84B361745 for ; Mon, 16 Sep 2019 12:16:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6BC1D21670 for ; Mon, 16 Sep 2019 12:16:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732344AbfIPMQh (ORCPT ); Mon, 16 Sep 2019 08:16:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47690 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731379AbfIPMQh (ORCPT ); Mon, 16 Sep 2019 08:16:37 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7C79630A76A9 for ; Mon, 16 Sep 2019 12:16:37 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3A49E5C1D6 for ; Mon, 16 Sep 2019 12:16:37 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 03/11] xfs: track allocation busy state in allocation cursor Date: Mon, 16 Sep 2019 08:16:27 -0400 Message-Id: <20190916121635.43148-4-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.47]); Mon, 16 Sep 2019 12:16:37 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Extend the allocation cursor to track extent busy state for an allocation attempt. No functional changes. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index d159377ed603..5c34d4c41761 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -716,6 +716,8 @@ struct xfs_alloc_cur { struct xfs_btree_cur *cnt; /* btree cursors */ struct xfs_btree_cur *bnolt; struct xfs_btree_cur *bnogt; + unsigned busy_gen;/* busy state */ + bool busy; }; /* @@ -733,6 +735,9 @@ xfs_alloc_cur_setup( ASSERT(args->alignment == 1 || args->type != XFS_ALLOCTYPE_THIS_BNO); + acur->busy = false; + acur->busy_gen = 0; + /* * Perform an initial cntbt lookup to check for availability of maxlen * extents. If this fails, we'll return -ENOSPC to signal the caller to @@ -1185,8 +1190,6 @@ xfs_alloc_ag_vextent_near( xfs_extlen_t ltlena; /* aligned ... */ xfs_agblock_t ltnew; /* useful start bno of left side */ xfs_extlen_t rlen; /* length of returned extent */ - bool busy; - unsigned busy_gen; #ifdef DEBUG /* * Randomly don't execute the first algorithm. @@ -1211,7 +1214,6 @@ xfs_alloc_ag_vextent_near( ltlen = 0; gtlena = 0; ltlena = 0; - busy = false; /* * Set up cursors and see if there are any free extents as big as @@ -1290,8 +1292,8 @@ xfs_alloc_ag_vextent_near( if (error) goto out; XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - busy = xfs_alloc_compute_aligned(args, ltbno, ltlen, - <bnoa, <lena, &busy_gen); + acur.busy = xfs_alloc_compute_aligned(args, ltbno, ltlen, + <bnoa, <lena, &acur.busy_gen); if (ltlena < args->minlen) continue; if (ltbnoa < args->min_agbno || ltbnoa > args->max_agbno) @@ -1373,8 +1375,8 @@ xfs_alloc_ag_vextent_near( if (error) goto out; XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - busy |= xfs_alloc_compute_aligned(args, ltbno, ltlen, - <bnoa, <lena, &busy_gen); + acur.busy |= xfs_alloc_compute_aligned(args, ltbno, + ltlen, <bnoa, <lena, &acur.busy_gen); if (ltlena >= args->minlen && ltbnoa >= args->min_agbno) break; error = xfs_btree_decrement(acur.bnolt, 0, &i); @@ -1388,8 +1390,8 @@ xfs_alloc_ag_vextent_near( if (error) goto out; XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - busy |= xfs_alloc_compute_aligned(args, gtbno, gtlen, - >bnoa, >lena, &busy_gen); + acur.busy |= xfs_alloc_compute_aligned(args, gtbno, + gtlen, >bnoa, >lena, &acur.busy_gen); if (gtlena >= args->minlen && gtbnoa <= args->max_agbno) break; error = xfs_btree_increment(acur.bnogt, 0, &i); @@ -1449,9 +1451,10 @@ xfs_alloc_ag_vextent_near( */ if (!xfs_alloc_cur_active(acur.bnolt) && !xfs_alloc_cur_active(acur.bnogt)) { - if (busy) { + if (acur.busy) { trace_xfs_alloc_near_busy(args); - xfs_extent_busy_flush(args->mp, args->pag, busy_gen); + xfs_extent_busy_flush(args->mp, args->pag, + acur.busy_gen); goto restart; } trace_xfs_alloc_size_neither(args); From patchwork Mon Sep 16 12:16:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146933 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DC7C01746 for ; Mon, 16 Sep 2019 12:16:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BB83A214DE for ; Mon, 16 Sep 2019 12:16:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732370AbfIPMQi (ORCPT ); Mon, 16 Sep 2019 08:16:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55242 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731514AbfIPMQi (ORCPT ); Mon, 16 Sep 2019 08:16:38 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DFFEC3084031 for ; Mon, 16 Sep 2019 12:16:37 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9DBB35C1D6 for ; Mon, 16 Sep 2019 12:16:37 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 04/11] xfs: track best extent from cntbt lastblock scan in alloc cursor Date: Mon, 16 Sep 2019 08:16:28 -0400 Message-Id: <20190916121635.43148-5-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.40]); Mon, 16 Sep 2019 12:16:37 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org If the size lookup lands in the last block of the by-size btree, the near mode algorithm scans the entire block for the extent with best available locality. In preparation for similar best available extent tracking across both btrees, extend the allocation cursor with best extent data and lift the associated state from the cntbt last block scan code. No functional changes. Signed-off-by: Brian Foster --- fs/xfs/libxfs/xfs_alloc.c | 63 ++++++++++++++++++++------------------- 1 file changed, 33 insertions(+), 30 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 5c34d4c41761..ee46989ab723 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -716,6 +716,11 @@ struct xfs_alloc_cur { struct xfs_btree_cur *cnt; /* btree cursors */ struct xfs_btree_cur *bnolt; struct xfs_btree_cur *bnogt; + xfs_agblock_t rec_bno;/* extent startblock */ + xfs_extlen_t rec_len;/* extent length */ + xfs_agblock_t bno; /* alloc bno */ + xfs_extlen_t len; /* alloc len */ + xfs_extlen_t diff; /* diff from search bno */ unsigned busy_gen;/* busy state */ bool busy; }; @@ -735,6 +740,11 @@ xfs_alloc_cur_setup( ASSERT(args->alignment == 1 || args->type != XFS_ALLOCTYPE_THIS_BNO); + acur->rec_bno = 0; + acur->rec_len = 0; + acur->bno = 0; + acur->len = 0; + acur->diff = -1; acur->busy = false; acur->busy_gen = 0; @@ -1247,10 +1257,7 @@ xfs_alloc_ag_vextent_near( * but we never loop back to the top. */ while (xfs_btree_islastblock(acur.cnt, 0)) { - xfs_extlen_t bdiff; - int besti=0; - xfs_extlen_t blen=0; - xfs_agblock_t bnew=0; + xfs_extlen_t diff; #ifdef DEBUG if (dofirst) @@ -1281,8 +1288,8 @@ xfs_alloc_ag_vextent_near( break; } i = acur.cnt->bc_ptrs[0]; - for (j = 1, blen = 0, bdiff = 0; - !error && j && (blen < args->maxlen || bdiff > 0); + for (j = 1; + !error && j && (acur.len < args->maxlen || acur.diff > 0); error = xfs_btree_increment(acur.cnt, 0, &j)) { /* * For each entry, decide if it's better than @@ -1301,44 +1308,40 @@ xfs_alloc_ag_vextent_near( args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen); xfs_alloc_fix_len(args); ASSERT(args->len >= args->minlen); - if (args->len < blen) + if (args->len < acur.len) continue; - ltdiff = xfs_alloc_compute_diff(args->agbno, args->len, + diff = xfs_alloc_compute_diff(args->agbno, args->len, args->alignment, args->datatype, ltbnoa, ltlena, <new); if (ltnew != NULLAGBLOCK && - (args->len > blen || ltdiff < bdiff)) { - bdiff = ltdiff; - bnew = ltnew; - blen = args->len; - besti = acur.cnt->bc_ptrs[0]; + (args->len > acur.len || diff < acur.diff)) { + acur.rec_bno = ltbno; + acur.rec_len = ltlen; + acur.diff = diff; + acur.bno = ltnew; + acur.len = args->len; } } /* * It didn't work. We COULD be in a case where * there's a good record somewhere, so try again. */ - if (blen == 0) + if (acur.len == 0) break; - /* - * Point at the best entry, and retrieve it again. - */ - acur.cnt->bc_ptrs[0] = besti; - error = xfs_alloc_get_rec(acur.cnt, <bno, <len, &i); - if (error) - goto out; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - ASSERT(ltbno + ltlen <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); - args->len = blen; /* - * We are allocating starting at bnew for blen blocks. + * Allocate at the bno/len tracked in the cursor. */ - args->agbno = bnew; - ASSERT(bnew >= ltbno); - ASSERT(bnew + blen <= ltbno + ltlen); - error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, ltbno, - ltlen, bnew, blen, XFSA_FIXUP_CNT_OK); + args->agbno = acur.bno; + args->len = acur.len; + ASSERT(acur.bno >= acur.rec_bno); + ASSERT(acur.bno + acur.len <= acur.rec_bno + acur.rec_len); + ASSERT(acur.rec_bno + acur.rec_len <= + be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); + + error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, + acur.rec_bno, acur.rec_len, acur.bno, acur.len, + 0); if (error) goto out; trace_xfs_alloc_near_first(args); From patchwork Mon Sep 16 12:16:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146935 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 02E7B14DB for ; Mon, 16 Sep 2019 12:16:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DFAF421670 for ; Mon, 16 Sep 2019 12:16:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732408AbfIPMQj (ORCPT ); Mon, 16 Sep 2019 08:16:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:48596 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731379AbfIPMQj (ORCPT ); Mon, 16 Sep 2019 08:16:39 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4EE4E89AC6 for ; Mon, 16 Sep 2019 12:16:38 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0EB2F5C21A for ; Mon, 16 Sep 2019 12:16:37 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 05/11] xfs: refactor cntbt lastblock scan best extent logic into helper Date: Mon, 16 Sep 2019 08:16:29 -0400 Message-Id: <20190916121635.43148-6-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 16 Sep 2019 12:16:38 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The cntbt lastblock scan checks the size, alignment, locality, etc. of each free extent in the block and compares it with the current best candidate. This logic will be reused by the upcoming optimized cntbt algorithm, so refactor it into a separate helper. Signed-off-by: Brian Foster --- fs/xfs/libxfs/xfs_alloc.c | 113 +++++++++++++++++++++++++++++--------- fs/xfs/xfs_trace.h | 25 +++++++++ 2 files changed, 111 insertions(+), 27 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index ee46989ab723..2fa7bb6a00a8 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -791,6 +791,89 @@ xfs_alloc_cur_close( acur->cnt = acur->bnolt = acur->bnogt = NULL; } +/* + * Check an extent for allocation and track the best available candidate in the + * allocation structure. The cursor is deactivated if it has entered an out of + * range state based on allocation arguments. Optionally return the extent + * extent geometry and allocation status if requested by the caller. + */ +static int +xfs_alloc_cur_check( + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur, + struct xfs_btree_cur *cur, + int *new) +{ + int error, i; + xfs_agblock_t bno, bnoa, bnew; + xfs_extlen_t len, lena, diff = -1; + bool busy; + unsigned busy_gen = 0; + bool deactivate = false; + + *new = 0; + + error = xfs_alloc_get_rec(cur, &bno, &len, &i); + if (error) + return error; + XFS_WANT_CORRUPTED_RETURN(args->mp, i == 1); + + /* + * Check minlen and deactivate a cntbt cursor if out of acceptable size + * range (i.e., walking backwards looking for a minlen extent). + */ + if (len < args->minlen) { + deactivate = true; + goto out; + } + + busy = xfs_alloc_compute_aligned(args, bno, len, &bnoa, &lena, + &busy_gen); + acur->busy |= busy; + if (busy) + acur->busy_gen = busy_gen; + /* deactivate a bnobt cursor outside of locality range */ + if (bnoa < args->min_agbno || bnoa > args->max_agbno) + goto out; + if (lena < args->minlen) + goto out; + + args->len = XFS_EXTLEN_MIN(lena, args->maxlen); + xfs_alloc_fix_len(args); + ASSERT(args->len >= args->minlen); + if (args->len < acur->len) + goto out; + + /* + * We have an aligned record that satisfies minlen and beats or matches + * the candidate extent size. Compare locality for near allocation mode. + */ + ASSERT(args->type == XFS_ALLOCTYPE_NEAR_BNO); + diff = xfs_alloc_compute_diff(args->agbno, args->len, + args->alignment, args->datatype, + bnoa, lena, &bnew); + if (bnew == NULLAGBLOCK) + goto out; + if (diff > acur->diff) + goto out; + + ASSERT(args->len > acur->len || + (args->len == acur->len && diff <= acur->diff)); + acur->rec_bno = bno; + acur->rec_len = len; + acur->bno = bnew; + acur->len = args->len; + acur->diff = diff; + *new = 1; + +out: + if (deactivate) + cur->bc_private.a.priv.abt.active = false; + trace_xfs_alloc_cur_check(args->mp, cur->bc_btnum, bno, len, diff, + *new); + return 0; +} + /* * Deal with the case where only small freespaces remain. Either return the * contents of the last freespace record, or allocate space from the freelist if @@ -1257,8 +1340,6 @@ xfs_alloc_ag_vextent_near( * but we never loop back to the top. */ while (xfs_btree_islastblock(acur.cnt, 0)) { - xfs_extlen_t diff; - #ifdef DEBUG if (dofirst) break; @@ -1289,38 +1370,16 @@ xfs_alloc_ag_vextent_near( } i = acur.cnt->bc_ptrs[0]; for (j = 1; - !error && j && (acur.len < args->maxlen || acur.diff > 0); + !error && j && xfs_alloc_cur_active(acur.cnt) && + (acur.len < args->maxlen || acur.diff > 0); error = xfs_btree_increment(acur.cnt, 0, &j)) { /* * For each entry, decide if it's better than * the previous best entry. */ - error = xfs_alloc_get_rec(acur.cnt, <bno, <len, &i); + error = xfs_alloc_cur_check(args, &acur, acur.cnt, &i); if (error) goto out; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - acur.busy = xfs_alloc_compute_aligned(args, ltbno, ltlen, - <bnoa, <lena, &acur.busy_gen); - if (ltlena < args->minlen) - continue; - if (ltbnoa < args->min_agbno || ltbnoa > args->max_agbno) - continue; - args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen); - xfs_alloc_fix_len(args); - ASSERT(args->len >= args->minlen); - if (args->len < acur.len) - continue; - diff = xfs_alloc_compute_diff(args->agbno, args->len, - args->alignment, args->datatype, ltbnoa, - ltlena, <new); - if (ltnew != NULLAGBLOCK && - (args->len > acur.len || diff < acur.diff)) { - acur.rec_bno = ltbno; - acur.rec_len = ltlen; - acur.diff = diff; - acur.bno = ltnew; - acur.len = args->len; - } } /* * It didn't work. We COULD be in a case where diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index eaae275ed430..b12fad3e45cb 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1663,6 +1663,31 @@ DEFINE_ALLOC_EVENT(xfs_alloc_vextent_noagbp); DEFINE_ALLOC_EVENT(xfs_alloc_vextent_loopfailed); DEFINE_ALLOC_EVENT(xfs_alloc_vextent_allfailed); +TRACE_EVENT(xfs_alloc_cur_check, + TP_PROTO(struct xfs_mount *mp, xfs_btnum_t btnum, xfs_agblock_t bno, + xfs_extlen_t len, xfs_extlen_t diff, bool new), + TP_ARGS(mp, btnum, bno, len, diff, new), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_btnum_t, btnum) + __field(xfs_agblock_t, bno) + __field(xfs_extlen_t, len) + __field(xfs_extlen_t, diff) + __field(bool, new) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->btnum = btnum; + __entry->bno = bno; + __entry->len = len; + __entry->diff = diff; + __entry->new = new; + ), + TP_printk("dev %d:%d btnum %d bno 0x%x len 0x%x diff 0x%x new %d", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->btnum, + __entry->bno, __entry->len, __entry->diff, __entry->new) +) + DECLARE_EVENT_CLASS(xfs_da_class, TP_PROTO(struct xfs_da_args *args), TP_ARGS(args), From patchwork Mon Sep 16 12:16:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146937 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2FEA91746 for ; Mon, 16 Sep 2019 12:16:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0EB0B21848 for ; Mon, 16 Sep 2019 12:16:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731379AbfIPMQj (ORCPT ); Mon, 16 Sep 2019 08:16:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36428 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731514AbfIPMQj (ORCPT ); Mon, 16 Sep 2019 08:16:39 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B51BD3090FD1 for ; Mon, 16 Sep 2019 12:16:38 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7390D5C21A for ; Mon, 16 Sep 2019 12:16:38 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 06/11] xfs: reuse best extent tracking logic for bnobt scan Date: Mon, 16 Sep 2019 08:16:30 -0400 Message-Id: <20190916121635.43148-7-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.43]); Mon, 16 Sep 2019 12:16:38 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The near mode bnobt scan searches left and right in the bnobt looking for the closest free extent to the allocation hint that satisfies minlen. Once such an extent is found, the left/right search terminates, we search one more time in the opposite direction and finish the allocation with the best overall extent. The left/right and find best searches are currently controlled via a combination of cursor state and local variables. Clean up this code and prepare for further improvements to the near mode fallback algorithm by reusing the allocation cursor best extent tracking mechanism. Update the tracking logic to deactivate bnobt cursors when out of allocation range and replace open-coded extent checks to calls to the common helper. In doing so, rename some misnamed local variables in the top-level near mode allocation function. Signed-off-by: Brian Foster --- fs/xfs/libxfs/xfs_alloc.c | 270 +++++++++++--------------------------- fs/xfs/xfs_trace.h | 4 +- 2 files changed, 76 insertions(+), 198 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 2fa7bb6a00a8..edcec975c306 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -810,6 +810,7 @@ xfs_alloc_cur_check( bool busy; unsigned busy_gen = 0; bool deactivate = false; + bool isbnobt = cur->bc_btnum == XFS_BTNUM_BNO; *new = 0; @@ -823,7 +824,7 @@ xfs_alloc_cur_check( * range (i.e., walking backwards looking for a minlen extent). */ if (len < args->minlen) { - deactivate = true; + deactivate = !isbnobt; goto out; } @@ -833,8 +834,10 @@ xfs_alloc_cur_check( if (busy) acur->busy_gen = busy_gen; /* deactivate a bnobt cursor outside of locality range */ - if (bnoa < args->min_agbno || bnoa > args->max_agbno) + if (bnoa < args->min_agbno || bnoa > args->max_agbno) { + deactivate = isbnobt; goto out; + } if (lena < args->minlen) goto out; @@ -854,8 +857,14 @@ xfs_alloc_cur_check( bnoa, lena, &bnew); if (bnew == NULLAGBLOCK) goto out; - if (diff > acur->diff) + + /* + * Deactivate a bnobt cursor with worse locality than the current best. + */ + if (diff > acur->diff) { + deactivate = isbnobt; goto out; + } ASSERT(args->len > acur->len || (args->len == acur->len && diff <= acur->diff)); @@ -1163,96 +1172,44 @@ xfs_alloc_ag_vextent_exact( } /* - * Search the btree in a given direction via the search cursor and compare - * the records found against the good extent we've already found. + * Search the btree in a given direction and check the records against the good + * extent we've already found. */ STATIC int xfs_alloc_find_best_extent( - struct xfs_alloc_arg *args, /* allocation argument structure */ - struct xfs_btree_cur *gcur, /* good cursor */ - struct xfs_btree_cur *scur, /* searching cursor */ - xfs_agblock_t gdiff, /* difference for search comparison */ - xfs_agblock_t *sbno, /* extent found by search */ - xfs_extlen_t *slen, /* extent length */ - xfs_agblock_t *sbnoa, /* aligned extent found by search */ - xfs_extlen_t *slena, /* aligned extent length */ - int dir) /* 0 = search right, 1 = search left */ + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur, + struct xfs_btree_cur *cur, + bool increment) { - xfs_agblock_t new; - xfs_agblock_t sdiff; int error; int i; - unsigned busy_gen; - - /* The good extent is perfect, no need to search. */ - if (!gdiff) - goto out_use_good; /* - * Look until we find a better one, run out of space or run off the end. + * Search so long as the cursor is active or we find a better extent. + * The cursor is deactivated if it extends beyond the range of the + * current allocation candidate. */ - do { - error = xfs_alloc_get_rec(scur, sbno, slen, &i); + while (xfs_alloc_cur_active(cur)) { + error = xfs_alloc_cur_check(args, acur, cur, &i); if (error) - goto error0; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0); - xfs_alloc_compute_aligned(args, *sbno, *slen, - sbnoa, slena, &busy_gen); - - /* - * The good extent is closer than this one. - */ - if (!dir) { - if (*sbnoa > args->max_agbno) - goto out_use_good; - if (*sbnoa >= args->agbno + gdiff) - goto out_use_good; - } else { - if (*sbnoa < args->min_agbno) - goto out_use_good; - if (*sbnoa <= args->agbno - gdiff) - goto out_use_good; - } - - /* - * Same distance, compare length and pick the best. - */ - if (*slena >= args->minlen) { - args->len = XFS_EXTLEN_MIN(*slena, args->maxlen); - xfs_alloc_fix_len(args); - - sdiff = xfs_alloc_compute_diff(args->agbno, args->len, - args->alignment, - args->datatype, *sbnoa, - *slena, &new); - - /* - * Choose closer size and invalidate other cursor. - */ - if (sdiff < gdiff) - goto out_use_search; - goto out_use_good; - } + return error; + if (i == 1) + break; + if (!xfs_alloc_cur_active(cur)) + break; - if (!dir) - error = xfs_btree_increment(scur, 0, &i); + if (increment) + error = xfs_btree_increment(cur, 0, &i); else - error = xfs_btree_decrement(scur, 0, &i); + error = xfs_btree_decrement(cur, 0, &i); if (error) - goto error0; - } while (i); - -out_use_good: - scur->bc_private.a.priv.abt.active = false; - return 0; + return error; + if (i == 0) + cur->bc_private.a.priv.abt.active = false; + } -out_use_search: - gcur->bc_private.a.priv.abt.active = false; return 0; - -error0: - /* caller invalidates cursors */ - return error; } /* @@ -1266,23 +1223,13 @@ xfs_alloc_ag_vextent_near( struct xfs_alloc_arg *args) { struct xfs_alloc_cur acur = {0,}; - struct xfs_btree_cur *bno_cur; - xfs_agblock_t gtbno; /* start bno of right side entry */ - xfs_agblock_t gtbnoa; /* aligned ... */ - xfs_extlen_t gtdiff; /* difference to right side entry */ - xfs_extlen_t gtlen; /* length of right side entry */ - xfs_extlen_t gtlena; /* aligned ... */ - xfs_agblock_t gtnew; /* useful start bno of right side */ + struct xfs_btree_cur *fbcur = NULL; int error; /* error code */ int i; /* result code, temporary */ int j; /* result code, temporary */ - xfs_agblock_t ltbno; /* start bno of left side entry */ - xfs_agblock_t ltbnoa; /* aligned ... */ - xfs_extlen_t ltdiff; /* difference to left side entry */ - xfs_extlen_t ltlen; /* length of left side entry */ - xfs_extlen_t ltlena; /* aligned ... */ - xfs_agblock_t ltnew; /* useful start bno of left side */ - xfs_extlen_t rlen; /* length of returned extent */ + xfs_agblock_t bno; + xfs_extlen_t len; + bool fbinc = false; #ifdef DEBUG /* * Randomly don't execute the first algorithm. @@ -1304,9 +1251,7 @@ xfs_alloc_ag_vextent_near( args->agbno = args->max_agbno; restart: - ltlen = 0; - gtlena = 0; - ltlena = 0; + len = 0; /* * Set up cursors and see if there are any free extents as big as @@ -1315,11 +1260,11 @@ xfs_alloc_ag_vextent_near( */ error = xfs_alloc_cur_setup(args, &acur); if (error == -ENOSPC) { - error = xfs_alloc_ag_vextent_small(args, acur.cnt, <bno, - <len, &i); + error = xfs_alloc_ag_vextent_small(args, acur.cnt, &bno, + &len, &i); if (error) goto out; - if (i == 0 || ltlen == 0) { + if (i == 0 || len == 0) { trace_xfs_alloc_near_noentry(args); goto out; } @@ -1350,21 +1295,21 @@ xfs_alloc_ag_vextent_near( * record smaller than maxlen, go to the start of this block, * and skip all those smaller than minlen. */ - if (ltlen || args->alignment > 1) { + if (len || args->alignment > 1) { acur.cnt->bc_ptrs[0] = 1; do { - error = xfs_alloc_get_rec(acur.cnt, <bno, - <len, &i); + error = xfs_alloc_get_rec(acur.cnt, &bno, &len, + &i); if (error) goto out; XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - if (ltlen >= args->minlen) + if (len >= args->minlen) break; error = xfs_btree_increment(acur.cnt, 0, &i); if (error) goto out; } while (i); - ASSERT(ltlen >= args->minlen); + ASSERT(len >= args->minlen); if (!i) break; } @@ -1433,77 +1378,43 @@ xfs_alloc_ag_vextent_near( */ do { if (xfs_alloc_cur_active(acur.bnolt)) { - error = xfs_alloc_get_rec(acur.bnolt, <bno, <len, &i); + error = xfs_alloc_cur_check(args, &acur, acur.bnolt, &i); if (error) goto out; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - acur.busy |= xfs_alloc_compute_aligned(args, ltbno, - ltlen, <bnoa, <lena, &acur.busy_gen); - if (ltlena >= args->minlen && ltbnoa >= args->min_agbno) + if (i == 1) { + trace_xfs_alloc_cur_left(args); + fbcur = acur.bnogt; + fbinc = true; break; + } error = xfs_btree_decrement(acur.bnolt, 0, &i); if (error) goto out; - if (!i || ltbnoa < args->min_agbno) + if (!i) acur.bnolt->bc_private.a.priv.abt.active = false; } if (xfs_alloc_cur_active(acur.bnogt)) { - error = xfs_alloc_get_rec(acur.bnogt, >bno, >len, &i); + error = xfs_alloc_cur_check(args, &acur, acur.bnogt, &i); if (error) goto out; - XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, out); - acur.busy |= xfs_alloc_compute_aligned(args, gtbno, - gtlen, >bnoa, >lena, &acur.busy_gen); - if (gtlena >= args->minlen && gtbnoa <= args->max_agbno) + if (i == 1) { + trace_xfs_alloc_cur_right(args); + fbcur = acur.bnolt; + fbinc = false; break; + } error = xfs_btree_increment(acur.bnogt, 0, &i); if (error) goto out; - if (!i || gtbnoa > args->max_agbno) + if (!i) acur.bnogt->bc_private.a.priv.abt.active = false; } } while (xfs_alloc_cur_active(acur.bnolt) || xfs_alloc_cur_active(acur.bnogt)); - /* - * Got both cursors still active, need to find better entry. - */ - if (xfs_alloc_cur_active(acur.bnolt) && - xfs_alloc_cur_active(acur.bnogt)) { - if (ltlena >= args->minlen) { - /* - * Left side is good, look for a right side entry. - */ - args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen); - xfs_alloc_fix_len(args); - ltdiff = xfs_alloc_compute_diff(args->agbno, args->len, - args->alignment, args->datatype, ltbnoa, - ltlena, <new); - - error = xfs_alloc_find_best_extent(args, - acur.bnolt, acur.bnogt, - ltdiff, >bno, >len, - >bnoa, >lena, - 0 /* search right */); - } else { - ASSERT(gtlena >= args->minlen); - - /* - * Right side is good, look for a left side entry. - */ - args->len = XFS_EXTLEN_MIN(gtlena, args->maxlen); - xfs_alloc_fix_len(args); - gtdiff = xfs_alloc_compute_diff(args->agbno, args->len, - args->alignment, args->datatype, gtbnoa, - gtlena, >new); - - error = xfs_alloc_find_best_extent(args, - acur.bnogt, acur.bnolt, - gtdiff, <bno, <len, - <bnoa, <lena, - 1 /* search left */); - } - + /* search the opposite direction for a better entry */ + if (fbcur) { + error = xfs_alloc_find_best_extent(args, &acur, fbcur, fbinc); if (error) goto out; } @@ -1511,8 +1422,7 @@ xfs_alloc_ag_vextent_near( /* * If we couldn't get anything, give up. */ - if (!xfs_alloc_cur_active(acur.bnolt) && - !xfs_alloc_cur_active(acur.bnogt)) { + if (!acur.len) { if (acur.busy) { trace_xfs_alloc_near_busy(args); xfs_extent_busy_flush(args->mp, args->pag, @@ -1524,47 +1434,15 @@ xfs_alloc_ag_vextent_near( goto out; } - /* - * At this point we have selected a freespace entry, either to the - * left or to the right. If it's on the right, copy all the - * useful variables to the "left" set so we only have one - * copy of this code. - */ - if (xfs_alloc_cur_active(acur.bnogt)) { - bno_cur = acur.bnogt; - ltbno = gtbno; - ltbnoa = gtbnoa; - ltlen = gtlen; - ltlena = gtlena; - j = 1; - } else { - bno_cur = acur.bnolt; - j = 0; - } - - /* - * Fix up the length and compute the useful address. - */ - args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen); - xfs_alloc_fix_len(args); - rlen = args->len; - (void)xfs_alloc_compute_diff(args->agbno, rlen, args->alignment, - args->datatype, ltbnoa, ltlena, <new); - ASSERT(ltnew >= ltbno); - ASSERT(ltnew + rlen <= ltbnoa + ltlena); - ASSERT(ltnew + rlen <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); - ASSERT(ltnew >= args->min_agbno && ltnew <= args->max_agbno); - args->agbno = ltnew; - - error = xfs_alloc_fixup_trees(acur.cnt, bno_cur, ltbno, ltlen, ltnew, - rlen, XFSA_FIXUP_BNO_OK); - if (error) - goto out; + args->agbno = acur.bno; + args->len = acur.len; + ASSERT(acur.bno >= acur.rec_bno); + ASSERT(acur.bno + acur.len <= acur.rec_bno + acur.rec_len); + ASSERT(acur.rec_bno + acur.rec_len <= + be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); - if (j) - trace_xfs_alloc_near_greater(args); - else - trace_xfs_alloc_near_lesser(args); + error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, acur.rec_bno, + acur.rec_len, acur.bno, acur.len, 0); out: xfs_alloc_cur_close(&acur, error); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index b12fad3e45cb..2e57dc3d4230 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1642,8 +1642,8 @@ DEFINE_ALLOC_EVENT(xfs_alloc_exact_notfound); DEFINE_ALLOC_EVENT(xfs_alloc_exact_error); DEFINE_ALLOC_EVENT(xfs_alloc_near_nominleft); DEFINE_ALLOC_EVENT(xfs_alloc_near_first); -DEFINE_ALLOC_EVENT(xfs_alloc_near_greater); -DEFINE_ALLOC_EVENT(xfs_alloc_near_lesser); +DEFINE_ALLOC_EVENT(xfs_alloc_cur_right); +DEFINE_ALLOC_EVENT(xfs_alloc_cur_left); DEFINE_ALLOC_EVENT(xfs_alloc_near_error); DEFINE_ALLOC_EVENT(xfs_alloc_near_noentry); DEFINE_ALLOC_EVENT(xfs_alloc_near_busy); From patchwork Mon Sep 16 12:16:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146939 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5ED7D184E for ; Mon, 16 Sep 2019 12:16:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 469CF216C8 for ; Mon, 16 Sep 2019 12:16:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731514AbfIPMQj (ORCPT ); Mon, 16 Sep 2019 08:16:39 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50722 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732374AbfIPMQj (ORCPT ); Mon, 16 Sep 2019 08:16:39 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 25B3910C093A for ; Mon, 16 Sep 2019 12:16:39 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id D6F565C258 for ; Mon, 16 Sep 2019 12:16:38 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 07/11] xfs: refactor allocation tree fixup code Date: Mon, 16 Sep 2019 08:16:31 -0400 Message-Id: <20190916121635.43148-8-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.66]); Mon, 16 Sep 2019 12:16:39 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Both algorithms duplicate the same btree allocation code. Eliminate the duplication and reuse the fallback algorithm codepath. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 18 ++---------------- 1 file changed, 2 insertions(+), 16 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index edcec975c306..3eacc799c4cb 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -1333,23 +1333,8 @@ xfs_alloc_ag_vextent_near( if (acur.len == 0) break; - /* - * Allocate at the bno/len tracked in the cursor. - */ - args->agbno = acur.bno; - args->len = acur.len; - ASSERT(acur.bno >= acur.rec_bno); - ASSERT(acur.bno + acur.len <= acur.rec_bno + acur.rec_len); - ASSERT(acur.rec_bno + acur.rec_len <= - be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); - - error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, - acur.rec_bno, acur.rec_len, acur.bno, acur.len, - 0); - if (error) - goto out; trace_xfs_alloc_near_first(args); - goto out; + goto alloc; } /* @@ -1434,6 +1419,7 @@ xfs_alloc_ag_vextent_near( goto out; } +alloc: args->agbno = acur.bno; args->len = acur.len; ASSERT(acur.bno >= acur.rec_bno); From patchwork Mon Sep 16 12:16:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146941 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A4FB414DB for ; Mon, 16 Sep 2019 12:16:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8E67E21670 for ; Mon, 16 Sep 2019 12:16:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732418AbfIPMQl (ORCPT ); Mon, 16 Sep 2019 08:16:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42124 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728413AbfIPMQk (ORCPT ); Mon, 16 Sep 2019 08:16:40 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 89B51300CA4B for ; Mon, 16 Sep 2019 12:16:39 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 46BE65C21A for ; Mon, 16 Sep 2019 12:16:39 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 08/11] xfs: refactor and reuse best extent scanning logic Date: Mon, 16 Sep 2019 08:16:32 -0400 Message-Id: <20190916121635.43148-9-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Mon, 16 Sep 2019 12:16:39 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The bnobt "find best" helper implements a simple btree walker function. This general pattern, or a subset thereof, is reused in various parts of a near mode allocation operation. For example, the bnobt left/right scans are each iterative btree walks along with the cntbt lastblock scan. Rework this function into a generic btree walker, add a couple parameters to control termination behavior from various contexts and reuse it where applicable. XXX: This slightly changes the cntbt lastblock scan logic to scan the entire block and use the best candidate as opposed to the current logic of checking until we locate a perfect match. Fix up cur_check() to terminate on perfect match. Signed-off-by: Brian Foster --- fs/xfs/libxfs/xfs_alloc.c | 103 ++++++++++++++++++-------------------- 1 file changed, 48 insertions(+), 55 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 3eacc799c4cb..ab494fd50dd7 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -1172,30 +1172,38 @@ xfs_alloc_ag_vextent_exact( } /* - * Search the btree in a given direction and check the records against the good - * extent we've already found. + * Search a given number of btree records in a given direction. Check each + * record against the good extent we've already found. */ STATIC int -xfs_alloc_find_best_extent( +xfs_alloc_walk_iter( struct xfs_alloc_arg *args, struct xfs_alloc_cur *acur, struct xfs_btree_cur *cur, - bool increment) + bool increment, + bool find_one,/* quit on first candidate */ + int count, /* rec count (-1 for infinite) */ + int *stat) { int error; int i; + *stat = 0; + /* * Search so long as the cursor is active or we find a better extent. * The cursor is deactivated if it extends beyond the range of the * current allocation candidate. */ - while (xfs_alloc_cur_active(cur)) { + while (xfs_alloc_cur_active(cur) && count) { error = xfs_alloc_cur_check(args, acur, cur, &i); if (error) return error; - if (i == 1) - break; + if (i == 1) { + *stat = 1; + if (find_one) + break; + } if (!xfs_alloc_cur_active(cur)) break; @@ -1207,6 +1215,9 @@ xfs_alloc_find_best_extent( return error; if (i == 0) cur->bc_private.a.priv.abt.active = false; + + if (count > 0) + count--; } return 0; @@ -1226,7 +1237,6 @@ xfs_alloc_ag_vextent_near( struct xfs_btree_cur *fbcur = NULL; int error; /* error code */ int i; /* result code, temporary */ - int j; /* result code, temporary */ xfs_agblock_t bno; xfs_extlen_t len; bool fbinc = false; @@ -1313,19 +1323,12 @@ xfs_alloc_ag_vextent_near( if (!i) break; } - i = acur.cnt->bc_ptrs[0]; - for (j = 1; - !error && j && xfs_alloc_cur_active(acur.cnt) && - (acur.len < args->maxlen || acur.diff > 0); - error = xfs_btree_increment(acur.cnt, 0, &j)) { - /* - * For each entry, decide if it's better than - * the previous best entry. - */ - error = xfs_alloc_cur_check(args, &acur, acur.cnt, &i); - if (error) - goto out; - } + + error = xfs_alloc_walk_iter(args, &acur, acur.cnt, true, false, + -1, &i); + if (error) + goto out; + /* * It didn't work. We COULD be in a case where * there's a good record somewhere, so try again. @@ -1357,49 +1360,39 @@ xfs_alloc_ag_vextent_near( goto out; /* - * Loop going left with the leftward cursor, right with the - * rightward cursor, until either both directions give up or - * we find an entry at least as big as minlen. + * Loop going left with the leftward cursor, right with the rightward + * cursor, until either both directions give up or we find an entry at + * least as big as minlen. */ do { - if (xfs_alloc_cur_active(acur.bnolt)) { - error = xfs_alloc_cur_check(args, &acur, acur.bnolt, &i); - if (error) - goto out; - if (i == 1) { - trace_xfs_alloc_cur_left(args); - fbcur = acur.bnogt; - fbinc = true; - break; - } - error = xfs_btree_decrement(acur.bnolt, 0, &i); - if (error) - goto out; - if (!i) - acur.bnolt->bc_private.a.priv.abt.active = false; + error = xfs_alloc_walk_iter(args, &acur, acur.bnolt, false, + true, 1, &i); + if (error) + goto out; + if (i == 1) { + trace_xfs_alloc_cur_left(args); + fbcur = acur.bnogt; + fbinc = true; + break; } - if (xfs_alloc_cur_active(acur.bnogt)) { - error = xfs_alloc_cur_check(args, &acur, acur.bnogt, &i); - if (error) - goto out; - if (i == 1) { - trace_xfs_alloc_cur_right(args); - fbcur = acur.bnolt; - fbinc = false; - break; - } - error = xfs_btree_increment(acur.bnogt, 0, &i); - if (error) - goto out; - if (!i) - acur.bnogt->bc_private.a.priv.abt.active = false; + + error = xfs_alloc_walk_iter(args, &acur, acur.bnogt, true, true, + 1, &i); + if (error) + goto out; + if (i == 1) { + trace_xfs_alloc_cur_right(args); + fbcur = acur.bnolt; + fbinc = false; + break; } } while (xfs_alloc_cur_active(acur.bnolt) || xfs_alloc_cur_active(acur.bnogt)); /* search the opposite direction for a better entry */ if (fbcur) { - error = xfs_alloc_find_best_extent(args, &acur, fbcur, fbinc); + error = xfs_alloc_walk_iter(args, &acur, fbcur, fbinc, true, -1, + &i); if (error) goto out; } From patchwork Mon Sep 16 12:16:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146943 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 103FD1746 for ; Mon, 16 Sep 2019 12:16:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id ED075214DE for ; Mon, 16 Sep 2019 12:16:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728413AbfIPMQl (ORCPT ); Mon, 16 Sep 2019 08:16:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42130 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732374AbfIPMQk (ORCPT ); Mon, 16 Sep 2019 08:16:40 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E9FB83082145 for ; Mon, 16 Sep 2019 12:16:39 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id A9AFF5C28F for ; Mon, 16 Sep 2019 12:16:39 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 09/11] xfs: refactor near mode alloc bnobt scan into separate function Date: Mon, 16 Sep 2019 08:16:33 -0400 Message-Id: <20190916121635.43148-10-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Mon, 16 Sep 2019 12:16:39 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org In preparation to enhance the near mode allocation bnobt scan algorithm, lift it into a separate function. No functional changes. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 128 ++++++++++++++++++++++---------------- 1 file changed, 74 insertions(+), 54 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index ab494fd50dd7..6f7e4666250c 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -1223,6 +1223,78 @@ xfs_alloc_walk_iter( return 0; } +/* + * Search in the by-bno btree to the left and to the right simultaneously until + * in each case we find a large enough free extent or run into the edge of the + * tree. When we run into the edge of the tree, we deactivate that cursor. + */ +STATIC int +xfs_alloc_ag_vextent_bnobt( + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur, + int *stat) +{ + struct xfs_btree_cur *fbcur = NULL; + int error; + int i; + bool fbinc; + + ASSERT(acur->len == 0); + ASSERT(args->type == XFS_ALLOCTYPE_NEAR_BNO); + + *stat = 0; + + error = xfs_alloc_lookup_le(acur->bnolt, args->agbno, 0, &i); + if (error) + return error; + error = xfs_alloc_lookup_ge(acur->bnogt, args->agbno, 0, &i); + if (error) + return error; + + /* + * Loop going left with the leftward cursor, right with the rightward + * cursor, until either both directions give up or we find an entry at + * least as big as minlen. + */ + while (xfs_alloc_cur_active(acur->bnolt) || + xfs_alloc_cur_active(acur->bnogt)) { + error = xfs_alloc_walk_iter(args, acur, acur->bnolt, false, + true, 1, &i); + if (error) + return error; + if (i == 1) { + trace_xfs_alloc_cur_left(args); + fbcur = acur->bnogt; + fbinc = true; + break; + } + + error = xfs_alloc_walk_iter(args, acur, acur->bnogt, true, true, + 1, &i); + if (error) + return error; + if (i == 1) { + trace_xfs_alloc_cur_right(args); + fbcur = acur->bnolt; + fbinc = false; + break; + } + } + + /* search the opposite direction for a better entry */ + if (fbcur) { + error = xfs_alloc_walk_iter(args, acur, fbcur, fbinc, true, -1, + &i); + if (error) + return error; + } + + if (acur->len) + *stat = 1; + + return 0; +} + /* * Allocate a variable extent near bno in the allocation group agno. * Extent's length (returned in len) will be between minlen and maxlen, @@ -1234,12 +1306,10 @@ xfs_alloc_ag_vextent_near( struct xfs_alloc_arg *args) { struct xfs_alloc_cur acur = {0,}; - struct xfs_btree_cur *fbcur = NULL; int error; /* error code */ int i; /* result code, temporary */ xfs_agblock_t bno; xfs_extlen_t len; - bool fbinc = false; #ifdef DEBUG /* * Randomly don't execute the first algorithm. @@ -1341,62 +1411,12 @@ xfs_alloc_ag_vextent_near( } /* - * Second algorithm. - * Search in the by-bno tree to the left and to the right - * simultaneously, until in each case we find a space big enough, - * or run into the edge of the tree. When we run into the edge, - * we deallocate that cursor. - * If both searches succeed, we compare the two spaces and pick - * the better one. - * With alignment, it's possible for both to fail; the upper - * level algorithm that picks allocation groups for allocations - * is not supposed to do this. + * Second algorithm. Search the bnobt left and right. */ - error = xfs_alloc_lookup_le(acur.bnolt, args->agbno, 0, &i); - if (error) - goto out; - error = xfs_alloc_lookup_ge(acur.bnogt, args->agbno, 0, &i); + error = xfs_alloc_ag_vextent_bnobt(args, &acur, &i); if (error) goto out; - /* - * Loop going left with the leftward cursor, right with the rightward - * cursor, until either both directions give up or we find an entry at - * least as big as minlen. - */ - do { - error = xfs_alloc_walk_iter(args, &acur, acur.bnolt, false, - true, 1, &i); - if (error) - goto out; - if (i == 1) { - trace_xfs_alloc_cur_left(args); - fbcur = acur.bnogt; - fbinc = true; - break; - } - - error = xfs_alloc_walk_iter(args, &acur, acur.bnogt, true, true, - 1, &i); - if (error) - goto out; - if (i == 1) { - trace_xfs_alloc_cur_right(args); - fbcur = acur.bnolt; - fbinc = false; - break; - } - } while (xfs_alloc_cur_active(acur.bnolt) || - xfs_alloc_cur_active(acur.bnogt)); - - /* search the opposite direction for a better entry */ - if (fbcur) { - error = xfs_alloc_walk_iter(args, &acur, fbcur, fbinc, true, -1, - &i); - if (error) - goto out; - } - /* * If we couldn't get anything, give up. */ From patchwork Mon Sep 16 12:16:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146945 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A18691745 for ; Mon, 16 Sep 2019 12:16:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8AA05214DE for ; Mon, 16 Sep 2019 12:16:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732374AbfIPMQl (ORCPT ); Mon, 16 Sep 2019 08:16:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50624 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732409AbfIPMQk (ORCPT ); Mon, 16 Sep 2019 08:16:40 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5957D10CC1FB for ; Mon, 16 Sep 2019 12:16:40 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 189ED5C22C for ; Mon, 16 Sep 2019 12:16:40 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 10/11] xfs: factor out tree fixup logic into helper Date: Mon, 16 Sep 2019 08:16:34 -0400 Message-Id: <20190916121635.43148-11-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.65]); Mon, 16 Sep 2019 12:16:40 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Lift the btree fixup path into a helper function. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_alloc.c | 42 +++++++++++++++++++++++++++++---------- fs/xfs/xfs_trace.h | 1 + 2 files changed, 33 insertions(+), 10 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 6f7e4666250c..381a08257aaf 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -883,6 +883,36 @@ xfs_alloc_cur_check( return 0; } +/* + * Complete an allocation of a candidate extent. Remove the extent from both + * trees and update the args structure. + */ +STATIC int +xfs_alloc_cur_finish( + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur) +{ + int error; + + ASSERT(acur->cnt && acur->bnolt); + ASSERT(acur->bno >= acur->rec_bno); + ASSERT(acur->bno + acur->len <= acur->rec_bno + acur->rec_len); + ASSERT(acur->rec_bno + acur->rec_len <= + be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); + + error = xfs_alloc_fixup_trees(acur->cnt, acur->bnolt, acur->rec_bno, + acur->rec_len, acur->bno, acur->len, 0); + if (error) + return error; + + args->agbno = acur->bno; + args->len = acur->len; + args->wasfromfl = 0; + + trace_xfs_alloc_cur(args); + return 0; +} + /* * Deal with the case where only small freespaces remain. Either return the * contents of the last freespace record, or allocate space from the freelist if @@ -1352,7 +1382,6 @@ xfs_alloc_ag_vextent_near( } else if (error) { goto out; } - args->wasfromfl = 0; /* * First algorithm. @@ -1433,15 +1462,8 @@ xfs_alloc_ag_vextent_near( } alloc: - args->agbno = acur.bno; - args->len = acur.len; - ASSERT(acur.bno >= acur.rec_bno); - ASSERT(acur.bno + acur.len <= acur.rec_bno + acur.rec_len); - ASSERT(acur.rec_bno + acur.rec_len <= - be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length)); - - error = xfs_alloc_fixup_trees(acur.cnt, acur.bnolt, acur.rec_bno, - acur.rec_len, acur.bno, acur.len, 0); + /* fix up btrees on a successful allocation */ + error = xfs_alloc_cur_finish(args, &acur); out: xfs_alloc_cur_close(&acur, error); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 2e57dc3d4230..b8b93068efe7 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1642,6 +1642,7 @@ DEFINE_ALLOC_EVENT(xfs_alloc_exact_notfound); DEFINE_ALLOC_EVENT(xfs_alloc_exact_error); DEFINE_ALLOC_EVENT(xfs_alloc_near_nominleft); DEFINE_ALLOC_EVENT(xfs_alloc_near_first); +DEFINE_ALLOC_EVENT(xfs_alloc_cur); DEFINE_ALLOC_EVENT(xfs_alloc_cur_right); DEFINE_ALLOC_EVENT(xfs_alloc_cur_left); DEFINE_ALLOC_EVENT(xfs_alloc_near_error); From patchwork Mon Sep 16 12:16:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 11146947 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C51E31746 for ; Mon, 16 Sep 2019 12:16:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AE01B21848 for ; Mon, 16 Sep 2019 12:16:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732409AbfIPMQm (ORCPT ); Mon, 16 Sep 2019 08:16:42 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47810 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732417AbfIPMQl (ORCPT ); Mon, 16 Sep 2019 08:16:41 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BEEB211A2A for ; Mon, 16 Sep 2019 12:16:40 +0000 (UTC) Received: from bfoster.bos.redhat.com (dhcp-41-2.bos.redhat.com [10.18.41.2]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7CECE5C28F for ; Mon, 16 Sep 2019 12:16:40 +0000 (UTC) From: Brian Foster To: linux-xfs@vger.kernel.org Subject: [PATCH v4 11/11] xfs: optimize near mode bnobt scans with concurrent cntbt lookups Date: Mon, 16 Sep 2019 08:16:35 -0400 Message-Id: <20190916121635.43148-12-bfoster@redhat.com> In-Reply-To: <20190916121635.43148-1-bfoster@redhat.com> References: <20190916121635.43148-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Mon, 16 Sep 2019 12:16:40 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The near mode fallback algorithm consists of a left/right scan of the bnobt. This algorithm has very poor breakdown characteristics under worst case free space fragmentation conditions. If a suitable extent is far enough from the locality hint, each allocation may scan most or all of the bnobt before it completes. This causes pathological behavior and extremely high allocation latencies. While locality is important to near mode allocations, it is not so important as to incur pathological allocation latency to provide the asolute best available locality for every allocation. If the allocation is large enough or far enough away, there is a point of diminishing returns. As such, we can bound the overall operation by including an iterative cntbt lookup in the broader search. The cntbt lookup is optimized to immediately find the extent with best locality for the given size on each iteration. Since the cntbt is indexed by extent size, the lookup repeats with a variably aggressive increasing search key size until it runs off the edge of the tree. This approach provides a natural balance between the two algorithms for various situations. For example, the bnobt scan is able to satisfy smaller allocations such as for inode chunks or btree blocks more quickly where the cntbt search may have to search through a large set of extent sizes when the search key starts off small relative to the largest extent in the tree. On the other hand, the cntbt search more deterministically covers the set of suitable extents for larger data extent allocation requests that the bnobt scan may have to search the entire tree to locate. Signed-off-by: Brian Foster --- fs/xfs/libxfs/xfs_alloc.c | 153 +++++++++++++++++++++++++++++++++++--- fs/xfs/xfs_trace.h | 2 + 2 files changed, 143 insertions(+), 12 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 381a08257aaf..4ec22040e516 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -716,6 +716,7 @@ struct xfs_alloc_cur { struct xfs_btree_cur *cnt; /* btree cursors */ struct xfs_btree_cur *bnolt; struct xfs_btree_cur *bnogt; + xfs_extlen_t cur_len;/* current search length */ xfs_agblock_t rec_bno;/* extent startblock */ xfs_extlen_t rec_len;/* extent length */ xfs_agblock_t bno; /* alloc bno */ @@ -740,6 +741,7 @@ xfs_alloc_cur_setup( ASSERT(args->alignment == 1 || args->type != XFS_ALLOCTYPE_THIS_BNO); + acur->cur_len = args->maxlen; acur->rec_bno = 0; acur->rec_len = 0; acur->bno = 0; @@ -913,6 +915,76 @@ xfs_alloc_cur_finish( return 0; } +/* + * Locality allocation lookup algorithm. This expects a cntbt cursor and uses + * bno optimized lookup to search for extents with ideal size and locality. + */ +STATIC int +xfs_alloc_cntbt_iter( + struct xfs_alloc_arg *args, + struct xfs_alloc_cur *acur) +{ + struct xfs_btree_cur *cur = acur->cnt; + xfs_agblock_t bno; + xfs_extlen_t len, cur_len; + int error; + int i; + + if (!xfs_alloc_cur_active(cur)) + return 0; + + /* locality optimized lookup */ + cur_len = acur->cur_len; + error = xfs_alloc_lookup_ge(cur, args->agbno, cur_len, &i); + if (error) + return error; + if (i == 0) + return 0; + error = xfs_alloc_get_rec(cur, &bno, &len, &i); + if (error) + return error; + + /* check the current record and update search length from it */ + error = xfs_alloc_cur_check(args, acur, cur, &i); + if (error) + return error; + ASSERT(len >= acur->cur_len); + acur->cur_len = len; + + /* + * We looked up the first record >= [agbno, len] above. The agbno is a + * secondary key and so the current record may lie just before or after + * agbno. If it is past agbno, check the previous record too so long as + * the length matches as it may be closer. Don't check a smaller record + * because that could deactivate our cursor. + */ + if (bno > args->agbno) { + error = xfs_btree_decrement(cur, 0, &i); + if (!error && i) { + error = xfs_alloc_get_rec(cur, &bno, &len, &i); + if (!error && i && len == acur->cur_len) + error = xfs_alloc_cur_check(args, acur, cur, + &i); + } + if (error) + return error; + } + + /* + * Increment the search key until we find at least one allocation + * candidate or if the extent we found was larger. Otherwise, double the + * search key to optimize the search. Efficiency is more important here + * than absolute best locality. + */ + cur_len <<= 1; + if (!acur->len || acur->cur_len >= cur_len) + acur->cur_len++; + else + acur->cur_len = cur_len; + + return error; +} + /* * Deal with the case where only small freespaces remain. Either return the * contents of the last freespace record, or allocate space from the freelist if @@ -1254,12 +1326,11 @@ xfs_alloc_walk_iter( } /* - * Search in the by-bno btree to the left and to the right simultaneously until - * in each case we find a large enough free extent or run into the edge of the - * tree. When we run into the edge of the tree, we deactivate that cursor. + * Search the by-bno and by-size btrees in parallel in search of an extent with + * ideal locality based on the NEAR mode ->agbno locality hint. */ STATIC int -xfs_alloc_ag_vextent_bnobt( +xfs_alloc_ag_vextent_locality( struct xfs_alloc_arg *args, struct xfs_alloc_cur *acur, int *stat) @@ -1274,6 +1345,9 @@ xfs_alloc_ag_vextent_bnobt( *stat = 0; + error = xfs_alloc_lookup_ge(acur->cnt, args->agbno, acur->cur_len, &i); + if (error) + return error; error = xfs_alloc_lookup_le(acur->bnolt, args->agbno, 0, &i); if (error) return error; @@ -1282,12 +1356,37 @@ xfs_alloc_ag_vextent_bnobt( return error; /* - * Loop going left with the leftward cursor, right with the rightward - * cursor, until either both directions give up or we find an entry at - * least as big as minlen. + * Search the bnobt and cntbt in parallel. Search the bnobt left and + * right and lookup the closest extent to the locality hint for each + * extent size key in the cntbt. The entire search terminates + * immediately on a bnobt hit because that means we've found best case + * locality. Otherwise the search continues until the cntbt cursor runs + * off the end of the tree. If no allocation candidate is found at this + * point, give up on locality, walk backwards from the end of the cntbt + * and take the first available extent. + * + * The parallel tree searches balance each other out to provide fairly + * consistent performance for various situations. The bnobt search can + * have pathological behavior in the worst case scenario of larger + * allocation requests and fragmented free space. On the other hand, the + * bnobt is able to satisfy most smaller allocation requests much more + * quickly than the cntbt. The cntbt search can sift through fragmented + * free space and sets of free extents for larger allocation requests + * more quickly than the bnobt. Since the locality hint is just a hint + * and we don't want to scan the entire bnobt for perfect locality, the + * cntbt search essentially bounds the bnobt search such that we can + * find good enough locality at reasonable performance in most cases. */ while (xfs_alloc_cur_active(acur->bnolt) || - xfs_alloc_cur_active(acur->bnogt)) { + xfs_alloc_cur_active(acur->bnogt) || + xfs_alloc_cur_active(acur->cnt)) { + + trace_xfs_alloc_cur_lookup(args); + + /* + * Search the bnobt left and right. In the case of a hit, finish + * the search in the opposite direction and we're done. + */ error = xfs_alloc_walk_iter(args, acur, acur->bnolt, false, true, 1, &i); if (error) @@ -1298,7 +1397,6 @@ xfs_alloc_ag_vextent_bnobt( fbinc = true; break; } - error = xfs_alloc_walk_iter(args, acur, acur->bnogt, true, true, 1, &i); if (error) @@ -1309,9 +1407,39 @@ xfs_alloc_ag_vextent_bnobt( fbinc = false; break; } + + /* + * Check the extent with best locality based on the current + * extent size search key and keep track of the best candidate. + * If we fail to find anything due to busy extents, return + * empty handed so the caller can flush and retry the search. If + * no busy extents were found, walk backwards from the end of + * the cntbt as a last resort. + */ + error = xfs_alloc_cntbt_iter(args, acur); + if (error) + return error; + if (!xfs_alloc_cur_active(acur->cnt)) { + trace_xfs_alloc_cur_lookup_done(args); + if (!acur->len && !acur->busy) { + error = xfs_btree_decrement(acur->cnt, 0, &i); + if (error) + return error; + if (i) { + acur->cnt->bc_private.a.priv.abt.active = true; + fbcur = acur->cnt; + fbinc = false; + } + } + break; + } + } - /* search the opposite direction for a better entry */ + /* + * Search in the opposite direction for a better entry in the case of + * a bnobt hit or walk backwards from the end of the cntbt. + */ if (fbcur) { error = xfs_alloc_walk_iter(args, acur, fbcur, fbinc, true, -1, &i); @@ -1440,9 +1568,10 @@ xfs_alloc_ag_vextent_near( } /* - * Second algorithm. Search the bnobt left and right. + * Second algorithm. Combined cntbt and bnobt search to find ideal + * locality. */ - error = xfs_alloc_ag_vextent_bnobt(args, &acur, &i); + error = xfs_alloc_ag_vextent_locality(args, &acur, &i); if (error) goto out; diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index b8b93068efe7..0c9dfeac4e75 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -1645,6 +1645,8 @@ DEFINE_ALLOC_EVENT(xfs_alloc_near_first); DEFINE_ALLOC_EVENT(xfs_alloc_cur); DEFINE_ALLOC_EVENT(xfs_alloc_cur_right); DEFINE_ALLOC_EVENT(xfs_alloc_cur_left); +DEFINE_ALLOC_EVENT(xfs_alloc_cur_lookup); +DEFINE_ALLOC_EVENT(xfs_alloc_cur_lookup_done); DEFINE_ALLOC_EVENT(xfs_alloc_near_error); DEFINE_ALLOC_EVENT(xfs_alloc_near_noentry); DEFINE_ALLOC_EVENT(xfs_alloc_near_busy);