From patchwork Wed Oct 24 22:57:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 10655195 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 60AC414BB for ; Wed, 24 Oct 2018 22:57:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 51A682ABD6 for ; Wed, 24 Oct 2018 22:57:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 45EEC2ACA4; Wed, 24 Oct 2018 22:57:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA2CD2ABD6 for ; Wed, 24 Oct 2018 22:57:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726332AbeJYH10 (ORCPT ); Thu, 25 Oct 2018 03:27:26 -0400 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:27343 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725829AbeJYH10 (ORCPT ); Thu, 25 Oct 2018 03:27:26 -0400 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl2.internode.on.net with ESMTP; 25 Oct 2018 09:27:20 +1030 Received: from discord.disaster.area ([192.168.1.111]) by dastard with esmtp (Exim 4.80) (envelope-from ) id 1gFS5W-0006J4-V6 for linux-xfs@vger.kernel.org; Thu, 25 Oct 2018 09:57:19 +1100 Received: from dave by discord.disaster.area with local (Exim 4.91) (envelope-from ) id 1gFS5W-0005D2-U0 for linux-xfs@vger.kernel.org; Thu, 25 Oct 2018 09:57:18 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Subject: [PATCH 1/5] xfs: move xfs_dir2_addname() Date: Thu, 25 Oct 2018 09:57:12 +1100 Message-Id: <20181024225716.19459-2-david@fromorbit.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181024225716.19459-1-david@fromorbit.com> References: <20181024225716.19459-1-david@fromorbit.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Chinner This gets rid of the need for a forward declaration of the static function xfs_dir2_addname_int() and readies the code for factoring of xfs_dir2_addname_int(). Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_dir2_node.c | 140 +++++++++++++++++----------------- 1 file changed, 69 insertions(+), 71 deletions(-) diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c index f1bb3434f51c..3661988906b0 100644 --- a/fs/xfs/libxfs/xfs_dir2_node.c +++ b/fs/xfs/libxfs/xfs_dir2_node.c @@ -34,8 +34,6 @@ static void xfs_dir2_leafn_rebalance(xfs_da_state_t *state, static int xfs_dir2_leafn_remove(xfs_da_args_t *args, struct xfs_buf *bp, int index, xfs_da_state_blk_t *dblk, int *rval); -static int xfs_dir2_node_addname_int(xfs_da_args_t *args, - xfs_da_state_blk_t *fblk); /* * Check internal consistency of a leafn block. @@ -1613,75 +1611,6 @@ xfs_dir2_leafn_unbalance( xfs_dir3_leaf_check(dp, drop_blk->bp); } -/* - * Top-level node form directory addname routine. - */ -int /* error */ -xfs_dir2_node_addname( - xfs_da_args_t *args) /* operation arguments */ -{ - xfs_da_state_blk_t *blk; /* leaf block for insert */ - int error; /* error return value */ - int rval; /* sub-return value */ - xfs_da_state_t *state; /* btree cursor */ - - trace_xfs_dir2_node_addname(args); - - /* - * Allocate and initialize the state (btree cursor). - */ - state = xfs_da_state_alloc(); - state->args = args; - state->mp = args->dp->i_mount; - /* - * Look up the name. We're not supposed to find it, but - * this gives us the insertion point. - */ - error = xfs_da3_node_lookup_int(state, &rval); - if (error) - rval = error; - if (rval != -ENOENT) { - goto done; - } - /* - * Add the data entry to a data block. - * Extravalid is set to a freeblock found by lookup. - */ - rval = xfs_dir2_node_addname_int(args, - state->extravalid ? &state->extrablk : NULL); - if (rval) { - goto done; - } - blk = &state->path.blk[state->path.active - 1]; - ASSERT(blk->magic == XFS_DIR2_LEAFN_MAGIC); - /* - * Add the new leaf entry. - */ - rval = xfs_dir2_leafn_add(blk->bp, args, blk->index); - if (rval == 0) { - /* - * It worked, fix the hash values up the btree. - */ - if (!(args->op_flags & XFS_DA_OP_JUSTCHECK)) - xfs_da3_fixhashpath(state, &state->path); - } else { - /* - * It didn't work, we need to split the leaf block. - */ - if (args->total == 0) { - ASSERT(rval == -ENOSPC); - goto done; - } - /* - * Split the leaf block and insert the new entry. - */ - rval = xfs_da3_split(state); - } -done: - xfs_da_state_free(state); - return rval; -} - /* * Add the data entry for a node-format directory name addition. * The leaf entry is added in xfs_dir2_leafn_add. @@ -2059,6 +1988,75 @@ xfs_dir2_node_addname_int( return 0; } +/* + * Top-level node form directory addname routine. + */ +int /* error */ +xfs_dir2_node_addname( + xfs_da_args_t *args) /* operation arguments */ +{ + xfs_da_state_blk_t *blk; /* leaf block for insert */ + int error; /* error return value */ + int rval; /* sub-return value */ + xfs_da_state_t *state; /* btree cursor */ + + trace_xfs_dir2_node_addname(args); + + /* + * Allocate and initialize the state (btree cursor). + */ + state = xfs_da_state_alloc(); + state->args = args; + state->mp = args->dp->i_mount; + /* + * Look up the name. We're not supposed to find it, but + * this gives us the insertion point. + */ + error = xfs_da3_node_lookup_int(state, &rval); + if (error) + rval = error; + if (rval != -ENOENT) { + goto done; + } + /* + * Add the data entry to a data block. + * Extravalid is set to a freeblock found by lookup. + */ + rval = xfs_dir2_node_addname_int(args, + state->extravalid ? &state->extrablk : NULL); + if (rval) { + goto done; + } + blk = &state->path.blk[state->path.active - 1]; + ASSERT(blk->magic == XFS_DIR2_LEAFN_MAGIC); + /* + * Add the new leaf entry. + */ + rval = xfs_dir2_leafn_add(blk->bp, args, blk->index); + if (rval == 0) { + /* + * It worked, fix the hash values up the btree. + */ + if (!(args->op_flags & XFS_DA_OP_JUSTCHECK)) + xfs_da3_fixhashpath(state, &state->path); + } else { + /* + * It didn't work, we need to split the leaf block. + */ + if (args->total == 0) { + ASSERT(rval == -ENOSPC); + goto done; + } + /* + * Split the leaf block and insert the new entry. + */ + rval = xfs_da3_split(state); + } +done: + xfs_da_state_free(state); + return rval; +} + /* * Lookup an entry in a node-format directory. * All the real work happens in xfs_da3_node_lookup_int. From patchwork Wed Oct 24 22:57:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 10655199 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 38A8B13A9 for ; Wed, 24 Oct 2018 22:57:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 28CBB2AC32 for ; Wed, 24 Oct 2018 22:57:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1D6522B042; Wed, 24 Oct 2018 22:57:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 36D752ABD6 for ; Wed, 24 Oct 2018 22:57:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726238AbeJYH13 (ORCPT ); Thu, 25 Oct 2018 03:27:29 -0400 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:27343 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725829AbeJYH13 (ORCPT ); Thu, 25 Oct 2018 03:27:29 -0400 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl2.internode.on.net with ESMTP; 25 Oct 2018 09:27:20 +1030 Received: from discord.disaster.area ([192.168.1.111]) by dastard with esmtp (Exim 4.80) (envelope-from ) id 1gFS5W-0006J5-W2 for linux-xfs@vger.kernel.org; Thu, 25 Oct 2018 09:57:19 +1100 Received: from dave by discord.disaster.area with local (Exim 4.91) (envelope-from ) id 1gFS5W-0005D5-Um for linux-xfs@vger.kernel.org; Thu, 25 Oct 2018 09:57:18 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Subject: [PATCH 2/5] xfs: factor data block addition from xfs_dir2_node_addname_int() Date: Thu, 25 Oct 2018 09:57:13 +1100 Message-Id: <20181024225716.19459-3-david@fromorbit.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181024225716.19459-1-david@fromorbit.com> References: <20181024225716.19459-1-david@fromorbit.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Chinner Factor out the code that adds a data block to a directory from xfs_dir2_node_addname_int(). This makes the code flow cleaner and more obvious and provides clear isolation of upcoming optimsations. Signed-Off-By: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_dir2_node.c | 328 +++++++++++++++++----------------- 1 file changed, 162 insertions(+), 166 deletions(-) diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c index 3661988906b0..7c674c55e004 100644 --- a/fs/xfs/libxfs/xfs_dir2_node.c +++ b/fs/xfs/libxfs/xfs_dir2_node.c @@ -1611,6 +1611,130 @@ xfs_dir2_leafn_unbalance( xfs_dir3_leaf_check(dp, drop_blk->bp); } +/* + * Add a new data block to the directory at the free space index that the caller + * has specified. + */ +static int +xfs_dir2_node_add_datablk( + struct xfs_da_args *args, + struct xfs_da_state_blk *fblk, + xfs_dir2_db_t *dbno, + struct xfs_buf **dbpp, + struct xfs_buf **fbpp, + int *findex) +{ + struct xfs_inode *dp = args->dp; + struct xfs_trans *tp = args->trans; + struct xfs_mount *mp = dp->i_mount; + struct xfs_dir3_icfree_hdr freehdr; + struct xfs_dir2_data_free *bf; + struct xfs_dir2_data_hdr *hdr; + struct xfs_dir2_free *free = NULL; + xfs_dir2_db_t fbno; + struct xfs_buf *fbp; + struct xfs_buf *dbp; + __be16 *bests = NULL; + int error; + + /* Not allowed to allocate, return failure. */ + if ((args->op_flags & XFS_DA_OP_JUSTCHECK) || args->total == 0) + return -ENOSPC; + + /* Allocate and initialize the new data block. */ + error = xfs_dir2_grow_inode(args, XFS_DIR2_DATA_SPACE, dbno); + if (error) + return error; + error = xfs_dir3_data_init(args, *dbno, &dbp); + if (error) + return error; + + /* + * Get the freespace block corresponding to the data block + * that was just allocated. + */ + fbno = dp->d_ops->db_to_fdb(args->geo, *dbno); + error = xfs_dir2_free_try_read(tp, dp, + xfs_dir2_db_to_da(args->geo, fbno), &fbp); + if (error) + return error; + + /* + * If there wasn't a freespace block, the read will + * return a NULL fbp. Allocate and initialize a new one. + */ + if (!fbp) { + error = xfs_dir2_grow_inode(args, XFS_DIR2_FREE_SPACE, &fbno); + if (error) + return error; + + if (dp->d_ops->db_to_fdb(args->geo, *dbno) != fbno) { + xfs_alert(mp, +"%s: dir ino %llu needed freesp block %lld for data block %lld, got %lld", + __func__, (unsigned long long)dp->i_ino, + (long long)dp->d_ops->db_to_fdb(args->geo, *dbno), + (long long)*dbno, (long long)fbno); + if (fblk) { + xfs_alert(mp, + " fblk "PTR_FMT" blkno %llu index %d magic 0x%x", + fblk, (unsigned long long)fblk->blkno, + fblk->index, fblk->magic); + } else { + xfs_alert(mp, " ... fblk is NULL"); + } + XFS_ERROR_REPORT("xfs_dir2_node_addname_int", + XFS_ERRLEVEL_LOW, mp); + return -EFSCORRUPTED; + } + + /* Get a buffer for the new block. */ + error = xfs_dir3_free_get_buf(args, fbno, &fbp); + if (error) + return error; + free = fbp->b_addr; + bests = dp->d_ops->free_bests_p(free); + dp->d_ops->free_hdr_from_disk(&freehdr, free); + + /* Remember the first slot as our empty slot. */ + freehdr.firstdb = (fbno - xfs_dir2_byte_to_db(args->geo, + XFS_DIR2_FREE_OFFSET)) * + dp->d_ops->free_max_bests(args->geo); + } else { + free = fbp->b_addr; + bests = dp->d_ops->free_bests_p(free); + dp->d_ops->free_hdr_from_disk(&freehdr, free); + } + + /* Set the freespace block index from the data block number. */ + *findex = dp->d_ops->db_to_fdindex(args->geo, *dbno); + + /* Extend the freespace table if the new data block is off the end. */ + if (*findex >= freehdr.nvalid) { + ASSERT(*findex < dp->d_ops->free_max_bests(args->geo)); + freehdr.nvalid = *findex + 1; + bests[*findex] = cpu_to_be16(NULLDATAOFF); + } + + /* + * If this entry was for an empty data block (this should always be + * true) then update the header. + */ + if (bests[*findex] == cpu_to_be16(NULLDATAOFF)) { + freehdr.nused++; + dp->d_ops->free_hdr_to_disk(fbp->b_addr, &freehdr); + xfs_dir2_free_log_header(args, fbp); + } + + /* Update the freespace value for the new block in the table. */ + hdr = dbp->b_addr; + bf = dp->d_ops->data_bestfree_p(hdr); + bests[*findex] = bf[0].length; + + *dbpp = dbp; + *fbpp = fbp; + return 0; +} + /* * Add the data entry for a node-format directory name addition. * The leaf entry is added in xfs_dir2_leafn_add. @@ -1635,10 +1759,9 @@ xfs_dir2_node_addname_int( xfs_dir2_db_t ifbno; /* initial freespace block no */ xfs_dir2_db_t lastfbno=0; /* highest freespace block no */ int length; /* length of the new entry */ - int logfree; /* need to log free entry */ - xfs_mount_t *mp; /* filesystem mount point */ - int needlog; /* need to log data header */ - int needscan; /* need to rescan data frees */ + int logfree = 0; /* need to log free entry */ + int needlog = 0; /* need to log data header */ + int needscan = 0; /* need to rescan data frees */ __be16 *tagp; /* data entry tag pointer */ xfs_trans_t *tp; /* transaction pointer */ __be16 *bests; @@ -1647,7 +1770,6 @@ xfs_dir2_node_addname_int( xfs_dir2_data_aoff_t aoff; dp = args->dp; - mp = dp->i_mount; tp = args->trans; length = dp->d_ops->data_entsize(args->namelen); /* @@ -1676,6 +1798,7 @@ xfs_dir2_node_addname_int( ASSERT(be16_to_cpu(bests[findex]) != NULLDATAOFF); ASSERT(be16_to_cpu(bests[findex]) >= length); dbno = freehdr.firstdb + findex; + goto found_block; } else { /* * The data block looked at didn't have enough room. @@ -1777,168 +1900,50 @@ xfs_dir2_node_addname_int( } } } + /* * If we don't have a data block, we need to allocate one and make * the freespace entries refer to it. */ - if (unlikely(dbno == -1)) { - /* - * Not allowed to allocate, return failure. - */ - if ((args->op_flags & XFS_DA_OP_JUSTCHECK) || args->total == 0) - return -ENOSPC; - - /* - * Allocate and initialize the new data block. - */ - if (unlikely((error = xfs_dir2_grow_inode(args, - XFS_DIR2_DATA_SPACE, - &dbno)) || - (error = xfs_dir3_data_init(args, dbno, &dbp)))) - return error; - - /* - * If (somehow) we have a freespace block, get rid of it. - */ - if (fbp) - xfs_trans_brelse(tp, fbp); - if (fblk && fblk->bp) - fblk->bp = NULL; - - /* - * Get the freespace block corresponding to the data block - * that was just allocated. - */ - fbno = dp->d_ops->db_to_fdb(args->geo, dbno); - error = xfs_dir2_free_try_read(tp, dp, - xfs_dir2_db_to_da(args->geo, fbno), - &fbp); + if (dbno == -1) { + error = xfs_dir2_node_add_datablk(args, fblk, &dbno, &dbp, &fbp, + &findex); if (error) return error; - /* - * If there wasn't a freespace block, the read will - * return a NULL fbp. Allocate and initialize a new one. - */ - if (!fbp) { - error = xfs_dir2_grow_inode(args, XFS_DIR2_FREE_SPACE, - &fbno); - if (error) - return error; - - if (dp->d_ops->db_to_fdb(args->geo, dbno) != fbno) { - xfs_alert(mp, -"%s: dir ino %llu needed freesp block %lld for data block %lld, got %lld ifbno %llu lastfbno %d", - __func__, (unsigned long long)dp->i_ino, - (long long)dp->d_ops->db_to_fdb( - args->geo, dbno), - (long long)dbno, (long long)fbno, - (unsigned long long)ifbno, lastfbno); - if (fblk) { - xfs_alert(mp, - " fblk "PTR_FMT" blkno %llu index %d magic 0x%x", - fblk, - (unsigned long long)fblk->blkno, - fblk->index, - fblk->magic); - } else { - xfs_alert(mp, " ... fblk is NULL"); - } - XFS_ERROR_REPORT("xfs_dir2_node_addname_int", - XFS_ERRLEVEL_LOW, mp); - return -EFSCORRUPTED; - } - - /* - * Get a buffer for the new block. - */ - error = xfs_dir3_free_get_buf(args, fbno, &fbp); - if (error) - return error; - free = fbp->b_addr; - bests = dp->d_ops->free_bests_p(free); - dp->d_ops->free_hdr_from_disk(&freehdr, free); - - /* - * Remember the first slot as our empty slot. - */ - freehdr.firstdb = - (fbno - xfs_dir2_byte_to_db(args->geo, - XFS_DIR2_FREE_OFFSET)) * - dp->d_ops->free_max_bests(args->geo); - } else { - free = fbp->b_addr; - bests = dp->d_ops->free_bests_p(free); - dp->d_ops->free_hdr_from_disk(&freehdr, free); - } + /* setup current free block buffer */ + free = fbp->b_addr; + bests = dp->d_ops->free_bests_p(free); - /* - * Set the freespace block index from the data block number. - */ - findex = dp->d_ops->db_to_fdindex(args->geo, dbno); - /* - * If it's after the end of the current entries in the - * freespace block, extend that table. - */ - if (findex >= freehdr.nvalid) { - ASSERT(findex < dp->d_ops->free_max_bests(args->geo)); - freehdr.nvalid = findex + 1; - /* - * Tag new entry so nused will go up. - */ - bests[findex] = cpu_to_be16(NULLDATAOFF); - } - /* - * If this entry was for an empty data block - * (this should always be true) then update the header. - */ - if (bests[findex] == cpu_to_be16(NULLDATAOFF)) { - freehdr.nused++; - dp->d_ops->free_hdr_to_disk(fbp->b_addr, &freehdr); - xfs_dir2_free_log_header(args, fbp); - } - /* - * Update the real value in the table. - * We haven't allocated the data entry yet so this will - * change again. - */ - hdr = dbp->b_addr; - bf = dp->d_ops->data_bestfree_p(hdr); - bests[findex] = bf[0].length; + /* we're going to have to log the free block index later */ logfree = 1; - } - /* - * We had a data block so we don't have to make a new one. - */ - else { - /* - * If just checking, we succeeded. - */ + } else { +found_block: + /* If just checking, we succeeded. */ if (args->op_flags & XFS_DA_OP_JUSTCHECK) return 0; - /* - * Read the data block in. - */ + /* Read the data block in. */ error = xfs_dir3_data_read(tp, dp, xfs_dir2_db_to_da(args->geo, dbno), -1, &dbp); if (error) return error; - hdr = dbp->b_addr; - bf = dp->d_ops->data_bestfree_p(hdr); - logfree = 0; + + /* suppress gcc maybe-used-initialised warning */ + bests = dp->d_ops->free_bests_p(free); } + + /* setup for data block up now */ + hdr = dbp->b_addr; + bf = dp->d_ops->data_bestfree_p(hdr); ASSERT(be16_to_cpu(bf[0].length) >= length); - /* - * Point to the existing unused space. - */ + + /* Point to the existing unused space. */ dup = (xfs_dir2_data_unused_t *) ((char *)hdr + be16_to_cpu(bf[0].offset)); - needscan = needlog = 0; - /* - * Mark the first part of the unused space, inuse for us. - */ + + /* Mark the first part of the unused space, inuse for us. */ aoff = (xfs_dir2_data_aoff_t)((char *)dup - (char *)hdr); error = xfs_dir2_data_use_free(args, dbp, dup, aoff, length, &needlog, &needscan); @@ -1946,9 +1951,8 @@ xfs_dir2_node_addname_int( xfs_trans_brelse(tp, dbp); return error; } - /* - * Fill in the new entry and log it. - */ + + /* Fill in the new entry and log it. */ dep = (xfs_dir2_data_entry_t *)dup; dep->inumber = cpu_to_be64(args->inumber); dep->namelen = args->namelen; @@ -1957,32 +1961,24 @@ xfs_dir2_node_addname_int( tagp = dp->d_ops->data_entry_tag_p(dep); *tagp = cpu_to_be16((char *)dep - (char *)hdr); xfs_dir2_data_log_entry(args, dbp, dep); - /* - * Rescan the block for bestfree if needed. - */ + + /* Rescan the freespace and log the data block if needed. */ if (needscan) xfs_dir2_data_freescan(dp, hdr, &needlog); - /* - * Log the data block header if needed. - */ if (needlog) xfs_dir2_data_log_header(args, dbp); - /* - * If the freespace entry is now wrong, update it. - */ - bests = dp->d_ops->free_bests_p(free); /* gcc is so stupid */ - if (be16_to_cpu(bests[findex]) != be16_to_cpu(bf[0].length)) { + + /* If the freespace block entry is now wrong, update it. */ + if (bests[findex] != bf[0].length) { bests[findex] = bf[0].length; logfree = 1; } - /* - * Log the freespace entry if needed. - */ + + /* Log the freespace entry if needed. */ if (logfree) xfs_dir2_free_log_bests(args, fbp, findex, findex); - /* - * Return the data block and offset in args, then drop the data block. - */ + + /* Return the data block and offset in args. */ args->blkno = (xfs_dablk_t)dbno; args->index = be16_to_cpu(*tagp); return 0; From patchwork Wed Oct 24 22:57:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 10655201 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D62117F3 for ; Wed, 24 Oct 2018 22:57:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E43A2ABD6 for ; Wed, 24 Oct 2018 22:57:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 32AF52B150; Wed, 24 Oct 2018 22:57:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6C7F52AC1A for ; Wed, 24 Oct 2018 22:57:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725829AbeJYH1a (ORCPT ); Thu, 25 Oct 2018 03:27:30 -0400 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:12937 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726308AbeJYH1a (ORCPT ); Thu, 25 Oct 2018 03:27:30 -0400 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl2.internode.on.net with ESMTP; 25 Oct 2018 09:27:20 +1030 Received: from discord.disaster.area ([192.168.1.111]) by dastard with esmtp (Exim 4.80) (envelope-from ) id 1gFS5X-0006J6-0b for linux-xfs@vger.kernel.org; Thu, 25 Oct 2018 09:57:19 +1100 Received: from dave by discord.disaster.area with local (Exim 4.91) (envelope-from ) id 1gFS5W-0005D8-Vj for linux-xfs@vger.kernel.org; Thu, 25 Oct 2018 09:57:18 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Subject: [PATCH 3/5] xfs: factor free block index lookup from xfs_dir2_node_addname_int() Date: Thu, 25 Oct 2018 09:57:14 +1100 Message-Id: <20181024225716.19459-4-david@fromorbit.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181024225716.19459-1-david@fromorbit.com> References: <20181024225716.19459-1-david@fromorbit.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Chinner Simplify the logic in xfs_dir2_node_addname_int() by factoring out the free block index lookup code that finds a block with enough free space for the entry to be added. The code that is moved gets a major cleanup at the same time, but there is no algorithm change here. Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_dir2_node.c | 195 ++++++++++++++++++---------------- 1 file changed, 105 insertions(+), 90 deletions(-) diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c index 7c674c55e004..1ecf41e380fb 100644 --- a/fs/xfs/libxfs/xfs_dir2_node.c +++ b/fs/xfs/libxfs/xfs_dir2_node.c @@ -1638,7 +1638,7 @@ xfs_dir2_node_add_datablk( int error; /* Not allowed to allocate, return failure. */ - if ((args->op_flags & XFS_DA_OP_JUSTCHECK) || args->total == 0) + if (args->total == 0) return -ENOSPC; /* Allocate and initialize the new data block. */ @@ -1735,43 +1735,29 @@ xfs_dir2_node_add_datablk( return 0; } -/* - * Add the data entry for a node-format directory name addition. - * The leaf entry is added in xfs_dir2_leafn_add. - * We may enter with a freespace block that the lookup found. - */ -static int /* error */ -xfs_dir2_node_addname_int( - xfs_da_args_t *args, /* operation arguments */ - xfs_da_state_blk_t *fblk) /* optional freespace block */ +static int +xfs_dir2_node_find_freeblk( + struct xfs_da_args *args, + struct xfs_da_state_blk *fblk, + xfs_dir2_db_t *dbnop, + struct xfs_buf **fbpp, + int *findexp, + int length) { - xfs_dir2_data_hdr_t *hdr; /* data block header */ - xfs_dir2_db_t dbno; /* data block number */ - struct xfs_buf *dbp; /* data block buffer */ - xfs_dir2_data_entry_t *dep; /* data entry pointer */ - xfs_inode_t *dp; /* incore directory inode */ - xfs_dir2_data_unused_t *dup; /* data unused entry pointer */ - int error; /* error return value */ - xfs_dir2_db_t fbno; /* freespace block number */ - struct xfs_buf *fbp; /* freespace buffer */ - int findex; /* freespace entry index */ - xfs_dir2_free_t *free=NULL; /* freespace block structure */ - xfs_dir2_db_t ifbno; /* initial freespace block no */ - xfs_dir2_db_t lastfbno=0; /* highest freespace block no */ - int length; /* length of the new entry */ - int logfree = 0; /* need to log free entry */ - int needlog = 0; /* need to log data header */ - int needscan = 0; /* need to rescan data frees */ - __be16 *tagp; /* data entry tag pointer */ - xfs_trans_t *tp; /* transaction pointer */ - __be16 *bests; + struct xfs_inode *dp = args->dp; + struct xfs_trans *tp = args->trans; + struct xfs_buf *fbp = NULL; + int findex; + xfs_dir2_db_t lastfbno; + xfs_dir2_db_t ifbno = -1; + xfs_dir2_db_t dbno = -1; + xfs_dir2_db_t fbno = -1; + xfs_dir2_free_t *free = NULL; struct xfs_dir3_icfree_hdr freehdr; - struct xfs_dir2_data_free *bf; - xfs_dir2_data_aoff_t aoff; + __be16 *bests; + xfs_fileoff_t fo; + int error; - dp = args->dp; - tp = args->trans; - length = dp->d_ops->data_entsize(args->namelen); /* * If we came in with a freespace block that means that lookup * found an entry with our hash value. This is the freespace @@ -1779,56 +1765,44 @@ xfs_dir2_node_addname_int( */ if (fblk) { fbp = fblk->bp; - /* - * Remember initial freespace block number. - */ - ifbno = fblk->blkno; free = fbp->b_addr; findex = fblk->index; - bests = dp->d_ops->free_bests_p(free); - dp->d_ops->free_hdr_from_disk(&freehdr, free); - - /* - * This means the free entry showed that the data block had - * space for our entry, so we remembered it. - * Use that data block. - */ if (findex >= 0) { + /* caller already found the freespace for us. */ + bests = dp->d_ops->free_bests_p(free); + dp->d_ops->free_hdr_from_disk(&freehdr, free); + ASSERT(findex < freehdr.nvalid); ASSERT(be16_to_cpu(bests[findex]) != NULLDATAOFF); ASSERT(be16_to_cpu(bests[findex]) >= length); dbno = freehdr.firstdb + findex; - goto found_block; - } else { - /* - * The data block looked at didn't have enough room. - * We'll start at the beginning of the freespace entries. - */ - dbno = -1; - findex = 0; + goto out; } - } else { + /* - * Didn't come in with a freespace block, so no data block. + * The data block looked at didn't have enough room. + * We'll start at the beginning of the freespace entries. */ - ifbno = dbno = -1; - fbp = NULL; - findex = 0; + ifbno = fblk->blkno; + fbno = ifbno; } + ASSERT(dbno == -1); + findex = 0; /* - * If we don't have a data block yet, we're going to scan the - * freespace blocks looking for one. Figure out what the - * highest freespace block number is. + * If we don't have a data block yet, we're going to scan the freespace + * blocks looking for one. Figure out what the highest freespace block + * number is. */ - if (dbno == -1) { - xfs_fileoff_t fo; /* freespace block number */ + error = xfs_bmap_last_offset(dp, &fo, XFS_DATA_FORK); + if (error) + return error; + lastfbno = xfs_dir2_da_to_db(args->geo, (xfs_dablk_t)fo); + + /* If we haven't get a search start block, set it now */ + if (fbno == -1) + fbno = xfs_dir2_byte_to_db(args->geo, XFS_DIR2_FREE_OFFSET); - if ((error = xfs_bmap_last_offset(dp, &fo, XFS_DATA_FORK))) - return error; - lastfbno = xfs_dir2_da_to_db(args->geo, (xfs_dablk_t)fo); - fbno = ifbno; - } /* * While we haven't identified a data block, search the freeblock * data for a good data block. If we find a null freeblock entry, @@ -1839,17 +1813,10 @@ xfs_dir2_node_addname_int( * If we don't have a freeblock in hand, get the next one. */ if (fbp == NULL) { - /* - * Happens the first time through unless lookup gave - * us a freespace block to start with. - */ - if (++fbno == 0) - fbno = xfs_dir2_byte_to_db(args->geo, - XFS_DIR2_FREE_OFFSET); /* * If it's ifbno we already looked at it. */ - if (fbno == ifbno) + if (++fbno == ifbno) fbno++; /* * If it's off the end we're done. @@ -1900,6 +1867,62 @@ xfs_dir2_node_addname_int( } } } +out: + *dbnop = dbno; + *fbpp = fbp; + *findexp = findex; + return 0; +} + + +/* + * Add the data entry for a node-format directory name addition. + * The leaf entry is added in xfs_dir2_leafn_add. + * We may enter with a freespace block that the lookup found. + */ +static int /* error */ +xfs_dir2_node_addname_int( + xfs_da_args_t *args, /* operation arguments */ + xfs_da_state_blk_t *fblk) /* optional freespace block */ +{ + xfs_dir2_data_hdr_t *hdr; /* data block header */ + xfs_dir2_db_t dbno; /* data block number */ + struct xfs_buf *dbp; /* data block buffer */ + xfs_dir2_data_entry_t *dep; /* data entry pointer */ + xfs_inode_t *dp; /* incore directory inode */ + xfs_dir2_data_unused_t *dup; /* data unused entry pointer */ + int error; /* error return value */ + struct xfs_buf *fbp; /* freespace buffer */ + int findex; /* freespace entry index */ + xfs_dir2_free_t *free=NULL; /* freespace block structure */ + int length; /* length of the new entry */ + int logfree = 0; /* need to log free entry */ + int needlog = 0; /* need to log data header */ + int needscan = 0; /* need to rescan data frees */ + __be16 *tagp; /* data entry tag pointer */ + xfs_trans_t *tp; /* transaction pointer */ + __be16 *bests; + struct xfs_dir2_data_free *bf; + xfs_dir2_data_aoff_t aoff; + + dp = args->dp; + tp = args->trans; + length = dp->d_ops->data_entsize(args->namelen); + + error = xfs_dir2_node_find_freeblk(args, fblk, &dbno, &fbp, &findex, + length); + if (error) + return error; + + /* + * Now we know if we must allocate blocks, so if we are checking whether + * we can insert without allocation then we can return now. + */ + if (args->op_flags & XFS_DA_OP_JUSTCHECK) { + if (dbno != -1) + return 0; + return -ENOSPC; + } /* * If we don't have a data block, we need to allocate one and make @@ -1911,29 +1934,21 @@ xfs_dir2_node_addname_int( if (error) return error; - /* setup current free block buffer */ - free = fbp->b_addr; - bests = dp->d_ops->free_bests_p(free); - /* we're going to have to log the free block index later */ logfree = 1; } else { -found_block: - /* If just checking, we succeeded. */ - if (args->op_flags & XFS_DA_OP_JUSTCHECK) - return 0; - /* Read the data block in. */ error = xfs_dir3_data_read(tp, dp, xfs_dir2_db_to_da(args->geo, dbno), -1, &dbp); if (error) return error; - - /* suppress gcc maybe-used-initialised warning */ - bests = dp->d_ops->free_bests_p(free); } + /* setup current free block buffer */ + free = fbp->b_addr; + bests = dp->d_ops->free_bests_p(free); + /* setup for data block up now */ hdr = dbp->b_addr; bf = dp->d_ops->data_bestfree_p(hdr); From patchwork Wed Oct 24 22:57:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 10655193 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 659D613A9 for ; Wed, 24 Oct 2018 22:57:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 564202ABD6 for ; Wed, 24 Oct 2018 22:57:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4A84D2AC32; Wed, 24 Oct 2018 22:57:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C58452ABD6 for ; Wed, 24 Oct 2018 22:57:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726293AbeJYH1Y (ORCPT ); Thu, 25 Oct 2018 03:27:24 -0400 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:27343 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725829AbeJYH1Y (ORCPT ); Thu, 25 Oct 2018 03:27:24 -0400 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl2.internode.on.net with ESMTP; 25 Oct 2018 09:27:20 +1030 Received: from discord.disaster.area ([192.168.1.111]) by dastard with esmtp (Exim 4.80) (envelope-from ) id 1gFS5X-0006J7-1h for linux-xfs@vger.kernel.org; Thu, 25 Oct 2018 09:57:19 +1100 Received: from dave by discord.disaster.area with local (Exim 4.91) (envelope-from ) id 1gFS5X-0005DB-0H for linux-xfs@vger.kernel.org; Thu, 25 Oct 2018 09:57:19 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Subject: [PATCH 4/5] xfs: speed up directory bestfree block scanning Date: Thu, 25 Oct 2018 09:57:15 +1100 Message-Id: <20181024225716.19459-5-david@fromorbit.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181024225716.19459-1-david@fromorbit.com> References: <20181024225716.19459-1-david@fromorbit.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Chinner When running a "create millions inodes in a directory" test recently, I noticed we were spending a huge amount of time converting freespace block headers from disk format to in-memory format: 31.47% [kernel] [k] xfs_dir2_node_addname 17.86% [kernel] [k] xfs_dir3_free_hdr_from_disk 3.55% [kernel] [k] xfs_dir3_free_bests_p We shouldn't be hitting the best free block scanning code so hard when doing sequential directory creates, and it turns out there's a highly suboptimal loop searching the the best free array in the freespace block - it decodes the block header before checking each entry inside a loop, instead of decoding the header once before running the entry search loop. This makes a massive difference to create rates. Profile now looks like this: 13.15% [kernel] [k] xfs_dir2_node_addname 3.52% [kernel] [k] xfs_dir3_leaf_check_int 3.11% [kernel] [k] xfs_log_commit_cil And the wall time/average file create rate differences are just as stark: create time(sec) / rate (files/s) File count vanilla patched 10k 0.54 / 18.5k 0.53 / 18.9k 20k 1.10 / 18.1k 1.05 / 19.0k 100k 4.21 / 23.8k 3.91 / 25.6k 200k 9.66 / 20,7k 7.37 / 27.1k 1M 86.61 / 11.5k 48.26 / 20.7k 2M 206.13 / 9.7k 129.71 / 15.4k The larger the directory, the bigger the performance improvement. Signed-Off-By: Dave Chinner --- fs/xfs/libxfs/xfs_dir2_node.c | 83 ++++++++++++++--------------------- 1 file changed, 32 insertions(+), 51 deletions(-) diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c index 1ecf41e380fb..621f63de075c 100644 --- a/fs/xfs/libxfs/xfs_dir2_node.c +++ b/fs/xfs/libxfs/xfs_dir2_node.c @@ -1754,7 +1754,7 @@ xfs_dir2_node_find_freeblk( xfs_dir2_db_t fbno = -1; xfs_dir2_free_t *free = NULL; struct xfs_dir3_icfree_hdr freehdr; - __be16 *bests; + __be16 *bests = NULL; xfs_fileoff_t fo; int error; @@ -1767,11 +1767,10 @@ xfs_dir2_node_find_freeblk( fbp = fblk->bp; free = fbp->b_addr; findex = fblk->index; + bests = dp->d_ops->free_bests_p(free); + dp->d_ops->free_hdr_from_disk(&freehdr, free); if (findex >= 0) { /* caller already found the freespace for us. */ - bests = dp->d_ops->free_bests_p(free); - dp->d_ops->free_hdr_from_disk(&freehdr, free); - ASSERT(findex < freehdr.nvalid); ASSERT(be16_to_cpu(bests[findex]) != NULLDATAOFF); ASSERT(be16_to_cpu(bests[findex]) >= length); @@ -1805,29 +1804,19 @@ xfs_dir2_node_find_freeblk( /* * While we haven't identified a data block, search the freeblock - * data for a good data block. If we find a null freeblock entry, - * indicating a hole in the data blocks, remember that. + * data for a data block with enough free space in it. */ - while (dbno == -1) { - /* - * If we don't have a freeblock in hand, get the next one. - */ + for ( ; fbno < lastfbno; fbno++) { + /* If we don't have a freeblock in hand, get the next one. */ if (fbp == NULL) { + /* If it's ifbno we already looked at it. */ + if (fbno == ifbno) + continue; + /* - * If it's ifbno we already looked at it. - */ - if (++fbno == ifbno) - fbno++; - /* - * If it's off the end we're done. - */ - if (fbno >= lastfbno) - break; - /* - * Read the block. There can be holes in the - * freespace blocks, so this might not succeed. - * This should be really rare, so there's no reason - * to avoid it. + * Read the block. There can be holes in the freespace + * blocks, so this might not succeed. This should be + * really rare, so there's no reason to avoid it. */ error = xfs_dir2_free_try_read(tp, dp, xfs_dir2_db_to_da(args->geo, fbno), @@ -1836,37 +1825,29 @@ xfs_dir2_node_find_freeblk( return error; if (!fbp) continue; - free = fbp->b_addr; + findex = 0; + free = fbp->b_addr; + bests = dp->d_ops->free_bests_p(free); + dp->d_ops->free_hdr_from_disk(&freehdr, free); } - /* - * Look at the current free entry. Is it good enough? - * - * The bests initialisation should be where the bufer is read in - * the above branch. But gcc is too stupid to realise that bests - * and the freehdr are actually initialised if they are placed - * there, so we have to do it here to avoid warnings. Blech. - */ - bests = dp->d_ops->free_bests_p(free); - dp->d_ops->free_hdr_from_disk(&freehdr, free); - if (be16_to_cpu(bests[findex]) != NULLDATAOFF && - be16_to_cpu(bests[findex]) >= length) - dbno = freehdr.firstdb + findex; - else { - /* - * Are we done with the freeblock? - */ - if (++findex == freehdr.nvalid) { - /* - * Drop the block. - */ - xfs_trans_brelse(tp, fbp); - fbp = NULL; - if (fblk && fblk->bp) - fblk->bp = NULL; + + /* Scan the free entry array for a large enough free space. */ + do { + if (be16_to_cpu(bests[findex]) != NULLDATAOFF && + be16_to_cpu(bests[findex]) >= length) { + dbno = freehdr.firstdb + findex; + goto out; } - } + } while (++findex < freehdr.nvalid); + + /* Didn't find free space, go on to next free block */ + xfs_trans_brelse(tp, fbp); + fbp = NULL; + if (fblk) + fblk->bp = NULL; } + out: *dbnop = dbno; *fbpp = fbp; From patchwork Wed Oct 24 22:57:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 10655197 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D113B17F3 for ; Wed, 24 Oct 2018 22:57:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C25E12ABD6 for ; Wed, 24 Oct 2018 22:57:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B68F12AC32; Wed, 24 Oct 2018 22:57:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3FA6B2AC1A for ; Wed, 24 Oct 2018 22:57:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726443AbeJYH10 (ORCPT ); Thu, 25 Oct 2018 03:27:26 -0400 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:65492 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726238AbeJYH10 (ORCPT ); Thu, 25 Oct 2018 03:27:26 -0400 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl2.internode.on.net with ESMTP; 25 Oct 2018 09:27:20 +1030 Received: from discord.disaster.area ([192.168.1.111]) by dastard with esmtp (Exim 4.80) (envelope-from ) id 1gFS5X-0006JC-33 for linux-xfs@vger.kernel.org; Thu, 25 Oct 2018 09:57:19 +1100 Received: from dave by discord.disaster.area with local (Exim 4.91) (envelope-from ) id 1gFS5X-0005DE-1S for linux-xfs@vger.kernel.org; Thu, 25 Oct 2018 09:57:19 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Subject: [PATCH 5/5] xfs: reverse search directory freespace indexes Date: Thu, 25 Oct 2018 09:57:16 +1100 Message-Id: <20181024225716.19459-6-david@fromorbit.com> X-Mailer: git-send-email 2.19.1 In-Reply-To: <20181024225716.19459-1-david@fromorbit.com> References: <20181024225716.19459-1-david@fromorbit.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Chinner When a directory is growing rapidly, new blocks tend to get added at the end of the directory. These end up at the end of the freespace index, and when the directory gets large finding these new freespaces gets expensive. The code does a linear search across the frespace index from the first block in the directory to the last, hence meaning the newly added space is the last index searched. Instead, do a reverse order index search, starting from the last block and index in the freespace index. This makes most lookups for free space on rapidly growing directories O(1) instead of O(N), but should not have any impact on random insert workloads because the average search length is the same regardless of which end of the array we start at. The result is a major improvement in large directory grow rates: create time(sec) / rate (files/s) File count vanilla patched 10k 0.54 / 18.5k 0.52 / 19.3k 20k 1.10 / 18.1k 1.00 / 20.0k 100k 4.21 / 23.8k 3.58 / 27.9k 200k 9.66 / 20,7k 7.08 / 28.3k 1M 86.61 / 11.5k 38.33 / 26.1k 2M 206.13 / 9.7k 82.20 / 24.3k 10M 2843.57 / 3.5k 591.78 / 16.9k Signed-Off-By: Dave Chinner --- fs/xfs/libxfs/xfs_dir2_node.c | 69 +++++++++++++++++------------------ 1 file changed, 34 insertions(+), 35 deletions(-) diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c index 621f63de075c..aa52dc740305 100644 --- a/fs/xfs/libxfs/xfs_dir2_node.c +++ b/fs/xfs/libxfs/xfs_dir2_node.c @@ -1747,7 +1747,8 @@ xfs_dir2_node_find_freeblk( struct xfs_inode *dp = args->dp; struct xfs_trans *tp = args->trans; struct xfs_buf *fbp = NULL; - int findex; + int findex = 0; + xfs_dir2_db_t firstfbno; xfs_dir2_db_t lastfbno; xfs_dir2_db_t ifbno = -1; xfs_dir2_db_t dbno = -1; @@ -1775,7 +1776,7 @@ xfs_dir2_node_find_freeblk( ASSERT(be16_to_cpu(bests[findex]) != NULLDATAOFF); ASSERT(be16_to_cpu(bests[findex]) >= length); dbno = freehdr.firstdb + findex; - goto out; + goto found_block; } /* @@ -1784,9 +1785,10 @@ xfs_dir2_node_find_freeblk( */ ifbno = fblk->blkno; fbno = ifbno; + xfs_trans_brelse(tp, fbp); + fbp = NULL; + fblk->bp = NULL; } - ASSERT(dbno == -1); - findex = 0; /* * If we don't have a data block yet, we're going to scan the freespace @@ -1797,6 +1799,7 @@ xfs_dir2_node_find_freeblk( if (error) return error; lastfbno = xfs_dir2_da_to_db(args->geo, (xfs_dablk_t)fo); + firstfbno = xfs_dir2_byte_to_db(args->geo, XFS_DIR2_FREE_OFFSET); /* If we haven't get a search start block, set it now */ if (fbno == -1) @@ -1804,51 +1807,47 @@ xfs_dir2_node_find_freeblk( /* * While we haven't identified a data block, search the freeblock - * data for a data block with enough free space in it. + * data for a good data block. Do a reverse order search, as growing + * directories will put new blocks with free space at the end of the + * free space index. */ - for ( ; fbno < lastfbno; fbno++) { - /* If we don't have a freeblock in hand, get the next one. */ - if (fbp == NULL) { - /* If it's ifbno we already looked at it. */ - if (fbno == ifbno) - continue; + for (fbno = lastfbno - 1; fbno >= firstfbno; fbno--) { + /* If it's ifbno we already looked at it. */ + if (fbno == ifbno) + continue; - /* - * Read the block. There can be holes in the freespace - * blocks, so this might not succeed. This should be - * really rare, so there's no reason to avoid it. - */ - error = xfs_dir2_free_try_read(tp, dp, - xfs_dir2_db_to_da(args->geo, fbno), - &fbp); - if (error) - return error; - if (!fbp) - continue; + /* + * Read the block. There can be holes in the freespace + * blocks, so this might not succeed. This should be + * really rare, so there's no reason to avoid it. + */ + error = xfs_dir2_free_try_read(tp, dp, + xfs_dir2_db_to_da(args->geo, fbno), + &fbp); + if (error) + return error; + if (!fbp) + continue; - findex = 0; - free = fbp->b_addr; - bests = dp->d_ops->free_bests_p(free); - dp->d_ops->free_hdr_from_disk(&freehdr, free); - } + findex = 0; + free = fbp->b_addr; + bests = dp->d_ops->free_bests_p(free); + dp->d_ops->free_hdr_from_disk(&freehdr, free); /* Scan the free entry array for a large enough free space. */ - do { + for (findex = freehdr.nvalid - 1; findex >= 0; findex--) { if (be16_to_cpu(bests[findex]) != NULLDATAOFF && be16_to_cpu(bests[findex]) >= length) { dbno = freehdr.firstdb + findex; - goto out; + goto found_block; } - } while (++findex < freehdr.nvalid); + } /* Didn't find free space, go on to next free block */ xfs_trans_brelse(tp, fbp); - fbp = NULL; - if (fblk) - fblk->bp = NULL; } -out: +found_block: *dbnop = dbno; *fbpp = fbp; *findexp = findex;