From patchwork Tue Jun 4 21:49:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10976183 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1854E2D49 for ; Tue, 4 Jun 2019 21:51:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F20C6286CF for ; Tue, 4 Jun 2019 21:51:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DFB07287FA; Tue, 4 Jun 2019 21:51:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A35AD286CF for ; Tue, 4 Jun 2019 21:51:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726317AbfFDVvm (ORCPT ); Tue, 4 Jun 2019 17:51:42 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:59904 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726312AbfFDVvm (ORCPT ); Tue, 4 Jun 2019 17:51:42 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LnWYI053067 for ; Tue, 4 Jun 2019 21:51:38 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=BZTZ1beOP+Cco50+8aLuOJ3fB92u2oCcyh+kxVA6aco=; b=PzytYtWdYHwI7I19s2wagIJFDU2x86EJb0Lr+4LfJwHwNU4VetYMb5708x/ZpgYgwBXT 3N1HGLQU2A3rlcqFs0xvB1lv2UMFAOa8sXbURVT1FXoidqIv/aIufBeTx38W1B48dUGv nmMgP9yOKqDIS0td5a19wIUnplVqgBCMNuqNPpecp0Mxmb2N5vnkKPN9T1zQwVax41Gs RxREDfPFrtzcY5DnqBkU93ywU82AufvNge928g8qIY8hvP/+1saUXkU7s2vG93QKIU1w 7zTHyo47cVBtBEQAPnPXn0TeNSd8imLWLG4oyhwVG9Glso5h+VhDXkJ1WfuKV6nXbwL4 Pw== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2130.oracle.com with ESMTP id 2sugstfpeb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:51:37 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LmblL009283 for ; Tue, 4 Jun 2019 21:49:36 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3030.oracle.com with ESMTP id 2swnhbug2g-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:49:36 +0000 Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x54LnZ6B008705 for ; Tue, 4 Jun 2019 21:49:35 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Jun 2019 14:49:35 -0700 Subject: [PATCH 01/10] xfs: create simplified inode walk function From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Tue, 04 Jun 2019 14:49:34 -0700 Message-ID: <155968497450.1657646.15305138327955918345.stgit@magnolia> In-Reply-To: <155968496814.1657646.13743491598480818627.stgit@magnolia> References: <155968496814.1657646.13743491598480818627.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Create a new iterator function to simplify walking inodes in an XFS filesystem. This new iterator will replace the existing open-coded walking that goes on in various places. Signed-off-by: Darrick J. Wong --- fs/xfs/Makefile | 1 fs/xfs/libxfs/xfs_ialloc_btree.c | 31 +++ fs/xfs/libxfs/xfs_ialloc_btree.h | 3 fs/xfs/xfs_itable.c | 5 fs/xfs/xfs_itable.h | 8 + fs/xfs/xfs_iwalk.c | 400 ++++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_iwalk.h | 18 ++ fs/xfs/xfs_trace.h | 40 ++++ 8 files changed, 502 insertions(+), 4 deletions(-) create mode 100644 fs/xfs/xfs_iwalk.c create mode 100644 fs/xfs/xfs_iwalk.h diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 91831975363b..74d30ef0dbce 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -80,6 +80,7 @@ xfs-y += xfs_aops.o \ xfs_iops.o \ xfs_inode.o \ xfs_itable.o \ + xfs_iwalk.o \ xfs_message.o \ xfs_mount.o \ xfs_mru_cache.o \ diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c index ac4b65da4c2b..cb7eac2f51c0 100644 --- a/fs/xfs/libxfs/xfs_ialloc_btree.c +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c @@ -564,6 +564,34 @@ xfs_inobt_max_size( XFS_INODES_PER_CHUNK); } +/* Read AGI and create inobt cursor. */ +int +xfs_inobt_cur( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_agnumber_t agno, + struct xfs_btree_cur **curpp, + struct xfs_buf **agi_bpp) +{ + struct xfs_btree_cur *cur; + int error; + + ASSERT(*agi_bpp == NULL); + + error = xfs_ialloc_read_agi(mp, tp, agno, agi_bpp); + if (error) + return error; + + cur = xfs_inobt_init_cursor(mp, tp, *agi_bpp, agno, XFS_BTNUM_INO); + if (!cur) { + xfs_trans_brelse(tp, *agi_bpp); + *agi_bpp = NULL; + return -ENOMEM; + } + *curpp = cur; + return 0; +} + static int xfs_inobt_count_blocks( struct xfs_mount *mp, @@ -576,11 +604,10 @@ xfs_inobt_count_blocks( struct xfs_btree_cur *cur; int error; - error = xfs_ialloc_read_agi(mp, tp, agno, &agbp); + error = xfs_inobt_cur(mp, tp, agno, &cur, &agbp); if (error) return error; - cur = xfs_inobt_init_cursor(mp, tp, agbp, agno, btnum); error = xfs_btree_count_blocks(cur, tree_blocks); xfs_btree_del_cursor(cur, error); xfs_trans_brelse(tp, agbp); diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.h b/fs/xfs/libxfs/xfs_ialloc_btree.h index ebdd0c6b8766..1bc44b4a2b6c 100644 --- a/fs/xfs/libxfs/xfs_ialloc_btree.h +++ b/fs/xfs/libxfs/xfs_ialloc_btree.h @@ -64,5 +64,8 @@ int xfs_finobt_calc_reserves(struct xfs_mount *mp, struct xfs_trans *tp, xfs_agnumber_t agno, xfs_extlen_t *ask, xfs_extlen_t *used); extern xfs_extlen_t xfs_iallocbt_calc_size(struct xfs_mount *mp, unsigned long long len); +int xfs_inobt_cur(struct xfs_mount *mp, struct xfs_trans *tp, + xfs_agnumber_t agno, struct xfs_btree_cur **curpp, + struct xfs_buf **agi_bpp); #endif /* __XFS_IALLOC_BTREE_H__ */ diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c index eef307cf90a7..3ca1c454afe6 100644 --- a/fs/xfs/xfs_itable.c +++ b/fs/xfs/xfs_itable.c @@ -19,6 +19,7 @@ #include "xfs_trace.h" #include "xfs_icache.h" #include "xfs_health.h" +#include "xfs_iwalk.h" /* * Return stat information for one inode. @@ -161,7 +162,7 @@ xfs_bulkstat_one( * Loop over all clusters in a chunk for a given incore inode allocation btree * record. Do a readahead if there are any allocated inodes in that cluster. */ -STATIC void +void xfs_bulkstat_ichunk_ra( struct xfs_mount *mp, xfs_agnumber_t agno, @@ -195,7 +196,7 @@ xfs_bulkstat_ichunk_ra( * are some left allocated, update the data for the pointed-to record as well as * return the count of grabbed inodes. */ -STATIC int +int xfs_bulkstat_grab_ichunk( struct xfs_btree_cur *cur, /* btree cursor */ xfs_agino_t agino, /* starting inode of chunk */ diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h index 8a822285b671..369e3f159d4e 100644 --- a/fs/xfs/xfs_itable.h +++ b/fs/xfs/xfs_itable.h @@ -84,4 +84,12 @@ xfs_inumbers( void __user *buffer, /* buffer with inode info */ inumbers_fmt_pf formatter); +/* Temporarily needed while we refactor functions. */ +struct xfs_btree_cur; +struct xfs_inobt_rec_incore; +void xfs_bulkstat_ichunk_ra(struct xfs_mount *mp, xfs_agnumber_t agno, + struct xfs_inobt_rec_incore *irec); +int xfs_bulkstat_grab_ichunk(struct xfs_btree_cur *cur, xfs_agino_t agino, + int *icount, struct xfs_inobt_rec_incore *irec); + #endif /* __XFS_ITABLE_H__ */ diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c new file mode 100644 index 000000000000..3e6c06e69c75 --- /dev/null +++ b/fs/xfs/xfs_iwalk.c @@ -0,0 +1,400 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2019 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_log_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_inode.h" +#include "xfs_btree.h" +#include "xfs_ialloc.h" +#include "xfs_ialloc_btree.h" +#include "xfs_itable.h" +#include "xfs_error.h" +#include "xfs_trace.h" +#include "xfs_icache.h" +#include "xfs_health.h" +#include "xfs_trans.h" +#include "xfs_iwalk.h" + +/* + * Walking Inodes in the Filesystem + * ================================ + * + * This iterator function walks a subset of filesystem inodes in increasing + * order from @startino until there are no more inodes. For each allocated + * inode it finds, it calls a walk function with the relevant inode number and + * a pointer to caller-provided data. The walk function can return the usual + * negative error code to stop the iteration; 0 to continue the iteration; or + * XFS_IWALK_ABORT to stop the iteration. This return value is returned to the + * caller. + * + * Internally, we allow the walk function to do anything, which means that we + * cannot maintain the inobt cursor or our lock on the AGI buffer. We + * therefore cache the inobt records in kernel memory and only call the walk + * function when our memory buffer is full. @nr_recs is the number of records + * that we've cached, and @sz_recs is the size of our cache. + * + * It is the responsibility of the walk function to ensure it accesses + * allocated inodes, as the inobt records may be stale by the time they are + * acted upon. + */ + +struct xfs_iwalk_ag { + struct xfs_mount *mp; + struct xfs_trans *tp; + + /* Where do we start the traversal? */ + xfs_ino_t startino; + + /* Array of inobt records we cache. */ + struct xfs_inobt_rec_incore *recs; + + /* Number of entries allocated for the @recs array. */ + unsigned int sz_recs; + + /* Number of entries in the @recs array that are in use. */ + unsigned int nr_recs; + + /* Inode walk function and data pointer. */ + xfs_iwalk_fn iwalk_fn; + void *data; +}; + +/* Allocate memory for a walk. */ +STATIC int +xfs_iwalk_alloc( + struct xfs_iwalk_ag *iwag) +{ + size_t size; + + ASSERT(iwag->recs == NULL); + iwag->nr_recs = 0; + + /* Allocate a prefetch buffer for inobt records. */ + size = iwag->sz_recs * sizeof(struct xfs_inobt_rec_incore); + iwag->recs = kmem_alloc(size, KM_MAYFAIL); + if (iwag->recs == NULL) + return -ENOMEM; + + return 0; +} + +/* Free memory we allocated for a walk. */ +STATIC void +xfs_iwalk_free( + struct xfs_iwalk_ag *iwag) +{ + kmem_free(iwag->recs); +} + +/* For each inuse inode in each cached inobt record, call our function. */ +STATIC int +xfs_iwalk_ag_recs( + struct xfs_iwalk_ag *iwag) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + xfs_ino_t ino; + unsigned int i, j; + xfs_agnumber_t agno; + int error; + + agno = XFS_INO_TO_AGNO(mp, iwag->startino); + for (i = 0; i < iwag->nr_recs; i++) { + struct xfs_inobt_rec_incore *irec = &iwag->recs[i]; + + trace_xfs_iwalk_ag_rec(mp, agno, irec); + + for (j = 0; j < XFS_INODES_PER_CHUNK; j++) { + /* Skip if this inode is free */ + if (XFS_INOBT_MASK(j) & irec->ir_free) + continue; + + /* Otherwise call our function. */ + ino = XFS_AGINO_TO_INO(mp, agno, irec->ir_startino + j); + error = iwag->iwalk_fn(mp, tp, ino, iwag->data); + if (error) + return error; + } + } + + return 0; +} + +/* Delete cursor and let go of AGI. */ +static inline void +xfs_iwalk_del_inobt( + struct xfs_trans *tp, + struct xfs_btree_cur **curpp, + struct xfs_buf **agi_bpp, + int error) +{ + if (*curpp) { + xfs_btree_del_cursor(*curpp, error); + *curpp = NULL; + } + if (*agi_bpp) { + xfs_trans_brelse(tp, *agi_bpp); + *agi_bpp = NULL; + } +} + +/* + * Set ourselves up for walking inobt records starting from a given point in + * the filesystem. + * + * If caller passed in a nonzero start inode number, load the record from the + * inobt and make the record look like all the inodes before agino are free so + * that we skip them, and then move the cursor to the next inobt record. This + * is how we support starting an iwalk in the middle of an inode chunk. + * + * If the caller passed in a start number of zero, move the cursor to the first + * inobt record. + * + * The caller is responsible for cleaning up the cursor and buffer pointer + * regardless of the error status. + */ +STATIC int +xfs_iwalk_ag_start( + struct xfs_iwalk_ag *iwag, + xfs_agnumber_t agno, + xfs_agino_t agino, + struct xfs_btree_cur **curpp, + struct xfs_buf **agi_bpp, + int *has_more) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + int icount; + int error; + + /* Set up a fresh cursor and empty the inobt cache. */ + iwag->nr_recs = 0; + error = xfs_inobt_cur(mp, tp, agno, curpp, agi_bpp); + if (error) + return error; + + /* Starting at the beginning of the AG? That's easy! */ + if (agino == 0) + return xfs_inobt_lookup(*curpp, 0, XFS_LOOKUP_GE, has_more); + + /* + * Otherwise, we have to grab the inobt record where we left off, stuff + * the record into our cache, and then see if there are more records. + * We require a lookup cache of at least two elements so that we don't + * have to deal with tearing down the cursor to walk the records. + */ + error = xfs_bulkstat_grab_ichunk(*curpp, agino - 1, &icount, + &iwag->recs[iwag->nr_recs]); + if (error) + return error; + if (icount) + iwag->nr_recs++; + + /* + * set_prefetch is supposed to give us a large enough inobt record + * cache that grab_ichunk can stage a partial first record and the loop + * body can cache a record without having to check for cache space + * until after it reads an inobt record. + */ + ASSERT(iwag->nr_recs < iwag->sz_recs); + + return xfs_btree_increment(*curpp, 0, has_more); +} + +typedef int (*xfs_iwalk_ag_recs_fn)(struct xfs_iwalk_ag *iwag); + +/* + * The inobt record cache is full, so preserve the inobt cursor state and + * run callbacks on the cached inobt records. When we're done, restore the + * cursor state to wherever the cursor would have been had the cache not been + * full (and therefore we could've just incremented the cursor). + */ +STATIC int +xfs_iwalk_run_callbacks( + struct xfs_iwalk_ag *iwag, + xfs_iwalk_ag_recs_fn walk_ag_recs_fn, + xfs_agnumber_t agno, + struct xfs_btree_cur **curpp, + struct xfs_buf **agi_bpp, + int *has_more) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + struct xfs_inobt_rec_incore *irec; + xfs_agino_t restart; + int error; + + ASSERT(iwag->nr_recs == iwag->sz_recs); + + /* Delete cursor but remember the last record we cached... */ + xfs_iwalk_del_inobt(tp, curpp, agi_bpp, 0); + irec = &iwag->recs[iwag->nr_recs - 1]; + restart = irec->ir_startino + XFS_INODES_PER_CHUNK - 1; + + error = walk_ag_recs_fn(iwag); + if (error) + return error; + + /* ...empty the cache... */ + iwag->nr_recs = 0; + + /* ...and recreate the cursor just past where we left off. */ + error = xfs_inobt_cur(mp, tp, agno, curpp, agi_bpp); + if (error) + return error; + + return xfs_inobt_lookup(*curpp, restart, XFS_LOOKUP_GE, has_more); +} + +/* Walk all inodes in a single AG, from @iwag->startino to the end of the AG. */ +STATIC int +xfs_iwalk_ag( + struct xfs_iwalk_ag *iwag) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + struct xfs_buf *agi_bp = NULL; + struct xfs_btree_cur *cur = NULL; + xfs_agnumber_t agno; + xfs_agino_t agino; + int has_more; + int error = 0; + + /* Set up our cursor at the right place in the inode btree. */ + agno = XFS_INO_TO_AGNO(mp, iwag->startino); + agino = XFS_INO_TO_AGINO(mp, iwag->startino); + error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more); + + while (!error && has_more) { + struct xfs_inobt_rec_incore *irec; + + cond_resched(); + + /* Fetch the inobt record. */ + irec = &iwag->recs[iwag->nr_recs]; + error = xfs_inobt_get_rec(cur, irec, &has_more); + if (error || !has_more) + break; + + /* No allocated inodes in this chunk; skip it. */ + if (irec->ir_freecount == irec->ir_count) { + error = xfs_btree_increment(cur, 0, &has_more); + if (error) + break; + continue; + } + + /* + * Start readahead for this inode chunk in anticipation of + * walking the inodes. + */ + xfs_bulkstat_ichunk_ra(mp, agno, irec); + + /* + * If there's space in the buffer for more records, increment + * the btree cursor and grab more. + */ + if (++iwag->nr_recs < iwag->sz_recs) { + error = xfs_btree_increment(cur, 0, &has_more); + if (error || !has_more) + break; + continue; + } + + /* + * Otherwise, we need to save cursor state and run the callback + * function on the cached records. The run_callbacks function + * is supposed to return a cursor pointing to the record where + * we would be if we had been able to increment like above. + */ + error = xfs_iwalk_run_callbacks(iwag, xfs_iwalk_ag_recs, agno, + &cur, &agi_bp, &has_more); + } + + xfs_iwalk_del_inobt(tp, &cur, &agi_bp, error); + + /* Walk any records left behind in the cache. */ + if (iwag->nr_recs == 0 || error) + return error; + + return xfs_iwalk_ag_recs(iwag); +} + +/* + * Given the number of inodes to prefetch, set the number of inobt records that + * we cache in memory, which controls the number of inodes we try to read + * ahead. + * + * If no max prefetch was given, default to 4096 bytes' worth of inobt records; + * this should be plenty of inodes to read ahead. This number (256 inobt + * records) was chosen so that the cache is never more than a single memory + * page. + */ +static inline void +xfs_iwalk_set_prefetch( + struct xfs_iwalk_ag *iwag, + unsigned int max_prefetch) +{ + if (max_prefetch) + iwag->sz_recs = round_up(max_prefetch, XFS_INODES_PER_CHUNK) / + XFS_INODES_PER_CHUNK; + else + iwag->sz_recs = 4096 / sizeof(struct xfs_inobt_rec_incore); + + /* + * Allocate enough space to prefetch at least two records so that we + * can cache both the inobt record where the iwalk started and the next + * record. This simplifies the AG inode walk loop setup code. + */ + iwag->sz_recs = max_t(unsigned int, iwag->sz_recs, 2); +} + +/* + * Walk all inodes in the filesystem starting from @startino. The @iwalk_fn + * will be called for each allocated inode, being passed the inode's number and + * @data. @max_prefetch controls how many inobt records' worth of inodes we + * try to readahead. + */ +int +xfs_iwalk( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_ino_t startino, + xfs_iwalk_fn iwalk_fn, + unsigned int max_prefetch, + void *data) +{ + struct xfs_iwalk_ag iwag = { + .mp = mp, + .tp = tp, + .iwalk_fn = iwalk_fn, + .data = data, + .startino = startino, + }; + xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino); + int error; + + ASSERT(agno < mp->m_sb.sb_agcount); + + xfs_iwalk_set_prefetch(&iwag, max_prefetch); + error = xfs_iwalk_alloc(&iwag); + if (error) + return error; + + for (; agno < mp->m_sb.sb_agcount; agno++) { + error = xfs_iwalk_ag(&iwag); + if (error) + break; + iwag.startino = XFS_AGINO_TO_INO(mp, agno + 1, 0); + } + + xfs_iwalk_free(&iwag); + return error; +} diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h new file mode 100644 index 000000000000..45b1baabcd2d --- /dev/null +++ b/fs/xfs/xfs_iwalk.h @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2019 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_IWALK_H__ +#define __XFS_IWALK_H__ + +/* Walk all inodes in the filesystem starting from @startino. */ +typedef int (*xfs_iwalk_fn)(struct xfs_mount *mp, struct xfs_trans *tp, + xfs_ino_t ino, void *data); +/* Return value (for xfs_iwalk_fn) that aborts the walk immediately. */ +#define XFS_IWALK_ABORT (1) + +int xfs_iwalk(struct xfs_mount *mp, struct xfs_trans *tp, xfs_ino_t startino, + xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, void *data); + +#endif /* __XFS_IWALK_H__ */ diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 2464ea351f83..f9bb1d50bc0e 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3516,6 +3516,46 @@ DEFINE_EVENT(xfs_inode_corrupt_class, name, \ DEFINE_INODE_CORRUPT_EVENT(xfs_inode_mark_sick); DEFINE_INODE_CORRUPT_EVENT(xfs_inode_mark_healthy); +TRACE_EVENT(xfs_iwalk_ag, + TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, + xfs_agino_t startino), + TP_ARGS(mp, agno, startino), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_agnumber_t, agno) + __field(xfs_agino_t, startino) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->agno = agno; + __entry->startino = startino; + ), + TP_printk("dev %d:%d agno %d startino %u", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->agno, + __entry->startino) +) + +TRACE_EVENT(xfs_iwalk_ag_rec, + TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, + struct xfs_inobt_rec_incore *irec), + TP_ARGS(mp, agno, irec), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_agnumber_t, agno) + __field(xfs_agino_t, startino) + __field(uint64_t, freemask) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->agno = agno; + __entry->startino = irec->ir_startino; + __entry->freemask = irec->ir_free; + ), + TP_printk("dev %d:%d agno %d startino %u freemask 0x%llx", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->agno, + __entry->startino, __entry->freemask) +) + #endif /* _TRACE_XFS_H */ #undef TRACE_INCLUDE_PATH From patchwork Tue Jun 4 21:49:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10976167 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 27BAE14E5 for ; Tue, 4 Jun 2019 21:49:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1390C205C0 for ; Tue, 4 Jun 2019 21:49:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 06770287A5; Tue, 4 Jun 2019 21:49:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 849B8205C0 for ; Tue, 4 Jun 2019 21:49:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726527AbfFDVty (ORCPT ); Tue, 4 Jun 2019 17:49:54 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:58436 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726490AbfFDVtx (ORCPT ); Tue, 4 Jun 2019 17:49:53 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LnaSM053362; Tue, 4 Jun 2019 21:49:44 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=1b7diM5WfN04GNY/lXdXD0MvTPr0mMD5odIP7dz4HGY=; b=Cq0qYJdxtbQZ+UsyGqXDcOtIEMn7gopcySC34MTFD1fuXPe4bf2OrEqMM8UkZ4yhjEBp MlDUIB5FY+Mxk7SAIttbPkK/GvvggXco7xgJNvr2X+prxUwr69BoIXZeH8zb/oEB9fH+ LcfgV0sAfi4TH5URc8QXwPfU6P2BXK936h8mruvjwswGQKd9ZTNPu2fCoMThdQNkEYDQ GVSS6VBnaBSEd0daXSuX4361c9Q7+ZAFcjgjn2cpYdVqsD8UgnqldZvvejLr3fgahiXI LOlTUxjYBePnM+bXzaBE6Lsj04B0Hgm8BpPwNK6EXu8lZbNuaN5+kVhoQ7IK5uqjxZCH Hw== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 2sugstfp7e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 04 Jun 2019 21:49:44 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54Lluqi041491; Tue, 4 Jun 2019 21:49:43 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3020.oracle.com with ESMTP id 2swnghkjg2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 04 Jun 2019 21:49:43 +0000 Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x54Lngno024788; Tue, 4 Jun 2019 21:49:42 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Jun 2019 14:49:41 -0700 Subject: [PATCH 02/10] xfs: convert quotacheck to use the new iwalk functions From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, Dave Chinner Date: Tue, 04 Jun 2019 14:49:40 -0700 Message-ID: <155968498085.1657646.3518168545540841602.stgit@magnolia> In-Reply-To: <155968496814.1657646.13743491598480818627.stgit@magnolia> References: <155968496814.1657646.13743491598480818627.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Convert quotacheck to use the new iwalk iterator to dig through the inodes. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Brian Foster --- fs/xfs/xfs_qm.c | 62 ++++++++++++++++++------------------------------------- 1 file changed, 20 insertions(+), 42 deletions(-) diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index aa6b6db3db0e..a5b2260406a8 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -26,6 +26,7 @@ #include "xfs_trace.h" #include "xfs_icache.h" #include "xfs_cksum.h" +#include "xfs_iwalk.h" /* * The global quota manager. There is only one of these for the entire @@ -1118,17 +1119,15 @@ xfs_qm_quotacheck_dqadjust( /* ARGSUSED */ STATIC int xfs_qm_dqusage_adjust( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t ino, /* inode number to get data for */ - void __user *buffer, /* not used */ - int ubsize, /* not used */ - int *ubused, /* not used */ - int *res) /* result code value */ + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_ino_t ino, + void *data) { - xfs_inode_t *ip; - xfs_qcnt_t nblks; - xfs_filblks_t rtblks = 0; /* total rt blks */ - int error; + struct xfs_inode *ip; + xfs_qcnt_t nblks; + xfs_filblks_t rtblks = 0; /* total rt blks */ + int error; ASSERT(XFS_IS_QUOTA_RUNNING(mp)); @@ -1136,20 +1135,18 @@ xfs_qm_dqusage_adjust( * rootino must have its resources accounted for, not so with the quota * inodes. */ - if (xfs_is_quota_inode(&mp->m_sb, ino)) { - *res = BULKSTAT_RV_NOTHING; - return -EINVAL; - } + if (xfs_is_quota_inode(&mp->m_sb, ino)) + return 0; /* * We don't _need_ to take the ilock EXCL here because quotacheck runs * at mount time and therefore nobody will be racing chown/chproj. */ - error = xfs_iget(mp, NULL, ino, XFS_IGET_DONTCACHE, 0, &ip); - if (error) { - *res = BULKSTAT_RV_NOTHING; + error = xfs_iget(mp, tp, ino, XFS_IGET_DONTCACHE, 0, &ip); + if (error == -EINVAL || error == -ENOENT) + return 0; + if (error) return error; - } ASSERT(ip->i_delayed_blks == 0); @@ -1157,7 +1154,7 @@ xfs_qm_dqusage_adjust( struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK); if (!(ifp->if_flags & XFS_IFEXTENTS)) { - error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK); + error = xfs_iread_extents(tp, ip, XFS_DATA_FORK); if (error) goto error0; } @@ -1200,13 +1197,8 @@ xfs_qm_dqusage_adjust( goto error0; } - xfs_irele(ip); - *res = BULKSTAT_RV_DIDONE; - return 0; - error0: xfs_irele(ip); - *res = BULKSTAT_RV_GIVEUP; return error; } @@ -1270,18 +1262,13 @@ STATIC int xfs_qm_quotacheck( xfs_mount_t *mp) { - int done, count, error, error2; - xfs_ino_t lastino; - size_t structsz; + int error, error2; uint flags; LIST_HEAD (buffer_list); struct xfs_inode *uip = mp->m_quotainfo->qi_uquotaip; struct xfs_inode *gip = mp->m_quotainfo->qi_gquotaip; struct xfs_inode *pip = mp->m_quotainfo->qi_pquotaip; - count = INT_MAX; - structsz = 1; - lastino = 0; flags = 0; ASSERT(uip || gip || pip); @@ -1318,18 +1305,9 @@ xfs_qm_quotacheck( flags |= XFS_PQUOTA_CHKD; } - do { - /* - * Iterate thru all the inodes in the file system, - * adjusting the corresponding dquot counters in core. - */ - error = xfs_bulkstat(mp, &lastino, &count, - xfs_qm_dqusage_adjust, - structsz, NULL, &done); - if (error) - break; - - } while (!done); + error = xfs_iwalk(mp, NULL, 0, xfs_qm_dqusage_adjust, 0, NULL); + if (error) + goto error_return; /* * We've made all the changes that we need to make incore. Flush them From patchwork Tue Jun 4 21:49:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10976165 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C775E14E5 for ; Tue, 4 Jun 2019 21:49:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B5387205C0 for ; Tue, 4 Jun 2019 21:49:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A991A287A5; Tue, 4 Jun 2019 21:49:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2681228704 for ; Tue, 4 Jun 2019 21:49:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726488AbfFDVtu (ORCPT ); Tue, 4 Jun 2019 17:49:50 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:36800 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726293AbfFDVtu (ORCPT ); Tue, 4 Jun 2019 17:49:50 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54Lndhd060600 for ; Tue, 4 Jun 2019 21:49:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=GWYDnFE03RVp43hatnjw4oNZGgYS4y4yhfqhC20MhCI=; b=ekS0q0AAmmMF1FlbvPaRrC6QDkNEVlMyPi1LJ7Z/ElzhzLaOWXjS0nK1mnf4/KufCGt5 mdbKT87yXEvDY8/K6QCkl+ShtwTHAJ9VSupq2aoH8O8/tkhQakTb3i6rAe6ylfyVFWL/ K2RxJBxmtQrS+23KEa+mSK020e/uynDdH8gR2S1ViPGbCL3m2gWqc+nktkX9HM1YSQvP uHGpohe1lndyiqjfFgh/6Qy/0enqrgPWa68YYEDP57IxW2fm09/t5n1S8Q6LEIvM7XHd HQSyg8fXK1LsEPkEOXK9rUvc3Pbia68NQRil74Yu+peU4VHFJl7EhNom9FhqTt39OVYn ZA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2120.oracle.com with ESMTP id 2suj0qfjua-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:49:49 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LmpPG172125 for ; Tue, 4 Jun 2019 21:49:48 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3030.oracle.com with ESMTP id 2swngkkhwe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:49:48 +0000 Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x54LnmW6024903 for ; Tue, 4 Jun 2019 21:49:48 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Jun 2019 14:49:48 -0700 Subject: [PATCH 03/10] xfs: bulkstat should copy lastip whenever userspace supplies one From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Tue, 04 Jun 2019 14:49:47 -0700 Message-ID: <155968498711.1657646.16552958514650953190.stgit@magnolia> In-Reply-To: <155968496814.1657646.13743491598480818627.stgit@magnolia> References: <155968496814.1657646.13743491598480818627.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=904 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=936 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong When userspace passes in a @lastip pointer we should copy the results back, even if the @ocount pointer is NULL. Signed-off-by: Darrick J. Wong Reviewed-by: Brian Foster --- fs/xfs/xfs_ioctl.c | 13 ++++++------- fs/xfs/xfs_ioctl32.c | 13 ++++++------- 2 files changed, 12 insertions(+), 14 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index d7dfc13f30f5..5ffbdcff3dba 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -768,14 +768,13 @@ xfs_ioc_bulkstat( if (error) return error; - if (bulkreq.ocount != NULL) { - if (copy_to_user(bulkreq.lastip, &inlast, - sizeof(xfs_ino_t))) - return -EFAULT; + if (bulkreq.lastip != NULL && + copy_to_user(bulkreq.lastip, &inlast, sizeof(xfs_ino_t))) + return -EFAULT; - if (copy_to_user(bulkreq.ocount, &count, sizeof(count))) - return -EFAULT; - } + if (bulkreq.ocount != NULL && + copy_to_user(bulkreq.ocount, &count, sizeof(count))) + return -EFAULT; return 0; } diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c index 614fc6886d24..814ffe6fbab7 100644 --- a/fs/xfs/xfs_ioctl32.c +++ b/fs/xfs/xfs_ioctl32.c @@ -310,14 +310,13 @@ xfs_compat_ioc_bulkstat( if (error) return error; - if (bulkreq.ocount != NULL) { - if (copy_to_user(bulkreq.lastip, &inlast, - sizeof(xfs_ino_t))) - return -EFAULT; + if (bulkreq.lastip != NULL && + copy_to_user(bulkreq.lastip, &inlast, sizeof(xfs_ino_t))) + return -EFAULT; - if (copy_to_user(bulkreq.ocount, &count, sizeof(count))) - return -EFAULT; - } + if (bulkreq.ocount != NULL && + copy_to_user(bulkreq.ocount, &count, sizeof(count))) + return -EFAULT; return 0; } From patchwork Tue Jun 4 21:49:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10976169 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 417B714E5 for ; Tue, 4 Jun 2019 21:50:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 270C5205C0 for ; Tue, 4 Jun 2019 21:50:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 17998287A5; Tue, 4 Jun 2019 21:50:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 81B9F205C0 for ; Tue, 4 Jun 2019 21:50:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726490AbfFDVuA (ORCPT ); Tue, 4 Jun 2019 17:50:00 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:36920 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726293AbfFDVt7 (ORCPT ); Tue, 4 Jun 2019 17:49:59 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LnQaq059836 for ; Tue, 4 Jun 2019 21:49:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=WDBOhUbx59TB1peKdjMd3t6IkFu8XwZ9szwHyxH+qfs=; b=Ii6dZHdzs+8CDK0yLjx+YgOf0JVFYoIz3vNPslhkXPq6oxiLig79VCauosvwY5CfZ3fv 6urZC93+FMIm3/TT++9gGtKWt2z2S9IQgkztFfpIBtwl9Udeka+0L1SnzcMdmUWLr6NQ 3mushJM50eBSpMsUB+qMNnGSuzNBaZamLLMoGpMDVOJVee5kQmpGkAhlIlmxake1V1jv cDSyv6RqY7L0txqybh+L1gztUnstBHSkLN1uIcCYfWh1a9KUbUiqeOVOw+v4OI8OCsqL 6SvXZda5js4kJUX6anS18+65erIMW9OOUfcM6GXFul7PMUde3GC6mxipfscCpgRNNOlR Rg== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2120.oracle.com with ESMTP id 2suj0qfjuq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:49:55 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54Lma5Y009273 for ; Tue, 4 Jun 2019 21:49:55 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3030.oracle.com with ESMTP id 2swnhbug5f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:49:55 +0000 Received: from abhmp0018.oracle.com (abhmp0018.oracle.com [141.146.116.24]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x54LnsSx008907 for ; Tue, 4 Jun 2019 21:49:54 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Jun 2019 14:49:54 -0700 Subject: [PATCH 04/10] xfs: convert bulkstat to new iwalk infrastructure From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Tue, 04 Jun 2019 14:49:53 -0700 Message-ID: <155968499323.1657646.9567491795149169699.stgit@magnolia> In-Reply-To: <155968496814.1657646.13743491598480818627.stgit@magnolia> References: <155968496814.1657646.13743491598480818627.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Create a new ibulk structure incore to help us deal with bulk inode stat state tracking and then convert the bulkstat code to use the new iwalk iterator. This disentangles inode walking from bulk stat control for simpler code and enables us to isolate the formatter functions to the ioctl handling code. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_ioctl.c | 65 ++++++-- fs/xfs/xfs_ioctl.h | 5 + fs/xfs/xfs_ioctl32.c | 88 +++++------ fs/xfs/xfs_itable.c | 407 ++++++++++++++------------------------------------ fs/xfs/xfs_itable.h | 79 ++++------ 5 files changed, 245 insertions(+), 399 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 5ffbdcff3dba..43734901aeb9 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -721,16 +721,28 @@ xfs_ioc_space( return error; } +/* Return 0 on success or positive error */ +int +xfs_bulkstat_one_fmt( + struct xfs_ibulk *breq, + const struct xfs_bstat *bstat) +{ + if (copy_to_user(breq->ubuffer, bstat, sizeof(*bstat))) + return -EFAULT; + return xfs_ibulk_advance(breq, sizeof(struct xfs_bstat)); +} + STATIC int xfs_ioc_bulkstat( xfs_mount_t *mp, unsigned int cmd, void __user *arg) { - xfs_fsop_bulkreq_t bulkreq; - int count; /* # of records returned */ - xfs_ino_t inlast; /* last inode number */ - int done; + struct xfs_fsop_bulkreq bulkreq; + struct xfs_ibulk breq = { + .mp = mp, + }; + xfs_ino_t lastino; int error; /* done = 1 if there are more stats to get and if bulkstat */ @@ -745,35 +757,54 @@ xfs_ioc_bulkstat( if (copy_from_user(&bulkreq, arg, sizeof(xfs_fsop_bulkreq_t))) return -EFAULT; - if (copy_from_user(&inlast, bulkreq.lastip, sizeof(__s64))) + if (copy_from_user(&lastino, bulkreq.lastip, sizeof(__s64))) return -EFAULT; - if ((count = bulkreq.icount) <= 0) + if (bulkreq.icount <= 0) return -EINVAL; if (bulkreq.ubuffer == NULL) return -EINVAL; - if (cmd == XFS_IOC_FSINUMBERS) - error = xfs_inumbers(mp, &inlast, &count, + breq.ubuffer = bulkreq.ubuffer; + breq.icount = bulkreq.icount; + + /* + * FSBULKSTAT_SINGLE expects that *lastip contains the inode number + * that we want to stat. However, FSINUMBERS and FSBULKSTAT expect + * that *lastip contains either zero or the number of the last inode to + * be examined by the previous call and return results starting with + * the next inode after that. The new bulk request functions take the + * inode to start with, so we have to adjust the lastino/startino + * parameter to maintain correct function. + */ + if (cmd == XFS_IOC_FSINUMBERS) { + int count = breq.icount; + + breq.startino = lastino; + error = xfs_inumbers(mp, &breq.startino, &count, bulkreq.ubuffer, xfs_inumbers_fmt); - else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE) - error = xfs_bulkstat_one(mp, inlast, bulkreq.ubuffer, - sizeof(xfs_bstat_t), NULL, &done); - else /* XFS_IOC_FSBULKSTAT */ - error = xfs_bulkstat(mp, &inlast, &count, xfs_bulkstat_one, - sizeof(xfs_bstat_t), bulkreq.ubuffer, - &done); + breq.ocount = count; + lastino = breq.startino; + } else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE) { + breq.startino = lastino; + error = xfs_bulkstat_one(&breq, xfs_bulkstat_one_fmt); + lastino = breq.startino; + } else { /* XFS_IOC_FSBULKSTAT */ + breq.startino = lastino + 1; + error = xfs_bulkstat(&breq, xfs_bulkstat_one_fmt); + lastino = breq.startino - 1; + } if (error) return error; if (bulkreq.lastip != NULL && - copy_to_user(bulkreq.lastip, &inlast, sizeof(xfs_ino_t))) + copy_to_user(bulkreq.lastip, &lastino, sizeof(xfs_ino_t))) return -EFAULT; if (bulkreq.ocount != NULL && - copy_to_user(bulkreq.ocount, &count, sizeof(count))) + copy_to_user(bulkreq.ocount, &breq.ocount, sizeof(__s32))) return -EFAULT; return 0; diff --git a/fs/xfs/xfs_ioctl.h b/fs/xfs/xfs_ioctl.h index 4b17f67c888a..f32c8aadfeba 100644 --- a/fs/xfs/xfs_ioctl.h +++ b/fs/xfs/xfs_ioctl.h @@ -77,4 +77,9 @@ xfs_set_dmattrs( uint evmask, uint16_t state); +struct xfs_ibulk; +struct xfs_bstat; + +int xfs_bulkstat_one_fmt(struct xfs_ibulk *breq, const struct xfs_bstat *bstat); + #endif diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c index 814ffe6fbab7..add15819daf3 100644 --- a/fs/xfs/xfs_ioctl32.c +++ b/fs/xfs/xfs_ioctl32.c @@ -172,15 +172,10 @@ xfs_bstime_store_compat( /* Return 0 on success or positive error (to xfs_bulkstat()) */ STATIC int xfs_bulkstat_one_fmt_compat( - void __user *ubuffer, - int ubsize, - int *ubused, - const xfs_bstat_t *buffer) + struct xfs_ibulk *breq, + const struct xfs_bstat *buffer) { - compat_xfs_bstat_t __user *p32 = ubuffer; - - if (ubsize < sizeof(*p32)) - return -ENOMEM; + struct compat_xfs_bstat __user *p32 = breq->ubuffer; if (put_user(buffer->bs_ino, &p32->bs_ino) || put_user(buffer->bs_mode, &p32->bs_mode) || @@ -205,23 +200,8 @@ xfs_bulkstat_one_fmt_compat( put_user(buffer->bs_dmstate, &p32->bs_dmstate) || put_user(buffer->bs_aextents, &p32->bs_aextents)) return -EFAULT; - if (ubused) - *ubused = sizeof(*p32); - return 0; -} -STATIC int -xfs_bulkstat_one_compat( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t ino, /* inode number to get data for */ - void __user *buffer, /* buffer to place output in */ - int ubsize, /* size of buffer */ - int *ubused, /* bytes used by me */ - int *stat) /* BULKSTAT_RV_... */ -{ - return xfs_bulkstat_one_int(mp, ino, buffer, ubsize, - xfs_bulkstat_one_fmt_compat, - ubused, stat); + return xfs_ibulk_advance(breq, sizeof(struct compat_xfs_bstat)); } /* copied from xfs_ioctl.c */ @@ -232,10 +212,11 @@ xfs_compat_ioc_bulkstat( compat_xfs_fsop_bulkreq_t __user *p32) { u32 addr; - xfs_fsop_bulkreq_t bulkreq; - int count; /* # of records returned */ - xfs_ino_t inlast; /* last inode number */ - int done; + struct xfs_fsop_bulkreq bulkreq; + struct xfs_ibulk breq = { + .mp = mp, + }; + xfs_ino_t lastino; int error; /* @@ -245,8 +226,7 @@ xfs_compat_ioc_bulkstat( * functions and structure size are the correct ones to use ... */ inumbers_fmt_pf inumbers_func = xfs_inumbers_fmt_compat; - bulkstat_one_pf bs_one_func = xfs_bulkstat_one_compat; - size_t bs_one_size = sizeof(struct compat_xfs_bstat); + bulkstat_one_fmt_pf bs_one_func = xfs_bulkstat_one_fmt_compat; #ifdef CONFIG_X86_X32 if (in_x32_syscall()) { @@ -259,8 +239,7 @@ xfs_compat_ioc_bulkstat( * x32 userspace expects. */ inumbers_func = xfs_inumbers_fmt; - bs_one_func = xfs_bulkstat_one; - bs_one_size = sizeof(struct xfs_bstat); + bs_one_func = xfs_bulkstat_one_fmt; } #endif @@ -284,38 +263,57 @@ xfs_compat_ioc_bulkstat( return -EFAULT; bulkreq.ocount = compat_ptr(addr); - if (copy_from_user(&inlast, bulkreq.lastip, sizeof(__s64))) + if (copy_from_user(&lastino, bulkreq.lastip, sizeof(__s64))) return -EFAULT; + breq.startino = lastino + 1; - if ((count = bulkreq.icount) <= 0) + if (bulkreq.icount <= 0) return -EINVAL; if (bulkreq.ubuffer == NULL) return -EINVAL; + breq.ubuffer = bulkreq.ubuffer; + breq.icount = bulkreq.icount; + + /* + * FSBULKSTAT_SINGLE expects that *lastip contains the inode number + * that we want to stat. However, FSINUMBERS and FSBULKSTAT expect + * that *lastip contains either zero or the number of the last inode to + * be examined by the previous call and return results starting with + * the next inode after that. The new bulk request functions take the + * inode to start with, so we have to adjust the lastino/startino + * parameter to maintain correct function. + */ if (cmd == XFS_IOC_FSINUMBERS_32) { - error = xfs_inumbers(mp, &inlast, &count, + int count = breq.icount; + + breq.startino = lastino; + error = xfs_inumbers(mp, &breq.startino, &count, bulkreq.ubuffer, inumbers_func); + breq.ocount = count; + lastino = breq.startino; } else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE_32) { - int res; - - error = bs_one_func(mp, inlast, bulkreq.ubuffer, - bs_one_size, NULL, &res); + breq.startino = lastino; + error = xfs_bulkstat_one(&breq, bs_one_func); + lastino = breq.startino; } else if (cmd == XFS_IOC_FSBULKSTAT_32) { - error = xfs_bulkstat(mp, &inlast, &count, - bs_one_func, bs_one_size, - bulkreq.ubuffer, &done); - } else + breq.startino = lastino + 1; + error = xfs_bulkstat(&breq, bs_one_func); + lastino = breq.startino - 1; + } else { error = -EINVAL; + } if (error) return error; + lastino = breq.startino - 1; if (bulkreq.lastip != NULL && - copy_to_user(bulkreq.lastip, &inlast, sizeof(xfs_ino_t))) + copy_to_user(bulkreq.lastip, &lastino, sizeof(xfs_ino_t))) return -EFAULT; if (bulkreq.ocount != NULL && - copy_to_user(bulkreq.ocount, &count, sizeof(count))) + copy_to_user(bulkreq.ocount, &breq.ocount, sizeof(__s32))) return -EFAULT; return 0; diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c index 3ca1c454afe6..87c597ea1df7 100644 --- a/fs/xfs/xfs_itable.c +++ b/fs/xfs/xfs_itable.c @@ -22,37 +22,63 @@ #include "xfs_iwalk.h" /* - * Return stat information for one inode. - * Return 0 if ok, else errno. + * Bulk Stat + * ========= + * + * Use the inode walking functions to fill out struct xfs_bstat for every + * allocated inode, then pass the stat information to some externally provided + * iteration function. */ -int + +struct xfs_bstat_chunk { + bulkstat_one_fmt_pf formatter; + struct xfs_ibulk *breq; +}; + +/* + * Fill out the bulkstat info for a single inode and report it somewhere. + * + * bc->breq->lastino is effectively the inode cursor as we walk through the + * filesystem. Therefore, we update it any time we need to move the cursor + * forward, regardless of whether or not we're sending any bstat information + * back to userspace. If the inode is internal metadata or, has been freed + * out from under us, we just simply keep going. + * + * However, if any other type of error happens we want to stop right where we + * are so that userspace will call back with exact number of the bad inode and + * we can send back an error code. + * + * Note that if the formatter tells us there's no space left in the buffer we + * move the cursor forward and abort the walk. + */ +STATIC int xfs_bulkstat_one_int( - struct xfs_mount *mp, /* mount point for filesystem */ - xfs_ino_t ino, /* inode to get data for */ - void __user *buffer, /* buffer to place output in */ - int ubsize, /* size of buffer */ - bulkstat_one_fmt_pf formatter, /* formatter, copy to user */ - int *ubused, /* bytes used by me */ - int *stat) /* BULKSTAT_RV_... */ + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_ino_t ino, + void *data) { + struct xfs_bstat_chunk *bc = data; struct xfs_icdinode *dic; /* dinode core info pointer */ struct xfs_inode *ip; /* incore inode pointer */ struct inode *inode; struct xfs_bstat *buf; /* return buffer */ int error = 0; /* error value */ - *stat = BULKSTAT_RV_NOTHING; - - if (!buffer || xfs_internal_inum(mp, ino)) + if (xfs_internal_inum(mp, ino)) { + bc->breq->startino = ino + 1; return -EINVAL; + } buf = kmem_zalloc(sizeof(*buf), KM_SLEEP | KM_MAYFAIL); if (!buf) return -ENOMEM; - error = xfs_iget(mp, NULL, ino, + error = xfs_iget(mp, tp, ino, (XFS_IGET_DONTCACHE | XFS_IGET_UNTRUSTED), XFS_ILOCK_SHARED, &ip); + if (error == -ENOENT || error == -EINVAL) + bc->breq->startino = ino + 1; if (error) goto out_free; @@ -119,43 +145,45 @@ xfs_bulkstat_one_int( xfs_iunlock(ip, XFS_ILOCK_SHARED); xfs_irele(ip); - error = formatter(buffer, ubsize, ubused, buf); - if (!error) - *stat = BULKSTAT_RV_DIDONE; - - out_free: + error = bc->formatter(bc->breq, buf); + switch (error) { + case XFS_IBULK_BUFFER_FULL: + error = XFS_IWALK_ABORT; + /* fall through */ + case 0: + bc->breq->startino = ino + 1; + break; + } +out_free: kmem_free(buf); return error; } -/* Return 0 on success or positive error */ -STATIC int -xfs_bulkstat_one_fmt( - void __user *ubuffer, - int ubsize, - int *ubused, - const xfs_bstat_t *buffer) -{ - if (ubsize < sizeof(*buffer)) - return -ENOMEM; - if (copy_to_user(ubuffer, buffer, sizeof(*buffer))) - return -EFAULT; - if (ubused) - *ubused = sizeof(*buffer); - return 0; -} - +/* Bulkstat a single inode. */ int xfs_bulkstat_one( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t ino, /* inode number to get data for */ - void __user *buffer, /* buffer to place output in */ - int ubsize, /* size of buffer */ - int *ubused, /* bytes used by me */ - int *stat) /* BULKSTAT_RV_... */ + struct xfs_ibulk *breq, + bulkstat_one_fmt_pf formatter) { - return xfs_bulkstat_one_int(mp, ino, buffer, ubsize, - xfs_bulkstat_one_fmt, ubused, stat); + struct xfs_bstat_chunk bc = { + .formatter = formatter, + .breq = breq, + }; + int error; + + breq->icount = 1; + breq->ocount = 0; + + error = xfs_bulkstat_one_int(breq->mp, NULL, breq->startino, &bc); + + /* + * If we reported one inode to userspace then we abort because we hit + * the end of the buffer. Don't leak that back to userspace. + */ + if (error == XFS_IWALK_ABORT) + error = 0; + + return error; } /* @@ -251,256 +279,65 @@ xfs_bulkstat_grab_ichunk( #define XFS_BULKSTAT_UBLEFT(ubleft) ((ubleft) >= statstruct_size) -struct xfs_bulkstat_agichunk { - char __user **ac_ubuffer;/* pointer into user's buffer */ - int ac_ubleft; /* bytes left in user's buffer */ - int ac_ubelem; /* spaces used in user's buffer */ -}; - -/* - * Process inodes in chunk with a pointer to a formatter function - * that will iget the inode and fill in the appropriate structure. - */ static int -xfs_bulkstat_ag_ichunk( - struct xfs_mount *mp, - xfs_agnumber_t agno, - struct xfs_inobt_rec_incore *irbp, - bulkstat_one_pf formatter, - size_t statstruct_size, - struct xfs_bulkstat_agichunk *acp, - xfs_agino_t *last_agino) +xfs_bulkstat_iwalk( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_ino_t ino, + void *data) { - char __user **ubufp = acp->ac_ubuffer; - int chunkidx; - int error = 0; - xfs_agino_t agino = irbp->ir_startino; - - for (chunkidx = 0; chunkidx < XFS_INODES_PER_CHUNK; - chunkidx++, agino++) { - int fmterror; - int ubused; - - /* inode won't fit in buffer, we are done */ - if (acp->ac_ubleft < statstruct_size) - break; - - /* Skip if this inode is free */ - if (XFS_INOBT_MASK(chunkidx) & irbp->ir_free) - continue; - - /* Get the inode and fill in a single buffer */ - ubused = statstruct_size; - error = formatter(mp, XFS_AGINO_TO_INO(mp, agno, agino), - *ubufp, acp->ac_ubleft, &ubused, &fmterror); - - if (fmterror == BULKSTAT_RV_GIVEUP || - (error && error != -ENOENT && error != -EINVAL)) { - acp->ac_ubleft = 0; - ASSERT(error); - break; - } - - /* be careful not to leak error if at end of chunk */ - if (fmterror == BULKSTAT_RV_NOTHING || error) { - error = 0; - continue; - } - - *ubufp += ubused; - acp->ac_ubleft -= ubused; - acp->ac_ubelem++; - } - - /* - * Post-update *last_agino. At this point, agino will always point one - * inode past the last inode we processed successfully. Hence we - * substract that inode when setting the *last_agino cursor so that we - * return the correct cookie to userspace. On the next bulkstat call, - * the inode under the lastino cookie will be skipped as we have already - * processed it here. - */ - *last_agino = agino - 1; + int error; + error = xfs_bulkstat_one_int(mp, tp, ino, data); + /* bulkstat just skips over missing inodes */ + if (error == -ENOENT || error == -EINVAL) + return 0; return error; } /* - * Return stat information in bulk (by-inode) for the filesystem. + * Check the incoming lastino parameter. + * + * We allow any inode value that could map to physical space inside the + * filesystem because if there are no inodes there, bulkstat moves on to the + * next chunk. In other words, the magic agino value of zero takes us to the + * first chunk in the AG, and an agino value past the end of the AG takes us to + * the first chunk in the next AG. + * + * Therefore we can end early if the requested inode is beyond the end of the + * filesystem or doesn't map properly. */ -int /* error status */ -xfs_bulkstat( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t *lastinop, /* last inode returned */ - int *ubcountp, /* size of buffer/count returned */ - bulkstat_one_pf formatter, /* func that'd fill a single buf */ - size_t statstruct_size, /* sizeof struct filling */ - char __user *ubuffer, /* buffer with inode stats */ - int *done) /* 1 if there are more stats to get */ +static inline bool +xfs_bulkstat_already_done( + struct xfs_mount *mp, + xfs_ino_t startino) { - xfs_buf_t *agbp; /* agi header buffer */ - xfs_agino_t agino; /* inode # in allocation group */ - xfs_agnumber_t agno; /* allocation group number */ - xfs_btree_cur_t *cur; /* btree cursor for ialloc btree */ - xfs_inobt_rec_incore_t *irbuf; /* start of irec buffer */ - int nirbuf; /* size of irbuf */ - int ubcount; /* size of user's buffer */ - struct xfs_bulkstat_agichunk ac; - int error = 0; - - /* - * Get the last inode value, see if there's nothing to do. - */ - agno = XFS_INO_TO_AGNO(mp, *lastinop); - agino = XFS_INO_TO_AGINO(mp, *lastinop); - if (agno >= mp->m_sb.sb_agcount || - *lastinop != XFS_AGINO_TO_INO(mp, agno, agino)) { - *done = 1; - *ubcountp = 0; - return 0; - } - - ubcount = *ubcountp; /* statstruct's */ - ac.ac_ubuffer = &ubuffer; - ac.ac_ubleft = ubcount * statstruct_size; /* bytes */; - ac.ac_ubelem = 0; - - *ubcountp = 0; - *done = 0; - - irbuf = kmem_zalloc_large(PAGE_SIZE * 4, KM_SLEEP); - if (!irbuf) - return -ENOMEM; - nirbuf = (PAGE_SIZE * 4) / sizeof(*irbuf); - - /* - * Loop over the allocation groups, starting from the last - * inode returned; 0 means start of the allocation group. - */ - while (agno < mp->m_sb.sb_agcount) { - struct xfs_inobt_rec_incore *irbp = irbuf; - struct xfs_inobt_rec_incore *irbufend = irbuf + nirbuf; - bool end_of_ag = false; - int icount = 0; - int stat; - - error = xfs_ialloc_read_agi(mp, NULL, agno, &agbp); - if (error) - break; - /* - * Allocate and initialize a btree cursor for ialloc btree. - */ - cur = xfs_inobt_init_cursor(mp, NULL, agbp, agno, - XFS_BTNUM_INO); - if (agino > 0) { - /* - * In the middle of an allocation group, we need to get - * the remainder of the chunk we're in. - */ - struct xfs_inobt_rec_incore r; - - error = xfs_bulkstat_grab_ichunk(cur, agino, &icount, &r); - if (error) - goto del_cursor; - if (icount) { - irbp->ir_startino = r.ir_startino; - irbp->ir_holemask = r.ir_holemask; - irbp->ir_count = r.ir_count; - irbp->ir_freecount = r.ir_freecount; - irbp->ir_free = r.ir_free; - irbp++; - } - /* Increment to the next record */ - error = xfs_btree_increment(cur, 0, &stat); - } else { - /* Start of ag. Lookup the first inode chunk */ - error = xfs_inobt_lookup(cur, 0, XFS_LOOKUP_GE, &stat); - } - if (error || stat == 0) { - end_of_ag = true; - goto del_cursor; - } + xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino); + xfs_agino_t agino = XFS_INO_TO_AGINO(mp, startino); - /* - * Loop through inode btree records in this ag, - * until we run out of inodes or space in the buffer. - */ - while (irbp < irbufend && icount < ubcount) { - struct xfs_inobt_rec_incore r; - - error = xfs_inobt_get_rec(cur, &r, &stat); - if (error || stat == 0) { - end_of_ag = true; - goto del_cursor; - } - - /* - * If this chunk has any allocated inodes, save it. - * Also start read-ahead now for this chunk. - */ - if (r.ir_freecount < r.ir_count) { - xfs_bulkstat_ichunk_ra(mp, agno, &r); - irbp->ir_startino = r.ir_startino; - irbp->ir_holemask = r.ir_holemask; - irbp->ir_count = r.ir_count; - irbp->ir_freecount = r.ir_freecount; - irbp->ir_free = r.ir_free; - irbp++; - icount += r.ir_count - r.ir_freecount; - } - error = xfs_btree_increment(cur, 0, &stat); - if (error || stat == 0) { - end_of_ag = true; - goto del_cursor; - } - cond_resched(); - } + return agno >= mp->m_sb.sb_agcount || + startino != XFS_AGINO_TO_INO(mp, agno, agino); +} - /* - * Drop the btree buffers and the agi buffer as we can't hold any - * of the locks these represent when calling iget. If there is a - * pending error, then we are done. - */ -del_cursor: - xfs_btree_del_cursor(cur, error); - xfs_buf_relse(agbp); - if (error) - break; - /* - * Now format all the good inodes into the user's buffer. The - * call to xfs_bulkstat_ag_ichunk() sets up the agino pointer - * for the next loop iteration. - */ - irbufend = irbp; - for (irbp = irbuf; - irbp < irbufend && ac.ac_ubleft >= statstruct_size; - irbp++) { - error = xfs_bulkstat_ag_ichunk(mp, agno, irbp, - formatter, statstruct_size, &ac, - &agino); - if (error) - break; +/* Return stat information in bulk (by-inode) for the filesystem. */ +int +xfs_bulkstat( + struct xfs_ibulk *breq, + bulkstat_one_fmt_pf formatter) +{ + struct xfs_bstat_chunk bc = { + .formatter = formatter, + .breq = breq, + }; + int error; - cond_resched(); - } + breq->ocount = 0; - /* - * If we've run out of space or had a formatting error, we - * are now done - */ - if (ac.ac_ubleft < statstruct_size || error) - break; + if (xfs_bulkstat_already_done(breq->mp, breq->startino)) + return 0; - if (end_of_ag) { - agno++; - agino = 0; - } - } - /* - * Done, we're either out of filesystem or space to put the data. - */ - kmem_free(irbuf); - *ubcountp = ac.ac_ubelem; + error = xfs_iwalk(breq->mp, NULL, breq->startino, xfs_bulkstat_iwalk, + breq->icount, &bc); /* * We found some inodes, so clear the error status and return them. @@ -509,17 +346,9 @@ xfs_bulkstat( * triggered again and propagated to userspace as there will be no * formatted inodes in the buffer. */ - if (ac.ac_ubelem) + if (breq->ocount > 0) error = 0; - /* - * If we ran out of filesystem, lastino will point off the end of - * the filesystem so the next call will return immediately. - */ - *lastinop = XFS_AGINO_TO_INO(mp, agno, agino); - if (agno >= mp->m_sb.sb_agcount) - *done = 1; - return error; } diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h index 369e3f159d4e..366d391eb11f 100644 --- a/fs/xfs/xfs_itable.h +++ b/fs/xfs/xfs_itable.h @@ -5,63 +5,46 @@ #ifndef __XFS_ITABLE_H__ #define __XFS_ITABLE_H__ -/* - * xfs_bulkstat() is used to fill in xfs_bstat structures as well as dm_stat - * structures (by the dmi library). This is a pointer to a formatter function - * that will iget the inode and fill in the appropriate structure. - * see xfs_bulkstat_one() and xfs_dm_bulkstat_one() in dmapi_xfs.c - */ -typedef int (*bulkstat_one_pf)(struct xfs_mount *mp, - xfs_ino_t ino, - void __user *buffer, - int ubsize, - int *ubused, - int *stat); +/* In-memory representation of a userspace request for batch inode data. */ +struct xfs_ibulk { + struct xfs_mount *mp; + void __user *ubuffer; /* user output buffer */ + xfs_ino_t startino; /* start with this inode */ + unsigned int icount; /* number of elements in ubuffer */ + unsigned int ocount; /* number of records returned */ +}; + +/* Return value that means we want to abort the walk. */ +#define XFS_IBULK_ABORT (XFS_IWALK_ABORT) + +/* Return value that means the formatting buffer is now full. */ +#define XFS_IBULK_BUFFER_FULL (2) /* - * Values for stat return value. + * Advance the user buffer pointer by one record of the given size. If the + * buffer is now full, return the appropriate error code. */ -#define BULKSTAT_RV_NOTHING 0 -#define BULKSTAT_RV_DIDONE 1 -#define BULKSTAT_RV_GIVEUP 2 +static inline int +xfs_ibulk_advance( + struct xfs_ibulk *breq, + size_t bytes) +{ + char __user *b = breq->ubuffer; + + breq->ubuffer = b + bytes; + breq->ocount++; + return breq->ocount == breq->icount ? XFS_IBULK_BUFFER_FULL : 0; +} /* * Return stat information in bulk (by-inode) for the filesystem. */ -int /* error status */ -xfs_bulkstat( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t *lastino, /* last inode returned */ - int *count, /* size of buffer/count returned */ - bulkstat_one_pf formatter, /* func that'd fill a single buf */ - size_t statstruct_size,/* sizeof struct that we're filling */ - char __user *ubuffer,/* buffer with inode stats */ - int *done); /* 1 if there are more stats to get */ -typedef int (*bulkstat_one_fmt_pf)( /* used size in bytes or negative error */ - void __user *ubuffer, /* buffer to write to */ - int ubsize, /* remaining user buffer sz */ - int *ubused, /* bytes used by formatter */ - const xfs_bstat_t *buffer); /* buffer to read from */ +typedef int (*bulkstat_one_fmt_pf)(struct xfs_ibulk *breq, + const struct xfs_bstat *bstat); -int -xfs_bulkstat_one_int( - xfs_mount_t *mp, - xfs_ino_t ino, - void __user *buffer, - int ubsize, - bulkstat_one_fmt_pf formatter, - int *ubused, - int *stat); - -int -xfs_bulkstat_one( - xfs_mount_t *mp, - xfs_ino_t ino, - void __user *buffer, - int ubsize, - int *ubused, - int *stat); +int xfs_bulkstat_one(struct xfs_ibulk *breq, bulkstat_one_fmt_pf formatter); +int xfs_bulkstat(struct xfs_ibulk *breq, bulkstat_one_fmt_pf formatter); typedef int (*inumbers_fmt_pf)( void __user *ubuffer, /* buffer to write to */ From patchwork Tue Jun 4 21:49:59 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10976171 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D3F6014C0 for ; Tue, 4 Jun 2019 21:50:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BFE88205C0 for ; Tue, 4 Jun 2019 21:50:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B16B8287A5; Tue, 4 Jun 2019 21:50:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 05883205C0 for ; Tue, 4 Jun 2019 21:50:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726554AbfFDVuE (ORCPT ); Tue, 4 Jun 2019 17:50:04 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:38552 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726293AbfFDVuE (ORCPT ); Tue, 4 Jun 2019 17:50:04 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LnkRu073488 for ; Tue, 4 Jun 2019 21:50:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=sj/KFmJUZRvKuZ0T5S5lh54CDFBzKq4V9ZZf1tmpVSY=; b=b7jf89ir3719kW11ULWsMYpGrnbdkDpmTGRYkTvwzPH4TcFtCK2YzCWOJh3tkruVJ9aI TH7CdJU011VA9YBJk7vYv/72uAFQjtTzXybxTMyLj+98HerrRq39S7egXi1vQ4qSdIim tmhzl6vgeTybB+BRSPnDXG2LGRiQXdC65r1FDPmjpzhECSgNfDEEb7+2B7TsAQ0ce7jn V/iI4AWFU182wzaNAnBwR5FShAdlIhRFvDRZ6LVMwXTGpBko0FkTfdYgepPJUQiRyhHG zdre1Mdyzp7ryt0xrlQ2jehRj+71ZAbPgfI1lOG9G9QczGIVNBqIBD094/m0BDrFKlkB Ug== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2130.oracle.com with ESMTP id 2suevdfw90-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:02 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54Lnqac044847 for ; Tue, 4 Jun 2019 21:50:02 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 2swnghkjjv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:02 +0000 Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x54Lo1f2011694 for ; Tue, 4 Jun 2019 21:50:01 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Jun 2019 14:50:00 -0700 Subject: [PATCH 05/10] xfs: move bulkstat ichunk helpers to iwalk code From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Tue, 04 Jun 2019 14:49:59 -0700 Message-ID: <155968499971.1657646.988325107267126890.stgit@magnolia> In-Reply-To: <155968496814.1657646.13743491598480818627.stgit@magnolia> References: <155968496814.1657646.13743491598480818627.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Now that we've reworked the bulkstat code to use iwalk, we can move the old bulkstat ichunk helpers to xfs_iwalk.c. No functional changes here. Signed-off-by: Darrick J. Wong Reviewed-by: Brian Foster --- fs/xfs/xfs_itable.c | 93 -------------------------------------------------- fs/xfs/xfs_itable.h | 8 ---- fs/xfs/xfs_iwalk.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 93 insertions(+), 103 deletions(-) diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c index 87c597ea1df7..06abe5c9c0ee 100644 --- a/fs/xfs/xfs_itable.c +++ b/fs/xfs/xfs_itable.c @@ -186,99 +186,6 @@ xfs_bulkstat_one( return error; } -/* - * Loop over all clusters in a chunk for a given incore inode allocation btree - * record. Do a readahead if there are any allocated inodes in that cluster. - */ -void -xfs_bulkstat_ichunk_ra( - struct xfs_mount *mp, - xfs_agnumber_t agno, - struct xfs_inobt_rec_incore *irec) -{ - struct xfs_ino_geometry *igeo = M_IGEO(mp); - xfs_agblock_t agbno; - struct blk_plug plug; - int i; /* inode chunk index */ - - agbno = XFS_AGINO_TO_AGBNO(mp, irec->ir_startino); - - blk_start_plug(&plug); - for (i = 0; - i < XFS_INODES_PER_CHUNK; - i += igeo->inodes_per_cluster, - agbno += igeo->blocks_per_cluster) { - if (xfs_inobt_maskn(i, igeo->inodes_per_cluster) & - ~irec->ir_free) { - xfs_btree_reada_bufs(mp, agno, agbno, - igeo->blocks_per_cluster, - &xfs_inode_buf_ops); - } - } - blk_finish_plug(&plug); -} - -/* - * Lookup the inode chunk that the given inode lives in and then get the record - * if we found the chunk. If the inode was not the last in the chunk and there - * are some left allocated, update the data for the pointed-to record as well as - * return the count of grabbed inodes. - */ -int -xfs_bulkstat_grab_ichunk( - struct xfs_btree_cur *cur, /* btree cursor */ - xfs_agino_t agino, /* starting inode of chunk */ - int *icount,/* return # of inodes grabbed */ - struct xfs_inobt_rec_incore *irec) /* btree record */ -{ - int idx; /* index into inode chunk */ - int stat; - int error = 0; - - /* Lookup the inode chunk that this inode lives in */ - error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_LE, &stat); - if (error) - return error; - if (!stat) { - *icount = 0; - return error; - } - - /* Get the record, should always work */ - error = xfs_inobt_get_rec(cur, irec, &stat); - if (error) - return error; - XFS_WANT_CORRUPTED_RETURN(cur->bc_mp, stat == 1); - - /* Check if the record contains the inode in request */ - if (irec->ir_startino + XFS_INODES_PER_CHUNK <= agino) { - *icount = 0; - return 0; - } - - idx = agino - irec->ir_startino + 1; - if (idx < XFS_INODES_PER_CHUNK && - (xfs_inobt_maskn(idx, XFS_INODES_PER_CHUNK - idx) & ~irec->ir_free)) { - int i; - - /* We got a right chunk with some left inodes allocated at it. - * Grab the chunk record. Mark all the uninteresting inodes - * free -- because they're before our start point. - */ - for (i = 0; i < idx; i++) { - if (XFS_INOBT_MASK(i) & ~irec->ir_free) - irec->ir_freecount++; - } - - irec->ir_free |= xfs_inobt_maskn(0, idx); - *icount = irec->ir_count - irec->ir_freecount; - } - - return 0; -} - -#define XFS_BULKSTAT_UBLEFT(ubleft) ((ubleft) >= statstruct_size) - static int xfs_bulkstat_iwalk( struct xfs_mount *mp, diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h index 366d391eb11f..a2562fe8d282 100644 --- a/fs/xfs/xfs_itable.h +++ b/fs/xfs/xfs_itable.h @@ -67,12 +67,4 @@ xfs_inumbers( void __user *buffer, /* buffer with inode info */ inumbers_fmt_pf formatter); -/* Temporarily needed while we refactor functions. */ -struct xfs_btree_cur; -struct xfs_inobt_rec_incore; -void xfs_bulkstat_ichunk_ra(struct xfs_mount *mp, xfs_agnumber_t agno, - struct xfs_inobt_rec_incore *irec); -int xfs_bulkstat_grab_ichunk(struct xfs_btree_cur *cur, xfs_agino_t agino, - int *icount, struct xfs_inobt_rec_incore *irec); - #endif /* __XFS_ITABLE_H__ */ diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 3e6c06e69c75..bef0c4907781 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -66,6 +66,97 @@ struct xfs_iwalk_ag { void *data; }; +/* + * Loop over all clusters in a chunk for a given incore inode allocation btree + * record. Do a readahead if there are any allocated inodes in that cluster. + */ +STATIC void +xfs_iwalk_ichunk_ra( + struct xfs_mount *mp, + xfs_agnumber_t agno, + struct xfs_inobt_rec_incore *irec) +{ + struct xfs_ino_geometry *igeo = M_IGEO(mp); + xfs_agblock_t agbno; + struct blk_plug plug; + int i; /* inode chunk index */ + + agbno = XFS_AGINO_TO_AGBNO(mp, irec->ir_startino); + + blk_start_plug(&plug); + for (i = 0; + i < XFS_INODES_PER_CHUNK; + i += igeo->inodes_per_cluster, + agbno += igeo->blocks_per_cluster) { + if (xfs_inobt_maskn(i, igeo->inodes_per_cluster) & + ~irec->ir_free) { + xfs_btree_reada_bufs(mp, agno, agbno, + igeo->blocks_per_cluster, + &xfs_inode_buf_ops); + } + } + blk_finish_plug(&plug); +} + +/* + * Lookup the inode chunk that the given inode lives in and then get the record + * if we found the chunk. If the inode was not the last in the chunk and there + * are some left allocated, update the data for the pointed-to record as well as + * return the count of grabbed inodes. + */ +STATIC int +xfs_iwalk_grab_ichunk( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_agino_t agino, /* starting inode of chunk */ + int *icount,/* return # of inodes grabbed */ + struct xfs_inobt_rec_incore *irec) /* btree record */ +{ + int idx; /* index into inode chunk */ + int stat; + int error = 0; + + /* Lookup the inode chunk that this inode lives in */ + error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_LE, &stat); + if (error) + return error; + if (!stat) { + *icount = 0; + return error; + } + + /* Get the record, should always work */ + error = xfs_inobt_get_rec(cur, irec, &stat); + if (error) + return error; + XFS_WANT_CORRUPTED_RETURN(cur->bc_mp, stat == 1); + + /* Check if the record contains the inode in request */ + if (irec->ir_startino + XFS_INODES_PER_CHUNK <= agino) { + *icount = 0; + return 0; + } + + idx = agino - irec->ir_startino + 1; + if (idx < XFS_INODES_PER_CHUNK && + (xfs_inobt_maskn(idx, XFS_INODES_PER_CHUNK - idx) & ~irec->ir_free)) { + int i; + + /* We got a right chunk with some left inodes allocated at it. + * Grab the chunk record. Mark all the uninteresting inodes + * free -- because they're before our start point. + */ + for (i = 0; i < idx; i++) { + if (XFS_INOBT_MASK(i) & ~irec->ir_free) + irec->ir_freecount++; + } + + irec->ir_free |= xfs_inobt_maskn(0, idx); + *icount = irec->ir_count - irec->ir_freecount; + } + + return 0; +} + /* Allocate memory for a walk. */ STATIC int xfs_iwalk_alloc( @@ -190,7 +281,7 @@ xfs_iwalk_ag_start( * We require a lookup cache of at least two elements so that we don't * have to deal with tearing down the cursor to walk the records. */ - error = xfs_bulkstat_grab_ichunk(*curpp, agino - 1, &icount, + error = xfs_iwalk_grab_ichunk(*curpp, agino - 1, &icount, &iwag->recs[iwag->nr_recs]); if (error) return error; @@ -295,7 +386,7 @@ xfs_iwalk_ag( * Start readahead for this inode chunk in anticipation of * walking the inodes. */ - xfs_bulkstat_ichunk_ra(mp, agno, irec); + xfs_iwalk_ichunk_ra(mp, agno, irec); /* * If there's space in the buffer for more records, increment From patchwork Tue Jun 4 21:50:06 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10976173 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7A10514C0 for ; Tue, 4 Jun 2019 21:50:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 687D728704 for ; Tue, 4 Jun 2019 21:50:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5CC7B287BF; Tue, 4 Jun 2019 21:50:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EBEDE28704 for ; Tue, 4 Jun 2019 21:50:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726354AbfFDVuK (ORCPT ); Tue, 4 Jun 2019 17:50:10 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:58706 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726537AbfFDVuK (ORCPT ); Tue, 4 Jun 2019 17:50:10 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54Lnffi053384 for ; Tue, 4 Jun 2019 21:50:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=YE+fohGzOYS+9YVXLNmYfCNyVoEP7dNAK2jNm1OXtT0=; b=algACDxR396I0EP8Vun2cV9xnwZS0bGiYHLH/qU37E+pO/Ho7dqqNg59r16jgLaJ3Nl2 G8iOkrPs3D5hAuMb60E4xFO2QSoiQ6hyOF90O9+xxOHXeNMFqpSM5D9GGVhXSgg1x9wx t9fVDSsVrFCCD8m+fcsfzEKS8mcN72YTQRV25Q3GOTsc2jtdzvv9FGbXtoRnxl08mxOC HlreFPFQl/n3g3PNBJXDTX7RS2zOfyitMMdSlpOKbCqhIewxH/pE/voArenDJZESmSp9 GxKyKadAGBbUGZgmnSyUsMGSD9Dl/+Pa33DFnsbLBKXyiuvZ9kSLd46QzQiqMZA+ecWy VA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 2sugstfp8x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:08 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54Lo0pI044987 for ; Tue, 4 Jun 2019 21:50:08 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3020.oracle.com with ESMTP id 2swnghkjne-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:08 +0000 Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x54Lo7YJ030975 for ; Tue, 4 Jun 2019 21:50:07 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Jun 2019 14:50:07 -0700 Subject: [PATCH 06/10] xfs: change xfs_iwalk_grab_ichunk to use startino, not lastino From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Tue, 04 Jun 2019 14:50:06 -0700 Message-ID: <155968500594.1657646.11152617991338213789.stgit@magnolia> In-Reply-To: <155968496814.1657646.13743491598480818627.stgit@magnolia> References: <155968496814.1657646.13743491598480818627.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=759 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=799 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Now that the inode chunk grabbing function is a static function in the iwalk code, change its behavior so that @agino is the inode where we want to /start/ the iteration. This reduces cognitive friction with the callers and simplifes the code. Signed-off-by: Darrick J. Wong Reviewed-by: Brian Foster --- fs/xfs/xfs_iwalk.c | 37 +++++++++++++++++-------------------- 1 file changed, 17 insertions(+), 20 deletions(-) diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index bef0c4907781..9ad017ddbae7 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -99,10 +99,10 @@ xfs_iwalk_ichunk_ra( } /* - * Lookup the inode chunk that the given inode lives in and then get the record - * if we found the chunk. If the inode was not the last in the chunk and there - * are some left allocated, update the data for the pointed-to record as well as - * return the count of grabbed inodes. + * Lookup the inode chunk that the given @agino lives in and then get the + * record if we found the chunk. Set the bits in @irec's free mask that + * correspond to the inodes before @agino so that we skip them. This is how we + * restart an inode walk that was interrupted in the middle of an inode record. */ STATIC int xfs_iwalk_grab_ichunk( @@ -113,6 +113,7 @@ xfs_iwalk_grab_ichunk( { int idx; /* index into inode chunk */ int stat; + int i; int error = 0; /* Lookup the inode chunk that this inode lives in */ @@ -136,24 +137,20 @@ xfs_iwalk_grab_ichunk( return 0; } - idx = agino - irec->ir_startino + 1; - if (idx < XFS_INODES_PER_CHUNK && - (xfs_inobt_maskn(idx, XFS_INODES_PER_CHUNK - idx) & ~irec->ir_free)) { - int i; + idx = agino - irec->ir_startino; - /* We got a right chunk with some left inodes allocated at it. - * Grab the chunk record. Mark all the uninteresting inodes - * free -- because they're before our start point. - */ - for (i = 0; i < idx; i++) { - if (XFS_INOBT_MASK(i) & ~irec->ir_free) - irec->ir_freecount++; - } - - irec->ir_free |= xfs_inobt_maskn(0, idx); - *icount = irec->ir_count - irec->ir_freecount; + /* + * We got a right chunk with some left inodes allocated at it. Grab + * the chunk record. Mark all the uninteresting inodes free because + * they're before our start point. + */ + for (i = 0; i < idx; i++) { + if (XFS_INOBT_MASK(i) & ~irec->ir_free) + irec->ir_freecount++; } + irec->ir_free |= xfs_inobt_maskn(0, idx); + *icount = irec->ir_count - irec->ir_freecount; return 0; } @@ -281,7 +278,7 @@ xfs_iwalk_ag_start( * We require a lookup cache of at least two elements so that we don't * have to deal with tearing down the cursor to walk the records. */ - error = xfs_iwalk_grab_ichunk(*curpp, agino - 1, &icount, + error = xfs_iwalk_grab_ichunk(*curpp, agino, &icount, &iwag->recs[iwag->nr_recs]); if (error) return error; From patchwork Tue Jun 4 21:50:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10976175 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1372E14C0 for ; Tue, 4 Jun 2019 21:50:19 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E9E5F205F8 for ; Tue, 4 Jun 2019 21:50:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DB82426B41; Tue, 4 Jun 2019 21:50:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 854E9205F8 for ; Tue, 4 Jun 2019 21:50:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726604AbfFDVuS (ORCPT ); Tue, 4 Jun 2019 17:50:18 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:58800 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726537AbfFDVuS (ORCPT ); Tue, 4 Jun 2019 17:50:18 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LnbfE053365 for ; Tue, 4 Jun 2019 21:50:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=/zrCm5COkDAl9zjEojBy7CVrc9eQWFnBPe3DfUjVHIQ=; b=kevTvVECHFPavR5GZd4GY0xduTUDSpjNCsJnmRmZ5+X5/17T554JaPjldB9T+RH1xywG bJ30WeyeKYlwUhUMaXOsy8NsSvyTOMvWTToGHr6QHf5003+ExkyDbJw7tyh/Bg92N1Ll tQI2DuMt8pApwBeqh1jijRAvWtbloE/Fxd/ti7Mbv1DPCk0Z8MBkR+g5wB1V91VJTV/j Fyoo+MXuO2N+KXCL27kkrFB7VC2tf65T0f4fijesYL6+K2tVx6lTl7C2c3ssvNYpwPFm GBiAQbhEaAX05gDdf0NBPO0YTxE4u6+T+GPtUN8SWQ1gh0imoRHwxDPS6IuwaU90PByj PQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2130.oracle.com with ESMTP id 2sugstfp9f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:16 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54Lmm6E172031 for ; Tue, 4 Jun 2019 21:50:16 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3030.oracle.com with ESMTP id 2swngkkj3t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:16 +0000 Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x54LoF3P025149 for ; Tue, 4 Jun 2019 21:50:15 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Jun 2019 14:50:15 -0700 Subject: [PATCH 07/10] xfs: clean up long conditionals in xfs_iwalk_ichunk_ra From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Tue, 04 Jun 2019 14:50:12 -0700 Message-ID: <155968501261.1657646.16272249682611768545.stgit@magnolia> In-Reply-To: <155968496814.1657646.13743491598480818627.stgit@magnolia> References: <155968496814.1657646.13743491598480818627.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=851 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=882 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Refactor xfs_iwalk_ichunk_ra to avoid long conditionals. Signed-off-by: Darrick J. Wong Reviewed-by: Brian Foster --- fs/xfs/xfs_iwalk.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 9ad017ddbae7..8595258b5001 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -84,16 +84,16 @@ xfs_iwalk_ichunk_ra( agbno = XFS_AGINO_TO_AGBNO(mp, irec->ir_startino); blk_start_plug(&plug); - for (i = 0; - i < XFS_INODES_PER_CHUNK; - i += igeo->inodes_per_cluster, - agbno += igeo->blocks_per_cluster) { - if (xfs_inobt_maskn(i, igeo->inodes_per_cluster) & - ~irec->ir_free) { + for (i = 0; i < XFS_INODES_PER_CHUNK; i += igeo->inodes_per_cluster) { + xfs_inofree_t imask; + + imask = xfs_inobt_maskn(i, igeo->inodes_per_cluster); + if (imask & ~irec->ir_free) { xfs_btree_reada_bufs(mp, agno, agbno, igeo->blocks_per_cluster, &xfs_inode_buf_ops); } + agbno += igeo->blocks_per_cluster; } blk_finish_plug(&plug); } From patchwork Tue Jun 4 21:50:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10976177 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF6CD14C0 for ; Tue, 4 Jun 2019 21:50:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CB1EE205F8 for ; Tue, 4 Jun 2019 21:50:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BF71B26B41; Tue, 4 Jun 2019 21:50:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C485A205F8 for ; Tue, 4 Jun 2019 21:50:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726416AbfFDVuZ (ORCPT ); Tue, 4 Jun 2019 17:50:25 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:37330 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726373AbfFDVuZ (ORCPT ); Tue, 4 Jun 2019 17:50:25 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LnPp5059831 for ; Tue, 4 Jun 2019 21:50:23 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=h2ecRESwHPlabpjcDJ2AUAyN30A/rTYte+st9045fQE=; b=An5nOWI2Sb5QOUd/INmyqO+W8fIb7N72xWI2WUm01+OpB4HDkEs2+84ytmEevv3spVN/ QDgjWy+z+sk6PllcOBdtbZ79gWkrB1rVsSCM6Irw1RNjR3HHqCGjCylaP599iBi+PFGW whyJE+TT5zR6gIoBxFOumofAUT9p6Ys1S9/vFu8Q2xWUO57w5rtgeeffagTbKk/4gDOF wUOyyylcjzHk1baZNwLlSRSz6yRkrQczsrfNNAcO9Hw2e+VAO1fD/ZoGJuzC8vwCpvoG 0yFrKp/AgXmpOxVL6IMtrM82t91A4FocVTaeb4n/EBEFYdOR+PUxFZvrl9DbZXskcO7o mQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 2suj0qfjx4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:23 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54Lo2tF045125 for ; Tue, 4 Jun 2019 21:50:22 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3020.oracle.com with ESMTP id 2swnghkjs8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:22 +0000 Received: from abhmp0002.oracle.com (abhmp0002.oracle.com [141.146.116.8]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x54LoMTV009124 for ; Tue, 4 Jun 2019 21:50:22 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Jun 2019 14:50:21 -0700 Subject: [PATCH 08/10] xfs: multithreaded iwalk implementation From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Tue, 04 Jun 2019 14:50:20 -0700 Message-ID: <155968502066.1657646.3694276570612406995.stgit@magnolia> In-Reply-To: <155968496814.1657646.13743491598480818627.stgit@magnolia> References: <155968496814.1657646.13743491598480818627.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Create a parallel iwalk implementation and switch quotacheck to use it. Signed-off-by: Darrick J. Wong --- fs/xfs/Makefile | 1 fs/xfs/xfs_globals.c | 3 + fs/xfs/xfs_iwalk.c | 76 ++++++++++++++++++++++++++++++- fs/xfs/xfs_iwalk.h | 2 + fs/xfs/xfs_pwork.c | 122 ++++++++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_pwork.h | 50 ++++++++++++++++++++ fs/xfs/xfs_qm.c | 2 - fs/xfs/xfs_sysctl.h | 6 ++ fs/xfs/xfs_sysfs.c | 40 ++++++++++++++++ fs/xfs/xfs_trace.h | 18 +++++++ 10 files changed, 317 insertions(+), 3 deletions(-) create mode 100644 fs/xfs/xfs_pwork.c create mode 100644 fs/xfs/xfs_pwork.h diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 74d30ef0dbce..48940a27d4aa 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -84,6 +84,7 @@ xfs-y += xfs_aops.o \ xfs_message.o \ xfs_mount.o \ xfs_mru_cache.o \ + xfs_pwork.o \ xfs_reflink.o \ xfs_stats.o \ xfs_super.o \ diff --git a/fs/xfs/xfs_globals.c b/fs/xfs/xfs_globals.c index d0d377384120..4f93f2c4dc38 100644 --- a/fs/xfs/xfs_globals.c +++ b/fs/xfs/xfs_globals.c @@ -31,6 +31,9 @@ xfs_param_t xfs_params = { .fstrm_timer = { 1, 30*100, 3600*100}, .eofb_timer = { 1, 300, 3600*24}, .cowb_timer = { 1, 1800, 3600*24}, +#ifdef DEBUG + .pwork_threads = { 0, 0, NR_CPUS }, +#endif }; struct xfs_globals xfs_globals = { diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 8595258b5001..71ee1628aa70 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -21,6 +21,7 @@ #include "xfs_health.h" #include "xfs_trans.h" #include "xfs_iwalk.h" +#include "xfs_pwork.h" /* * Walking Inodes in the Filesystem @@ -46,6 +47,9 @@ */ struct xfs_iwalk_ag { + /* parallel work control data; will be null if single threaded */ + struct xfs_pwork pwork; + struct xfs_mount *mp; struct xfs_trans *tp; @@ -200,6 +204,9 @@ xfs_iwalk_ag_recs( trace_xfs_iwalk_ag_rec(mp, agno, irec); for (j = 0; j < XFS_INODES_PER_CHUNK; j++) { + if (xfs_pwork_want_abort(&iwag->pwork)) + return 0; + /* Skip if this inode is free */ if (XFS_INOBT_MASK(j) & irec->ir_free) continue; @@ -360,7 +367,7 @@ xfs_iwalk_ag( agino = XFS_INO_TO_AGINO(mp, iwag->startino); error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more); - while (!error && has_more) { + while (!error && has_more && !xfs_pwork_want_abort(&iwag->pwork)) { struct xfs_inobt_rec_incore *irec; cond_resched(); @@ -409,7 +416,7 @@ xfs_iwalk_ag( xfs_iwalk_del_inobt(tp, &cur, &agi_bp, error); /* Walk any records left behind in the cache. */ - if (iwag->nr_recs == 0 || error) + if (iwag->nr_recs == 0 || error || xfs_pwork_want_abort(&iwag->pwork)) return error; return xfs_iwalk_ag_recs(iwag); @@ -465,6 +472,7 @@ xfs_iwalk( .iwalk_fn = iwalk_fn, .data = data, .startino = startino, + .pwork = XFS_PWORK_SINGLE_THREADED, }; xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino); int error; @@ -486,3 +494,67 @@ xfs_iwalk( xfs_iwalk_free(&iwag); return error; } + +/* Run per-thread iwalk work. */ +static int +xfs_iwalk_ag_work( + struct xfs_mount *mp, + struct xfs_pwork *pwork) +{ + struct xfs_iwalk_ag *iwag; + int error; + + iwag = container_of(pwork, struct xfs_iwalk_ag, pwork); + error = xfs_iwalk_alloc(iwag); + if (error) + goto out; + + error = xfs_iwalk_ag(iwag); + xfs_iwalk_free(iwag); +out: + kmem_free(iwag); + return error; +} + +/* + * Walk all the inodes in the filesystem using multiple threads to process each + * AG. + */ +int +xfs_iwalk_threaded( + struct xfs_mount *mp, + xfs_ino_t startino, + xfs_iwalk_fn iwalk_fn, + unsigned int max_prefetch, + void *data) +{ + struct xfs_pwork_ctl pctl; + xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino); + unsigned int nr_threads; + int error; + + ASSERT(agno < mp->m_sb.sb_agcount); + + nr_threads = xfs_pwork_guess_datadev_parallelism(mp); + error = xfs_pwork_init(mp, &pctl, xfs_iwalk_ag_work, "xfs_iwalk", + nr_threads); + if (error) + return error; + + for (; agno < mp->m_sb.sb_agcount; agno++) { + struct xfs_iwalk_ag *iwag; + + iwag = kmem_alloc(sizeof(struct xfs_iwalk_ag), KM_SLEEP); + iwag->mp = mp; + iwag->tp = NULL; + iwag->iwalk_fn = iwalk_fn; + iwag->data = data; + iwag->startino = startino; + iwag->recs = NULL; + xfs_iwalk_set_prefetch(iwag, max_prefetch); + xfs_pwork_queue(&pctl, &iwag->pwork); + startino = XFS_AGINO_TO_INO(mp, agno + 1, 0); + } + + return xfs_pwork_destroy(&pctl); +} diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h index 45b1baabcd2d..40233a05a766 100644 --- a/fs/xfs/xfs_iwalk.h +++ b/fs/xfs/xfs_iwalk.h @@ -14,5 +14,7 @@ typedef int (*xfs_iwalk_fn)(struct xfs_mount *mp, struct xfs_trans *tp, int xfs_iwalk(struct xfs_mount *mp, struct xfs_trans *tp, xfs_ino_t startino, xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, void *data); +int xfs_iwalk_threaded(struct xfs_mount *mp, xfs_ino_t startino, + xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, void *data); #endif /* __XFS_IWALK_H__ */ diff --git a/fs/xfs/xfs_pwork.c b/fs/xfs/xfs_pwork.c new file mode 100644 index 000000000000..19605a3a2482 --- /dev/null +++ b/fs/xfs/xfs_pwork.c @@ -0,0 +1,122 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2019 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_log_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_trace.h" +#include "xfs_sysctl.h" +#include "xfs_pwork.h" + +/* + * Parallel Work Queue + * =================== + * + * Abstract away the details of running a large and "obviously" parallelizable + * task across multiple CPUs. Callers initialize the pwork control object with + * a desired level of parallelization and a work function. Next, they embed + * struct xfs_pwork in whatever structure they use to pass work context to a + * worker thread and queue that pwork. The work function will be passed the + * pwork item when it is run (from process context) and any returned error will + * cause all threads to abort. + * + * This is the rough equivalent of the xfsprogs workqueue code, though we can't + * reuse that name here. + */ + +/* Invoke our caller's function. */ +static void +xfs_pwork_work( + struct work_struct *work) +{ + struct xfs_pwork *pwork; + struct xfs_pwork_ctl *pctl; + int error; + + pwork = container_of(work, struct xfs_pwork, work); + pctl = pwork->pctl; + error = pctl->work_fn(pctl->mp, pwork); + if (error && !pctl->error) + pctl->error = error; +} + +/* + * Set up control data for parallel work. @work_fn is the function that will + * be called. @tag will be written into the kernel threads. @nr_threads is + * the level of parallelism desired, or 0 for no limit. + */ +int +xfs_pwork_init( + struct xfs_mount *mp, + struct xfs_pwork_ctl *pctl, + xfs_pwork_work_fn work_fn, + const char *tag, + unsigned int nr_threads) +{ +#ifdef DEBUG + if (xfs_globals.pwork_threads > 0) + nr_threads = xfs_globals.pwork_threads; +#endif + trace_xfs_pwork_init(mp, nr_threads, current->pid); + + pctl->wq = alloc_workqueue("%s-%d", WQ_FREEZABLE, nr_threads, tag, + current->pid); + if (!pctl->wq) + return -ENOMEM; + pctl->work_fn = work_fn; + pctl->error = 0; + pctl->mp = mp; + + return 0; +} + +/* Queue some parallel work. */ +void +xfs_pwork_queue( + struct xfs_pwork_ctl *pctl, + struct xfs_pwork *pwork) +{ + INIT_WORK(&pwork->work, xfs_pwork_work); + pwork->pctl = pctl; + queue_work(pctl->wq, &pwork->work); +} + +/* Wait for the work to finish and tear down the control structure. */ +int +xfs_pwork_destroy( + struct xfs_pwork_ctl *pctl) +{ + destroy_workqueue(pctl->wq); + pctl->wq = NULL; + return pctl->error; +} + +/* + * Return the amount of parallelism that the data device can handle, or 0 for + * no limit. + */ +unsigned int +xfs_pwork_guess_datadev_parallelism( + struct xfs_mount *mp) +{ + struct xfs_buftarg *btp = mp->m_ddev_targp; + int iomin; + int ioopt; + + if (blk_queue_nonrot(btp->bt_bdev->bd_queue)) + return num_online_cpus(); + if (mp->m_sb.sb_width && mp->m_sb.sb_unit) + return mp->m_sb.sb_width / mp->m_sb.sb_unit; + iomin = bdev_io_min(btp->bt_bdev); + ioopt = bdev_io_opt(btp->bt_bdev); + if (iomin && ioopt) + return ioopt / iomin; + + return 1; +} diff --git a/fs/xfs/xfs_pwork.h b/fs/xfs/xfs_pwork.h new file mode 100644 index 000000000000..e0c1354a2d8c --- /dev/null +++ b/fs/xfs/xfs_pwork.h @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2019 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_PWORK_H__ +#define __XFS_PWORK_H__ + +struct xfs_pwork; +struct xfs_mount; + +typedef int (*xfs_pwork_work_fn)(struct xfs_mount *mp, struct xfs_pwork *pwork); + +/* + * Parallel work coordination structure. + */ +struct xfs_pwork_ctl { + struct workqueue_struct *wq; + struct xfs_mount *mp; + xfs_pwork_work_fn work_fn; + int error; +}; + +/* + * Embed this parallel work control item inside your own work structure, + * then queue work with it. + */ +struct xfs_pwork { + struct work_struct work; + struct xfs_pwork_ctl *pctl; +}; + +#define XFS_PWORK_SINGLE_THREADED { .pctl = NULL } + +/* Have we been told to abort? */ +static inline bool +xfs_pwork_want_abort( + struct xfs_pwork *pwork) +{ + return pwork->pctl && pwork->pctl->error; +} + +int xfs_pwork_init(struct xfs_mount *mp, struct xfs_pwork_ctl *pctl, + xfs_pwork_work_fn work_fn, const char *tag, + unsigned int nr_threads); +void xfs_pwork_queue(struct xfs_pwork_ctl *pctl, struct xfs_pwork *pwork); +int xfs_pwork_destroy(struct xfs_pwork_ctl *pctl); +unsigned int xfs_pwork_guess_datadev_parallelism(struct xfs_mount *mp); + +#endif /* __XFS_PWORK_H__ */ diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index a5b2260406a8..e4f3785f7a64 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -1305,7 +1305,7 @@ xfs_qm_quotacheck( flags |= XFS_PQUOTA_CHKD; } - error = xfs_iwalk(mp, NULL, 0, xfs_qm_dqusage_adjust, 0, NULL); + error = xfs_iwalk_threaded(mp, 0, xfs_qm_dqusage_adjust, 0, NULL); if (error) goto error_return; diff --git a/fs/xfs/xfs_sysctl.h b/fs/xfs/xfs_sysctl.h index ad7f9be13087..b555e045e2f4 100644 --- a/fs/xfs/xfs_sysctl.h +++ b/fs/xfs/xfs_sysctl.h @@ -37,6 +37,9 @@ typedef struct xfs_param { xfs_sysctl_val_t fstrm_timer; /* Filestream dir-AG assoc'n timeout. */ xfs_sysctl_val_t eofb_timer; /* Interval between eofb scan wakeups */ xfs_sysctl_val_t cowb_timer; /* Interval between cowb scan wakeups */ +#ifdef DEBUG + xfs_sysctl_val_t pwork_threads; /* Parallel workqueue thread count */ +#endif } xfs_param_t; /* @@ -82,6 +85,9 @@ enum { extern xfs_param_t xfs_params; struct xfs_globals { +#ifdef DEBUG + int pwork_threads; /* parallel workqueue threads */ +#endif int log_recovery_delay; /* log recovery delay (secs) */ int mount_delay; /* mount setup delay (secs) */ bool bug_on_assert; /* BUG() the kernel on assert failure */ diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c index cabda13f3c64..910e6b9cb1a7 100644 --- a/fs/xfs/xfs_sysfs.c +++ b/fs/xfs/xfs_sysfs.c @@ -206,11 +206,51 @@ always_cow_show( } XFS_SYSFS_ATTR_RW(always_cow); +#ifdef DEBUG +/* + * Override how many threads the parallel work queue is allowed to create. + * This has to be a debug-only global (instead of an errortag) because one of + * the main users of parallel workqueues is mount time quotacheck. + */ +STATIC ssize_t +pwork_threads_store( + struct kobject *kobject, + const char *buf, + size_t count) +{ + int ret; + int val; + + ret = kstrtoint(buf, 0, &val); + if (ret) + return ret; + + if (val < 0 || val > NR_CPUS) + return -EINVAL; + + xfs_globals.pwork_threads = val; + + return count; +} + +STATIC ssize_t +pwork_threads_show( + struct kobject *kobject, + char *buf) +{ + return snprintf(buf, PAGE_SIZE, "%d\n", xfs_globals.pwork_threads); +} +XFS_SYSFS_ATTR_RW(pwork_threads); +#endif /* DEBUG */ + static struct attribute *xfs_dbg_attrs[] = { ATTR_LIST(bug_on_assert), ATTR_LIST(log_recovery_delay), ATTR_LIST(mount_delay), ATTR_LIST(always_cow), +#ifdef DEBUG + ATTR_LIST(pwork_threads), +#endif NULL, }; diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index f9bb1d50bc0e..658cbade1998 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3556,6 +3556,24 @@ TRACE_EVENT(xfs_iwalk_ag_rec, __entry->startino, __entry->freemask) ) +TRACE_EVENT(xfs_pwork_init, + TP_PROTO(struct xfs_mount *mp, unsigned int nr_threads, pid_t pid), + TP_ARGS(mp, nr_threads, pid), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(unsigned int, nr_threads) + __field(pid_t, pid) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->nr_threads = nr_threads; + __entry->pid = pid; + ), + TP_printk("dev %d:%d nr_threads %u pid %u", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->nr_threads, __entry->pid) +) + #endif /* _TRACE_XFS_H */ #undef TRACE_INCLUDE_PATH From patchwork Tue Jun 4 21:50:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10976179 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 390C114E5 for ; Tue, 4 Jun 2019 21:50:33 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 25C06205F8 for ; Tue, 4 Jun 2019 21:50:33 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 176B126B41; Tue, 4 Jun 2019 21:50:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0046E205F8 for ; Tue, 4 Jun 2019 21:50:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726317AbfFDVub (ORCPT ); Tue, 4 Jun 2019 17:50:31 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:59034 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726373AbfFDVub (ORCPT ); Tue, 4 Jun 2019 17:50:31 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LnuaD053576 for ; Tue, 4 Jun 2019 21:50:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=6QXxTAHirJ30VqBHOCc7/TQtPJpF5rytDrpxpvML3g8=; b=oTN8KPEjE8qRvpFCJ81UpdKvTEQFgPB9YlRbnWbcrbj8fvR7TtDbMmoliSncWBEhpUqz O4BkZUVYejO8TbSWg5zMIrQx8GCCyFjPxzHxbFC0I5gy5OEwOWoYubnQCjPwVdIUJzn3 6ozeXBNzkKeshDZWrhcCeVM7oNrjzdz77233Qdv3Brchxo2aQJ456yZKxazn/JfHAbcz Z29F7xqozEQet8XoMNFVQoCf8khQr1naN12YElWIgMpAAhVBDs88mUjz0pVcNkP/MET9 Z7TR34kptVBjzyBKg2VODQEI3tPgpm23rSfFig4/vkIzEoLAIkGiYtTMTaYYIH8krq50 Vg== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2130.oracle.com with ESMTP id 2sugstfpan-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:29 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LmmTQ172010 for ; Tue, 4 Jun 2019 21:50:28 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3030.oracle.com with ESMTP id 2swngkkj6k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:28 +0000 Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x54LoS6K025239 for ; Tue, 4 Jun 2019 21:50:28 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Jun 2019 14:50:28 -0700 Subject: [PATCH 09/10] xfs: poll waiting for quotacheck From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Tue, 04 Jun 2019 14:50:27 -0700 Message-ID: <155968502703.1657646.17911228005649046316.stgit@magnolia> In-Reply-To: <155968496814.1657646.13743491598480818627.stgit@magnolia> References: <155968496814.1657646.13743491598480818627.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Create a pwork destroy function that uses polling instead of uninterruptible sleep to wait for work items to finish so that we can touch the softlockup watchdog. IOWs, gross hack. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_iwalk.c | 3 +++ fs/xfs/xfs_iwalk.h | 3 ++- fs/xfs/xfs_pwork.c | 21 +++++++++++++++++++++ fs/xfs/xfs_pwork.h | 2 ++ fs/xfs/xfs_qm.c | 2 +- 5 files changed, 29 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 71ee1628aa70..c4a9c4c246b7 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -526,6 +526,7 @@ xfs_iwalk_threaded( xfs_ino_t startino, xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, + bool polled, void *data) { struct xfs_pwork_ctl pctl; @@ -556,5 +557,7 @@ xfs_iwalk_threaded( startino = XFS_AGINO_TO_INO(mp, agno + 1, 0); } + if (polled) + return xfs_pwork_destroy_poll(&pctl); return xfs_pwork_destroy(&pctl); } diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h index 40233a05a766..76d8f87a39ef 100644 --- a/fs/xfs/xfs_iwalk.h +++ b/fs/xfs/xfs_iwalk.h @@ -15,6 +15,7 @@ typedef int (*xfs_iwalk_fn)(struct xfs_mount *mp, struct xfs_trans *tp, int xfs_iwalk(struct xfs_mount *mp, struct xfs_trans *tp, xfs_ino_t startino, xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, void *data); int xfs_iwalk_threaded(struct xfs_mount *mp, xfs_ino_t startino, - xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, void *data); + xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, bool poll, + void *data); #endif /* __XFS_IWALK_H__ */ diff --git a/fs/xfs/xfs_pwork.c b/fs/xfs/xfs_pwork.c index 19605a3a2482..3b885e0b52ac 100644 --- a/fs/xfs/xfs_pwork.c +++ b/fs/xfs/xfs_pwork.c @@ -13,6 +13,7 @@ #include "xfs_trace.h" #include "xfs_sysctl.h" #include "xfs_pwork.h" +#include /* * Parallel Work Queue @@ -44,6 +45,7 @@ xfs_pwork_work( error = pctl->work_fn(pctl->mp, pwork); if (error && !pctl->error) pctl->error = error; + atomic_dec(&pctl->nr_work); } /* @@ -72,6 +74,7 @@ xfs_pwork_init( pctl->work_fn = work_fn; pctl->error = 0; pctl->mp = mp; + atomic_set(&pctl->nr_work, 0); return 0; } @@ -84,6 +87,7 @@ xfs_pwork_queue( { INIT_WORK(&pwork->work, xfs_pwork_work); pwork->pctl = pctl; + atomic_inc(&pctl->nr_work); queue_work(pctl->wq, &pwork->work); } @@ -97,6 +101,23 @@ xfs_pwork_destroy( return pctl->error; } +/* + * Wait for the work to finish and tear down the control structure. + * Continually poll completion status and touch the soft lockup watchdog. + * This is for things like mount that hold locks. + */ +int +xfs_pwork_destroy_poll( + struct xfs_pwork_ctl *pctl) +{ + while (atomic_read(&pctl->nr_work) > 0) { + msleep(1); + touch_softlockup_watchdog(); + } + + return xfs_pwork_destroy(pctl); +} + /* * Return the amount of parallelism that the data device can handle, or 0 for * no limit. diff --git a/fs/xfs/xfs_pwork.h b/fs/xfs/xfs_pwork.h index e0c1354a2d8c..08da723a8dc9 100644 --- a/fs/xfs/xfs_pwork.h +++ b/fs/xfs/xfs_pwork.h @@ -18,6 +18,7 @@ struct xfs_pwork_ctl { struct workqueue_struct *wq; struct xfs_mount *mp; xfs_pwork_work_fn work_fn; + atomic_t nr_work; int error; }; @@ -45,6 +46,7 @@ int xfs_pwork_init(struct xfs_mount *mp, struct xfs_pwork_ctl *pctl, unsigned int nr_threads); void xfs_pwork_queue(struct xfs_pwork_ctl *pctl, struct xfs_pwork *pwork); int xfs_pwork_destroy(struct xfs_pwork_ctl *pctl); +int xfs_pwork_destroy_poll(struct xfs_pwork_ctl *pctl); unsigned int xfs_pwork_guess_datadev_parallelism(struct xfs_mount *mp); #endif /* __XFS_PWORK_H__ */ diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index e4f3785f7a64..de6a623ada02 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -1305,7 +1305,7 @@ xfs_qm_quotacheck( flags |= XFS_PQUOTA_CHKD; } - error = xfs_iwalk_threaded(mp, 0, xfs_qm_dqusage_adjust, 0, NULL); + error = xfs_iwalk_threaded(mp, 0, xfs_qm_dqusage_adjust, 0, true, NULL); if (error) goto error_return; From patchwork Tue Jun 4 21:50:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10976181 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6E4A514E5 for ; Tue, 4 Jun 2019 21:50:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5A337205F8 for ; Tue, 4 Jun 2019 21:50:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4E57D26B41; Tue, 4 Jun 2019 21:50:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 26226205F8 for ; Tue, 4 Jun 2019 21:50:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726427AbfFDVuj (ORCPT ); Tue, 4 Jun 2019 17:50:39 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:39100 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726373AbfFDVuj (ORCPT ); Tue, 4 Jun 2019 17:50:39 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54LoI5o074193 for ; Tue, 4 Jun 2019 21:50:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=tF8iDbuZO7zFMMaf7phKcnERgnD1uA2e4n+yn8a7LUg=; b=kVwIvuef8L/m6In5+l5kci1SxrD5kLZ9S5x+4Hs5zfieqveFLnOYqlCHzl35YwnfItqf jVN22qgZQV/LvsA4D9vtOc9d0yHjHKyxTHtSIpYTDKIbwQdmDkgKNSAndRE0FumphPMi rDTBuDPEIA3o84pw48+nWhT/ml6niIYFq7aQBTuZozOZuyEAqEpt0fboKyPtKiXUGLcU +Mw2v5/g+HEXuVGUMOrY/3G8G+slb3WocJnI7GxPS55UMrM+gSfKSadzZx4B3MpVlQmE swWttUq4BKKXs2+crdTrwHiiXWOiP2swt8csRaCuHpE+kr22GpQfzVDXllbi55dhFJfz xg== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2130.oracle.com with ESMTP id 2suevdfwbc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:36 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x54Loa6V013024 for ; Tue, 4 Jun 2019 21:50:36 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3030.oracle.com with ESMTP id 2swnhbugec-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Tue, 04 Jun 2019 21:50:36 +0000 Received: from abhmp0020.oracle.com (abhmp0020.oracle.com [141.146.116.26]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x54LoY5X012060 for ; Tue, 4 Jun 2019 21:50:34 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Jun 2019 14:50:34 -0700 Subject: [PATCH 10/10] xfs: refactor INUMBERS to use iwalk functions From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Tue, 04 Jun 2019 14:50:33 -0700 Message-ID: <155968503328.1657646.15035810252397604734.stgit@magnolia> In-Reply-To: <155968496814.1657646.13743491598480818627.stgit@magnolia> References: <155968496814.1657646.13743491598480818627.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9278 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906040138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Now that we have generic functions to walk inode records, refactor the INUMBERS implementation to use it. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_ioctl.c | 20 ++++-- fs/xfs/xfs_ioctl.h | 2 + fs/xfs/xfs_ioctl32.c | 35 ++++------ fs/xfs/xfs_itable.c | 168 ++++++++++++++++++++------------------------------ fs/xfs/xfs_itable.h | 22 +------ fs/xfs/xfs_iwalk.c | 161 ++++++++++++++++++++++++++++++++++++++++++++++-- fs/xfs/xfs_iwalk.h | 12 ++++ 7 files changed, 262 insertions(+), 158 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 43734901aeb9..4fa9a2c8b029 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -732,6 +732,16 @@ xfs_bulkstat_one_fmt( return xfs_ibulk_advance(breq, sizeof(struct xfs_bstat)); } +int +xfs_inumbers_fmt( + struct xfs_ibulk *breq, + const struct xfs_inogrp *igrp) +{ + if (copy_to_user(breq->ubuffer, igrp, sizeof(*igrp))) + return -EFAULT; + return xfs_ibulk_advance(breq, sizeof(struct xfs_inogrp)); +} + STATIC int xfs_ioc_bulkstat( xfs_mount_t *mp, @@ -779,13 +789,9 @@ xfs_ioc_bulkstat( * parameter to maintain correct function. */ if (cmd == XFS_IOC_FSINUMBERS) { - int count = breq.icount; - - breq.startino = lastino; - error = xfs_inumbers(mp, &breq.startino, &count, - bulkreq.ubuffer, xfs_inumbers_fmt); - breq.ocount = count; - lastino = breq.startino; + breq.startino = lastino + 1; + error = xfs_inumbers(&breq, xfs_inumbers_fmt); + lastino = breq.startino - 1; } else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE) { breq.startino = lastino; error = xfs_bulkstat_one(&breq, xfs_bulkstat_one_fmt); diff --git a/fs/xfs/xfs_ioctl.h b/fs/xfs/xfs_ioctl.h index f32c8aadfeba..fb303eaa8863 100644 --- a/fs/xfs/xfs_ioctl.h +++ b/fs/xfs/xfs_ioctl.h @@ -79,7 +79,9 @@ xfs_set_dmattrs( struct xfs_ibulk; struct xfs_bstat; +struct xfs_inogrp; int xfs_bulkstat_one_fmt(struct xfs_ibulk *breq, const struct xfs_bstat *bstat); +int xfs_inumbers_fmt(struct xfs_ibulk *breq, const struct xfs_inogrp *igrp); #endif diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c index add15819daf3..dd53a9692e68 100644 --- a/fs/xfs/xfs_ioctl32.c +++ b/fs/xfs/xfs_ioctl32.c @@ -85,22 +85,17 @@ xfs_compat_growfs_rt_copyin( STATIC int xfs_inumbers_fmt_compat( - void __user *ubuffer, - const struct xfs_inogrp *buffer, - long count, - long *written) + struct xfs_ibulk *breq, + const struct xfs_inogrp *igrp) { - compat_xfs_inogrp_t __user *p32 = ubuffer; - long i; + struct compat_xfs_inogrp __user *p32 = breq->ubuffer; - for (i = 0; i < count; i++) { - if (put_user(buffer[i].xi_startino, &p32[i].xi_startino) || - put_user(buffer[i].xi_alloccount, &p32[i].xi_alloccount) || - put_user(buffer[i].xi_allocmask, &p32[i].xi_allocmask)) - return -EFAULT; - } - *written = count * sizeof(*p32); - return 0; + if (put_user(igrp->xi_startino, &p32->xi_startino) || + put_user(igrp->xi_alloccount, &p32->xi_alloccount) || + put_user(igrp->xi_allocmask, &p32->xi_allocmask)) + return -EFAULT; + + return xfs_ibulk_advance(breq, sizeof(struct compat_xfs_inogrp)); } #else @@ -225,7 +220,7 @@ xfs_compat_ioc_bulkstat( * to userpace memory via bulkreq.ubuffer. Normally the compat * functions and structure size are the correct ones to use ... */ - inumbers_fmt_pf inumbers_func = xfs_inumbers_fmt_compat; + inumbers_fmt_pf inumbers_func = xfs_inumbers_fmt_compat; bulkstat_one_fmt_pf bs_one_func = xfs_bulkstat_one_fmt_compat; #ifdef CONFIG_X86_X32 @@ -286,13 +281,9 @@ xfs_compat_ioc_bulkstat( * parameter to maintain correct function. */ if (cmd == XFS_IOC_FSINUMBERS_32) { - int count = breq.icount; - - breq.startino = lastino; - error = xfs_inumbers(mp, &breq.startino, &count, - bulkreq.ubuffer, inumbers_func); - breq.ocount = count; - lastino = breq.startino; + breq.startino = lastino + 1; + error = xfs_inumbers(&breq, inumbers_func); + lastino = breq.startino - 1; } else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE_32) { breq.startino = lastino; error = xfs_bulkstat_one(&breq, bs_one_func); diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c index 06abe5c9c0ee..bade54d6ac64 100644 --- a/fs/xfs/xfs_itable.c +++ b/fs/xfs/xfs_itable.c @@ -259,121 +259,85 @@ xfs_bulkstat( return error; } -int -xfs_inumbers_fmt( - void __user *ubuffer, /* buffer to write to */ - const struct xfs_inogrp *buffer, /* buffer to read from */ - long count, /* # of elements to read */ - long *written) /* # of bytes written */ +struct xfs_inumbers_chunk { + inumbers_fmt_pf formatter; + struct xfs_ibulk *breq; +}; + +/* + * INUMBERS + * ======== + * This is how we export inode btree records to userspace, so that XFS tools + * can figure out where inodes are allocated. + */ + +/* + * Format the inode group structure and report it somewhere. + * + * Similar to xfs_bulkstat_one_int, lastino is the inode cursor as we walk + * through the filesystem so we move it forward unless there was a runtime + * error. If the formatter tells us the buffer is now full we also move the + * cursor forward and abort the walk. + */ +STATIC int +xfs_inumbers_walk( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_agnumber_t agno, + const struct xfs_inobt_rec_incore *irec, + void *data) { - if (copy_to_user(ubuffer, buffer, count * sizeof(*buffer))) - return -EFAULT; - *written = count * sizeof(*buffer); - return 0; + struct xfs_inogrp inogrp = { + .xi_startino = XFS_AGINO_TO_INO(mp, agno, irec->ir_startino), + .xi_alloccount = irec->ir_count - irec->ir_freecount, + .xi_allocmask = ~irec->ir_free, + }; + struct xfs_inumbers_chunk *ic = data; + xfs_agino_t agino; + int error; + + error = ic->formatter(ic->breq, &inogrp); + if (error && error != XFS_IBULK_BUFFER_FULL) + return error; + if (error == XFS_IBULK_BUFFER_FULL) + error = XFS_INOBT_WALK_ABORT; + + agino = irec->ir_startino + XFS_INODES_PER_CHUNK; + ic->breq->startino = XFS_AGINO_TO_INO(mp, agno, agino); + return error; } /* * Return inode number table for the filesystem. */ -int /* error status */ +int xfs_inumbers( - struct xfs_mount *mp,/* mount point for filesystem */ - xfs_ino_t *lastino,/* last inode returned */ - int *count,/* size of buffer/count returned */ - void __user *ubuffer,/* buffer with inode descriptions */ + struct xfs_ibulk *breq, inumbers_fmt_pf formatter) { - xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, *lastino); - xfs_agino_t agino = XFS_INO_TO_AGINO(mp, *lastino); - struct xfs_btree_cur *cur = NULL; - struct xfs_buf *agbp = NULL; - struct xfs_inogrp *buffer; - int bcount; - int left = *count; - int bufidx = 0; + struct xfs_inumbers_chunk ic = { + .formatter = formatter, + .breq = breq, + }; int error = 0; - *count = 0; - if (agno >= mp->m_sb.sb_agcount || - *lastino != XFS_AGINO_TO_INO(mp, agno, agino)) - return error; + breq->ocount = 0; - bcount = min(left, (int)(PAGE_SIZE / sizeof(*buffer))); - buffer = kmem_zalloc(bcount * sizeof(*buffer), KM_SLEEP); - do { - struct xfs_inobt_rec_incore r; - int stat; - - if (!agbp) { - error = xfs_ialloc_read_agi(mp, NULL, agno, &agbp); - if (error) - break; - - cur = xfs_inobt_init_cursor(mp, NULL, agbp, agno, - XFS_BTNUM_INO); - error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_GE, - &stat); - if (error) - break; - if (!stat) - goto next_ag; - } - - error = xfs_inobt_get_rec(cur, &r, &stat); - if (error) - break; - if (!stat) - goto next_ag; - - agino = r.ir_startino + XFS_INODES_PER_CHUNK - 1; - buffer[bufidx].xi_startino = - XFS_AGINO_TO_INO(mp, agno, r.ir_startino); - buffer[bufidx].xi_alloccount = r.ir_count - r.ir_freecount; - buffer[bufidx].xi_allocmask = ~r.ir_free; - if (++bufidx == bcount) { - long written; - - error = formatter(ubuffer, buffer, bufidx, &written); - if (error) - break; - ubuffer += written; - *count += bufidx; - bufidx = 0; - } - if (!--left) - break; - - error = xfs_btree_increment(cur, 0, &stat); - if (error) - break; - if (stat) - continue; - -next_ag: - xfs_btree_del_cursor(cur, XFS_BTREE_ERROR); - cur = NULL; - xfs_buf_relse(agbp); - agbp = NULL; - agino = 0; - agno++; - } while (agno < mp->m_sb.sb_agcount); - - if (!error) { - if (bufidx) { - long written; - - error = formatter(ubuffer, buffer, bufidx, &written); - if (!error) - *count += bufidx; - } - *lastino = XFS_AGINO_TO_INO(mp, agno, agino); - } + if (xfs_bulkstat_already_done(breq->mp, breq->startino)) + return 0; + + error = xfs_inobt_walk(breq->mp, NULL, breq->startino, + xfs_inumbers_walk, breq->icount, &ic); - kmem_free(buffer); - if (cur) - xfs_btree_del_cursor(cur, error); - if (agbp) - xfs_buf_relse(agbp); + /* + * We found some inode groups, so clear the error status and return + * them. The lastino pointer will point directly at the inode that + * triggered any error that occurred, so on the next call the error + * will be triggered again and propagated to userspace as there will be + * no formatted inode groups in the buffer. + */ + if (breq->ocount > 0) + error = 0; return error; } diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h index a2562fe8d282..b4c89454e27a 100644 --- a/fs/xfs/xfs_itable.h +++ b/fs/xfs/xfs_itable.h @@ -46,25 +46,9 @@ typedef int (*bulkstat_one_fmt_pf)(struct xfs_ibulk *breq, int xfs_bulkstat_one(struct xfs_ibulk *breq, bulkstat_one_fmt_pf formatter); int xfs_bulkstat(struct xfs_ibulk *breq, bulkstat_one_fmt_pf formatter); -typedef int (*inumbers_fmt_pf)( - void __user *ubuffer, /* buffer to write to */ - const xfs_inogrp_t *buffer, /* buffer to read from */ - long count, /* # of elements to read */ - long *written); /* # of bytes written */ +typedef int (*inumbers_fmt_pf)(struct xfs_ibulk *breq, + const struct xfs_inogrp *igrp); -int -xfs_inumbers_fmt( - void __user *ubuffer, /* buffer to write to */ - const xfs_inogrp_t *buffer, /* buffer to read from */ - long count, /* # of elements to read */ - long *written); /* # of bytes written */ - -int /* error status */ -xfs_inumbers( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t *last, /* last inode returned */ - int *count, /* size of buffer/count returned */ - void __user *buffer, /* buffer with inode info */ - inumbers_fmt_pf formatter); +int xfs_inumbers(struct xfs_ibulk *breq, inumbers_fmt_pf formatter); #endif /* __XFS_ITABLE_H__ */ diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index c4a9c4c246b7..3a35d1cf7e14 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -66,7 +66,10 @@ struct xfs_iwalk_ag { unsigned int nr_recs; /* Inode walk function and data pointer. */ - xfs_iwalk_fn iwalk_fn; + union { + xfs_iwalk_fn iwalk_fn; + xfs_inobt_walk_fn inobt_walk_fn; + }; void *data; }; @@ -104,16 +107,18 @@ xfs_iwalk_ichunk_ra( /* * Lookup the inode chunk that the given @agino lives in and then get the - * record if we found the chunk. Set the bits in @irec's free mask that - * correspond to the inodes before @agino so that we skip them. This is how we - * restart an inode walk that was interrupted in the middle of an inode record. + * record if we found the chunk. If @trim is set, set the bits in @irec's free + * mask that correspond to the inodes before @agino so that we skip them. + * This is how we restart an inode walk that was interrupted in the middle of + * an inode record. */ STATIC int xfs_iwalk_grab_ichunk( struct xfs_btree_cur *cur, /* btree cursor */ xfs_agino_t agino, /* starting inode of chunk */ int *icount,/* return # of inodes grabbed */ - struct xfs_inobt_rec_incore *irec) /* btree record */ + struct xfs_inobt_rec_incore *irec, /* btree record */ + bool trim) { int idx; /* index into inode chunk */ int stat; @@ -141,6 +146,12 @@ xfs_iwalk_grab_ichunk( return 0; } + /* Return the entire record if the caller wants the whole thing. */ + if (!trim) { + *icount = irec->ir_count; + return 0; + } + idx = agino - irec->ir_startino; /* @@ -262,7 +273,8 @@ xfs_iwalk_ag_start( xfs_agino_t agino, struct xfs_btree_cur **curpp, struct xfs_buf **agi_bpp, - int *has_more) + int *has_more, + bool trim) { struct xfs_mount *mp = iwag->mp; struct xfs_trans *tp = iwag->tp; @@ -286,7 +298,7 @@ xfs_iwalk_ag_start( * have to deal with tearing down the cursor to walk the records. */ error = xfs_iwalk_grab_ichunk(*curpp, agino, &icount, - &iwag->recs[iwag->nr_recs]); + &iwag->recs[iwag->nr_recs], trim); if (error) return error; if (icount) @@ -365,7 +377,8 @@ xfs_iwalk_ag( /* Set up our cursor at the right place in the inode btree. */ agno = XFS_INO_TO_AGNO(mp, iwag->startino); agino = XFS_INO_TO_AGINO(mp, iwag->startino); - error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more); + error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more, + true); while (!error && has_more && !xfs_pwork_want_abort(&iwag->pwork)) { struct xfs_inobt_rec_incore *irec; @@ -561,3 +574,135 @@ xfs_iwalk_threaded( return xfs_pwork_destroy_poll(&pctl); return xfs_pwork_destroy(&pctl); } + +/* For each inuse inode in each cached inobt record, call our function. */ +STATIC int +xfs_inobt_walk_ag_recs( + struct xfs_iwalk_ag *iwag) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + struct xfs_inobt_rec_incore *irec; + unsigned int i; + xfs_agnumber_t agno; + int error; + + agno = XFS_INO_TO_AGNO(mp, iwag->startino); + for (i = 0, irec = iwag->recs; i < iwag->nr_recs; i++, irec++) { + trace_xfs_iwalk_ag_rec(mp, agno, irec); + error = iwag->inobt_walk_fn(mp, tp, agno, irec, iwag->data); + if (error) + return error; + } + + iwag->nr_recs = 0; + return 0; +} + +/* + * Walk all inode btree records in a single AG, from @iwag->startino to the end + * of the AG. + */ +STATIC int +xfs_inobt_walk_ag( + struct xfs_iwalk_ag *iwag) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + struct xfs_buf *agi_bp = NULL; + struct xfs_btree_cur *cur = NULL; + xfs_agnumber_t agno; + xfs_agino_t agino; + int has_more; + int error = 0; + + /* Set up our cursor at the right place in the inode btree. */ + agno = XFS_INO_TO_AGNO(mp, iwag->startino); + agino = XFS_INO_TO_AGINO(mp, iwag->startino); + error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more, + false); + + while (!error && has_more && !xfs_pwork_want_abort(&iwag->pwork)) { + struct xfs_inobt_rec_incore *irec; + + cond_resched(); + + /* Fetch the inobt record. */ + irec = &iwag->recs[iwag->nr_recs]; + error = xfs_inobt_get_rec(cur, irec, &has_more); + if (error || !has_more) + break; + + /* + * If there's space in the buffer for more records, increment + * the btree cursor and grab more. + */ + if (++iwag->nr_recs < iwag->sz_recs) { + error = xfs_btree_increment(cur, 0, &has_more); + if (error || !has_more) + break; + continue; + } + + /* + * Otherwise, we need to save cursor state and run the callback + * function on the cached records. The run_callbacks function + * is supposed to return a cursor pointing to the record where + * we would be if we had been able to increment like above. + */ + error = xfs_iwalk_run_callbacks(iwag, xfs_inobt_walk_ag_recs, + agno, &cur, &agi_bp, &has_more); + } + + xfs_iwalk_del_inobt(tp, &cur, &agi_bp, error); + + /* Walk any records left behind in the cache. */ + if (iwag->nr_recs == 0 || error || xfs_pwork_want_abort(&iwag->pwork)) + return error; + + return xfs_inobt_walk_ag_recs(iwag); +} + +/* + * Walk all inode btree records in the filesystem starting from @startino. The + * @inobt_walk_fn will be called for each btree record, being passed the incore + * record and @data. @max_prefetch controls how many inobt records we try to + * cache ahead of time. + */ +int +xfs_inobt_walk( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_ino_t startino, + xfs_inobt_walk_fn inobt_walk_fn, + unsigned int max_prefetch, + void *data) +{ + struct xfs_iwalk_ag iwag = { + .mp = mp, + .tp = tp, + .inobt_walk_fn = inobt_walk_fn, + .data = data, + .startino = startino, + .pwork = XFS_PWORK_SINGLE_THREADED, + }; + xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino); + int error; + + ASSERT(agno < mp->m_sb.sb_agcount); + + xfs_iwalk_set_prefetch(&iwag, max_prefetch * XFS_INODES_PER_CHUNK); + error = xfs_iwalk_alloc(&iwag); + if (error) + return error; + + for (; agno < mp->m_sb.sb_agcount; agno++) { + error = xfs_inobt_walk_ag(&iwag); + if (error) + break; + iwag.startino = XFS_AGINO_TO_INO(mp, agno + 1, 0); + } + + xfs_iwalk_free(&iwag); + return error; +} diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h index 76d8f87a39ef..20bee93d4676 100644 --- a/fs/xfs/xfs_iwalk.h +++ b/fs/xfs/xfs_iwalk.h @@ -18,4 +18,16 @@ int xfs_iwalk_threaded(struct xfs_mount *mp, xfs_ino_t startino, xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, bool poll, void *data); +/* Walk all inode btree records in the filesystem starting from @startino. */ +typedef int (*xfs_inobt_walk_fn)(struct xfs_mount *mp, struct xfs_trans *tp, + xfs_agnumber_t agno, + const struct xfs_inobt_rec_incore *irec, + void *data); +/* Return value (for xfs_inobt_walk_fn) that aborts the walk immediately. */ +#define XFS_INOBT_WALK_ABORT (XFS_IWALK_ABORT) + +int xfs_inobt_walk(struct xfs_mount *mp, struct xfs_trans *tp, + xfs_ino_t startino, xfs_inobt_walk_fn inobt_walk_fn, + unsigned int max_prefetch, void *data); + #endif /* __XFS_IWALK_H__ */