From patchwork Wed May 29 22:26:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967843 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD4E476 for ; Wed, 29 May 2019 22:26:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A7FAA28A35 for ; Wed, 29 May 2019 22:26:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9C2E828A38; Wed, 29 May 2019 22:26:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5631828A35 for ; Wed, 29 May 2019 22:26:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726018AbfE2W03 (ORCPT ); Wed, 29 May 2019 18:26:29 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:42114 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726538AbfE2W02 (ORCPT ); Wed, 29 May 2019 18:26:28 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM466E022827 for ; Wed, 29 May 2019 22:26:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=NWtsAado0sNM38C4EJ5OmG9BTaJmRaqr9VbSTb3srnc=; b=oxp25zxbJNLJTLO2c32aNu2viF0AGoMvDUdhi+WclYk2XCij/CWBr7BA+pixrxCAeFpb qEYbk0E+K503qTZcVjN+MDghZlxQjOUcvf7N2MKtAAd8wFsvF43hQ11/B50k4jSBxifD NEhSADV+/FkO6himWENo76Z43TMs9e7lc37JUcOP5ccZlfKthKcpMq2RNAuAFdaPJlS+ Zvohspq3Lx89pBf/YxAolg9dFw1DtpMZg0SWFb3S+c4O8YEDwJu32fEFggaYoLNE3Sul NlrsJdVvJ8ORSu8Mpy1jtODSFk1CmAXxdaQKXFbNM4xohw2YQxQesZxQ6/26+mhqxYpC gg== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2130.oracle.com with ESMTP id 2spw4tmta0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:25 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMPqoS141863 for ; Wed, 29 May 2019 22:26:24 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3030.oracle.com with ESMTP id 2srbdxndwp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:23 +0000 Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x4TMQMv0003306 for ; Wed, 29 May 2019 22:26:23 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:26:22 -0700 Subject: [PATCH 01/11] xfs: separate inode geometry From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:26:20 -0700 Message-ID: <155916878065.757870.5464843093825782642.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=4 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=4 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Separate the inode geometry information into a distinct structure. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_format.h | 33 +++++++++++- fs/xfs/libxfs/xfs_ialloc.c | 109 ++++++++++++++++++++------------------ fs/xfs/libxfs/xfs_ialloc.h | 6 +- fs/xfs/libxfs/xfs_ialloc_btree.c | 15 +++-- fs/xfs/libxfs/xfs_inode_buf.c | 2 - fs/xfs/libxfs/xfs_sb.c | 24 +++++--- fs/xfs/libxfs/xfs_trans_resv.c | 17 +++--- fs/xfs/libxfs/xfs_trans_space.h | 7 +- fs/xfs/libxfs/xfs_types.c | 4 + fs/xfs/scrub/ialloc.c | 22 ++++---- fs/xfs/scrub/quota.c | 2 - fs/xfs/xfs_fsops.c | 4 + fs/xfs/xfs_inode.c | 17 +++--- fs/xfs/xfs_itable.c | 11 ++-- fs/xfs/xfs_log_recover.c | 23 ++++---- fs/xfs/xfs_mount.c | 49 +++++++++-------- fs/xfs/xfs_mount.h | 17 ------ fs/xfs/xfs_super.c | 6 +- 18 files changed, 205 insertions(+), 163 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index 9bb3c48843ec..66f527b1c461 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -1071,7 +1071,7 @@ static inline void xfs_dinode_put_rdev(struct xfs_dinode *dip, xfs_dev_t rdev) #define XFS_INO_MASK(k) (uint32_t)((1ULL << (k)) - 1) #define XFS_INO_OFFSET_BITS(mp) (mp)->m_sb.sb_inopblog #define XFS_INO_AGBNO_BITS(mp) (mp)->m_sb.sb_agblklog -#define XFS_INO_AGINO_BITS(mp) (mp)->m_agino_log +#define XFS_INO_AGINO_BITS(mp) ((mp)->m_ino_geo.ig_agino_log) #define XFS_INO_AGNO_BITS(mp) (mp)->m_agno_log #define XFS_INO_BITS(mp) \ XFS_INO_AGNO_BITS(mp) + XFS_INO_AGINO_BITS(mp) @@ -1694,4 +1694,35 @@ struct xfs_acl { #define SGI_ACL_FILE_SIZE (sizeof(SGI_ACL_FILE)-1) #define SGI_ACL_DEFAULT_SIZE (sizeof(SGI_ACL_DEFAULT)-1) +struct xfs_ino_geometry { + /* Maximum inode count in this filesystem. */ + uint64_t ig_maxicount; + + /* Minimum inode buffer size, in bytes. */ + unsigned int ig_min_cluster_size; + + /* Inode cluster sizes, adjusted to be at least 1 fsb. */ + unsigned int ig_inodes_per_cluster; + unsigned int ig_blocks_per_cluster; + + /* Inode cluster alignment. */ + unsigned int ig_cluster_align; + unsigned int ig_cluster_align_inodes; + + unsigned int ig_inobt_mxr[2]; /* max inobt btree records */ + unsigned int ig_inobt_mnr[2]; /* min inobt btree records */ + unsigned int ig_in_maxlevels; /* max inobt btree levels. */ + + /* Minimum inode allocation size */ + unsigned int ig_ialloc_inos; + unsigned int ig_ialloc_blks; + + /* Minimum inode blocks for a sparse allocation. */ + unsigned int ig_ialloc_min_blks; + + unsigned int ig_inoalign_mask;/* mask sb_inoalignmt if used */ + unsigned int ig_agino_log; /* #bits for agino in inum */ + unsigned int ig_sinoalign; /* stripe unit inode alignment */ +}; + #endif /* __XFS_FORMAT_H__ */ diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index fe9898875097..c881e0521331 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -299,7 +299,7 @@ xfs_ialloc_inode_init( * sizes, manipulate the inodes in buffers which are multiples of the * blocks size. */ - nbufs = length / mp->m_blocks_per_cluster; + nbufs = length / mp->m_ino_geo.ig_blocks_per_cluster; /* * Figure out what version number to use in the inodes we create. If @@ -343,9 +343,10 @@ xfs_ialloc_inode_init( * Get the block. */ d = XFS_AGB_TO_DADDR(mp, agno, agbno + - (j * mp->m_blocks_per_cluster)); + (j * mp->m_ino_geo.ig_blocks_per_cluster)); fbuf = xfs_trans_get_buf(tp, mp->m_ddev_targp, d, - mp->m_bsize * mp->m_blocks_per_cluster, + mp->m_bsize * + mp->m_ino_geo.ig_blocks_per_cluster, XBF_UNMAPPED); if (!fbuf) return -ENOMEM; @@ -353,7 +354,7 @@ xfs_ialloc_inode_init( /* Initialize the inode buffers and log them appropriately. */ fbuf->b_ops = &xfs_inode_buf_ops; xfs_buf_zero(fbuf, 0, BBTOB(fbuf->b_length)); - for (i = 0; i < mp->m_inodes_per_cluster; i++) { + for (i = 0; i < mp->m_ino_geo.ig_inodes_per_cluster; i++) { int ioffset = i << mp->m_sb.sb_inodelog; uint isize = xfs_dinode_size(version); @@ -616,24 +617,26 @@ xfs_inobt_insert_sprec( * Allocate new inodes in the allocation group specified by agbp. * Return 0 for success, else error code. */ -STATIC int /* error code or 0 */ +STATIC int xfs_ialloc_ag_alloc( - xfs_trans_t *tp, /* transaction pointer */ - xfs_buf_t *agbp, /* alloc group buffer */ - int *alloc) + struct xfs_trans *tp, + struct xfs_buf *agbp, + int *alloc) { - xfs_agi_t *agi; /* allocation group header */ - xfs_alloc_arg_t args; /* allocation argument structure */ - xfs_agnumber_t agno; - int error; - xfs_agino_t newino; /* new first inode's number */ - xfs_agino_t newlen; /* new number of inodes */ - int isaligned = 0; /* inode allocation at stripe unit */ - /* boundary */ - uint16_t allocmask = (uint16_t) -1; /* init. to full chunk */ + struct xfs_agi *agi; + struct xfs_alloc_arg args; + xfs_agnumber_t agno; + int error; + xfs_agino_t newino; /* new first inode's number */ + xfs_agino_t newlen; /* new number of inodes */ + int isaligned = 0; /* inode allocation at stripe */ + /* unit boundary */ + /* init. to full chunk */ + uint16_t allocmask = (uint16_t) -1; struct xfs_inobt_rec_incore rec; - struct xfs_perag *pag; - int do_sparse = 0; + struct xfs_perag *pag; + struct xfs_ino_geometry *igeo = &tp->t_mountp->m_ino_geo; + int do_sparse = 0; memset(&args, 0, sizeof(args)); args.tp = tp; @@ -644,7 +647,7 @@ xfs_ialloc_ag_alloc( #ifdef DEBUG /* randomly do sparse inode allocations */ if (xfs_sb_version_hassparseinodes(&tp->t_mountp->m_sb) && - args.mp->m_ialloc_min_blks < args.mp->m_ialloc_blks) + igeo->ig_ialloc_min_blks < igeo->ig_ialloc_blks) do_sparse = prandom_u32() & 1; #endif @@ -652,12 +655,12 @@ xfs_ialloc_ag_alloc( * Locking will ensure that we don't have two callers in here * at one time. */ - newlen = args.mp->m_ialloc_inos; - if (args.mp->m_maxicount && + newlen = igeo->ig_ialloc_inos; + if (igeo->ig_maxicount && percpu_counter_read_positive(&args.mp->m_icount) + newlen > - args.mp->m_maxicount) + igeo->ig_maxicount) return -ENOSPC; - args.minlen = args.maxlen = args.mp->m_ialloc_blks; + args.minlen = args.maxlen = igeo->ig_ialloc_blks; /* * First try to allocate inodes contiguous with the last-allocated * chunk of inodes. If the filesystem is striped, this will fill @@ -667,7 +670,7 @@ xfs_ialloc_ag_alloc( newino = be32_to_cpu(agi->agi_newino); agno = be32_to_cpu(agi->agi_seqno); args.agbno = XFS_AGINO_TO_AGBNO(args.mp, newino) + - args.mp->m_ialloc_blks; + igeo->ig_ialloc_blks; if (do_sparse) goto sparse_alloc; if (likely(newino != NULLAGINO && @@ -690,10 +693,10 @@ xfs_ialloc_ag_alloc( * but not to use them in the actual exact allocation. */ args.alignment = 1; - args.minalignslop = args.mp->m_cluster_align - 1; + args.minalignslop = args.mp->m_ino_geo.ig_cluster_align - 1; /* Allow space for the inode btree to split. */ - args.minleft = args.mp->m_in_maxlevels - 1; + args.minleft = igeo->ig_in_maxlevels - 1; if ((error = xfs_alloc_vextent(&args))) return error; @@ -720,12 +723,12 @@ xfs_ialloc_ag_alloc( * pieces, so don't need alignment anyway. */ isaligned = 0; - if (args.mp->m_sinoalign) { + if (igeo->ig_sinoalign) { ASSERT(!(args.mp->m_flags & XFS_MOUNT_NOALIGN)); args.alignment = args.mp->m_dalign; isaligned = 1; } else - args.alignment = args.mp->m_cluster_align; + args.alignment = args.mp->m_ino_geo.ig_cluster_align; /* * Need to figure out where to allocate the inode blocks. * Ideally they should be spaced out through the a.g. @@ -741,7 +744,7 @@ xfs_ialloc_ag_alloc( /* * Allow space for the inode btree to split. */ - args.minleft = args.mp->m_in_maxlevels - 1; + args.minleft = igeo->ig_in_maxlevels - 1; if ((error = xfs_alloc_vextent(&args))) return error; } @@ -754,7 +757,7 @@ xfs_ialloc_ag_alloc( args.type = XFS_ALLOCTYPE_NEAR_BNO; args.agbno = be32_to_cpu(agi->agi_root); args.fsbno = XFS_AGB_TO_FSB(args.mp, agno, args.agbno); - args.alignment = args.mp->m_cluster_align; + args.alignment = args.mp->m_ino_geo.ig_cluster_align; if ((error = xfs_alloc_vextent(&args))) return error; } @@ -764,7 +767,7 @@ xfs_ialloc_ag_alloc( * the sparse allocation length is smaller than a full chunk. */ if (xfs_sb_version_hassparseinodes(&args.mp->m_sb) && - args.mp->m_ialloc_min_blks < args.mp->m_ialloc_blks && + igeo->ig_ialloc_min_blks < igeo->ig_ialloc_blks && args.fsbno == NULLFSBLOCK) { sparse_alloc: args.type = XFS_ALLOCTYPE_NEAR_BNO; @@ -773,7 +776,7 @@ xfs_ialloc_ag_alloc( args.alignment = args.mp->m_sb.sb_spino_align; args.prod = 1; - args.minlen = args.mp->m_ialloc_min_blks; + args.minlen = igeo->ig_ialloc_min_blks; args.maxlen = args.minlen; /* @@ -789,7 +792,7 @@ xfs_ialloc_ag_alloc( args.min_agbno = args.mp->m_sb.sb_inoalignmt; args.max_agbno = round_down(args.mp->m_sb.sb_agblocks, args.mp->m_sb.sb_inoalignmt) - - args.mp->m_ialloc_blks; + igeo->ig_ialloc_blks; error = xfs_alloc_vextent(&args); if (error) @@ -1006,8 +1009,8 @@ xfs_ialloc_ag_select( * space needed for alignment of inode chunks when checking the * longest contiguous free space in the AG - this prevents us * from getting ENOSPC because we have free space larger than - * m_ialloc_blks but alignment constraints prevent us from using - * it. + * ig_ialloc_blks but alignment constraints prevent us from + * using it. * * If we can't find an AG with space for full alignment slack to * be taken into account, we must be near ENOSPC in all AGs. @@ -1015,9 +1018,9 @@ xfs_ialloc_ag_select( * if we fail allocation due to alignment issues then it is most * likely a real ENOSPC condition. */ - ineed = mp->m_ialloc_min_blks; + ineed = mp->m_ino_geo.ig_ialloc_min_blks; if (flags && ineed > 1) - ineed += mp->m_cluster_align; + ineed += mp->m_ino_geo.ig_cluster_align; longest = pag->pagf_longest; if (!longest) longest = pag->pagf_flcount > 0; @@ -1703,6 +1706,7 @@ xfs_dialloc( int noroom = 0; xfs_agnumber_t start_agno; struct xfs_perag *pag; + struct xfs_ino_geometry *igeo = &mp->m_ino_geo; int okalloc = 1; if (*IO_agbp) { @@ -1733,9 +1737,9 @@ xfs_dialloc( * Read rough value of mp->m_icount by percpu_counter_read_positive, * which will sacrifice the preciseness but improve the performance. */ - if (mp->m_maxicount && - percpu_counter_read_positive(&mp->m_icount) + mp->m_ialloc_inos - > mp->m_maxicount) { + if (mp->m_ino_geo.ig_maxicount && + percpu_counter_read_positive(&mp->m_icount) + igeo->ig_ialloc_inos + > igeo->ig_maxicount) { noroom = 1; okalloc = 0; } @@ -1852,7 +1856,8 @@ xfs_difree_inode_chunk( if (!xfs_inobt_issparse(rec->ir_holemask)) { /* not sparse, calculate extent info directly */ xfs_bmap_add_free(tp, XFS_AGB_TO_FSB(mp, agno, sagbno), - mp->m_ialloc_blks, &XFS_RMAP_OINFO_INODES); + mp->m_ino_geo.ig_ialloc_blks, + &XFS_RMAP_OINFO_INODES); return; } @@ -2261,7 +2266,7 @@ xfs_imap_lookup( /* check that the returned record contains the required inode */ if (rec.ir_startino > agino || - rec.ir_startino + mp->m_ialloc_inos <= agino) + rec.ir_startino + mp->m_ino_geo.ig_ialloc_inos <= agino) return -EINVAL; /* for untrusted inodes check it is allocated first */ @@ -2352,7 +2357,7 @@ xfs_imap( * If the inode cluster size is the same as the blocksize or * smaller we get to the buffer by simple arithmetics. */ - if (mp->m_blocks_per_cluster == 1) { + if (mp->m_ino_geo.ig_blocks_per_cluster == 1) { offset = XFS_INO_TO_OFFSET(mp, ino); ASSERT(offset < mp->m_sb.sb_inopblock); @@ -2368,8 +2373,8 @@ xfs_imap( * find the location. Otherwise we have to do a btree * lookup to find the location. */ - if (mp->m_inoalign_mask) { - offset_agbno = agbno & mp->m_inoalign_mask; + if (mp->m_ino_geo.ig_inoalign_mask) { + offset_agbno = agbno & mp->m_ino_geo.ig_inoalign_mask; chunk_agbno = agbno - offset_agbno; } else { error = xfs_imap_lookup(mp, tp, agno, agino, agbno, @@ -2381,13 +2386,13 @@ xfs_imap( out_map: ASSERT(agbno >= chunk_agbno); cluster_agbno = chunk_agbno + - ((offset_agbno / mp->m_blocks_per_cluster) * - mp->m_blocks_per_cluster); + ((offset_agbno / mp->m_ino_geo.ig_blocks_per_cluster) * + mp->m_ino_geo.ig_blocks_per_cluster); offset = ((agbno - cluster_agbno) * mp->m_sb.sb_inopblock) + XFS_INO_TO_OFFSET(mp, ino); imap->im_blkno = XFS_AGB_TO_DADDR(mp, agno, cluster_agbno); - imap->im_len = XFS_FSB_TO_BB(mp, mp->m_blocks_per_cluster); + imap->im_len = XFS_FSB_TO_BB(mp, mp->m_ino_geo.ig_blocks_per_cluster); imap->im_boffset = (unsigned short)(offset << mp->m_sb.sb_inodelog); /* @@ -2409,7 +2414,7 @@ xfs_imap( } /* - * Compute and fill in value of m_in_maxlevels. + * Compute and fill in value of m_ino_geo.ig_in_maxlevels. */ void xfs_ialloc_compute_maxlevels( @@ -2418,8 +2423,8 @@ xfs_ialloc_compute_maxlevels( uint inodes; inodes = (1LL << XFS_INO_AGINO_BITS(mp)) >> XFS_INODES_PER_CHUNK_LOG; - mp->m_in_maxlevels = xfs_btree_compute_maxlevels(mp->m_inobt_mnr, - inodes); + mp->m_ino_geo.ig_in_maxlevels = xfs_btree_compute_maxlevels( + mp->m_ino_geo.ig_inobt_mnr, inodes); } /* diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h index e936b7cc9389..b74fa2addd51 100644 --- a/fs/xfs/libxfs/xfs_ialloc.h +++ b/fs/xfs/libxfs/xfs_ialloc.h @@ -28,9 +28,9 @@ static inline int xfs_icluster_size_fsb( struct xfs_mount *mp) { - if (mp->m_sb.sb_blocksize >= mp->m_inode_cluster_size) + if (mp->m_sb.sb_blocksize >= mp->m_ino_geo.ig_min_cluster_size) return 1; - return mp->m_inode_cluster_size >> mp->m_sb.sb_blocklog; + return mp->m_ino_geo.ig_min_cluster_size >> mp->m_sb.sb_blocklog; } /* @@ -96,7 +96,7 @@ xfs_imap( uint flags); /* flags for inode btree lookup */ /* - * Compute and fill in value of m_in_maxlevels. + * Compute and fill in value of m_ino_geo.ig_in_maxlevels. */ void xfs_ialloc_compute_maxlevels( diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c index bc2dfacd2f4a..79cc5cf21e1b 100644 --- a/fs/xfs/libxfs/xfs_ialloc_btree.c +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c @@ -28,7 +28,7 @@ xfs_inobt_get_minrecs( struct xfs_btree_cur *cur, int level) { - return cur->bc_mp->m_inobt_mnr[level != 0]; + return cur->bc_mp->m_ino_geo.ig_inobt_mnr[level != 0]; } STATIC struct xfs_btree_cur * @@ -164,7 +164,7 @@ xfs_inobt_get_maxrecs( struct xfs_btree_cur *cur, int level) { - return cur->bc_mp->m_inobt_mxr[level != 0]; + return cur->bc_mp->m_ino_geo.ig_inobt_mxr[level != 0]; } STATIC void @@ -281,10 +281,11 @@ xfs_inobt_verify( /* level verification */ level = be16_to_cpu(block->bb_level); - if (level >= mp->m_in_maxlevels) + if (level >= mp->m_ino_geo.ig_in_maxlevels) return __this_address; - return xfs_btree_sblock_verify(bp, mp->m_inobt_mxr[level != 0]); + return xfs_btree_sblock_verify(bp, + mp->m_ino_geo.ig_inobt_mxr[level != 0]); } static void @@ -546,7 +547,7 @@ xfs_inobt_max_size( xfs_agblock_t agblocks = xfs_ag_block_count(mp, agno); /* Bail out if we're uninitialized, which can happen in mkfs. */ - if (mp->m_inobt_mxr[0] == 0) + if (mp->m_ino_geo.ig_inobt_mxr[0] == 0) return 0; /* @@ -558,7 +559,7 @@ xfs_inobt_max_size( XFS_FSB_TO_AGNO(mp, mp->m_sb.sb_logstart) == agno) agblocks -= mp->m_sb.sb_logblocks; - return xfs_btree_calc_size(mp->m_inobt_mnr, + return xfs_btree_calc_size(mp->m_ino_geo.ig_inobt_mnr, (uint64_t)agblocks * mp->m_sb.sb_inopblock / XFS_INODES_PER_CHUNK); } @@ -619,5 +620,5 @@ xfs_iallocbt_calc_size( struct xfs_mount *mp, unsigned long long len) { - return xfs_btree_calc_size(mp->m_inobt_mnr, len); + return xfs_btree_calc_size(mp->m_ino_geo.ig_inobt_mnr, len); } diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index e021d5133ccb..641aa1c2f1ae 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -36,7 +36,7 @@ xfs_inobp_check( int j; xfs_dinode_t *dip; - j = mp->m_inode_cluster_size >> mp->m_sb.sb_inodelog; + j = mp->m_ino_geo.ig_min_cluster_size >> mp->m_sb.sb_inodelog; for (i = 0; i < j; i++) { dip = xfs_buf_offset(bp, i * mp->m_sb.sb_inodesize); diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c index e76a3e5d28d7..9416fc741788 100644 --- a/fs/xfs/libxfs/xfs_sb.c +++ b/fs/xfs/libxfs/xfs_sb.c @@ -804,16 +804,18 @@ const struct xfs_buf_ops xfs_sb_quiet_buf_ops = { */ void xfs_sb_mount_common( - struct xfs_mount *mp, - struct xfs_sb *sbp) + struct xfs_mount *mp, + struct xfs_sb *sbp) { + struct xfs_ino_geometry *igeo = &mp->m_ino_geo; + mp->m_agfrotor = mp->m_agirotor = 0; mp->m_maxagi = mp->m_sb.sb_agcount; mp->m_blkbit_log = sbp->sb_blocklog + XFS_NBBYLOG; mp->m_blkbb_log = sbp->sb_blocklog - BBSHIFT; mp->m_sectbb_log = sbp->sb_sectlog - BBSHIFT; mp->m_agno_log = xfs_highbit32(sbp->sb_agcount - 1) + 1; - mp->m_agino_log = sbp->sb_inopblog + sbp->sb_agblklog; + mp->m_ino_geo.ig_agino_log = sbp->sb_inopblog + sbp->sb_agblklog; mp->m_blockmask = sbp->sb_blocksize - 1; mp->m_blockwsize = sbp->sb_blocksize >> XFS_WORDLOG; mp->m_blockwmask = mp->m_blockwsize - 1; @@ -823,10 +825,10 @@ xfs_sb_mount_common( mp->m_alloc_mnr[0] = mp->m_alloc_mxr[0] / 2; mp->m_alloc_mnr[1] = mp->m_alloc_mxr[1] / 2; - mp->m_inobt_mxr[0] = xfs_inobt_maxrecs(mp, sbp->sb_blocksize, 1); - mp->m_inobt_mxr[1] = xfs_inobt_maxrecs(mp, sbp->sb_blocksize, 0); - mp->m_inobt_mnr[0] = mp->m_inobt_mxr[0] / 2; - mp->m_inobt_mnr[1] = mp->m_inobt_mxr[1] / 2; + igeo->ig_inobt_mxr[0] = xfs_inobt_maxrecs(mp, sbp->sb_blocksize, 1); + igeo->ig_inobt_mxr[1] = xfs_inobt_maxrecs(mp, sbp->sb_blocksize, 0); + igeo->ig_inobt_mnr[0] = igeo->ig_inobt_mxr[0] / 2; + igeo->ig_inobt_mnr[1] = igeo->ig_inobt_mxr[1] / 2; mp->m_bmap_dmxr[0] = xfs_bmbt_maxrecs(mp, sbp->sb_blocksize, 1); mp->m_bmap_dmxr[1] = xfs_bmbt_maxrecs(mp, sbp->sb_blocksize, 0); @@ -844,14 +846,14 @@ xfs_sb_mount_common( mp->m_refc_mnr[1] = mp->m_refc_mxr[1] / 2; mp->m_bsize = XFS_FSB_TO_BB(mp, 1); - mp->m_ialloc_inos = max_t(uint16_t, XFS_INODES_PER_CHUNK, + igeo->ig_ialloc_inos = max_t(uint16_t, XFS_INODES_PER_CHUNK, sbp->sb_inopblock); - mp->m_ialloc_blks = mp->m_ialloc_inos >> sbp->sb_inopblog; + igeo->ig_ialloc_blks = igeo->ig_ialloc_inos >> sbp->sb_inopblog; if (sbp->sb_spino_align) - mp->m_ialloc_min_blks = sbp->sb_spino_align; + igeo->ig_ialloc_min_blks = sbp->sb_spino_align; else - mp->m_ialloc_min_blks = mp->m_ialloc_blks; + igeo->ig_ialloc_min_blks = igeo->ig_ialloc_blks; mp->m_alloc_set_aside = xfs_alloc_set_aside(mp); mp->m_ag_max_usable = xfs_alloc_ag_max_usable(mp); } diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index 83f4ee2afc49..0d0d24729606 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -136,9 +136,10 @@ STATIC uint xfs_calc_inobt_res( struct xfs_mount *mp) { - return xfs_calc_buf_res(mp->m_in_maxlevels, XFS_FSB_TO_B(mp, 1)) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), - XFS_FSB_TO_B(mp, 1)); + return xfs_calc_buf_res(mp->m_ino_geo.ig_in_maxlevels, + XFS_FSB_TO_B(mp, 1)) + + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), + XFS_FSB_TO_B(mp, 1)); } /* @@ -167,7 +168,7 @@ xfs_calc_finobt_res( * includes: * * the allocation btrees: 2 trees * (max depth - 1) * block size - * the inode chunk: m_ialloc_blks * N + * the inode chunk: m_ino_geo.ig_ialloc_blks * N * * The size N of the inode chunk reservation depends on whether it is for * allocation or free and which type of create transaction is in use. An inode @@ -193,7 +194,7 @@ xfs_calc_inode_chunk_res( size = XFS_FSB_TO_B(mp, 1); } - res += xfs_calc_buf_res(mp->m_ialloc_blks, size); + res += xfs_calc_buf_res(mp->m_ino_geo.ig_ialloc_blks, size); return res; } @@ -307,7 +308,8 @@ xfs_calc_iunlink_remove_reservation( struct xfs_mount *mp) { return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) + - 2 * max_t(uint, XFS_FSB_TO_B(mp, 1), mp->m_inode_cluster_size); + 2 * max_t(uint, XFS_FSB_TO_B(mp, 1), + mp->m_ino_geo.ig_min_cluster_size); } /* @@ -345,7 +347,8 @@ STATIC uint xfs_calc_iunlink_add_reservation(xfs_mount_t *mp) { return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) + - max_t(uint, XFS_FSB_TO_B(mp, 1), mp->m_inode_cluster_size); + max_t(uint, XFS_FSB_TO_B(mp, 1), + mp->m_ino_geo.ig_min_cluster_size); } /* diff --git a/fs/xfs/libxfs/xfs_trans_space.h b/fs/xfs/libxfs/xfs_trans_space.h index a62fb950bef1..65ecc600ef44 100644 --- a/fs/xfs/libxfs/xfs_trans_space.h +++ b/fs/xfs/libxfs/xfs_trans_space.h @@ -56,9 +56,9 @@ #define XFS_DIRREMOVE_SPACE_RES(mp) \ XFS_DAREMOVE_SPACE_RES(mp, XFS_DATA_FORK) #define XFS_IALLOC_SPACE_RES(mp) \ - ((mp)->m_ialloc_blks + \ + ((mp)->m_ino_geo.ig_ialloc_blks + \ (xfs_sb_version_hasfinobt(&mp->m_sb) ? 2 : 1 * \ - ((mp)->m_in_maxlevels - 1))) + ((mp)->m_ino_geo.ig_in_maxlevels - 1))) /* * Space reservation values for various transactions. @@ -94,7 +94,8 @@ #define XFS_SYMLINK_SPACE_RES(mp,nl,b) \ (XFS_IALLOC_SPACE_RES(mp) + XFS_DIRENTER_SPACE_RES(mp,nl) + (b)) #define XFS_IFREE_SPACE_RES(mp) \ - (xfs_sb_version_hasfinobt(&mp->m_sb) ? (mp)->m_in_maxlevels : 0) + (xfs_sb_version_hasfinobt(&mp->m_sb) ? \ + (mp)->m_ino_geo.ig_in_maxlevels : 0) #endif /* __XFS_TRANS_SPACE_H__ */ diff --git a/fs/xfs/libxfs/xfs_types.c b/fs/xfs/libxfs/xfs_types.c index d51acc95bc00..cfa3f407eacd 100644 --- a/fs/xfs/libxfs/xfs_types.c +++ b/fs/xfs/libxfs/xfs_types.c @@ -87,14 +87,14 @@ xfs_agino_range( * Calculate the first inode, which will be in the first * cluster-aligned block after the AGFL. */ - bno = round_up(XFS_AGFL_BLOCK(mp) + 1, mp->m_cluster_align); + bno = round_up(XFS_AGFL_BLOCK(mp) + 1, mp->m_ino_geo.ig_cluster_align); *first = XFS_AGB_TO_AGINO(mp, bno); /* * Calculate the last inode, which will be at the end of the * last (aligned) cluster that can be allocated in the AG. */ - bno = round_down(eoag, mp->m_cluster_align); + bno = round_down(eoag, mp->m_ino_geo.ig_cluster_align); *last = XFS_AGB_TO_AGINO(mp, bno) - 1; } diff --git a/fs/xfs/scrub/ialloc.c b/fs/xfs/scrub/ialloc.c index 9b47117180cb..fa7386bf76e9 100644 --- a/fs/xfs/scrub/ialloc.c +++ b/fs/xfs/scrub/ialloc.c @@ -230,7 +230,7 @@ xchk_iallocbt_check_cluster( int error = 0; nr_inodes = min_t(unsigned int, XFS_INODES_PER_CHUNK, - mp->m_inodes_per_cluster); + mp->m_ino_geo.ig_inodes_per_cluster); /* Map this inode cluster */ agbno = XFS_AGINO_TO_AGBNO(mp, irec->ir_startino + cluster_base); @@ -251,7 +251,7 @@ xchk_iallocbt_check_cluster( */ ir_holemask = (irec->ir_holemask & cluster_mask); imap.im_blkno = XFS_AGB_TO_DADDR(mp, agno, agbno); - imap.im_len = XFS_FSB_TO_BB(mp, mp->m_blocks_per_cluster); + imap.im_len = XFS_FSB_TO_BB(mp, mp->m_ino_geo.ig_blocks_per_cluster); imap.im_boffset = XFS_INO_TO_OFFSET(mp, irec->ir_startino) << mp->m_sb.sb_inodelog; @@ -276,12 +276,13 @@ xchk_iallocbt_check_cluster( /* If any part of this is a hole, skip it. */ if (ir_holemask) { xchk_xref_is_not_owned_by(bs->sc, agbno, - mp->m_blocks_per_cluster, + mp->m_ino_geo.ig_blocks_per_cluster, &XFS_RMAP_OINFO_INODES); return 0; } - xchk_xref_is_owned_by(bs->sc, agbno, mp->m_blocks_per_cluster, + xchk_xref_is_owned_by(bs->sc, agbno, + mp->m_ino_geo.ig_blocks_per_cluster, &XFS_RMAP_OINFO_INODES); /* Grab the inode cluster buffer. */ @@ -333,7 +334,7 @@ xchk_iallocbt_check_clusters( */ for (cluster_base = 0; cluster_base < XFS_INODES_PER_CHUNK; - cluster_base += bs->sc->mp->m_inodes_per_cluster) { + cluster_base += bs->sc->mp->m_ino_geo.ig_inodes_per_cluster) { error = xchk_iallocbt_check_cluster(bs, irec, cluster_base); if (error) break; @@ -355,6 +356,7 @@ xchk_iallocbt_rec_alignment( { struct xfs_mount *mp = bs->sc->mp; struct xchk_iallocbt *iabt = bs->private; + struct xfs_ino_geometry *ig = &mp->m_ino_geo; /* * finobt records have different positioning requirements than inobt @@ -372,7 +374,7 @@ xchk_iallocbt_rec_alignment( unsigned int imask; imask = min_t(unsigned int, XFS_INODES_PER_CHUNK, - mp->m_cluster_align_inodes) - 1; + ig->ig_cluster_align_inodes) - 1; if (irec->ir_startino & imask) xchk_btree_set_corrupt(bs->sc, bs->cur, 0); return; @@ -400,17 +402,17 @@ xchk_iallocbt_rec_alignment( } /* inobt records must be aligned to cluster and inoalignmnt size. */ - if (irec->ir_startino & (mp->m_cluster_align_inodes - 1)) { + if (irec->ir_startino & (ig->ig_cluster_align_inodes - 1)) { xchk_btree_set_corrupt(bs->sc, bs->cur, 0); return; } - if (irec->ir_startino & (mp->m_inodes_per_cluster - 1)) { + if (irec->ir_startino & (ig->ig_inodes_per_cluster - 1)) { xchk_btree_set_corrupt(bs->sc, bs->cur, 0); return; } - if (mp->m_inodes_per_cluster <= XFS_INODES_PER_CHUNK) + if (ig->ig_inodes_per_cluster <= XFS_INODES_PER_CHUNK) return; /* @@ -419,7 +421,7 @@ xchk_iallocbt_rec_alignment( * after this one. */ iabt->next_startino = irec->ir_startino + XFS_INODES_PER_CHUNK; - iabt->next_cluster_ino = irec->ir_startino + mp->m_inodes_per_cluster; + iabt->next_cluster_ino = irec->ir_startino + ig->ig_inodes_per_cluster; } /* Scrub an inobt/finobt record. */ diff --git a/fs/xfs/scrub/quota.c b/fs/xfs/scrub/quota.c index 5dfe2b5924db..bf4c4630e1df 100644 --- a/fs/xfs/scrub/quota.c +++ b/fs/xfs/scrub/quota.c @@ -144,7 +144,7 @@ xchk_quota_item( if (bsoft > bhard) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, offset); - if (ihard > mp->m_maxicount) + if (ihard > mp->m_ino_geo.ig_maxicount) xchk_fblock_set_warning(sc, XFS_DATA_FORK, offset); if (isoft > ihard) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, offset); diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index 3d0e0570e3aa..564bb64d51d3 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -251,9 +251,9 @@ xfs_growfs_data( if (mp->m_sb.sb_imax_pct) { uint64_t icount = mp->m_sb.sb_dblocks * mp->m_sb.sb_imax_pct; do_div(icount, 100); - mp->m_maxicount = XFS_FSB_TO_INO(mp, icount); + mp->m_ino_geo.ig_maxicount = XFS_FSB_TO_INO(mp, icount); } else - mp->m_maxicount = 0; + mp->m_ino_geo.ig_maxicount = 0; /* Update secondary superblocks now the physical grow has completed */ error = xfs_update_secondary_sbs(mp); diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 71d216cf6f87..28ad467607cf 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2537,13 +2537,14 @@ xfs_ifree_cluster( xfs_inode_log_item_t *iip; struct xfs_log_item *lip; struct xfs_perag *pag; + struct xfs_ino_geometry *igeo = &mp->m_ino_geo; xfs_ino_t inum; inum = xic->first_ino; pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, inum)); - nbufs = mp->m_ialloc_blks / mp->m_blocks_per_cluster; + nbufs = igeo->ig_ialloc_blks / igeo->ig_blocks_per_cluster; - for (j = 0; j < nbufs; j++, inum += mp->m_inodes_per_cluster) { + for (j = 0; j < nbufs; j++, inum += igeo->ig_inodes_per_cluster) { /* * The allocation bitmap tells us which inodes of the chunk were * physically allocated. Skip the cluster if an inode falls into @@ -2551,7 +2552,7 @@ xfs_ifree_cluster( */ ioffset = inum - xic->first_ino; if ((xic->alloc & XFS_INOBT_MASK(ioffset)) == 0) { - ASSERT(ioffset % mp->m_inodes_per_cluster == 0); + ASSERT(ioffset % igeo->ig_inodes_per_cluster == 0); continue; } @@ -2567,7 +2568,8 @@ xfs_ifree_cluster( * to mark all the active inodes on the buffer stale. */ bp = xfs_trans_get_buf(tp, mp->m_ddev_targp, blkno, - mp->m_bsize * mp->m_blocks_per_cluster, + mp->m_bsize * + igeo->ig_blocks_per_cluster, XBF_UNMAPPED); if (!bp) @@ -2614,7 +2616,7 @@ xfs_ifree_cluster( * transaction stale above, which means there is no point in * even trying to lock them. */ - for (i = 0; i < mp->m_inodes_per_cluster; i++) { + for (i = 0; i < igeo->ig_inodes_per_cluster; i++) { retry: rcu_read_lock(); ip = radix_tree_lookup(&pag->pag_ici_root, @@ -3476,19 +3478,20 @@ xfs_iflush_cluster( int cilist_size; struct xfs_inode **cilist; struct xfs_inode *cip; + struct xfs_ino_geometry *igeo = &mp->m_ino_geo; int nr_found; int clcount = 0; int i; pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino)); - inodes_per_cluster = mp->m_inode_cluster_size >> mp->m_sb.sb_inodelog; + inodes_per_cluster = igeo->ig_min_cluster_size >> mp->m_sb.sb_inodelog; cilist_size = inodes_per_cluster * sizeof(xfs_inode_t *); cilist = kmem_alloc(cilist_size, KM_MAYFAIL|KM_NOFS); if (!cilist) goto out_put; - mask = ~(((mp->m_inode_cluster_size >> mp->m_sb.sb_inodelog)) - 1); + mask = ~(((igeo->ig_min_cluster_size >> mp->m_sb.sb_inodelog)) - 1); first_index = XFS_INO_TO_AGINO(mp, ip->i_ino) & mask; rcu_read_lock(); /* really need a gang lookup range call here */ diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c index 1e1a0af1dd34..cff28ee73deb 100644 --- a/fs/xfs/xfs_itable.c +++ b/fs/xfs/xfs_itable.c @@ -167,6 +167,7 @@ xfs_bulkstat_ichunk_ra( xfs_agnumber_t agno, struct xfs_inobt_rec_incore *irec) { + struct xfs_ino_geometry *igeo = &mp->m_ino_geo; xfs_agblock_t agbno; struct blk_plug plug; int i; /* inode chunk index */ @@ -174,12 +175,14 @@ xfs_bulkstat_ichunk_ra( agbno = XFS_AGINO_TO_AGBNO(mp, irec->ir_startino); blk_start_plug(&plug); - for (i = 0; i < XFS_INODES_PER_CHUNK; - i += mp->m_inodes_per_cluster, agbno += mp->m_blocks_per_cluster) { - if (xfs_inobt_maskn(i, mp->m_inodes_per_cluster) & + for (i = 0; + i < XFS_INODES_PER_CHUNK; + i += igeo->ig_inodes_per_cluster, + agbno += igeo->ig_blocks_per_cluster) { + if (xfs_inobt_maskn(i, igeo->ig_inodes_per_cluster) & ~irec->ir_free) { xfs_btree_reada_bufs(mp, agno, agbno, - mp->m_blocks_per_cluster, + igeo->ig_blocks_per_cluster, &xfs_inode_buf_ops); } } diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 9329f5adbfbe..15118e531184 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2882,19 +2882,19 @@ xlog_recover_buffer_pass2( * * Also make sure that only inode buffers with good sizes stay in * the buffer cache. The kernel moves inodes in buffers of 1 block - * or mp->m_inode_cluster_size bytes, whichever is bigger. The inode + * or ig_min_cluster_size bytes, whichever is bigger. The inode * buffers in the log can be a different size if the log was generated * by an older kernel using unclustered inode buffers or a newer kernel * running with a different inode cluster size. Regardless, if the - * the inode buffer size isn't max(blocksize, mp->m_inode_cluster_size) - * for *our* value of mp->m_inode_cluster_size, then we need to keep + * the inode buffer size isn't max(blocksize, ig_min_cluster_size) + * for *our* value of ig_min_cluster_size, then we need to keep * the buffer out of the buffer cache so that the buffer won't * overlap with future reads of those inodes. */ if (XFS_DINODE_MAGIC == be16_to_cpu(*((__be16 *)xfs_buf_offset(bp, 0))) && (BBTOB(bp->b_io_length) != max(log->l_mp->m_sb.sb_blocksize, - (uint32_t)log->l_mp->m_inode_cluster_size))) { + (uint32_t)log->l_mp->m_ino_geo.ig_min_cluster_size))) { xfs_buf_stale(bp); error = xfs_bwrite(bp); } else { @@ -3849,6 +3849,7 @@ xlog_recover_do_icreate_pass2( { struct xfs_mount *mp = log->l_mp; struct xfs_icreate_log *icl; + struct xfs_ino_geometry *igeo = &mp->m_ino_geo; xfs_agnumber_t agno; xfs_agblock_t agbno; unsigned int count; @@ -3898,10 +3899,10 @@ xlog_recover_do_icreate_pass2( /* * The inode chunk is either full or sparse and we only support - * m_ialloc_min_blks sized sparse allocations at this time. + * m_ino_geo.ig_ialloc_min_blks sized sparse allocations at this time. */ - if (length != mp->m_ialloc_blks && - length != mp->m_ialloc_min_blks) { + if (length != igeo->ig_ialloc_blks && + length != igeo->ig_ialloc_min_blks) { xfs_warn(log->l_mp, "%s: unsupported chunk length", __FUNCTION__); return -EINVAL; @@ -3921,13 +3922,13 @@ xlog_recover_do_icreate_pass2( * buffers for cancellation so we don't overwrite anything written after * a cancellation. */ - bb_per_cluster = XFS_FSB_TO_BB(mp, mp->m_blocks_per_cluster); - nbufs = length / mp->m_blocks_per_cluster; + bb_per_cluster = XFS_FSB_TO_BB(mp, igeo->ig_blocks_per_cluster); + nbufs = length / igeo->ig_blocks_per_cluster; for (i = 0, cancel_count = 0; i < nbufs; i++) { xfs_daddr_t daddr; - daddr = XFS_AGB_TO_DADDR(mp, agno, - agbno + i * mp->m_blocks_per_cluster); + daddr = XFS_AGB_TO_DADDR(mp, agno, agbno + + i * igeo->ig_blocks_per_cluster); if (xlog_check_buffer_cancelled(log, daddr, bb_per_cluster, 0)) cancel_count++; } diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 6b2bfe81dc51..17c47682609b 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -433,10 +433,12 @@ xfs_update_alignment(xfs_mount_t *mp) * Set the maximum inode count for this filesystem */ STATIC void -xfs_set_maxicount(xfs_mount_t *mp) +xfs_set_maxicount( + struct xfs_mount *mp) { - xfs_sb_t *sbp = &(mp->m_sb); - uint64_t icount; + struct xfs_sb *sbp = &(mp->m_sb); + struct xfs_ino_geometry *igeo = &mp->m_ino_geo; + uint64_t icount; if (sbp->sb_imax_pct) { /* @@ -445,11 +447,11 @@ xfs_set_maxicount(xfs_mount_t *mp) */ icount = sbp->sb_dblocks * sbp->sb_imax_pct; do_div(icount, 100); - do_div(icount, mp->m_ialloc_blks); - mp->m_maxicount = (icount * mp->m_ialloc_blks) << - sbp->sb_inopblog; + do_div(icount, igeo->ig_ialloc_blks); + igeo->ig_maxicount = XFS_FSB_TO_INO(mp, + icount * igeo->ig_ialloc_blks); } else { - mp->m_maxicount = 0; + igeo->ig_maxicount = 0; } } @@ -518,18 +520,18 @@ xfs_set_inoalignment(xfs_mount_t *mp) { if (xfs_sb_version_hasalign(&mp->m_sb) && mp->m_sb.sb_inoalignmt >= xfs_icluster_size_fsb(mp)) - mp->m_inoalign_mask = mp->m_sb.sb_inoalignmt - 1; + mp->m_ino_geo.ig_inoalign_mask = mp->m_sb.sb_inoalignmt - 1; else - mp->m_inoalign_mask = 0; + mp->m_ino_geo.ig_inoalign_mask = 0; /* * If we are using stripe alignment, check whether * the stripe unit is a multiple of the inode alignment */ - if (mp->m_dalign && mp->m_inoalign_mask && - !(mp->m_dalign & mp->m_inoalign_mask)) - mp->m_sinoalign = mp->m_dalign; + if (mp->m_dalign && mp->m_ino_geo.ig_inoalign_mask && + !(mp->m_dalign & mp->m_ino_geo.ig_inoalign_mask)) + mp->m_ino_geo.ig_sinoalign = mp->m_dalign; else - mp->m_sinoalign = 0; + mp->m_ino_geo.ig_sinoalign = 0; } /* @@ -683,6 +685,7 @@ xfs_mountfs( { struct xfs_sb *sbp = &(mp->m_sb); struct xfs_inode *rip; + struct xfs_ino_geometry *igeo = &mp->m_ino_geo; uint64_t resblks; uint quotamount = 0; uint quotaflags = 0; @@ -797,18 +800,20 @@ xfs_mountfs( * has set the inode alignment value appropriately for larger cluster * sizes. */ - mp->m_inode_cluster_size = XFS_INODE_BIG_CLUSTER_SIZE; + igeo->ig_min_cluster_size = XFS_INODE_BIG_CLUSTER_SIZE; if (xfs_sb_version_hascrc(&mp->m_sb)) { - int new_size = mp->m_inode_cluster_size; + int new_size = igeo->ig_min_cluster_size; new_size *= mp->m_sb.sb_inodesize / XFS_DINODE_MIN_SIZE; if (mp->m_sb.sb_inoalignmt >= XFS_B_TO_FSBT(mp, new_size)) - mp->m_inode_cluster_size = new_size; + igeo->ig_min_cluster_size = new_size; } - mp->m_blocks_per_cluster = xfs_icluster_size_fsb(mp); - mp->m_inodes_per_cluster = XFS_FSB_TO_INO(mp, mp->m_blocks_per_cluster); - mp->m_cluster_align = xfs_ialloc_cluster_alignment(mp); - mp->m_cluster_align_inodes = XFS_FSB_TO_INO(mp, mp->m_cluster_align); + igeo->ig_blocks_per_cluster = xfs_icluster_size_fsb(mp); + igeo->ig_inodes_per_cluster = XFS_FSB_TO_INO(mp, + igeo->ig_blocks_per_cluster); + igeo->ig_cluster_align = xfs_ialloc_cluster_alignment(mp); + igeo->ig_cluster_align_inodes = XFS_FSB_TO_INO(mp, + igeo->ig_cluster_align); /* * If enabled, sparse inode chunk alignment is expected to match the @@ -817,11 +822,11 @@ xfs_mountfs( */ if (xfs_sb_version_hassparseinodes(&mp->m_sb) && mp->m_sb.sb_spino_align != - XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)) { + XFS_B_TO_FSBT(mp, igeo->ig_min_cluster_size)) { xfs_warn(mp, "Sparse inode block alignment (%u) must match cluster size (%llu).", mp->m_sb.sb_spino_align, - XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size)); + XFS_B_TO_FSBT(mp, igeo->ig_min_cluster_size)); error = -EINVAL; goto out_remove_uuid; } diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index c81a5cd7c228..89cbb1268b63 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -105,6 +105,7 @@ typedef struct xfs_mount { struct xfs_da_geometry *m_dir_geo; /* directory block geometry */ struct xfs_da_geometry *m_attr_geo; /* attribute block geometry */ struct xlog *m_log; /* log specific stuff */ + struct xfs_ino_geometry m_ino_geo; /* inode geometry */ int m_logbufs; /* number of log buffers */ int m_logbsize; /* size of each log buffer */ uint m_rsumlevels; /* rt summary levels */ @@ -126,12 +127,6 @@ typedef struct xfs_mount { uint8_t m_blkbit_log; /* blocklog + NBBY */ uint8_t m_blkbb_log; /* blocklog - BBSHIFT */ uint8_t m_agno_log; /* log #ag's */ - uint8_t m_agino_log; /* #bits for agino in inum */ - uint m_inode_cluster_size;/* min inode buf size */ - unsigned int m_inodes_per_cluster; - unsigned int m_blocks_per_cluster; - unsigned int m_cluster_align; - unsigned int m_cluster_align_inodes; uint m_blockmask; /* sb_blocksize-1 */ uint m_blockwsize; /* sb_blocksize in words */ uint m_blockwmask; /* blockwsize-1 */ @@ -139,15 +134,12 @@ typedef struct xfs_mount { uint m_alloc_mnr[2]; /* min alloc btree records */ uint m_bmap_dmxr[2]; /* max bmap btree records */ uint m_bmap_dmnr[2]; /* min bmap btree records */ - uint m_inobt_mxr[2]; /* max inobt btree records */ - uint m_inobt_mnr[2]; /* min inobt btree records */ uint m_rmap_mxr[2]; /* max rmap btree records */ uint m_rmap_mnr[2]; /* min rmap btree records */ uint m_refc_mxr[2]; /* max refc btree records */ uint m_refc_mnr[2]; /* min refc btree records */ uint m_ag_maxlevels; /* XFS_AG_MAXLEVELS */ uint m_bm_maxlevels[2]; /* XFS_BM_MAXLEVELS */ - uint m_in_maxlevels; /* max inobt btree levels. */ uint m_rmap_maxlevels; /* max rmap btree levels */ uint m_refc_maxlevels; /* max refcount btree level */ xfs_extlen_t m_ag_prealloc_blocks; /* reserved ag blocks */ @@ -159,20 +151,13 @@ typedef struct xfs_mount { int m_fixedfsid[2]; /* unchanged for life of FS */ uint64_t m_flags; /* global mount flags */ bool m_finobt_nores; /* no per-AG finobt resv. */ - int m_ialloc_inos; /* inodes in inode allocation */ - int m_ialloc_blks; /* blocks in inode allocation */ - int m_ialloc_min_blks;/* min blocks in sparse inode - * allocation */ - int m_inoalign_mask;/* mask sb_inoalignmt if used */ uint m_qflags; /* quota status flags */ struct xfs_trans_resv m_resv; /* precomputed res values */ - uint64_t m_maxicount; /* maximum inode count */ uint64_t m_resblks; /* total reserved blocks */ uint64_t m_resblks_avail;/* available reserved blocks */ uint64_t m_resblks_save; /* reserved blks @ remount,ro */ int m_dalign; /* stripe unit */ int m_swidth; /* stripe width */ - int m_sinoalign; /* stripe unit inode alignment */ uint8_t m_sectbb_log; /* sectlog - BBSHIFT */ const struct xfs_nameops *m_dirnameops; /* vector of dir name ops */ const struct xfs_dir_ops *m_dir_inode_ops; /* vector of dir inode ops */ diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index a14d11d78bd8..4b44c55e3022 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -582,7 +582,7 @@ xfs_set_inode_alloc( * Calculate how much should be reserved for inodes to meet * the max inode percentage. Used only for inode32. */ - if (mp->m_maxicount) { + if (mp->m_ino_geo.ig_maxicount) { uint64_t icount; icount = sbp->sb_dblocks * sbp->sb_imax_pct; @@ -1131,10 +1131,10 @@ xfs_fs_statfs( fakeinos = XFS_FSB_TO_INO(mp, statp->f_bfree); statp->f_files = min(icount + fakeinos, (uint64_t)XFS_MAXINUMBER); - if (mp->m_maxicount) + if (mp->m_ino_geo.ig_maxicount) statp->f_files = min_t(typeof(statp->f_files), statp->f_files, - mp->m_maxicount); + mp->m_ino_geo.ig_maxicount); /* If sb_icount overshot maxicount, report actual allocation */ statp->f_files = max_t(typeof(statp->f_files), From patchwork Wed May 29 22:26:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967845 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E5D3C76 for ; Wed, 29 May 2019 22:26:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D3EFB28A35 for ; Wed, 29 May 2019 22:26:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C822028A38; Wed, 29 May 2019 22:26:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A978528A35 for ; Wed, 29 May 2019 22:26:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726538AbfE2W0d (ORCPT ); Wed, 29 May 2019 18:26:33 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:55016 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726535AbfE2W0c (ORCPT ); Wed, 29 May 2019 18:26:32 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM4E6A031120 for ; Wed, 29 May 2019 22:26:30 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=1leqaq1eA1J/8NNPBCWF5RgkaSWgtn2dx4cpueMaSpA=; b=P5ltkhlHDhqz6uOHrqAk4zEu8mlHQEPWCCFZj8eUb5frLxs2q8b+jCJIqHVIwzRm585C zDWL8qsq6ZycN5/NmQTsobC217+1MPhE58YWuIekSVUxeIeebVXQvhfBA/oM+BH3XpsJ GJsEn/jvK37Wxd7NFsH7vCZ8//u1UW2YBGr1O+U/TVNPQC1biKBG4h1w780QDS6pwuYF 3jEn29z2RPpAFJwy9xXPEvwvDI9xfPTq+umvOnxGm9W853vYrHnrA/cTm+UfKBGdrDWL qSYQtzphZt38f4gjGcWANPF2+G8sdUM6MATUE4DHKdCiW71GGD0WSdB6xS3ki+r6f7Oy YA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2120.oracle.com with ESMTP id 2spxbqcp43-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:30 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMPS7E096530 for ; Wed, 29 May 2019 22:26:29 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userp3030.oracle.com with ESMTP id 2ss1fnqqpk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:29 +0000 Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x4TMQTG5003333 for ; Wed, 29 May 2019 22:26:29 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:26:28 -0700 Subject: [PATCH 02/11] xfs: create simplified inode walk function From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:26:27 -0700 Message-ID: <155916878781.757870.16686660432312494436.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Signed-off-by: Darrick J. Wong --- fs/xfs/Makefile | 1 fs/xfs/xfs_itable.c | 5 - fs/xfs/xfs_itable.h | 8 + fs/xfs/xfs_iwalk.c | 402 +++++++++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_iwalk.h | 18 ++ fs/xfs/xfs_trace.h | 40 +++++ 6 files changed, 472 insertions(+), 2 deletions(-) create mode 100644 fs/xfs/xfs_iwalk.c create mode 100644 fs/xfs/xfs_iwalk.h diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 91831975363b..74d30ef0dbce 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -80,6 +80,7 @@ xfs-y += xfs_aops.o \ xfs_iops.o \ xfs_inode.o \ xfs_itable.o \ + xfs_iwalk.o \ xfs_message.o \ xfs_mount.o \ xfs_mru_cache.o \ diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c index cff28ee73deb..96590d9f917c 100644 --- a/fs/xfs/xfs_itable.c +++ b/fs/xfs/xfs_itable.c @@ -19,6 +19,7 @@ #include "xfs_trace.h" #include "xfs_icache.h" #include "xfs_health.h" +#include "xfs_iwalk.h" /* * Return stat information for one inode. @@ -161,7 +162,7 @@ xfs_bulkstat_one( * Loop over all clusters in a chunk for a given incore inode allocation btree * record. Do a readahead if there are any allocated inodes in that cluster. */ -STATIC void +void xfs_bulkstat_ichunk_ra( struct xfs_mount *mp, xfs_agnumber_t agno, @@ -195,7 +196,7 @@ xfs_bulkstat_ichunk_ra( * are some left allocated, update the data for the pointed-to record as well as * return the count of grabbed inodes. */ -STATIC int +int xfs_bulkstat_grab_ichunk( struct xfs_btree_cur *cur, /* btree cursor */ xfs_agino_t agino, /* starting inode of chunk */ diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h index 8a822285b671..369e3f159d4e 100644 --- a/fs/xfs/xfs_itable.h +++ b/fs/xfs/xfs_itable.h @@ -84,4 +84,12 @@ xfs_inumbers( void __user *buffer, /* buffer with inode info */ inumbers_fmt_pf formatter); +/* Temporarily needed while we refactor functions. */ +struct xfs_btree_cur; +struct xfs_inobt_rec_incore; +void xfs_bulkstat_ichunk_ra(struct xfs_mount *mp, xfs_agnumber_t agno, + struct xfs_inobt_rec_incore *irec); +int xfs_bulkstat_grab_ichunk(struct xfs_btree_cur *cur, xfs_agino_t agino, + int *icount, struct xfs_inobt_rec_incore *irec); + #endif /* __XFS_ITABLE_H__ */ diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c new file mode 100644 index 000000000000..0ce3baa159ba --- /dev/null +++ b/fs/xfs/xfs_iwalk.c @@ -0,0 +1,402 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2019 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_log_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_inode.h" +#include "xfs_btree.h" +#include "xfs_ialloc.h" +#include "xfs_ialloc_btree.h" +#include "xfs_itable.h" +#include "xfs_error.h" +#include "xfs_trace.h" +#include "xfs_icache.h" +#include "xfs_health.h" +#include "xfs_trans.h" +#include "xfs_iwalk.h" + +/* + * Walking All the Inodes in the Filesystem + * ======================================== + * Starting at some @startino, call a walk function on every allocated inode in + * the system. The walk function is called with the relevant inode number and + * a pointer to caller-provided data. The walk function can return the usual + * negative error code, 0, or XFS_IWALK_ABORT to stop the iteration. This + * return value is returned to the caller. + * + * Internally, we allow the walk function to do anything, which means that we + * cannot maintain the inobt cursor or our lock on the AGI buffer. We + * therefore build up a batch of inobt records in kernel memory and only call + * the walk function when our memory buffer is full. + */ + +struct xfs_iwalk_ag { + struct xfs_mount *mp; + struct xfs_trans *tp; + + /* Where do we start the traversal? */ + xfs_ino_t startino; + + /* Array of inobt records we cache. */ + struct xfs_inobt_rec_incore *recs; + unsigned int sz_recs; + unsigned int nr_recs; + + /* Inode walk function and data pointer. */ + xfs_iwalk_fn iwalk_fn; + void *data; +}; + +/* Allocate memory for a walk. */ +STATIC int +xfs_iwalk_allocbuf( + struct xfs_iwalk_ag *iwag) +{ + size_t size; + + ASSERT(iwag->recs == NULL); + iwag->nr_recs = 0; + + /* Allocate a prefetch buffer for inobt records. */ + size = iwag->sz_recs * sizeof(struct xfs_inobt_rec_incore); + iwag->recs = kmem_alloc(size, KM_SLEEP); + if (iwag->recs == NULL) + return -ENOMEM; + + return 0; +} + +/* Free memory we allocated for a walk. */ +STATIC void +xfs_iwalk_freebuf( + struct xfs_iwalk_ag *iwag) +{ + ASSERT(iwag->recs != NULL); + kmem_free(iwag->recs); +} + +/* For each inuse inode in each cached inobt record, call our function. */ +STATIC int +xfs_iwalk_ag_recs( + struct xfs_iwalk_ag *iwag) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + struct xfs_inobt_rec_incore *irec; + xfs_ino_t ino; + unsigned int i, j; + xfs_agnumber_t agno; + int error; + + agno = XFS_INO_TO_AGNO(mp, iwag->startino); + for (i = 0, irec = iwag->recs; i < iwag->nr_recs; i++, irec++) { + trace_xfs_iwalk_ag_rec(mp, agno, irec->ir_startino, + irec->ir_free); + for (j = 0; j < XFS_INODES_PER_CHUNK; j++) { + /* Skip if this inode is free */ + if (XFS_INOBT_MASK(j) & irec->ir_free) + continue; + + /* Otherwise call our function. */ + ino = XFS_AGINO_TO_INO(mp, agno, irec->ir_startino + j); + error = iwag->iwalk_fn(mp, tp, ino, iwag->data); + if (error) + return error; + } + } + + iwag->nr_recs = 0; + return 0; +} + +/* Read AGI and create inobt cursor. */ +static inline int +xfs_iwalk_inobt_cur( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_agnumber_t agno, + struct xfs_btree_cur **curpp, + struct xfs_buf **agi_bpp) +{ + struct xfs_btree_cur *cur; + int error; + + ASSERT(*agi_bpp == NULL); + + error = xfs_ialloc_read_agi(mp, tp, agno, agi_bpp); + if (error) + return error; + + cur = xfs_inobt_init_cursor(mp, tp, *agi_bpp, agno, XFS_BTNUM_INO); + if (!cur) + return -ENOMEM; + *curpp = cur; + return 0; +} + +/* Delete cursor and let go of AGI. */ +static inline void +xfs_iwalk_del_inobt( + struct xfs_trans *tp, + struct xfs_btree_cur **curpp, + struct xfs_buf **agi_bpp, + int error) +{ + if (*curpp) { + xfs_btree_del_cursor(*curpp, error); + *curpp = NULL; + } + if (*agi_bpp) { + xfs_trans_brelse(tp, *agi_bpp); + *agi_bpp = NULL; + } +} + +/* + * Set ourselves up for walking inobt records starting from a given point in + * the filesystem. + * + * If caller passed in a nonzero start inode number, load the record from the + * inobt and make the record look like all the inodes before agino are free so + * that we skip them, and then move the cursor to the next inobt record. This + * is how we support starting an iwalk in the middle of an inode chunk. + * + * If the caller passed in a start number of zero, move the cursor to the first + * inobt record. + * + * The caller is responsible for cleaning up the cursor and buffer pointer + * regardless of the error status. + */ +STATIC int +xfs_iwalk_ag_start( + struct xfs_iwalk_ag *iwag, + xfs_agnumber_t agno, + xfs_agino_t agino, + struct xfs_btree_cur **curpp, + struct xfs_buf **agi_bpp, + int *has_more) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + int icount; + int error; + + /* Set up a fresh cursor and empty the inobt cache. */ + iwag->nr_recs = 0; + error = xfs_iwalk_inobt_cur(mp, tp, agno, curpp, agi_bpp); + if (error) + return error; + + /* Starting at the beginning of the AG? That's easy! */ + if (agino == 0) + return xfs_inobt_lookup(*curpp, 0, XFS_LOOKUP_GE, has_more); + + /* + * Otherwise, we have to grab the inobt record where we left off, stuff + * the record into our cache, and then see if there are more records. + * We require a lookup cache of at least two elements so that we don't + * have to deal with tearing down the cursor to walk the records. + */ + error = xfs_bulkstat_grab_ichunk(*curpp, agino - 1, &icount, + &iwag->recs[iwag->nr_recs]); + if (error) + return error; + if (icount) + iwag->nr_recs++; + + ASSERT(iwag->nr_recs < iwag->sz_recs); + return xfs_btree_increment(*curpp, 0, has_more); +} + +typedef int (*xfs_iwalk_ag_recs_fn)(struct xfs_iwalk_ag *iwag); + +/* + * Acknowledge that we added an inobt record to the cache. Flush the inobt + * record cache if the buffer is full, and position the cursor wherever it + * needs to be so that we can keep going. + */ +STATIC int +xfs_iwalk_ag_increment( + struct xfs_iwalk_ag *iwag, + xfs_iwalk_ag_recs_fn walk_ag_recs_fn, + xfs_agnumber_t agno, + struct xfs_btree_cur **curpp, + struct xfs_buf **agi_bpp, + int *has_more) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + struct xfs_inobt_rec_incore *irec; + xfs_agino_t restart; + int error; + + iwag->nr_recs++; + + /* If there's space, just increment and look for more records. */ + if (iwag->nr_recs < iwag->sz_recs) + return xfs_btree_increment(*curpp, 0, has_more); + + /* + * Otherwise the record cache is full; delete the cursor and walk the + * records... + */ + xfs_iwalk_del_inobt(tp, curpp, agi_bpp, 0); + irec = &iwag->recs[iwag->nr_recs - 1]; + restart = irec->ir_startino + XFS_INODES_PER_CHUNK - 1; + + error = walk_ag_recs_fn(iwag); + if (error) + return error; + + /* ...and recreate cursor where we left off. */ + error = xfs_iwalk_inobt_cur(mp, tp, agno, curpp, agi_bpp); + if (error) + return error; + + return xfs_inobt_lookup(*curpp, restart, XFS_LOOKUP_GE, has_more); +} + +/* Walk all inodes in a single AG, from @iwag->startino to the end of the AG. */ +STATIC int +xfs_iwalk_ag( + struct xfs_iwalk_ag *iwag) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + struct xfs_buf *agi_bp = NULL; + struct xfs_btree_cur *cur = NULL; + xfs_agnumber_t agno; + xfs_agino_t agino; + int has_more; + int error = 0; + + /* Set up our cursor at the right place in the inode btree. */ + agno = XFS_INO_TO_AGNO(mp, iwag->startino); + agino = XFS_INO_TO_AGINO(mp, iwag->startino); + error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more); + if (error) + goto out_cur; + + while (has_more) { + struct xfs_inobt_rec_incore *irec; + + /* Fetch the inobt record. */ + irec = &iwag->recs[iwag->nr_recs]; + error = xfs_inobt_get_rec(cur, irec, &has_more); + if (error) + goto out_cur; + if (!has_more) + break; + + /* No allocated inodes in this chunk; skip it. */ + if (irec->ir_freecount == irec->ir_count) { + error = xfs_btree_increment(cur, 0, &has_more); + goto next_loop; + } + + /* + * Start readahead for this inode chunk in anticipation of + * walking the inodes. + */ + xfs_bulkstat_ichunk_ra(mp, agno, irec); + + /* + * Add this inobt record to our cache, flush the cache if + * needed, and move on to the next record. + */ + error = xfs_iwalk_ag_increment(iwag, xfs_iwalk_ag_recs, agno, + &cur, &agi_bp, &has_more); +next_loop: + if (error) + goto out_cur; + cond_resched(); + } + + /* Walk any records left behind in the cache. */ + if (iwag->nr_recs) { + xfs_iwalk_del_inobt(tp, &cur, &agi_bp, error); + return xfs_iwalk_ag_recs(iwag); + } + +out_cur: + xfs_iwalk_del_inobt(tp, &cur, &agi_bp, error); + return error; +} + +/* + * Given the number of inodes to prefetch, set the number of inobt records that + * we cache in memory, which controls the number of inodes we try to read + * ahead. + * + * If no max prefetch was given, default to one page's worth of inobt records; + * this should be plenty of inodes to read ahead. + */ +static inline void +xfs_iwalk_set_prefetch( + struct xfs_iwalk_ag *iwag, + unsigned int max_prefetch) +{ + if (max_prefetch) + iwag->sz_recs = round_up(max_prefetch, XFS_INODES_PER_CHUNK) / + XFS_INODES_PER_CHUNK; + else + iwag->sz_recs = PAGE_SIZE / sizeof(struct xfs_inobt_rec_incore); + + /* + * Allocate enough space to prefetch at least two records so that we + * can cache both the inobt record where the iwalk started and the next + * record. This simplifies the AG inode walk loop setup code. + */ + if (iwag->sz_recs < 2) + iwag->sz_recs = 2; +} + +/* + * Walk all inodes in the filesystem starting from @startino. The @iwalk_fn + * will be called for each allocated inode, being passed the inode's number and + * @data. @max_prefetch controls how many inobt records' worth of inodes we + * try to readahead. + */ +int +xfs_iwalk( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_ino_t startino, + xfs_iwalk_fn iwalk_fn, + unsigned int max_prefetch, + void *data) +{ + struct xfs_iwalk_ag iwag = { + .mp = mp, + .tp = tp, + .iwalk_fn = iwalk_fn, + .data = data, + .startino = startino, + }; + xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino); + int error; + + ASSERT(agno < mp->m_sb.sb_agcount); + + xfs_iwalk_set_prefetch(&iwag, max_prefetch); + error = xfs_iwalk_allocbuf(&iwag); + if (error) + return error; + + for (; agno < mp->m_sb.sb_agcount; agno++) { + error = xfs_iwalk_ag(&iwag); + if (error) + break; + iwag.startino = XFS_AGINO_TO_INO(mp, agno + 1, 0); + } + + xfs_iwalk_freebuf(&iwag); + return error; +} diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h new file mode 100644 index 000000000000..45b1baabcd2d --- /dev/null +++ b/fs/xfs/xfs_iwalk.h @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2019 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_IWALK_H__ +#define __XFS_IWALK_H__ + +/* Walk all inodes in the filesystem starting from @startino. */ +typedef int (*xfs_iwalk_fn)(struct xfs_mount *mp, struct xfs_trans *tp, + xfs_ino_t ino, void *data); +/* Return value (for xfs_iwalk_fn) that aborts the walk immediately. */ +#define XFS_IWALK_ABORT (1) + +int xfs_iwalk(struct xfs_mount *mp, struct xfs_trans *tp, xfs_ino_t startino, + xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, void *data); + +#endif /* __XFS_IWALK_H__ */ diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 2464ea351f83..a2881659f776 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3516,6 +3516,46 @@ DEFINE_EVENT(xfs_inode_corrupt_class, name, \ DEFINE_INODE_CORRUPT_EVENT(xfs_inode_mark_sick); DEFINE_INODE_CORRUPT_EVENT(xfs_inode_mark_healthy); +TRACE_EVENT(xfs_iwalk_ag, + TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, + xfs_agino_t startino), + TP_ARGS(mp, agno, startino), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_agnumber_t, agno) + __field(xfs_agino_t, startino) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->agno = agno; + __entry->startino = startino; + ), + TP_printk("dev %d:%d agno %d startino %u", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->agno, + __entry->startino) +) + +TRACE_EVENT(xfs_iwalk_ag_rec, + TP_PROTO(struct xfs_mount *mp, xfs_agnumber_t agno, + xfs_agino_t startino, uint64_t freemask), + TP_ARGS(mp, agno, startino, freemask), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_agnumber_t, agno) + __field(xfs_agino_t, startino) + __field(uint64_t, freemask) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->agno = agno; + __entry->startino = startino; + __entry->freemask = freemask; + ), + TP_printk("dev %d:%d agno %d startino %u freemask 0x%llx", + MAJOR(__entry->dev), MINOR(__entry->dev), __entry->agno, + __entry->startino, __entry->freemask) +) + #endif /* _TRACE_XFS_H */ #undef TRACE_INCLUDE_PATH From patchwork Wed May 29 22:26:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967847 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 786381398 for ; Wed, 29 May 2019 22:26:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6803228A37 for ; Wed, 29 May 2019 22:26:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5C2FB28A3C; Wed, 29 May 2019 22:26:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E2BD528A37 for ; Wed, 29 May 2019 22:26:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726706AbfE2W0h (ORCPT ); Wed, 29 May 2019 18:26:37 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:42254 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726535AbfE2W0h (ORCPT ); Wed, 29 May 2019 18:26:37 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM3jQD022704 for ; Wed, 29 May 2019 22:26:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=c9qNndP8IJzpiACrJHRed5Gk9Hetpv+3t+1Xs/wR1b8=; b=sX/M+ykmm1lth1u4a0epjGxW0NOUvwa+wWy111X61uRSnIwwaOEhn7oOfKYiug2paF4Y MKJHzWRjIlwDM53pqwKUzo0zjlpCBVQ/49c/nfhoib4EXO2zf2FTNnsy0gUqZ/ETuMbC RQS0Mn/mBARwJYVEZ3UTWeyrk9fSy0Sl5JqHrbxs1JMbWRX37Nq6qx5BJ3JwZFol3liP jpF4AHEnC4MZuZFjSb8yHtOFQ9AxtpOHcrbh6ZXTaQTftuM8EOihsoeVlm0R4FtXtkfI xaWCLqUjJXtucEj5wB8rYbA+AjCuKq5izhyYbP7mnGggpbU21PWVj9t/CLGRJ7l5tZbP ow== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2130.oracle.com with ESMTP id 2spw4tmtan-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:36 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMPSrH096507 for ; Wed, 29 May 2019 22:26:36 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3030.oracle.com with ESMTP id 2ss1fnqqqd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:35 +0000 Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x4TMQZhB022678 for ; Wed, 29 May 2019 22:26:35 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:26:34 -0700 Subject: [PATCH 03/11] xfs: convert quotacheck to use the new iwalk functions From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:26:34 -0700 Message-ID: <155916879404.757870.10976871618930441712.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Convert quotacheck to use the new iwalk iterator to dig through the inodes. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner --- fs/xfs/xfs_qm.c | 62 ++++++++++++++++++------------------------------------- 1 file changed, 20 insertions(+), 42 deletions(-) diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index aa6b6db3db0e..a5b2260406a8 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -26,6 +26,7 @@ #include "xfs_trace.h" #include "xfs_icache.h" #include "xfs_cksum.h" +#include "xfs_iwalk.h" /* * The global quota manager. There is only one of these for the entire @@ -1118,17 +1119,15 @@ xfs_qm_quotacheck_dqadjust( /* ARGSUSED */ STATIC int xfs_qm_dqusage_adjust( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t ino, /* inode number to get data for */ - void __user *buffer, /* not used */ - int ubsize, /* not used */ - int *ubused, /* not used */ - int *res) /* result code value */ + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_ino_t ino, + void *data) { - xfs_inode_t *ip; - xfs_qcnt_t nblks; - xfs_filblks_t rtblks = 0; /* total rt blks */ - int error; + struct xfs_inode *ip; + xfs_qcnt_t nblks; + xfs_filblks_t rtblks = 0; /* total rt blks */ + int error; ASSERT(XFS_IS_QUOTA_RUNNING(mp)); @@ -1136,20 +1135,18 @@ xfs_qm_dqusage_adjust( * rootino must have its resources accounted for, not so with the quota * inodes. */ - if (xfs_is_quota_inode(&mp->m_sb, ino)) { - *res = BULKSTAT_RV_NOTHING; - return -EINVAL; - } + if (xfs_is_quota_inode(&mp->m_sb, ino)) + return 0; /* * We don't _need_ to take the ilock EXCL here because quotacheck runs * at mount time and therefore nobody will be racing chown/chproj. */ - error = xfs_iget(mp, NULL, ino, XFS_IGET_DONTCACHE, 0, &ip); - if (error) { - *res = BULKSTAT_RV_NOTHING; + error = xfs_iget(mp, tp, ino, XFS_IGET_DONTCACHE, 0, &ip); + if (error == -EINVAL || error == -ENOENT) + return 0; + if (error) return error; - } ASSERT(ip->i_delayed_blks == 0); @@ -1157,7 +1154,7 @@ xfs_qm_dqusage_adjust( struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK); if (!(ifp->if_flags & XFS_IFEXTENTS)) { - error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK); + error = xfs_iread_extents(tp, ip, XFS_DATA_FORK); if (error) goto error0; } @@ -1200,13 +1197,8 @@ xfs_qm_dqusage_adjust( goto error0; } - xfs_irele(ip); - *res = BULKSTAT_RV_DIDONE; - return 0; - error0: xfs_irele(ip); - *res = BULKSTAT_RV_GIVEUP; return error; } @@ -1270,18 +1262,13 @@ STATIC int xfs_qm_quotacheck( xfs_mount_t *mp) { - int done, count, error, error2; - xfs_ino_t lastino; - size_t structsz; + int error, error2; uint flags; LIST_HEAD (buffer_list); struct xfs_inode *uip = mp->m_quotainfo->qi_uquotaip; struct xfs_inode *gip = mp->m_quotainfo->qi_gquotaip; struct xfs_inode *pip = mp->m_quotainfo->qi_pquotaip; - count = INT_MAX; - structsz = 1; - lastino = 0; flags = 0; ASSERT(uip || gip || pip); @@ -1318,18 +1305,9 @@ xfs_qm_quotacheck( flags |= XFS_PQUOTA_CHKD; } - do { - /* - * Iterate thru all the inodes in the file system, - * adjusting the corresponding dquot counters in core. - */ - error = xfs_bulkstat(mp, &lastino, &count, - xfs_qm_dqusage_adjust, - structsz, NULL, &done); - if (error) - break; - - } while (!done); + error = xfs_iwalk(mp, NULL, 0, xfs_qm_dqusage_adjust, 0, NULL); + if (error) + goto error_return; /* * We've made all the changes that we need to make incore. Flush them From patchwork Wed May 29 22:26:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967849 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D77171398 for ; Wed, 29 May 2019 22:26:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C778428A35 for ; Wed, 29 May 2019 22:26:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B947B28A38; Wed, 29 May 2019 22:26:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 663C628A35 for ; Wed, 29 May 2019 22:26:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726535AbfE2W0o (ORCPT ); Wed, 29 May 2019 18:26:44 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:55162 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726532AbfE2W0n (ORCPT ); Wed, 29 May 2019 18:26:43 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM4CG0030647 for ; Wed, 29 May 2019 22:26:42 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=GWYDnFE03RVp43hatnjw4oNZGgYS4y4yhfqhC20MhCI=; b=ZWJs1LouK4WRSNi89wj7LgCoDMiWRy9oZsnIFBVfkAL9rGMIDgaOzU1c0limtvKxuLsE kN3+Q+y4kqNOBXdzUWk4xkhtl3vo3iIV3XZiPciamDfKs4FteDc+Bcgdasw3xjrblZcG 4HfJGQ0/tYCZY26zwLuN9Mc9thspXW00PzeSdM6YsMg7isUW71gNz5ftGFQiFYmw57q8 W6u8aWaSp2J7EhhmqLX5mFugG1G7Swa0NUjBEMDonKzCJ43aSszqE5NA5xWMJLBJpjtq BqNO9VuxNGZ6uxiI92j1hM90CVIuEPwaGSe1OamhGOzKQAtNotk3dkIpl9EKeIpb0cOJ fQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2120.oracle.com with ESMTP id 2spxbqcp4e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:42 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMQ1qC164418 for ; Wed, 29 May 2019 22:26:42 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3020.oracle.com with ESMTP id 2sr31vh8pp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:41 +0000 Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x4TMQfxT022707 for ; Wed, 29 May 2019 22:26:41 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:26:40 -0700 Subject: [PATCH 04/11] xfs: bulkstat should copy lastip whenever userspace supplies one From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:26:40 -0700 Message-ID: <155916880004.757870.14054258865473950566.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=916 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=949 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong When userspace passes in a @lastip pointer we should copy the results back, even if the @ocount pointer is NULL. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_ioctl.c | 13 ++++++------- fs/xfs/xfs_ioctl32.c | 13 ++++++------- 2 files changed, 12 insertions(+), 14 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index d7dfc13f30f5..5ffbdcff3dba 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -768,14 +768,13 @@ xfs_ioc_bulkstat( if (error) return error; - if (bulkreq.ocount != NULL) { - if (copy_to_user(bulkreq.lastip, &inlast, - sizeof(xfs_ino_t))) - return -EFAULT; + if (bulkreq.lastip != NULL && + copy_to_user(bulkreq.lastip, &inlast, sizeof(xfs_ino_t))) + return -EFAULT; - if (copy_to_user(bulkreq.ocount, &count, sizeof(count))) - return -EFAULT; - } + if (bulkreq.ocount != NULL && + copy_to_user(bulkreq.ocount, &count, sizeof(count))) + return -EFAULT; return 0; } diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c index 614fc6886d24..814ffe6fbab7 100644 --- a/fs/xfs/xfs_ioctl32.c +++ b/fs/xfs/xfs_ioctl32.c @@ -310,14 +310,13 @@ xfs_compat_ioc_bulkstat( if (error) return error; - if (bulkreq.ocount != NULL) { - if (copy_to_user(bulkreq.lastip, &inlast, - sizeof(xfs_ino_t))) - return -EFAULT; + if (bulkreq.lastip != NULL && + copy_to_user(bulkreq.lastip, &inlast, sizeof(xfs_ino_t))) + return -EFAULT; - if (copy_to_user(bulkreq.ocount, &count, sizeof(count))) - return -EFAULT; - } + if (bulkreq.ocount != NULL && + copy_to_user(bulkreq.ocount, &count, sizeof(count))) + return -EFAULT; return 0; } From patchwork Wed May 29 22:26:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967851 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6D4321398 for ; Wed, 29 May 2019 22:26:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5904728A35 for ; Wed, 29 May 2019 22:26:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4C3DF28A38; Wed, 29 May 2019 22:26:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AD6B728A35 for ; Wed, 29 May 2019 22:26:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726546AbfE2W0y (ORCPT ); Wed, 29 May 2019 18:26:54 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:36186 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726532AbfE2W0y (ORCPT ); Wed, 29 May 2019 18:26:54 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM4IsL042027 for ; Wed, 29 May 2019 22:26:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=rsHCtCjg4xqHf+cgUNwxUcTldRrsL4r/qvTgP6PoVCA=; b=qBvqKyLUl4Feu5cOG5B9u8q3Gspe/3zCYZLya+gfhVrouxdGeoLVflc67fS1hfSLvEDt XM9pg2U9pgnDwoXPLIVAFiQRwCRboXrosKKjBn2z3FY+3mV/cGlJFXTSWMeZcMSnOXoJ Htv6aczUnEyezIRHQSJOlnWf0Ua9MHa+t07blWRlLiqfDJL3jlIArFSWnU42ehnPDYcx ckqydA3XvVYCrMJNCppkjTxYBBI1BnvFPU5raPXP3BH7osTZoMYp34aXNXFtrmbiMsmk slmJRIuMOyhrTCPigqNz7cap+vDH2uw6JqF40A7U8RjSQcoxvBRcpvEbkoFlcpZq1s5y TA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2130.oracle.com with ESMTP id 2spu7dn14n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:48 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMQOmH172756 for ; Wed, 29 May 2019 22:26:48 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 2sqh73yktv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:48 +0000 Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x4TMQl5L003499 for ; Wed, 29 May 2019 22:26:47 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:26:46 -0700 Subject: [PATCH 05/11] xfs: convert bulkstat to new iwalk infrastructure From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:26:46 -0700 Message-ID: <155916880601.757870.13087821258923949678.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Create a new ibulk structure incore to help us deal with bulk inode stat state tracking and then convert the bulkstat code to use the new iwalk iterator. This disentangles inode walking from bulk stat control for simpler code and enables us to isolate the formatter functions to the ioctl handling code. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_ioctl.c | 65 ++++++-- fs/xfs/xfs_ioctl.h | 5 + fs/xfs/xfs_ioctl32.c | 88 +++++------ fs/xfs/xfs_itable.c | 407 ++++++++++++++------------------------------------ fs/xfs/xfs_itable.h | 79 ++++------ 5 files changed, 245 insertions(+), 399 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 5ffbdcff3dba..43734901aeb9 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -721,16 +721,28 @@ xfs_ioc_space( return error; } +/* Return 0 on success or positive error */ +int +xfs_bulkstat_one_fmt( + struct xfs_ibulk *breq, + const struct xfs_bstat *bstat) +{ + if (copy_to_user(breq->ubuffer, bstat, sizeof(*bstat))) + return -EFAULT; + return xfs_ibulk_advance(breq, sizeof(struct xfs_bstat)); +} + STATIC int xfs_ioc_bulkstat( xfs_mount_t *mp, unsigned int cmd, void __user *arg) { - xfs_fsop_bulkreq_t bulkreq; - int count; /* # of records returned */ - xfs_ino_t inlast; /* last inode number */ - int done; + struct xfs_fsop_bulkreq bulkreq; + struct xfs_ibulk breq = { + .mp = mp, + }; + xfs_ino_t lastino; int error; /* done = 1 if there are more stats to get and if bulkstat */ @@ -745,35 +757,54 @@ xfs_ioc_bulkstat( if (copy_from_user(&bulkreq, arg, sizeof(xfs_fsop_bulkreq_t))) return -EFAULT; - if (copy_from_user(&inlast, bulkreq.lastip, sizeof(__s64))) + if (copy_from_user(&lastino, bulkreq.lastip, sizeof(__s64))) return -EFAULT; - if ((count = bulkreq.icount) <= 0) + if (bulkreq.icount <= 0) return -EINVAL; if (bulkreq.ubuffer == NULL) return -EINVAL; - if (cmd == XFS_IOC_FSINUMBERS) - error = xfs_inumbers(mp, &inlast, &count, + breq.ubuffer = bulkreq.ubuffer; + breq.icount = bulkreq.icount; + + /* + * FSBULKSTAT_SINGLE expects that *lastip contains the inode number + * that we want to stat. However, FSINUMBERS and FSBULKSTAT expect + * that *lastip contains either zero or the number of the last inode to + * be examined by the previous call and return results starting with + * the next inode after that. The new bulk request functions take the + * inode to start with, so we have to adjust the lastino/startino + * parameter to maintain correct function. + */ + if (cmd == XFS_IOC_FSINUMBERS) { + int count = breq.icount; + + breq.startino = lastino; + error = xfs_inumbers(mp, &breq.startino, &count, bulkreq.ubuffer, xfs_inumbers_fmt); - else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE) - error = xfs_bulkstat_one(mp, inlast, bulkreq.ubuffer, - sizeof(xfs_bstat_t), NULL, &done); - else /* XFS_IOC_FSBULKSTAT */ - error = xfs_bulkstat(mp, &inlast, &count, xfs_bulkstat_one, - sizeof(xfs_bstat_t), bulkreq.ubuffer, - &done); + breq.ocount = count; + lastino = breq.startino; + } else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE) { + breq.startino = lastino; + error = xfs_bulkstat_one(&breq, xfs_bulkstat_one_fmt); + lastino = breq.startino; + } else { /* XFS_IOC_FSBULKSTAT */ + breq.startino = lastino + 1; + error = xfs_bulkstat(&breq, xfs_bulkstat_one_fmt); + lastino = breq.startino - 1; + } if (error) return error; if (bulkreq.lastip != NULL && - copy_to_user(bulkreq.lastip, &inlast, sizeof(xfs_ino_t))) + copy_to_user(bulkreq.lastip, &lastino, sizeof(xfs_ino_t))) return -EFAULT; if (bulkreq.ocount != NULL && - copy_to_user(bulkreq.ocount, &count, sizeof(count))) + copy_to_user(bulkreq.ocount, &breq.ocount, sizeof(__s32))) return -EFAULT; return 0; diff --git a/fs/xfs/xfs_ioctl.h b/fs/xfs/xfs_ioctl.h index 4b17f67c888a..f32c8aadfeba 100644 --- a/fs/xfs/xfs_ioctl.h +++ b/fs/xfs/xfs_ioctl.h @@ -77,4 +77,9 @@ xfs_set_dmattrs( uint evmask, uint16_t state); +struct xfs_ibulk; +struct xfs_bstat; + +int xfs_bulkstat_one_fmt(struct xfs_ibulk *breq, const struct xfs_bstat *bstat); + #endif diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c index 814ffe6fbab7..add15819daf3 100644 --- a/fs/xfs/xfs_ioctl32.c +++ b/fs/xfs/xfs_ioctl32.c @@ -172,15 +172,10 @@ xfs_bstime_store_compat( /* Return 0 on success or positive error (to xfs_bulkstat()) */ STATIC int xfs_bulkstat_one_fmt_compat( - void __user *ubuffer, - int ubsize, - int *ubused, - const xfs_bstat_t *buffer) + struct xfs_ibulk *breq, + const struct xfs_bstat *buffer) { - compat_xfs_bstat_t __user *p32 = ubuffer; - - if (ubsize < sizeof(*p32)) - return -ENOMEM; + struct compat_xfs_bstat __user *p32 = breq->ubuffer; if (put_user(buffer->bs_ino, &p32->bs_ino) || put_user(buffer->bs_mode, &p32->bs_mode) || @@ -205,23 +200,8 @@ xfs_bulkstat_one_fmt_compat( put_user(buffer->bs_dmstate, &p32->bs_dmstate) || put_user(buffer->bs_aextents, &p32->bs_aextents)) return -EFAULT; - if (ubused) - *ubused = sizeof(*p32); - return 0; -} -STATIC int -xfs_bulkstat_one_compat( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t ino, /* inode number to get data for */ - void __user *buffer, /* buffer to place output in */ - int ubsize, /* size of buffer */ - int *ubused, /* bytes used by me */ - int *stat) /* BULKSTAT_RV_... */ -{ - return xfs_bulkstat_one_int(mp, ino, buffer, ubsize, - xfs_bulkstat_one_fmt_compat, - ubused, stat); + return xfs_ibulk_advance(breq, sizeof(struct compat_xfs_bstat)); } /* copied from xfs_ioctl.c */ @@ -232,10 +212,11 @@ xfs_compat_ioc_bulkstat( compat_xfs_fsop_bulkreq_t __user *p32) { u32 addr; - xfs_fsop_bulkreq_t bulkreq; - int count; /* # of records returned */ - xfs_ino_t inlast; /* last inode number */ - int done; + struct xfs_fsop_bulkreq bulkreq; + struct xfs_ibulk breq = { + .mp = mp, + }; + xfs_ino_t lastino; int error; /* @@ -245,8 +226,7 @@ xfs_compat_ioc_bulkstat( * functions and structure size are the correct ones to use ... */ inumbers_fmt_pf inumbers_func = xfs_inumbers_fmt_compat; - bulkstat_one_pf bs_one_func = xfs_bulkstat_one_compat; - size_t bs_one_size = sizeof(struct compat_xfs_bstat); + bulkstat_one_fmt_pf bs_one_func = xfs_bulkstat_one_fmt_compat; #ifdef CONFIG_X86_X32 if (in_x32_syscall()) { @@ -259,8 +239,7 @@ xfs_compat_ioc_bulkstat( * x32 userspace expects. */ inumbers_func = xfs_inumbers_fmt; - bs_one_func = xfs_bulkstat_one; - bs_one_size = sizeof(struct xfs_bstat); + bs_one_func = xfs_bulkstat_one_fmt; } #endif @@ -284,38 +263,57 @@ xfs_compat_ioc_bulkstat( return -EFAULT; bulkreq.ocount = compat_ptr(addr); - if (copy_from_user(&inlast, bulkreq.lastip, sizeof(__s64))) + if (copy_from_user(&lastino, bulkreq.lastip, sizeof(__s64))) return -EFAULT; + breq.startino = lastino + 1; - if ((count = bulkreq.icount) <= 0) + if (bulkreq.icount <= 0) return -EINVAL; if (bulkreq.ubuffer == NULL) return -EINVAL; + breq.ubuffer = bulkreq.ubuffer; + breq.icount = bulkreq.icount; + + /* + * FSBULKSTAT_SINGLE expects that *lastip contains the inode number + * that we want to stat. However, FSINUMBERS and FSBULKSTAT expect + * that *lastip contains either zero or the number of the last inode to + * be examined by the previous call and return results starting with + * the next inode after that. The new bulk request functions take the + * inode to start with, so we have to adjust the lastino/startino + * parameter to maintain correct function. + */ if (cmd == XFS_IOC_FSINUMBERS_32) { - error = xfs_inumbers(mp, &inlast, &count, + int count = breq.icount; + + breq.startino = lastino; + error = xfs_inumbers(mp, &breq.startino, &count, bulkreq.ubuffer, inumbers_func); + breq.ocount = count; + lastino = breq.startino; } else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE_32) { - int res; - - error = bs_one_func(mp, inlast, bulkreq.ubuffer, - bs_one_size, NULL, &res); + breq.startino = lastino; + error = xfs_bulkstat_one(&breq, bs_one_func); + lastino = breq.startino; } else if (cmd == XFS_IOC_FSBULKSTAT_32) { - error = xfs_bulkstat(mp, &inlast, &count, - bs_one_func, bs_one_size, - bulkreq.ubuffer, &done); - } else + breq.startino = lastino + 1; + error = xfs_bulkstat(&breq, bs_one_func); + lastino = breq.startino - 1; + } else { error = -EINVAL; + } if (error) return error; + lastino = breq.startino - 1; if (bulkreq.lastip != NULL && - copy_to_user(bulkreq.lastip, &inlast, sizeof(xfs_ino_t))) + copy_to_user(bulkreq.lastip, &lastino, sizeof(xfs_ino_t))) return -EFAULT; if (bulkreq.ocount != NULL && - copy_to_user(bulkreq.ocount, &count, sizeof(count))) + copy_to_user(bulkreq.ocount, &breq.ocount, sizeof(__s32))) return -EFAULT; return 0; diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c index 96590d9f917c..454bc992bf93 100644 --- a/fs/xfs/xfs_itable.c +++ b/fs/xfs/xfs_itable.c @@ -22,37 +22,63 @@ #include "xfs_iwalk.h" /* - * Return stat information for one inode. - * Return 0 if ok, else errno. + * Bulk Stat + * ========= + * + * Use the inode walking functions to fill out struct xfs_bstat for every + * allocated inode, then pass the stat information to some externally provided + * iteration function. */ -int + +struct xfs_bstat_chunk { + bulkstat_one_fmt_pf formatter; + struct xfs_ibulk *breq; +}; + +/* + * Fill out the bulkstat info for a single inode and report it somewhere. + * + * bc->breq->lastino is effectively the inode cursor as we walk through the + * filesystem. Therefore, we update it any time we need to move the cursor + * forward, regardless of whether or not we're sending any bstat information + * back to userspace. If the inode is internal metadata or, has been freed + * out from under us, we just simply keep going. + * + * However, if any other type of error happens we want to stop right where we + * are so that userspace will call back with exact number of the bad inode and + * we can send back an error code. + * + * Note that if the formatter tells us there's no space left in the buffer we + * move the cursor forward and abort the walk. + */ +STATIC int xfs_bulkstat_one_int( - struct xfs_mount *mp, /* mount point for filesystem */ - xfs_ino_t ino, /* inode to get data for */ - void __user *buffer, /* buffer to place output in */ - int ubsize, /* size of buffer */ - bulkstat_one_fmt_pf formatter, /* formatter, copy to user */ - int *ubused, /* bytes used by me */ - int *stat) /* BULKSTAT_RV_... */ + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_ino_t ino, + void *data) { + struct xfs_bstat_chunk *bc = data; struct xfs_icdinode *dic; /* dinode core info pointer */ struct xfs_inode *ip; /* incore inode pointer */ struct inode *inode; struct xfs_bstat *buf; /* return buffer */ int error = 0; /* error value */ - *stat = BULKSTAT_RV_NOTHING; - - if (!buffer || xfs_internal_inum(mp, ino)) + if (xfs_internal_inum(mp, ino)) { + bc->breq->startino = ino + 1; return -EINVAL; + } buf = kmem_zalloc(sizeof(*buf), KM_SLEEP | KM_MAYFAIL); if (!buf) return -ENOMEM; - error = xfs_iget(mp, NULL, ino, + error = xfs_iget(mp, tp, ino, (XFS_IGET_DONTCACHE | XFS_IGET_UNTRUSTED), XFS_ILOCK_SHARED, &ip); + if (error == -ENOENT || error == -EINVAL) + bc->breq->startino = ino + 1; if (error) goto out_free; @@ -119,43 +145,45 @@ xfs_bulkstat_one_int( xfs_iunlock(ip, XFS_ILOCK_SHARED); xfs_irele(ip); - error = formatter(buffer, ubsize, ubused, buf); - if (!error) - *stat = BULKSTAT_RV_DIDONE; - - out_free: + error = bc->formatter(bc->breq, buf); + switch (error) { + case XFS_IBULK_BUFFER_FULL: + error = XFS_IWALK_ABORT; + /* fall through */ + case 0: + bc->breq->startino = ino + 1; + break; + } +out_free: kmem_free(buf); return error; } -/* Return 0 on success or positive error */ -STATIC int -xfs_bulkstat_one_fmt( - void __user *ubuffer, - int ubsize, - int *ubused, - const xfs_bstat_t *buffer) -{ - if (ubsize < sizeof(*buffer)) - return -ENOMEM; - if (copy_to_user(ubuffer, buffer, sizeof(*buffer))) - return -EFAULT; - if (ubused) - *ubused = sizeof(*buffer); - return 0; -} - +/* Bulkstat a single inode. */ int xfs_bulkstat_one( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t ino, /* inode number to get data for */ - void __user *buffer, /* buffer to place output in */ - int ubsize, /* size of buffer */ - int *ubused, /* bytes used by me */ - int *stat) /* BULKSTAT_RV_... */ + struct xfs_ibulk *breq, + bulkstat_one_fmt_pf formatter) { - return xfs_bulkstat_one_int(mp, ino, buffer, ubsize, - xfs_bulkstat_one_fmt, ubused, stat); + struct xfs_bstat_chunk bc = { + .formatter = formatter, + .breq = breq, + }; + int error; + + breq->icount = 1; + breq->ocount = 0; + + error = xfs_bulkstat_one_int(breq->mp, NULL, breq->startino, &bc); + + /* + * If we reported one inode to userspace then we abort because we hit + * the end of the buffer. Don't leak that back to userspace. + */ + if (error == XFS_IWALK_ABORT) + error = 0; + + return error; } /* @@ -251,256 +279,65 @@ xfs_bulkstat_grab_ichunk( #define XFS_BULKSTAT_UBLEFT(ubleft) ((ubleft) >= statstruct_size) -struct xfs_bulkstat_agichunk { - char __user **ac_ubuffer;/* pointer into user's buffer */ - int ac_ubleft; /* bytes left in user's buffer */ - int ac_ubelem; /* spaces used in user's buffer */ -}; - -/* - * Process inodes in chunk with a pointer to a formatter function - * that will iget the inode and fill in the appropriate structure. - */ static int -xfs_bulkstat_ag_ichunk( - struct xfs_mount *mp, - xfs_agnumber_t agno, - struct xfs_inobt_rec_incore *irbp, - bulkstat_one_pf formatter, - size_t statstruct_size, - struct xfs_bulkstat_agichunk *acp, - xfs_agino_t *last_agino) +xfs_bulkstat_iwalk( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_ino_t ino, + void *data) { - char __user **ubufp = acp->ac_ubuffer; - int chunkidx; - int error = 0; - xfs_agino_t agino = irbp->ir_startino; - - for (chunkidx = 0; chunkidx < XFS_INODES_PER_CHUNK; - chunkidx++, agino++) { - int fmterror; - int ubused; - - /* inode won't fit in buffer, we are done */ - if (acp->ac_ubleft < statstruct_size) - break; - - /* Skip if this inode is free */ - if (XFS_INOBT_MASK(chunkidx) & irbp->ir_free) - continue; - - /* Get the inode and fill in a single buffer */ - ubused = statstruct_size; - error = formatter(mp, XFS_AGINO_TO_INO(mp, agno, agino), - *ubufp, acp->ac_ubleft, &ubused, &fmterror); - - if (fmterror == BULKSTAT_RV_GIVEUP || - (error && error != -ENOENT && error != -EINVAL)) { - acp->ac_ubleft = 0; - ASSERT(error); - break; - } - - /* be careful not to leak error if at end of chunk */ - if (fmterror == BULKSTAT_RV_NOTHING || error) { - error = 0; - continue; - } - - *ubufp += ubused; - acp->ac_ubleft -= ubused; - acp->ac_ubelem++; - } - - /* - * Post-update *last_agino. At this point, agino will always point one - * inode past the last inode we processed successfully. Hence we - * substract that inode when setting the *last_agino cursor so that we - * return the correct cookie to userspace. On the next bulkstat call, - * the inode under the lastino cookie will be skipped as we have already - * processed it here. - */ - *last_agino = agino - 1; + int error; + error = xfs_bulkstat_one_int(mp, tp, ino, data); + /* bulkstat just skips over missing inodes */ + if (error == -ENOENT || error == -EINVAL) + return 0; return error; } /* - * Return stat information in bulk (by-inode) for the filesystem. + * Check the incoming lastino parameter. + * + * We allow any inode value that could map to physical space inside the + * filesystem because if there are no inodes there, bulkstat moves on to the + * next chunk. In other words, the magic agino value of zero takes us to the + * first chunk in the AG, and an agino value past the end of the AG takes us to + * the first chunk in the next AG. + * + * Therefore we can end early if the requested inode is beyond the end of the + * filesystem or doesn't map properly. */ -int /* error status */ -xfs_bulkstat( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t *lastinop, /* last inode returned */ - int *ubcountp, /* size of buffer/count returned */ - bulkstat_one_pf formatter, /* func that'd fill a single buf */ - size_t statstruct_size, /* sizeof struct filling */ - char __user *ubuffer, /* buffer with inode stats */ - int *done) /* 1 if there are more stats to get */ +static inline bool +xfs_bulkstat_already_done( + struct xfs_mount *mp, + xfs_ino_t startino) { - xfs_buf_t *agbp; /* agi header buffer */ - xfs_agino_t agino; /* inode # in allocation group */ - xfs_agnumber_t agno; /* allocation group number */ - xfs_btree_cur_t *cur; /* btree cursor for ialloc btree */ - xfs_inobt_rec_incore_t *irbuf; /* start of irec buffer */ - int nirbuf; /* size of irbuf */ - int ubcount; /* size of user's buffer */ - struct xfs_bulkstat_agichunk ac; - int error = 0; - - /* - * Get the last inode value, see if there's nothing to do. - */ - agno = XFS_INO_TO_AGNO(mp, *lastinop); - agino = XFS_INO_TO_AGINO(mp, *lastinop); - if (agno >= mp->m_sb.sb_agcount || - *lastinop != XFS_AGINO_TO_INO(mp, agno, agino)) { - *done = 1; - *ubcountp = 0; - return 0; - } - - ubcount = *ubcountp; /* statstruct's */ - ac.ac_ubuffer = &ubuffer; - ac.ac_ubleft = ubcount * statstruct_size; /* bytes */; - ac.ac_ubelem = 0; - - *ubcountp = 0; - *done = 0; - - irbuf = kmem_zalloc_large(PAGE_SIZE * 4, KM_SLEEP); - if (!irbuf) - return -ENOMEM; - nirbuf = (PAGE_SIZE * 4) / sizeof(*irbuf); - - /* - * Loop over the allocation groups, starting from the last - * inode returned; 0 means start of the allocation group. - */ - while (agno < mp->m_sb.sb_agcount) { - struct xfs_inobt_rec_incore *irbp = irbuf; - struct xfs_inobt_rec_incore *irbufend = irbuf + nirbuf; - bool end_of_ag = false; - int icount = 0; - int stat; - - error = xfs_ialloc_read_agi(mp, NULL, agno, &agbp); - if (error) - break; - /* - * Allocate and initialize a btree cursor for ialloc btree. - */ - cur = xfs_inobt_init_cursor(mp, NULL, agbp, agno, - XFS_BTNUM_INO); - if (agino > 0) { - /* - * In the middle of an allocation group, we need to get - * the remainder of the chunk we're in. - */ - struct xfs_inobt_rec_incore r; - - error = xfs_bulkstat_grab_ichunk(cur, agino, &icount, &r); - if (error) - goto del_cursor; - if (icount) { - irbp->ir_startino = r.ir_startino; - irbp->ir_holemask = r.ir_holemask; - irbp->ir_count = r.ir_count; - irbp->ir_freecount = r.ir_freecount; - irbp->ir_free = r.ir_free; - irbp++; - } - /* Increment to the next record */ - error = xfs_btree_increment(cur, 0, &stat); - } else { - /* Start of ag. Lookup the first inode chunk */ - error = xfs_inobt_lookup(cur, 0, XFS_LOOKUP_GE, &stat); - } - if (error || stat == 0) { - end_of_ag = true; - goto del_cursor; - } + xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino); + xfs_agino_t agino = XFS_INO_TO_AGINO(mp, startino); - /* - * Loop through inode btree records in this ag, - * until we run out of inodes or space in the buffer. - */ - while (irbp < irbufend && icount < ubcount) { - struct xfs_inobt_rec_incore r; - - error = xfs_inobt_get_rec(cur, &r, &stat); - if (error || stat == 0) { - end_of_ag = true; - goto del_cursor; - } - - /* - * If this chunk has any allocated inodes, save it. - * Also start read-ahead now for this chunk. - */ - if (r.ir_freecount < r.ir_count) { - xfs_bulkstat_ichunk_ra(mp, agno, &r); - irbp->ir_startino = r.ir_startino; - irbp->ir_holemask = r.ir_holemask; - irbp->ir_count = r.ir_count; - irbp->ir_freecount = r.ir_freecount; - irbp->ir_free = r.ir_free; - irbp++; - icount += r.ir_count - r.ir_freecount; - } - error = xfs_btree_increment(cur, 0, &stat); - if (error || stat == 0) { - end_of_ag = true; - goto del_cursor; - } - cond_resched(); - } + return agno >= mp->m_sb.sb_agcount || + startino != XFS_AGINO_TO_INO(mp, agno, agino); +} - /* - * Drop the btree buffers and the agi buffer as we can't hold any - * of the locks these represent when calling iget. If there is a - * pending error, then we are done. - */ -del_cursor: - xfs_btree_del_cursor(cur, error); - xfs_buf_relse(agbp); - if (error) - break; - /* - * Now format all the good inodes into the user's buffer. The - * call to xfs_bulkstat_ag_ichunk() sets up the agino pointer - * for the next loop iteration. - */ - irbufend = irbp; - for (irbp = irbuf; - irbp < irbufend && ac.ac_ubleft >= statstruct_size; - irbp++) { - error = xfs_bulkstat_ag_ichunk(mp, agno, irbp, - formatter, statstruct_size, &ac, - &agino); - if (error) - break; +/* Return stat information in bulk (by-inode) for the filesystem. */ +int +xfs_bulkstat( + struct xfs_ibulk *breq, + bulkstat_one_fmt_pf formatter) +{ + struct xfs_bstat_chunk bc = { + .formatter = formatter, + .breq = breq, + }; + int error; - cond_resched(); - } + breq->ocount = 0; - /* - * If we've run out of space or had a formatting error, we - * are now done - */ - if (ac.ac_ubleft < statstruct_size || error) - break; + if (xfs_bulkstat_already_done(breq->mp, breq->startino)) + return 0; - if (end_of_ag) { - agno++; - agino = 0; - } - } - /* - * Done, we're either out of filesystem or space to put the data. - */ - kmem_free(irbuf); - *ubcountp = ac.ac_ubelem; + error = xfs_iwalk(breq->mp, NULL, breq->startino, xfs_bulkstat_iwalk, + breq->icount, &bc); /* * We found some inodes, so clear the error status and return them. @@ -509,17 +346,9 @@ xfs_bulkstat( * triggered again and propagated to userspace as there will be no * formatted inodes in the buffer. */ - if (ac.ac_ubelem) + if (breq->ocount > 0) error = 0; - /* - * If we ran out of filesystem, lastino will point off the end of - * the filesystem so the next call will return immediately. - */ - *lastinop = XFS_AGINO_TO_INO(mp, agno, agino); - if (agno >= mp->m_sb.sb_agcount) - *done = 1; - return error; } diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h index 369e3f159d4e..366d391eb11f 100644 --- a/fs/xfs/xfs_itable.h +++ b/fs/xfs/xfs_itable.h @@ -5,63 +5,46 @@ #ifndef __XFS_ITABLE_H__ #define __XFS_ITABLE_H__ -/* - * xfs_bulkstat() is used to fill in xfs_bstat structures as well as dm_stat - * structures (by the dmi library). This is a pointer to a formatter function - * that will iget the inode and fill in the appropriate structure. - * see xfs_bulkstat_one() and xfs_dm_bulkstat_one() in dmapi_xfs.c - */ -typedef int (*bulkstat_one_pf)(struct xfs_mount *mp, - xfs_ino_t ino, - void __user *buffer, - int ubsize, - int *ubused, - int *stat); +/* In-memory representation of a userspace request for batch inode data. */ +struct xfs_ibulk { + struct xfs_mount *mp; + void __user *ubuffer; /* user output buffer */ + xfs_ino_t startino; /* start with this inode */ + unsigned int icount; /* number of elements in ubuffer */ + unsigned int ocount; /* number of records returned */ +}; + +/* Return value that means we want to abort the walk. */ +#define XFS_IBULK_ABORT (XFS_IWALK_ABORT) + +/* Return value that means the formatting buffer is now full. */ +#define XFS_IBULK_BUFFER_FULL (2) /* - * Values for stat return value. + * Advance the user buffer pointer by one record of the given size. If the + * buffer is now full, return the appropriate error code. */ -#define BULKSTAT_RV_NOTHING 0 -#define BULKSTAT_RV_DIDONE 1 -#define BULKSTAT_RV_GIVEUP 2 +static inline int +xfs_ibulk_advance( + struct xfs_ibulk *breq, + size_t bytes) +{ + char __user *b = breq->ubuffer; + + breq->ubuffer = b + bytes; + breq->ocount++; + return breq->ocount == breq->icount ? XFS_IBULK_BUFFER_FULL : 0; +} /* * Return stat information in bulk (by-inode) for the filesystem. */ -int /* error status */ -xfs_bulkstat( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t *lastino, /* last inode returned */ - int *count, /* size of buffer/count returned */ - bulkstat_one_pf formatter, /* func that'd fill a single buf */ - size_t statstruct_size,/* sizeof struct that we're filling */ - char __user *ubuffer,/* buffer with inode stats */ - int *done); /* 1 if there are more stats to get */ -typedef int (*bulkstat_one_fmt_pf)( /* used size in bytes or negative error */ - void __user *ubuffer, /* buffer to write to */ - int ubsize, /* remaining user buffer sz */ - int *ubused, /* bytes used by formatter */ - const xfs_bstat_t *buffer); /* buffer to read from */ +typedef int (*bulkstat_one_fmt_pf)(struct xfs_ibulk *breq, + const struct xfs_bstat *bstat); -int -xfs_bulkstat_one_int( - xfs_mount_t *mp, - xfs_ino_t ino, - void __user *buffer, - int ubsize, - bulkstat_one_fmt_pf formatter, - int *ubused, - int *stat); - -int -xfs_bulkstat_one( - xfs_mount_t *mp, - xfs_ino_t ino, - void __user *buffer, - int ubsize, - int *ubused, - int *stat); +int xfs_bulkstat_one(struct xfs_ibulk *breq, bulkstat_one_fmt_pf formatter); +int xfs_bulkstat(struct xfs_ibulk *breq, bulkstat_one_fmt_pf formatter); typedef int (*inumbers_fmt_pf)( void __user *ubuffer, /* buffer to write to */ From patchwork Wed May 29 22:26:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967853 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9F4A51398 for ; Wed, 29 May 2019 22:26:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8DDFE28A35 for ; Wed, 29 May 2019 22:26:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8232F28A38; Wed, 29 May 2019 22:26:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D642E28A35 for ; Wed, 29 May 2019 22:26:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726708AbfE2W05 (ORCPT ); Wed, 29 May 2019 18:26:57 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:36306 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726668AbfE2W05 (ORCPT ); Wed, 29 May 2019 18:26:57 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM41tb041565 for ; Wed, 29 May 2019 22:26:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=12HWU1rjGxZuy7mDK0ePpdp7OpvPjxjlmJtLz4h6HlI=; b=qZBCSePx6J+M+HV7+UxWjT+XJC0oVJ5oOgfuAZQeZpCCRISZpEMRje1r5lJkhyZknRQH +SHx+JxssHafKLV6xY52qcXfGdywo8O+RvirIYh56GMlZz/vTzJK10CnjjTyToRC3bnl sjV3Zq6hp71qSeOW62/n2voptqbxVOs6trCeuDTvSgEr4ufexaLVkU5UjIjslirUeRkC 1Sn3Y0vqBmm8EdrBExdm+u2acS+HmwYZcofNwzmYxgeXsuIqTplaMYPA1dl2/NIVi6RB C9XwjbJmS3XNpKIxyDLvuk3RmGeUQcRYuHiaDiihMZSPOi2XxakzRqX0x6s3J7t20q6A jA== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2130.oracle.com with ESMTP id 2spu7dn15a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:55 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMQ2J2164480 for ; Wed, 29 May 2019 22:26:54 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3020.oracle.com with ESMTP id 2sr31vh8s8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:26:54 +0000 Received: from abhmp0020.oracle.com (abhmp0020.oracle.com [141.146.116.26]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x4TMQrk0021281 for ; Wed, 29 May 2019 22:26:53 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:26:53 -0700 Subject: [PATCH 06/11] xfs: move bulkstat ichunk helpers to iwalk code From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:26:52 -0700 Message-ID: <155916881228.757870.16878993812159363561.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Now that we've reworked the bulkstat code to use iwalk, we can move the old bulkstat ichunk helpers to xfs_iwalk.c. No functional changes here. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_itable.c | 93 -------------------------------------------------- fs/xfs/xfs_itable.h | 8 ---- fs/xfs/xfs_iwalk.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 93 insertions(+), 103 deletions(-) diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c index 454bc992bf93..06abe5c9c0ee 100644 --- a/fs/xfs/xfs_itable.c +++ b/fs/xfs/xfs_itable.c @@ -186,99 +186,6 @@ xfs_bulkstat_one( return error; } -/* - * Loop over all clusters in a chunk for a given incore inode allocation btree - * record. Do a readahead if there are any allocated inodes in that cluster. - */ -void -xfs_bulkstat_ichunk_ra( - struct xfs_mount *mp, - xfs_agnumber_t agno, - struct xfs_inobt_rec_incore *irec) -{ - struct xfs_ino_geometry *igeo = &mp->m_ino_geo; - xfs_agblock_t agbno; - struct blk_plug plug; - int i; /* inode chunk index */ - - agbno = XFS_AGINO_TO_AGBNO(mp, irec->ir_startino); - - blk_start_plug(&plug); - for (i = 0; - i < XFS_INODES_PER_CHUNK; - i += igeo->ig_inodes_per_cluster, - agbno += igeo->ig_blocks_per_cluster) { - if (xfs_inobt_maskn(i, igeo->ig_inodes_per_cluster) & - ~irec->ir_free) { - xfs_btree_reada_bufs(mp, agno, agbno, - igeo->ig_blocks_per_cluster, - &xfs_inode_buf_ops); - } - } - blk_finish_plug(&plug); -} - -/* - * Lookup the inode chunk that the given inode lives in and then get the record - * if we found the chunk. If the inode was not the last in the chunk and there - * are some left allocated, update the data for the pointed-to record as well as - * return the count of grabbed inodes. - */ -int -xfs_bulkstat_grab_ichunk( - struct xfs_btree_cur *cur, /* btree cursor */ - xfs_agino_t agino, /* starting inode of chunk */ - int *icount,/* return # of inodes grabbed */ - struct xfs_inobt_rec_incore *irec) /* btree record */ -{ - int idx; /* index into inode chunk */ - int stat; - int error = 0; - - /* Lookup the inode chunk that this inode lives in */ - error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_LE, &stat); - if (error) - return error; - if (!stat) { - *icount = 0; - return error; - } - - /* Get the record, should always work */ - error = xfs_inobt_get_rec(cur, irec, &stat); - if (error) - return error; - XFS_WANT_CORRUPTED_RETURN(cur->bc_mp, stat == 1); - - /* Check if the record contains the inode in request */ - if (irec->ir_startino + XFS_INODES_PER_CHUNK <= agino) { - *icount = 0; - return 0; - } - - idx = agino - irec->ir_startino + 1; - if (idx < XFS_INODES_PER_CHUNK && - (xfs_inobt_maskn(idx, XFS_INODES_PER_CHUNK - idx) & ~irec->ir_free)) { - int i; - - /* We got a right chunk with some left inodes allocated at it. - * Grab the chunk record. Mark all the uninteresting inodes - * free -- because they're before our start point. - */ - for (i = 0; i < idx; i++) { - if (XFS_INOBT_MASK(i) & ~irec->ir_free) - irec->ir_freecount++; - } - - irec->ir_free |= xfs_inobt_maskn(0, idx); - *icount = irec->ir_count - irec->ir_freecount; - } - - return 0; -} - -#define XFS_BULKSTAT_UBLEFT(ubleft) ((ubleft) >= statstruct_size) - static int xfs_bulkstat_iwalk( struct xfs_mount *mp, diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h index 366d391eb11f..a2562fe8d282 100644 --- a/fs/xfs/xfs_itable.h +++ b/fs/xfs/xfs_itable.h @@ -67,12 +67,4 @@ xfs_inumbers( void __user *buffer, /* buffer with inode info */ inumbers_fmt_pf formatter); -/* Temporarily needed while we refactor functions. */ -struct xfs_btree_cur; -struct xfs_inobt_rec_incore; -void xfs_bulkstat_ichunk_ra(struct xfs_mount *mp, xfs_agnumber_t agno, - struct xfs_inobt_rec_incore *irec); -int xfs_bulkstat_grab_ichunk(struct xfs_btree_cur *cur, xfs_agino_t agino, - int *icount, struct xfs_inobt_rec_incore *irec); - #endif /* __XFS_ITABLE_H__ */ diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 0ce3baa159ba..77320c297284 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -54,6 +54,97 @@ struct xfs_iwalk_ag { void *data; }; +/* + * Loop over all clusters in a chunk for a given incore inode allocation btree + * record. Do a readahead if there are any allocated inodes in that cluster. + */ +STATIC void +xfs_iwalk_ichunk_ra( + struct xfs_mount *mp, + xfs_agnumber_t agno, + struct xfs_inobt_rec_incore *irec) +{ + struct xfs_ino_geometry *igeo = &mp->m_ino_geo; + xfs_agblock_t agbno; + struct blk_plug plug; + int i; /* inode chunk index */ + + agbno = XFS_AGINO_TO_AGBNO(mp, irec->ir_startino); + + blk_start_plug(&plug); + for (i = 0; + i < XFS_INODES_PER_CHUNK; + i += igeo->ig_inodes_per_cluster, + agbno += igeo->ig_blocks_per_cluster) { + if (xfs_inobt_maskn(i, igeo->ig_inodes_per_cluster) & + ~irec->ir_free) { + xfs_btree_reada_bufs(mp, agno, agbno, + igeo->ig_blocks_per_cluster, + &xfs_inode_buf_ops); + } + } + blk_finish_plug(&plug); +} + +/* + * Lookup the inode chunk that the given inode lives in and then get the record + * if we found the chunk. If the inode was not the last in the chunk and there + * are some left allocated, update the data for the pointed-to record as well as + * return the count of grabbed inodes. + */ +STATIC int +xfs_iwalk_grab_ichunk( + struct xfs_btree_cur *cur, /* btree cursor */ + xfs_agino_t agino, /* starting inode of chunk */ + int *icount,/* return # of inodes grabbed */ + struct xfs_inobt_rec_incore *irec) /* btree record */ +{ + int idx; /* index into inode chunk */ + int stat; + int error = 0; + + /* Lookup the inode chunk that this inode lives in */ + error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_LE, &stat); + if (error) + return error; + if (!stat) { + *icount = 0; + return error; + } + + /* Get the record, should always work */ + error = xfs_inobt_get_rec(cur, irec, &stat); + if (error) + return error; + XFS_WANT_CORRUPTED_RETURN(cur->bc_mp, stat == 1); + + /* Check if the record contains the inode in request */ + if (irec->ir_startino + XFS_INODES_PER_CHUNK <= agino) { + *icount = 0; + return 0; + } + + idx = agino - irec->ir_startino + 1; + if (idx < XFS_INODES_PER_CHUNK && + (xfs_inobt_maskn(idx, XFS_INODES_PER_CHUNK - idx) & ~irec->ir_free)) { + int i; + + /* We got a right chunk with some left inodes allocated at it. + * Grab the chunk record. Mark all the uninteresting inodes + * free -- because they're before our start point. + */ + for (i = 0; i < idx; i++) { + if (XFS_INOBT_MASK(i) & ~irec->ir_free) + irec->ir_freecount++; + } + + irec->ir_free |= xfs_inobt_maskn(0, idx); + *icount = irec->ir_count - irec->ir_freecount; + } + + return 0; +} + /* Allocate memory for a walk. */ STATIC int xfs_iwalk_allocbuf( @@ -204,7 +295,7 @@ xfs_iwalk_ag_start( * We require a lookup cache of at least two elements so that we don't * have to deal with tearing down the cursor to walk the records. */ - error = xfs_bulkstat_grab_ichunk(*curpp, agino - 1, &icount, + error = xfs_iwalk_grab_ichunk(*curpp, agino - 1, &icount, &iwag->recs[iwag->nr_recs]); if (error) return error; @@ -305,7 +396,7 @@ xfs_iwalk_ag( * Start readahead for this inode chunk in anticipation of * walking the inodes. */ - xfs_bulkstat_ichunk_ra(mp, agno, irec); + xfs_iwalk_ichunk_ra(mp, agno, irec); /* * Add this inobt record to our cache, flush the cache if From patchwork Wed May 29 22:26:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967855 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3CF7976 for ; Wed, 29 May 2019 22:27:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 297C628A35 for ; Wed, 29 May 2019 22:27:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1848228A38; Wed, 29 May 2019 22:27:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A759C28A35 for ; Wed, 29 May 2019 22:27:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726532AbfE2W1D (ORCPT ); Wed, 29 May 2019 18:27:03 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:42680 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726428AbfE2W1C (ORCPT ); Wed, 29 May 2019 18:27:02 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM4jYo024135 for ; Wed, 29 May 2019 22:27:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=Y5F7q5sBr8wQt965IrO12rpid2fTyorOrY8mJctsLF4=; b=QjC8mmiGZbXITiwoHbNDUjks9fwJ/oB+qnQW9G9UUy3iYLFTSOUx7YEiaL13DCeA8jNE N4rbk/7oH09poIMNWG45Jgl5k715Z6vmm1ttfGnzskd3BNDhIEob6nR0DNWUGboaRhDp OgT/1Pl8MjR9Tv9Y9Yxxssd2Dr5OiwWgq1EzffliMEESD/FC/xE1MaS0GU1GImjx1sPb LaVXXnoZKIzhc0R2ZeI8DCNgaY/AKIsfoiT19HWioPhZSzBrVwro3p+aBs0x+QrkMg4H oaW4sjCPOKPtyC9UVwD0YQEyllxRRLoNa/l5YqbVTZRPWIp444OSGvVI1TTFAe7ix3hJ pA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 2spw4tmtcu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:27:01 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMQcgD173038 for ; Wed, 29 May 2019 22:27:00 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 2sqh73ykwb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:27:00 +0000 Received: from abhmp0004.oracle.com (abhmp0004.oracle.com [141.146.116.10]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x4TMQx83003564 for ; Wed, 29 May 2019 22:26:59 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:26:59 -0700 Subject: [PATCH 07/11] xfs: change xfs_iwalk_grab_ichunk to use startino, not lastino From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:26:58 -0700 Message-ID: <155916881851.757870.17079240504908852762.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=783 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=824 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Now that the inode chunk grabbing function is a static function in the iwalk code, change its behavior so that @agino is the inode where we want to /start/ the iteration. This reduces cognitive friction with the callers and simplifes the code. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_iwalk.c | 37 +++++++++++++++++-------------------- 1 file changed, 17 insertions(+), 20 deletions(-) diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 77320c297284..8e7881e95674 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -87,10 +87,10 @@ xfs_iwalk_ichunk_ra( } /* - * Lookup the inode chunk that the given inode lives in and then get the record - * if we found the chunk. If the inode was not the last in the chunk and there - * are some left allocated, update the data for the pointed-to record as well as - * return the count of grabbed inodes. + * Lookup the inode chunk that the given @agino lives in and then get the + * record if we found the chunk. Set the bits in @irec's free mask that + * correspond to the inodes before @agino so that we skip them. This is how we + * restart an inode walk that was interrupted in the middle of an inode record. */ STATIC int xfs_iwalk_grab_ichunk( @@ -101,6 +101,7 @@ xfs_iwalk_grab_ichunk( { int idx; /* index into inode chunk */ int stat; + int i; int error = 0; /* Lookup the inode chunk that this inode lives in */ @@ -124,24 +125,20 @@ xfs_iwalk_grab_ichunk( return 0; } - idx = agino - irec->ir_startino + 1; - if (idx < XFS_INODES_PER_CHUNK && - (xfs_inobt_maskn(idx, XFS_INODES_PER_CHUNK - idx) & ~irec->ir_free)) { - int i; + idx = agino - irec->ir_startino; - /* We got a right chunk with some left inodes allocated at it. - * Grab the chunk record. Mark all the uninteresting inodes - * free -- because they're before our start point. - */ - for (i = 0; i < idx; i++) { - if (XFS_INOBT_MASK(i) & ~irec->ir_free) - irec->ir_freecount++; - } - - irec->ir_free |= xfs_inobt_maskn(0, idx); - *icount = irec->ir_count - irec->ir_freecount; + /* + * We got a right chunk with some left inodes allocated at it. Grab + * the chunk record. Mark all the uninteresting inodes free because + * they're before our start point. + */ + for (i = 0; i < idx; i++) { + if (XFS_INOBT_MASK(i) & ~irec->ir_free) + irec->ir_freecount++; } + irec->ir_free |= xfs_inobt_maskn(0, idx); + *icount = irec->ir_count - irec->ir_freecount; return 0; } @@ -295,7 +292,7 @@ xfs_iwalk_ag_start( * We require a lookup cache of at least two elements so that we don't * have to deal with tearing down the cursor to walk the records. */ - error = xfs_iwalk_grab_ichunk(*curpp, agino - 1, &icount, + error = xfs_iwalk_grab_ichunk(*curpp, agino, &icount, &iwag->recs[iwag->nr_recs]); if (error) return error; From patchwork Wed May 29 22:27:04 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967857 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BC65D1398 for ; Wed, 29 May 2019 22:27:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ACEAC28A35 for ; Wed, 29 May 2019 22:27:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A160328A38; Wed, 29 May 2019 22:27:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5BE9528A35 for ; Wed, 29 May 2019 22:27:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726018AbfE2W1J (ORCPT ); Wed, 29 May 2019 18:27:09 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:36524 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726189AbfE2W1I (ORCPT ); Wed, 29 May 2019 18:27:08 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM40xj041560 for ; Wed, 29 May 2019 22:27:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=S4Mt6ha56eeYo4OT3O7XlM+woF6oHnhnL91Yer0vYjI=; b=R4X5AWh5Xlo1PxnxZe6//mZ4sA5OCwkgljhrsm8ncCQ7+gtu0IhGm9S+hh5c8K3VP0pj QQpy2HTo+OOSJRuNGhj6wLDN6B+xUnuGxLbLhUCYXwI1cBNF5nuyrVrbRxSto+XcWEUJ SjK2nhvWG8M+I0L3YTw+AevonpjUwQKCnsLayEAAqPqsQbpPnvKyPH9Orbjl7OVwinNK Uc7FcJqunki9sltwiUNONkd9O7K+Mn4SdxFUTOqKTFlE2mmtTIjPgCkLNmvDtkbM8iUf dtlnXKAu1MmpmSsVCvZDZI+XD/KE7Ncyz1n5xp4kTVNv7skQzHF3DTO+poaXzYS8PJDN cQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2130.oracle.com with ESMTP id 2spu7dn16m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:27:07 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMQ20T164430 for ; Wed, 29 May 2019 22:27:06 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3020.oracle.com with ESMTP id 2sr31vh8uv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:27:06 +0000 Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x4TMR5S4013979 for ; Wed, 29 May 2019 22:27:05 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:27:05 -0700 Subject: [PATCH 08/11] xfs: clean up long conditionals in xfs_iwalk_ichunk_ra From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:27:04 -0700 Message-ID: <155916882448.757870.7872177617055446316.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=882 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=917 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Refactor xfs_iwalk_ichunk_ra to avoid long conditionals. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_iwalk.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 8e7881e95674..3c523afdcfa0 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -76,8 +76,10 @@ xfs_iwalk_ichunk_ra( i < XFS_INODES_PER_CHUNK; i += igeo->ig_inodes_per_cluster, agbno += igeo->ig_blocks_per_cluster) { - if (xfs_inobt_maskn(i, igeo->ig_inodes_per_cluster) & - ~irec->ir_free) { + xfs_inofree_t imask; + + imask = xfs_inobt_maskn(i, igeo->ig_inodes_per_cluster); + if (imask & ~irec->ir_free) { xfs_btree_reada_bufs(mp, agno, agbno, igeo->ig_blocks_per_cluster, &xfs_inode_buf_ops); From patchwork Wed May 29 22:27:10 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967859 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B87C476 for ; Wed, 29 May 2019 22:27:22 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A6C4528A35 for ; Wed, 29 May 2019 22:27:22 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9ACFC28A38; Wed, 29 May 2019 22:27:22 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A0E0628A35 for ; Wed, 29 May 2019 22:27:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726189AbfE2W1V (ORCPT ); Wed, 29 May 2019 18:27:21 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:36664 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726054AbfE2W1V (ORCPT ); Wed, 29 May 2019 18:27:21 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM3f9Y041443 for ; Wed, 29 May 2019 22:27:18 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=V7qmOr/TeSEffMwQJwOF8Mq07C85/vSfs4PAQS6Eyhs=; b=AYEcry+GwFcxaLg7tFbtY7qsC3OOuxwT8XG7qd4ue75PeSrCMSwETEbvBUsXTpoAUOGz vnVzaD6klJT+1TBPHGxp7pCu3npHuxdOiLr7woSf9U+jr1LNDYmvBgCHDDjKey5zoJ4g 2xmcNwwq9PZVlwfmqlyzQNQz6UmEgQ2wbcs5vRVOQMU2ht79NXws+RqnZxEbl5Rl9zCD v0+gbFEQYlguTdieYpCl36BKjQlk7532W4wDTvsNsZ+gaMiV1U4lUat92HHcKxHtfQ9k 9PoCT9fau8DWiI4Sx5XHykprc910Dtpj5fG5tTUpwqVULEOp9LSXSrFSu6m/w6+gQbmI TQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2130.oracle.com with ESMTP id 2spu7dn178-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:27:18 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMPrFq141964 for ; Wed, 29 May 2019 22:27:18 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3030.oracle.com with ESMTP id 2srbdxne5p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:27:18 +0000 Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x4TMRHcK023047 for ; Wed, 29 May 2019 22:27:17 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:27:17 -0700 Subject: [PATCH 09/11] xfs: multithreaded iwalk implementation From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:27:10 -0700 Message-ID: <155916883058.757870.14626550548698723956.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Create a parallel iwalk implementation and switch quotacheck to use it. Signed-off-by: Darrick J. Wong --- fs/xfs/Makefile | 1 fs/xfs/xfs_globals.c | 3 + fs/xfs/xfs_iwalk.c | 76 +++++++++++++++++++++++++++++++- fs/xfs/xfs_iwalk.h | 2 + fs/xfs/xfs_pwork.c | 118 ++++++++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/xfs_pwork.h | 50 +++++++++++++++++++++ fs/xfs/xfs_qm.c | 2 - fs/xfs/xfs_sysctl.h | 6 +++ fs/xfs/xfs_sysfs.c | 40 +++++++++++++++++ fs/xfs/xfs_trace.h | 18 ++++++++ 10 files changed, 313 insertions(+), 3 deletions(-) create mode 100644 fs/xfs/xfs_pwork.c create mode 100644 fs/xfs/xfs_pwork.h diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 74d30ef0dbce..48940a27d4aa 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -84,6 +84,7 @@ xfs-y += xfs_aops.o \ xfs_message.o \ xfs_mount.o \ xfs_mru_cache.o \ + xfs_pwork.o \ xfs_reflink.o \ xfs_stats.o \ xfs_super.o \ diff --git a/fs/xfs/xfs_globals.c b/fs/xfs/xfs_globals.c index d0d377384120..4f93f2c4dc38 100644 --- a/fs/xfs/xfs_globals.c +++ b/fs/xfs/xfs_globals.c @@ -31,6 +31,9 @@ xfs_param_t xfs_params = { .fstrm_timer = { 1, 30*100, 3600*100}, .eofb_timer = { 1, 300, 3600*24}, .cowb_timer = { 1, 1800, 3600*24}, +#ifdef DEBUG + .pwork_threads = { 0, 0, NR_CPUS }, +#endif }; struct xfs_globals xfs_globals = { diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 3c523afdcfa0..453790bae194 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -21,6 +21,7 @@ #include "xfs_health.h" #include "xfs_trans.h" #include "xfs_iwalk.h" +#include "xfs_pwork.h" /* * Walking All the Inodes in the Filesystem @@ -38,6 +39,9 @@ */ struct xfs_iwalk_ag { + /* parallel work control data; will be null if single threaded */ + struct xfs_pwork pwork; + struct xfs_mount *mp; struct xfs_trans *tp; @@ -190,6 +194,9 @@ xfs_iwalk_ag_recs( trace_xfs_iwalk_ag_rec(mp, agno, irec->ir_startino, irec->ir_free); for (j = 0; j < XFS_INODES_PER_CHUNK; j++) { + if (xfs_pwork_want_abort(&iwag->pwork)) + return 0; + /* Skip if this inode is free */ if (XFS_INOBT_MASK(j) & irec->ir_free) continue; @@ -374,7 +381,7 @@ xfs_iwalk_ag( if (error) goto out_cur; - while (has_more) { + while (has_more && !xfs_pwork_want_abort(&iwag->pwork)) { struct xfs_inobt_rec_incore *irec; /* Fetch the inobt record. */ @@ -410,7 +417,7 @@ xfs_iwalk_ag( } /* Walk any records left behind in the cache. */ - if (iwag->nr_recs) { + if (iwag->nr_recs && !xfs_pwork_want_abort(&iwag->pwork)) { xfs_iwalk_del_inobt(tp, &cur, &agi_bp, error); return xfs_iwalk_ag_recs(iwag); } @@ -469,6 +476,7 @@ xfs_iwalk( .iwalk_fn = iwalk_fn, .data = data, .startino = startino, + .pwork = XFS_PWORK_SINGLE_THREADED, }; xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino); int error; @@ -490,3 +498,67 @@ xfs_iwalk( xfs_iwalk_freebuf(&iwag); return error; } + +/* Run per-thread iwalk work. */ +static int +xfs_iwalk_ag_work( + struct xfs_mount *mp, + struct xfs_pwork *pwork) +{ + struct xfs_iwalk_ag *iwag; + int error; + + iwag = container_of(pwork, struct xfs_iwalk_ag, pwork); + error = xfs_iwalk_allocbuf(iwag); + if (error) + goto out; + + error = xfs_iwalk_ag(iwag); + xfs_iwalk_freebuf(iwag); +out: + kmem_free(iwag); + return error; +} + +/* + * Walk all the inodes in the filesystem using multiple threads to process each + * AG. + */ +int +xfs_iwalk_threaded( + struct xfs_mount *mp, + xfs_ino_t startino, + xfs_iwalk_fn iwalk_fn, + unsigned int max_prefetch, + void *data) +{ + struct xfs_pwork_ctl pctl; + xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino); + unsigned int nr_threads; + int error; + + ASSERT(agno < mp->m_sb.sb_agcount); + + nr_threads = xfs_pwork_guess_datadev_parallelism(mp); + error = xfs_pwork_init(mp, &pctl, xfs_iwalk_ag_work, "xfs_iwalk", + nr_threads); + if (error) + return error; + + for (; agno < mp->m_sb.sb_agcount; agno++) { + struct xfs_iwalk_ag *iwag; + + iwag = kmem_alloc(sizeof(struct xfs_iwalk_ag), KM_SLEEP); + iwag->mp = mp; + iwag->tp = NULL; + iwag->iwalk_fn = iwalk_fn; + iwag->data = data; + iwag->startino = startino; + iwag->recs = NULL; + xfs_iwalk_set_prefetch(iwag, max_prefetch); + xfs_pwork_queue(&pctl, &iwag->pwork); + startino = XFS_AGINO_TO_INO(mp, agno + 1, 0); + } + + return xfs_pwork_destroy(&pctl); +} diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h index 45b1baabcd2d..40233a05a766 100644 --- a/fs/xfs/xfs_iwalk.h +++ b/fs/xfs/xfs_iwalk.h @@ -14,5 +14,7 @@ typedef int (*xfs_iwalk_fn)(struct xfs_mount *mp, struct xfs_trans *tp, int xfs_iwalk(struct xfs_mount *mp, struct xfs_trans *tp, xfs_ino_t startino, xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, void *data); +int xfs_iwalk_threaded(struct xfs_mount *mp, xfs_ino_t startino, + xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, void *data); #endif /* __XFS_IWALK_H__ */ diff --git a/fs/xfs/xfs_pwork.c b/fs/xfs/xfs_pwork.c new file mode 100644 index 000000000000..c1419558b089 --- /dev/null +++ b/fs/xfs/xfs_pwork.c @@ -0,0 +1,118 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2019 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_log_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_trace.h" +#include "xfs_sysctl.h" +#include "xfs_pwork.h" + +/* + * Parallel Work Queue + * =================== + * Abstract away the details of running a large and "obviously" parallelizable + * task across multiple CPUs. Callers initialize the pwork control object with + * a desired level of parallelization and a work function. Next, they embed + * struct xfs_pwork in whatever structure they use to pass work context to a + * worker thread and queue that pwork. The work function will be passed the + * pwork item when it is run (from process context) and any returned error will + * cause all threads to abort. + */ + +/* Invoke our caller's function. */ +static void +xfs_pwork_work( + struct work_struct *work) +{ + struct xfs_pwork *pwork; + struct xfs_pwork_ctl *pctl; + int error; + + pwork = container_of(work, struct xfs_pwork, work); + pctl = pwork->pctl; + error = pctl->work_fn(pctl->mp, pwork); + if (error && !pctl->error) + pctl->error = error; +} + +/* + * Set up control data for parallel work. @work_fn is the function that will + * be called. @tag will be written into the kernel threads. @nr_threads is + * the level of parallelism desired, or 0 for no limit. + */ +int +xfs_pwork_init( + struct xfs_mount *mp, + struct xfs_pwork_ctl *pctl, + xfs_pwork_work_fn work_fn, + const char *tag, + unsigned int nr_threads) +{ +#ifdef DEBUG + if (xfs_globals.pwork_threads > 0) + nr_threads = xfs_globals.pwork_threads; +#endif + trace_xfs_pwork_init(mp, nr_threads, current->pid); + + pctl->wq = alloc_workqueue("%s-%d", WQ_FREEZABLE, nr_threads, tag, + current->pid); + if (!pctl->wq) + return -ENOMEM; + pctl->work_fn = work_fn; + pctl->error = 0; + pctl->mp = mp; + + return 0; +} + +/* Queue some parallel work. */ +void +xfs_pwork_queue( + struct xfs_pwork_ctl *pctl, + struct xfs_pwork *pwork) +{ + INIT_WORK(&pwork->work, xfs_pwork_work); + pwork->pctl = pctl; + queue_work(pctl->wq, &pwork->work); +} + +/* Wait for the work to finish and tear down the control structure. */ +int +xfs_pwork_destroy( + struct xfs_pwork_ctl *pctl) +{ + destroy_workqueue(pctl->wq); + pctl->wq = NULL; + return pctl->error; +} + +/* + * Return the amount of parallelism that the data device can handle, or 0 for + * no limit. + */ +unsigned int +xfs_pwork_guess_datadev_parallelism( + struct xfs_mount *mp) +{ + struct xfs_buftarg *btp = mp->m_ddev_targp; + int iomin; + int ioopt; + + if (blk_queue_nonrot(btp->bt_bdev->bd_queue)) + return num_online_cpus(); + if (mp->m_sb.sb_width && mp->m_sb.sb_unit) + return mp->m_sb.sb_width / mp->m_sb.sb_unit; + iomin = bdev_io_min(btp->bt_bdev); + ioopt = bdev_io_opt(btp->bt_bdev); + if (iomin && ioopt) + return ioopt / iomin; + + return 1; +} diff --git a/fs/xfs/xfs_pwork.h b/fs/xfs/xfs_pwork.h new file mode 100644 index 000000000000..e0c1354a2d8c --- /dev/null +++ b/fs/xfs/xfs_pwork.h @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * Copyright (C) 2019 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_PWORK_H__ +#define __XFS_PWORK_H__ + +struct xfs_pwork; +struct xfs_mount; + +typedef int (*xfs_pwork_work_fn)(struct xfs_mount *mp, struct xfs_pwork *pwork); + +/* + * Parallel work coordination structure. + */ +struct xfs_pwork_ctl { + struct workqueue_struct *wq; + struct xfs_mount *mp; + xfs_pwork_work_fn work_fn; + int error; +}; + +/* + * Embed this parallel work control item inside your own work structure, + * then queue work with it. + */ +struct xfs_pwork { + struct work_struct work; + struct xfs_pwork_ctl *pctl; +}; + +#define XFS_PWORK_SINGLE_THREADED { .pctl = NULL } + +/* Have we been told to abort? */ +static inline bool +xfs_pwork_want_abort( + struct xfs_pwork *pwork) +{ + return pwork->pctl && pwork->pctl->error; +} + +int xfs_pwork_init(struct xfs_mount *mp, struct xfs_pwork_ctl *pctl, + xfs_pwork_work_fn work_fn, const char *tag, + unsigned int nr_threads); +void xfs_pwork_queue(struct xfs_pwork_ctl *pctl, struct xfs_pwork *pwork); +int xfs_pwork_destroy(struct xfs_pwork_ctl *pctl); +unsigned int xfs_pwork_guess_datadev_parallelism(struct xfs_mount *mp); + +#endif /* __XFS_PWORK_H__ */ diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index a5b2260406a8..e4f3785f7a64 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -1305,7 +1305,7 @@ xfs_qm_quotacheck( flags |= XFS_PQUOTA_CHKD; } - error = xfs_iwalk(mp, NULL, 0, xfs_qm_dqusage_adjust, 0, NULL); + error = xfs_iwalk_threaded(mp, 0, xfs_qm_dqusage_adjust, 0, NULL); if (error) goto error_return; diff --git a/fs/xfs/xfs_sysctl.h b/fs/xfs/xfs_sysctl.h index ad7f9be13087..b555e045e2f4 100644 --- a/fs/xfs/xfs_sysctl.h +++ b/fs/xfs/xfs_sysctl.h @@ -37,6 +37,9 @@ typedef struct xfs_param { xfs_sysctl_val_t fstrm_timer; /* Filestream dir-AG assoc'n timeout. */ xfs_sysctl_val_t eofb_timer; /* Interval between eofb scan wakeups */ xfs_sysctl_val_t cowb_timer; /* Interval between cowb scan wakeups */ +#ifdef DEBUG + xfs_sysctl_val_t pwork_threads; /* Parallel workqueue thread count */ +#endif } xfs_param_t; /* @@ -82,6 +85,9 @@ enum { extern xfs_param_t xfs_params; struct xfs_globals { +#ifdef DEBUG + int pwork_threads; /* parallel workqueue threads */ +#endif int log_recovery_delay; /* log recovery delay (secs) */ int mount_delay; /* mount setup delay (secs) */ bool bug_on_assert; /* BUG() the kernel on assert failure */ diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c index cabda13f3c64..910e6b9cb1a7 100644 --- a/fs/xfs/xfs_sysfs.c +++ b/fs/xfs/xfs_sysfs.c @@ -206,11 +206,51 @@ always_cow_show( } XFS_SYSFS_ATTR_RW(always_cow); +#ifdef DEBUG +/* + * Override how many threads the parallel work queue is allowed to create. + * This has to be a debug-only global (instead of an errortag) because one of + * the main users of parallel workqueues is mount time quotacheck. + */ +STATIC ssize_t +pwork_threads_store( + struct kobject *kobject, + const char *buf, + size_t count) +{ + int ret; + int val; + + ret = kstrtoint(buf, 0, &val); + if (ret) + return ret; + + if (val < 0 || val > NR_CPUS) + return -EINVAL; + + xfs_globals.pwork_threads = val; + + return count; +} + +STATIC ssize_t +pwork_threads_show( + struct kobject *kobject, + char *buf) +{ + return snprintf(buf, PAGE_SIZE, "%d\n", xfs_globals.pwork_threads); +} +XFS_SYSFS_ATTR_RW(pwork_threads); +#endif /* DEBUG */ + static struct attribute *xfs_dbg_attrs[] = { ATTR_LIST(bug_on_assert), ATTR_LIST(log_recovery_delay), ATTR_LIST(mount_delay), ATTR_LIST(always_cow), +#ifdef DEBUG + ATTR_LIST(pwork_threads), +#endif NULL, }; diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index a2881659f776..5c0eea7bfcb0 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3556,6 +3556,24 @@ TRACE_EVENT(xfs_iwalk_ag_rec, __entry->startino, __entry->freemask) ) +TRACE_EVENT(xfs_pwork_init, + TP_PROTO(struct xfs_mount *mp, unsigned int nr_threads, pid_t pid), + TP_ARGS(mp, nr_threads, pid), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(unsigned int, nr_threads) + __field(pid_t, pid) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->nr_threads = nr_threads; + __entry->pid = pid; + ), + TP_printk("dev %d:%d nr_threads %u pid %u", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->nr_threads, __entry->pid) +) + #endif /* _TRACE_XFS_H */ #undef TRACE_INCLUDE_PATH From patchwork Wed May 29 22:27:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967861 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9FD8E1398 for ; Wed, 29 May 2019 22:27:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8F5AD28A35 for ; Wed, 29 May 2019 22:27:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 839A328A38; Wed, 29 May 2019 22:27:30 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A98F828A35 for ; Wed, 29 May 2019 22:27:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726240AbfE2W11 (ORCPT ); Wed, 29 May 2019 18:27:27 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:36778 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726054AbfE2W11 (ORCPT ); Wed, 29 May 2019 18:27:27 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM40xl041560 for ; Wed, 29 May 2019 22:27:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=WT5mPB5sA+jN/tShEd3fbnENro/+Z0INNHigYlan4d4=; b=renavZMIQfht9xlk8v6zwI3VICK/bpFHuLWXXmGl4i0NK3POwPLpBCVbp6ZjQkJW8n+j z6BdO0PCrSaSCSDeOgmi+tuTugonsqYhkhW5L9t3BQ0a2txaB0g18FMBpqcMVqFNtDxl lK/pqWvMu45j8OFMl87L8kkGjh1iX+unuaSi7aNpbcCiYTJ5bUDwP038usYvpoteNgpF YeFFZu6OdDkOFZ+0ViCSzB8oqda2lWJn04iWxT4s40YR1O4NPnhGpZ83a9ahZfD87voH 0rup/I2SL1iK/0gKqg2t2uiJgM7KDfpJHFBkM3KrVRkgQWN0GvW3fEPFh6Ry/yslxVyC Qg== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2130.oracle.com with ESMTP id 2spu7dn182-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:27:25 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMPtm6142042 for ; Wed, 29 May 2019 22:27:25 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3030.oracle.com with ESMTP id 2srbdxne6q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:27:24 +0000 Received: from abhmp0003.oracle.com (abhmp0003.oracle.com [141.146.116.9]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x4TMRNJF003829 for ; Wed, 29 May 2019 22:27:24 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:27:23 -0700 Subject: [PATCH 10/11] xfs: poll waiting for quotacheck From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:27:22 -0700 Message-ID: <155916884274.757870.3080038991334062615.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Create a pwork destroy function that uses polling instead of uninterruptible sleep to wait for work items to finish so that we can touch the softlockup watchdog. IOWs, gross hack. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_iwalk.c | 3 +++ fs/xfs/xfs_iwalk.h | 3 ++- fs/xfs/xfs_pwork.c | 21 +++++++++++++++++++++ fs/xfs/xfs_pwork.h | 2 ++ fs/xfs/xfs_qm.c | 2 +- 5 files changed, 29 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 453790bae194..7f40d0633651 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -530,6 +530,7 @@ xfs_iwalk_threaded( xfs_ino_t startino, xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, + bool polled, void *data) { struct xfs_pwork_ctl pctl; @@ -560,5 +561,7 @@ xfs_iwalk_threaded( startino = XFS_AGINO_TO_INO(mp, agno + 1, 0); } + if (polled) + return xfs_pwork_destroy_poll(&pctl); return xfs_pwork_destroy(&pctl); } diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h index 40233a05a766..76d8f87a39ef 100644 --- a/fs/xfs/xfs_iwalk.h +++ b/fs/xfs/xfs_iwalk.h @@ -15,6 +15,7 @@ typedef int (*xfs_iwalk_fn)(struct xfs_mount *mp, struct xfs_trans *tp, int xfs_iwalk(struct xfs_mount *mp, struct xfs_trans *tp, xfs_ino_t startino, xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, void *data); int xfs_iwalk_threaded(struct xfs_mount *mp, xfs_ino_t startino, - xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, void *data); + xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, bool poll, + void *data); #endif /* __XFS_IWALK_H__ */ diff --git a/fs/xfs/xfs_pwork.c b/fs/xfs/xfs_pwork.c index c1419558b089..a9c39615233a 100644 --- a/fs/xfs/xfs_pwork.c +++ b/fs/xfs/xfs_pwork.c @@ -13,6 +13,7 @@ #include "xfs_trace.h" #include "xfs_sysctl.h" #include "xfs_pwork.h" +#include /* * Parallel Work Queue @@ -40,6 +41,7 @@ xfs_pwork_work( error = pctl->work_fn(pctl->mp, pwork); if (error && !pctl->error) pctl->error = error; + atomic_dec(&pctl->nr_work); } /* @@ -68,6 +70,7 @@ xfs_pwork_init( pctl->work_fn = work_fn; pctl->error = 0; pctl->mp = mp; + atomic_set(&pctl->nr_work, 0); return 0; } @@ -80,6 +83,7 @@ xfs_pwork_queue( { INIT_WORK(&pwork->work, xfs_pwork_work); pwork->pctl = pctl; + atomic_inc(&pctl->nr_work); queue_work(pctl->wq, &pwork->work); } @@ -93,6 +97,23 @@ xfs_pwork_destroy( return pctl->error; } +/* + * Wait for the work to finish and tear down the control structure. + * Continually poll completion status and touch the soft lockup watchdog. + * This is for things like mount that hold locks. + */ +int +xfs_pwork_destroy_poll( + struct xfs_pwork_ctl *pctl) +{ + while (atomic_read(&pctl->nr_work) > 0) { + msleep(1); + touch_softlockup_watchdog(); + } + + return xfs_pwork_destroy(pctl); +} + /* * Return the amount of parallelism that the data device can handle, or 0 for * no limit. diff --git a/fs/xfs/xfs_pwork.h b/fs/xfs/xfs_pwork.h index e0c1354a2d8c..08da723a8dc9 100644 --- a/fs/xfs/xfs_pwork.h +++ b/fs/xfs/xfs_pwork.h @@ -18,6 +18,7 @@ struct xfs_pwork_ctl { struct workqueue_struct *wq; struct xfs_mount *mp; xfs_pwork_work_fn work_fn; + atomic_t nr_work; int error; }; @@ -45,6 +46,7 @@ int xfs_pwork_init(struct xfs_mount *mp, struct xfs_pwork_ctl *pctl, unsigned int nr_threads); void xfs_pwork_queue(struct xfs_pwork_ctl *pctl, struct xfs_pwork *pwork); int xfs_pwork_destroy(struct xfs_pwork_ctl *pctl); +int xfs_pwork_destroy_poll(struct xfs_pwork_ctl *pctl); unsigned int xfs_pwork_guess_datadev_parallelism(struct xfs_mount *mp); #endif /* __XFS_PWORK_H__ */ diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index e4f3785f7a64..de6a623ada02 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -1305,7 +1305,7 @@ xfs_qm_quotacheck( flags |= XFS_PQUOTA_CHKD; } - error = xfs_iwalk_threaded(mp, 0, xfs_qm_dqusage_adjust, 0, NULL); + error = xfs_iwalk_threaded(mp, 0, xfs_qm_dqusage_adjust, 0, true, NULL); if (error) goto error_return; From patchwork Wed May 29 22:27:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 10967865 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2E14576 for ; Wed, 29 May 2019 22:27:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1B20F28A35 for ; Wed, 29 May 2019 22:27:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0F77928A38; Wed, 29 May 2019 22:27:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E261528A35 for ; Wed, 29 May 2019 22:27:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726428AbfE2W1g (ORCPT ); Wed, 29 May 2019 18:27:36 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:55852 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726054AbfE2W1g (ORCPT ); Wed, 29 May 2019 18:27:36 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TM3pXG030494 for ; Wed, 29 May 2019 22:27:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=8jfjR2hM9fP4EwaTHY5J/yOS/8gKCMsSCZh96pu4xcE=; b=VE7uzxoSt5Ay7aWe4esrDGTsfG1070x6q7jIW+f9tbj+xtK6XLPJk4cIMks6w5MO32x/ f1B6ufM7uzkkAT7tpyDW+J9NT0/eEBSqt1QAwFAkSN5+Oulbny6sZiAQUdwTAdcmLv5A WSGsMgSHzj3EqTTDhQpEt4iL0fGT03tmUrJ1tDcqZhs+rhev/+4Kd4+QUk0GBlJEFUfV 318Q1w43nSHp0pV8wxCaRc1Zpc94vR/kLtpucTqeoBLeDDnzr4BTzpzcIX98Aqd1IEC5 4mopQbbrzde2l8b00EObHJIRWGDwu2atYMez8ZtMKJmCSRDmGKSic333+Oq+lmU4S9bG FQ== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2120.oracle.com with ESMTP id 2spxbqcp84-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:27:32 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x4TMPq2m141862 for ; Wed, 29 May 2019 22:27:31 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3030.oracle.com with ESMTP id 2srbdxne7h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 29 May 2019 22:27:31 +0000 Received: from abhmp0020.oracle.com (abhmp0020.oracle.com [141.146.116.26]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x4TMRUND003861 for ; Wed, 29 May 2019 22:27:30 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 29 May 2019 15:27:29 -0700 Subject: [PATCH 11/11] xfs: refactor INUMBERS to use iwalk functions From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org Date: Wed, 29 May 2019 15:27:28 -0700 Message-ID: <155916884881.757870.5045246991167103120.stgit@magnolia> In-Reply-To: <155916877311.757870.11060347556535201032.stgit@magnolia> References: <155916877311.757870.11060347556535201032.stgit@magnolia> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9272 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905290138 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Darrick J. Wong Now that we have generic functions to walk inode records, refactor the INUMBERS implementation to use it. Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_ioctl.c | 20 ++++-- fs/xfs/xfs_ioctl.h | 2 + fs/xfs/xfs_ioctl32.c | 35 ++++------ fs/xfs/xfs_itable.c | 168 ++++++++++++++++++++------------------------------ fs/xfs/xfs_itable.h | 22 +------ fs/xfs/xfs_iwalk.c | 155 ++++++++++++++++++++++++++++++++++++++++++++-- fs/xfs/xfs_iwalk.h | 12 ++++ 7 files changed, 256 insertions(+), 158 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 43734901aeb9..4fa9a2c8b029 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -732,6 +732,16 @@ xfs_bulkstat_one_fmt( return xfs_ibulk_advance(breq, sizeof(struct xfs_bstat)); } +int +xfs_inumbers_fmt( + struct xfs_ibulk *breq, + const struct xfs_inogrp *igrp) +{ + if (copy_to_user(breq->ubuffer, igrp, sizeof(*igrp))) + return -EFAULT; + return xfs_ibulk_advance(breq, sizeof(struct xfs_inogrp)); +} + STATIC int xfs_ioc_bulkstat( xfs_mount_t *mp, @@ -779,13 +789,9 @@ xfs_ioc_bulkstat( * parameter to maintain correct function. */ if (cmd == XFS_IOC_FSINUMBERS) { - int count = breq.icount; - - breq.startino = lastino; - error = xfs_inumbers(mp, &breq.startino, &count, - bulkreq.ubuffer, xfs_inumbers_fmt); - breq.ocount = count; - lastino = breq.startino; + breq.startino = lastino + 1; + error = xfs_inumbers(&breq, xfs_inumbers_fmt); + lastino = breq.startino - 1; } else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE) { breq.startino = lastino; error = xfs_bulkstat_one(&breq, xfs_bulkstat_one_fmt); diff --git a/fs/xfs/xfs_ioctl.h b/fs/xfs/xfs_ioctl.h index f32c8aadfeba..fb303eaa8863 100644 --- a/fs/xfs/xfs_ioctl.h +++ b/fs/xfs/xfs_ioctl.h @@ -79,7 +79,9 @@ xfs_set_dmattrs( struct xfs_ibulk; struct xfs_bstat; +struct xfs_inogrp; int xfs_bulkstat_one_fmt(struct xfs_ibulk *breq, const struct xfs_bstat *bstat); +int xfs_inumbers_fmt(struct xfs_ibulk *breq, const struct xfs_inogrp *igrp); #endif diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c index add15819daf3..dd53a9692e68 100644 --- a/fs/xfs/xfs_ioctl32.c +++ b/fs/xfs/xfs_ioctl32.c @@ -85,22 +85,17 @@ xfs_compat_growfs_rt_copyin( STATIC int xfs_inumbers_fmt_compat( - void __user *ubuffer, - const struct xfs_inogrp *buffer, - long count, - long *written) + struct xfs_ibulk *breq, + const struct xfs_inogrp *igrp) { - compat_xfs_inogrp_t __user *p32 = ubuffer; - long i; + struct compat_xfs_inogrp __user *p32 = breq->ubuffer; - for (i = 0; i < count; i++) { - if (put_user(buffer[i].xi_startino, &p32[i].xi_startino) || - put_user(buffer[i].xi_alloccount, &p32[i].xi_alloccount) || - put_user(buffer[i].xi_allocmask, &p32[i].xi_allocmask)) - return -EFAULT; - } - *written = count * sizeof(*p32); - return 0; + if (put_user(igrp->xi_startino, &p32->xi_startino) || + put_user(igrp->xi_alloccount, &p32->xi_alloccount) || + put_user(igrp->xi_allocmask, &p32->xi_allocmask)) + return -EFAULT; + + return xfs_ibulk_advance(breq, sizeof(struct compat_xfs_inogrp)); } #else @@ -225,7 +220,7 @@ xfs_compat_ioc_bulkstat( * to userpace memory via bulkreq.ubuffer. Normally the compat * functions and structure size are the correct ones to use ... */ - inumbers_fmt_pf inumbers_func = xfs_inumbers_fmt_compat; + inumbers_fmt_pf inumbers_func = xfs_inumbers_fmt_compat; bulkstat_one_fmt_pf bs_one_func = xfs_bulkstat_one_fmt_compat; #ifdef CONFIG_X86_X32 @@ -286,13 +281,9 @@ xfs_compat_ioc_bulkstat( * parameter to maintain correct function. */ if (cmd == XFS_IOC_FSINUMBERS_32) { - int count = breq.icount; - - breq.startino = lastino; - error = xfs_inumbers(mp, &breq.startino, &count, - bulkreq.ubuffer, inumbers_func); - breq.ocount = count; - lastino = breq.startino; + breq.startino = lastino + 1; + error = xfs_inumbers(&breq, inumbers_func); + lastino = breq.startino - 1; } else if (cmd == XFS_IOC_FSBULKSTAT_SINGLE_32) { breq.startino = lastino; error = xfs_bulkstat_one(&breq, bs_one_func); diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c index 06abe5c9c0ee..bade54d6ac64 100644 --- a/fs/xfs/xfs_itable.c +++ b/fs/xfs/xfs_itable.c @@ -259,121 +259,85 @@ xfs_bulkstat( return error; } -int -xfs_inumbers_fmt( - void __user *ubuffer, /* buffer to write to */ - const struct xfs_inogrp *buffer, /* buffer to read from */ - long count, /* # of elements to read */ - long *written) /* # of bytes written */ +struct xfs_inumbers_chunk { + inumbers_fmt_pf formatter; + struct xfs_ibulk *breq; +}; + +/* + * INUMBERS + * ======== + * This is how we export inode btree records to userspace, so that XFS tools + * can figure out where inodes are allocated. + */ + +/* + * Format the inode group structure and report it somewhere. + * + * Similar to xfs_bulkstat_one_int, lastino is the inode cursor as we walk + * through the filesystem so we move it forward unless there was a runtime + * error. If the formatter tells us the buffer is now full we also move the + * cursor forward and abort the walk. + */ +STATIC int +xfs_inumbers_walk( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_agnumber_t agno, + const struct xfs_inobt_rec_incore *irec, + void *data) { - if (copy_to_user(ubuffer, buffer, count * sizeof(*buffer))) - return -EFAULT; - *written = count * sizeof(*buffer); - return 0; + struct xfs_inogrp inogrp = { + .xi_startino = XFS_AGINO_TO_INO(mp, agno, irec->ir_startino), + .xi_alloccount = irec->ir_count - irec->ir_freecount, + .xi_allocmask = ~irec->ir_free, + }; + struct xfs_inumbers_chunk *ic = data; + xfs_agino_t agino; + int error; + + error = ic->formatter(ic->breq, &inogrp); + if (error && error != XFS_IBULK_BUFFER_FULL) + return error; + if (error == XFS_IBULK_BUFFER_FULL) + error = XFS_INOBT_WALK_ABORT; + + agino = irec->ir_startino + XFS_INODES_PER_CHUNK; + ic->breq->startino = XFS_AGINO_TO_INO(mp, agno, agino); + return error; } /* * Return inode number table for the filesystem. */ -int /* error status */ +int xfs_inumbers( - struct xfs_mount *mp,/* mount point for filesystem */ - xfs_ino_t *lastino,/* last inode returned */ - int *count,/* size of buffer/count returned */ - void __user *ubuffer,/* buffer with inode descriptions */ + struct xfs_ibulk *breq, inumbers_fmt_pf formatter) { - xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, *lastino); - xfs_agino_t agino = XFS_INO_TO_AGINO(mp, *lastino); - struct xfs_btree_cur *cur = NULL; - struct xfs_buf *agbp = NULL; - struct xfs_inogrp *buffer; - int bcount; - int left = *count; - int bufidx = 0; + struct xfs_inumbers_chunk ic = { + .formatter = formatter, + .breq = breq, + }; int error = 0; - *count = 0; - if (agno >= mp->m_sb.sb_agcount || - *lastino != XFS_AGINO_TO_INO(mp, agno, agino)) - return error; + breq->ocount = 0; - bcount = min(left, (int)(PAGE_SIZE / sizeof(*buffer))); - buffer = kmem_zalloc(bcount * sizeof(*buffer), KM_SLEEP); - do { - struct xfs_inobt_rec_incore r; - int stat; - - if (!agbp) { - error = xfs_ialloc_read_agi(mp, NULL, agno, &agbp); - if (error) - break; - - cur = xfs_inobt_init_cursor(mp, NULL, agbp, agno, - XFS_BTNUM_INO); - error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_GE, - &stat); - if (error) - break; - if (!stat) - goto next_ag; - } - - error = xfs_inobt_get_rec(cur, &r, &stat); - if (error) - break; - if (!stat) - goto next_ag; - - agino = r.ir_startino + XFS_INODES_PER_CHUNK - 1; - buffer[bufidx].xi_startino = - XFS_AGINO_TO_INO(mp, agno, r.ir_startino); - buffer[bufidx].xi_alloccount = r.ir_count - r.ir_freecount; - buffer[bufidx].xi_allocmask = ~r.ir_free; - if (++bufidx == bcount) { - long written; - - error = formatter(ubuffer, buffer, bufidx, &written); - if (error) - break; - ubuffer += written; - *count += bufidx; - bufidx = 0; - } - if (!--left) - break; - - error = xfs_btree_increment(cur, 0, &stat); - if (error) - break; - if (stat) - continue; - -next_ag: - xfs_btree_del_cursor(cur, XFS_BTREE_ERROR); - cur = NULL; - xfs_buf_relse(agbp); - agbp = NULL; - agino = 0; - agno++; - } while (agno < mp->m_sb.sb_agcount); - - if (!error) { - if (bufidx) { - long written; - - error = formatter(ubuffer, buffer, bufidx, &written); - if (!error) - *count += bufidx; - } - *lastino = XFS_AGINO_TO_INO(mp, agno, agino); - } + if (xfs_bulkstat_already_done(breq->mp, breq->startino)) + return 0; + + error = xfs_inobt_walk(breq->mp, NULL, breq->startino, + xfs_inumbers_walk, breq->icount, &ic); - kmem_free(buffer); - if (cur) - xfs_btree_del_cursor(cur, error); - if (agbp) - xfs_buf_relse(agbp); + /* + * We found some inode groups, so clear the error status and return + * them. The lastino pointer will point directly at the inode that + * triggered any error that occurred, so on the next call the error + * will be triggered again and propagated to userspace as there will be + * no formatted inode groups in the buffer. + */ + if (breq->ocount > 0) + error = 0; return error; } diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h index a2562fe8d282..b4c89454e27a 100644 --- a/fs/xfs/xfs_itable.h +++ b/fs/xfs/xfs_itable.h @@ -46,25 +46,9 @@ typedef int (*bulkstat_one_fmt_pf)(struct xfs_ibulk *breq, int xfs_bulkstat_one(struct xfs_ibulk *breq, bulkstat_one_fmt_pf formatter); int xfs_bulkstat(struct xfs_ibulk *breq, bulkstat_one_fmt_pf formatter); -typedef int (*inumbers_fmt_pf)( - void __user *ubuffer, /* buffer to write to */ - const xfs_inogrp_t *buffer, /* buffer to read from */ - long count, /* # of elements to read */ - long *written); /* # of bytes written */ +typedef int (*inumbers_fmt_pf)(struct xfs_ibulk *breq, + const struct xfs_inogrp *igrp); -int -xfs_inumbers_fmt( - void __user *ubuffer, /* buffer to write to */ - const xfs_inogrp_t *buffer, /* buffer to read from */ - long count, /* # of elements to read */ - long *written); /* # of bytes written */ - -int /* error status */ -xfs_inumbers( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_ino_t *last, /* last inode returned */ - int *count, /* size of buffer/count returned */ - void __user *buffer, /* buffer with inode info */ - inumbers_fmt_pf formatter); +int xfs_inumbers(struct xfs_ibulk *breq, inumbers_fmt_pf formatter); #endif /* __XFS_ITABLE_H__ */ diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 7f40d0633651..afa4b22ffb3d 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -54,7 +54,10 @@ struct xfs_iwalk_ag { unsigned int nr_recs; /* Inode walk function and data pointer. */ - xfs_iwalk_fn iwalk_fn; + union { + xfs_iwalk_fn iwalk_fn; + xfs_inobt_walk_fn inobt_walk_fn; + }; void *data; }; @@ -94,16 +97,18 @@ xfs_iwalk_ichunk_ra( /* * Lookup the inode chunk that the given @agino lives in and then get the - * record if we found the chunk. Set the bits in @irec's free mask that - * correspond to the inodes before @agino so that we skip them. This is how we - * restart an inode walk that was interrupted in the middle of an inode record. + * record if we found the chunk. If @trim is set, set the bits in @irec's free + * mask that correspond to the inodes before @agino so that we skip them. + * This is how we restart an inode walk that was interrupted in the middle of + * an inode record. */ STATIC int xfs_iwalk_grab_ichunk( struct xfs_btree_cur *cur, /* btree cursor */ xfs_agino_t agino, /* starting inode of chunk */ int *icount,/* return # of inodes grabbed */ - struct xfs_inobt_rec_incore *irec) /* btree record */ + struct xfs_inobt_rec_incore *irec, /* btree record */ + bool trim) { int idx; /* index into inode chunk */ int stat; @@ -131,6 +136,12 @@ xfs_iwalk_grab_ichunk( return 0; } + /* Return the entire record if the caller wants the whole thing. */ + if (!trim) { + *icount = irec->ir_count; + return 0; + } + idx = agino - irec->ir_startino; /* @@ -278,7 +289,8 @@ xfs_iwalk_ag_start( xfs_agino_t agino, struct xfs_btree_cur **curpp, struct xfs_buf **agi_bpp, - int *has_more) + int *has_more, + bool trim) { struct xfs_mount *mp = iwag->mp; struct xfs_trans *tp = iwag->tp; @@ -302,7 +314,7 @@ xfs_iwalk_ag_start( * have to deal with tearing down the cursor to walk the records. */ error = xfs_iwalk_grab_ichunk(*curpp, agino, &icount, - &iwag->recs[iwag->nr_recs]); + &iwag->recs[iwag->nr_recs], trim); if (error) return error; if (icount) @@ -377,7 +389,8 @@ xfs_iwalk_ag( /* Set up our cursor at the right place in the inode btree. */ agno = XFS_INO_TO_AGNO(mp, iwag->startino); agino = XFS_INO_TO_AGINO(mp, iwag->startino); - error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more); + error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more, + true); if (error) goto out_cur; @@ -565,3 +578,129 @@ xfs_iwalk_threaded( return xfs_pwork_destroy_poll(&pctl); return xfs_pwork_destroy(&pctl); } + +/* For each inuse inode in each cached inobt record, call our function. */ +STATIC int +xfs_inobt_walk_ag_recs( + struct xfs_iwalk_ag *iwag) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_trans *tp = iwag->tp; + struct xfs_inobt_rec_incore *irec; + unsigned int i; + xfs_agnumber_t agno; + int error; + + agno = XFS_INO_TO_AGNO(mp, iwag->startino); + for (i = 0, irec = iwag->recs; i < iwag->nr_recs; i++, irec++) { + trace_xfs_iwalk_ag_rec(mp, agno, irec->ir_startino, + irec->ir_free); + error = iwag->inobt_walk_fn(mp, tp, agno, irec, iwag->data); + if (error) + return error; + } + + iwag->nr_recs = 0; + return 0; +} + +/* + * Walk all inode btree records in a single AG, from @iwag->startino to the end + * of the AG. + */ +STATIC int +xfs_inobt_walk_ag( + struct xfs_iwalk_ag *iwag) +{ + struct xfs_mount *mp = iwag->mp; + struct xfs_buf *agi_bp = NULL; + struct xfs_btree_cur *cur = NULL; + xfs_agnumber_t agno; + xfs_agino_t agino; + int has_more; + int error = 0; + + /* Set up our cursor at the right place in the inode btree. */ + agno = XFS_INO_TO_AGNO(mp, iwag->startino); + agino = XFS_INO_TO_AGINO(mp, iwag->startino); + error = xfs_iwalk_ag_start(iwag, agno, agino, &cur, &agi_bp, &has_more, + false); + if (error) + goto out_cur; + + while (has_more && !xfs_pwork_want_abort(&iwag->pwork)) { + struct xfs_inobt_rec_incore *irec; + + /* Fetch the inobt record. */ + irec = &iwag->recs[iwag->nr_recs]; + error = xfs_inobt_get_rec(cur, irec, &has_more); + if (error) + goto out_cur; + if (!has_more) + break; + + /* + * Add this inobt record to our cache, flush the cache if + * needed, and move on to the next record. + */ + error = xfs_iwalk_ag_increment(iwag, xfs_inobt_walk_ag_recs, + agno, &cur, &agi_bp, &has_more); + if (error) + goto out_cur; + cond_resched(); + } + + /* Walk any records left behind in the cache. */ + if (iwag->nr_recs && !xfs_pwork_want_abort(&iwag->pwork)) { + xfs_iwalk_del_inobt(iwag->tp, &cur, &agi_bp, error); + return xfs_inobt_walk_ag_recs(iwag); + } + +out_cur: + xfs_iwalk_del_inobt(iwag->tp, &cur, &agi_bp, error); + return error; +} + +/* + * Walk all inode btree records in the filesystem starting from @startino. The + * @inobt_walk_fn will be called for each btree record, being passed the incore + * record and @data. @max_prefetch controls how many inobt records we try to + * cache ahead of time. + */ +int +xfs_inobt_walk( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_ino_t startino, + xfs_inobt_walk_fn inobt_walk_fn, + unsigned int max_prefetch, + void *data) +{ + struct xfs_iwalk_ag iwag = { + .mp = mp, + .tp = tp, + .inobt_walk_fn = inobt_walk_fn, + .data = data, + .startino = startino, + .pwork = XFS_PWORK_SINGLE_THREADED, + }; + xfs_agnumber_t agno = XFS_INO_TO_AGNO(mp, startino); + int error; + + ASSERT(agno < mp->m_sb.sb_agcount); + + xfs_iwalk_set_prefetch(&iwag, max_prefetch * XFS_INODES_PER_CHUNK); + error = xfs_iwalk_allocbuf(&iwag); + if (error) + return error; + + for (; agno < mp->m_sb.sb_agcount; agno++) { + error = xfs_inobt_walk_ag(&iwag); + if (error) + break; + iwag.startino = XFS_AGINO_TO_INO(mp, agno + 1, 0); + } + + xfs_iwalk_freebuf(&iwag); + return error; +} diff --git a/fs/xfs/xfs_iwalk.h b/fs/xfs/xfs_iwalk.h index 76d8f87a39ef..20bee93d4676 100644 --- a/fs/xfs/xfs_iwalk.h +++ b/fs/xfs/xfs_iwalk.h @@ -18,4 +18,16 @@ int xfs_iwalk_threaded(struct xfs_mount *mp, xfs_ino_t startino, xfs_iwalk_fn iwalk_fn, unsigned int max_prefetch, bool poll, void *data); +/* Walk all inode btree records in the filesystem starting from @startino. */ +typedef int (*xfs_inobt_walk_fn)(struct xfs_mount *mp, struct xfs_trans *tp, + xfs_agnumber_t agno, + const struct xfs_inobt_rec_incore *irec, + void *data); +/* Return value (for xfs_inobt_walk_fn) that aborts the walk immediately. */ +#define XFS_INOBT_WALK_ABORT (XFS_IWALK_ABORT) + +int xfs_inobt_walk(struct xfs_mount *mp, struct xfs_trans *tp, + xfs_ino_t startino, xfs_inobt_walk_fn inobt_walk_fn, + unsigned int max_prefetch, void *data); + #endif /* __XFS_IWALK_H__ */