From patchwork Sat Jun 6 08:27:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Babu R X-Patchwork-Id: 11591091 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EE87613B1 for ; Sat, 6 Jun 2020 08:28:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D1921207F9 for ; Sat, 6 Jun 2020 08:28:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="tfeT1+Og" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728615AbgFFI2P (ORCPT ); Sat, 6 Jun 2020 04:28:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49404 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728598AbgFFI2M (ORCPT ); Sat, 6 Jun 2020 04:28:12 -0400 Received: from mail-pg1-x541.google.com (mail-pg1-x541.google.com [IPv6:2607:f8b0:4864:20::541]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CAFACC08C5C3 for ; Sat, 6 Jun 2020 01:28:11 -0700 (PDT) Received: by mail-pg1-x541.google.com with SMTP id p21so6264511pgm.13 for ; Sat, 06 Jun 2020 01:28:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZWXKz7uGjj9j4oh4L3iqAqQbfXOPs+3jEy3+yq3popQ=; b=tfeT1+OgVoMLcJOOkVmPYZke3n9m8T1wW0X1lyDmNQS0+owQJyZv4Zh4kwUV/WDtA5 JByajPUn6drZOshgZyLUwZs4wjxl3Bs5phSsXeHK741AG/XZTztVZW1+w+E8DrxlnnOe /5aXExeqbJzfSEuXQDEPvMNn0jGKbv25xbUhv/EwKi3C8WC8dunKiB9k1Zb6dLLkZo12 upZKqqzGaLGGy2bP2UDGTPrm2bWXO5p0PuMhBn6v7ZoiMpRHBRMrooTMg+qy4O3egjmd mKfyrhkFNw4Y/oVer7RsKjUfVsiN1BN5S9Ql2f4rw5uht0/t7IrzeAgvKSv2dGSCeUhw hikg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZWXKz7uGjj9j4oh4L3iqAqQbfXOPs+3jEy3+yq3popQ=; b=RdHu6BZlhrhuwhDA27wMYrKKit4yWohNcTrLA6Eqxi0Z2oc571sgLzUnSfWcQhw2ax DPXY30Avj7g+JTtMUDPlelZUncP4M03Fom9fdc/++cRX0bsYAx9iC63eeFnqEykqbpS6 sJ0bV8jDvqX8BknaU1oT2NLC170MtHN+8Bba6eBcx2vPPYphTF3sMfs9zC3a7p0AlMo3 tg77hxXrALJ2cuf6e3+kzZN9a98pswF7Z42xSZwnWQB4Gg5VH9EbdbEdaaBC7ML/oYm3 qDQj0iG/lL5LI/RhltpJ+x3x5eEZ2UhhelQP5+jqqO/mVo6ZgEAmC6PtQK7kTHQCBir8 wyRg== X-Gm-Message-State: AOAM531WbBCqkaA2h85IWVxHfz/dT/jyBo4mb/ThA3A3TYTeRh+eGLL8 nNFET7yUfxdO7GvonLJyOgkVVYfH X-Google-Smtp-Source: ABdhPJzyk03HuorGfgpy4BAzy1/NXoLTa5Qkl6DnTXfPJhYnRXLa1x799rXIl+ozxegL68JsDVVazA== X-Received: by 2002:a62:60c3:: with SMTP id u186mr13531943pfb.253.1591432090966; Sat, 06 Jun 2020 01:28:10 -0700 (PDT) Received: from localhost.localdomain ([122.167.144.243]) by smtp.gmail.com with ESMTPSA id j3sm1678130pfh.87.2020.06.06.01.28.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 06 Jun 2020 01:28:10 -0700 (PDT) From: Chandan Babu R To: linux-xfs@vger.kernel.org Cc: Chandan Babu R , david@fromorbit.com, darrick.wong@oracle.com, bfoster@redhat.com, hch@infradead.org, Dave Chinner Subject: [PATCH 1/7] xfs: Fix log reservation calculation for xattr insert operation Date: Sat, 6 Jun 2020 13:57:39 +0530 Message-Id: <20200606082745.15174-2-chandanrlinux@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200606082745.15174-1-chandanrlinux@gmail.com> References: <20200606082745.15174-1-chandanrlinux@gmail.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org Log space reservation for xattr insert operation is divided into two parts, 1. Mount time - Inode - Superblock for accounting space allocations - AGF for accounting space used by count, block number, rmap and refcnt btrees. 2. The remaining log space can only be calculated at run time because, - A local xattr can be large enough to cause a double split of the da btree. - The value of the xattr can be large enough to be stored in remote blocks. The contents of the remote blocks are not logged. The log space reservation could be, - (XFS_DA_NODE_MAXDEPTH + 1) number of blocks. The "+ 1" is required in case xattr is large enough to cause another split of the da btree path. - BMBT blocks for storing (XFS_DA_NODE_MAXDEPTH + 1) record entries. - Space for logging blocks of count, block number, rmap and refcnt btrees. At present, mount time log reservation includes block count required for a single split of the dabtree. The dabtree block count is also taken into account by xfs_attr_calc_size(). Also, AGF log space reservation isn't accounted for. Due to the reasons mentioned above, log reservation calculation for xattr insert operation gives an incorrect value. Apart from the above, xfs_log_calc_max_attrsetm_res() passes byte count as an argument to XFS_NEXTENTADD_SPACE_RES() instead of block count. The above mentioned inconsistencies were discoverd when trying to mount a modified XFS filesystem which uses a 32-bit value as xattr extent counter caused the following warning messages to be printed on the console, XFS (loop0): Mounting V4 Filesystem XFS (loop0): Log size 2560 blocks too small, minimum size is 4035 blocks XFS (loop0): Log size out of supported range. XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report. XFS (loop0): Ending clean mount To fix the inconsistencies described above, this commit replaces 'mount' and 'runtime' components with just one static reservation. The new reservation calculates the log space for the worst case possible i.e. it considers, 1. Double split of the da btree. This happens for large local xattrs. 2. Bmbt blocks required for mapping the contents of a maximum sized (i.e. XATTR_SIZE_MAX bytes in size) remote attribute. Suggested-by: Dave Chinner Signed-off-by: Chandan Babu R --- fs/xfs/libxfs/xfs_attr.c | 6 +--- fs/xfs/libxfs/xfs_log_rlimit.c | 29 ------------------ fs/xfs/libxfs/xfs_trans_resv.c | 54 +++++++++++++++------------------ fs/xfs/libxfs/xfs_trans_resv.h | 5 +-- fs/xfs/libxfs/xfs_trans_space.h | 7 ++++- 5 files changed, 32 insertions(+), 69 deletions(-) diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c index 3b1bd6e112f8..a4b23edf887e 100644 --- a/fs/xfs/libxfs/xfs_attr.c +++ b/fs/xfs/libxfs/xfs_attr.c @@ -337,11 +337,7 @@ xfs_attr_set( return error; } - tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres + - M_RES(mp)->tr_attrsetrt.tr_logres * - args->total; - tres.tr_logcount = XFS_ATTRSET_LOG_COUNT; - tres.tr_logflags = XFS_TRANS_PERM_LOG_RES; + tres = M_RES(mp)->tr_attrset; total = args->total; } else { XFS_STATS_INC(mp, xs_attr_remove); diff --git a/fs/xfs/libxfs/xfs_log_rlimit.c b/fs/xfs/libxfs/xfs_log_rlimit.c index 7f55eb3f3653..7aa9e6684ecd 100644 --- a/fs/xfs/libxfs/xfs_log_rlimit.c +++ b/fs/xfs/libxfs/xfs_log_rlimit.c @@ -15,27 +15,6 @@ #include "xfs_da_btree.h" #include "xfs_bmap_btree.h" -/* - * Calculate the maximum length in bytes that would be required for a local - * attribute value as large attributes out of line are not logged. - */ -STATIC int -xfs_log_calc_max_attrsetm_res( - struct xfs_mount *mp) -{ - int size; - int nblks; - - size = xfs_attr_leaf_entsize_local_max(mp->m_attr_geo->blksize) - - MAXNAMELEN - 1; - nblks = XFS_DAENTER_SPACE_RES(mp, XFS_ATTR_FORK); - nblks += XFS_B_TO_FSB(mp, size); - nblks += XFS_NEXTENTADD_SPACE_RES(mp, size, XFS_ATTR_FORK); - - return M_RES(mp)->tr_attrsetm.tr_logres + - M_RES(mp)->tr_attrsetrt.tr_logres * nblks; -} - /* * Iterate over the log space reservation table to figure out and return * the maximum one in terms of the pre-calculated values which were done @@ -49,9 +28,6 @@ xfs_log_get_max_trans_res( struct xfs_trans_res *resp; struct xfs_trans_res *end_resp; int log_space = 0; - int attr_space; - - attr_space = xfs_log_calc_max_attrsetm_res(mp); resp = (struct xfs_trans_res *)M_RES(mp); end_resp = (struct xfs_trans_res *)(M_RES(mp) + 1); @@ -64,11 +40,6 @@ xfs_log_get_max_trans_res( *max_resp = *resp; /* struct copy */ } } - - if (attr_space > log_space) { - *max_resp = M_RES(mp)->tr_attrsetm; /* struct copy */ - max_resp->tr_logres = attr_space; - } } /* diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index d1a0848cb52e..b44b521c605c 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -19,6 +19,7 @@ #include "xfs_trans.h" #include "xfs_qm.h" #include "xfs_trans_space.h" +#include "xfs_attr_remote.h" #define _ALLOC true #define _FREE false @@ -698,42 +699,36 @@ xfs_calc_attrinval_reservation( } /* - * Setting an attribute at mount time. + * Setting an attribute. * the inode getting the attribute * the superblock for allocations - * the agfs extents are allocated from + * the agf extents are allocated from * the attribute btree * max depth - * the inode allocation btree - * Since attribute transaction space is dependent on the size of the attribute, - * the calculation is done partially at mount time and partially at runtime(see - * below). + * the bmbt entries for da btree blocks + * the bmbt entries for remote blocks (if any) + * the allocation btrees. */ STATIC uint -xfs_calc_attrsetm_reservation( +xfs_calc_attrset_reservation( struct xfs_mount *mp) { + int max_rmt_blks; + int da_blks; + int bmbt_blks; + + da_blks = XFS_DAENTER_BLOCKS(mp, XFS_ATTR_FORK); + bmbt_blks = XFS_DAENTER_BMAPS(mp, XFS_ATTR_FORK); + + max_rmt_blks = xfs_attr3_rmt_blocks(mp, XATTR_SIZE_MAX); + bmbt_blks += XFS_NEXTENTADD_SPACE_RES(mp, max_rmt_blks, XFS_ATTR_FORK); + return XFS_DQUOT_LOGRES(mp) + xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(XFS_DA_NODE_MAXDEPTH, XFS_FSB_TO_B(mp, 1)); -} - -/* - * Setting an attribute at runtime, transaction space unit per block. - * the superblock for allocations: sector size - * the inode bmap btree could join or split: max depth * block size - * Since the runtime attribute transaction space is dependent on the total - * blocks needed for the 1st bmap, here we calculate out the space unit for - * one block so that the caller could figure out the total space according - * to the attibute extent length in blocks by: - * ext * M_RES(mp)->tr_attrsetrt.tr_logres - */ -STATIC uint -xfs_calc_attrsetrt_reservation( - struct xfs_mount *mp) -{ - return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_ATTR_FORK), + xfs_calc_buf_res(1, mp->m_sb.sb_sectsize) + + xfs_calc_buf_res(da_blks, XFS_FSB_TO_B(mp, 1)) + + xfs_calc_buf_res(bmbt_blks, XFS_FSB_TO_B(mp, 1)) + + xfs_calc_buf_res(xfs_allocfree_log_count(mp, da_blks), XFS_FSB_TO_B(mp, 1)); } @@ -897,9 +892,9 @@ xfs_trans_resv_calc( resp->tr_attrinval.tr_logcount = XFS_ATTRINVAL_LOG_COUNT; resp->tr_attrinval.tr_logflags |= XFS_TRANS_PERM_LOG_RES; - resp->tr_attrsetm.tr_logres = xfs_calc_attrsetm_reservation(mp); - resp->tr_attrsetm.tr_logcount = XFS_ATTRSET_LOG_COUNT; - resp->tr_attrsetm.tr_logflags |= XFS_TRANS_PERM_LOG_RES; + resp->tr_attrset.tr_logres = xfs_calc_attrset_reservation(mp); + resp->tr_attrset.tr_logcount = XFS_ATTRSET_LOG_COUNT; + resp->tr_attrset.tr_logflags |= XFS_TRANS_PERM_LOG_RES; resp->tr_attrrm.tr_logres = xfs_calc_attrrm_reservation(mp); resp->tr_attrrm.tr_logcount = XFS_ATTRRM_LOG_COUNT; @@ -942,7 +937,6 @@ xfs_trans_resv_calc( resp->tr_ichange.tr_logres = xfs_calc_ichange_reservation(mp); resp->tr_fsyncts.tr_logres = xfs_calc_swrite_reservation(mp); resp->tr_writeid.tr_logres = xfs_calc_writeid_reservation(mp); - resp->tr_attrsetrt.tr_logres = xfs_calc_attrsetrt_reservation(mp); resp->tr_clearagi.tr_logres = xfs_calc_clear_agi_bucket_reservation(mp); resp->tr_growrtzero.tr_logres = xfs_calc_growrtzero_reservation(mp); resp->tr_growrtfree.tr_logres = xfs_calc_growrtfree_reservation(mp); diff --git a/fs/xfs/libxfs/xfs_trans_resv.h b/fs/xfs/libxfs/xfs_trans_resv.h index 7241ab28cf84..f50996ae18e6 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.h +++ b/fs/xfs/libxfs/xfs_trans_resv.h @@ -35,10 +35,7 @@ struct xfs_trans_resv { struct xfs_trans_res tr_writeid; /* write setuid/setgid file */ struct xfs_trans_res tr_attrinval; /* attr fork buffer * invalidation */ - struct xfs_trans_res tr_attrsetm; /* set/create an attribute at - * mount time */ - struct xfs_trans_res tr_attrsetrt; /* set/create an attribute at - * runtime */ + struct xfs_trans_res tr_attrset; /* set/create an attribute */ struct xfs_trans_res tr_attrrm; /* remove an attribute */ struct xfs_trans_res tr_clearagi; /* clear agi unlinked bucket */ struct xfs_trans_res tr_growrtalloc; /* grow realtime allocations */ diff --git a/fs/xfs/libxfs/xfs_trans_space.h b/fs/xfs/libxfs/xfs_trans_space.h index 88221c7a04cc..b559af70cf51 100644 --- a/fs/xfs/libxfs/xfs_trans_space.h +++ b/fs/xfs/libxfs/xfs_trans_space.h @@ -38,8 +38,13 @@ #define XFS_DAENTER_1B(mp,w) \ ((w) == XFS_DATA_FORK ? (mp)->m_dir_geo->fsbcount : 1) +/* + * xattr set operation can cause the da btree to split once from the + * root to leaf and also allocate an extra leaf node. The '1' in the + * macro below accounts for the extra leaf node. + */ #define XFS_DAENTER_DBS(mp,w) \ - (XFS_DA_NODE_MAXDEPTH + (((w) == XFS_DATA_FORK) ? 2 : 0)) + (XFS_DA_NODE_MAXDEPTH + (((w) == XFS_DATA_FORK) ? 2 : 1)) #define XFS_DAENTER_BLOCKS(mp,w) \ (XFS_DAENTER_1B(mp,w) * XFS_DAENTER_DBS(mp,w)) #define XFS_DAENTER_BMAP1B(mp,w) \ From patchwork Sat Jun 6 08:27:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Babu R X-Patchwork-Id: 11591093 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 16F69912 for ; Sat, 6 Jun 2020 08:28:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F358B207D0 for ; Sat, 6 Jun 2020 08:28:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="o4uOhrb3" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728598AbgFFI2R (ORCPT ); Sat, 6 Jun 2020 04:28:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49412 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725283AbgFFI2P (ORCPT ); Sat, 6 Jun 2020 04:28:15 -0400 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8184FC08C5C2 for ; Sat, 6 Jun 2020 01:28:15 -0700 (PDT) Received: by mail-pj1-x102f.google.com with SMTP id b7so3865846pju.0 for ; Sat, 06 Jun 2020 01:28:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=n0gi0u83M7cBb5px8dNyN1dONnCoo4e2ck6zITVPf+w=; b=o4uOhrb3Z1CKyf+s0t5ltXchvtlr9cFVcd3OoulW6IzmqNtI0luYoFgXiIStUAeBEC ourKYLAt9KXCyshfEhiYrSQiJldKyx7DaeehY7tkTnVy7dzK5co5v5GKAefdPGtw/SWj DshWz6JvuncV4WdJG5YY8WCgzB5Hjeq7/L+Hcp+QOgMRo1HA9zZmjiqMTD7RwNF27/+Q biHaBLZvZP7x3Ncdz8wbViI8OA2mZ7K9rbHDs/WhJ1I0eU9sGtllDhX5+Vf0XdH3j3lD e0WO+00paUy5zeoKN4Cu2vozCkLlGzW8zre6knIYRV+lPmLjh2QxZHTL47kQPZD+hn0H 1YDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=n0gi0u83M7cBb5px8dNyN1dONnCoo4e2ck6zITVPf+w=; b=Wrsm2uX7JKA6gO+DvPuNl+XLdXHTVu9QVEZETcD4ZFuwSXiXTbo7IYTs+bf4bZPKiW y8HkIB7nvvX5J0wiN8VfHA7+h31tVWUm/0GPpzpbMaQYx1ZhRmx1UhnwIC0T8xFql4ks NUfmJAm4ogACgk3WDEADqrzN6MDVegMZ2xBLO8Cha473Q1v1ueUPrIEsi97CGM2TKKF/ 8vnm2GXdJdoMhHhLvWohUR2/vWoAfA8SaQRz7pW7eZAW+QW8jrVeoSjl0Ib7TMHOyd5P 5Rrzz9uPQmnd8kQe3w0ilb7T49uVwWAKS33/LwH+bo7wJnHIfd4m8bqMS/aRvj1toSUw f9qg== X-Gm-Message-State: AOAM530dYqItrnWJnmj0+KCRJnq9DR3TBMr486zHkQW89kxatzDsEhsF UClf3u5uFJ4FhRfj0QtaXEqCDETo X-Google-Smtp-Source: ABdhPJxRj6k3SjTlqfioHAYXhs6z+Ndnj+xp7bfSRX8N22j0sIZXpyr3vyPrelIPI9DmXZP38zlavA== X-Received: by 2002:a17:90a:260b:: with SMTP id l11mr7410088pje.210.1591432094822; Sat, 06 Jun 2020 01:28:14 -0700 (PDT) Received: from localhost.localdomain ([122.167.144.243]) by smtp.gmail.com with ESMTPSA id j3sm1678130pfh.87.2020.06.06.01.28.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 06 Jun 2020 01:28:14 -0700 (PDT) From: Chandan Babu R To: linux-xfs@vger.kernel.org Cc: Chandan Babu R , david@fromorbit.com, darrick.wong@oracle.com, bfoster@redhat.com, hch@infradead.org Subject: [PATCH 2/7] xfs: Check for per-inode extent count overflow Date: Sat, 6 Jun 2020 13:57:40 +0530 Message-Id: <20200606082745.15174-3-chandanrlinux@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200606082745.15174-1-chandanrlinux@gmail.com> References: <20200606082745.15174-1-chandanrlinux@gmail.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The following error message was noticed when a workload added one million xattrs, deleted 50% of them and then inserted 400,000 new xattrs. XFS (loop0): xfs_iflush_int: detected corrupt incore inode 131, total extents = -19916, nblocks = 102937, ptr ffff9ce33b098c00 The error message was printed during unmounting the filesystem. The value printed under "total extents" indicates that we overflowed the per-inode signed 16-bit xattr extent counter. Instead of letting this silent corruption occur, this patch checks for extent counter (both data and xattr) overflow before we assign the new value to the corresponding in-memory extent counter. Signed-off-by: Chandan Babu R --- fs/xfs/libxfs/xfs_bmap.c | 92 +++++++++++++++++++++++++++------- fs/xfs/libxfs/xfs_inode_fork.c | 29 +++++++++++ fs/xfs/libxfs/xfs_inode_fork.h | 1 + 3 files changed, 104 insertions(+), 18 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index edc63dba007f..798fca5c52af 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -906,7 +906,10 @@ xfs_bmap_local_to_extents( xfs_iext_first(ifp, &icur); xfs_iext_insert(ip, &icur, &rec, 0); - ifp->if_nextents = 1; + error = xfs_next_set(ip, whichfork, 1); + if (error) + goto done; + ip->i_d.di_nblocks = 1; xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, 1L); @@ -1594,7 +1597,10 @@ xfs_bmap_add_extent_delay_real( xfs_iext_remove(bma->ip, &bma->icur, state); xfs_iext_prev(ifp, &bma->icur); xfs_iext_update_extent(bma->ip, state, &bma->icur, &LEFT); - ifp->if_nextents--; + + error = xfs_next_set(bma->ip, whichfork, -1); + if (error) + goto done; if (bma->cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; @@ -1698,7 +1704,10 @@ xfs_bmap_add_extent_delay_real( PREV.br_startblock = new->br_startblock; PREV.br_state = new->br_state; xfs_iext_update_extent(bma->ip, state, &bma->icur, &PREV); - ifp->if_nextents++; + + error = xfs_next_set(bma->ip, whichfork, 1); + if (error) + goto done; if (bma->cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; @@ -1764,7 +1773,10 @@ xfs_bmap_add_extent_delay_real( * The left neighbor is not contiguous. */ xfs_iext_update_extent(bma->ip, state, &bma->icur, new); - ifp->if_nextents++; + + error = xfs_next_set(bma->ip, whichfork, 1); + if (error) + goto done; if (bma->cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; @@ -1851,7 +1863,10 @@ xfs_bmap_add_extent_delay_real( * The right neighbor is not contiguous. */ xfs_iext_update_extent(bma->ip, state, &bma->icur, new); - ifp->if_nextents++; + + error = xfs_next_set(bma->ip, whichfork, 1); + if (error) + goto done; if (bma->cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; @@ -1937,7 +1952,10 @@ xfs_bmap_add_extent_delay_real( xfs_iext_next(ifp, &bma->icur); xfs_iext_insert(bma->ip, &bma->icur, &RIGHT, state); xfs_iext_insert(bma->ip, &bma->icur, &LEFT, state); - ifp->if_nextents++; + + error = xfs_next_set(bma->ip, whichfork, 1); + if (error) + goto done; if (bma->cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; @@ -2141,7 +2159,11 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_remove(ip, icur, state); xfs_iext_prev(ifp, icur); xfs_iext_update_extent(ip, state, icur, &LEFT); - ifp->if_nextents -= 2; + + error = xfs_next_set(ip, whichfork, -2); + if (error) + goto done; + if (cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; else { @@ -2193,7 +2215,11 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_remove(ip, icur, state); xfs_iext_prev(ifp, icur); xfs_iext_update_extent(ip, state, icur, &LEFT); - ifp->if_nextents--; + + error = xfs_next_set(ip, whichfork, -1); + if (error) + goto done; + if (cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; else { @@ -2235,7 +2261,10 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_remove(ip, icur, state); xfs_iext_prev(ifp, icur); xfs_iext_update_extent(ip, state, icur, &PREV); - ifp->if_nextents--; + + error = xfs_next_set(ip, whichfork, -1); + if (error) + goto done; if (cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; @@ -2343,7 +2372,10 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_update_extent(ip, state, icur, &PREV); xfs_iext_insert(ip, icur, new, state); - ifp->if_nextents++; + + error = xfs_next_set(ip, whichfork, 1); + if (error) + goto done; if (cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; @@ -2419,7 +2451,10 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_update_extent(ip, state, icur, &PREV); xfs_iext_next(ifp, icur); xfs_iext_insert(ip, icur, new, state); - ifp->if_nextents++; + + error = xfs_next_set(ip, whichfork, 1); + if (error) + goto done; if (cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; @@ -2471,7 +2506,10 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_next(ifp, icur); xfs_iext_insert(ip, icur, &r[1], state); xfs_iext_insert(ip, icur, &r[0], state); - ifp->if_nextents += 2; + + error = xfs_next_set(ip, whichfork, 2); + if (error) + goto done; if (cur == NULL) rval = XFS_ILOG_CORE | XFS_ILOG_DEXT; @@ -2787,7 +2825,10 @@ xfs_bmap_add_extent_hole_real( xfs_iext_remove(ip, icur, state); xfs_iext_prev(ifp, icur); xfs_iext_update_extent(ip, state, icur, &left); - ifp->if_nextents--; + + error = xfs_next_set(ip, whichfork, -1); + if (error) + goto done; if (cur == NULL) { rval = XFS_ILOG_CORE | xfs_ilog_fext(whichfork); @@ -2886,7 +2927,10 @@ xfs_bmap_add_extent_hole_real( * Insert a new entry. */ xfs_iext_insert(ip, icur, new, state); - ifp->if_nextents++; + + error = xfs_next_set(ip, whichfork, 1); + if (error) + goto done; if (cur == NULL) { rval = XFS_ILOG_CORE | xfs_ilog_fext(whichfork); @@ -5083,7 +5127,10 @@ xfs_bmap_del_extent_real( */ xfs_iext_remove(ip, icur, state); xfs_iext_prev(ifp, icur); - ifp->if_nextents--; + + error = xfs_next_set(ip, whichfork, -1); + if (error) + goto done; flags |= XFS_ILOG_CORE; if (!cur) { @@ -5193,7 +5240,10 @@ xfs_bmap_del_extent_real( } else flags |= xfs_ilog_fext(whichfork); - ifp->if_nextents++; + error = xfs_next_set(ip, whichfork, 1); + if (error) + goto done; + xfs_iext_next(ifp, icur); xfs_iext_insert(ip, icur, &new, state); break; @@ -5660,7 +5710,10 @@ xfs_bmse_merge( * Update the on-disk extent count, the btree if necessary and log the * inode. */ - ifp->if_nextents--; + error = xfs_next_set(ip, whichfork, -1); + if (error) + goto done; + *logflags |= XFS_ILOG_CORE; if (!cur) { *logflags |= XFS_ILOG_DEXT; @@ -6047,7 +6100,10 @@ xfs_bmap_split_extent( /* Add new extent */ xfs_iext_next(ifp, &icur); xfs_iext_insert(ip, &icur, &new, 0); - ifp->if_nextents++; + + error = xfs_next_set(ip, whichfork, 1); + if (error) + goto del_cursor; if (cur) { error = xfs_bmbt_lookup_eq(cur, &new, &i); diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index 28b366275ae0..3bf5a2c391bd 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -728,3 +728,32 @@ xfs_ifork_verify_local_attr( return 0; } + +int +xfs_next_set( + struct xfs_inode *ip, + int whichfork, + int delta) +{ + struct xfs_ifork *ifp; + int64_t nr_exts; + int64_t max_exts; + + ifp = XFS_IFORK_PTR(ip, whichfork); + + if (whichfork == XFS_DATA_FORK || whichfork == XFS_COW_FORK) + max_exts = MAXEXTNUM; + else if (whichfork == XFS_ATTR_FORK) + max_exts = MAXAEXTNUM; + else + ASSERT(0); + + nr_exts = ifp->if_nextents + delta; + if ((delta > 0 && nr_exts > max_exts) + || (delta < 0 && nr_exts < 0)) + return -EOVERFLOW; + + ifp->if_nextents = nr_exts; + + return 0; +} diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h index a4953e95c4f3..a84ae42ace79 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.h +++ b/fs/xfs/libxfs/xfs_inode_fork.h @@ -173,4 +173,5 @@ extern void xfs_ifork_init_cow(struct xfs_inode *ip); int xfs_ifork_verify_local_data(struct xfs_inode *ip); int xfs_ifork_verify_local_attr(struct xfs_inode *ip); +int xfs_next_set(struct xfs_inode *ip, int whichfork, int delta); #endif /* __XFS_INODE_FORK_H__ */ From patchwork Sat Jun 6 08:27:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Babu R X-Patchwork-Id: 11591095 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B0848912 for ; Sat, 6 Jun 2020 08:28:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9046F207D0 for ; Sat, 6 Jun 2020 08:28:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="I4tiR3Xv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728632AbgFFI2X (ORCPT ); Sat, 6 Jun 2020 04:28:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49430 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725283AbgFFI2U (ORCPT ); Sat, 6 Jun 2020 04:28:20 -0400 Received: from mail-pg1-x543.google.com (mail-pg1-x543.google.com [IPv6:2607:f8b0:4864:20::543]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2384C08C5C2 for ; Sat, 6 Jun 2020 01:28:19 -0700 (PDT) Received: by mail-pg1-x543.google.com with SMTP id p21so6264611pgm.13 for ; Sat, 06 Jun 2020 01:28:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cGwjIzOF/wLm6AC3DNPM0M+L+3oZz86vyou6DH5FeNQ=; b=I4tiR3XvXKeCbZrbxqIZeRhs/8N/YxznIBBxsrk+ucrMkWVK1OaTg5Mg0OCNp8JxTs ANkeKcpahEq5RUycLJsMRcRe15lJrmzdRS53YJed+v3aKn3jFugv3DUC+J8AZxPymVbT YgOIJrFfpxeXpKkdF4Yt8Nx23Asb1U44k/DE/eUPQxxb/cSwqhkwFoZ3sXBTbLpjmSVm CCdnq8BUAtktBxZ6kELadNl34BJfwM3dgHdtkCLR4gzEa0R+cuswM2sRqAtwXmjbP1Fr xXL9GgJSaW9z4/JhYHhJ8tSRjGYGLCWap+Dz4hi2pyxy4jXu4Er+PUOZvb08N3gAy4rE wCQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cGwjIzOF/wLm6AC3DNPM0M+L+3oZz86vyou6DH5FeNQ=; b=aB8Idav+OJs0tC1d6qiEI5IkPLMl6AWmtM+b4pGifVKOYy2JE+Z4A83wtHILEH82SO mj/PwwY2eyASxHpLRuV4y8sGLxQ94a2bMPwuDl2h1mJ0LC5O7MFbLm5ltL9GXZ2y9kz9 jUjVeL2aRdh6ZH/59loZM+iJJa61AgLumLRkS3xUhXUFJim+0k+MGXynM8ph4asSEEtm Bn9t5QyujSUYk+7TnMa2BIyEHww6vl50GWQtE/ahUjeRyShRggblMjpkmwoF8Lxfuf63 IwljV+IqP45qmVVxC0ZiGRW9Xtgk+/Fmzv+WvJRDbTFM/RnQdv2N/MIuIpMiDFC+5Oes GJKA== X-Gm-Message-State: AOAM5315SpBxaklYPXPXBRIolmksjCBEfbWCQYkQs0NLHBL15TiZF+58 El5sEvatKkrwvbl19J6x4c5ZCH0c X-Google-Smtp-Source: ABdhPJxZTV1OvkkMzhkIjNjI4XltkdPux7XjtXbA2oB4kSVoxaeri758xzqAoahLKAR6tDOHjXZobQ== X-Received: by 2002:aa7:804a:: with SMTP id y10mr12776849pfm.186.1591432099260; Sat, 06 Jun 2020 01:28:19 -0700 (PDT) Received: from localhost.localdomain ([122.167.144.243]) by smtp.gmail.com with ESMTPSA id j3sm1678130pfh.87.2020.06.06.01.28.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 06 Jun 2020 01:28:18 -0700 (PDT) From: Chandan Babu R To: linux-xfs@vger.kernel.org Cc: Chandan Babu R , david@fromorbit.com, darrick.wong@oracle.com, bfoster@redhat.com, hch@infradead.org Subject: [PATCH 3/7] xfs: Compute maximum height of directory BMBT separately Date: Sat, 6 Jun 2020 13:57:41 +0530 Message-Id: <20200606082745.15174-4-chandanrlinux@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200606082745.15174-1-chandanrlinux@gmail.com> References: <20200606082745.15174-1-chandanrlinux@gmail.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org xfs/306 causes the following call trace when using a data fork with a maximum extent count of 2^47, XFS (loop0): Mounting V5 Filesystem XFS (loop0): Log size 8906 blocks too small, minimum size is 9075 blocks XFS (loop0): AAIEEE! Log failed size checks. Abort! XFS: Assertion failed: 0, file: fs/xfs/xfs_log.c, line: 711 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 12821 at fs/xfs/xfs_message.c:112 assfail+0x25/0x28 Modules linked in: CPU: 0 PID: 12821 Comm: mount Tainted: G W 5.6.0-rc6-next-20200320-chandan-00003-g071c2af3f4de #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:assfail+0x25/0x28 Code: ff ff 0f 0b c3 0f 1f 44 00 00 41 89 c8 48 89 d1 48 89 f2 48 c7 c6 40 b7 4b b3 e8 82 f9 ff ff 80 3d 83 d6 64 01 00 74 02 0f $ RSP: 0018:ffffb05b414cbd78 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff9d9d501d5000 RCX: 0000000000000000 RDX: 00000000ffffffc0 RSI: 000000000000000a RDI: ffffffffb346dc65 RBP: ffff9da444b49a80 R08: 0000000000000000 R09: 0000000000000000 R10: 000000000000000a R11: f000000000000000 R12: 00000000ffffffea R13: 000000000000000e R14: 0000000000004594 R15: ffff9d9d501d5628 FS: 00007fd6c5d17c80(0000) GS:ffff9da44d800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000002 CR3: 00000008a48c0000 CR4: 00000000000006f0 Call Trace: xfs_log_mount+0xf8/0x300 xfs_mountfs+0x46e/0x950 xfs_fc_fill_super+0x318/0x510 ? xfs_mount_free+0x30/0x30 get_tree_bdev+0x15c/0x250 vfs_get_tree+0x25/0xb0 do_mount+0x740/0x9b0 ? memdup_user+0x41/0x80 __x64_sys_mount+0x8e/0xd0 do_syscall_64+0x48/0x110 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fd6c5f2ccda Code: 48 8b 0d b9 e1 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f $ RSP: 002b:00007ffe00dfb9f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000560c1aaa92c0 RCX: 00007fd6c5f2ccda RDX: 0000560c1aaae110 RSI: 0000560c1aaad040 RDI: 0000560c1aaa94d0 RBP: 00007fd6c607d204 R08: 0000000000000000 R09: 0000560c1aaadde0 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000560c1aaa94d0 R15: 0000560c1aaae110 ---[ end trace 6436391b468bc652 ]--- XFS (loop0): log mount failed The corresponding filesystem was created using mkfs options "-m rmapbt=1,reflink=1 -b size=1k -d size=20m -n size=64k". i.e. We have a filesystem of size 20MiB, data block size of 1KiB and directory block size of 64KiB. Filesystems of size < 1GiB can have less than 10MiB on-disk log (Please refer to calculate_log_size() in xfsprogs). The largest reservation space was contributed by the rename operation. The corresponding calculation is done inside xfs_calc_rename_reservation(). In this case, the value returned by this function is, xfs_calc_inode_res(mp, 4) + xfs_calc_buf_res(2 * XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1)) xfs_calc_inode_res(mp, 4) returns a constant value of 3040 bytes regardless of the maximum data fork extent count. The largest contribution to the rename operation was by "2 * XFS_DIROP_LOG_COUNT(mp)" and it is a function of maximum height of a directory's BMBT tree. XFS_DIROP_LOG_COUNT() is a sum of, 1. The maximum number of dabtree blocks that needs to be logged i.e. XFS_DAENTER_BLOCKS() = XFS_DAENTER_1B(mp,w) * XFS_DAENTER_DBS(mp,w). For directories, this evaluates to (64 * (XFS_DA_NODE_MAXDEPTH + 2)) = (64 * (5 + 2)) = 448. 2. The corresponding maximum number of BMBT blocks that needs to be logged i.e. XFS_DAENTER_BMAPS() = XFS_DAENTER_DBS(mp,w) * XFS_DAENTER_BMAP1B(mp,w) XFS_DAENTER_DBS(mp,w) = XFS_DA_NODE_MAXDEPTH + 2 = 7 XFS_DAENTER_BMAP1B(mp,w) = XFS_NEXTENTADD_SPACE_RES(mp, XFS_DAENTER_1B(mp, w), w) = XFS_NEXTENTADD_SPACE_RES(mp, 64, w) = ((64 + XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp) - 1) / XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp)) * XFS_EXTENTADD_SPACE_RES(mp, w) XFS_MAX_CONTIG_EXTENTS_PER_BLOCK() = mp->m_alloc_mxr[0] - mp->m_alloc_mnr[0] = 121 - 60 = 61 XFS_DAENTER_BMAP1B(mp,w) = ((64 + XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp) - 1) / XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp)) * XFS_EXTENTADD_SPACE_RES(mp, w) = ((64 + 61 - 1) / 61) * XFS_EXTENTADD_SPACE_RES(mp, w) = 2 * XFS_EXTENTADD_SPACE_RES(mp, w) = 2 * (XFS_BM_MAXLEVELS(mp,w) - 1) = 2 * (8 - 1) = 14 With 2^32 as the maximum extent count the maximum height of the bmap btree was 7. Now with 2^47 maximum extent count, the height has increased to 8. Therefore, XFS_DAENTER_BMAPS() = 7 * 14 = 98. XFS_DIROP_LOG_COUNT() = 448 + 98 = 546. 2 * XFS_DIROP_LOG_COUNT() = 2 * 546 = 1092. With 2^32 max extent count, XFS_DIROP_LOG_COUNT() evaluates to 533. Hence 2 * XFS_DIROP_LOG_COUNT() = 2 * 533 = 1066. This small difference of 1092 - 1066 = 26 fs blocks is sufficient to trip us over the minimum log size check. A future commit in this series will use 2^27 as the maximum directory extent count. This will result in a shorter directory BMBT tree. Log reservation calculations that are applicable only to directories (e.g. XFS_DIROP_LOG_COUNT()) can then choose this instead of non-dir data fork BMBT height. This commit introduces a new member in 'struct xfs_mount' to hold the maximum BMBT height of a directory. At present, the maximum height of a directory BMBT is the same as a the maximum height of a non-directory BMBT. A future commit will change the parameters used as input for computing the maximum height of a directory BMBT. Signed-off-by: Chandan Babu R --- fs/xfs/libxfs/xfs_bmap.c | 17 ++++++++++++++--- fs/xfs/libxfs/xfs_bmap.h | 3 ++- fs/xfs/xfs_mount.c | 5 +++-- fs/xfs/xfs_mount.h | 1 + 4 files changed, 20 insertions(+), 6 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 798fca5c52af..01e2b543b139 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -50,7 +50,8 @@ kmem_zone_t *xfs_bmap_free_item_zone; void xfs_bmap_compute_maxlevels( xfs_mount_t *mp, /* file system mount structure */ - int whichfork) /* data or attr fork */ + int whichfork, /* data or attr fork */ + int dir_bmbt) /* Dir or non-dir data fork */ { int level; /* btree level */ uint maxblocks; /* max blocks at this level */ @@ -60,6 +61,9 @@ xfs_bmap_compute_maxlevels( int minnoderecs; /* min records in node block */ int sz; /* root block size */ + if (whichfork == XFS_ATTR_FORK) + ASSERT(dir_bmbt == 0); + /* * The maximum number of extents in a file, hence the maximum number of * leaf entries, is controlled by the size of the on-disk extent count, @@ -75,8 +79,11 @@ xfs_bmap_compute_maxlevels( * of a minimum size available. */ if (whichfork == XFS_DATA_FORK) { - maxleafents = MAXEXTNUM; sz = XFS_BMDR_SPACE_CALC(MINDBTPTRS); + if (dir_bmbt) + maxleafents = MAXEXTNUM; + else + maxleafents = MAXEXTNUM; } else { maxleafents = MAXAEXTNUM; sz = XFS_BMDR_SPACE_CALC(MINABTPTRS); @@ -91,7 +98,11 @@ xfs_bmap_compute_maxlevels( else maxblocks = (maxblocks + minnoderecs - 1) / minnoderecs; } - mp->m_bm_maxlevels[whichfork] = level; + + if (whichfork == XFS_DATA_FORK && dir_bmbt) + mp->m_bm_dir_maxlevel = level; + else + mp->m_bm_maxlevels[whichfork] = level; } STATIC int /* error */ diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h index 6028a3c825ba..4250c9ab4b75 100644 --- a/fs/xfs/libxfs/xfs_bmap.h +++ b/fs/xfs/libxfs/xfs_bmap.h @@ -187,7 +187,8 @@ void xfs_bmap_local_to_extents_empty(struct xfs_trans *tp, void __xfs_bmap_add_free(struct xfs_trans *tp, xfs_fsblock_t bno, xfs_filblks_t len, const struct xfs_owner_info *oinfo, bool skip_discard); -void xfs_bmap_compute_maxlevels(struct xfs_mount *mp, int whichfork); +void xfs_bmap_compute_maxlevels(struct xfs_mount *mp, int whichfork, + int dir_bmbt); int xfs_bmap_first_unused(struct xfs_trans *tp, struct xfs_inode *ip, xfs_extlen_t len, xfs_fileoff_t *unused, int whichfork); int xfs_bmap_last_before(struct xfs_trans *tp, struct xfs_inode *ip, diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index bb91f04266b9..d8ebfc67bb63 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -711,8 +711,9 @@ xfs_mountfs( goto out; xfs_alloc_compute_maxlevels(mp); - xfs_bmap_compute_maxlevels(mp, XFS_DATA_FORK); - xfs_bmap_compute_maxlevels(mp, XFS_ATTR_FORK); + xfs_bmap_compute_maxlevels(mp, XFS_DATA_FORK, 0); + xfs_bmap_compute_maxlevels(mp, XFS_DATA_FORK, 1); + xfs_bmap_compute_maxlevels(mp, XFS_ATTR_FORK, 0); xfs_ialloc_setup_geometry(mp); xfs_rmapbt_compute_maxlevels(mp); xfs_refcountbt_compute_maxlevels(mp); diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index aba5a1579279..9dbf036ddace 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -133,6 +133,7 @@ typedef struct xfs_mount { uint m_refc_mnr[2]; /* min refc btree records */ uint m_ag_maxlevels; /* XFS_AG_MAXLEVELS */ uint m_bm_maxlevels[2]; /* XFS_BM_MAXLEVELS */ + uint m_bm_dir_maxlevel; uint m_rmap_maxlevels; /* max rmap btree levels */ uint m_refc_maxlevels; /* max refcount btree level */ xfs_extlen_t m_ag_prealloc_blocks; /* reserved ag blocks */ From patchwork Sat Jun 6 08:27:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Babu R X-Patchwork-Id: 11591097 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BBFC4138C for ; Sat, 6 Jun 2020 08:28:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9D170207F9 for ; Sat, 6 Jun 2020 08:28:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kUMEY0Vf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728628AbgFFI23 (ORCPT ); Sat, 6 Jun 2020 04:28:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725283AbgFFI2Z (ORCPT ); Sat, 6 Jun 2020 04:28:25 -0400 Received: from mail-pl1-x641.google.com (mail-pl1-x641.google.com [IPv6:2607:f8b0:4864:20::641]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85C3FC08C5C2 for ; Sat, 6 Jun 2020 01:28:24 -0700 (PDT) Received: by mail-pl1-x641.google.com with SMTP id v24so4638135plo.6 for ; Sat, 06 Jun 2020 01:28:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9JcFcO7SozB9TEjPs/+E22bMQiEI+K7EyLJdu9Kdm+c=; b=kUMEY0VfTdKHtOZfRZ7xZUji6cgpOOuNHsLSY/D5lJWQnAOO/sav2m+HJvp9GjtXEj L2+Z/f/rGYn/UTGieRJQoyRJz0ZnZU09tDiAM2lW6n0jOFOywwIEHeFeOVz94ytAN+v9 T52iPug8kGxDL6BxuCbnqIoO6A1qrqGY98XRY8vzoLxQ2RaVxnu8CsNH0OuPvymTypk8 A/7cAxEIooj1adidoGue67GkQaese2hOVmep2i2YwbQZn7/v8QFpHipb3FjchAdWK699 SujxScHetpwH5z5hiySBhIlPL8DmUI7azn22G/vhP707IV9/LF3P3QO3MxStWvKSPxdj QZ/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9JcFcO7SozB9TEjPs/+E22bMQiEI+K7EyLJdu9Kdm+c=; b=oGcVy3chUFKT3SwBYT4F7uq9qke2EusFJKkJJifP8eh7u5x1qypzOPB0isygWsER6z z1sTVcJxa/pG7s6d05vxGWPlmSgo2IdefvkwHmu4LKux4DUk6BeF4jhUZxQnnOLHsd3v yMvjAvYbEv8Zv7Jpf54B2ya+9c+/Niz64uOYtxCT4KSnrH08ovBRMFNvctuBhNKZ0VIQ JV+kbL2IGPRVUW6DOloxKr7/HfdoGbaLoByYXMfKZgUySsyjuASBL2Q4dlhzBP+Xb2ON 4qYukClDAGij224wbf4sstLZuMBztt9xpPVzM8czyjWcIyZt7CVYuTq6zjsh+IAzdZ/t 4AjQ== X-Gm-Message-State: AOAM531GpgnfhoaLSLJ0nPCVczZl/O9VpqF21RbbEne0KQEIiYc3m8jO ZoPyh3jqJzsBSbxZUUoLE+La4AtE X-Google-Smtp-Source: ABdhPJygiU7fLbR62YCblv/E4C2RK1ja1FCknz+GXO47uKHDLO3/uhPj7WFF+d4te7RPWoB/usbmPQ== X-Received: by 2002:a17:90b:3690:: with SMTP id mj16mr7174550pjb.104.1591432103660; Sat, 06 Jun 2020 01:28:23 -0700 (PDT) Received: from localhost.localdomain ([122.167.144.243]) by smtp.gmail.com with ESMTPSA id j3sm1678130pfh.87.2020.06.06.01.28.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 06 Jun 2020 01:28:23 -0700 (PDT) From: Chandan Babu R To: linux-xfs@vger.kernel.org Cc: Chandan Babu R , david@fromorbit.com, darrick.wong@oracle.com, bfoster@redhat.com, hch@infradead.org Subject: [PATCH 4/7] xfs: Add "Use Dir BMBT height" argument to XFS_BM_MAXLEVELS() Date: Sat, 6 Jun 2020 13:57:42 +0530 Message-Id: <20200606082745.15174-5-chandanrlinux@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200606082745.15174-1-chandanrlinux@gmail.com> References: <20200606082745.15174-1-chandanrlinux@gmail.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org XFS_BM_MAXLEVELS() returns the maximum possible height of BMBT tree for either data or attribute fork. For data forks, this commit adds a new argument to XFS_BM_MAXLEVELS() to let the users choose between the maximum heights of dir and non-dir BMBTs. As of this commit, both dir and non-dir BMBTs have the same maximum height. A future commit in this series will use 2^27 extent count as the input to compute the maximum height of a directory BMBT which will in turn cause the maximum heights of dir and non-dir BMBTs to differ. Signed-off-by: Chandan Babu R --- fs/xfs/libxfs/xfs_attr.c | 5 ++-- fs/xfs/libxfs/xfs_bmap.c | 5 ++-- fs/xfs/libxfs/xfs_bmap_btree.h | 4 +++- fs/xfs/libxfs/xfs_trans_resv.c | 25 +++++++++++--------- fs/xfs/libxfs/xfs_trans_resv.h | 4 ++-- fs/xfs/libxfs/xfs_trans_space.h | 41 +++++++++++++++++---------------- fs/xfs/xfs_bmap_item.c | 3 ++- fs/xfs/xfs_reflink.c | 4 ++-- 8 files changed, 50 insertions(+), 41 deletions(-) diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c index a4b23edf887e..357e29a5a167 100644 --- a/fs/xfs/libxfs/xfs_attr.c +++ b/fs/xfs/libxfs/xfs_attr.c @@ -150,7 +150,7 @@ xfs_attr_calc_size( * "local" or "remote" (note: local != inline). */ size = xfs_attr_leaf_newentsize(args, local); - nblks = XFS_DAENTER_SPACE_RES(mp, XFS_ATTR_FORK); + nblks = XFS_DAENTER_SPACE_RES(mp, XFS_ATTR_FORK, 0); if (*local) { if (size > (args->geo->blksize / 2)) { /* Double split possible */ @@ -163,7 +163,8 @@ xfs_attr_calc_size( */ uint dblocks = xfs_attr3_rmt_blocks(mp, args->valuelen); nblks += dblocks; - nblks += XFS_NEXTENTADD_SPACE_RES(mp, dblocks, XFS_ATTR_FORK); + nblks += XFS_NEXTENTADD_SPACE_RES(mp, dblocks, + XFS_ATTR_FORK, 0); } return nblks; diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 01e2b543b139..8b0029b3cecf 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -182,13 +182,14 @@ xfs_bmap_worst_indlen( mp = ip->i_mount; maxrecs = mp->m_bmap_dmxr[0]; for (level = 0, rval = 0; - level < XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK); + level < XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK, 0); level++) { len += maxrecs - 1; do_div(len, maxrecs); rval += len; if (len == 1) - return rval + XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK) - + return rval + + XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK, 0) - level - 1; if (level == 0) maxrecs = mp->m_bmap_dmxr[1]; diff --git a/fs/xfs/libxfs/xfs_bmap_btree.h b/fs/xfs/libxfs/xfs_bmap_btree.h index 72bf74c79fb9..a047be5883d1 100644 --- a/fs/xfs/libxfs/xfs_bmap_btree.h +++ b/fs/xfs/libxfs/xfs_bmap_btree.h @@ -79,7 +79,9 @@ struct xfs_trans; /* * Maximum number of bmap btree levels. */ -#define XFS_BM_MAXLEVELS(mp,w) ((mp)->m_bm_maxlevels[(w)]) +#define XFS_BM_MAXLEVELS(mp,w,use_dir_bmbt) \ + ((!(use_dir_bmbt)) ? \ + (mp)->m_bm_maxlevels[(w)] : (mp)->m_bm_dir_maxlevel) /* * Prototypes for xfs_bmap.c to call. diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index b44b521c605c..39cfca1b71b6 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -265,14 +265,14 @@ xfs_calc_write_reservation( unsigned int blksz = XFS_FSB_TO_B(mp, 1); t1 = xfs_calc_inode_res(mp, 1) + - xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK), blksz) + + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK, 0), blksz) + xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), blksz); if (xfs_sb_version_hasrealtime(&mp->m_sb)) { t2 = xfs_calc_inode_res(mp, 1) + - xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK), - blksz) + + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK, 0), + blksz) + xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) + xfs_calc_buf_res(xfs_rtalloc_log_count(mp, 1), blksz) + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), blksz); @@ -313,7 +313,8 @@ xfs_calc_itruncate_reservation( unsigned int blksz = XFS_FSB_TO_B(mp, 1); t1 = xfs_calc_inode_res(mp, 1) + - xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK) + 1, blksz); + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK, 0) + 1, + blksz); t2 = xfs_calc_buf_res(9, mp->m_sb.sb_sectsize) + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 4), blksz); @@ -592,7 +593,7 @@ xfs_calc_growrtalloc_reservation( struct xfs_mount *mp) { return xfs_calc_buf_res(2, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK), + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK, 0), XFS_FSB_TO_B(mp, 1)) + xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), @@ -669,7 +670,7 @@ xfs_calc_addafork_reservation( xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(2, mp->m_sb.sb_sectsize) + xfs_calc_buf_res(1, mp->m_dir_geo->blksize) + - xfs_calc_buf_res(XFS_DAENTER_BMAP1B(mp, XFS_DATA_FORK) + 1, + xfs_calc_buf_res(XFS_DAENTER_BMAP1B(mp, XFS_DATA_FORK, 0) + 1, XFS_FSB_TO_B(mp, 1)) + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), XFS_FSB_TO_B(mp, 1)); @@ -691,7 +692,7 @@ xfs_calc_attrinval_reservation( struct xfs_mount *mp) { return max((xfs_calc_inode_res(mp, 1) + - xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_ATTR_FORK), + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_ATTR_FORK, 0), XFS_FSB_TO_B(mp, 1))), (xfs_calc_buf_res(9, mp->m_sb.sb_sectsize) + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 4), @@ -717,10 +718,11 @@ xfs_calc_attrset_reservation( int bmbt_blks; da_blks = XFS_DAENTER_BLOCKS(mp, XFS_ATTR_FORK); - bmbt_blks = XFS_DAENTER_BMAPS(mp, XFS_ATTR_FORK); + bmbt_blks = XFS_DAENTER_BMAPS(mp, XFS_ATTR_FORK, 0); max_rmt_blks = xfs_attr3_rmt_blocks(mp, XATTR_SIZE_MAX); - bmbt_blks += XFS_NEXTENTADD_SPACE_RES(mp, max_rmt_blks, XFS_ATTR_FORK); + bmbt_blks += XFS_NEXTENTADD_SPACE_RES(mp, max_rmt_blks, + XFS_ATTR_FORK, 0); return XFS_DQUOT_LOGRES(mp) + xfs_calc_inode_res(mp, 1) + @@ -752,8 +754,9 @@ xfs_calc_attrrm_reservation( xfs_calc_buf_res(XFS_DA_NODE_MAXDEPTH, XFS_FSB_TO_B(mp, 1)) + (uint)XFS_FSB_TO_B(mp, - XFS_BM_MAXLEVELS(mp, XFS_ATTR_FORK)) + - xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK), 0)), + XFS_BM_MAXLEVELS(mp, XFS_ATTR_FORK, 0)) + + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK, 0), + 0)), (xfs_calc_buf_res(5, mp->m_sb.sb_sectsize) + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), XFS_FSB_TO_B(mp, 1)))); diff --git a/fs/xfs/libxfs/xfs_trans_resv.h b/fs/xfs/libxfs/xfs_trans_resv.h index f50996ae18e6..d64989eeebd7 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.h +++ b/fs/xfs/libxfs/xfs_trans_resv.h @@ -61,10 +61,10 @@ struct xfs_trans_resv { */ #define XFS_DIROP_LOG_RES(mp) \ (XFS_FSB_TO_B(mp, XFS_DAENTER_BLOCKS(mp, XFS_DATA_FORK)) + \ - (XFS_FSB_TO_B(mp, XFS_DAENTER_BMAPS(mp, XFS_DATA_FORK) + 1))) + (XFS_FSB_TO_B(mp, XFS_DAENTER_BMAPS(mp, XFS_DATA_FORK, 1) + 1))) #define XFS_DIROP_LOG_COUNT(mp) \ (XFS_DAENTER_BLOCKS(mp, XFS_DATA_FORK) + \ - XFS_DAENTER_BMAPS(mp, XFS_DATA_FORK) + 1) + XFS_DAENTER_BMAPS(mp, XFS_DATA_FORK, 1) + 1) /* * Various log count values. diff --git a/fs/xfs/libxfs/xfs_trans_space.h b/fs/xfs/libxfs/xfs_trans_space.h index b559af70cf51..c51d809a16b1 100644 --- a/fs/xfs/libxfs/xfs_trans_space.h +++ b/fs/xfs/libxfs/xfs_trans_space.h @@ -25,15 +25,16 @@ #define XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp) \ (((mp)->m_alloc_mxr[0]) - ((mp)->m_alloc_mnr[0])) -#define XFS_EXTENTADD_SPACE_RES(mp,w) (XFS_BM_MAXLEVELS(mp,w) - 1) -#define XFS_NEXTENTADD_SPACE_RES(mp,b,w)\ +#define XFS_EXTENTADD_SPACE_RES(mp,w,dbmbt) \ + (XFS_BM_MAXLEVELS(mp,w,dbmbt) - 1) +#define XFS_NEXTENTADD_SPACE_RES(mp,b,w,dbmbt) \ (((b + XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp) - 1) / \ XFS_MAX_CONTIG_EXTENTS_PER_BLOCK(mp)) * \ - XFS_EXTENTADD_SPACE_RES(mp,w)) + XFS_EXTENTADD_SPACE_RES(mp,w,dbmbt)) /* Blocks we might need to add "b" mappings & rmappings to a file. */ -#define XFS_SWAP_RMAP_SPACE_RES(mp,b,w)\ - (XFS_NEXTENTADD_SPACE_RES((mp), (b), (w)) + \ +#define XFS_SWAP_RMAP_SPACE_RES(mp,b,w) \ + (XFS_NEXTENTADD_SPACE_RES((mp), (b), (w), 0) + \ XFS_NRMAPADD_SPACE_RES((mp), (b))) #define XFS_DAENTER_1B(mp,w) \ @@ -47,19 +48,19 @@ (XFS_DA_NODE_MAXDEPTH + (((w) == XFS_DATA_FORK) ? 2 : 1)) #define XFS_DAENTER_BLOCKS(mp,w) \ (XFS_DAENTER_1B(mp,w) * XFS_DAENTER_DBS(mp,w)) -#define XFS_DAENTER_BMAP1B(mp,w) \ - XFS_NEXTENTADD_SPACE_RES(mp, XFS_DAENTER_1B(mp, w), w) -#define XFS_DAENTER_BMAPS(mp,w) \ - (XFS_DAENTER_DBS(mp,w) * XFS_DAENTER_BMAP1B(mp,w)) -#define XFS_DAENTER_SPACE_RES(mp,w) \ - (XFS_DAENTER_BLOCKS(mp,w) + XFS_DAENTER_BMAPS(mp,w)) -#define XFS_DAREMOVE_SPACE_RES(mp,w) XFS_DAENTER_BMAPS(mp,w) +#define XFS_DAENTER_BMAP1B(mp,w,dbmbt) \ + XFS_NEXTENTADD_SPACE_RES(mp, XFS_DAENTER_1B(mp, w), w, dbmbt) +#define XFS_DAENTER_BMAPS(mp,w,dbmbt) \ + (XFS_DAENTER_DBS(mp,w) * XFS_DAENTER_BMAP1B(mp,w,dbmbt)) +#define XFS_DAENTER_SPACE_RES(mp,w,dbmbt) \ + (XFS_DAENTER_BLOCKS(mp,w) + XFS_DAENTER_BMAPS(mp,w,dbmbt)) +#define XFS_DAREMOVE_SPACE_RES(mp,w,dbmbt) XFS_DAENTER_BMAPS(mp,w,dbmbt) #define XFS_DIRENTER_MAX_SPLIT(mp,nl) 1 #define XFS_DIRENTER_SPACE_RES(mp,nl) \ - (XFS_DAENTER_SPACE_RES(mp, XFS_DATA_FORK) * \ + (XFS_DAENTER_SPACE_RES(mp, XFS_DATA_FORK, 1) * \ XFS_DIRENTER_MAX_SPLIT(mp,nl)) #define XFS_DIRREMOVE_SPACE_RES(mp) \ - XFS_DAREMOVE_SPACE_RES(mp, XFS_DATA_FORK) + XFS_DAREMOVE_SPACE_RES(mp, XFS_DATA_FORK, 1) #define XFS_IALLOC_SPACE_RES(mp) \ (M_IGEO(mp)->ialloc_blks + \ (xfs_sb_version_hasfinobt(&mp->m_sb) ? 2 : 1 * \ @@ -69,26 +70,26 @@ * Space reservation values for various transactions. */ #define XFS_ADDAFORK_SPACE_RES(mp) \ - ((mp)->m_dir_geo->fsbcount + XFS_DAENTER_BMAP1B(mp, XFS_DATA_FORK)) + ((mp)->m_dir_geo->fsbcount + XFS_DAENTER_BMAP1B(mp, XFS_DATA_FORK, 0)) #define XFS_ATTRRM_SPACE_RES(mp) \ - XFS_DAREMOVE_SPACE_RES(mp, XFS_ATTR_FORK) + XFS_DAREMOVE_SPACE_RES(mp, XFS_ATTR_FORK, 0) /* This macro is not used - see inline code in xfs_attr_set */ #define XFS_ATTRSET_SPACE_RES(mp, v) \ - (XFS_DAENTER_SPACE_RES(mp, XFS_ATTR_FORK) + XFS_B_TO_FSB(mp, v)) + (XFS_DAENTER_SPACE_RES(mp, XFS_ATTR_FORK, 0) + XFS_B_TO_FSB(mp, v)) #define XFS_CREATE_SPACE_RES(mp,nl) \ (XFS_IALLOC_SPACE_RES(mp) + XFS_DIRENTER_SPACE_RES(mp,nl)) #define XFS_DIOSTRAT_SPACE_RES(mp, v) \ - (XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK) + (v)) + (XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK, 0) + (v)) #define XFS_GROWFS_SPACE_RES(mp) \ (2 * (mp)->m_ag_maxlevels) #define XFS_GROWFSRT_SPACE_RES(mp,b) \ - ((b) + XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK)) + ((b) + XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK, 0)) #define XFS_LINK_SPACE_RES(mp,nl) \ XFS_DIRENTER_SPACE_RES(mp,nl) #define XFS_MKDIR_SPACE_RES(mp,nl) \ (XFS_IALLOC_SPACE_RES(mp) + XFS_DIRENTER_SPACE_RES(mp,nl)) #define XFS_QM_DQALLOC_SPACE_RES(mp) \ - (XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK) + \ + (XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK, 0) + \ XFS_DQUOT_CLUSTER_SIZE_FSB) #define XFS_QM_QINOCREATE_SPACE_RES(mp) \ XFS_IALLOC_SPACE_RES(mp) diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index 6736c5ab188f..0a8a8377a150 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -482,7 +482,8 @@ xfs_bui_item_recover( } error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, - XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK), 0, 0, &tp); + XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK, 0), 0, + 0, &tp); if (error) return error; /* diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 107bf2a2f344..fd35a0bf2c47 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -614,7 +614,7 @@ xfs_reflink_end_cow_extent( return 0; } - resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK); + resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK, 0); error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, XFS_TRANS_RESERVE, &tp); if (error) @@ -1017,7 +1017,7 @@ xfs_reflink_remap_extent( } /* Start a rolling transaction to switch the mappings */ - resblks = XFS_EXTENTADD_SPACE_RES(ip->i_mount, XFS_DATA_FORK); + resblks = XFS_EXTENTADD_SPACE_RES(ip->i_mount, XFS_DATA_FORK, 0); error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp); if (error) goto out; From patchwork Sat Jun 6 08:27:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Babu R X-Patchwork-Id: 11591099 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4E6FA13B1 for ; Sat, 6 Jun 2020 08:28:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3760B20810 for ; Sat, 6 Jun 2020 08:28:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SEr3sNJ6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725283AbgFFI2a (ORCPT ); Sat, 6 Jun 2020 04:28:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728637AbgFFI21 (ORCPT ); Sat, 6 Jun 2020 04:28:27 -0400 Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6CBA6C08C5C4 for ; Sat, 6 Jun 2020 01:28:27 -0700 (PDT) Received: by mail-pl1-x642.google.com with SMTP id bh7so4630941plb.11 for ; Sat, 06 Jun 2020 01:28:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jlFEglChqr2AMUDkDA53KRhdaxEKoTEMphYryBokfGU=; b=SEr3sNJ6fEJyv5Rv839NwLGFo2LI/nHJRQzHpGdNZxOo5lAkuBpByujjyzX7HI9vZW vFXtd71T5FnwoaIAR8CsW9771yXvJszx1IIVIJGVnE3roLSxDplfYVTNTsjyrLZmdfvZ lCcBO5LFZuQwbrZ7Zxhs0ysW6rq7STOpDZDsmGjYqt7wF6+EVazBsLTkDYqu4YkurN26 mCNAHx0RJQlvibKxmHnX3Cx+sVX/ZnpyVDyLTkovcbpBfPjWBA/FxBgzONEy/Cn3oLzH /dVzUG/Pmr3QTgmCFFCRsrabzNkBW1d7dYXXHgaUXrt7jGRpPFWCUPe8lKYnEr5aXj8A 4IBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jlFEglChqr2AMUDkDA53KRhdaxEKoTEMphYryBokfGU=; b=c6M9bv/fGBerhhsXEeOYvUyB1nH8tPHdJv73D89jTWnQq+F1JZIIFE4lW8dG6K2DhN Ec0KX1B7x6JfpjBcwbqiXPfY33gCn7Do7vHHhCJt9cxS+4WdW/k9u8lR0e9miL6+P35e fucSt0iZlSvE5jzO4JsvgLyiA0lWDlBn21HyA5PStKH2dlmx5gb+obFlwf3tWHlhjZBt /w9YCynsxatJZ3YpKbVr6K6Ow7KeLyPDOkHxKo92LT3jX4K6AFHFnSIlp1fzlzXP8Nd3 OUzRY/ky/2xTrIKHH8oBxSOLtHcU6H8diKRUxPNRMm45CUuFOini3apBbUdagdZCNjFT EqbQ== X-Gm-Message-State: AOAM532hy2xFGxCz+ZBJHti4TZUXjngoBH4pheiDkjoGB3srlVipyBTY fl7fkzgSY/2ESMCYvwBNs1f6h/2v X-Google-Smtp-Source: ABdhPJw5ghUcU3xGkRfQYxv6o6P7F2cufz4OBw7FFX1oiHVH4vFkKNaAVzHRmcVgF92xN4BGlGGtLA== X-Received: by 2002:a17:90b:949:: with SMTP id dw9mr6422319pjb.101.1591432106813; Sat, 06 Jun 2020 01:28:26 -0700 (PDT) Received: from localhost.localdomain ([122.167.144.243]) by smtp.gmail.com with ESMTPSA id j3sm1678130pfh.87.2020.06.06.01.28.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 06 Jun 2020 01:28:26 -0700 (PDT) From: Chandan Babu R To: linux-xfs@vger.kernel.org Cc: Chandan Babu R , david@fromorbit.com, darrick.wong@oracle.com, bfoster@redhat.com, hch@infradead.org Subject: [PATCH 5/7] xfs: Use 2^27 as the maximum number of directory extents Date: Sat, 6 Jun 2020 13:57:43 +0530 Message-Id: <20200606082745.15174-6-chandanrlinux@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200606082745.15174-1-chandanrlinux@gmail.com> References: <20200606082745.15174-1-chandanrlinux@gmail.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org The maximum number of extents that can be used by a directory can be calculated as shown below. (FS block size is assumed to be 512 bytes since the smallest allowed block size can create a BMBT of maximum possible height). Maximum number of extents in data space = XFS_DIR2_SPACE_SIZE / 2^9 = 32GiB / 2^9 = 2^26. Maximum number (theoretically) of extents in leaf space = 32GiB / 2^9 = 2^26. Maximum number of entries in a free space index block = (512 - (sizeof struct xfs_dir3_free_hdr)) / (sizeof struct xfs_dir2_data_off_t) = (512 - 64) / 2 = 224 Maximum number of extents in free space index = (Maximum number of extents in data segment) / 224 = 2^26 / 224 = ~2^18 Maximum number of extents in a directory = Maximum number of extents in data space + Maximum number of extents in leaf space + Maximum number of extents in free space index = 2^26 + 2^26 + 2^18 = ~2^27 This commit defines the macro MAXDIREXTNUM to have the value 2^27 and this in turn is used in calculating the maximum height of a directory BMBT. Signed-off-by: Chandan Babu R --- fs/xfs/libxfs/xfs_bmap.c | 2 +- fs/xfs/libxfs/xfs_types.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 8b0029b3cecf..f75b70ae7b1f 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -81,7 +81,7 @@ xfs_bmap_compute_maxlevels( if (whichfork == XFS_DATA_FORK) { sz = XFS_BMDR_SPACE_CALC(MINDBTPTRS); if (dir_bmbt) - maxleafents = MAXEXTNUM; + maxleafents = MAXDIREXTNUM; else maxleafents = MAXEXTNUM; } else { diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h index 397d94775440..0a3041ad5bec 100644 --- a/fs/xfs/libxfs/xfs_types.h +++ b/fs/xfs/libxfs/xfs_types.h @@ -60,6 +60,7 @@ typedef void * xfs_failaddr_t; */ #define MAXEXTLEN ((xfs_extlen_t)0x001fffff) /* 21 bits */ #define MAXEXTNUM ((xfs_extnum_t)0x7fffffff) /* signed int */ +#define MAXDIREXTNUM ((xfs_extnum_t)0x7ffffff) /* 27 bits */ #define MAXAEXTNUM ((xfs_aextnum_t)0x7fff) /* signed short */ /* From patchwork Sat Jun 6 08:27:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Babu R X-Patchwork-Id: 11591101 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6C967912 for ; Sat, 6 Jun 2020 08:28:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4924D20810 for ; Sat, 6 Jun 2020 08:28:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="qRcpZLcV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728643AbgFFI2g (ORCPT ); Sat, 6 Jun 2020 04:28:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728637AbgFFI2f (ORCPT ); Sat, 6 Jun 2020 04:28:35 -0400 Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3371C08C5C2 for ; Sat, 6 Jun 2020 01:28:34 -0700 (PDT) Received: by mail-pf1-x442.google.com with SMTP id h185so6099564pfg.2 for ; Sat, 06 Jun 2020 01:28:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TDtbnYDu7WRQzvF8HvnR1pSPe/HHIhNKzNarkrnywF0=; b=qRcpZLcVrZ9M0pfkBvcUct6N3mBQhtJQVV6TcRWOQNZbkn7TAgokNTYpgphx5S7UG1 ioY3kxVvP0anqlw4piTr2l7gkRZ28fNFXv7nAtL3eS+0h6ae1Vz+kFBnjUnr9w8GDZ22 FLsyAJMUDp98fnQPQe6u8t8X/essWY0DRMqQ1vGbK/zryvTkD+Q6HzRtsJCynAPxTJeL to3njXWFXn/7cfEbrZIY5pAW4bMzIbSdAlR7SSJwujxfaBQ6FE3Pm1E/bguCZqGafaNs Y7mZIPIiN9NbmC51qg7yws3vmFJrlUfz++tKjnWKJQrnD/ZXosje17P3Yd0vzhfGMWU8 GRtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TDtbnYDu7WRQzvF8HvnR1pSPe/HHIhNKzNarkrnywF0=; b=IhObWeejD/InkmbC9VM2idWeNsuvcVjB5VoLbLlDCCCoT1YQR/SsIQooP8SdefQdeC DReEi7lFbzYmAEg7Yq+90c/W+FqCY04gpuz0u4tRDScS3DSJnY+1FzZzckXe/HryOxAW sswo42rQ15NDnUZ0Chnp6KkKITdsEl8T5pyE3xKPrpvnLgrwzSD7JL7F9H9bacA7WSNZ RI5TDOwOt/uQg92gm2oITlTvBWHWdFw15cFc82Ghd838rGR+JRTJpIzQX1fZbRrvHf/G FKx5lrgPeyyBAhkvA/faNXjC9q8XudK4yyh1f5RtuDRq5qzr23KLJUUvF2c05fQKNOHC t/9Q== X-Gm-Message-State: AOAM532+X0r5zQWp4Mk4XiNZReX07dKvnsfA+SsPE8ReHYpjxceqdgAW fFLfaHqMiK4bovqy63auzYLHr2QY X-Google-Smtp-Source: ABdhPJzuZJO9tuwPPMMJlKL+8rn6frulxIfBy4bJWDj5KfAdqybV0KTobQtvnguBRDKJWTJwEnscuw== X-Received: by 2002:a63:4f09:: with SMTP id d9mr12503604pgb.10.1591432113497; Sat, 06 Jun 2020 01:28:33 -0700 (PDT) Received: from localhost.localdomain ([122.167.144.243]) by smtp.gmail.com with ESMTPSA id j3sm1678130pfh.87.2020.06.06.01.28.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 06 Jun 2020 01:28:33 -0700 (PDT) From: Chandan Babu R To: linux-xfs@vger.kernel.org Cc: Chandan Babu R , david@fromorbit.com, darrick.wong@oracle.com, bfoster@redhat.com, hch@infradead.org Subject: [PATCH 6/7] xfs: Extend data extent counter to 47 bits Date: Sat, 6 Jun 2020 13:57:44 +0530 Message-Id: <20200606082745.15174-7-chandanrlinux@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200606082745.15174-1-chandanrlinux@gmail.com> References: <20200606082745.15174-1-chandanrlinux@gmail.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org This commit extends the per-inode data extent counter to 47 bits. The length of 47-bits was chosen because, Maximum file size = 2^63. Maximum extent count when using 64k block size = 2^63 / 2^16 = 2^47. The following changes are made to accomplish this, 1. A new ro-compat superblock flag to prevent older kernels from mounting the filesystem in read-write mode. This flag is set for the first time when an inode would end up having more than 2^31 extents. 3. Carve out a new 32-bit field from xfs_dinode->di_pad2[]. This field holds the most significant 15 bits of the data extent counter. 2. A new inode->di_flags2 flag to indicate that the newly added field contains valid data. This flag is set when one of the following two conditions are met, - When the inode is about to have more than 2^31 extents. - When flushing the incore inode (See xfs_iflush_int()), if the superblock ro-compat flag is already set. Signed-off-by: Chandan Babu R --- fs/xfs/libxfs/xfs_bmap.c | 40 ++++++++-------- fs/xfs/libxfs/xfs_format.h | 30 ++++++++---- fs/xfs/libxfs/xfs_inode_buf.c | 46 +++++++++++++++--- fs/xfs/libxfs/xfs_inode_buf.h | 2 + fs/xfs/libxfs/xfs_inode_fork.c | 84 ++++++++++++++++++++++++++------- fs/xfs/libxfs/xfs_inode_fork.h | 3 +- fs/xfs/libxfs/xfs_log_format.h | 5 +- fs/xfs/libxfs/xfs_types.h | 5 +- fs/xfs/scrub/inode.c | 9 ++-- fs/xfs/xfs_inode.c | 6 ++- fs/xfs/xfs_inode_item.c | 5 +- fs/xfs/xfs_inode_item_recover.c | 16 +++++-- 12 files changed, 184 insertions(+), 67 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index f75b70ae7b1f..73e552678adc 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -53,9 +53,9 @@ xfs_bmap_compute_maxlevels( int whichfork, /* data or attr fork */ int dir_bmbt) /* Dir or non-dir data fork */ { + uint64_t maxleafents; /* max leaf entries possible */ int level; /* btree level */ uint maxblocks; /* max blocks at this level */ - uint maxleafents; /* max leaf entries possible */ int maxrootrecs; /* max records in root block */ int minleafrecs; /* min records in leaf block */ int minnoderecs; /* min records in node block */ @@ -477,7 +477,7 @@ xfs_bmap_check_leaf_extents( if (bp_release) xfs_trans_brelse(NULL, bp); error_norelse: - xfs_warn(mp, "%s: BAD after btree leaves for %d extents", + xfs_warn(mp, "%s: BAD after btree leaves for %llu extents", __func__, i); xfs_err(mp, "%s: CORRUPTED BTREE OR SOMETHING", __func__); xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); @@ -918,7 +918,7 @@ xfs_bmap_local_to_extents( xfs_iext_first(ifp, &icur); xfs_iext_insert(ip, &icur, &rec, 0); - error = xfs_next_set(ip, whichfork, 1); + error = xfs_next_set(tp, ip, whichfork, 1); if (error) goto done; @@ -1610,7 +1610,7 @@ xfs_bmap_add_extent_delay_real( xfs_iext_prev(ifp, &bma->icur); xfs_iext_update_extent(bma->ip, state, &bma->icur, &LEFT); - error = xfs_next_set(bma->ip, whichfork, -1); + error = xfs_next_set(bma->tp, bma->ip, whichfork, -1); if (error) goto done; @@ -1717,7 +1717,7 @@ xfs_bmap_add_extent_delay_real( PREV.br_state = new->br_state; xfs_iext_update_extent(bma->ip, state, &bma->icur, &PREV); - error = xfs_next_set(bma->ip, whichfork, 1); + error = xfs_next_set(bma->tp, bma->ip, whichfork, 1); if (error) goto done; @@ -1786,7 +1786,7 @@ xfs_bmap_add_extent_delay_real( */ xfs_iext_update_extent(bma->ip, state, &bma->icur, new); - error = xfs_next_set(bma->ip, whichfork, 1); + error = xfs_next_set(bma->tp, bma->ip, whichfork, 1); if (error) goto done; @@ -1876,7 +1876,7 @@ xfs_bmap_add_extent_delay_real( */ xfs_iext_update_extent(bma->ip, state, &bma->icur, new); - error = xfs_next_set(bma->ip, whichfork, 1); + error = xfs_next_set(bma->tp, bma->ip, whichfork, 1); if (error) goto done; @@ -1965,7 +1965,7 @@ xfs_bmap_add_extent_delay_real( xfs_iext_insert(bma->ip, &bma->icur, &RIGHT, state); xfs_iext_insert(bma->ip, &bma->icur, &LEFT, state); - error = xfs_next_set(bma->ip, whichfork, 1); + error = xfs_next_set(bma->tp, bma->ip, whichfork, 1); if (error) goto done; @@ -2172,7 +2172,7 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_prev(ifp, icur); xfs_iext_update_extent(ip, state, icur, &LEFT); - error = xfs_next_set(ip, whichfork, -2); + error = xfs_next_set(tp, ip, whichfork, -2); if (error) goto done; @@ -2228,7 +2228,7 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_prev(ifp, icur); xfs_iext_update_extent(ip, state, icur, &LEFT); - error = xfs_next_set(ip, whichfork, -1); + error = xfs_next_set(tp, ip, whichfork, -1); if (error) goto done; @@ -2274,7 +2274,7 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_prev(ifp, icur); xfs_iext_update_extent(ip, state, icur, &PREV); - error = xfs_next_set(ip, whichfork, -1); + error = xfs_next_set(tp, ip, whichfork, -1); if (error) goto done; @@ -2385,7 +2385,7 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_update_extent(ip, state, icur, &PREV); xfs_iext_insert(ip, icur, new, state); - error = xfs_next_set(ip, whichfork, 1); + error = xfs_next_set(tp, ip, whichfork, 1); if (error) goto done; @@ -2464,7 +2464,7 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_next(ifp, icur); xfs_iext_insert(ip, icur, new, state); - error = xfs_next_set(ip, whichfork, 1); + error = xfs_next_set(tp, ip, whichfork, 1); if (error) goto done; @@ -2519,7 +2519,7 @@ xfs_bmap_add_extent_unwritten_real( xfs_iext_insert(ip, icur, &r[1], state); xfs_iext_insert(ip, icur, &r[0], state); - error = xfs_next_set(ip, whichfork, 2); + error = xfs_next_set(tp, ip, whichfork, 2); if (error) goto done; @@ -2838,7 +2838,7 @@ xfs_bmap_add_extent_hole_real( xfs_iext_prev(ifp, icur); xfs_iext_update_extent(ip, state, icur, &left); - error = xfs_next_set(ip, whichfork, -1); + error = xfs_next_set(tp, ip, whichfork, -1); if (error) goto done; @@ -2940,7 +2940,7 @@ xfs_bmap_add_extent_hole_real( */ xfs_iext_insert(ip, icur, new, state); - error = xfs_next_set(ip, whichfork, 1); + error = xfs_next_set(tp, ip, whichfork, 1); if (error) goto done; @@ -5140,7 +5140,7 @@ xfs_bmap_del_extent_real( xfs_iext_remove(ip, icur, state); xfs_iext_prev(ifp, icur); - error = xfs_next_set(ip, whichfork, -1); + error = xfs_next_set(tp, ip, whichfork, -1); if (error) goto done; @@ -5252,7 +5252,7 @@ xfs_bmap_del_extent_real( } else flags |= xfs_ilog_fext(whichfork); - error = xfs_next_set(ip, whichfork, 1); + error = xfs_next_set(tp, ip, whichfork, 1); if (error) goto done; @@ -5722,7 +5722,7 @@ xfs_bmse_merge( * Update the on-disk extent count, the btree if necessary and log the * inode. */ - error = xfs_next_set(ip, whichfork, -1); + error = xfs_next_set(tp, ip, whichfork, -1); if (error) goto done; @@ -6113,7 +6113,7 @@ xfs_bmap_split_extent( xfs_iext_next(ifp, &icur); xfs_iext_insert(ip, &icur, &new, 0); - error = xfs_next_set(ip, whichfork, 1); + error = xfs_next_set(tp, ip, whichfork, 1); if (error) goto del_cursor; diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index b42a52bfa1e9..91bee33aa988 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -449,10 +449,12 @@ xfs_sb_has_compat_feature( #define XFS_SB_FEAT_RO_COMPAT_FINOBT (1 << 0) /* free inode btree */ #define XFS_SB_FEAT_RO_COMPAT_RMAPBT (1 << 1) /* reverse map btree */ #define XFS_SB_FEAT_RO_COMPAT_REFLINK (1 << 2) /* reflinked files */ +#define XFS_SB_FEAT_RO_COMPAT_47BIT_DEXT_CNTR (1 << 3) /* 47bit data extents */ #define XFS_SB_FEAT_RO_COMPAT_ALL \ (XFS_SB_FEAT_RO_COMPAT_FINOBT | \ XFS_SB_FEAT_RO_COMPAT_RMAPBT | \ - XFS_SB_FEAT_RO_COMPAT_REFLINK) + XFS_SB_FEAT_RO_COMPAT_REFLINK | \ + XFS_SB_FEAT_RO_COMPAT_47BIT_DEXT_CNTR) #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN ~XFS_SB_FEAT_RO_COMPAT_ALL static inline bool xfs_sb_has_ro_compat_feature( @@ -563,6 +565,18 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp) (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_REFLINK); } +static inline bool xfs_sb_version_has47bitext(struct xfs_sb *sbp) +{ + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 && + (sbp->sb_features_ro_compat & + XFS_SB_FEAT_RO_COMPAT_47BIT_DEXT_CNTR); +} + +static inline void xfs_sb_version_add47bitext(struct xfs_sb *sbp) +{ + sbp->sb_features_ro_compat |= XFS_SB_FEAT_RO_COMPAT_47BIT_DEXT_CNTR; +} + /* * end of superblock version macros */ @@ -873,7 +887,7 @@ typedef struct xfs_dinode { __be64 di_size; /* number of bytes in file */ __be64 di_nblocks; /* # of direct & btree blocks used */ __be32 di_extsize; /* basic/minimum extent size for file */ - __be32 di_nextents; /* number of extents in data fork */ + __be32 di_nextents_lo; /* number of extents in data fork */ __be16 di_anextents; /* number of extents in attribute fork*/ __u8 di_forkoff; /* attr fork offs, <<3 for 64b align */ __s8 di_aformat; /* format of attr fork's data */ @@ -891,7 +905,8 @@ typedef struct xfs_dinode { __be64 di_lsn; /* flush sequence */ __be64 di_flags2; /* more random flags */ __be32 di_cowextsize; /* basic cow extent size for file */ - __u8 di_pad2[12]; /* more padding for future expansion */ + __be32 di_nextents_hi; + __u8 di_pad2[8]; /* more padding for future expansion */ /* fields only written to during inode creation */ xfs_timestamp_t di_crtime; /* time created */ @@ -992,10 +1007,6 @@ enum xfs_dinode_fmt { ((w) == XFS_DATA_FORK ? \ (dip)->di_format : \ (dip)->di_aformat) -#define XFS_DFORK_NEXTENTS(dip,w) \ - ((w) == XFS_DATA_FORK ? \ - be32_to_cpu((dip)->di_nextents) : \ - be16_to_cpu((dip)->di_anextents)) /* * For block and character special files the 32bit dev_t is stored at the @@ -1061,12 +1072,15 @@ static inline void xfs_dinode_put_rdev(struct xfs_dinode *dip, xfs_dev_t rdev) #define XFS_DIFLAG2_DAX_BIT 0 /* use DAX for this inode */ #define XFS_DIFLAG2_REFLINK_BIT 1 /* file's blocks may be shared */ #define XFS_DIFLAG2_COWEXTSIZE_BIT 2 /* copy on write extent size hint */ +#define XFS_DIFLAG2_47BIT_NEXTENTS_BIT 3 /* Uses di_nextents_hi field */ #define XFS_DIFLAG2_DAX (1 << XFS_DIFLAG2_DAX_BIT) #define XFS_DIFLAG2_REFLINK (1 << XFS_DIFLAG2_REFLINK_BIT) #define XFS_DIFLAG2_COWEXTSIZE (1 << XFS_DIFLAG2_COWEXTSIZE_BIT) +#define XFS_DIFLAG2_47BIT_NEXTENTS (1 << XFS_DIFLAG2_47BIT_NEXTENTS_BIT) #define XFS_DIFLAG2_ANY \ - (XFS_DIFLAG2_DAX | XFS_DIFLAG2_REFLINK | XFS_DIFLAG2_COWEXTSIZE) + (XFS_DIFLAG2_DAX | XFS_DIFLAG2_REFLINK | XFS_DIFLAG2_COWEXTSIZE | \ + XFS_DIFLAG2_47BIT_NEXTENTS) /* * Inode number format: diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 6f84ea85fdd8..8b89fe080f70 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -307,7 +307,8 @@ xfs_inode_to_disk( to->di_size = cpu_to_be64(from->di_size); to->di_nblocks = cpu_to_be64(from->di_nblocks); to->di_extsize = cpu_to_be32(from->di_extsize); - to->di_nextents = cpu_to_be32(xfs_ifork_nextents(&ip->i_df)); + to->di_nextents_lo = cpu_to_be32(xfs_ifork_nextents(&ip->i_df) & + 0xffffffffU); to->di_anextents = cpu_to_be16(xfs_ifork_nextents(ip->i_afp)); to->di_forkoff = from->di_forkoff; to->di_aformat = xfs_ifork_format(ip->i_afp); @@ -322,6 +323,10 @@ xfs_inode_to_disk( to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.tv_nsec); to->di_flags2 = cpu_to_be64(from->di_flags2); to->di_cowextsize = cpu_to_be32(from->di_cowextsize); + if (from->di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS) + to->di_nextents_hi + = cpu_to_be32(xfs_ifork_nextents(&ip->i_df) + >> 32); to->di_ino = cpu_to_be64(ip->i_ino); to->di_lsn = cpu_to_be64(lsn); memset(to->di_pad2, 0, sizeof(to->di_pad2)); @@ -360,7 +365,7 @@ xfs_log_dinode_to_disk( to->di_size = cpu_to_be64(from->di_size); to->di_nblocks = cpu_to_be64(from->di_nblocks); to->di_extsize = cpu_to_be32(from->di_extsize); - to->di_nextents = cpu_to_be32(from->di_nextents); + to->di_nextents_lo = cpu_to_be32(from->di_nextents_lo); to->di_anextents = cpu_to_be16(from->di_anextents); to->di_forkoff = from->di_forkoff; to->di_aformat = from->di_aformat; @@ -375,6 +380,9 @@ xfs_log_dinode_to_disk( to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.t_nsec); to->di_flags2 = cpu_to_be64(from->di_flags2); to->di_cowextsize = cpu_to_be32(from->di_cowextsize); + if (from->di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS) + to->di_nextents_hi = + cpu_to_be32(from->di_nextents_hi); to->di_ino = cpu_to_be64(from->di_ino); to->di_lsn = cpu_to_be64(from->di_lsn); memcpy(to->di_pad2, from->di_pad2, sizeof(to->di_pad2)); @@ -391,7 +399,9 @@ xfs_dinode_verify_fork( struct xfs_mount *mp, int whichfork) { - uint32_t di_nextents = XFS_DFORK_NEXTENTS(dip, whichfork); + xfs_extnum_t di_nextents; + + di_nextents = xfs_dfork_nextents(&mp->m_sb, dip, whichfork); switch (XFS_DFORK_FORMAT(dip, whichfork)) { case XFS_DINODE_FMT_LOCAL: @@ -462,6 +472,8 @@ xfs_dinode_verify( uint16_t flags; uint64_t flags2; uint64_t di_size; + xfs_extnum_t nextents; + int64_t nblocks; if (dip->di_magic != cpu_to_be16(XFS_DINODE_MAGIC)) return __this_address; @@ -492,10 +504,12 @@ xfs_dinode_verify( if ((S_ISLNK(mode) || S_ISDIR(mode)) && di_size == 0) return __this_address; + nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK); + nextents += xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK); + nblocks = be64_to_cpu(dip->di_nblocks); + /* Fork checks carried over from xfs_iformat_fork */ - if (mode && - be32_to_cpu(dip->di_nextents) + be16_to_cpu(dip->di_anextents) > - be64_to_cpu(dip->di_nblocks)) + if (mode && nextents > nblocks) return __this_address; if (mode && XFS_DFORK_BOFF(dip) > mp->m_sb.sb_inodesize) @@ -716,3 +730,23 @@ xfs_inode_validate_cowextsize( return NULL; } + +xfs_extnum_t +xfs_dfork_nextents( + struct xfs_sb *sbp, + struct xfs_dinode *dip, + int whichfork) +{ + xfs_extnum_t nextents; + + if (whichfork == XFS_DATA_FORK) { + nextents = be32_to_cpu(dip->di_nextents_lo); + if (xfs_sb_version_has_v3inode(sbp) + && (dip->di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS)) + nextents |= (u64)(be32_to_cpu(dip->di_nextents_hi)) + << 32; + return nextents; + } else { + return be16_to_cpu(dip->di_anextents); + } +} diff --git a/fs/xfs/libxfs/xfs_inode_buf.h b/fs/xfs/libxfs/xfs_inode_buf.h index 865ac493c72a..4583db53b933 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.h +++ b/fs/xfs/libxfs/xfs_inode_buf.h @@ -65,5 +65,7 @@ xfs_failaddr_t xfs_inode_validate_extsize(struct xfs_mount *mp, xfs_failaddr_t xfs_inode_validate_cowextsize(struct xfs_mount *mp, uint32_t cowextsize, uint16_t mode, uint16_t flags, uint64_t flags2); +xfs_extnum_t xfs_dfork_nextents(struct xfs_sb *sbp, struct xfs_dinode *dip, + int whichfork); #endif /* __XFS_INODE_BUF_H__ */ diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index 3bf5a2c391bd..ec682e2d5bcb 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -10,6 +10,7 @@ #include "xfs_format.h" #include "xfs_log_format.h" #include "xfs_trans_resv.h" +#include "xfs_sb.h" #include "xfs_mount.h" #include "xfs_inode.h" #include "xfs_trans.h" @@ -103,21 +104,22 @@ xfs_iformat_extents( int whichfork) { struct xfs_mount *mp = ip->i_mount; + struct xfs_sb *sb = &mp->m_sb; struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, whichfork); + xfs_extnum_t nex = xfs_dfork_nextents(sb, dip, whichfork); int state = xfs_bmap_fork_to_state(whichfork); - int nex = XFS_DFORK_NEXTENTS(dip, whichfork); int size = nex * sizeof(xfs_bmbt_rec_t); struct xfs_iext_cursor icur; struct xfs_bmbt_rec *dp; struct xfs_bmbt_irec new; - int i; + xfs_extnum_t i; /* * If the number of extents is unreasonable, then something is wrong and * we just bail out rather than crash in kmem_alloc() or memcpy() below. */ if (unlikely(size < 0 || size > XFS_DFORK_SIZE(dip, mp, whichfork))) { - xfs_warn(ip->i_mount, "corrupt inode %Lu ((a)extents = %d).", + xfs_warn(ip->i_mount, "corrupt inode %Lu ((a)extents = %llu).", (unsigned long long) ip->i_ino, nex); xfs_inode_verifier_error(ip, -EFSCORRUPTED, "xfs_iformat_extents(1)", dip, sizeof(*dip), @@ -233,7 +235,11 @@ xfs_iformat_data_fork( * depend on it. */ ip->i_df.if_format = dip->di_format; - ip->i_df.if_nextents = be32_to_cpu(dip->di_nextents); + ip->i_df.if_nextents = be32_to_cpu(dip->di_nextents_lo); + if (ip->i_d.di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS) + ip->i_df.if_nextents |= + ((u64)(be32_to_cpu(dip->di_nextents_hi)) << 32); + switch (inode->i_mode & S_IFMT) { case S_IFIFO: @@ -729,31 +735,73 @@ xfs_ifork_verify_local_attr( return 0; } +static int +xfs_next_set_data( + struct xfs_trans *tp, + struct xfs_inode *ip, + struct xfs_ifork *ifp, + int delta) +{ + struct xfs_mount *mp = ip->i_mount; + xfs_extnum_t nr_exts; + + nr_exts = ifp->if_nextents + delta; + + if ((delta > 0 && nr_exts > MAXEXTNUM) + || (delta < 0 && nr_exts > ifp->if_nextents)) + return -EOVERFLOW; + + if (ifp->if_nextents <= MAXEXTNUM31BIT && + nr_exts > MAXEXTNUM31BIT && + !(ip->i_d.di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS) && + xfs_sb_version_has_v3inode(&mp->m_sb)) { + if (!xfs_sb_version_has47bitext(&mp->m_sb)) { + bool log_sb = false; + + spin_lock(&mp->m_sb_lock); + if (!xfs_sb_version_has47bitext(&mp->m_sb)) { + xfs_sb_version_add47bitext(&mp->m_sb); + log_sb = true; + } + spin_unlock(&mp->m_sb_lock); + + if (log_sb) + xfs_log_sb(tp); + } + + ip->i_d.di_flags2 |= XFS_DIFLAG2_47BIT_NEXTENTS; + } + + ifp->if_nextents = nr_exts; + + return 0; +} + int xfs_next_set( + struct xfs_trans *tp, struct xfs_inode *ip, int whichfork, int delta) { struct xfs_ifork *ifp; int64_t nr_exts; - int64_t max_exts; + int error = 0; ifp = XFS_IFORK_PTR(ip, whichfork); - if (whichfork == XFS_DATA_FORK || whichfork == XFS_COW_FORK) - max_exts = MAXEXTNUM; - else if (whichfork == XFS_ATTR_FORK) - max_exts = MAXAEXTNUM; - else - ASSERT(0); - - nr_exts = ifp->if_nextents + delta; - if ((delta > 0 && nr_exts > max_exts) - || (delta < 0 && nr_exts < 0)) - return -EOVERFLOW; + if (whichfork == XFS_DATA_FORK || whichfork == XFS_COW_FORK) { + error = xfs_next_set_data(tp, ip, ifp, delta); + } else if (whichfork == XFS_ATTR_FORK) { + nr_exts = ifp->if_nextents + delta; + if ((delta > 0 && nr_exts > MAXAEXTNUM) + || (delta < 0 && nr_exts < 0)) + return -EOVERFLOW; - ifp->if_nextents = nr_exts; + ifp->if_nextents = nr_exts; + } else { + ASSERT(0); + } - return 0; + return error; } diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h index a84ae42ace79..c74fa6371cc8 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.h +++ b/fs/xfs/libxfs/xfs_inode_fork.h @@ -173,5 +173,6 @@ extern void xfs_ifork_init_cow(struct xfs_inode *ip); int xfs_ifork_verify_local_data(struct xfs_inode *ip); int xfs_ifork_verify_local_attr(struct xfs_inode *ip); -int xfs_next_set(struct xfs_inode *ip, int whichfork, int delta); +int xfs_next_set(struct xfs_trans *tp, struct xfs_inode *ip, int whichfork, + int delta); #endif /* __XFS_INODE_FORK_H__ */ diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h index e3400c9c71cd..879aadff7692 100644 --- a/fs/xfs/libxfs/xfs_log_format.h +++ b/fs/xfs/libxfs/xfs_log_format.h @@ -396,7 +396,7 @@ struct xfs_log_dinode { xfs_fsize_t di_size; /* number of bytes in file */ xfs_rfsblock_t di_nblocks; /* # of direct & btree blocks used */ xfs_extlen_t di_extsize; /* basic/minimum extent size for file */ - xfs_extnum_t di_nextents; /* number of extents in data fork */ + uint32_t di_nextents_lo; /* number of extents in data fork */ xfs_aextnum_t di_anextents; /* number of extents in attribute fork*/ uint8_t di_forkoff; /* attr fork offs, <<3 for 64b align */ int8_t di_aformat; /* format of attr fork's data */ @@ -414,7 +414,8 @@ struct xfs_log_dinode { xfs_lsn_t di_lsn; /* flush sequence */ uint64_t di_flags2; /* more random flags */ uint32_t di_cowextsize; /* basic cow extent size for file */ - uint8_t di_pad2[12]; /* more padding for future expansion */ + uint32_t di_nextents_hi; + uint8_t di_pad2[8]; /* more padding for future expansion */ /* fields only written to during inode creation */ xfs_ictimestamp_t di_crtime; /* time created */ diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h index 0a3041ad5bec..c68ff2178976 100644 --- a/fs/xfs/libxfs/xfs_types.h +++ b/fs/xfs/libxfs/xfs_types.h @@ -12,7 +12,7 @@ typedef uint32_t xfs_agblock_t; /* blockno in alloc. group */ typedef uint32_t xfs_agino_t; /* inode # within allocation grp */ typedef uint32_t xfs_extlen_t; /* extent length in blocks */ typedef uint32_t xfs_agnumber_t; /* allocation group number */ -typedef int32_t xfs_extnum_t; /* # of extents in a file */ +typedef uint64_t xfs_extnum_t; /* # of extents in a file */ typedef int16_t xfs_aextnum_t; /* # extents in an attribute fork */ typedef int64_t xfs_fsize_t; /* bytes in a file */ typedef uint64_t xfs_ufsize_t; /* unsigned bytes in a file */ @@ -59,7 +59,8 @@ typedef void * xfs_failaddr_t; * Max values for extlen, extnum, aextnum. */ #define MAXEXTLEN ((xfs_extlen_t)0x001fffff) /* 21 bits */ -#define MAXEXTNUM ((xfs_extnum_t)0x7fffffff) /* signed int */ +#define MAXEXTNUM31BIT ((xfs_extnum_t)0x7fffffff) /* 31 bits */ +#define MAXEXTNUM ((xfs_extnum_t)0x7fffffffffff) /* 47 bits */ #define MAXDIREXTNUM ((xfs_extnum_t)0x7ffffff) /* 27 bits */ #define MAXAEXTNUM ((xfs_aextnum_t)0x7fff) /* signed short */ diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c index 6d483ab29e63..be41fd242ff2 100644 --- a/fs/xfs/scrub/inode.c +++ b/fs/xfs/scrub/inode.c @@ -205,8 +205,8 @@ xchk_dinode( struct xfs_mount *mp = sc->mp; size_t fork_recs; unsigned long long isize; + xfs_extnum_t nextents; uint64_t flags2; - uint32_t nextents; uint16_t flags; uint16_t mode; @@ -354,7 +354,7 @@ xchk_dinode( xchk_inode_extsize(sc, dip, ino, mode, flags); /* di_nextents */ - nextents = be32_to_cpu(dip->di_nextents); + nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK); fork_recs = XFS_DFORK_DSIZE(dip, mp) / sizeof(struct xfs_bmbt_rec); switch (dip->di_format) { case XFS_DINODE_FMT_EXTENTS: @@ -464,6 +464,7 @@ xchk_inode_xref_bmap( struct xfs_scrub *sc, struct xfs_dinode *dip) { + xfs_mount_t *mp = sc->mp; xfs_extnum_t nextents; xfs_filblks_t count; xfs_filblks_t acount; @@ -477,14 +478,14 @@ xchk_inode_xref_bmap( &nextents, &count); if (!xchk_should_check_xref(sc, &error, NULL)) return; - if (nextents < be32_to_cpu(dip->di_nextents)) + if (nextents < xfs_dfork_nextents(&mp->m_sb, dip, XFS_DATA_FORK)) xchk_ino_xref_set_corrupt(sc, sc->ip->i_ino); error = xfs_bmap_count_blocks(sc->tp, sc->ip, XFS_ATTR_FORK, &nextents, &acount); if (!xchk_should_check_xref(sc, &error, NULL)) return; - if (nextents != be16_to_cpu(dip->di_anextents)) + if (nextents != xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK)) xchk_ino_xref_set_corrupt(sc, sc->ip->i_ino); /* Check nblocks against the inode. */ diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 64f5f9a440ae..4418a66cf6d6 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -3748,7 +3748,7 @@ xfs_iflush_int( ip->i_d.di_nblocks, mp, XFS_ERRTAG_IFLUSH_5)) { xfs_alert_tag(mp, XFS_PTAG_IFLUSH, "%s: detected corrupt incore inode %Lu, " - "total extents = %d, nblocks = %Ld, ptr "PTR_FMT, + "total extents = %llu, nblocks = %Ld, ptr "PTR_FMT, __func__, ip->i_ino, ip->i_df.if_nextents + xfs_ifork_nextents(ip->i_afp), ip->i_d.di_nblocks, ip); @@ -3785,6 +3785,10 @@ xfs_iflush_int( xfs_ifork_verify_local_attr(ip)) goto flush_out; + if (!(ip->i_d.di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS) + && xfs_sb_version_has47bitext(&mp->m_sb)) + ip->i_d.di_flags2 |= XFS_DIFLAG2_47BIT_NEXTENTS; + /* * Copy the dirty parts of the inode into the on-disk inode. We always * copy out the core of the inode, because if the inode is dirty at all diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index ba47bf65b772..6f27ac7c8631 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -326,7 +326,7 @@ xfs_inode_to_log_dinode( to->di_size = from->di_size; to->di_nblocks = from->di_nblocks; to->di_extsize = from->di_extsize; - to->di_nextents = xfs_ifork_nextents(&ip->i_df); + to->di_nextents_lo = xfs_ifork_nextents(&ip->i_df) & 0xffffffffU; to->di_anextents = xfs_ifork_nextents(ip->i_afp); to->di_forkoff = from->di_forkoff; to->di_aformat = xfs_ifork_format(ip->i_afp); @@ -344,6 +344,9 @@ xfs_inode_to_log_dinode( to->di_crtime.t_nsec = from->di_crtime.tv_nsec; to->di_flags2 = from->di_flags2; to->di_cowextsize = from->di_cowextsize; + if (from->di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS) + to->di_nextents_hi = + xfs_ifork_nextents(&ip->i_df) >> 32; to->di_ino = ip->i_ino; to->di_lsn = lsn; memset(to->di_pad2, 0, sizeof(to->di_pad2)); diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c index 10ef5ddf5429..8d64b861fb66 100644 --- a/fs/xfs/xfs_inode_item_recover.c +++ b/fs/xfs/xfs_inode_item_recover.c @@ -134,6 +134,7 @@ xlog_recover_inode_commit_pass2( struct xfs_log_dinode *ldip; uint isize; int need_free = 0; + xfs_extnum_t nextents; if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) { in_f = item->ri_buf[0].i_addr; @@ -255,16 +256,23 @@ xlog_recover_inode_commit_pass2( goto out_release; } } - if (unlikely(ldip->di_nextents + ldip->di_anextents > ldip->di_nblocks)){ + + nextents = ldip->di_nextents_lo; + if (xfs_sb_version_has_v3inode(&mp->m_sb) && + ldip->di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS) + nextents |= ((u64)(ldip->di_nextents_hi) << 32); + + nextents += ldip->di_anextents; + + if (unlikely(nextents > ldip->di_nblocks)) { XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(5)", XFS_ERRLEVEL_LOW, mp, ldip, sizeof(*ldip)); xfs_alert(mp, "%s: Bad inode log record, rec ptr "PTR_FMT", dino ptr "PTR_FMT", " - "dino bp "PTR_FMT", ino %Ld, total extents = %d, nblocks = %Ld", + "dino bp "PTR_FMT", ino %Ld, total extents = %llu, nblocks = %Ld", __func__, item, dip, bp, in_f->ilf_ino, - ldip->di_nextents + ldip->di_anextents, - ldip->di_nblocks); + nextents, ldip->di_nblocks); error = -EFSCORRUPTED; goto out_release; } From patchwork Sat Jun 6 08:27:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Babu R X-Patchwork-Id: 11591103 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2721B912 for ; Sat, 6 Jun 2020 08:28:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0362C20810 for ; Sat, 6 Jun 2020 08:28:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="OAeiEX76" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728647AbgFFI2o (ORCPT ); Sat, 6 Jun 2020 04:28:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728637AbgFFI2n (ORCPT ); Sat, 6 Jun 2020 04:28:43 -0400 Received: from mail-pj1-x1041.google.com (mail-pj1-x1041.google.com [IPv6:2607:f8b0:4864:20::1041]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22EEEC08C5C2 for ; Sat, 6 Jun 2020 01:28:43 -0700 (PDT) Received: by mail-pj1-x1041.google.com with SMTP id m2so3814074pjv.2 for ; Sat, 06 Jun 2020 01:28:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1FfzURtyHK1nUeHXeUuUFQ84XlezpTNwgY45BD7Rqqg=; b=OAeiEX76OfjxEY4r0SGI8NQ1sJvji4IExJdWi/NXNwhgUVySxWPUmBZQLdRB0mgsUH MrtbiXGwHFWrMgNfUlvrPOMEuFTUCtshs8D8NBAOTLC4PHWreK+SH0pngh6bsF5I/pHY duFNY7EzDnNP9xAQmFst0fzap3Qp13nc7+DvM8K/a041R4mj68JaBCzYsnAyd7WEkTPt QKLplIwpT7ud/Rse0MMMEnuVZ17FS1epWnL11Bgpv1Jkj10JLjy9dlOWrUHu6IkLpzYm C9GEhizGS+AltMPIxN2Ag36kWQWwmCcPlJQo33qKJEJi2WAtFDcc2RNkoszD+TP9i+qm pC+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1FfzURtyHK1nUeHXeUuUFQ84XlezpTNwgY45BD7Rqqg=; b=pqVPe7IqGdB4wAuBEi4bGZq1tzOpWcWnWBkqJMuxDqrTN0Wr+uYqNFYC1LvUMaseUR H1Pk4LSm+SmW98PuGhAAaZwAuX97pFMqKonVqzbf+Q5fIp5pt6Fwk6OJTRpiyy30EW/E QVlP6G6uY3O+cpP0oqzq7nZlrJPh9p0Q+wx4/Zo2HbB2EVhwj3YlK5W1neHEMmN+Rmn3 8KqRc9Oy47HuPf4N+RBkVKjAhrTxUnCNh+fSdlHm6V3VSnlL0pvxAakSr+1azWZimbnU AjuinLCPbQVCDuGBz4vqyxKwqs427jcStf3lqAa9JDEjl+3yzA9TbU0hmr1jDLN9Sk8X KocQ== X-Gm-Message-State: AOAM532rN2M9FKtTL6AscgiIFB8tHMvDJY6AA9LvUSBvV2ntOWclKm2I I+j4wjrnm6IpfugMrg/4zSPeCjlC X-Google-Smtp-Source: ABdhPJwMKfh/6uuhRHUx/Xg8oPRs+EgMIOM/1o7OLvHElALqye4IOhfuhFSDK+RwZDmpjSKVU0XHIg== X-Received: by 2002:a17:90a:218c:: with SMTP id q12mr6530728pjc.116.1591432122392; Sat, 06 Jun 2020 01:28:42 -0700 (PDT) Received: from localhost.localdomain ([122.167.144.243]) by smtp.gmail.com with ESMTPSA id j3sm1678130pfh.87.2020.06.06.01.28.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 06 Jun 2020 01:28:42 -0700 (PDT) From: Chandan Babu R To: linux-xfs@vger.kernel.org Cc: Chandan Babu R , david@fromorbit.com, darrick.wong@oracle.com, bfoster@redhat.com, hch@infradead.org Subject: [PATCH 7/7] xfs: Extend attr extent counter to 32 bits Date: Sat, 6 Jun 2020 13:57:45 +0530 Message-Id: <20200606082745.15174-8-chandanrlinux@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200606082745.15174-1-chandanrlinux@gmail.com> References: <20200606082745.15174-1-chandanrlinux@gmail.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org This commit extends the per-inode attr extent counter to 32 bits. The following changes are made to accomplish this, 1. A new ro-compat superblock flag to prevent older kernels from mounting the filesystem in read-write mode. This flag is set for the first time when an inode would end up having more than 2^15 extents. 3. Carve out a new 16-bit field from xfs_dinode->di_pad2[]. This field holds the most significant 16 bits of the attr extent counter. 2. A new inode->di_flags2 flag to indicate that the newly added field contains valid data. This flag is set when one of the following two conditions are met, - When the inode is about to have more than 2^15 extents. - When flushing the incore inode (See xfs_iflush_int()), if the superblock ro-compat flag is already set. Signed-off-by: Chandan Babu R --- fs/xfs/libxfs/xfs_format.h | 25 ++++++++++--- fs/xfs/libxfs/xfs_inode_buf.c | 23 +++++++++--- fs/xfs/libxfs/xfs_inode_fork.c | 62 ++++++++++++++++++++++++++------- fs/xfs/libxfs/xfs_log_format.h | 5 +-- fs/xfs/libxfs/xfs_types.h | 5 +-- fs/xfs/scrub/inode.c | 5 +-- fs/xfs/xfs_inode.c | 4 +++ fs/xfs/xfs_inode_item.c | 5 ++- fs/xfs/xfs_inode_item_recover.c | 8 ++++- 9 files changed, 113 insertions(+), 29 deletions(-) diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h index 91bee33aa988..2e37d887fd35 100644 --- a/fs/xfs/libxfs/xfs_format.h +++ b/fs/xfs/libxfs/xfs_format.h @@ -450,11 +450,13 @@ xfs_sb_has_compat_feature( #define XFS_SB_FEAT_RO_COMPAT_RMAPBT (1 << 1) /* reverse map btree */ #define XFS_SB_FEAT_RO_COMPAT_REFLINK (1 << 2) /* reflinked files */ #define XFS_SB_FEAT_RO_COMPAT_47BIT_DEXT_CNTR (1 << 3) /* 47bit data extents */ +#define XFS_SB_FEAT_RO_COMPAT_32BIT_AEXT_CNTR (1 << 4) /* 32bit attr extents */ #define XFS_SB_FEAT_RO_COMPAT_ALL \ (XFS_SB_FEAT_RO_COMPAT_FINOBT | \ XFS_SB_FEAT_RO_COMPAT_RMAPBT | \ XFS_SB_FEAT_RO_COMPAT_REFLINK | \ - XFS_SB_FEAT_RO_COMPAT_47BIT_DEXT_CNTR) + XFS_SB_FEAT_RO_COMPAT_47BIT_DEXT_CNTR | \ + XFS_SB_FEAT_RO_COMPAT_32BIT_AEXT_CNTR) #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN ~XFS_SB_FEAT_RO_COMPAT_ALL static inline bool xfs_sb_has_ro_compat_feature( @@ -577,6 +579,18 @@ static inline void xfs_sb_version_add47bitext(struct xfs_sb *sbp) sbp->sb_features_ro_compat |= XFS_SB_FEAT_RO_COMPAT_47BIT_DEXT_CNTR; } +static inline bool xfs_sb_version_has32bitaext(struct xfs_sb *sbp) +{ + return XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 && + (sbp->sb_features_ro_compat & + XFS_SB_FEAT_RO_COMPAT_32BIT_AEXT_CNTR); +} + +static inline void xfs_sb_version_add32bitaext(struct xfs_sb *sbp) +{ + sbp->sb_features_ro_compat |= XFS_SB_FEAT_RO_COMPAT_32BIT_AEXT_CNTR; +} + /* * end of superblock version macros */ @@ -888,7 +902,7 @@ typedef struct xfs_dinode { __be64 di_nblocks; /* # of direct & btree blocks used */ __be32 di_extsize; /* basic/minimum extent size for file */ __be32 di_nextents_lo; /* number of extents in data fork */ - __be16 di_anextents; /* number of extents in attribute fork*/ + __be16 di_anextents_lo;/* lower part of xattr extent count */ __u8 di_forkoff; /* attr fork offs, <<3 for 64b align */ __s8 di_aformat; /* format of attr fork's data */ __be32 di_dmevmask; /* DMIG event mask */ @@ -906,7 +920,8 @@ typedef struct xfs_dinode { __be64 di_flags2; /* more random flags */ __be32 di_cowextsize; /* basic cow extent size for file */ __be32 di_nextents_hi; - __u8 di_pad2[8]; /* more padding for future expansion */ + __be16 di_anextents_hi;/* higher part of xattr extent count */ + __u8 di_pad2[6]; /* more padding for future expansion */ /* fields only written to during inode creation */ xfs_timestamp_t di_crtime; /* time created */ @@ -1073,14 +1088,16 @@ static inline void xfs_dinode_put_rdev(struct xfs_dinode *dip, xfs_dev_t rdev) #define XFS_DIFLAG2_REFLINK_BIT 1 /* file's blocks may be shared */ #define XFS_DIFLAG2_COWEXTSIZE_BIT 2 /* copy on write extent size hint */ #define XFS_DIFLAG2_47BIT_NEXTENTS_BIT 3 /* Uses di_nextents_hi field */ +#define XFS_DIFLAG2_32BIT_ANEXTENTS_BIT 4 /* Uses di_anextents_hi field */ #define XFS_DIFLAG2_DAX (1 << XFS_DIFLAG2_DAX_BIT) #define XFS_DIFLAG2_REFLINK (1 << XFS_DIFLAG2_REFLINK_BIT) #define XFS_DIFLAG2_COWEXTSIZE (1 << XFS_DIFLAG2_COWEXTSIZE_BIT) #define XFS_DIFLAG2_47BIT_NEXTENTS (1 << XFS_DIFLAG2_47BIT_NEXTENTS_BIT) +#define XFS_DIFLAG2_32BIT_ANEXTENTS (1 << XFS_DIFLAG2_32BIT_ANEXTENTS_BIT) #define XFS_DIFLAG2_ANY \ (XFS_DIFLAG2_DAX | XFS_DIFLAG2_REFLINK | XFS_DIFLAG2_COWEXTSIZE | \ - XFS_DIFLAG2_47BIT_NEXTENTS) + XFS_DIFLAG2_47BIT_NEXTENTS | XFS_DIFLAG2_32BIT_ANEXTENTS) /* * Inode number format: diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c index 8b89fe080f70..285cbce0cd10 100644 --- a/fs/xfs/libxfs/xfs_inode_buf.c +++ b/fs/xfs/libxfs/xfs_inode_buf.c @@ -309,7 +309,8 @@ xfs_inode_to_disk( to->di_extsize = cpu_to_be32(from->di_extsize); to->di_nextents_lo = cpu_to_be32(xfs_ifork_nextents(&ip->i_df) & 0xffffffffU); - to->di_anextents = cpu_to_be16(xfs_ifork_nextents(ip->i_afp)); + to->di_anextents_lo = cpu_to_be16(xfs_ifork_nextents(ip->i_afp) & + 0xffffU); to->di_forkoff = from->di_forkoff; to->di_aformat = xfs_ifork_format(ip->i_afp); to->di_dmevmask = cpu_to_be32(from->di_dmevmask); @@ -327,6 +328,10 @@ xfs_inode_to_disk( to->di_nextents_hi = cpu_to_be32(xfs_ifork_nextents(&ip->i_df) >> 32); + if (from->di_flags2 & XFS_DIFLAG2_32BIT_ANEXTENTS) + to->di_anextents_hi + = cpu_to_be16(xfs_ifork_nextents(ip->i_afp) + >> 16); to->di_ino = cpu_to_be64(ip->i_ino); to->di_lsn = cpu_to_be64(lsn); memset(to->di_pad2, 0, sizeof(to->di_pad2)); @@ -366,7 +371,7 @@ xfs_log_dinode_to_disk( to->di_nblocks = cpu_to_be64(from->di_nblocks); to->di_extsize = cpu_to_be32(from->di_extsize); to->di_nextents_lo = cpu_to_be32(from->di_nextents_lo); - to->di_anextents = cpu_to_be16(from->di_anextents); + to->di_anextents_lo = cpu_to_be16(from->di_anextents_lo); to->di_forkoff = from->di_forkoff; to->di_aformat = from->di_aformat; to->di_dmevmask = cpu_to_be32(from->di_dmevmask); @@ -383,6 +388,9 @@ xfs_log_dinode_to_disk( if (from->di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS) to->di_nextents_hi = cpu_to_be32(from->di_nextents_hi); + if (from->di_flags2 & XFS_DIFLAG2_32BIT_ANEXTENTS) + to->di_anextents_hi = + cpu_to_be16(from->di_anextents_hi); to->di_ino = cpu_to_be64(from->di_ino); to->di_lsn = cpu_to_be64(from->di_lsn); memcpy(to->di_pad2, from->di_pad2, sizeof(to->di_pad2)); @@ -566,7 +574,7 @@ xfs_dinode_verify( default: return __this_address; } - if (dip->di_anextents) + if (xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK)) return __this_address; } @@ -745,8 +753,13 @@ xfs_dfork_nextents( && (dip->di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS)) nextents |= (u64)(be32_to_cpu(dip->di_nextents_hi)) << 32; - return nextents; } else { - return be16_to_cpu(dip->di_anextents); + nextents = be16_to_cpu(dip->di_anextents_lo); + if (xfs_sb_version_has_v3inode(sbp) + && (dip->di_flags2 & XFS_DIFLAG2_32BIT_ANEXTENTS)) + nextents |= (u32)(be16_to_cpu(dip->di_anextents_hi)) + << 16; } + + return nextents; } diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index ec682e2d5bcb..169e16947ece 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -301,7 +301,10 @@ xfs_iformat_attr_fork( ip->i_afp->if_format = dip->di_aformat; if (unlikely(ip->i_afp->if_format == 0)) /* pre IRIX 6.2 file system */ ip->i_afp->if_format = XFS_DINODE_FMT_EXTENTS; - ip->i_afp->if_nextents = be16_to_cpu(dip->di_anextents); + ip->i_afp->if_nextents = be16_to_cpu(dip->di_anextents_lo); + if (ip->i_d.di_flags2 & XFS_DIFLAG2_32BIT_ANEXTENTS) + ip->i_afp->if_nextents |= + (u32)(be16_to_cpu(dip->di_anextents_hi)) << 16; switch (ip->i_afp->if_format) { case XFS_DINODE_FMT_LOCAL: @@ -777,6 +780,48 @@ xfs_next_set_data( return 0; } +static int +xfs_next_set_attr( + struct xfs_trans *tp, + struct xfs_inode *ip, + struct xfs_ifork *ifp, + int delta) +{ + struct xfs_mount *mp = ip->i_mount; + xfs_aextnum_t nr_exts; + + nr_exts = ifp->if_nextents + delta; + + if ((delta > 0 && nr_exts < ifp->if_nextents) || + (delta < 0 && nr_exts > ifp->if_nextents)) + return -EOVERFLOW; + + if (ifp->if_nextents <= MAXAEXTNUM15BIT && + nr_exts > MAXAEXTNUM15BIT && + !(ip->i_d.di_flags2 & XFS_DIFLAG2_32BIT_ANEXTENTS) && + xfs_sb_version_has_v3inode(&mp->m_sb)) { + if (!xfs_sb_version_has32bitaext(&mp->m_sb)) { + bool log_sb = false; + + spin_lock(&mp->m_sb_lock); + if (!xfs_sb_version_has32bitaext(&mp->m_sb)) { + xfs_sb_version_add32bitaext(&mp->m_sb); + log_sb = true; + } + spin_unlock(&mp->m_sb_lock); + + if (log_sb) + xfs_log_sb(tp); + } + + ip->i_d.di_flags2 |= XFS_DIFLAG2_32BIT_ANEXTENTS; + } + + ifp->if_nextents = nr_exts; + + return 0; +} + int xfs_next_set( struct xfs_trans *tp, @@ -785,23 +830,16 @@ xfs_next_set( int delta) { struct xfs_ifork *ifp; - int64_t nr_exts; int error = 0; ifp = XFS_IFORK_PTR(ip, whichfork); - if (whichfork == XFS_DATA_FORK || whichfork == XFS_COW_FORK) { + if (whichfork == XFS_DATA_FORK || whichfork == XFS_COW_FORK) error = xfs_next_set_data(tp, ip, ifp, delta); - } else if (whichfork == XFS_ATTR_FORK) { - nr_exts = ifp->if_nextents + delta; - if ((delta > 0 && nr_exts > MAXAEXTNUM) - || (delta < 0 && nr_exts < 0)) - return -EOVERFLOW; - - ifp->if_nextents = nr_exts; - } else { + else if (whichfork == XFS_ATTR_FORK) + error = xfs_next_set_attr(tp, ip, ifp, delta); + else ASSERT(0); - } return error; } diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h index 879aadff7692..db419fc862bc 100644 --- a/fs/xfs/libxfs/xfs_log_format.h +++ b/fs/xfs/libxfs/xfs_log_format.h @@ -397,7 +397,7 @@ struct xfs_log_dinode { xfs_rfsblock_t di_nblocks; /* # of direct & btree blocks used */ xfs_extlen_t di_extsize; /* basic/minimum extent size for file */ uint32_t di_nextents_lo; /* number of extents in data fork */ - xfs_aextnum_t di_anextents; /* number of extents in attribute fork*/ + uint16_t di_anextents_lo;/* lower part of xattr extent count */ uint8_t di_forkoff; /* attr fork offs, <<3 for 64b align */ int8_t di_aformat; /* format of attr fork's data */ uint32_t di_dmevmask; /* DMIG event mask */ @@ -415,7 +415,8 @@ struct xfs_log_dinode { uint64_t di_flags2; /* more random flags */ uint32_t di_cowextsize; /* basic cow extent size for file */ uint32_t di_nextents_hi; - uint8_t di_pad2[8]; /* more padding for future expansion */ + uint16_t di_anextents_hi;/* higher part of xattr extent count */ + uint8_t di_pad2[6]; /* more padding for future expansion */ /* fields only written to during inode creation */ xfs_ictimestamp_t di_crtime; /* time created */ diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h index c68ff2178976..974737a9e9c1 100644 --- a/fs/xfs/libxfs/xfs_types.h +++ b/fs/xfs/libxfs/xfs_types.h @@ -13,7 +13,7 @@ typedef uint32_t xfs_agino_t; /* inode # within allocation grp */ typedef uint32_t xfs_extlen_t; /* extent length in blocks */ typedef uint32_t xfs_agnumber_t; /* allocation group number */ typedef uint64_t xfs_extnum_t; /* # of extents in a file */ -typedef int16_t xfs_aextnum_t; /* # extents in an attribute fork */ +typedef uint32_t xfs_aextnum_t; /* # extents in an attribute fork */ typedef int64_t xfs_fsize_t; /* bytes in a file */ typedef uint64_t xfs_ufsize_t; /* unsigned bytes in a file */ @@ -62,7 +62,8 @@ typedef void * xfs_failaddr_t; #define MAXEXTNUM31BIT ((xfs_extnum_t)0x7fffffff) /* 31 bits */ #define MAXEXTNUM ((xfs_extnum_t)0x7fffffffffff) /* 47 bits */ #define MAXDIREXTNUM ((xfs_extnum_t)0x7ffffff) /* 27 bits */ -#define MAXAEXTNUM ((xfs_aextnum_t)0x7fff) /* signed short */ +#define MAXAEXTNUM15BIT ((xfs_aextnum_t)0x7fff) /* 15 bits */ +#define MAXAEXTNUM ((xfs_aextnum_t)0xffffffff) /* 32 bits */ /* * Minimum and maximum blocksize and sectorsize. diff --git a/fs/xfs/scrub/inode.c b/fs/xfs/scrub/inode.c index be41fd242ff2..01e60c78a3a3 100644 --- a/fs/xfs/scrub/inode.c +++ b/fs/xfs/scrub/inode.c @@ -371,10 +371,12 @@ xchk_dinode( break; } + nextents = xfs_dfork_nextents(&mp->m_sb, dip, XFS_ATTR_FORK); + /* di_forkoff */ if (XFS_DFORK_APTR(dip) >= (char *)dip + mp->m_sb.sb_inodesize) xchk_ino_set_corrupt(sc, ino); - if (dip->di_anextents != 0 && dip->di_forkoff == 0) + if (nextents != 0 && dip->di_forkoff == 0) xchk_ino_set_corrupt(sc, ino); if (dip->di_forkoff == 0 && dip->di_aformat != XFS_DINODE_FMT_EXTENTS) xchk_ino_set_corrupt(sc, ino); @@ -386,7 +388,6 @@ xchk_dinode( xchk_ino_set_corrupt(sc, ino); /* di_anextents */ - nextents = be16_to_cpu(dip->di_anextents); fork_recs = XFS_DFORK_ASIZE(dip, mp) / sizeof(struct xfs_bmbt_rec); switch (dip->di_aformat) { case XFS_DINODE_FMT_EXTENTS: diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 4418a66cf6d6..6ec34e069344 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -3789,6 +3789,10 @@ xfs_iflush_int( && xfs_sb_version_has47bitext(&mp->m_sb)) ip->i_d.di_flags2 |= XFS_DIFLAG2_47BIT_NEXTENTS; + if (!(ip->i_d.di_flags2 & XFS_DIFLAG2_32BIT_ANEXTENTS) + && xfs_sb_version_has32bitaext(&mp->m_sb)) + ip->i_d.di_flags2 |= XFS_DIFLAG2_32BIT_ANEXTENTS; + /* * Copy the dirty parts of the inode into the on-disk inode. We always * copy out the core of the inode, because if the inode is dirty at all diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c index 6f27ac7c8631..40f0a19d1c07 100644 --- a/fs/xfs/xfs_inode_item.c +++ b/fs/xfs/xfs_inode_item.c @@ -327,7 +327,7 @@ xfs_inode_to_log_dinode( to->di_nblocks = from->di_nblocks; to->di_extsize = from->di_extsize; to->di_nextents_lo = xfs_ifork_nextents(&ip->i_df) & 0xffffffffU; - to->di_anextents = xfs_ifork_nextents(ip->i_afp); + to->di_anextents_lo = xfs_ifork_nextents(ip->i_afp) & 0xffffU; to->di_forkoff = from->di_forkoff; to->di_aformat = xfs_ifork_format(ip->i_afp); to->di_dmevmask = from->di_dmevmask; @@ -347,6 +347,9 @@ xfs_inode_to_log_dinode( if (from->di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS) to->di_nextents_hi = xfs_ifork_nextents(&ip->i_df) >> 32; + if (from->di_flags2 & XFS_DIFLAG2_32BIT_ANEXTENTS) + to->di_anextents_hi = + xfs_ifork_nextents(ip->i_afp) >> 16; to->di_ino = ip->i_ino; to->di_lsn = lsn; memset(to->di_pad2, 0, sizeof(to->di_pad2)); diff --git a/fs/xfs/xfs_inode_item_recover.c b/fs/xfs/xfs_inode_item_recover.c index 8d64b861fb66..c8b5fbba848b 100644 --- a/fs/xfs/xfs_inode_item_recover.c +++ b/fs/xfs/xfs_inode_item_recover.c @@ -135,6 +135,7 @@ xlog_recover_inode_commit_pass2( uint isize; int need_free = 0; xfs_extnum_t nextents; + xfs_aextnum_t anextents; if (item->ri_buf[0].i_len == sizeof(struct xfs_inode_log_format)) { in_f = item->ri_buf[0].i_addr; @@ -262,7 +263,12 @@ xlog_recover_inode_commit_pass2( ldip->di_flags2 & XFS_DIFLAG2_47BIT_NEXTENTS) nextents |= ((u64)(ldip->di_nextents_hi) << 32); - nextents += ldip->di_anextents; + anextents = ldip->di_anextents_lo; + if (xfs_sb_version_has_v3inode(&mp->m_sb) && + ldip->di_flags2 & XFS_DIFLAG2_32BIT_ANEXTENTS) + anextents |= ((u32)(ldip->di_anextents_hi) << 16); + + nextents += anextents; if (unlikely(nextents > ldip->di_nblocks)) { XFS_CORRUPTION_ERROR("xlog_recover_inode_pass2(5)",