From patchwork Tue Sep 29 17:43:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11806425 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B2D2618 for ; Tue, 29 Sep 2020 17:44:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D7F4E208B8 for ; Tue, 29 Sep 2020 17:44:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="FsnnoQiK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729389AbgI2RoC (ORCPT ); Tue, 29 Sep 2020 13:44:02 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:55050 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728897AbgI2RoC (ORCPT ); Tue, 29 Sep 2020 13:44:02 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 08THeMOd061464; Tue, 29 Sep 2020 17:43:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=mAZikhXhPaOuZwHxX1b0xxaKKiPkttpGNFfMm3piz8A=; b=FsnnoQiKigV0I4zIJMwiio4Ouo7VSavJI8xcrX7OrIYqfEhrcyV5ZlNFkZyQba+EoxAR dZju9LBMT7/TGk9gmzFn3tq8wOCUD65xuXMgQ3dd7A1zTM2yqFWB/zCE8gYAkA+sFhOw lomPK71BbDo1hhYvy6nrN7Rhu14+yD8KaOrEUlCfkIztkTVV5Eef23l5+y1KCkGsrPhJ p4FZvkax9YPMFoKX7x5q9G8gWPym0BTnedSeXtB5daLhq3vD7luvuq/cGOgdReBJMzbm E/STHzZitEm/mbnL9Q1sdZIvkFmCMhRbExXsNHAOcGWb+zCLA/89yJanKuTDz1Cthld0 vw== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by aserp2130.oracle.com with ESMTP id 33su5avcb9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 29 Sep 2020 17:43:55 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 08THer97146942; Tue, 29 Sep 2020 17:43:55 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3030.oracle.com with ESMTP id 33uv2e7vy9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 29 Sep 2020 17:43:55 +0000 Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 08THht5o028120; Tue, 29 Sep 2020 17:43:55 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 29 Sep 2020 10:43:54 -0700 Subject: [PATCH 1/3] xfs: clean up bmap intent item recovery checking From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: Dave Chinner , Christoph Hellwig , linux-xfs@vger.kernel.org, david@fromorbit.com, hch@lst.de Date: Tue, 29 Sep 2020 10:43:53 -0700 Message-ID: <160140143350.830434.10160569591329741062.stgit@magnolia> In-Reply-To: <160140142711.830434.5161910313856677767.stgit@magnolia> References: <160140142711.830434.5161910313856677767.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9759 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 spamscore=0 adultscore=0 suspectscore=1 malwarescore=0 bulkscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009290149 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9759 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 suspectscore=1 lowpriorityscore=0 spamscore=0 clxscore=1015 mlxscore=0 impostorscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009290149 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong The bmap intent item checking code in xfs_bui_item_recover is spread all over the function. We should check the recovered log item at the top before we allocate any resources or do anything else, so do that. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_bmap_item.c | 38 ++++++++++++-------------------------- 1 file changed, 12 insertions(+), 26 deletions(-) diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index 126df48dae5f..c1f2cc3c42cb 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -437,8 +437,6 @@ xfs_bui_item_recover( xfs_fsblock_t inode_fsb; xfs_filblks_t count; xfs_exntst_t state; - enum xfs_bmap_intent_type type; - bool op_ok; unsigned int bui_type; int whichfork; int error = 0; @@ -456,16 +454,19 @@ xfs_bui_item_recover( XFS_FSB_TO_DADDR(mp, bmap->me_startblock)); inode_fsb = XFS_BB_TO_FSB(mp, XFS_FSB_TO_DADDR(mp, XFS_INO_TO_FSB(mp, bmap->me_owner))); - switch (bmap->me_flags & XFS_BMAP_EXTENT_TYPE_MASK) { + state = (bmap->me_flags & XFS_BMAP_EXTENT_UNWRITTEN) ? + XFS_EXT_UNWRITTEN : XFS_EXT_NORM; + whichfork = (bmap->me_flags & XFS_BMAP_EXTENT_ATTR_FORK) ? + XFS_ATTR_FORK : XFS_DATA_FORK; + bui_type = bmap->me_flags & XFS_BMAP_EXTENT_TYPE_MASK; + switch (bui_type) { case XFS_BMAP_MAP: case XFS_BMAP_UNMAP: - op_ok = true; break; default: - op_ok = false; - break; + return -EFSCORRUPTED; } - if (!op_ok || startblock_fsb == 0 || + if (startblock_fsb == 0 || bmap->me_len == 0 || inode_fsb == 0 || startblock_fsb >= mp->m_sb.sb_dblocks || @@ -493,32 +494,17 @@ xfs_bui_item_recover( if (VFS_I(ip)->i_nlink == 0) xfs_iflags_set(ip, XFS_IRECOVERY); - /* Process deferred bmap item. */ - state = (bmap->me_flags & XFS_BMAP_EXTENT_UNWRITTEN) ? - XFS_EXT_UNWRITTEN : XFS_EXT_NORM; - whichfork = (bmap->me_flags & XFS_BMAP_EXTENT_ATTR_FORK) ? - XFS_ATTR_FORK : XFS_DATA_FORK; - bui_type = bmap->me_flags & XFS_BMAP_EXTENT_TYPE_MASK; - switch (bui_type) { - case XFS_BMAP_MAP: - case XFS_BMAP_UNMAP: - type = bui_type; - break; - default: - XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, mp); - error = -EFSCORRUPTED; - goto err_inode; - } xfs_trans_ijoin(tp, ip, 0); count = bmap->me_len; - error = xfs_trans_log_finish_bmap_update(tp, budp, type, ip, whichfork, - bmap->me_startoff, bmap->me_startblock, &count, state); + error = xfs_trans_log_finish_bmap_update(tp, budp, bui_type, ip, + whichfork, bmap->me_startoff, bmap->me_startblock, + &count, state); if (error) goto err_inode; if (count > 0) { - ASSERT(type == XFS_BMAP_UNMAP); + ASSERT(bui_type == XFS_BMAP_UNMAP); irec.br_startblock = bmap->me_startblock; irec.br_blockcount = count; irec.br_startoff = bmap->me_startoff; From patchwork Tue Sep 29 17:44:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11806427 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C6A82618 for ; Tue, 29 Sep 2020 17:44:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E39720C09 for ; Tue, 29 Sep 2020 17:44:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="eJ+VoSfQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729917AbgI2RoO (ORCPT ); Tue, 29 Sep 2020 13:44:14 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:38210 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728897AbgI2RoK (ORCPT ); Tue, 29 Sep 2020 13:44:10 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 08THd87d189333; Tue, 29 Sep 2020 17:44:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=DtOJSOkqFqIzvOrcxT0fSw21gmD6RIP/2aEdMkyI/N8=; b=eJ+VoSfQHL3Cf3mBqiBSMDbtvIVPoRwHOZlPcR9PBrl2XvGjfO4aXWPAQK/2/lF3DVxU 3JLOODDBc29jvrEdZBIROolWafiV0YqflOmKx66vOSMR+LQ0ri/6y0XzGtRnwOWryGCY WCX4UE2X2ehmgbUcWFxc1J3FVCOBsIya9vEH0kJ26PkwmqbRpuGnDCsbNiIFIHb2E2OH NVIsEYrrm5HPgDqRXOIOaKdfGgYZdduY3FDZJ4PPzur9NYlIIb3eT1+HTpg5SRBuy7wF vP6hea3dus952N0ElY2Z1exnNBY90f1dZgGU3zHO0QoVVGtwAV+vdzqAwzLDG7a0XcJT xw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2120.oracle.com with ESMTP id 33sx9n47t7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 29 Sep 2020 17:44:03 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 08THdko4099138; Tue, 29 Sep 2020 17:44:02 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3030.oracle.com with ESMTP id 33tfjx5xdq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 29 Sep 2020 17:44:02 +0000 Received: from abhmp0004.oracle.com (abhmp0004.oracle.com [141.146.116.10]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 08THi1Vk018965; Tue, 29 Sep 2020 17:44:01 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 29 Sep 2020 10:44:01 -0700 Subject: [PATCH 2/3] xfs: clean up xfs_bui_item_recover iget/trans_alloc/ilock ordering From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: Dave Chinner , Christoph Hellwig , linux-xfs@vger.kernel.org, david@fromorbit.com, hch@lst.de Date: Tue, 29 Sep 2020 10:44:00 -0700 Message-ID: <160140144017.830434.9012644788797432565.stgit@magnolia> In-Reply-To: <160140142711.830434.5161910313856677767.stgit@magnolia> References: <160140142711.830434.5161910313856677767.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9759 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 spamscore=0 mlxscore=0 phishscore=0 adultscore=0 bulkscore=0 mlxlogscore=999 suspectscore=1 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009290149 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9759 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 suspectscore=1 phishscore=0 mlxscore=0 lowpriorityscore=0 adultscore=0 clxscore=1015 spamscore=0 impostorscore=0 malwarescore=0 bulkscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009290149 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong In most places in XFS, we have a specific order in which we gather resources: grab the inode, allocate a transaction, then lock the inode. xfs_bui_item_recover doesn't do it in that order, so fix it to be more consistent. This also makes the error bailout code a bit less weird. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_bmap_item.c | 42 ++++++++++++++++++++++-------------------- 1 file changed, 22 insertions(+), 20 deletions(-) diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index c1f2cc3c42cb..1c9cb5a04bb5 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -475,25 +475,26 @@ xfs_bui_item_recover( (bmap->me_flags & ~XFS_BMAP_EXTENT_FLAGS)) return -EFSCORRUPTED; - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, - XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK), 0, 0, &tp); - if (error) - return error; - - budp = xfs_trans_get_bud(tp, buip); - /* Grab the inode. */ - error = xfs_iget(mp, tp, bmap->me_owner, 0, XFS_ILOCK_EXCL, &ip); + error = xfs_iget(mp, NULL, bmap->me_owner, 0, 0, &ip); if (error) - goto err_inode; + return error; - error = xfs_qm_dqattach_locked(ip, false); + error = xfs_qm_dqattach(ip); if (error) - goto err_inode; + goto err_rele; if (VFS_I(ip)->i_nlink == 0) xfs_iflags_set(ip, XFS_IRECOVERY); + /* Allocate transaction and do the work. */ + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, + XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK), 0, 0, &tp); + if (error) + goto err_rele; + + budp = xfs_trans_get_bud(tp, buip); + xfs_ilock(ip, XFS_ILOCK_EXCL); xfs_trans_ijoin(tp, ip, 0); count = bmap->me_len; @@ -501,7 +502,7 @@ xfs_bui_item_recover( whichfork, bmap->me_startoff, bmap->me_startblock, &count, state); if (error) - goto err_inode; + goto err_cancel; if (count > 0) { ASSERT(bui_type == XFS_BMAP_UNMAP); @@ -512,18 +513,19 @@ xfs_bui_item_recover( xfs_bmap_unmap_extent(tp, ip, &irec); } + /* Commit transaction, which frees tp. */ error = xfs_defer_ops_capture_and_commit(tp, capture_list); + if (error) + goto err_unlock; + return 0; + +err_cancel: + xfs_trans_cancel(tp); +err_unlock: xfs_iunlock(ip, XFS_ILOCK_EXCL); +err_rele: xfs_irele(ip); return error; - -err_inode: - xfs_trans_cancel(tp); - if (ip) { - xfs_iunlock(ip, XFS_ILOCK_EXCL); - xfs_irele(ip); - } - return error; } STATIC bool From patchwork Tue Sep 29 17:44:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11806429 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4006D6CB for ; Tue, 29 Sep 2020 17:44:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 152412076B for ; Tue, 29 Sep 2020 17:44:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="inYOirPf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729933AbgI2RoO (ORCPT ); Tue, 29 Sep 2020 13:44:14 -0400 Received: from aserp2130.oracle.com ([141.146.126.79]:55240 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729909AbgI2RoN (ORCPT ); Tue, 29 Sep 2020 13:44:13 -0400 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 08THdwOC060962; Tue, 29 Sep 2020 17:44:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=o1PS46rSrhHYEK2QVll4+dguVd7newe8cRyNaEPujKo=; b=inYOirPfmph77qq62PRnu6JOKhqjAcrrSkZqquhwo83HNjKXlSsbyemcEtbQ9Gllwu24 afks0yDb5Q0F6EPu+BQTxMWlmBbz9TgL/tExJizWZvyuhaqAZGLSMnBn+WTKyqlB6xBP dUhNc7+5NuUYmkwZS1uriwa8IYZ8JHDoeL6sSqmcb50wLo1PxEzWbDuszHILpgwLGiKr iI7/c9o/YnjKmhDikIH5JYFWkBJ3AIguTFKlj/P2wE85siUp1Oc1rbmxLiLp1c36VWZ5 Dh6piKwxP0CfSurk8udDvdAR2M/mhi7ithi4ojwgPNP9RBbbkzDsqNCB3Zj2s27BPD4+ QA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2130.oracle.com with ESMTP id 33su5avcd5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 29 Sep 2020 17:44:09 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 08THeR05167305; Tue, 29 Sep 2020 17:44:09 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 33tfhxyxau-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 29 Sep 2020 17:44:09 +0000 Received: from abhmp0002.oracle.com (abhmp0002.oracle.com [141.146.116.8]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 08THi8vS017901; Tue, 29 Sep 2020 17:44:08 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 29 Sep 2020 10:44:07 -0700 Subject: [PATCH 3/3] xfs: fix an incore inode UAF in xfs_bui_recover From: "Darrick J. Wong" To: darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, david@fromorbit.com, hch@lst.de Date: Tue, 29 Sep 2020 10:44:06 -0700 Message-ID: <160140144660.830434.10498291551366134327.stgit@magnolia> In-Reply-To: <160140142711.830434.5161910313856677767.stgit@magnolia> References: <160140142711.830434.5161910313856677767.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9759 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxlogscore=999 bulkscore=0 phishscore=0 malwarescore=0 adultscore=0 suspectscore=3 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009290149 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9759 signatures=668680 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 suspectscore=3 lowpriorityscore=0 spamscore=0 clxscore=1015 mlxscore=0 impostorscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2009290149 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong In xfs_bui_item_recover, there exists a use-after-free bug with regards to the inode that is involved in the bmap replay operation. If the mapping operation does not complete, we call xfs_bmap_unmap_extent to create a deferred op to finish the unmapping work, and we retain a pointer to the incore inode. Unfortunately, the very next thing we do is commit the transaction and drop the inode. If reclaim tears down the inode before we try to finish the defer ops, we dereference garbage and blow up. Therefore, create a way to join inodes to the defer ops freezer so that we can maintain the xfs_inode reference until we're done with the inode. Note: This imposes the requirement that there be enough memory to keep every incore inode in memory throughout recovery. Signed-off-by: Darrick J. Wong --- fs/xfs/libxfs/xfs_defer.c | 55 ++++++++++++++++++++++++++++++++++++++------ fs/xfs/libxfs/xfs_defer.h | 11 +++++++-- fs/xfs/xfs_bmap_item.c | 8 ++---- fs/xfs/xfs_extfree_item.c | 2 +- fs/xfs/xfs_log_recover.c | 7 +++++- fs/xfs/xfs_refcount_item.c | 2 +- fs/xfs/xfs_rmap_item.c | 2 +- 7 files changed, 67 insertions(+), 20 deletions(-) diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c index 4caaf5527403..c466a3177acc 100644 --- a/fs/xfs/libxfs/xfs_defer.c +++ b/fs/xfs/libxfs/xfs_defer.c @@ -16,6 +16,7 @@ #include "xfs_inode.h" #include "xfs_inode_item.h" #include "xfs_trace.h" +#include "xfs_icache.h" /* * Deferred Operations in XFS @@ -553,10 +554,14 @@ xfs_defer_move( * deferred ops state is transferred to the capture structure and the * transaction is then ready for the caller to commit it. If there are no * intent items to capture, this function returns NULL. + * + * If inodes are passed in and this function returns a capture structure, the + * inodes are now owned by the capture structure. */ static struct xfs_defer_capture * xfs_defer_ops_capture( - struct xfs_trans *tp) + struct xfs_trans *tp, + struct xfs_inode *ip) { struct xfs_defer_capture *dfc; @@ -588,6 +593,12 @@ xfs_defer_ops_capture( dfc->dfc_tres.tr_logcount = 1; dfc->dfc_tres.tr_logflags = XFS_TRANS_PERM_LOG_RES; + /* + * Transfer responsibility for unlocking and releasing the inodes to + * the capture structure. + */ + dfc->dfc_ip = ip; + return dfc; } @@ -598,29 +609,49 @@ xfs_defer_ops_release( struct xfs_defer_capture *dfc) { xfs_defer_cancel_list(mp, &dfc->dfc_dfops); + if (dfc->dfc_ip) + xfs_irele(dfc->dfc_ip); kmem_free(dfc); } /* * Capture any deferred ops and commit the transaction. This is the last step - * needed to finish a log intent item that we recovered from the log. + * needed to finish a log intent item that we recovered from the log. If any + * of the deferred ops operate on an inode, the caller must pass in that inode + * so that the reference can be transferred to the capture structure. The + * caller must hold ILOCK_EXCL on the inode, and must not touch the inode after + * this call returns. */ int xfs_defer_ops_capture_and_commit( struct xfs_trans *tp, + struct xfs_inode *ip, struct list_head *capture_list) { struct xfs_mount *mp = tp->t_mountp; struct xfs_defer_capture *dfc; int error; + ASSERT(ip == NULL || xfs_isilocked(ip, XFS_ILOCK_EXCL)); + /* If we don't capture anything, commit transaction and exit. */ - dfc = xfs_defer_ops_capture(tp); - if (!dfc) - return xfs_trans_commit(tp); + dfc = xfs_defer_ops_capture(tp, ip); + if (!dfc) { + error = xfs_trans_commit(tp); + if (ip) { + xfs_iunlock(ip, XFS_ILOCK_EXCL); + xfs_irele(ip); + } + return error; + } - /* Commit the transaction and add the capture structure to the list. */ + /* + * Commit the transaction and add the capture structure to the list. + * Once that's done, we can unlock the inode. + */ error = xfs_trans_commit(tp); + if (ip) + xfs_iunlock(ip, XFS_ILOCK_EXCL); if (error) { xfs_defer_ops_release(mp, dfc); return error; @@ -632,16 +663,24 @@ xfs_defer_ops_capture_and_commit( /* * Attach a chain of captured deferred ops to a new transaction and free the - * capture structure. + * capture structure. A captured inode will be passed back to the caller. */ void xfs_defer_ops_continue( struct xfs_defer_capture *dfc, - struct xfs_trans *tp) + struct xfs_trans *tp, + struct xfs_inode **ipp) { ASSERT(tp->t_flags & XFS_TRANS_PERM_LOG_RES); ASSERT(!(tp->t_flags & XFS_TRANS_DIRTY)); + /* Lock and join the captured inode to the new transaction. */ + if (dfc->dfc_ip) { + xfs_ilock(dfc->dfc_ip, XFS_ILOCK_EXCL); + xfs_trans_ijoin(tp, dfc->dfc_ip, 0); + } + *ipp = dfc->dfc_ip; + /* Move captured dfops chain and state to the transaction. */ list_splice_init(&dfc->dfc_dfops, &tp->t_dfops); tp->t_flags |= dfc->dfc_tpflags; diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h index c447c79bbe74..3aaf702d4445 100644 --- a/fs/xfs/libxfs/xfs_defer.h +++ b/fs/xfs/libxfs/xfs_defer.h @@ -77,15 +77,22 @@ struct xfs_defer_capture { unsigned int dfc_tpflags; unsigned int dfc_blkres; struct xfs_trans_res dfc_tres; + + /* + * An inode reference that must be maintained to complete the deferred + * work. + */ + struct xfs_inode *dfc_ip; }; /* * Functions to capture a chain of deferred operations and continue them later. * This doesn't normally happen except log recovery. */ -int xfs_defer_ops_capture_and_commit(struct xfs_trans *tp, +int xfs_defer_ops_capture_and_commit(struct xfs_trans *tp, struct xfs_inode *ip, struct list_head *capture_list); -void xfs_defer_ops_continue(struct xfs_defer_capture *d, struct xfs_trans *tp); +void xfs_defer_ops_continue(struct xfs_defer_capture *d, struct xfs_trans *tp, + struct xfs_inode **ipp); void xfs_defer_ops_release(struct xfs_mount *mp, struct xfs_defer_capture *d); #endif /* __XFS_DEFER_H__ */ diff --git a/fs/xfs/xfs_bmap_item.c b/fs/xfs/xfs_bmap_item.c index 1c9cb5a04bb5..0ffbc75bafe1 100644 --- a/fs/xfs/xfs_bmap_item.c +++ b/fs/xfs/xfs_bmap_item.c @@ -513,15 +513,11 @@ xfs_bui_item_recover( xfs_bmap_unmap_extent(tp, ip, &irec); } - /* Commit transaction, which frees tp. */ - error = xfs_defer_ops_capture_and_commit(tp, capture_list); - if (error) - goto err_unlock; - return 0; + /* Commit transaction, which frees the transaction and the inode. */ + return xfs_defer_ops_capture_and_commit(tp, ip, capture_list); err_cancel: xfs_trans_cancel(tp); -err_unlock: xfs_iunlock(ip, XFS_ILOCK_EXCL); err_rele: xfs_irele(ip); diff --git a/fs/xfs/xfs_extfree_item.c b/fs/xfs/xfs_extfree_item.c index 17d36fe5cfd0..3920542f5736 100644 --- a/fs/xfs/xfs_extfree_item.c +++ b/fs/xfs/xfs_extfree_item.c @@ -627,7 +627,7 @@ xfs_efi_item_recover( } - return xfs_defer_ops_capture_and_commit(tp, capture_list); + return xfs_defer_ops_capture_and_commit(tp, NULL, capture_list); abort_error: xfs_trans_cancel(tp); diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 46e750279634..90ad2bfdfa48 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2439,6 +2439,7 @@ xlog_finish_defer_ops( { struct xfs_defer_capture *dfc, *next; struct xfs_trans *tp; + struct xfs_inode *ip; int error = 0; list_for_each_entry_safe(dfc, next, capture_list, dfc_list) { @@ -2449,9 +2450,13 @@ xlog_finish_defer_ops( /* Transfer all collected dfops to this transaction. */ list_del_init(&dfc->dfc_list); - xfs_defer_ops_continue(dfc, tp); + xfs_defer_ops_continue(dfc, tp, &ip); error = xfs_trans_commit(tp); + if (ip) { + xfs_iunlock(ip, XFS_ILOCK_EXCL); + xfs_irele(ip); + } if (error) return error; } diff --git a/fs/xfs/xfs_refcount_item.c b/fs/xfs/xfs_refcount_item.c index 0478374add64..ad895b48f365 100644 --- a/fs/xfs/xfs_refcount_item.c +++ b/fs/xfs/xfs_refcount_item.c @@ -544,7 +544,7 @@ xfs_cui_item_recover( } xfs_refcount_finish_one_cleanup(tp, rcur, error); - return xfs_defer_ops_capture_and_commit(tp, capture_list); + return xfs_defer_ops_capture_and_commit(tp, NULL, capture_list); abort_error: xfs_refcount_finish_one_cleanup(tp, rcur, error); diff --git a/fs/xfs/xfs_rmap_item.c b/fs/xfs/xfs_rmap_item.c index 0d8fa707f079..1163f32c3e62 100644 --- a/fs/xfs/xfs_rmap_item.c +++ b/fs/xfs/xfs_rmap_item.c @@ -567,7 +567,7 @@ xfs_rui_item_recover( } xfs_rmap_finish_one_cleanup(tp, rcur, error); - return xfs_defer_ops_capture_and_commit(tp, capture_list); + return xfs_defer_ops_capture_and_commit(tp, NULL, capture_list); abort_error: xfs_rmap_finish_one_cleanup(tp, rcur, error);