From patchwork Tue Oct 24 14:53:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13434703 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1F3530CFD for ; Tue, 24 Oct 2023 14:54:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lDiecNXE" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8ABB4C433C9; Tue, 24 Oct 2023 14:54:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698159245; bh=AQzwasHNWU2WpTY1u/tl3X4m5NS461Kg3OnivwaYl00=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=lDiecNXELGvare9RyqWhtE5sJ2qrD5uhnb/CKR+QUxDD0SohuDIUBSUVNjM9kDXkl kg6dx4fSf2zeiFVbrZJ+Sbv2OL7Nh0NSmDyYSPcNV+F1F2wFrGZhSRq7XZ8UW9ZA3h 4btFbGOC2kY/SFL/euydTn6rrED/qUFrZcUKz4Vv9Q/6i0M5iZLIzvfnhKXHeoOB+p GpbgKIxJNFNZsAcFT6rMZ1I1tmNZ1fcdMwKvqYtkyTpXL3Aox6UNApiaH5SvY3T9Qt AvT2SlmQVm0qrJ0sQNG5RvYPHrf+wjXTl8m3SyTA9ZbMm9KwV0w1Y1ijds+n8OhHrH /H/hxTNf0D7ZA== From: Christian Brauner Date: Tue, 24 Oct 2023 16:53:39 +0200 Subject: [PATCH RFC 1/6] fs: simplify setup_bdev_super() calls Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20231024-vfs-super-rework-v1-1-37a8aa697148@kernel.org> References: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> In-Reply-To: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> To: Jan Kara , Christoph Hellwig Cc: linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.13-dev-26615 X-Developer-Signature: v=1; a=openpgp-sha256; l=1496; i=brauner@kernel.org; h=from:subject:message-id; bh=AQzwasHNWU2WpTY1u/tl3X4m5NS461Kg3OnivwaYl00=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaSa3+qafnBq+OGJP42ZlBQtmV/6rMldVp5zY+fdij9rvpSc 8Dm0vKOUhUGMi0FWTJHFod0kXG45T8Vmo0wNmDmsTCBDGLg4BWAiUTsZ/pfe9Pmn/XqSm4DulIdbc2 vrbNI+HD4bLNl4O/bLGf/ULU8ZGRbxMnbmTq2x2cK4UN7OfNKxnquWAlYiBicVH5nPPzSFjwcA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 There's no need to drop s_umount anymore now that we removed all sources where s_umount is taken beneath open_mutex or bd_holder_lock. Signed-off-by: Christian Brauner Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig --- fs/super.c | 16 ---------------- 1 file changed, 16 deletions(-) diff --git a/fs/super.c b/fs/super.c index b26b302f870d..4edde92d5e8f 100644 --- a/fs/super.c +++ b/fs/super.c @@ -1613,15 +1613,7 @@ int get_tree_bdev(struct fs_context *fc, return -EBUSY; } } else { - /* - * We drop s_umount here because we need to open the bdev and - * bdev->open_mutex ranks above s_umount (blkdev_put() -> - * bdev_mark_dead()). It is safe because we have active sb - * reference and SB_BORN is not set yet. - */ - super_unlock_excl(s); error = setup_bdev_super(s, fc->sb_flags, fc); - __super_lock_excl(s); if (!error) error = fill_super(s, fc); if (error) { @@ -1665,15 +1657,7 @@ struct dentry *mount_bdev(struct file_system_type *fs_type, return ERR_PTR(-EBUSY); } } else { - /* - * We drop s_umount here because we need to open the bdev and - * bdev->open_mutex ranks above s_umount (blkdev_put() -> - * bdev_mark_dead()). It is safe because we have active sb - * reference and SB_BORN is not set yet. - */ - super_unlock_excl(s); error = setup_bdev_super(s, flags, NULL); - __super_lock_excl(s); if (!error) error = fill_super(s, data, flags & SB_SILENT ? 1 : 0); if (error) { From patchwork Tue Oct 24 14:53:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13434704 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3AFFD262BC for ; Tue, 24 Oct 2023 14:54:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kCuVcOd3" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 186DAC433C8; Tue, 24 Oct 2023 14:54:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698159247; bh=oK48GroZC22EkQ8fjvkqVknlzKfiRSSDVRh6FFQNUzY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=kCuVcOd3ak49VOGJC7CVah/QAbtx/qVPuWQEUiAyhuCkn6SFAtU5fzmumBd9L3XSa ljmpCbyWPe3VJc+v411SJl/yyDfzZjwj7e77dgvehyI4guY+Ryfi5WL5LOIdTHthfm 1KP2v0eykWDBCITgiCHJCtdd2So/IitVG0671biyoKguu5pxCbf2UJ3ukpD+XpiR/f mtcotp2DlQIv3CLi5UhJfVaZvyrgUz8lRlJEl9DO8ynigkz7Zkrx5W1wW88pvCbWZp FrGhpJ83F5Fgti2Btnuelw7lXuFVMePZ8veV18cDrwYAYIsxU1oaMxZ4YKS7ASQwlQ /nKlmjzK87osQ== From: Christian Brauner Date: Tue, 24 Oct 2023 16:53:40 +0200 Subject: [PATCH RFC 2/6] xfs: simplify device handling Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20231024-vfs-super-rework-v1-2-37a8aa697148@kernel.org> References: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> In-Reply-To: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> To: Jan Kara , Christoph Hellwig Cc: linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.13-dev-26615 X-Developer-Signature: v=1; a=openpgp-sha256; l=1805; i=brauner@kernel.org; h=from:subject:message-id; bh=oK48GroZC22EkQ8fjvkqVknlzKfiRSSDVRh6FFQNUzY=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaSa3+oKPCxvs7lNSE0yjrmLmXnXGm+3KZs51NW1xRw+iUnN lvrdUcrCIMbFICumyOLQbhIut5ynYrNRpgbMHFYmkCEMXJwCMJEZ/YwMHXG5aUUq4dJGP5hN3l8viT 6UuO2i2I1iexf1MPXH61RnMvzhXbT02BfejxOyjCZ4XtHbvPWWz6zNNTsrea/JnTyWzxzHAAA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 We removed all codepaths where s_umount is taken beneath open_mutex and bd_holder_lock so don't make things more complicated than they need to be and hold s_umount over block device opening. Signed-off-by: Christian Brauner Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_super.c | 19 +++---------------- 1 file changed, 3 insertions(+), 16 deletions(-) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index f0ae07828153..84107d162e41 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -437,19 +437,13 @@ xfs_open_devices( struct bdev_handle *logdev_handle = NULL, *rtdev_handle = NULL; int error; - /* - * blkdev_put() can't be called under s_umount, see the comment - * in get_tree_bdev() for more details - */ - up_write(&sb->s_umount); - /* * Open real time and log devices - order is important. */ if (mp->m_logname) { error = xfs_blkdev_get(mp, mp->m_logname, &logdev_handle); if (error) - goto out_relock; + return error; } if (mp->m_rtname) { @@ -492,10 +486,7 @@ xfs_open_devices( bdev_release(logdev_handle); } - error = 0; -out_relock: - down_write(&sb->s_umount); - return error; + return 0; out_free_rtdev_targ: if (mp->m_rtdev_targp) @@ -508,7 +499,7 @@ xfs_open_devices( out_close_logdev: if (logdev_handle) bdev_release(logdev_handle); - goto out_relock; + return error; } /* @@ -758,10 +749,6 @@ static void xfs_mount_free( struct xfs_mount *mp) { - /* - * Free the buftargs here because blkdev_put needs to be called outside - * of sb->s_umount, which is held around the call to ->put_super. - */ if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp) xfs_free_buftarg(mp->m_logdev_targp); if (mp->m_rtdev_targp) From patchwork Tue Oct 24 14:53:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13434705 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B15902E625 for ; Tue, 24 Oct 2023 14:54:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="vExzwc2a" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8252BC433C9; Tue, 24 Oct 2023 14:54:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698159248; bh=b49zUp9bTmN4rXnxKu1Affh8QA+lRZC1QvcrUf077PA=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=vExzwc2atk5/L/W3huDrdYFrTK2X+sbLkIq0n8ADKvOHQjwJnRPkKVjia95QZZAc3 Iy4ELTSPC77p1gO1edmZmBnmkD/0nutDjE0SB9LXURiD6JntjBnZI86DG8hltTLmSe NDdSn+9ONgtXRwuPza7OmEfH/1FJ46roK+4E7anXtHk1t12af8bg7YoYEmrdMDmu4t AseD3JLkSlvfbgz7gBKDCViMJld2iCvM/0/8U5TmwxQiy+l1M9JVEwjfvCYfr+cMd3 OTnK5mge+gBDX5fOHIbKxgEf/fFzV4jK6KwlF2mNLgstg+CzWUkSylGFKH1NlLJE8I 7egRM83eQQRlg== From: Christian Brauner Date: Tue, 24 Oct 2023 16:53:41 +0200 Subject: [PATCH RFC 3/6] ext4: simplify device handling Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20231024-vfs-super-rework-v1-3-37a8aa697148@kernel.org> References: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> In-Reply-To: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> To: Jan Kara , Christoph Hellwig Cc: linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.13-dev-26615 X-Developer-Signature: v=1; a=openpgp-sha256; l=956; i=brauner@kernel.org; h=from:subject:message-id; bh=b49zUp9bTmN4rXnxKu1Affh8QA+lRZC1QvcrUf077PA=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaSa3+pS2OFeXvvO8N9pG16RKbkMbxxX2+yzOVhRWLR+LU/F wcknO0pZGMS4GGTFFFkc2k3C5ZbzVGw2ytSAmcPKBDKEgYtTACaS1MjIcPlTgdnRljmJ7hfSvlm7uu d8vjVvsumRTc0bhX98tZzS6MHwV7hr407GN8LTi3dn9+sLNJU/PW6UOulsfSfzYfnyqdzijAA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 We removed all codepaths where s_umount is taken beneath open_mutex and bd_holder_lock so don't make things more complicated than they need to be and hold s_umount over block device opening. Signed-off-by: Christian Brauner Reviewed-by: Jan Kara Reviewed-by: Christoph Hellwig --- fs/ext4/super.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index d43f8324242a..e94df97ea440 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5855,11 +5855,8 @@ static struct bdev_handle *ext4_get_journal_blkdev(struct super_block *sb, struct ext4_super_block *es; int errno; - /* see get_tree_bdev why this is needed and safe */ - up_write(&sb->s_umount); bdev_handle = bdev_open_by_dev(j_dev, BLK_OPEN_READ | BLK_OPEN_WRITE, sb, &fs_holder_ops); - down_write(&sb->s_umount); if (IS_ERR(bdev_handle)) { ext4_msg(sb, KERN_ERR, "failed to open journal device unknown-block(%u,%u) %ld", From patchwork Tue Oct 24 14:53:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13434706 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA65C30CEC for ; Tue, 24 Oct 2023 14:54:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="h5VO/5/L" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D3A41C433CB; Tue, 24 Oct 2023 14:54:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698159249; bh=v3RKBYvuRr8zxHvkXri88BSnOOcy017B2yC5x12I4VM=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=h5VO/5/L6i43E567LCVRlfdhpi4zjOvDMqWWviwL4Ij8LQM1RhCVp8aGSKebS1BEW vL5QFuw2i3T+EY3tbEv2mxaNrPgs6pL8u3awBWwGUh2noHGymbL6mdNUsgC3xK72Dj /LrYZa33kY2OquVU10tJHXQQ98ueV0G4o+CsfVS4rm9yd+P9uYUbO9BEhCPsGNryNc JotosxWpML47XlMAhWRJLfDeecoXi7UVBNJSEbYOHiYsAPkioXYknse6h707b9zbKN mx3Z7fhoh+TrcYS/XWNlMWGOK93GV3tXOrV6y7iM5LCuROklcI0ORxII4i8+GT2PU1 4h2d6Dwng9ARg== From: Christian Brauner Date: Tue, 24 Oct 2023 16:53:42 +0200 Subject: [PATCH RFC 4/6] bdev: simplify waiting for concurrent claimers Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20231024-vfs-super-rework-v1-4-37a8aa697148@kernel.org> References: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> In-Reply-To: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> To: Jan Kara , Christoph Hellwig Cc: linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.13-dev-26615 X-Developer-Signature: v=1; a=openpgp-sha256; l=3947; i=brauner@kernel.org; h=from:subject:message-id; bh=v3RKBYvuRr8zxHvkXri88BSnOOcy017B2yC5x12I4VM=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaSa3+oS9BPimbep1UXFzoVj/0E2VcdLu74wFNTzGog88Zrb nXqmo5SFQYyLQVZMkcWh3SRcbjlPxWajTA2YOaxMIEMYuDgFYCLsZgz/o8PS5GdvVbozjZOrUJul+v tm/6uP3ki9Lp1W4Wtx2HjFGkaGV8nzGye58SZopXF+ZS3T9mPzuOUb9PXXzhsv3RxbTS9yAgA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Simplify the mechanism to wait for concurrent block devices claimers and make it possible to introduce an additional state in the following patches. Signed-off-by: Christian Brauner --- block/bdev.c | 34 ++++++++++++++++++---------------- include/linux/blk_types.h | 7 ++++++- 2 files changed, 24 insertions(+), 17 deletions(-) diff --git a/block/bdev.c b/block/bdev.c index 9deacd346192..7d19e04a8df8 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -482,6 +482,14 @@ static bool bd_may_claim(struct block_device *bdev, void *holder, return true; } +static bool wait_claimable(const struct block_device *bdev) +{ + enum bd_claim bd_claim; + + bd_claim = smp_load_acquire(&bdev->bd_claim); + return bd_claim == BD_CLAIM_DEFAULT; +} + /** * bd_prepare_to_claim - claim a block device * @bdev: block device of interest @@ -490,7 +498,7 @@ static bool bd_may_claim(struct block_device *bdev, void *holder, * * Claim @bdev. This function fails if @bdev is already claimed by another * holder and waits if another claiming is in progress. return, the caller - * has ownership of bd_claiming and bd_holder[s]. + * has ownership of bd_claim and bd_holder[s]. * * RETURNS: * 0 if @bdev can be claimed, -EBUSY otherwise. @@ -511,31 +519,25 @@ int bd_prepare_to_claim(struct block_device *bdev, void *holder, } /* if claiming is already in progress, wait for it to finish */ - if (whole->bd_claiming) { - wait_queue_head_t *wq = bit_waitqueue(&whole->bd_claiming, 0); - DEFINE_WAIT(wait); - - prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE); + if (whole->bd_claim) { mutex_unlock(&bdev_lock); - schedule(); - finish_wait(wq, &wait); + wait_var_event(&whole->bd_claim, wait_claimable(whole)); goto retry; } /* yay, all mine */ - whole->bd_claiming = holder; + whole->bd_claim = BD_CLAIM_ACQUIRE; mutex_unlock(&bdev_lock); return 0; } EXPORT_SYMBOL_GPL(bd_prepare_to_claim); /* only for the loop driver */ -static void bd_clear_claiming(struct block_device *whole, void *holder) +static void bd_clear_claiming(struct block_device *whole) { lockdep_assert_held(&bdev_lock); - /* tell others that we're done */ - BUG_ON(whole->bd_claiming != holder); - whole->bd_claiming = NULL; - wake_up_bit(&whole->bd_claiming, 0); + smp_store_release(&whole->bd_claim, BD_CLAIM_DEFAULT); + smp_mb(); + wake_up_var(&whole->bd_claim); } /** @@ -565,7 +567,7 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder, bdev->bd_holder = holder; bdev->bd_holder_ops = hops; mutex_unlock(&bdev->bd_holder_lock); - bd_clear_claiming(whole, holder); + bd_clear_claiming(whole); mutex_unlock(&bdev_lock); } @@ -581,7 +583,7 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder, void bd_abort_claiming(struct block_device *bdev, void *holder) { mutex_lock(&bdev_lock); - bd_clear_claiming(bdev_whole(bdev), holder); + bd_clear_claiming(bdev_whole(bdev)); mutex_unlock(&bdev_lock); } EXPORT_SYMBOL(bd_abort_claiming); diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 749203277fee..cbef041fd868 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -37,6 +37,11 @@ struct bio_crypt_ctx; #define PAGE_SECTORS (1 << PAGE_SECTORS_SHIFT) #define SECTOR_MASK (PAGE_SECTORS - 1) +enum bd_claim { + BD_CLAIM_DEFAULT = 0, + BD_CLAIM_ACQUIRE = 1, +}; + struct block_device { sector_t bd_start_sect; sector_t bd_nr_sectors; @@ -52,7 +57,7 @@ struct block_device { atomic_t bd_openers; spinlock_t bd_size_lock; /* for bd_inode->i_size updates */ struct inode * bd_inode; /* will die */ - void * bd_claiming; + enum bd_claim bd_claim; void * bd_holder; const struct blk_holder_ops *bd_holder_ops; struct mutex bd_holder_lock; From patchwork Tue Oct 24 14:53:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13434707 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7BEB530F8E for ; Tue, 24 Oct 2023 14:54:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="oer0jVkX" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 61FD0C433CA; Tue, 24 Oct 2023 14:54:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698159251; bh=zRcxpP10A5inJWak5dVvTMGjF9P0Illhvd9f0A8IowA=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=oer0jVkX/k/pp7cUbgyG0Qh2bZd4vYNuVTbByf67XTNrXETICZBaLN4883QX2EWTv pCUbWjop3am19IcpdPHPwNIZgbpw7JSTPHWEkrrjHogI2fLZHM3oF2aX1GpUCFMmXM 2Jiw6y3fMcMDayZHTlYeJJtKA91COZAUmOgNXBGC7zwmoG2daioqsimw0xofhPk7LW zuT3YID6Iz22wxRyC7WcQ1Kpycf5c+5vA45+2IgUdDsxVS/arZSmDLSAK978EvMgmJ 0fMyihlEk8ZxRoLiI0YIIKu0rexXM0oRnAfM9AKGigd6Xc24So91QPv35kW7ql5Cuh jcYW5KztqQ43Q== From: Christian Brauner Date: Tue, 24 Oct 2023 16:53:43 +0200 Subject: [PATCH RFC 5/6] block: mark device as about to be released Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20231024-vfs-super-rework-v1-5-37a8aa697148@kernel.org> References: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> In-Reply-To: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> To: Jan Kara , Christoph Hellwig Cc: linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.13-dev-26615 X-Developer-Signature: v=1; a=openpgp-sha256; l=2828; i=brauner@kernel.org; h=from:subject:message-id; bh=zRcxpP10A5inJWak5dVvTMGjF9P0Illhvd9f0A8IowA=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaSa3+oKDctK+VZ8hVHL8ljgP8dMj5w/ne1cJ/UmHFi3vt4t UDOoo5SFQYyLQVZMkcWh3SRcbjlPxWajTA2YOaxMIEMYuDgFYCLXljMynD1T2xJqKW/nG3D3a8h/kb k3VGp2ahYdLbFj1IjV9Tu+keF/isHs5Vd3sLL2LT55Q1NayO3OiX0z3yk3zEtkuhpiEPWDCQA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Make it possible for the exclusive holder of a block device to mark it as about to be closed and exclusive ownership given up. Any concurrent opener trying to claim the device with the same holder ops can wait until the device is free to be reclaimed. Requiring the same holder ops makes it possible to easily define groups of openers that can wait for each other. Signed-off-by: Christian Brauner --- block/bdev.c | 20 ++++++++++++++++++++ include/linux/blk_types.h | 1 + include/linux/blkdev.h | 1 + 3 files changed, 22 insertions(+) diff --git a/block/bdev.c b/block/bdev.c index 7d19e04a8df8..943c7a188bb3 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -469,6 +469,11 @@ static bool bd_may_claim(struct block_device *bdev, void *holder, return false; return true; } + + if ((whole->bd_claim == BD_CLAIM_YIELD) && + (bdev->bd_holder_ops == hops)) + return true; + return false; } @@ -608,6 +613,7 @@ static void bd_end_claim(struct block_device *bdev, void *holder) mutex_unlock(&bdev->bd_holder_lock); if (bdev->bd_write_holder) unblock = true; + bd_clear_claiming(whole); } if (!whole->bd_holders) whole->bd_holder = NULL; @@ -954,6 +960,20 @@ void bdev_release(struct bdev_handle *handle) } EXPORT_SYMBOL(bdev_release); +void bdev_yield(struct bdev_handle *handle) +{ + struct block_device *bdev = handle->bdev; + struct block_device *whole = bdev_whole(bdev); + + mutex_lock(&bdev_lock); + WARN_ON_ONCE(bdev->bd_holders == 0); + WARN_ON_ONCE(bdev->bd_holder != handle->holder); + WARN_ON_ONCE(whole->bd_claim); + whole->bd_claim = BD_CLAIM_YIELD; + mutex_unlock(&bdev_lock); +} +EXPORT_SYMBOL(bdev_yield); + /** * lookup_bdev() - Look up a struct block_device by name. * @pathname: Name of the block device in the filesystem. diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index cbef041fd868..54cf274a436c 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -40,6 +40,7 @@ struct bio_crypt_ctx; enum bd_claim { BD_CLAIM_DEFAULT = 0, BD_CLAIM_ACQUIRE = 1, + BD_CLAIM_YIELD = 2, }; struct block_device { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index abf71cce785c..b15129afcdbe 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1513,6 +1513,7 @@ int bd_prepare_to_claim(struct block_device *bdev, void *holder, void bd_abort_claiming(struct block_device *bdev, void *holder); void blkdev_put(struct block_device *bdev, void *holder); void bdev_release(struct bdev_handle *handle); +void bdev_yield(struct bdev_handle *handle); /* just for blk-cgroup, don't use elsewhere */ struct block_device *blkdev_get_no_open(dev_t dev); From patchwork Tue Oct 24 14:53:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13434708 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 42C9727EC7 for ; Tue, 24 Oct 2023 14:54:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="etLnl+IZ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BD1ABC433CB; Tue, 24 Oct 2023 14:54:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698159252; bh=+0FJM42wSRlrkZRX413gII31Xps3fFIfnfwjB6IUDfs=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=etLnl+IZf/lJQIDjpEDMIpDK/7FsMzOl3EfQoNi34wghhLY+uOzfMjPXBr9Yr4u5q w3V3et3K6+LSObSEoMppvVjdsldExlEKKfY/aAuPwqF+mRBMtiLLP0TAhDGroA9sCx /bnb4eWxRGZ1qhu0n+aU2/7UvlFRFcuDLeMhFEJVrXx7SPUs3zqAVSLe8+iL/03vu1 fpCfGgTwEKPZCbLyEH7huwZV+RWYbfq2OAt/096+1rSJSI8HkC/24naRLgE6+eRB5O E0yS8iOuGh9R6drOxF9Ly0d1vQbc1X9n08IHWzG+nTjZJCHVOUM8Mb3UTyiCJ5iKM5 Te5pMxXQS8EOg== From: Christian Brauner Date: Tue, 24 Oct 2023 16:53:44 +0200 Subject: [PATCH RFC 6/6] fs: add ->yield_devices() Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20231024-vfs-super-rework-v1-6-37a8aa697148@kernel.org> References: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> In-Reply-To: <20231024-vfs-super-rework-v1-0-37a8aa697148@kernel.org> To: Jan Kara , Christoph Hellwig Cc: linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.13-dev-26615 X-Developer-Signature: v=1; a=openpgp-sha256; l=8696; i=brauner@kernel.org; h=from:subject:message-id; bh=+0FJM42wSRlrkZRX413gII31Xps3fFIfnfwjB6IUDfs=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaSa3+qa1PvhoWvBrNKEORd2KYoxuKsxn95+bLXwhDL9nXPm XXz3saOUhUGMi0FWTJHFod0kXG45T8Vmo0wNmDmsTCBDGLg4BWAiXHmMDItKbkgrftn679Il6R3dV8 4venGY70frzkfZFs7b9F3vqf5k+Kf0X7n3RM5fl7/vdMI8M5L+rZlWcy+5NebesfVngplypvEBAA== X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 and allow filesystems to mark devices as about to be released allowing concurrent openers to wait until the device is reclaimable. Signed-off-by: Christian Brauner --- fs/ext4/super.c | 12 ++++++++++++ fs/super.c | 51 ++++++++++++++++++++------------------------------- fs/xfs/xfs_super.c | 27 +++++++++++++++++++++++++++ include/linux/fs.h | 1 + 4 files changed, 60 insertions(+), 31 deletions(-) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index e94df97ea440..45f550801329 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1477,6 +1477,17 @@ static void ext4_shutdown(struct super_block *sb) ext4_force_shutdown(sb, EXT4_GOING_FLAGS_NOLOGFLUSH); } +static void ext4_yield_devices(struct super_block *sb) +{ + struct ext4_sb_info *sbi = sb->s_fs_info; + struct bdev_handle *journal_bdev_handle = + sbi ? sbi->s_journal_bdev_handle : NULL; + + if (journal_bdev_handle) + bdev_yield(journal_bdev_handle); + bdev_yield(sb->s_bdev_handle); +} + static void init_once(void *foo) { struct ext4_inode_info *ei = foo; @@ -1638,6 +1649,7 @@ static const struct super_operations ext4_sops = { .statfs = ext4_statfs, .show_options = ext4_show_options, .shutdown = ext4_shutdown, + .yield_devices = ext4_yield_devices, #ifdef CONFIG_QUOTA .quota_read = ext4_quota_read, .quota_write = ext4_quota_write, diff --git a/fs/super.c b/fs/super.c index 4edde92d5e8f..7e24bbd65be2 100644 --- a/fs/super.c +++ b/fs/super.c @@ -87,10 +87,10 @@ static inline bool wait_born(struct super_block *sb) /* * Pairs with smp_store_release() in super_wake() and ensures - * that we see SB_BORN or SB_DYING after we're woken. + * that we see SB_BORN or SB_DEAD after we're woken. */ flags = smp_load_acquire(&sb->s_flags); - return flags & (SB_BORN | SB_DYING); + return flags & (SB_BORN | SB_DEAD); } /** @@ -101,12 +101,12 @@ static inline bool wait_born(struct super_block *sb) * If the superblock has neither passed through vfs_get_tree() or * generic_shutdown_super() yet wait for it to happen. Either superblock * creation will succeed and SB_BORN is set by vfs_get_tree() or we're - * woken and we'll see SB_DYING. + * woken and we'll see SB_DEAD. * * The caller must have acquired a temporary reference on @sb->s_count. * * Return: The function returns true if SB_BORN was set and with - * s_umount held. The function returns false if SB_DYING was + * s_umount held. The function returns false if SB_DEAD was * set and without s_umount held. */ static __must_check bool super_lock(struct super_block *sb, bool excl) @@ -122,7 +122,7 @@ static __must_check bool super_lock(struct super_block *sb, bool excl) * @sb->s_root is NULL and @sb->s_active is 0. No one needs to * grab a reference to this. Tell them so. */ - if (sb->s_flags & SB_DYING) { + if (sb->s_flags & SB_DEAD) { super_unlock(sb, excl); return false; } @@ -137,7 +137,7 @@ static __must_check bool super_lock(struct super_block *sb, bool excl) wait_var_event(&sb->s_flags, wait_born(sb)); /* - * Neither SB_BORN nor SB_DYING are ever unset so we never loop. + * Neither SB_BORN nor SB_DEAD are ever unset so we never loop. * Just reacquire @sb->s_umount for the caller. */ goto relock; @@ -439,18 +439,17 @@ void put_super(struct super_block *sb) static void kill_super_notify(struct super_block *sb) { - lockdep_assert_not_held(&sb->s_umount); + const struct super_operations *sop = sb->s_op; - /* already notified earlier */ - if (sb->s_flags & SB_DEAD) - return; + lockdep_assert_held(&sb->s_umount); + + /* Allow openers to wait for the devices to be cleaned up. */ + if (sop->yield_devices) + sop->yield_devices(sb); /* * Remove it from @fs_supers so it isn't found by new - * sget{_fc}() walkers anymore. Any concurrent mounter still - * managing to grab a temporary reference is guaranteed to - * already see SB_DYING and will wait until we notify them about - * SB_DEAD. + * sget{_fc}() walkers anymore. */ spin_lock(&sb_lock); hlist_del_init(&sb->s_instances); @@ -459,7 +458,7 @@ static void kill_super_notify(struct super_block *sb) /* * Let concurrent mounts know that this thing is really dead. * We don't need @sb->s_umount here as every concurrent caller - * will see SB_DYING and either discard the superblock or wait + * will see SB_DEAD and either discard the superblock or wait * for SB_DEAD. */ super_wake(sb, SB_DEAD); @@ -483,8 +482,6 @@ void deactivate_locked_super(struct super_block *s) unregister_shrinker(&s->s_shrink); fs->kill_sb(s); - kill_super_notify(s); - /* * Since list_lru_destroy() may sleep, we cannot call it from * put_super(), where we hold the sb_lock. Therefore we destroy @@ -583,7 +580,7 @@ static bool grab_super(struct super_block *sb) bool super_trylock_shared(struct super_block *sb) { if (down_read_trylock(&sb->s_umount)) { - if (!(sb->s_flags & SB_DYING) && sb->s_root && + if (!(sb->s_flags & SB_DEAD) && sb->s_root && (sb->s_flags & SB_BORN)) return true; super_unlock_shared(sb); @@ -689,16 +686,9 @@ void generic_shutdown_super(struct super_block *sb) spin_unlock(&sb->s_inode_list_lock); } } - /* - * Broadcast to everyone that grabbed a temporary reference to this - * superblock before we removed it from @fs_supers that the superblock - * is dying. Every walker of @fs_supers outside of sget{_fc}() will now - * discard this superblock and treat it as dead. - * - * We leave the superblock on @fs_supers so it can be found by - * sget{_fc}() until we passed sb->kill_sb(). - */ - super_wake(sb, SB_DYING); + + kill_super_notify(sb); + super_unlock_excl(sb); if (sb->s_bdi != &noop_backing_dev_info) { if (sb->s_iflags & SB_I_PERSB_BDI) @@ -790,7 +780,7 @@ struct super_block *sget_fc(struct fs_context *fc, /* * Make the superblock visible on @super_blocks and @fs_supers. * It's in a nascent state and users should wait on SB_BORN or - * SB_DYING to be set. + * SB_DEAD to be set. */ list_add_tail(&s->s_list, &super_blocks); hlist_add_head(&s->s_instances, &s->s_type->fs_supers); @@ -906,7 +896,7 @@ static void __iterate_supers(void (*f)(struct super_block *)) spin_lock(&sb_lock); list_for_each_entry(sb, &super_blocks, s_list) { /* Pairs with memory marrier in super_wake(). */ - if (smp_load_acquire(&sb->s_flags) & SB_DYING) + if (smp_load_acquire(&sb->s_flags) & SB_DEAD) continue; sb->s_count++; spin_unlock(&sb_lock); @@ -1248,7 +1238,6 @@ void kill_anon_super(struct super_block *sb) { dev_t dev = sb->s_dev; generic_shutdown_super(sb); - kill_super_notify(sb); free_anon_bdev(dev); } EXPORT_SYMBOL(kill_anon_super); diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 84107d162e41..f7a0cb92c7c0 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1170,6 +1170,32 @@ xfs_fs_shutdown( xfs_force_shutdown(XFS_M(sb), SHUTDOWN_DEVICE_REMOVED); } +static void xfs_fs_bdev_yield(struct bdev_handle *handle, + struct super_block *sb) +{ + if (handle != sb->s_bdev_handle) + bdev_yield(handle); +} + +static void +xfs_fs_yield_devices( + struct super_block *sb) +{ + struct xfs_mount *mp = XFS_M(sb); + + if (mp) { + if (mp->m_logdev_targp && + mp->m_logdev_targp != mp->m_ddev_targp) + xfs_fs_bdev_yield(mp->m_logdev_targp->bt_bdev_handle, sb); + if (mp->m_rtdev_targp) + xfs_fs_bdev_yield(mp->m_rtdev_targp->bt_bdev_handle, sb); + if (mp->m_ddev_targp) + xfs_fs_bdev_yield(mp->m_ddev_targp->bt_bdev_handle, sb); + } + + bdev_yield(sb->s_bdev_handle); +} + static const struct super_operations xfs_super_operations = { .alloc_inode = xfs_fs_alloc_inode, .destroy_inode = xfs_fs_destroy_inode, @@ -1184,6 +1210,7 @@ static const struct super_operations xfs_super_operations = { .nr_cached_objects = xfs_fs_nr_cached_objects, .free_cached_objects = xfs_fs_free_cached_objects, .shutdown = xfs_fs_shutdown, + .yield_devices = xfs_fs_yield_devices, }; static int diff --git a/include/linux/fs.h b/include/linux/fs.h index 5174e821d451..f0278bf4ca03 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2026,6 +2026,7 @@ struct super_operations { long (*free_cached_objects)(struct super_block *, struct shrink_control *); void (*shutdown)(struct super_block *sb); + void (*yield_devices)(struct super_block *sb); }; /*