Message ID | de2889bd0a9ea5446c3473fe7b2086fbd954b9ab.1680496851.git.anand.jain@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [stable-5.4.y,stable-5.10.y] btrfs: scan device in non-exclusive mode | expand |
On Mon, Apr 03, 2023 at 01:46:08PM +0800, Anand Jain wrote: > commit 50d281fc434cb8e2497f5e70a309ccca6b1a09f0 upstream. > > This fixes mkfs/mount/check failures due to race with systemd-udevd > scan. > > During the device scan initiated by systemd-udevd, other user space > EXCL operations such as mkfs, mount, or check may get blocked and result > in a "Device or resource busy" error. This is because the device > scan process opens the device with the EXCL flag in the kernel. > > Two reports were received: > > - btrfs/179 test case, where the fsck command failed with the -EBUSY > error > > - LTP pwritev03 test case, where mkfs.vfs failed with > the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem > on the device. > > In both cases, fsck and mkfs (respectively) were racing with a > systemd-udevd device scan, and systemd-udevd won, resulting in the > -EBUSY error for fsck and mkfs. > > Reproducing the problem has been difficult because there is a very > small window during which these userspace threads can race to > acquire the exclusive device open. Even on the system where the problem > was observed, the problem occurrences were anywhere between 10 to 400 > iterations and chances of reproducing decreases with debug printk()s. > > However, an exclusive device open is unnecessary for the scan process, > as there are no write operations on the device during scan. Furthermore, > during the mount process, the superblock is re-read in the below > function call chain: > > btrfs_mount_root > btrfs_open_devices > open_fs_devices > btrfs_open_one_device > btrfs_get_bdev_and_sb > > So, to fix this issue, removes the FMODE_EXCL flag from the scan > operation, and add a comment. > > The case where mkfs may still write to the device and a scan is running, > the btrfs signature is not written at that time so scan will not > recognize such device. > > Reported-by: Sherry Yang <sherry.yang@oracle.com> > Reported-by: kernel test robot <oliver.sang@intel.com> > Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com > CC: stable@vger.kernel.org # 5.4+ > Signed-off-by: Anand Jain <anand.jain@oracle.com> > Reviewed-by: David Sterba <dsterba@suse.com> > Signed-off-by: David Sterba <dsterba@suse.com> > Signed-off-by: Anand Jain <anand.jain@oracle.com> > --- > > The upstream commit 50d281fc434cb8e2497f5e70a309ccca6b1a09f0 can be > applied without conflict to LTS stable-5.15.y and stable-6.1.y. However, > on LTS stable-5.4.y and stable-5.15.y, a conflict fix is required since > the zoned device support commits are not present in these versions. This > patch resolves the conflicts. Now queued up, thanks. greg k-h
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 3fa972a43b5e..c5944c61317f 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1579,8 +1579,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags, * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ bytenr = btrfs_sb_offset(0); - flags |= FMODE_EXCL; + /* + * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may + * initiate the device scan which may race with the user's mount + * or mkfs command, resulting in failure. + * Since the device scan is solely for reading purposes, there is + * no need for FMODE_EXCL. Additionally, the devices are read again + * during the mount process. It is ok to get some inconsistent + * values temporarily, as the device paths of the fsid are the only + * required information for assembling the volume. + */ bdev = blkdev_get_by_path(path, flags, holder); if (IS_ERR(bdev)) return ERR_CAST(bdev);