Message ID | 20240618164042.343777-1-gulam.mohamed@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [V6,for-6.11/block] loop: Fix a race between loop detach and loop open | expand |
Do we need the re-addition of the open method to fix the ltp test
case? I kinda hate it, but if that is what it takes:
Reviewed-by: Christoph Hellwig <hch@lst.de>
Hi Christoph, > -----Original Message----- > From: Christoph Hellwig <hch@lst.de> > Sent: Wednesday, June 19, 2024 1:46 PM > To: Gulam Mohamed <gulam.mohamed@oracle.com> > Cc: linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; > yukuai1@huaweicloud.com; hch@lst.de; axboe@kernel.dk > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach > and loop open > > Do we need the re-addition of the open method to fix the ltp test case? I > kinda hate it, but if that is what it takes: > I don't think its needed but I kept it because your following comment in the suggested change says " switch the state to roundown here to prevent new openers from coming in": + * Mark the device for removing the backing device on last close. + * If we are the only opener, also switch the state to roundown here to + * prevent new openers from coming in. */ Please suggest. Regards, Gulam Mohamed. > Reviewed-by: Christoph Hellwig <hch@lst.de>
On Wed, Jun 19, 2024 at 08:21:35AM +0000, Gulam Mohamed wrote: > > To: Gulam Mohamed <gulam.mohamed@oracle.com> > > Cc: linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; > > yukuai1@huaweicloud.com; hch@lst.de; axboe@kernel.dk > > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach > > and loop open > > > > Do we need the re-addition of the open method to fix the ltp test case? I > > kinda hate it, but if that is what it takes: > > > I don't think its needed but I kept it because your following comment in the suggested change says " switch the state to roundown here to prevent new openers from coming in": Let's keep it. I meant to say new I/O coming in, but letting a new opener come in and then fail I/O isn't really nice behavior.
> -----Original Message----- > From: Christoph Hellwig <hch@lst.de> > Sent: Wednesday, June 19, 2024 1:57 PM > To: Gulam Mohamed <gulam.mohamed@oracle.com> > Cc: Christoph Hellwig <hch@lst.de>; linux-block@vger.kernel.org; linux- > kernel@vger.kernel.org; yukuai1@huaweicloud.com; axboe@kernel.dk > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach > and loop open > > On Wed, Jun 19, 2024 at 08:21:35AM +0000, Gulam Mohamed wrote: > > > To: Gulam Mohamed <gulam.mohamed@oracle.com> > > > Cc: linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; > > > yukuai1@huaweicloud.com; hch@lst.de; axboe@kernel.dk > > > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop > > > detach and loop open > > > > > > Do we need the re-addition of the open method to fix the ltp test > > > case? I kinda hate it, but if that is what it takes: > > > > > I don't think its needed but I kept it because your following comment in the > suggested change says " switch the state to roundown here to prevent new > openers from coming in": > > Let's keep it. I meant to say new I/O coming in, but letting a new opener > come in and then fail I/O isn't really nice behavior. Thanks Christoph. Regards, Gulam Mohamed.
Hi Jens, This patch is reviewed by Chirstoph, can you please take a look and pull it if it is good to you? Regards, Gulam Mohamed. > -----Original Message----- > From: Christoph Hellwig <hch@lst.de> > Sent: Wednesday, June 19, 2024 1:46 PM > To: Gulam Mohamed <gulam.mohamed@oracle.com> > Cc: linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; > yukuai1@huaweicloud.com; hch@lst.de; axboe@kernel.dk > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach > and loop open > > Do we need the re-addition of the open method to fix the ltp test case? I > kinda hate it, but if that is what it takes: > > Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, 18 Jun 2024 16:40:42 +0000, Gulam Mohamed wrote: > 1. Userspace sends the command "losetup -d" which uses the open() call > to open the device > 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the > function loop_clr_fd() > 3. If LOOP_CLR_FD is the first command received at the time, then the > AUTOCLEAR flag is not set and deletion of the > loop device proceeds ahead and scans the partitions (drop/add > partitions) > > [...] Applied, thanks! [1/1] loop: Fix a race between loop detach and loop open commit: 18048c1af7836b8e31739d9eaefebc2bf76261f7 Best regards,
On 6/27/24 2:13 PM, Gulam Mohamed wrote: > Hi Jens, > > This patch is reviewed by Chirstoph, can you please take a look > and pull it if it is good to you? In the future, if you find something not being applied, do check if it still applies. Because it did not, I had to fix it up by hand.
Hello, kernel test robot noticed "ltp.ioctl_loop06.fail" on: commit: a167a9996e22ae0d108307fbc66b811d821ffbe7 ("[PATCH V6 for-6.11/block] loop: Fix a race between loop detach and loop open") url: https://github.com/intel-lab-lkp/linux/commits/Gulam-Mohamed/loop-Fix-a-race-between-loop-detach-and-loop-open/20240619-004334 base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next patch link: https://lore.kernel.org/all/20240618164042.343777-1-gulam.mohamed@oracle.com/ patch subject: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach and loop open in testcase: ltp version: ltp-x86_64-14c1f76-1_20240615 with following parameters: disk: 1HDD fs: f2fs test: syscalls-01/ioctl_loop06 compiler: gcc-13 test machine: 4 threads 1 sockets Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz (Ivy Bridge) with 8G memory (please refer to attached dmesg/kmsg for entire log/backtrace) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202406281350.b7298127-oliver.sang@intel.com Running tests....... <<<test_start>>> tag=ioctl_loop06 stime=1719063458 cmdline="ioctl_loop06" contacts="" analysis=exit <<<test_output>>> tst_test.c:1734: TINFO: LTP version: 20240524-41-g248223546 tst_test.c:1618: TINFO: Timeout per run is 0h 02m 30s tst_device.c:96: TINFO: Found free device 0 '/dev/loop0' ioctl_loop06.c:74: TINFO: Using LOOP_SET_BLOCK_SIZE with arg < 512 ioctl_loop06.c:65: TPASS: Set block size failed as expected: EINVAL (22) ioctl_loop06.c:74: TINFO: Using LOOP_SET_BLOCK_SIZE with arg > PAGE_SIZE ioctl_loop06.c:65: TPASS: Set block size failed as expected: EINVAL (22) ioctl_loop06.c:74: TINFO: Using LOOP_SET_BLOCK_SIZE with arg != power_of_2 ioctl_loop06.c:65: TPASS: Set block size failed as expected: EINVAL (22) ioctl_loop06.c:74: TINFO: Using LOOP_CONFIGURE with block_size < 512 ioctl_loop06.c:67: TFAIL: Set block size failed expected EINVAL got: EBUSY (16) ioctl_loop06.c:74: TINFO: Using LOOP_CONFIGURE with block_size > PAGE_SIZE ioctl_loop06.c:67: TFAIL: Set block size failed expected EINVAL got: EBUSY (16) ioctl_loop06.c:74: TINFO: Using LOOP_CONFIGURE with block_size != power_of_2 ioctl_loop06.c:67: TFAIL: Set block size failed expected EINVAL got: EBUSY (16) Summary: passed 3 failed 3 broken 0 skipped 0 warnings 0 incrementing stop <<<execution_status>>> initiation_status="ok" duration=0 termination_type=exited termination_id=1 corefile=no cutime=0 cstime=3 <<<test_end>>> INFO: ltp-pan reported some tests FAIL LTP Version: 20240524-41-g248223546 ############################################################### Done executing testcases. LTP Version: 20240524-41-g248223546 ############################################################### The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240628/202406281350.b7298127-oliver.sang@intel.com
> -----Original Message----- > From: Jens Axboe <axboe@kernel.dk> > Sent: Friday, June 28, 2024 3:42 AM > To: Gulam Mohamed <gulam.mohamed@oracle.com> > Cc: linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; > yukuai1@huaweicloud.com; Christoph Hellwig <hch@lst.de> > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach > and loop open > > On 6/27/24 2:13 PM, Gulam Mohamed wrote: > > Hi Jens, > > > > This patch is reviewed by Chirstoph, can you please take a look > > and pull it if it is good to you? > > In the future, if you find something not being applied, do check if it still > applies. Because it did not, I had to fix it up by hand. > > -- > Jens Axboe Sure Jens. Thank you very much. Regards, Gulam Mohamed.
Hi Christoph, With our latest version of the patch V6, the "kernel robot test" failed in the ioctl_loop06 test (LTP tests) as in below mail. The reason for the failure is, the deferring of the "detach" loop device to release function. The test opens the loop device, sends LOOP_SET_BLOCK_SIZE and LOOP_CONFIGURE commands and in between that, it will also detach the loop device. At the end of the test, while cleanup, it will close the loop device. As we deferred the detach to last close, the detach will be at the end only but before that we are setting the lo_state to Lo_rundown. This setting of Lo_rundown we are doing in the beginning because, there was another LTP test case failed earlier due to the same reason. So, when the LOOP_CONFIGURE was sent, the loop device was still in Lo_rundown state (Lo_unbound will be set after detach in __loop_clr_fd()) due to which kernel returned the EBUSY error causing the test to fail. I have noticed that a good number of test cases are having a behaviour that it will send different loop commands and in between the detach command also, with only a single open. And close happens at the end. Due to this, I think a couple of test cases needs to be modified. Now, as per my understanding, we have two options here: 1. Continue with this kernel patch and modify few test cases to accommodate this new kernel behaviour 2. Go back to using the lo_refcnt The first option doesn't seem to be flexible as we need to modify a good number of test cases. Can you please correct if I am missing anything and suggest how to proceed? Regards, Gulam Mohamed. > -----Original Message----- > From: kernel test robot <oliver.sang@intel.com> > Sent: Friday, June 28, 2024 11:09 AM > To: Gulam Mohamed <gulam.mohamed@oracle.com> > Cc: oe-lkp@lists.linux.dev; lkp@intel.com; linux-block@vger.kernel.org; > ltp@lists.linux.it; linux-kernel@vger.kernel.org; yukuai1@huaweicloud.com; > hch@lst.de; axboe@kernel.dk; oliver.sang@intel.com > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach > and loop open > > > > Hello, > > kernel test robot noticed "ltp.ioctl_loop06.fail" on: > > commit: a167a9996e22ae0d108307fbc66b811d821ffbe7 ("[PATCH V6 for- > 6.11/block] loop: Fix a race between loop detach and loop open") > url: https://urldefense.com/v3/__https://github.com/intel-lab- > lkp/linux/commits/Gulam-Mohamed/loop-Fix-a-race-between-loop-detach- > and-loop-open/20240619- > 004334__;!!ACWV5N9M2RV99hQ!KYjjKXzy4egkNOv4NcO0tNoEVElMSquM5Riz > BqquFexq6ScoztvIJUysVnUfltmEDSSy4LXCb1bKijp8xq-V_CM$ > base: > https://urldefense.com/v3/__https://git.kernel.org/cgit/linux/kernel/git/axboe > /linux- > block.git__;!!ACWV5N9M2RV99hQ!KYjjKXzy4egkNOv4NcO0tNoEVElMSquM5Ri > zBqquFexq6ScoztvIJUysVnUfltmEDSSy4LXCb1bKijp8JTiEzBw$ for-next patch > link: > https://urldefense.com/v3/__https://lore.kernel.org/all/20240618164042.343 > 777-1- > gulam.mohamed@oracle.com/__;!!ACWV5N9M2RV99hQ!KYjjKXzy4egkNOv4 > NcO0tNoEVElMSquM5RizBqquFexq6ScoztvIJUysVnUfltmEDSSy4LXCb1bKijp8v > DJnm-A$ > patch subject: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach > and loop open > > in testcase: ltp > version: ltp-x86_64-14c1f76-1_20240615 > with following parameters: > > disk: 1HDD > fs: f2fs > test: syscalls-01/ioctl_loop06 > > > > compiler: gcc-13 > test machine: 4 threads 1 sockets Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz > (Ivy Bridge) with 8G memory > > (please refer to attached dmesg/kmsg for entire log/backtrace) > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <oliver.sang@intel.com> > | Closes: > | https://urldefense.com/v3/__https://lore.kernel.org/oe-lkp/20240628135 > | 0.b7298127- > oliver.sang@intel.com__;!!ACWV5N9M2RV99hQ!KYjjKXzy4egkNOv4N > | > cO0tNoEVElMSquM5RizBqquFexq6ScoztvIJUysVnUfltmEDSSy4LXCb1bKijp8aKJi > x68 > | $ > > > > Running tests....... > <<<test_start>>> > tag=ioctl_loop06 stime=1719063458 > cmdline="ioctl_loop06" > contacts="" > analysis=exit > <<<test_output>>> > tst_test.c:1734: TINFO: LTP version: 20240524-41-g248223546 > tst_test.c:1618: TINFO: Timeout per run is 0h 02m 30s > tst_device.c:96: TINFO: Found free device 0 '/dev/loop0' > ioctl_loop06.c:74: TINFO: Using LOOP_SET_BLOCK_SIZE with arg < 512 > ioctl_loop06.c:65: TPASS: Set block size failed as expected: EINVAL (22) > ioctl_loop06.c:74: TINFO: Using LOOP_SET_BLOCK_SIZE with arg > PAGE_SIZE > ioctl_loop06.c:65: TPASS: Set block size failed as expected: EINVAL (22) > ioctl_loop06.c:74: TINFO: Using LOOP_SET_BLOCK_SIZE with arg != > power_of_2 > ioctl_loop06.c:65: TPASS: Set block size failed as expected: EINVAL (22) > ioctl_loop06.c:74: TINFO: Using LOOP_CONFIGURE with block_size < 512 > ioctl_loop06.c:67: TFAIL: Set block size failed expected EINVAL got: EBUSY (16) > ioctl_loop06.c:74: TINFO: Using LOOP_CONFIGURE with block_size > > PAGE_SIZE > ioctl_loop06.c:67: TFAIL: Set block size failed expected EINVAL got: EBUSY (16) > ioctl_loop06.c:74: TINFO: Using LOOP_CONFIGURE with block_size != > power_of_2 > ioctl_loop06.c:67: TFAIL: Set block size failed expected EINVAL got: EBUSY (16) > > Summary: > passed 3 > failed 3 > broken 0 > skipped 0 > warnings 0 > incrementing stop > <<<execution_status>>> > initiation_status="ok" > duration=0 termination_type=exited termination_id=1 corefile=no > cutime=0 cstime=3 > <<<test_end>>> > INFO: ltp-pan reported some tests FAIL > LTP Version: 20240524-41-g248223546 > > > ############################################################### > > Done executing testcases. > LTP Version: 20240524-41-g248223546 > > ############################################################### > > > > > The kernel config and materials to reproduce are available at: > https://urldefense.com/v3/__https://download.01.org/0day- > ci/archive/20240628/202406281350.b7298127- > oliver.sang@intel.com__;!!ACWV5N9M2RV99hQ!KYjjKXzy4egkNOv4NcO0tNoE > VElMSquM5RizBqquFexq6ScoztvIJUysVnUfltmEDSSy4LXCb1bKijp8ZxsxyCs$ > > > > -- > 0-DAY CI Kernel Test Service > https://urldefense.com/v3/__https://github.com/intel/lkp- > tests/wiki__;!!ACWV5N9M2RV99hQ!KYjjKXzy4egkNOv4NcO0tNoEVElMSquM5 > RizBqquFexq6ScoztvIJUysVnUfltmEDSSy4LXCb1bKijp8xUKSK2Y$
Hi Gulam, On Sun, Jun 30, 2024 at 10:11:14PM +0000, Gulam Mohamed wrote: > With our latest version of the patch V6, the "kernel robot test" failed > in the ioctl_loop06 test (LTP tests) as in below mail. > the reason for the failure is, the deferring of the "detach" loop > device to release function. The test opens the loop device, sends > LOOP_SET_BLOCK_SIZE and LOOP_CONFIGURE commands and in between that, > it will also detach the loop device. At the end of the test, while > cleanup, it will close the loop device. As we deferred the detach to > last close, the detach will be at the end only but before that we are > setting the lo_state to Lo_rundown. This setting of Lo_rundown we are > doing in the beginning because, there was another LTP test case failed > earlier due to the same reason. > > So, when the LOOP_CONFIGURE was sent, the loop device was still in > Lo_rundown state (Lo_unbound will be set after detach in > __loop_clr_fd()) due to which kernel returned the EBUSY error causing > the test to fail. Before we'd end up in Lo_unbound toward the end of __loop_clr_fd if there was a single opener. > I have noticed that a good number of test cases are having a behaviour > that it will send different loop commands and in between the detach > command also, with only a single open. And close happens at the end. > Due to this, I think a couple of test cases needs to be modified. > > Now, as per my understanding, we have two options here: > > 1. Continue with this kernel patch and modify few test cases to > accommodate this new kernel behaviour That would be my preference. Any code that is doing a clear_fd and then tries to configure it again is prone to races vs other openers. It also does not seem very useful outside of test code. But if we end up breaking real code and not test cases we might have to go and bring it back.
Hi Christoph, > -----Original Message----- > From: hch@lst.de <hch@lst.de> > Sent: Tuesday, July 2, 2024 9:20 PM > To: Gulam Mohamed <gulam.mohamed@oracle.com> > Cc: hch@lst.de; linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; > yukuai1@huaweicloud.com; axboe@kernel.dk > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach > and loop open > > Hi Gulam, > > On Sun, Jun 30, 2024 at 10:11:14PM +0000, Gulam Mohamed wrote: > > With our latest version of the patch V6, the "kernel robot test" > > failed in the ioctl_loop06 test (LTP tests) as in below mail. > > the reason for the failure is, the deferring of the "detach" loop > > device to release function. The test opens the loop device, sends > > LOOP_SET_BLOCK_SIZE and LOOP_CONFIGURE commands and in between > that, > > it will also detach the loop device. At the end of the test, while > > cleanup, it will close the loop device. As we deferred the detach to > > last close, the detach will be at the end only but before that we are > > setting the lo_state to Lo_rundown. This setting of Lo_rundown we are > > doing in the beginning because, there was another LTP test case failed > > earlier due to the same reason. > > > > So, when the LOOP_CONFIGURE was sent, the loop device was still in > > Lo_rundown state (Lo_unbound will be set after detach in > > __loop_clr_fd()) due to which kernel returned the EBUSY error causing > > the test to fail. > > Before we'd end up in Lo_unbound toward the end of __loop_clr_fd if there > was a single opener. > > > I have noticed that a good number of test cases are having a behaviour > > that it will send different loop commands and in between the detach > > command also, with only a single open. And close happens at the end. > > Due to this, I think a couple of test cases needs to be modified. > > > > Now, as per my understanding, we have two options here: > > > > 1. Continue with this kernel patch and modify few test cases to > > accommodate this new kernel behaviour > > That would be my preference. Any code that is doing a clear_fd and then tries > to configure it again is prone to races vs other openers. It also does not seem > very useful outside of test code. > But if we end up breaking real code and not test cases we might have to go > and bring it back. Requested the maintainers of the LTP test cases for the modification to accomodate the new kernel behavior.
Hi Christoph, > -----Original Message----- > From: Gulam Mohamed > Sent: Saturday, July 6, 2024 1:21 AM > To: hch@lst.de > Cc: linux-block@vger.kernel.org; linux-kernel@vger.kernel.org; > yukuai1@huaweicloud.com; axboe@kernel.dk > Subject: RE: [PATCH V6 for-6.11/block] loop: Fix a race between loop detach > and loop open > > Hi Christoph, > > > -----Original Message----- > > From: hch@lst.de <hch@lst.de> > > Sent: Tuesday, July 2, 2024 9:20 PM > > To: Gulam Mohamed <gulam.mohamed@oracle.com> > > Cc: hch@lst.de; linux-block@vger.kernel.org; > > linux-kernel@vger.kernel.org; yukuai1@huaweicloud.com; > axboe@kernel.dk > > Subject: Re: [PATCH V6 for-6.11/block] loop: Fix a race between loop > > detach and loop open > > > > Hi Gulam, > > > > On Sun, Jun 30, 2024 at 10:11:14PM +0000, Gulam Mohamed wrote: > > > With our latest version of the patch V6, the "kernel robot test" > > > failed in the ioctl_loop06 test (LTP tests) as in below mail. > > > the reason for the failure is, the deferring of the "detach" loop > > > device to release function. The test opens the loop device, sends > > > LOOP_SET_BLOCK_SIZE and LOOP_CONFIGURE commands and in > between > > that, > > > it will also detach the loop device. At the end of the test, while > > > cleanup, it will close the loop device. As we deferred the detach to > > > last close, the detach will be at the end only but before that we > > > are setting the lo_state to Lo_rundown. This setting of Lo_rundown > > > we are doing in the beginning because, there was another LTP test > > > case failed earlier due to the same reason. > > > > > > So, when the LOOP_CONFIGURE was sent, the loop device was still in > > > Lo_rundown state (Lo_unbound will be set after detach in > > > __loop_clr_fd()) due to which kernel returned the EBUSY error > > > causing the test to fail. > > > > Before we'd end up in Lo_unbound toward the end of __loop_clr_fd if > > there was a single opener. > > > > > I have noticed that a good number of test cases are having a > > > behaviour that it will send different loop commands and in between > > > the detach command also, with only a single open. And close happens at > the end. > > > Due to this, I think a couple of test cases needs to be modified. > > > > > > Now, as per my understanding, we have two options here: > > > > > > 1. Continue with this kernel patch and modify few test cases to > > > accommodate this new kernel behaviour > > > > That would be my preference. Any code that is doing a clear_fd and > > then tries to configure it again is prone to races vs other openers. > > It also does not seem very useful outside of test code. > > But if we end up breaking real code and not test cases we might have > > to go and bring it back. > > Requested the maintainers of the LTP test cases for the modification to > accomodate the new kernel behavior. The LTP maintainers agreed to modify the impacted the test cases to accommodate the new kernel behavior. They are asking the kernel version/commit in which this new behavior is included. Can you please help in integrating the path into the mainline? Regards, Gulam Mohamed.
Hi Gulam, the patch has been queue up by the block maintainers for Linux 6.11.
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 93780f41646b..6fa19aa7c913 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1131,7 +1131,7 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode, return error; } -static void __loop_clr_fd(struct loop_device *lo, bool release) +static void __loop_clr_fd(struct loop_device *lo) { struct file *filp; gfp_t gfp = lo->old_gfp_mask; @@ -1139,14 +1139,6 @@ static void __loop_clr_fd(struct loop_device *lo, bool release) if (test_bit(QUEUE_FLAG_WC, &lo->lo_queue->queue_flags)) blk_queue_write_cache(lo->lo_queue, false, false); - /* - * Freeze the request queue when unbinding on a live file descriptor and - * thus an open device. When called from ->release we are guaranteed - * that there is no I/O in progress already. - */ - if (!release) - blk_mq_freeze_queue(lo->lo_queue); - spin_lock_irq(&lo->lo_lock); filp = lo->lo_backing_file; lo->lo_backing_file = NULL; @@ -1164,8 +1156,6 @@ static void __loop_clr_fd(struct loop_device *lo, bool release) mapping_set_gfp_mask(filp->f_mapping, gfp); /* This is safe: open() is still holding a reference. */ module_put(THIS_MODULE); - if (!release) - blk_mq_unfreeze_queue(lo->lo_queue); disk_force_media_change(lo->lo_disk); @@ -1180,11 +1170,7 @@ static void __loop_clr_fd(struct loop_device *lo, bool release) * must be at least one and it can only become zero when the * current holder is released. */ - if (!release) - mutex_lock(&lo->lo_disk->open_mutex); err = bdev_disk_changed(lo->lo_disk, false); - if (!release) - mutex_unlock(&lo->lo_disk->open_mutex); if (err) pr_warn("%s: partition scan of loop%d failed (rc=%d)\n", __func__, lo->lo_number, err); @@ -1233,24 +1219,16 @@ static int loop_clr_fd(struct loop_device *lo) return -ENXIO; } /* - * If we've explicitly asked to tear down the loop device, - * and it has an elevated reference count, set it for auto-teardown when - * the last reference goes away. This stops $!~#$@ udev from - * preventing teardown because it decided that it needs to run blkid on - * the loopback device whenever they appear. xfstests is notorious for - * failing tests because blkid via udev races with a losetup - * <dev>/do something like mkfs/losetup -d <dev> causing the losetup -d - * command to fail with EBUSY. + * Mark the device for removing the backing device on last close. + * If we are the only opener, also switch the state to roundown here to + * prevent new openers from coming in. */ - if (disk_openers(lo->lo_disk) > 1) { - lo->lo_flags |= LO_FLAGS_AUTOCLEAR; - loop_global_unlock(lo, true); - return 0; - } - lo->lo_state = Lo_rundown; + + lo->lo_flags |= LO_FLAGS_AUTOCLEAR; + if (disk_openers(lo->lo_disk) == 1) + lo->lo_state = Lo_rundown; loop_global_unlock(lo, true); - __loop_clr_fd(lo, false); return 0; } @@ -1717,25 +1695,43 @@ static int lo_compat_ioctl(struct block_device *bdev, blk_mode_t mode, } #endif +static int lo_open(struct gendisk *disk, blk_mode_t mode) +{ + struct loop_device *lo = disk->private_data; + int err; + + err = mutex_lock_killable(&lo->lo_mutex); + if (err) + return err; + + if (lo->lo_state == Lo_deleting || lo->lo_state == Lo_rundown) + err = -ENXIO; + mutex_unlock(&lo->lo_mutex); + return err; +} + static void lo_release(struct gendisk *disk) { struct loop_device *lo = disk->private_data; + bool need_clear = false; if (disk_openers(disk) > 0) return; + /* + * Clear the backing device information if this is the last close of + * a device that's been marked for auto clear, or on which LOOP_CLR_FD + * has been called. + */ mutex_lock(&lo->lo_mutex); - if (lo->lo_state == Lo_bound && (lo->lo_flags & LO_FLAGS_AUTOCLEAR)) { + if (lo->lo_state == Lo_bound && (lo->lo_flags & LO_FLAGS_AUTOCLEAR)) lo->lo_state = Lo_rundown; - mutex_unlock(&lo->lo_mutex); - /* - * In autoclear mode, stop the loop thread - * and remove configuration after last close. - */ - __loop_clr_fd(lo, true); - return; - } + + need_clear = (lo->lo_state == Lo_rundown); mutex_unlock(&lo->lo_mutex); + + if (need_clear) + __loop_clr_fd(lo); } static void lo_free_disk(struct gendisk *disk) @@ -1752,6 +1748,7 @@ static void lo_free_disk(struct gendisk *disk) static const struct block_device_operations lo_fops = { .owner = THIS_MODULE, + .open = lo_open, .release = lo_release, .ioctl = lo_ioctl, #ifdef CONFIG_COMPAT
1. Userspace sends the command "losetup -d" which uses the open() call to open the device 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the function loop_clr_fd() 3. If LOOP_CLR_FD is the first command received at the time, then the AUTOCLEAR flag is not set and deletion of the loop device proceeds ahead and scans the partitions (drop/add partitions) if (disk_openers(lo->lo_disk) > 1) { lo->lo_flags |= LO_FLAGS_AUTOCLEAR; loop_global_unlock(lo, true); return 0; } 4. Before scanning partitions, it will check to see if any partition of the loop device is currently opened 5. If any partition is opened, then it will return EBUSY: if (disk->open_partitions) return -EBUSY; 6. So, after receiving the "LOOP_CLR_FD" command and just before the above check for open_partitions, if any other command (like blkid) opens any partition of the loop device, then the partition scan will not proceed and EBUSY is returned as shown in above code 7. But in "__loop_clr_fd()", this EBUSY error is not propagated 8. We have noticed that this is causing the partitions of the loop to remain stale even after the loop device is detached resulting in the IO errors on the partitions Fix: Defer the detach of loop device to release function, which is called when the last close happens, by setting the lo_flags to LO_FLAGS_AUTOCLEAR at the time of detach i.e in loop_clr_fd() function. Test case involves the following two scripts: script1.sh: while [ 1 ]; do losetup -P -f /home/opt/looptest/test10.img blkid /dev/loop0p1 done script2.sh: while [ 1 ]; do losetup -d /dev/loop0 done Without fix, the following IO errors have been observed: kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16) kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page read Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com> --- v6<-v5: Set the loop state Lo_rundown only when there is a single opener of the loop device drivers/block/loop.c | 75 +++++++++++++++++++++----------------------- 1 file changed, 36 insertions(+), 39 deletions(-)