diff mbox series

[07/17] nbd: call blk_mark_disk_dead in nbd_clear_sock_ioctl

Message ID 20230811100828.1897174-8-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [01/17] FOLD: reverts part of "fs: use the super_block as holder when mounting file systems" | expand

Commit Message

Christoph Hellwig Aug. 11, 2023, 10:08 a.m. UTC
nbd_clear_sock_ioctl kills the socket and with that the block
device.  Instead of just invalidating file system buffers,
mark the device as dead, which will also invalidate the buffers
as part of the proper shutdown sequence.  This also includes
invalidating partitions if there are any.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/nbd.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Comments

Samuel Holland Sept. 20, 2023, 8:41 p.m. UTC | #1
On 2023-08-11 5:08 AM, Christoph Hellwig wrote:
> nbd_clear_sock_ioctl kills the socket and with that the block
> device.  Instead of just invalidating file system buffers,
> mark the device as dead, which will also invalidate the buffers
> as part of the proper shutdown sequence.  This also includes
> invalidating partitions if there are any.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/block/nbd.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 8576d696c7a221..42e0159bb258fa 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -1434,12 +1434,10 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd)
>  	return ret;
>  }
>  
> -static void nbd_clear_sock_ioctl(struct nbd_device *nbd,
> -				 struct block_device *bdev)
> +static void nbd_clear_sock_ioctl(struct nbd_device *nbd)
>  {
> +	blk_mark_disk_dead(nbd->disk);
>  	nbd_clear_sock(nbd);
> -	__invalidate_device(bdev, true);
> -	nbd_bdev_reset(nbd);

This change breaks nbd-client, which calls the NBD_CLEAR_SOCK ioctl during
device setup and socket reconnection. After merging this series (bisected to
511fb5bafed1), all NBD devices are immediately dead on arrival:

[   14.605849] nbd0: detected capacity change from 0 to 4194304

[   14.606211] Buffer I/O error on dev nbd0, logical block 0, async page read
[   14.619101] Buffer I/O error on dev nbd0, logical block 0, async page read

[   14.630490]  nbd0: unable to read partition table

I wonder if disk_force_media_change() is the right thing to call here instead.

Regards,
Samuel

>  	if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF,
>  			       &nbd->config->runtime_flags))
>  		nbd_config_put(nbd);
> @@ -1465,7 +1463,7 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
>  	case NBD_DISCONNECT:
>  		return nbd_disconnect(nbd);
>  	case NBD_CLEAR_SOCK:
> -		nbd_clear_sock_ioctl(nbd, bdev);
> +		nbd_clear_sock_ioctl(nbd);
>  		return 0;
>  	case NBD_SET_SOCK:
>  		return nbd_add_socket(nbd, arg, false);
Christoph Hellwig Sept. 25, 2023, 7:48 a.m. UTC | #2
On Wed, Sep 20, 2023 at 03:41:11PM -0500, Samuel Holland wrote:
> [   14.619101] Buffer I/O error on dev nbd0, logical block 0, async page read
> 
> [   14.630490]  nbd0: unable to read partition table
> 
> I wonder if disk_force_media_change() is the right thing to call here instead.

So what are the semantics of clearing the socket?

The <= 6.5 behavior of invalidating fs caches, but not actually marking
the fs shutdown is pretty broken, especially if this expects to resurrect
the device and thus the file system later on.
Wouter Verhelst Oct. 1, 2023, 5:10 p.m. UTC | #3
On Mon, Sep 25, 2023 at 09:48:38AM +0200, Christoph Hellwig wrote:
> On Wed, Sep 20, 2023 at 03:41:11PM -0500, Samuel Holland wrote:
> > [   14.619101] Buffer I/O error on dev nbd0, logical block 0, async page read
> > 
> > [   14.630490]  nbd0: unable to read partition table
> > 
> > I wonder if disk_force_media_change() is the right thing to call here instead.
> 
> So what are the semantics of clearing the socket?
> 
> The <= 6.5 behavior of invalidating fs caches, but not actually marking
> the fs shutdown is pretty broken, especially if this expects to resurrect
> the device and thus the file system later on.

nbd-client -d calls

ioctl(nbd, NBD_DISCONNECT);
ioctl(nbd, NBD_CLEAR_SOCK);

(error handling removed for clarity)

where "nbd" is the file handle to the nbd device. This expects that the
device is cleared and that then the device can be reused for a different
connection, much like "losetup -d". Expecting that the next connection
would talk to the same file system is wrong.

In netlink mode, it obviously doesn't use the ioctl()s, but instead
sends an NBD_CMD_DISCONNECT command, without any NBD_CLEAR_SOCK, for
which no equivalent message exists. At this point, obviously the same
result is expected in userspace, i.e., the device should now be
available for the next connection that may or may not be the same one.

nbd-client also has "-persist" option that used to work. This does
expect to resurrect the device and file system. It depends on semantics
where the kernel would block IO to the device until the nbd-client
process that initiated the connection exits, thus allowing it to
re-establish the connection if possible. When doing this, we don't issue
a DISCONNECT or CLEAR_SOCK message and obviously the client is expected
to re-establish a connection to the same device, thus some state should
be retained.

These semantics have however been broken at some point over the past decade
or so, but I didn't notice that at the time, so I didn't complain, and
it's therefore probably not relevant anymore. We should perhaps rethink
whether this is still a good idea given the way the netlink mode does
not have a client waiting for a return of the ioctl() call, and if so
how to implement a replacement.

Kind regards,
Christoph Hellwig Oct. 2, 2023, 6:21 a.m. UTC | #4
On Sun, Oct 01, 2023 at 07:10:53PM +0200, Wouter Verhelst wrote:
> > So what are the semantics of clearing the socket?
> > 
> > The <= 6.5 behavior of invalidating fs caches, but not actually marking
> > the fs shutdown is pretty broken, especially if this expects to resurrect
> > the device and thus the file system later on.
> 
> nbd-client -d calls
> 
> ioctl(nbd, NBD_DISCONNECT);
> ioctl(nbd, NBD_CLEAR_SOCK);
> 
> (error handling removed for clarity)
> 
> where "nbd" is the file handle to the nbd device. This expects that the
> device is cleared and that then the device can be reused for a different
> connection, much like "losetup -d". Expecting that the next connection
> would talk to the same file system is wrong.

So a fs shutdown seems like a the right thing.  So I'm a little confused
on what actualy broke here.
Samuel Holland Oct. 2, 2023, 7:15 p.m. UTC | #5
On 2023-10-02 1:21 AM, Christoph Hellwig wrote:
> On Sun, Oct 01, 2023 at 07:10:53PM +0200, Wouter Verhelst wrote:
>>> So what are the semantics of clearing the socket?
>>>
>>> The <= 6.5 behavior of invalidating fs caches, but not actually marking
>>> the fs shutdown is pretty broken, especially if this expects to resurrect
>>> the device and thus the file system later on.
>>
>> nbd-client -d calls
>>
>> ioctl(nbd, NBD_DISCONNECT);
>> ioctl(nbd, NBD_CLEAR_SOCK);
>>
>> (error handling removed for clarity)
>>
>> where "nbd" is the file handle to the nbd device. This expects that the
>> device is cleared and that then the device can be reused for a different
>> connection, much like "losetup -d". Expecting that the next connection
>> would talk to the same file system is wrong.
> 
> So a fs shutdown seems like a the right thing.  So I'm a little confused
> on what actualy broke here.

I'm not too familiar with the block subsystem, but my understanding is that
blk_mark_disk_dead(nbd->disk) is permanent -- there is no way to un-mark a disk
as dead. So this makes the device (e.g. /dev/nbd0) permanently unusable after
the call to ioctl(nbd, NBD_CLEAR_SOCK).

Like Wouter said, the semantics should be similar to a loop device, where the
same block device can be reused after being disconnected. That was why I
suggested disk_force_media_change() as called from __loop_clr_fd(). The loop
driver doesn't call blk_mark_disk_dead() anywhere.

Regards,
Samuel
diff mbox series

Patch

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 8576d696c7a221..42e0159bb258fa 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1434,12 +1434,10 @@  static int nbd_start_device_ioctl(struct nbd_device *nbd)
 	return ret;
 }
 
-static void nbd_clear_sock_ioctl(struct nbd_device *nbd,
-				 struct block_device *bdev)
+static void nbd_clear_sock_ioctl(struct nbd_device *nbd)
 {
+	blk_mark_disk_dead(nbd->disk);
 	nbd_clear_sock(nbd);
-	__invalidate_device(bdev, true);
-	nbd_bdev_reset(nbd);
 	if (test_and_clear_bit(NBD_RT_HAS_CONFIG_REF,
 			       &nbd->config->runtime_flags))
 		nbd_config_put(nbd);
@@ -1465,7 +1463,7 @@  static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
 	case NBD_DISCONNECT:
 		return nbd_disconnect(nbd);
 	case NBD_CLEAR_SOCK:
-		nbd_clear_sock_ioctl(nbd, bdev);
+		nbd_clear_sock_ioctl(nbd);
 		return 0;
 	case NBD_SET_SOCK:
 		return nbd_add_socket(nbd, arg, false);