diff mbox series

sd: skip non-removable devices in sd_check_events()

Message ID 20190116073509.131535-1-hare@suse.de (mailing list archive)
State Changes Requested
Headers show
Series sd: skip non-removable devices in sd_check_events() | expand

Commit Message

Hannes Reinecke Jan. 16, 2019, 7:35 a.m. UTC
If the device is _not_ removable we should not start the event
poller as the media will not go away. Having the event poller running
will block the open() call as it will try to flush outstanding events,
which it can't if the device is in state 'BLOCKED'. So the open() call
will be stalled until the device state changed, which might be quite
some time depending on the transport.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 drivers/scsi/sd.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Martin Wilck Jan. 16, 2019, 10:26 a.m. UTC | #1
On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
> If the device is _not_ removable we should not start the event
> poller as the media will not go away. Having the event poller running
> will block the open() call as it will try to flush outstanding
> events,
> which it can't if the device is in state 'BLOCKED'. So the open()
> call
> will be stalled until the device state changed, which might be quite
> some time depending on the transport.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.com>
> ---
>  drivers/scsi/sd.c | 3 +++
>  1 file changed, 3 insertions(+)

Thanks - you lost patience with me, did you? :-)

I didn't submit this yet because Tejun (adding him to CC) had
explicitly removed this check in commit 2bae0093cab4, and I was still
trying to understand why.

Perhaps a follow-up discussion will clarify that.

Martin

> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index a1a44f52e0e8..521f0a384446 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1539,6 +1539,9 @@ static unsigned int sd_check_events(struct
> gendisk *disk, unsigned int clearing)
>  		return 0;
>  
>  	sdp = sdkp->device;
> +	if (!sdp->removable)
> +		return 0;
> +
>  	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp,
> "sd_check_events\n"));
>  
>  	/*
Hannes Reinecke Jan. 16, 2019, 10:32 a.m. UTC | #2
On 1/16/19 11:26 AM, Martin Wilck wrote:
> On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
>> If the device is _not_ removable we should not start the event
>> poller as the media will not go away. Having the event poller running
>> will block the open() call as it will try to flush outstanding
>> events,
>> which it can't if the device is in state 'BLOCKED'. So the open()
>> call
>> will be stalled until the device state changed, which might be quite
>> some time depending on the transport.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.com>
>> ---
>>   drivers/scsi/sd.c | 3 +++
>>   1 file changed, 3 insertions(+)
> 
> Thanks - you lost patience with me, did you? :-)
> 
> I didn't submit this yet because Tejun (adding him to CC) had
> explicitly removed this check in commit 2bae0093cab4, and I was still
> trying to understand why.
> 
> Perhaps a follow-up discussion will clarify that.
> 
To my understanding Tejun removed it as the check got moved into
set_media_not_present() function.
However, I'd be very much interested to hear on how a non-removable 
device can change ...
Sure, there might be non-compatible devices which do _not_ set the 
removable flag (despite them being removable) and the notorious aacraid 
which insists on claiming all devices being removable, but if the need 
arises we can always introduce a blacklist flag.

Cheers,

Hannes
Martin Wilck Jan. 16, 2019, 10:58 a.m. UTC | #3
On Wed, 2019-01-16 at 11:32 +0100, Hannes Reinecke wrote:
> On 1/16/19 11:26 AM, Martin Wilck wrote:
> > On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
> > > If the device is _not_ removable we should not start the event
> > > poller as the media will not go away. Having the event poller
> > > running
> > > will block the open() call as it will try to flush outstanding
> > > events,
> > > which it can't if the device is in state 'BLOCKED'. So the open()
> > > call
> > > will be stalled until the device state changed, which might be
> > > quite
> > > some time depending on the transport.
> > > 
> > > Signed-off-by: Hannes Reinecke <hare@suse.com>
> > > ---
> > >   drivers/scsi/sd.c | 3 +++
> > >   1 file changed, 3 insertions(+)
> > 
> > Thanks - you lost patience with me, did you? :-)
> > 
> > I didn't submit this yet because Tejun (adding him to CC) had
> > explicitly removed this check in commit 2bae0093cab4, and I was
> > still
> > trying to understand why.
> > 
> > Perhaps a follow-up discussion will clarify that.
> > 
> To my understanding Tejun removed it as the check got moved into
> set_media_not_present() function.

Maybe, but as a matter of fact, after 2bae0093cab4, TUR could be
executed for non-removable disks in sd_check_events(), which was never
done in sd_media_changed() beforehand.

> However, I'd be very much interested to hear on how a non-removable 
> device can change ...

Me too. The specs at least don't seem to mandate that only removable
devices can return MEDIUM NOT PRESENT. But even then, it makes only
little sense to actively test such devices for a media change by
sending TURs.

> Sure, there might be non-compatible devices which do _not_ set the 
> removable flag (despite them being removable) and the notorious
> aacraid 
> which insists on claiming all devices being removable, but if the
> need 
> arises we can always introduce a blacklist flag.

I agree. Also, I suppose that if someone has a case for checking
for media change on non-removable devices, (s)he will bring it up.

Therefore:
Acked-by: Martin Wilck <mwilck@suse.com>

Regards
Martin
Douglas Gilbert Jan. 16, 2019, 4:32 p.m. UTC | #4
On 2019-01-16 5:58 a.m., Martin Wilck wrote:
> On Wed, 2019-01-16 at 11:32 +0100, Hannes Reinecke wrote:
>> On 1/16/19 11:26 AM, Martin Wilck wrote:
>>> On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
>>>> If the device is _not_ removable we should not start the event
>>>> poller as the media will not go away. Having the event poller
>>>> running
>>>> will block the open() call as it will try to flush outstanding
>>>> events,
>>>> which it can't if the device is in state 'BLOCKED'. So the open()
>>>> call
>>>> will be stalled until the device state changed, which might be
>>>> quite
>>>> some time depending on the transport.
>>>>
>>>> Signed-off-by: Hannes Reinecke <hare@suse.com>
>>>> ---
>>>>    drivers/scsi/sd.c | 3 +++
>>>>    1 file changed, 3 insertions(+)
>>>
>>> Thanks - you lost patience with me, did you? :-)
>>>
>>> I didn't submit this yet because Tejun (adding him to CC) had
>>> explicitly removed this check in commit 2bae0093cab4, and I was
>>> still
>>> trying to understand why.
>>>
>>> Perhaps a follow-up discussion will clarify that.
>>>
>> To my understanding Tejun removed it as the check got moved into
>> set_media_not_present() function.
> 
> Maybe, but as a matter of fact, after 2bae0093cab4, TUR could be
> executed for non-removable disks in sd_check_events(), which was never
> done in sd_media_changed() beforehand.
> 
>> However, I'd be very much interested to hear on how a non-removable
>> device can change ...
> 
> Me too. The specs at least don't seem to mandate that only removable
> devices can return MEDIUM NOT PRESENT. But even then, it makes only
> little sense to actively test such devices for a media change by
> sending TURs.

So to be more precise, do you mean the LU disappearing while its
port(s) and containing target device remain online?

Otherwise a hot unplug meets the bill. Also some curious things can
happen at the transport level. For example I believe T10 have just
accepted a MODE SELECT command that will change a SAS wide port
to narrow port(s) (and vice versa). The device is taken offline for
a programmable period meant to be long enough for the scanning logic
to notice and then brought back in its new configuration.

Doug Gilbert

>> Sure, there might be non-compatible devices which do _not_ set the
>> removable flag (despite them being removable) and the notorious
>> aacraid
>> which insists on claiming all devices being removable, but if the
>> need
>> arises we can always introduce a blacklist flag.
> 
> I agree. Also, I suppose that if someone has a case for checking
> for media change on non-removable devices, (s)he will bring it up.
> 
> Therefore:
> Acked-by: Martin Wilck <mwilck@suse.com>
> 
> Regards
> Martin
> 
> 
>
Bart Van Assche Jan. 16, 2019, 5:11 p.m. UTC | #5
On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
> If the device is _not_ removable we should not start the event
> poller as the media will not go away. Having the event poller running
> will block the open() call as it will try to flush outstanding events,
> which it can't if the device is in state 'BLOCKED'. So the open() call
> will be stalled until the device state changed, which might be quite
> some time depending on the transport.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.com>
> ---
>  drivers/scsi/sd.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index a1a44f52e0e8..521f0a384446 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1539,6 +1539,9 @@ static unsigned int sd_check_events(struct gendisk *disk, unsigned int clearing)
>  		return 0;
>  
>  	sdp = sdkp->device;
> +	if (!sdp->removable)
> +		return 0;
> +
>  	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_check_events\n"));
>  
>  	/*

Hi Hannes,

Although this patch looks fine to me, wouldn't it be a better approach to
cause the events checker not to submit any SCSI commands to blocked devices?
That approach should improve the sd behavior for both removable and
non-removable devices.

Thanks,

Bart.
Martin Wilck Jan. 16, 2019, 9:05 p.m. UTC | #6
On Wed, 2019-01-16 at 11:32 -0500, Douglas Gilbert wrote:
> On 2019-01-16 5:58 a.m., Martin Wilck wrote:
> > On Wed, 2019-01-16 at 11:32 +0100, Hannes Reinecke wrote:
> > > On 1/16/19 11:26 AM, Martin Wilck wrote:
> > > > On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
> > > > > If the device is _not_ removable we should not start the
> > > > > event
> > > > > poller as the media will not go away. Having the event poller
> > > > > running
> > > > > will block the open() call as it will try to flush
> > > > > outstanding
> > > > > events,
> > > > > which it can't if the device is in state 'BLOCKED'. So the
> > > > > open()
> > > > > call
> > > > > will be stalled until the device state changed, which might
> > > > > be
> > > > > quite
> > > > > some time depending on the transport.
> > > > > 
> > > > > Signed-off-by: Hannes Reinecke <hare@suse.com>
> > > > > ---
> > > > >    drivers/scsi/sd.c | 3 +++
> > > > >    1 file changed, 3 insertions(+)
> > > > 
> > > > Thanks - you lost patience with me, did you? :-)
> > > > 
> > > > I didn't submit this yet because Tejun (adding him to CC) had
> > > > explicitly removed this check in commit 2bae0093cab4, and I was
> > > > still
> > > > trying to understand why.
> > > > 
> > > > Perhaps a follow-up discussion will clarify that.
> > > > 
> > > To my understanding Tejun removed it as the check got moved into
> > > set_media_not_present() function.
> > 
> > Maybe, but as a matter of fact, after 2bae0093cab4, TUR could be
> > executed for non-removable disks in sd_check_events(), which was
> > never
> > done in sd_media_changed() beforehand.
> > 
> > > However, I'd be very much interested to hear on how a non-
> > > removable
> > > device can change ...
> > 
> > Me too. The specs at least don't seem to mandate that only
> > removable
> > devices can return MEDIUM NOT PRESENT. But even then, it makes only
> > little sense to actively test such devices for a media change by
> > sending TURs.
> 
> So to be more precise, do you mean the LU disappearing while its
> port(s) and containing target device remain online?
> 
> Otherwise a hot unplug meets the bill. Also some curious things can
> happen at the transport level. For example I believe T10 have just
> accepted a MODE SELECT command that will change a SAS wide port
> to narrow port(s) (and vice versa). The device is taken offline for
> a programmable period meant to be long enough for the scanning logic
> to notice and then brought back in its new configuration.

Are these really cases where the sense code would be UNIT ATTENTION /
MEDIUM NOT PRESENT or NOT READY / MEDIUM NOT PRESENT? That's all that
matters here. Other sorts of device removal would be handled much
differently.

Cheers,
Martin
Martin Wilck Jan. 16, 2019, 11:12 p.m. UTC | #7
On Wed, 2019-01-16 at 08:35 +0100, Hannes Reinecke wrote:
> If the device is _not_ removable we should not start the event
> poller as the media will not go away. Having the event poller running
> will block the open() call as it will try to flush outstanding
> events,
> which it can't if the device is in state 'BLOCKED'. So the open()
> call
> will be stalled until the device state changed, which might be quite
> some time depending on the transport.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.com>

I've pondered this some more. Meanwhile I think we should rather fix
the generic event handling code in genhd.c. The problem is that this
code schedules the check_events() function in some situations (notably,
on close(2) via blkdev_put()), even for devices which don't report any
supported events in the (struct gendisk)->events field (the sd driver
correctly sets the events field depending on the "removable" attribute,
but the that only influences the polling behavior, not the call upon
close()).

It looks weird to me that devices that don't report any supported
events would be checked for such events. The reason for this is 
7c88a168da80 and follow-ups (7eec77a1, 9fd097b1). For the sake of not
confusing userland, these commits sacrificed the information in the
generic block layer whether or not a given device actually supports
media change events.

I'm working on a patch to fix that.

Regards,
Martin
diff mbox series

Patch

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index a1a44f52e0e8..521f0a384446 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1539,6 +1539,9 @@  static unsigned int sd_check_events(struct gendisk *disk, unsigned int clearing)
 		return 0;
 
 	sdp = sdkp->device;
+	if (!sdp->removable)
+		return 0;
+
 	SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_check_events\n"));
 
 	/*