diff mbox series

[RFC,v3,04/41] csiostor: use reserved command for LUN reset

Message ID 20200430131904.5847-5-hare@suse.de (mailing list archive)
State Changes Requested
Headers show
Series scsi: enable reserved commands for LLDDs | expand

Commit Message

Hannes Reinecke April 30, 2020, 1:18 p.m. UTC
When issuing a LUN reset we should be using a reserved command
to avoid overwriting the original command.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 drivers/scsi/csiostor/csio_init.c |  1 +
 drivers/scsi/csiostor/csio_scsi.c | 48 +++++++++++++++++++++++----------------
 2 files changed, 30 insertions(+), 19 deletions(-)

Comments

Ming Lei April 30, 2020, 3:15 p.m. UTC | #1
On Thu, Apr 30, 2020 at 03:18:27PM +0200, Hannes Reinecke wrote:
> When issuing a LUN reset we should be using a reserved command
> to avoid overwriting the original command.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.com>
> ---
>  drivers/scsi/csiostor/csio_init.c |  1 +
>  drivers/scsi/csiostor/csio_scsi.c | 48 +++++++++++++++++++++++----------------
>  2 files changed, 30 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/scsi/csiostor/csio_init.c b/drivers/scsi/csiostor/csio_init.c
> index 8dea7d53788a..5e1b0a24caf6 100644
> --- a/drivers/scsi/csiostor/csio_init.c
> +++ b/drivers/scsi/csiostor/csio_init.c
> @@ -622,6 +622,7 @@ csio_shost_init(struct csio_hw *hw, struct device *dev,
>  	ln->dev_num = (shost->host_no << 16);
>  
>  	shost->can_queue = CSIO_MAX_QUEUE;
> +	shost->nr_reserved_cmds = 1;

->can_queue isn't increased by 1 given CSIO_MAX_QUEUE isn't changed, so
setting shost->nr_reserved_cmds as 1 will cause io queue depth reduced by 1,
that is supposed to not happen.

Any conversion not increasing ->can_queue should have this same issue, please
check other conversions.



thanks,
Ming
Hannes Reinecke May 1, 2020, 1:01 p.m. UTC | #2
On 4/30/20 5:15 PM, Ming Lei wrote:
> On Thu, Apr 30, 2020 at 03:18:27PM +0200, Hannes Reinecke wrote:
>> When issuing a LUN reset we should be using a reserved command
>> to avoid overwriting the original command.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.com>
>> ---
>>   drivers/scsi/csiostor/csio_init.c |  1 +
>>   drivers/scsi/csiostor/csio_scsi.c | 48 +++++++++++++++++++++++----------------
>>   2 files changed, 30 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/scsi/csiostor/csio_init.c b/drivers/scsi/csiostor/csio_init.c
>> index 8dea7d53788a..5e1b0a24caf6 100644
>> --- a/drivers/scsi/csiostor/csio_init.c
>> +++ b/drivers/scsi/csiostor/csio_init.c
>> @@ -622,6 +622,7 @@ csio_shost_init(struct csio_hw *hw, struct device *dev,
>>   	ln->dev_num = (shost->host_no << 16);
>>   
>>   	shost->can_queue = CSIO_MAX_QUEUE;
>> +	shost->nr_reserved_cmds = 1;
> 
> ->can_queue isn't increased by 1 given CSIO_MAX_QUEUE isn't changed, so
> setting shost->nr_reserved_cmds as 1 will cause io queue depth reduced by 1,
> that is supposed to not happen.
> 
We cannot increase MAX_QUEUE arbitrarily as this is a compile time 
variable, which seems to relate to a hardware setting.

But I can see to update the reserved command functionality for allowing 
to fetch commands from the normal I/O tag pool; in the case of LUN reset 
it shouldn't make much of a difference as the all I/O is quiesced anyway.

Cheers,

Hannes
Ming Lei May 1, 2020, 3:01 p.m. UTC | #3
On Fri, May 01, 2020 at 03:01:14PM +0200, Hannes Reinecke wrote:
> On 4/30/20 5:15 PM, Ming Lei wrote:
> > On Thu, Apr 30, 2020 at 03:18:27PM +0200, Hannes Reinecke wrote:
> > > When issuing a LUN reset we should be using a reserved command
> > > to avoid overwriting the original command.
> > > 
> > > Signed-off-by: Hannes Reinecke <hare@suse.com>
> > > ---
> > >   drivers/scsi/csiostor/csio_init.c |  1 +
> > >   drivers/scsi/csiostor/csio_scsi.c | 48 +++++++++++++++++++++++----------------
> > >   2 files changed, 30 insertions(+), 19 deletions(-)
> > > 
> > > diff --git a/drivers/scsi/csiostor/csio_init.c b/drivers/scsi/csiostor/csio_init.c
> > > index 8dea7d53788a..5e1b0a24caf6 100644
> > > --- a/drivers/scsi/csiostor/csio_init.c
> > > +++ b/drivers/scsi/csiostor/csio_init.c
> > > @@ -622,6 +622,7 @@ csio_shost_init(struct csio_hw *hw, struct device *dev,
> > >   	ln->dev_num = (shost->host_no << 16);
> > >   	shost->can_queue = CSIO_MAX_QUEUE;
> > > +	shost->nr_reserved_cmds = 1;
> > 
> > ->can_queue isn't increased by 1 given CSIO_MAX_QUEUE isn't changed, so
> > setting shost->nr_reserved_cmds as 1 will cause io queue depth reduced by 1,
> > that is supposed to not happen.
> > 
> We cannot increase MAX_QUEUE arbitrarily as this is a compile time variable,
> which seems to relate to a hardware setting.
> 
> But I can see to update the reserved command functionality for allowing to
> fetch commands from the normal I/O tag pool; in the case of LUN reset it
> shouldn't make much of a difference as the all I/O is quiesced anyway.

It isn't related with reset.

This patch reduces active IO queue depth by 1 anytime no matter there is reset
or not, and this way may cause performance regression.

Thanks, 
Ming
hch@lst.de May 1, 2020, 5:45 p.m. UTC | #4
On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
> > We cannot increase MAX_QUEUE arbitrarily as this is a compile time variable,
> > which seems to relate to a hardware setting.
> > 
> > But I can see to update the reserved command functionality for allowing to
> > fetch commands from the normal I/O tag pool; in the case of LUN reset it
> > shouldn't make much of a difference as the all I/O is quiesced anyway.
> 
> It isn't related with reset.
> 
> This patch reduces active IO queue depth by 1 anytime no matter there is reset
> or not, and this way may cause performance regression.

But isn't it the right thing to do?  How else do we guarantee that
there always is a tag available for the LU reset?
Ming Lei May 2, 2020, 3:11 a.m. UTC | #5
On Fri, May 01, 2020 at 07:45:05PM +0200, Christoph Hellwig wrote:
> On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
> > > We cannot increase MAX_QUEUE arbitrarily as this is a compile time variable,
> > > which seems to relate to a hardware setting.
> > > 
> > > But I can see to update the reserved command functionality for allowing to
> > > fetch commands from the normal I/O tag pool; in the case of LUN reset it
> > > shouldn't make much of a difference as the all I/O is quiesced anyway.
> > 
> > It isn't related with reset.
> > 
> > This patch reduces active IO queue depth by 1 anytime no matter there is reset
> > or not, and this way may cause performance regression.
> 
> But isn't it the right thing to do?  How else do we guarantee that
> there always is a tag available for the LU reset?

If that is case, some of these patches should be bug-fix, but nothing
about this kind of comment is provided. If it is true, please update
the commit log and explain the current issue in detail, such as,
what is the side-effect of 'overwriting the original command'?

And we might need to backport it to stable tree because storage error
recovery is very key function.

Even though it is true, still not sure if this patch is the correct
way to fix the issue cause IO performance drop might be caused.


Thanks,
Ming
Hannes Reinecke May 2, 2020, 8:49 a.m. UTC | #6
On 5/1/20 7:45 PM, Christoph Hellwig wrote:
> On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
>>> We cannot increase MAX_QUEUE arbitrarily as this is a compile time variable,
>>> which seems to relate to a hardware setting.
>>>
>>> But I can see to update the reserved command functionality for allowing to
>>> fetch commands from the normal I/O tag pool; in the case of LUN reset it
>>> shouldn't make much of a difference as the all I/O is quiesced anyway.
>>
>> It isn't related with reset.
>>
>> This patch reduces active IO queue depth by 1 anytime no matter there is reset
>> or not, and this way may cause performance regression.
> 
> But isn't it the right thing to do?  How else do we guarantee that
> there always is a tag available for the LU reset?
> 
Precisely. One could argue that this is an issue with the current 
driver, too; if all tags have timed-out there is no way how we can send 
a LUN reset even now. Hence we need to reserve a tag for us to reliably 
send a LUN reset.
And this was precisely the problem what sparked off this entire 
patchset; some drivers require a valid tag to send internal, non SCSI 
commands to the hardware.
And with the current design it requires some really ugly hacks to make 
this to work.

Cheers,

Hannes
Ming Lei May 2, 2020, 2:29 p.m. UTC | #7
On Sat, May 02, 2020 at 10:49:32AM +0200, Hannes Reinecke wrote:
> On 5/1/20 7:45 PM, Christoph Hellwig wrote:
> > On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
> > > > We cannot increase MAX_QUEUE arbitrarily as this is a compile time variable,
> > > > which seems to relate to a hardware setting.
> > > > 
> > > > But I can see to update the reserved command functionality for allowing to
> > > > fetch commands from the normal I/O tag pool; in the case of LUN reset it
> > > > shouldn't make much of a difference as the all I/O is quiesced anyway.
> > > 
> > > It isn't related with reset.
> > > 
> > > This patch reduces active IO queue depth by 1 anytime no matter there is reset
> > > or not, and this way may cause performance regression.
> > 
> > But isn't it the right thing to do?  How else do we guarantee that
> > there always is a tag available for the LU reset?
> > 
> Precisely. One could argue that this is an issue with the current driver,
> too; if all tags have timed-out there is no way how we can send a LUN reset
> even now. Hence we need to reserve a tag for us to reliably send a LUN
> reset.
> And this was precisely the problem what sparked off this entire patchset;
> some drivers require a valid tag to send internal, non SCSI commands to the
> hardware.

Could you explain a bit how you conclude that csio_scsi reset hander has to
use one unique tag? At least we don't allocate request from block layer for
ioctl(SG_SCSI_RESET), see scsi_ioctl_reset(). Also this patch doesn't
use the reserved rq->tag too.

> And with the current design it requires some really ugly hacks to make this
> to work.

You also don't explain how csio_eh_lun_reset_handler() is broken and where
the ugly hack is in csio scsi too, and how this patch fixes the issue, could
you document the exact reason in the commit log?


Thanks,
Ming
Bart Van Assche May 2, 2020, 4:10 p.m. UTC | #8
On 2020-05-02 01:49, Hannes Reinecke wrote:
> On 5/1/20 7:45 PM, Christoph Hellwig wrote:
>> On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
>>>> We cannot increase MAX_QUEUE arbitrarily as this is a compile time
>>>> variable,
>>>> which seems to relate to a hardware setting.
>>>>
>>>> But I can see to update the reserved command functionality for
>>>> allowing to
>>>> fetch commands from the normal I/O tag pool; in the case of LUN
>>>> reset it
>>>> shouldn't make much of a difference as the all I/O is quiesced anyway.
>>>
>>> It isn't related with reset.
>>>
>>> This patch reduces active IO queue depth by 1 anytime no matter there
>>> is reset
>>> or not, and this way may cause performance regression.
>>
>> But isn't it the right thing to do?  How else do we guarantee that
>> there always is a tag available for the LU reset?
>>
> Precisely. One could argue that this is an issue with the current
> driver, too; if all tags have timed-out there is no way how we can send
> a LUN reset even now. Hence we need to reserve a tag for us to reliably
> send a LUN reset.
> And this was precisely the problem what sparked off this entire
> patchset; some drivers require a valid tag to send internal, non SCSI
> commands to the hardware.
> And with the current design it requires some really ugly hacks to make
> this to work.

Hi Hannes,

The above explanation seems incomplete to me. The code in
drivers/scsi/scsi_error.c and several SCSI LLDs use scsi_eh_prep_cmnd()
and scsi_eh_restore_cmnd() to reset a controller without allocating a
new command. Has it been considered to use that approach in the csiostor
driver?

Thanks,

Bart.
Hannes Reinecke May 4, 2020, 6:43 a.m. UTC | #9
On 5/2/20 5:11 AM, Ming Lei wrote:
> On Fri, May 01, 2020 at 07:45:05PM +0200, Christoph Hellwig wrote:
>> On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
>>>> We cannot increase MAX_QUEUE arbitrarily as this is a compile time variable,
>>>> which seems to relate to a hardware setting.
>>>>
>>>> But I can see to update the reserved command functionality for allowing to
>>>> fetch commands from the normal I/O tag pool; in the case of LUN reset it
>>>> shouldn't make much of a difference as the all I/O is quiesced anyway.
>>>
>>> It isn't related with reset.
>>>
>>> This patch reduces active IO queue depth by 1 anytime no matter there is reset
>>> or not, and this way may cause performance regression.
>>
>> But isn't it the right thing to do?  How else do we guarantee that
>> there always is a tag available for the LU reset?
> 
> If that is case, some of these patches should be bug-fix, but nothing
> about this kind of comment is provided. If it is true, please update
> the commit log and explain the current issue in detail, such as,
> what is the side-effect of 'overwriting the original command'?
> 
> And we might need to backport it to stable tree because storage error
> recovery is very key function.
>  > Even though it is true, still not sure if this patch is the correct
> way to fix the issue cause IO performance drop might be caused.
> 
You can't have it both ways.

The underlying problem is this:
The csiostor driver (and several others, too) require a valid hardware 
tag to send a LU reset command. Currently it tries to allocate a tag 
from the pool of free hardware tags.
However, experience shows that the majority of cases (in my personal 
experience _all_ of the cases) where we ever entered the error handler 
are due to command timeouts. If now all commands timed out, the tag 
space is full and we cannot get a free tag to send the LU reset command.
Hence LU reset will currently fail in this case.
With the patchset we will always ensure to have at least one free tag
such that we can send the LU reset command. But, as correctly noted,
it will reduce the available tagspace and possibly reduce the performance.
But you really can have it both ways. Either you go for max performance 
and have the risk of starving the error handler, or you go for 
reliability and accept a (slightly) lower performance.
And, btw, I'm not sure if one could even measure the performance impact.
csiostor has 2048 tags per HBA with a 10G FCoE link. So it would require 
a latency of less than 8us with 4k I/O to saturate the HBA; for 
everything slower we wouldn't be seeing anything.

Cheers,

Hannes
Hannes Reinecke May 4, 2020, 6:55 a.m. UTC | #10
On 5/2/20 4:29 PM, Ming Lei wrote:
> On Sat, May 02, 2020 at 10:49:32AM +0200, Hannes Reinecke wrote:
>> On 5/1/20 7:45 PM, Christoph Hellwig wrote:
>>> On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
>>>>> We cannot increase MAX_QUEUE arbitrarily as this is a compile time variable,
>>>>> which seems to relate to a hardware setting.
>>>>>
>>>>> But I can see to update the reserved command functionality for allowing to
>>>>> fetch commands from the normal I/O tag pool; in the case of LUN reset it
>>>>> shouldn't make much of a difference as the all I/O is quiesced anyway.
>>>>
>>>> It isn't related with reset.
>>>>
>>>> This patch reduces active IO queue depth by 1 anytime no matter there is reset
>>>> or not, and this way may cause performance regression.
>>>
>>> But isn't it the right thing to do?  How else do we guarantee that
>>> there always is a tag available for the LU reset?
>>>
>> Precisely. One could argue that this is an issue with the current driver,
>> too; if all tags have timed-out there is no way how we can send a LUN reset
>> even now. Hence we need to reserve a tag for us to reliably send a LUN
>> reset.
>> And this was precisely the problem what sparked off this entire patchset;
>> some drivers require a valid tag to send internal, non SCSI commands to the
>> hardware.
> 
> Could you explain a bit how you conclude that csio_scsi reset hander has to
> use one unique tag? At least we don't allocate request from block layer for
> ioctl(SG_SCSI_RESET), see scsi_ioctl_reset(). Also this patch doesn't
> use the reserved rq->tag too.
> 
>> And with the current design it requires some really ugly hacks to make this
>> to work.
> 
> You also don't explain how csio_eh_lun_reset_handler() is broken and where
> the ugly hack is in csio scsi too, and how this patch fixes the issue, could
> you document the exact reason in the commit log?
> 
The problem is the ioctl path.
When issuing TMF commands from the ioctl path we currently do not have a 
valid SCSI command to pass to the various SCSI EH functions.
This requires the SCSI LLDDs to check for every EH function whether the 
passed in SCSI command is valid (ie coming from SCSI EH), or a made up 
one coming from the ioctl path.
And having to code various ways to work around this issue.

With this patchset I'm implementing a standardized way how these 
functions can be coded. By using a reserved command the EH functions and 
the driver internal command handling will always have a valid command, 
so the workarounds can be removed.
And this is also the first step to my final SCSI EH rewrite, where I'm 
planning to move the SCSI EH functions from having the SCSI command as 
argument to the respective object (ie LU reset will be having a SCSI 
device as argument, host reset a SCSI host etc.)

Cheers,

Hannes
Hannes Reinecke May 4, 2020, 7:02 a.m. UTC | #11
On 5/2/20 6:10 PM, Bart Van Assche wrote:
> On 2020-05-02 01:49, Hannes Reinecke wrote:
>> On 5/1/20 7:45 PM, Christoph Hellwig wrote:
>>> On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
>>>>> We cannot increase MAX_QUEUE arbitrarily as this is a compile time
>>>>> variable,
>>>>> which seems to relate to a hardware setting.
>>>>>
>>>>> But I can see to update the reserved command functionality for
>>>>> allowing to
>>>>> fetch commands from the normal I/O tag pool; in the case of LUN
>>>>> reset it
>>>>> shouldn't make much of a difference as the all I/O is quiesced anyway.
>>>>
>>>> It isn't related with reset.
>>>>
>>>> This patch reduces active IO queue depth by 1 anytime no matter there
>>>> is reset
>>>> or not, and this way may cause performance regression.
>>>
>>> But isn't it the right thing to do?  How else do we guarantee that
>>> there always is a tag available for the LU reset?
>>>
>> Precisely. One could argue that this is an issue with the current
>> driver, too; if all tags have timed-out there is no way how we can send
>> a LUN reset even now. Hence we need to reserve a tag for us to reliably
>> send a LUN reset.
>> And this was precisely the problem what sparked off this entire
>> patchset; some drivers require a valid tag to send internal, non SCSI
>> commands to the hardware.
>> And with the current design it requires some really ugly hacks to make
>> this to work.
> 
> Hi Hannes,
> 
> The above explanation seems incomplete to me. The code in
> drivers/scsi/scsi_error.c and several SCSI LLDs use scsi_eh_prep_cmnd()
> and scsi_eh_restore_cmnd() to reset a controller without allocating a
> new command. Has it been considered to use that approach in the csiostor
> driver?
> 
As outlined in the response to Ming, the problem is the ioctl path.
When called from ioctl we do _not_ have a valid command, hence drivers 
have to figure out if the command is valid (ie coming from SCSI EH), or 
invalid (ie coming from ioctl).
I'm trying to unify both call paths by having the SCSI EH functions to 
always allocate a (reserved) SCSI command.

I have not moved this into the caller, as using a reserved command 
requires modifications of the driver itself (at least by setting 
'nr_reserved_cmds').

Also, the above approach does not work when we run into a command 
timeout; which, from my experience, is the majority of cases.
For timeout commands we precisely can _not_ re-use the command tag, as 
the command itself is still assumed live somewhere (otherwise we could 
have aborted it, and we wouldn't have to call the EH functions...).
But that means that any associated resources (like FC oxids) are still 
active on the wire, and re-using them would actually be a violating of 
the spec.

Cheers,

Hannes
Ming Lei May 4, 2020, 8:47 a.m. UTC | #12
On Mon, May 04, 2020 at 08:55:05AM +0200, Hannes Reinecke wrote:
> On 5/2/20 4:29 PM, Ming Lei wrote:
> > On Sat, May 02, 2020 at 10:49:32AM +0200, Hannes Reinecke wrote:
> > > On 5/1/20 7:45 PM, Christoph Hellwig wrote:
> > > > On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
> > > > > > We cannot increase MAX_QUEUE arbitrarily as this is a compile time variable,
> > > > > > which seems to relate to a hardware setting.
> > > > > > 
> > > > > > But I can see to update the reserved command functionality for allowing to
> > > > > > fetch commands from the normal I/O tag pool; in the case of LUN reset it
> > > > > > shouldn't make much of a difference as the all I/O is quiesced anyway.
> > > > > 
> > > > > It isn't related with reset.
> > > > > 
> > > > > This patch reduces active IO queue depth by 1 anytime no matter there is reset
> > > > > or not, and this way may cause performance regression.
> > > > 
> > > > But isn't it the right thing to do?  How else do we guarantee that
> > > > there always is a tag available for the LU reset?
> > > > 
> > > Precisely. One could argue that this is an issue with the current driver,
> > > too; if all tags have timed-out there is no way how we can send a LUN reset
> > > even now. Hence we need to reserve a tag for us to reliably send a LUN
> > > reset.
> > > And this was precisely the problem what sparked off this entire patchset;
> > > some drivers require a valid tag to send internal, non SCSI commands to the
> > > hardware.
> > 
> > Could you explain a bit how you conclude that csio_scsi reset hander has to
> > use one unique tag? At least we don't allocate request from block layer for
> > ioctl(SG_SCSI_RESET), see scsi_ioctl_reset(). Also this patch doesn't
> > use the reserved rq->tag too.
> > 
> > > And with the current design it requires some really ugly hacks to make this
> > > to work.
> > 
> > You also don't explain how csio_eh_lun_reset_handler() is broken and where
> > the ugly hack is in csio scsi too, and how this patch fixes the issue, could
> > you document the exact reason in the commit log?
> > 
> The problem is the ioctl path.
> When issuing TMF commands from the ioctl path we currently do not have a
> valid SCSI command to pass to the various SCSI EH functions.
> This requires the SCSI LLDDs to check for every EH function whether the
> passed in SCSI command is valid (ie coming from SCSI EH), or a made up one
> coming from the ioctl path.

Could you point out where the check is in csio driver?

> And having to code various ways to work around this issue.
> 
> With this patchset I'm implementing a standardized way how these functions
> can be coded. By using a reserved command the EH functions and the driver
> internal command handling will always have a valid command, so the
> workarounds can be removed.

It depends if every HBA really requires unique tag for sending reset command.

If some HBAs don't require unique tag for resetting device or host, the
reserved tag is wasted.

thanks, 
Ming
Hannes Reinecke May 4, 2020, 10:24 a.m. UTC | #13
On 5/4/20 10:47 AM, Ming Lei wrote:
> On Mon, May 04, 2020 at 08:55:05AM +0200, Hannes Reinecke wrote:
>> On 5/2/20 4:29 PM, Ming Lei wrote:
>>> On Sat, May 02, 2020 at 10:49:32AM +0200, Hannes Reinecke wrote:
>>>> On 5/1/20 7:45 PM, Christoph Hellwig wrote:
>>>>> On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
>>>>>>> We cannot increase MAX_QUEUE arbitrarily as this is a compile time variable,
>>>>>>> which seems to relate to a hardware setting.
>>>>>>>
>>>>>>> But I can see to update the reserved command functionality for allowing to
>>>>>>> fetch commands from the normal I/O tag pool; in the case of LUN reset it
>>>>>>> shouldn't make much of a difference as the all I/O is quiesced anyway.
>>>>>>
>>>>>> It isn't related with reset.
>>>>>>
>>>>>> This patch reduces active IO queue depth by 1 anytime no matter there is reset
>>>>>> or not, and this way may cause performance regression.
>>>>>
>>>>> But isn't it the right thing to do?  How else do we guarantee that
>>>>> there always is a tag available for the LU reset?
>>>>>
>>>> Precisely. One could argue that this is an issue with the current driver,
>>>> too; if all tags have timed-out there is no way how we can send a LUN reset
>>>> even now. Hence we need to reserve a tag for us to reliably send a LUN
>>>> reset.
>>>> And this was precisely the problem what sparked off this entire patchset;
>>>> some drivers require a valid tag to send internal, non SCSI commands to the
>>>> hardware.
>>>
>>> Could you explain a bit how you conclude that csio_scsi reset hander has to
>>> use one unique tag? At least we don't allocate request from block layer for
>>> ioctl(SG_SCSI_RESET), see scsi_ioctl_reset(). Also this patch doesn't
>>> use the reserved rq->tag too.
>>>
>>>> And with the current design it requires some really ugly hacks to make this
>>>> to work.
>>>
>>> You also don't explain how csio_eh_lun_reset_handler() is broken and where
>>> the ugly hack is in csio scsi too, and how this patch fixes the issue, could
>>> you document the exact reason in the commit log?
>>>
>> The problem is the ioctl path.
>> When issuing TMF commands from the ioctl path we currently do not have a
>> valid SCSI command to pass to the various SCSI EH functions.
>> This requires the SCSI LLDDs to check for every EH function whether the
>> passed in SCSI command is valid (ie coming from SCSI EH), or a made up one
>> coming from the ioctl path.
> 
> Could you point out where the check is in csio driver?
> 
Okok, I see your point.

Indeed the csiostor driver doesn't use the 'tag' per se for submitting 
commands; it's just using the scsi command pointer as a tag to figure 
out if a completion has been send from the hw.

I'll be giving it a bit more thought, and will be dropping it for the 
next round (which will contain only the minimal changes to get the 
'reserved_cmds' interface in).

Cheers,

Hannes
Ming Lei May 4, 2020, 10:49 a.m. UTC | #14
On Mon, May 04, 2020 at 12:24:41PM +0200, Hannes Reinecke wrote:
> On 5/4/20 10:47 AM, Ming Lei wrote:
> > On Mon, May 04, 2020 at 08:55:05AM +0200, Hannes Reinecke wrote:
> > > On 5/2/20 4:29 PM, Ming Lei wrote:
> > > > On Sat, May 02, 2020 at 10:49:32AM +0200, Hannes Reinecke wrote:
> > > > > On 5/1/20 7:45 PM, Christoph Hellwig wrote:
> > > > > > On Fri, May 01, 2020 at 11:01:29PM +0800, Ming Lei wrote:
> > > > > > > > We cannot increase MAX_QUEUE arbitrarily as this is a compile time variable,
> > > > > > > > which seems to relate to a hardware setting.
> > > > > > > > 
> > > > > > > > But I can see to update the reserved command functionality for allowing to
> > > > > > > > fetch commands from the normal I/O tag pool; in the case of LUN reset it
> > > > > > > > shouldn't make much of a difference as the all I/O is quiesced anyway.
> > > > > > > 
> > > > > > > It isn't related with reset.
> > > > > > > 
> > > > > > > This patch reduces active IO queue depth by 1 anytime no matter there is reset
> > > > > > > or not, and this way may cause performance regression.
> > > > > > 
> > > > > > But isn't it the right thing to do?  How else do we guarantee that
> > > > > > there always is a tag available for the LU reset?
> > > > > > 
> > > > > Precisely. One could argue that this is an issue with the current driver,
> > > > > too; if all tags have timed-out there is no way how we can send a LUN reset
> > > > > even now. Hence we need to reserve a tag for us to reliably send a LUN
> > > > > reset.
> > > > > And this was precisely the problem what sparked off this entire patchset;
> > > > > some drivers require a valid tag to send internal, non SCSI commands to the
> > > > > hardware.
> > > > 
> > > > Could you explain a bit how you conclude that csio_scsi reset hander has to
> > > > use one unique tag? At least we don't allocate request from block layer for
> > > > ioctl(SG_SCSI_RESET), see scsi_ioctl_reset(). Also this patch doesn't
> > > > use the reserved rq->tag too.
> > > > 
> > > > > And with the current design it requires some really ugly hacks to make this
> > > > > to work.
> > > > 
> > > > You also don't explain how csio_eh_lun_reset_handler() is broken and where
> > > > the ugly hack is in csio scsi too, and how this patch fixes the issue, could
> > > > you document the exact reason in the commit log?
> > > > 
> > > The problem is the ioctl path.
> > > When issuing TMF commands from the ioctl path we currently do not have a
> > > valid SCSI command to pass to the various SCSI EH functions.
> > > This requires the SCSI LLDDs to check for every EH function whether the
> > > passed in SCSI command is valid (ie coming from SCSI EH), or a made up one
> > > coming from the ioctl path.
> > 
> > Could you point out where the check is in csio driver?
> > 
> Okok, I see your point.
> 
> Indeed the csiostor driver doesn't use the 'tag' per se for submitting
> commands; it's just using the scsi command pointer as a tag to figure out if
> a completion has been send from the hw.
> 
> I'll be giving it a bit more thought, and will be dropping it for the next
> round (which will contain only the minimal changes to get the
> 'reserved_cmds' interface in).

IMO, the 'reserved_cmds' interface is only needed in case that RESET
command is transported from IO channel. Any HBA which has dedicated
channel for sending RESET doesn't need this interface.

Thanks,
Ming
diff mbox series

Patch

diff --git a/drivers/scsi/csiostor/csio_init.c b/drivers/scsi/csiostor/csio_init.c
index 8dea7d53788a..5e1b0a24caf6 100644
--- a/drivers/scsi/csiostor/csio_init.c
+++ b/drivers/scsi/csiostor/csio_init.c
@@ -622,6 +622,7 @@  csio_shost_init(struct csio_hw *hw, struct device *dev,
 	ln->dev_num = (shost->host_no << 16);
 
 	shost->can_queue = CSIO_MAX_QUEUE;
+	shost->nr_reserved_cmds = 1;
 	shost->this_id = -1;
 	shost->unique_id = shost->host_no;
 	shost->max_cmd_len = 16; /* Max CDB length supported */
diff --git a/drivers/scsi/csiostor/csio_scsi.c b/drivers/scsi/csiostor/csio_scsi.c
index 00cf33573136..273a8b952e69 100644
--- a/drivers/scsi/csiostor/csio_scsi.c
+++ b/drivers/scsi/csiostor/csio_scsi.c
@@ -2057,10 +2057,12 @@  csio_tm_cbfn(struct csio_hw *hw, struct csio_ioreq *req)
 static int
 csio_eh_lun_reset_handler(struct scsi_cmnd *cmnd)
 {
-	struct csio_lnode *ln = shost_priv(cmnd->device->host);
+	struct scsi_cmnd *reset_cmnd;
+	struct scsi_device *sdev = cmnd->device;
+	struct csio_lnode *ln = shost_priv(sdev->host);
 	struct csio_hw *hw = csio_lnode_to_hw(ln);
 	struct csio_scsim *scsim = csio_hw_to_scsim(hw);
-	struct csio_rnode *rn = (struct csio_rnode *)(cmnd->device->hostdata);
+	struct csio_rnode *rn = (struct csio_rnode *)(sdev->hostdata);
 	struct csio_ioreq *ioreq = NULL;
 	struct csio_scsi_qset *sqset;
 	unsigned long flags;
@@ -2073,13 +2075,13 @@  csio_eh_lun_reset_handler(struct scsi_cmnd *cmnd)
 		goto fail;
 
 	csio_dbg(hw, "Request to reset LUN:%llu (ssni:0x%x tgtid:%d)\n",
-		      cmnd->device->lun, rn->flowid, rn->scsi_id);
+		      sdev->lun, rn->flowid, rn->scsi_id);
 
 	if (!csio_is_lnode_ready(ln)) {
 		csio_err(hw,
 			 "LUN reset cannot be issued on non-ready"
 			 " local node vnpi:0x%x (LUN:%llu)\n",
-			 ln->vnp_flowid, cmnd->device->lun);
+			 ln->vnp_flowid, sdev->lun);
 		goto fail;
 	}
 
@@ -2099,17 +2101,22 @@  csio_eh_lun_reset_handler(struct scsi_cmnd *cmnd)
 		csio_err(hw,
 			 "LUN reset cannot be issued on non-ready"
 			 " remote node ssni:0x%x (LUN:%llu)\n",
-			 rn->flowid, cmnd->device->lun);
+			 rn->flowid, sdev->lun);
 		goto fail;
 	}
 
+	reset_cmnd = scsi_get_reserved_cmd(sdev, DMA_NONE);
+	if (!reset_cmnd) {
+		csio_err(hw, "No free TMF request\n");
+		goto fail;
+	}
 	/* Get a free ioreq structure - SM is already set to uninit */
 	ioreq = csio_get_scsi_ioreq_lock(hw, scsim);
 
 	if (!ioreq) {
 		csio_err(hw, "Out of IO request elements. Active # :%d\n",
 			 scsim->stats.n_active);
-		goto fail;
+		goto fail_ret_cmnd;
 	}
 
 	sqset			= &hw->sqset[ln->portid][smp_processor_id()];
@@ -2119,11 +2126,11 @@  csio_eh_lun_reset_handler(struct scsi_cmnd *cmnd)
 	ioreq->iq_idx		= sqset->iq_idx;
 	ioreq->eq_idx		= sqset->eq_idx;
 
-	csio_scsi_cmnd(ioreq)	= cmnd;
-	cmnd->host_scribble	= (unsigned char *)ioreq;
-	cmnd->SCp.Status	= 0;
+	csio_scsi_cmnd(ioreq)	= reset_cmnd;
+	reset_cmnd->host_scribble	= (unsigned char *)ioreq;
+	reset_cmnd->SCp.Status	= 0;
 
-	cmnd->SCp.Message	= FCP_TMF_LUN_RESET;
+	reset_cmnd->SCp.Message	= FCP_TMF_LUN_RESET;
 	ioreq->tmo		= CSIO_SCSI_LUNRST_TMO_MS / 1000;
 
 	/*
@@ -2140,7 +2147,7 @@  csio_eh_lun_reset_handler(struct scsi_cmnd *cmnd)
 	sld.level = CSIO_LEV_LUN;
 	sld.lnode = ioreq->lnode;
 	sld.rnode = ioreq->rnode;
-	sld.oslun = cmnd->device->lun;
+	sld.oslun = sdev->lun;
 
 	spin_lock_irqsave(&hw->lock, flags);
 	/* Kick off TM SM on the ioreq */
@@ -2156,14 +2163,14 @@  csio_eh_lun_reset_handler(struct scsi_cmnd *cmnd)
 	csio_dbg(hw, "Waiting max %d secs for LUN reset completion\n",
 		    count * (CSIO_SCSI_TM_POLL_MS / 1000));
 	/* Wait for completion */
-	while ((((struct scsi_cmnd *)csio_scsi_cmnd(ioreq)) == cmnd)
+	while ((((struct scsi_cmnd *)csio_scsi_cmnd(ioreq)) == reset_cmnd)
 								&& count--)
 		msleep(CSIO_SCSI_TM_POLL_MS);
 
 	/* LUN reset timed-out */
-	if (((struct scsi_cmnd *)csio_scsi_cmnd(ioreq)) == cmnd) {
+	if (((struct scsi_cmnd *)csio_scsi_cmnd(ioreq)) == reset_cmnd) {
 		csio_err(hw, "LUN reset (%d:%llu) timed out\n",
-			 cmnd->device->id, cmnd->device->lun);
+			 sdev->id, sdev->lun);
 
 		spin_lock_irq(&hw->lock);
 		csio_scsi_drvcleanup(ioreq);
@@ -2174,11 +2181,12 @@  csio_eh_lun_reset_handler(struct scsi_cmnd *cmnd)
 	}
 
 	/* LUN reset returned, check cached status */
-	if (cmnd->SCp.Status != FW_SUCCESS) {
+	if (reset_cmnd->SCp.Status != FW_SUCCESS) {
 		csio_err(hw, "LUN reset failed (%d:%llu), status: %d\n",
-			 cmnd->device->id, cmnd->device->lun, cmnd->SCp.Status);
-		goto fail;
+			 sdev->id, sdev->lun, reset_cmnd->SCp.Status);
+		goto fail_ret_cmnd;
 	}
+	scsi_put_reserved_cmd(reset_cmnd);
 
 	/* LUN reset succeeded, Start aborting affected I/Os */
 	/*
@@ -2196,7 +2204,7 @@  csio_eh_lun_reset_handler(struct scsi_cmnd *cmnd)
 	if (retval != 0) {
 		csio_err(hw,
 			 "Attempt to abort I/Os during LUN reset of %llu"
-			 " returned %d\n", cmnd->device->lun, retval);
+			 " returned %d\n", sdev->lun, retval);
 		/* Return I/Os back to active_q */
 		spin_lock_irq(&hw->lock);
 		list_splice_tail_init(&local_q, &scsim->active_q);
@@ -2207,12 +2215,14 @@  csio_eh_lun_reset_handler(struct scsi_cmnd *cmnd)
 	CSIO_INC_STATS(rn, n_lun_rst);
 
 	csio_info(hw, "LUN reset occurred (%d:%llu)\n",
-		  cmnd->device->id, cmnd->device->lun);
+		  sdev->id, sdev->lun);
 
 	return SUCCESS;
 
 fail_ret_ioreq:
 	csio_put_scsi_ioreq_lock(hw, scsim, ioreq);
+fail_ret_cmnd:
+	scsi_put_reserved_cmd(reset_cmnd);
 fail:
 	CSIO_INC_STATS(rn, n_lun_rst_fail);
 	return FAILED;