diff mbox

[3/6] IB/srp: Fail SCSI commands silently

Message ID 530BA476.7040005@acm.org (mailing list archive)
State Rejected
Headers show

Commit Message

Bart Van Assche Feb. 24, 2014, 7:58 p.m. UTC
On 02/22/14 06:41, David Dillow wrote:
> I didn't suggest that -- I'm saying add a common functionality to turn
> on/off the message printing for commands that failed due to a dead
> transport. If it is useful for SRP initiators, it is probably useful for
> SAS and iSCSI imitators as well.
> 
> At a quick look, it seems you need to add a new value to the
> rq_flag_bits enum in include/linux/blk_types.h, __REQ_FAILED_TRANSPORT,
> somewhere before __REQ_NR_BITS. And a #define in the style of those
> following it. Use that flag instead of REQ_QUIET in the patch we are
> discussing, and change the code in scsi_lib.c to check
> 
>   ((req->cmd_flags & REQ_FAILED_TRANSPORT) && $user_wants_this_knob) || !(req->cmd_flags & REQ_QUIET)
> 
> before calling scsi_print_sense(), with appropriate naming, of course.
> This gives you a central place to control it, and allows for consistency
> between initiators.
> 
> I agree it is much easier to just fix it in SRP, especially given the
> glacial rate of change for the SCSI mid-layer, but I really think a
> central control and driver-by-driver enablement would be the best
> approach. YMMV.

Hello Dave,

That makes sense to me. Do you think the (untested) patch below could
be a valid alternative to the above proposal ?

---
 drivers/scsi/scsi_lib.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

David Dillow Feb. 25, 2014, 4:36 a.m. UTC | #1
On Mon, 2014-02-24 at 20:58 +0100, Bart Van Assche wrote:
> On 02/22/14 06:41, David Dillow wrote:
> > I didn't suggest that -- I'm saying add a common functionality to turn
> > on/off the message printing for commands that failed due to a dead
> > transport. If it is useful for SRP initiators, it is probably useful for
> > SAS and iSCSI imitators as well.
> > 
> > At a quick look, it seems you need to add a new value to the
> > rq_flag_bits enum in include/linux/blk_types.h, __REQ_FAILED_TRANSPORT,
> > somewhere before __REQ_NR_BITS. And a #define in the style of those
> > following it. Use that flag instead of REQ_QUIET in the patch we are
> > discussing, and change the code in scsi_lib.c to check
> > 
> >   ((req->cmd_flags & REQ_FAILED_TRANSPORT) && $user_wants_this_knob) || !(req->cmd_flags & REQ_QUIET)
> > 
> > before calling scsi_print_sense(), with appropriate naming, of course.
> > This gives you a central place to control it, and allows for consistency
> > between initiators.
> > 
> > I agree it is much easier to just fix it in SRP, especially given the
> > glacial rate of change for the SCSI mid-layer, but I really think a
> > central control and driver-by-driver enablement would be the best
> > approach. YMMV.
> 
> Hello Dave,
> 
> That makes sense to me. Do you think the (untested) patch below could
> be a valid alternative to the above proposal ?

I'd still like to see a knob to let the user turn it off -- sometimes
you want to see those messages -- but it seems like a viable approach,
especially if it works when you test it :)

I'll leave to others about how important that knob is, though. If no one
else barks, then it should be OK.

> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 7bd7f0d..124ab53 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -857,7 +857,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>  		 */
>  		if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
>  			;
> -		else if (!(req->cmd_flags & REQ_QUIET))
> +		else if (cmd->device->sdev_state != SDEV_TRANSPORT_OFFLINE &&
> +			 !(req->cmd_flags & REQ_QUIET))
>  			scsi_print_sense("", cmd);
>  		result = 0;
>  		/* BLOCK_PC may have set error */


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Feb. 25, 2014, 10:33 a.m. UTC | #2
On 02/25/14 05:36, David Dillow wrote:
> On Mon, 2014-02-24 at 20:58 +0100, Bart Van Assche wrote:
>> On 02/22/14 06:41, David Dillow wrote:
>>> I didn't suggest that -- I'm saying add a common functionality to turn
>>> on/off the message printing for commands that failed due to a dead
>>> transport. If it is useful for SRP initiators, it is probably useful for
>>> SAS and iSCSI imitators as well.
>>>
>>> At a quick look, it seems you need to add a new value to the
>>> rq_flag_bits enum in include/linux/blk_types.h, __REQ_FAILED_TRANSPORT,
>>> somewhere before __REQ_NR_BITS. And a #define in the style of those
>>> following it. Use that flag instead of REQ_QUIET in the patch we are
>>> discussing, and change the code in scsi_lib.c to check
>>>
>>>   ((req->cmd_flags & REQ_FAILED_TRANSPORT) && $user_wants_this_knob) || !(req->cmd_flags & REQ_QUIET)
>>>
>>> before calling scsi_print_sense(), with appropriate naming, of course.
>>> This gives you a central place to control it, and allows for consistency
>>> between initiators.
>>>
>>> I agree it is much easier to just fix it in SRP, especially given the
>>> glacial rate of change for the SCSI mid-layer, but I really think a
>>> central control and driver-by-driver enablement would be the best
>>> approach. YMMV.
>>
>> That makes sense to me. Do you think the (untested) patch below could
>> be a valid alternative to the above proposal ?
> 
> I'd still like to see a knob to let the user turn it off -- sometimes
> you want to see those messages -- but it seems like a viable approach,
> especially if it works when you test it :)
> 
> I'll leave to others about how important that knob is, though. If no one
> else barks, then it should be OK.
> 
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 7bd7f0d..124ab53 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -857,7 +857,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>>  		 */
>>  		if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
>>  			;
>> -		else if (!(req->cmd_flags & REQ_QUIET))
>> +		else if (cmd->device->sdev_state != SDEV_TRANSPORT_OFFLINE &&
>> +			 !(req->cmd_flags & REQ_QUIET))
>>  			scsi_print_sense("", cmd);
>>  		result = 0;
>>  		/* BLOCK_PC may have set error */

Hello Dave,

Testing learned me that the above patch suppresses the messages reported
by the SCSI mid-layer due to a transport failure but not the messages
reported by the block layer (e.g. "end_request: recoverable transport
error, dev sdc, sector 42514" -- see also blk_update_request() in
block/blk-core.c). Do you really think it is essential to introduce a
new flag in the block layer for the purpose of suppressing transport
layer error messages and to add support for that flag in the block core
and in the SCSI mid-layer ? To me it seems a lot simpler to use the
existing REQ_QUIET flag.

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Dillow Feb. 26, 2014, 6:32 a.m. UTC | #3
On Tue, 2014-02-25 at 11:33 +0100, Bart Van Assche wrote:
> Do you really think it is essential to introduce a
> new flag in the block layer for the purpose of suppressing transport
> layer error messages and to add support for that flag in the block
> core and in the SCSI mid-layer ? To me it seems a lot simpler to use
> the existing REQ_QUIET flag.

Yes, I think you want different semantics -- one is the requestor saying
"don't complain about this if it fails", the other is the LLD saying
"this failed due to transport failure, you may or may not want to report
it, under user control". But it doesn't necessarily have to be a new
block flag -- it may be as simple as adding a stanza in
scsi_io_completion() that looks like

if (cmd->device->sdev_state != SDEV_TRANSPORT_OFFLINE &&
    user_want_to_silence_transport_errors)
	req->cmd_flags |= REQ_QUIET;

if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
	;
else if (!(req->cmd_flags & REQ_QUIET))
	scsi_print_sense("", cmd);



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Feb. 26, 2014, 1:16 p.m. UTC | #4
On 02/26/14 07:32, David Dillow wrote:
> On Tue, 2014-02-25 at 11:33 +0100, Bart Van Assche wrote:
>> Do you really think it is essential to introduce a
>> new flag in the block layer for the purpose of suppressing transport
>> layer error messages and to add support for that flag in the block
>> core and in the SCSI mid-layer ? To me it seems a lot simpler to use
>> the existing REQ_QUIET flag.
> 
> Yes, I think you want different semantics -- one is the requestor saying
> "don't complain about this if it fails", the other is the LLD saying
> "this failed due to transport failure, you may or may not want to report
> it, under user control". But it doesn't necessarily have to be a new
> block flag -- it may be as simple as adding a stanza in
> scsi_io_completion() that looks like
> 
> if (cmd->device->sdev_state != SDEV_TRANSPORT_OFFLINE &&
>     user_want_to_silence_transport_errors)
> 	req->cmd_flags |= REQ_QUIET;
> 
> if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
> 	;
> else if (!(req->cmd_flags & REQ_QUIET))
> 	scsi_print_sense("", cmd);

But this approach doesn't suppress the error messages printed by the
block layer due to a transport layer failure. Additionally, the block
layer is neither aware of the SCSI transport layer state nor of the ASC
and ASCQ codes in the sense data. Hence my preference to keep patch 3/6
as it has been posted at the start of this thread.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Dillow Feb. 27, 2014, 5:48 a.m. UTC | #5
On Wed, 2014-02-26 at 14:16 +0100, Bart Van Assche wrote:
> On 02/26/14 07:32, David Dillow wrote:
> > On Tue, 2014-02-25 at 11:33 +0100, Bart Van Assche wrote:
> >> Do you really think it is essential to introduce a
> >> new flag in the block layer for the purpose of suppressing transport
> >> layer error messages and to add support for that flag in the block
> >> core and in the SCSI mid-layer ? To me it seems a lot simpler to use
> >> the existing REQ_QUIET flag.
> > 
> > Yes, I think you want different semantics -- one is the requestor saying
> > "don't complain about this if it fails", the other is the LLD saying
> > "this failed due to transport failure, you may or may not want to report
> > it, under user control". But it doesn't necessarily have to be a new
> > block flag -- it may be as simple as adding a stanza in
> > scsi_io_completion() that looks like
> > 
> > if (cmd->device->sdev_state != SDEV_TRANSPORT_OFFLINE &&
> >     user_want_to_silence_transport_errors)
> > 	req->cmd_flags |= REQ_QUIET;
> > 
> > if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
> > 	;
> > else if (!(req->cmd_flags & REQ_QUIET))
> > 	scsi_print_sense("", cmd);
> 
> But this approach doesn't suppress the error messages printed by the
> block layer due to a transport layer failure.

I don't see where the block level is printing messages about the failed
requests -- but maybe I'm missing it. I see scsi_done() ->
blk_complete_request() -> blk_done_softirq() -> scsi_softirq_done() ->
scsi_finish_command() -> scsi_io_completion()

I'm not seeing where the errored block requests would be printed before
scsi_io_completion(), which is where we were talking about setting the
REQ_QUIET flag above.

What messages are you talking about -- even better if you can point to
where they come from -- but mainly which ones; I'm happy to track them
down myself. It's been a while since I've done a deep dive on this code,
so my apologies if it should be obvious to me...

> Additionally, the block layer is neither aware of the SCSI transport
> layer state nor of the ASC and ASCQ codes in the sense data.

Of course it isn't, and no one is suggesting it should.

> Hence my preference to keep patch 3/6 as it has been posted at the
> start of this thread.

I agree it is the most expedient approach to solve Sebastian's immediate
problem, but I'd rather the solution be generic to the SCSI mid-layer so
it will be useful for others beyond SRP.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Feb. 27, 2014, 1:44 p.m. UTC | #6
On 02/27/14 06:48, David Dillow wrote:
> I don't see where the block level is printing messages about the failed
> requests -- but maybe I'm missing it. I see scsi_done() ->
> blk_complete_request() -> blk_done_softirq() -> scsi_softirq_done() ->
> scsi_finish_command() -> scsi_io_completion()
> 
> I'm not seeing where the errored block requests would be printed before
> scsi_io_completion(), which is where we were talking about setting the
> REQ_QUIET flag above.
> 
> What messages are you talking about -- even better if you can point to
> where they come from -- but mainly which ones; I'm happy to track them
> down myself. It's been a while since I've done a deep dive on this code,
> so my apologies if it should be obvious to me...

Hello Dave,

I think the call chain is as follows: scsi_io_completion() ->
scsi_end_request() -> blk_end_request() -> blk_end_bidi_request() ->
blk_update_bidi_request() -> blk_update_request(). That last function
prints an error message for failed requests if REQ_QUIET has not been
set. So this means that it's possible to set the REQ_QUIET flag from the
SCSI mid-layer before the block layer checks that flag.

> I agree it is the most expedient approach to solve Sebastian's immediate
> problem, but I'd rather the solution be generic to the SCSI mid-layer so
> it will be useful for others beyond SRP.

It would be great if this would be solved in a more generic way. But
please keep in mind that this is not just Sebastian's problem - other
users have reported this before.

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 7bd7f0d..124ab53 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -857,7 +857,8 @@  void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 		 */
 		if ((sshdr.asc == 0x0) && (sshdr.ascq == 0x1d))
 			;
-		else if (!(req->cmd_flags & REQ_QUIET))
+		else if (cmd->device->sdev_state != SDEV_TRANSPORT_OFFLINE &&
+			 !(req->cmd_flags & REQ_QUIET))
 			scsi_print_sense("", cmd);
 		result = 0;
 		/* BLOCK_PC may have set error */