diff mbox

scsi_dh_alua: skip RTPG for devices only supporting active/optimized

Message ID 1512728048-84729-1-git-send-email-hare@suse.de (mailing list archive)
State Changes Requested
Headers show

Commit Message

Hannes Reinecke Dec. 8, 2017, 10:14 a.m. UTC
From: Hannes Reinecke <hare@suse.com>

For hardware only supporting active/optimized there's no point in
ever re-issuing RTPG as the only new state we can possibly read is
active/optimized.
This avoid spurious errors during path failover on such arrays.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 drivers/scsi/device_handler/scsi_dh_alua.c | 35 ++++++++++++++++++++++--------
 1 file changed, 26 insertions(+), 9 deletions(-)

Comments

Bart Van Assche Dec. 12, 2017, midnight UTC | #1
On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote:
> @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg)

>  	retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags);

>  

>  	if (retval) {

> +		/*

> +		 * If the target only supports active/optimized there's

> +		 * not much we can do; it's not that we can switch paths

> +		 * or somesuch.

> +		 * So ignore any errors to avoid spurious failures during

> +		 * path failover.

> +		 */

> +		if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) {

> +			sdev_printk(KERN_INFO, sdev,

> +				    "%s: ignoring rtpg result %d\n",

> +				    ALUA_DH_NAME, retval);

> +			kfree(buff);

> +			return SCSI_DH_OK;

> +		}


Hello Hannes,

Sorry but this change looks weird to me. If an RTPG fails, shouldn't
alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target
supports? Are you perhaps trying to implement a work-around for an array
that does not respond to RTPG during path transitions?

Thanks,

Bart.
Hannes Reinecke Dec. 12, 2017, 7:01 a.m. UTC | #2
On 12/12/2017 01:00 AM, Bart Van Assche wrote:
> On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote:
>> @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg)
>>  	retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags);
>>  
>>  	if (retval) {
>> +		/*
>> +		 * If the target only supports active/optimized there's
>> +		 * not much we can do; it's not that we can switch paths
>> +		 * or somesuch.
>> +		 * So ignore any errors to avoid spurious failures during
>> +		 * path failover.
>> +		 */
>> +		if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) {
>> +			sdev_printk(KERN_INFO, sdev,
>> +				    "%s: ignoring rtpg result %d\n",
>> +				    ALUA_DH_NAME, retval);
>> +			kfree(buff);
>> +			return SCSI_DH_OK;
>> +		}
> 
> Hello Hannes,
> 
> Sorry but this change looks weird to me. If an RTPG fails, shouldn't
> alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target
> supports? Are you perhaps trying to implement a work-around for an array
> that does not respond to RTPG during path transitions?
> 
Yes, precisely.

Thing is: if an array is only supporting active/optimized the entire
device-handler stuff is basically a no-op as we can't switch paths anyway.
So it doesn't really matter if the RTPG fails; in fact, we could just
short-circuit the entire logic. I did not do that to allow for a state
modification (ie arrays _might_ decide to announce additional states
eventually, and only starting off with active/optimized as an initial
state set).

But if we return SCSI_DH_IO here the multipath logic will not attempt to
switch paths, and failover will not work.

Cheers,

Hannes
Bart Van Assche Dec. 12, 2017, 6:25 p.m. UTC | #3
On Tue, 2017-12-12 at 08:01 +0100, Hannes Reinecke wrote:
> On 12/12/2017 01:00 AM, Bart Van Assche wrote:

> > On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote:

> > > @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg)

> > >  	retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags);

> > >  

> > >  	if (retval) {

> > > +		/*

> > > +		 * If the target only supports active/optimized there's

> > > +		 * not much we can do; it's not that we can switch paths

> > > +		 * or somesuch.

> > > +		 * So ignore any errors to avoid spurious failures during

> > > +		 * path failover.

> > > +		 */

> > > +		if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) {

> > > +			sdev_printk(KERN_INFO, sdev,

> > > +				    "%s: ignoring rtpg result %d\n",

> > > +				    ALUA_DH_NAME, retval);

> > > +			kfree(buff);

> > > +			return SCSI_DH_OK;

> > > +		}

> > 

> > Hello Hannes,

> > 

> > Sorry but this change looks weird to me. If an RTPG fails, shouldn't

> > alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target

> > supports? Are you perhaps trying to implement a work-around for an array

> > that does not respond to RTPG during path transitions?

> > 

> 

> Yes, precisely.

> 

> Thing is: if an array is only supporting active/optimized the entire

> device-handler stuff is basically a no-op as we can't switch paths anyway.

> So it doesn't really matter if the RTPG fails; in fact, we could just

> short-circuit the entire logic. I did not do that to allow for a state

> modification (ie arrays _might_ decide to announce additional states

> eventually, and only starting off with active/optimized as an initial

> state set).

> 

> But if we return SCSI_DH_IO here the multipath logic will not attempt to

> switch paths, and failover will not work.


Hello Hannes,

I would appreciate it if it would be mentioned more clearly in the comment
above pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED that the newly added code
is a workaround for non-standard array behavior. I'm afraid that otherwise
people who will read that code will be puzzled about why that code has been
added.

Thanks,

Bart.
Hannes Reinecke Dec. 13, 2017, 9:14 a.m. UTC | #4
On 12/12/2017 07:25 PM, Bart Van Assche wrote:
> On Tue, 2017-12-12 at 08:01 +0100, Hannes Reinecke wrote:
>> On 12/12/2017 01:00 AM, Bart Van Assche wrote:
>>> On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote:
>>>> @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg)
>>>>  	retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags);
>>>>  
>>>>  	if (retval) {
>>>> +		/*
>>>> +		 * If the target only supports active/optimized there's
>>>> +		 * not much we can do; it's not that we can switch paths
>>>> +		 * or somesuch.
>>>> +		 * So ignore any errors to avoid spurious failures during
>>>> +		 * path failover.
>>>> +		 */
>>>> +		if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) {
>>>> +			sdev_printk(KERN_INFO, sdev,
>>>> +				    "%s: ignoring rtpg result %d\n",
>>>> +				    ALUA_DH_NAME, retval);
>>>> +			kfree(buff);
>>>> +			return SCSI_DH_OK;
>>>> +		}
>>>
>>> Hello Hannes,
>>>
>>> Sorry but this change looks weird to me. If an RTPG fails, shouldn't
>>> alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target
>>> supports? Are you perhaps trying to implement a work-around for an array
>>> that does not respond to RTPG during path transitions?
>>>
>>
>> Yes, precisely.
>>
>> Thing is: if an array is only supporting active/optimized the entire
>> device-handler stuff is basically a no-op as we can't switch paths anyway.
>> So it doesn't really matter if the RTPG fails; in fact, we could just
>> short-circuit the entire logic. I did not do that to allow for a state
>> modification (ie arrays _might_ decide to announce additional states
>> eventually, and only starting off with active/optimized as an initial
>> state set).
>>
>> But if we return SCSI_DH_IO here the multipath logic will not attempt to
>> switch paths, and failover will not work.
> 
> Hello Hannes,
> 
> I would appreciate it if it would be mentioned more clearly in the comment
> above pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED that the newly added code
> is a workaround for non-standard array behavior. I'm afraid that otherwise
> people who will read that code will be puzzled about why that code has been
> added.
> 
Okay, not a problem.
Will be sending out a v2.

Cheers,

Hannes
diff mbox

Patch

diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index fd22dc6..b09c12b 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -40,6 +40,7 @@ 
 #define TPGS_SUPPORT_LBA_DEPENDENT	0x10
 #define TPGS_SUPPORT_OFFLINE		0x40
 #define TPGS_SUPPORT_TRANSITION		0x80
+#define TPGS_SUPPORT_ALL		0xdf
 
 #define RTPG_FMT_MASK			0x70
 #define RTPG_FMT_EXT_HDR		0x10
@@ -81,6 +82,7 @@  struct alua_port_group {
 	int			tpgs;
 	int			state;
 	int			pref;
+	int			valid_states;
 	unsigned		flags; /* used for optimizing STPG */
 	unsigned char		transition_tmo;
 	unsigned long		expiry;
@@ -243,6 +245,7 @@  static struct alua_port_group *alua_alloc_pg(struct scsi_device *sdev,
 	pg->group_id = group_id;
 	pg->tpgs = tpgs;
 	pg->state = SCSI_ACCESS_STATE_OPTIMAL;
+	pg->valid_states = TPGS_SUPPORT_ALL;
 	if (optimize_stpg)
 		pg->flags |= ALUA_OPTIMIZE_STPG;
 	kref_init(&pg->kref);
@@ -516,7 +519,7 @@  static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg)
 {
 	struct scsi_sense_hdr sense_hdr;
 	struct alua_port_group *tmp_pg;
-	int len, k, off, valid_states = 0, bufflen = ALUA_RTPG_SIZE;
+	int len, k, off, bufflen = ALUA_RTPG_SIZE;
 	unsigned char *desc, *buff;
 	unsigned err, retval;
 	unsigned int tpg_desc_tbl_off;
@@ -541,6 +544,20 @@  static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg)
 	retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags);
 
 	if (retval) {
+		/*
+		 * If the target only supports active/optimized there's
+		 * not much we can do; it's not that we can switch paths
+		 * or somesuch.
+		 * So ignore any errors to avoid spurious failures during
+		 * path failover.
+		 */
+		if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) {
+			sdev_printk(KERN_INFO, sdev,
+				    "%s: ignoring rtpg result %d\n",
+				    ALUA_DH_NAME, retval);
+			kfree(buff);
+			return SCSI_DH_OK;
+		}
 		if (!scsi_sense_valid(&sense_hdr)) {
 			sdev_printk(KERN_INFO, sdev,
 				    "%s: rtpg failed, result %d\n",
@@ -652,7 +669,7 @@  static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg)
 					rcu_read_unlock();
 				}
 				if (tmp_pg == pg)
-					valid_states = desc[1];
+					tmp_pg->valid_states = desc[1];
 				spin_unlock_irqrestore(&tmp_pg->lock, flags);
 			}
 			kref_put(&tmp_pg->kref, release_port_group);
@@ -665,13 +682,13 @@  static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg)
 		    "%s: port group %02x state %c %s supports %c%c%c%c%c%c%c\n",
 		    ALUA_DH_NAME, pg->group_id, print_alua_state(pg->state),
 		    pg->pref ? "preferred" : "non-preferred",
-		    valid_states&TPGS_SUPPORT_TRANSITION?'T':'t',
-		    valid_states&TPGS_SUPPORT_OFFLINE?'O':'o',
-		    valid_states&TPGS_SUPPORT_LBA_DEPENDENT?'L':'l',
-		    valid_states&TPGS_SUPPORT_UNAVAILABLE?'U':'u',
-		    valid_states&TPGS_SUPPORT_STANDBY?'S':'s',
-		    valid_states&TPGS_SUPPORT_NONOPTIMIZED?'N':'n',
-		    valid_states&TPGS_SUPPORT_OPTIMIZED?'A':'a');
+		    pg->valid_states&TPGS_SUPPORT_TRANSITION?'T':'t',
+		    pg->valid_states&TPGS_SUPPORT_OFFLINE?'O':'o',
+		    pg->valid_states&TPGS_SUPPORT_LBA_DEPENDENT?'L':'l',
+		    pg->valid_states&TPGS_SUPPORT_UNAVAILABLE?'U':'u',
+		    pg->valid_states&TPGS_SUPPORT_STANDBY?'S':'s',
+		    pg->valid_states&TPGS_SUPPORT_NONOPTIMIZED?'N':'n',
+		    pg->valid_states&TPGS_SUPPORT_OPTIMIZED?'A':'a');
 
 	switch (pg->state) {
 	case SCSI_ACCESS_STATE_TRANSITIONING: