diff mbox

[Bug,111441] New: iscsi fails to attach to targets

Message ID 56ABBCFF.9060003@cs.wisc.edu (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Mike Christie Jan. 29, 2016, 7:26 p.m. UTC
On 01/29/2016 01:11 PM, Serguei Bezverkhi (sbezverk) wrote:
> If you send me the diff for your patch, I will build new kernel myself.
> 

Bugzilla must be messing something up. I attached to one of the previous
mails. Attaching it here again.

Email me offlist and without bugzilla if you do not get it here.

The patch will fix the syfs bug ons you are hitting.

I am not sure if it will fix the genhd one. We can deal with that one
next if it is a different issue.


> Serguei
> 
> 
> Serguei Bezverkhi,
> TECHNICAL LEADER.SERVICES
> Global SP Services
> sbezverk@cisco.com
> Phone: +1 416 306 7312
> Mobile: +1 514 234 7374
> 
> CCIE (R&S,SP,Sec) - #9527
> 
> Cisco.com
> 
> 
> 
>  Think before you print.
> This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
> Please click here for Company Registration Information.
> 
> 
> 
> 
> -----Original Message-----
> From: Michael Christie [mailto:michaelc@cs.wisc.edu] 
> Sent: Friday, January 29, 2016 2:09 PM
> To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>
> Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org
> Subject: Re: [Bug 111441] New: iscsi fails to attach to targets
> 
> 
>> On Jan 29, 2016, at 6:04 AM, Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com> wrote:
>>
>> Actually this server uses both cases: Local taregts (since it is OpenStack server) and remote targets as it tries to mount 4 remotefile systems.  
>>
>> You are correct, I always use the same box I just change the kernel it is using to boot. No other changes to the environment. I do not mind to load a test kernel without that suspected patch, just get me the RPM.
>>
> 
> I do not know what you mean. I think the patch I sent will fix the sysfs errors caused due to alua not being setup properly on your system and scsi_dh_alua failing to attach. That patch should be applied to the 4.4 upstream kernel. Are you saying you want me to make you a kernel rpm?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Comments

Serguei Bezverkhi (sbezverk) Jan. 29, 2016, 10:21 p.m. UTC | #1
HI Mike,

I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.


Here is dmesg

[   26.103812] scsi 3:0:0:2: Direct-Access     LIO-ORG  san-disk-2       4.0  PQ: 0 ANSI: 5
[   26.104338] sd 3:0:0:2: alua: supports implicit and explicit TPGS
[   26.104549] sd 3:0:0:2: alua: No target port descriptors found
[   26.104552] sd 3:0:0:2: alua: Attach failed (-22)
[   26.104554] sd 3:0:0:2: failed to add device handler: -22
[   26.104578] sd 3:0:0:2: [sdc] 20507809792 512-byte logical blocks: (10.4 TB/9.54 TiB)
[   26.104905] sd 3:0:0:2: [sdc] Write Protect is off
[   26.104908] sd 3:0:0:2: [sdc] Mode Sense: 43 00 10 08
[   26.105036] sd 3:0:0:2: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   26.112294] scsi host6: iSCSI Initiator over TCP/IP
[   26.113279] scsi 4:0:0:3: Direct-Access     LIO-ORG  san-disk-3       4.0  PQ: 0 ANSI: 5
[   26.113690] sd 4:0:0:3: alua: supports implicit and explicit TPGS
[   26.113877] sd 4:0:0:3: [sdd] 9765625856 512-byte logical blocks: (5.00 TB/4.54 TiB)
[   26.113948] sd 4:0:0:3: alua: No target port descriptors found
[   26.113951] sd 4:0:0:3: alua: Attach failed (-22)
[   26.113953] sd 4:0:0:3: failed to add device handler: -22
[   26.114292] sd 4:0:0:3: [sdd] Write Protect is off
[   26.114295] sd 4:0:0:3: [sdd] Mode Sense: 43 00 10 08
[   26.114503] sd 4:0:0:3: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   26.123875] scsi 5:0:0:1: Direct-Access     LIO-ORG  san-disk-1       4.0  PQ: 0 ANSI: 5
[   26.123911] scsi 6:0:0:4: Direct-Access     LIO-ORG  san-disk-4       4.0  PQ: 0 ANSI: 5
[   26.124452] sd 6:0:0:4: alua: supports implicit and explicit TPGS
[   26.124453] sd 5:0:0:1: alua: supports implicit and explicit TPGS
[   26.124724] sd 5:0:0:1: alua: No target port descriptors found
[   26.124727] sd 5:0:0:1: alua: Attach failed (-22)
[   26.124728] sd 5:0:0:1: failed to add device handler: -22
[   26.124736] sd 6:0:0:4: [sde] 10742171648 512-byte logical blocks: (5.49 TB/5.00 TiB)
[   26.124773] sd 5:0:0:1: [sdf] 7812499389 512-byte logical blocks: (3.99 TB/3.63 TiB)
[   26.124777] sd 6:0:0:4: alua: No target port descriptors found
[   26.124779] sd 6:0:0:4: alua: Attach failed (-22)
[   26.124780] sd 6:0:0:4: failed to add device handler: -22
[   26.125182] sd 5:0:0:1: [sdf] Write Protect is off
[   26.125184] sd 5:0:0:1: [sdf] Mode Sense: 43 00 10 08
[   26.125217] sd 6:0:0:4: [sde] Write Protect is off
[   26.125220] sd 6:0:0:4: [sde] Mode Sense: 43 00 10 08
[   26.125306] sd 5:0:0:1: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   26.125512] sd 6:0:0:4: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA
[   26.129633]  sdf: sdf1
[   26.130637] sd 5:0:0:1: [sdf] Attached SCSI disk
[   26.144377] ixgbe 0000:04:00.0: registered PHC device on enp4s0f0
[   26.149072]  sdc: sdc1
[   26.150434] sd 3:0:0:2: [sdc] Attached SCSI disk
[   26.190709]  sdd: sdd1 sdd2
[   26.193348] sd 4:0:0:3: [sdd] Attached SCSI disk
[   26.230515]  sde: sde1
[   26.231674] sd 6:0:0:4: [sde] Attached SCSI disk
[   26.231987] sd 6:0:0:4: [sde] Synchronizing SCSI cache
[   26.232021] sd 5:0:0:1: [sdf] Synchronizing SCSI cache
[   26.233212] sd 3:0:0:2: [sdc] Synchronizing SCSI cache
[   26.233440] sd 4:0:0:3: [sdd] Synchronizing SCSI cache
[   26.236755] Buffer I/O error on dev sdc, logical block 2563476132, async page read
[   26.238897] Buffer I/O error on dev sdd, logical block 1220703182, async page read
[   26.245773] ixgbe 0000:04:00.1: SR-IOV enabled with 8 VFs
[   26.245775] ixgbe 0000:04:00.1: configure port vlans to keep your VFs secure
[   26.274544] scsi 6:0:0:0: Unexpected response from lun 4 while scanning, scan aborted
[   26.283173] scsi 3:0:0:0: Unexpected response from lun 2 while scanning, scan aborted
[   26.288571] scsi 4:0:0:0: Unexpected response from lun 3 while scanning, scan aborted
[   26.288618] scsi 5:0:0:0: Unexpected response from lun 1 while scanning, scan aborted


Second traceback is gone too, but still no luck attaching local iscsi targets either.


[  639.148875] TARGET_CORE[iSCSI]: Expected Transfer Length: 264 does not match SCSI CDB Length: 8 for SAM Opcode: 0x12
[  639.148911] sd 7:0:0:0: [sdc] 115343360 512-byte logical blocks: (59.0 GB/55.0 GiB)
[  639.148925] sd 7:0:0:0: alua: No target port descriptors found
[  639.148928] sd 7:0:0:0: alua: Attach failed (-22)
[  639.149186] sd 7:0:0:0: [sdc] Write Protect is off
[  639.149188] sd 7:0:0:0: [sdc] Mode Sense: 43 00 10 08
[  639.149279] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
[  639.149298] iSCSI/iqn.1994-05.com.redhat:cf7f1fafca4b: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[  639.149530] sd 7:0:0:0: failed to add device handler: -22
[  639.154762] sd 7:0:0:0: [sdc] Attached SCSI disk
[  639.154857] sd 7:0:0:0: [sdc] Synchronizing SCSI cache
[  655.279047] scsi 7:0:0:0: Direct-Access     LIO-ORG  IBLOCK           4.0  PQ: 0 ANSI: 5
[  655.279397] sd 7:0:0:0: alua: supports implicit and explicit TPGS
[  655.279503] TARGET_CORE[iSCSI]: Expected Transfer Length: 264 does not match SCSI CDB Length: 8 for SAM Opcode: 0x12
[  655.279533] sd 7:0:0:0: alua: No target port descriptors found
[  655.279535] sd 7:0:0:0: alua: Attach failed (-22)
[  655.279587] sd 7:0:0:0: [sdc] 115343360 512-byte logical blocks: (59.0 GB/55.0 GiB)
[  655.279848] sd 7:0:0:0: [sdc] Write Protect is off
[  655.279849] sd 7:0:0:0: [sdc] Mode Sense: 43 00 10 08
[  655.279981] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
[  655.280034] iSCSI/iqn.1994-05.com.redhat:cf7f1fafca4b: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[  655.280171] sd 7:0:0:0: failed to add device handler: -22
[  655.286008] sd 7:0:0:0: [sdc] Attached SCSI disk
[  655.286132] sd 7:0:0:0: [sdc] Synchronizing SCSI cache


Serguei Bezverkhi,
TECHNICAL LEADER.SERVICES
Global SP Services
sbezverk@cisco.com
Phone: +1 416 306 7312
Mobile: +1 514 234 7374

CCIE (R&S,SP,Sec) - #9527

Cisco.com



 Think before you print.
This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
Please click here for Company Registration Information.




-----Original Message-----
From: Mike Christie [mailto:michaelc@cs.wisc.edu] 
Sent: Friday, January 29, 2016 2:27 PM
To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>
Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org
Subject: Re: [Bug 111441] New: iscsi fails to attach to targets



On 01/29/2016 01:11 PM, Serguei Bezverkhi (sbezverk) wrote:
> If you send me the diff for your patch, I will build new kernel myself.
> 

Bugzilla must be messing something up. I attached to one of the previous mails. Attaching it here again.

Email me offlist and without bugzilla if you do not get it here.

The patch will fix the syfs bug ons you are hitting.

I am not sure if it will fix the genhd one. We can deal with that one next if it is a different issue.


> Serguei
> 
> 
> Serguei Bezverkhi,
> TECHNICAL LEADER.SERVICES
> Global SP Services
> sbezverk@cisco.com
> Phone: +1 416 306 7312
> Mobile: +1 514 234 7374
> 
> CCIE (R&S,SP,Sec) - #9527
> 
> Cisco.com
> 
> 
> 
>  Think before you print.
> This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
> Please click here for Company Registration Information.
> 
> 
> 
> 
> -----Original Message-----
> From: Michael Christie [mailto:michaelc@cs.wisc.edu]
> Sent: Friday, January 29, 2016 2:09 PM
> To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>
> Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org
> Subject: Re: [Bug 111441] New: iscsi fails to attach to targets
> 
> 
>> On Jan 29, 2016, at 6:04 AM, Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com> wrote:
>>
>> Actually this server uses both cases: Local taregts (since it is OpenStack server) and remote targets as it tries to mount 4 remotefile systems.  
>>
>> You are correct, I always use the same box I just change the kernel it is using to boot. No other changes to the environment. I do not mind to load a test kernel without that suspected patch, just get me the RPM.
>>
> 
> I do not know what you mean. I think the patch I sent will fix the sysfs errors caused due to alua not being setup properly on your system and scsi_dh_alua failing to attach. That patch should be applied to the 4.4 upstream kernel. Are you saying you want me to make you a kernel rpm?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike Christie Jan. 29, 2016, 11:32 p.m. UTC | #2
On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:
> HI Mike,
> 
> I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.
> 

That is sort of expected. Your target is not setup for ALUA properly. It
says it supports ALUA, but when scsi_dh_alua asks about the ports it is
reporting there are none. Ccing the people that made the patch that
added the issue and own the code.

Hey Christoph and Hannes,

The dh/alua changes that added this:

        error = scsi_dh_add_device(sdev);
        if (error) {
                sdev_printk(KERN_INFO, sdev,
                                "failed to add device handler: %d\n",
error);
                return error;
        }

to scsi_sysfs_add_sdev are adding a regression.

1. If that fails, then we forget to do device_del before doing the
return. My patch in this thread added that back, so we do not see the
sysfs oopses anymore. But.....

2. It looks like in older kernels, we would allow misconfigured targets
like this one to still setup devices. Do we want that old behavior back?
Should we just ignore the return value from scsi_dh_add_device above?
Note that in this case, it is LIO so it can be easily fixed on the
target side by just setting it up properly. I do not think other targets
would hit this type of issue.





> 
> Here is dmesg
> 
> [   26.103812] scsi 3:0:0:2: Direct-Access     LIO-ORG  san-disk-2       4.0  PQ: 0 ANSI: 5
> [   26.104338] sd 3:0:0:2: alua: supports implicit and explicit TPGS
> [   26.104549] sd 3:0:0:2: alua: No target port descriptors found
> [   26.104552] sd 3:0:0:2: alua: Attach failed (-22)
> [   26.104554] sd 3:0:0:2: failed to add device handler: -22
> [   26.104578] sd 3:0:0:2: [sdc] 20507809792 512-byte logical blocks: (10.4 TB/9.54 TiB)
> [   26.104905] sd 3:0:0:2: [sdc] Write Protect is off
> [   26.104908] sd 3:0:0:2: [sdc] Mode Sense: 43 00 10 08
> [   26.105036] sd 3:0:0:2: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [   26.112294] scsi host6: iSCSI Initiator over TCP/IP
> [   26.113279] scsi 4:0:0:3: Direct-Access     LIO-ORG  san-disk-3       4.0  PQ: 0 ANSI: 5
> [   26.113690] sd 4:0:0:3: alua: supports implicit and explicit TPGS
> [   26.113877] sd 4:0:0:3: [sdd] 9765625856 512-byte logical blocks: (5.00 TB/4.54 TiB)
> [   26.113948] sd 4:0:0:3: alua: No target port descriptors found
> [   26.113951] sd 4:0:0:3: alua: Attach failed (-22)
> [   26.113953] sd 4:0:0:3: failed to add device handler: -22
> [   26.114292] sd 4:0:0:3: [sdd] Write Protect is off
> [   26.114295] sd 4:0:0:3: [sdd] Mode Sense: 43 00 10 08
> [   26.114503] sd 4:0:0:3: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [   26.123875] scsi 5:0:0:1: Direct-Access     LIO-ORG  san-disk-1       4.0  PQ: 0 ANSI: 5
> [   26.123911] scsi 6:0:0:4: Direct-Access     LIO-ORG  san-disk-4       4.0  PQ: 0 ANSI: 5
> [   26.124452] sd 6:0:0:4: alua: supports implicit and explicit TPGS
> [   26.124453] sd 5:0:0:1: alua: supports implicit and explicit TPGS
> [   26.124724] sd 5:0:0:1: alua: No target port descriptors found
> [   26.124727] sd 5:0:0:1: alua: Attach failed (-22)
> [   26.124728] sd 5:0:0:1: failed to add device handler: -22
> [   26.124736] sd 6:0:0:4: [sde] 10742171648 512-byte logical blocks: (5.49 TB/5.00 TiB)
> [   26.124773] sd 5:0:0:1: [sdf] 7812499389 512-byte logical blocks: (3.99 TB/3.63 TiB)
> [   26.124777] sd 6:0:0:4: alua: No target port descriptors found
> [   26.124779] sd 6:0:0:4: alua: Attach failed (-22)
> [   26.124780] sd 6:0:0:4: failed to add device handler: -22
> [   26.125182] sd 5:0:0:1: [sdf] Write Protect is off
> [   26.125184] sd 5:0:0:1: [sdf] Mode Sense: 43 00 10 08
> [   26.125217] sd 6:0:0:4: [sde] Write Protect is off
> [   26.125220] sd 6:0:0:4: [sde] Mode Sense: 43 00 10 08
> [   26.125306] sd 5:0:0:1: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [   26.125512] sd 6:0:0:4: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [   26.129633]  sdf: sdf1
> [   26.130637] sd 5:0:0:1: [sdf] Attached SCSI disk
> [   26.144377] ixgbe 0000:04:00.0: registered PHC device on enp4s0f0
> [   26.149072]  sdc: sdc1
> [   26.150434] sd 3:0:0:2: [sdc] Attached SCSI disk
> [   26.190709]  sdd: sdd1 sdd2
> [   26.193348] sd 4:0:0:3: [sdd] Attached SCSI disk
> [   26.230515]  sde: sde1
> [   26.231674] sd 6:0:0:4: [sde] Attached SCSI disk
> [   26.231987] sd 6:0:0:4: [sde] Synchronizing SCSI cache
> [   26.232021] sd 5:0:0:1: [sdf] Synchronizing SCSI cache
> [   26.233212] sd 3:0:0:2: [sdc] Synchronizing SCSI cache
> [   26.233440] sd 4:0:0:3: [sdd] Synchronizing SCSI cache
> [   26.236755] Buffer I/O error on dev sdc, logical block 2563476132, async page read
> [   26.238897] Buffer I/O error on dev sdd, logical block 1220703182, async page read
> [   26.245773] ixgbe 0000:04:00.1: SR-IOV enabled with 8 VFs
> [   26.245775] ixgbe 0000:04:00.1: configure port vlans to keep your VFs secure
> [   26.274544] scsi 6:0:0:0: Unexpected response from lun 4 while scanning, scan aborted
> [   26.283173] scsi 3:0:0:0: Unexpected response from lun 2 while scanning, scan aborted
> [   26.288571] scsi 4:0:0:0: Unexpected response from lun 3 while scanning, scan aborted
> [   26.288618] scsi 5:0:0:0: Unexpected response from lun 1 while scanning, scan aborted
> 
> 
> Second traceback is gone too, but still no luck attaching local iscsi targets either.
> 
> 
> [  639.148875] TARGET_CORE[iSCSI]: Expected Transfer Length: 264 does not match SCSI CDB Length: 8 for SAM Opcode: 0x12
> [  639.148911] sd 7:0:0:0: [sdc] 115343360 512-byte logical blocks: (59.0 GB/55.0 GiB)
> [  639.148925] sd 7:0:0:0: alua: No target port descriptors found
> [  639.148928] sd 7:0:0:0: alua: Attach failed (-22)
> [  639.149186] sd 7:0:0:0: [sdc] Write Protect is off
> [  639.149188] sd 7:0:0:0: [sdc] Mode Sense: 43 00 10 08
> [  639.149279] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [  639.149298] iSCSI/iqn.1994-05.com.redhat:cf7f1fafca4b: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
> [  639.149530] sd 7:0:0:0: failed to add device handler: -22
> [  639.154762] sd 7:0:0:0: [sdc] Attached SCSI disk
> [  639.154857] sd 7:0:0:0: [sdc] Synchronizing SCSI cache
> [  655.279047] scsi 7:0:0:0: Direct-Access     LIO-ORG  IBLOCK           4.0  PQ: 0 ANSI: 5
> [  655.279397] sd 7:0:0:0: alua: supports implicit and explicit TPGS
> [  655.279503] TARGET_CORE[iSCSI]: Expected Transfer Length: 264 does not match SCSI CDB Length: 8 for SAM Opcode: 0x12
> [  655.279533] sd 7:0:0:0: alua: No target port descriptors found
> [  655.279535] sd 7:0:0:0: alua: Attach failed (-22)
> [  655.279587] sd 7:0:0:0: [sdc] 115343360 512-byte logical blocks: (59.0 GB/55.0 GiB)
> [  655.279848] sd 7:0:0:0: [sdc] Write Protect is off
> [  655.279849] sd 7:0:0:0: [sdc] Mode Sense: 43 00 10 08
> [  655.279981] sd 7:0:0:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [  655.280034] iSCSI/iqn.1994-05.com.redhat:cf7f1fafca4b: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
> [  655.280171] sd 7:0:0:0: failed to add device handler: -22
> [  655.286008] sd 7:0:0:0: [sdc] Attached SCSI disk
> [  655.286132] sd 7:0:0:0: [sdc] Synchronizing SCSI cache
> 
> 
> Serguei Bezverkhi,
> TECHNICAL LEADER.SERVICES
> Global SP Services
> sbezverk@cisco.com
> Phone: +1 416 306 7312
> Mobile: +1 514 234 7374
> 
> CCIE (R&S,SP,Sec) - #9527
> 
> Cisco.com
> 
> 
> 
>  Think before you print.
> This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
> Please click here for Company Registration Information.
> 
> 
> 
> 
> -----Original Message-----
> From: Mike Christie [mailto:michaelc@cs.wisc.edu] 
> Sent: Friday, January 29, 2016 2:27 PM
> To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>
> Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org
> Subject: Re: [Bug 111441] New: iscsi fails to attach to targets
> 
> 
> 
> On 01/29/2016 01:11 PM, Serguei Bezverkhi (sbezverk) wrote:
>> If you send me the diff for your patch, I will build new kernel myself.
>>
> 
> Bugzilla must be messing something up. I attached to one of the previous mails. Attaching it here again.
> 
> Email me offlist and without bugzilla if you do not get it here.
> 
> The patch will fix the syfs bug ons you are hitting.
> 
> I am not sure if it will fix the genhd one. We can deal with that one next if it is a different issue.
> 
> 
>> Serguei
>>
>>
>> Serguei Bezverkhi,
>> TECHNICAL LEADER.SERVICES
>> Global SP Services
>> sbezverk@cisco.com
>> Phone: +1 416 306 7312
>> Mobile: +1 514 234 7374
>>
>> CCIE (R&S,SP,Sec) - #9527
>>
>> Cisco.com
>>
>>
>>
>>  Think before you print.
>> This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
>> Please click here for Company Registration Information.
>>
>>
>>
>>
>> -----Original Message-----
>> From: Michael Christie [mailto:michaelc@cs.wisc.edu]
>> Sent: Friday, January 29, 2016 2:09 PM
>> To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>
>> Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org
>> Subject: Re: [Bug 111441] New: iscsi fails to attach to targets
>>
>>
>>> On Jan 29, 2016, at 6:04 AM, Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com> wrote:
>>>
>>> Actually this server uses both cases: Local taregts (since it is OpenStack server) and remote targets as it tries to mount 4 remotefile systems.  
>>>
>>> You are correct, I always use the same box I just change the kernel it is using to boot. No other changes to the environment. I do not mind to load a test kernel without that suspected patch, just get me the RPM.
>>>
>>
>> I do not know what you mean. I think the patch I sent will fix the sysfs errors caused due to alua not being setup properly on your system and scsi_dh_alua failing to attach. That patch should be applied to the 4.4 upstream kernel. Are you saying you want me to make you a kernel rpm?
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" 
>> in the body of a message to majordomo@vger.kernel.org More majordomo 
>> info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger Jan. 30, 2016, 7:38 a.m. UTC | #3
On Fri, 2016-01-29 at 17:32 -0600, Mike Christie wrote:
> On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:
> > HI Mike,
> > 
> > I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.
> > 
> 
> That is sort of expected. Your target is not setup for ALUA properly. It
> says it supports ALUA, but when scsi_dh_alua asks about the ports it is
> reporting there are none. Ccing the people that made the patch that
> added the issue and own the code.
> 
> Hey Christoph and Hannes,
> 
> The dh/alua changes that added this:
> 
>         error = scsi_dh_add_device(sdev);
>         if (error) {
>                 sdev_printk(KERN_INFO, sdev,
>                                 "failed to add device handler: %d\n",
> error);
>                 return error;
>         }
> 
> to scsi_sysfs_add_sdev are adding a regression.
> 
> 1. If that fails, then we forget to do device_del before doing the
> return. My patch in this thread added that back, so we do not see the
> sysfs oopses anymore. But.....
> 
> 2. It looks like in older kernels, we would allow misconfigured targets
> like this one to still setup devices. Do we want that old behavior back?
> Should we just ignore the return value from scsi_dh_add_device above?
> Note that in this case, it is LIO so it can be easily fixed on the
> target side by just setting it up properly. I do not think other targets
> would hit this type of issue.
> 

Btw, what does misconfigured mean here wrt target ALUA..?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike Christie Feb. 1, 2016, 4:55 p.m. UTC | #4
On 01/30/2016 01:38 AM, Nicholas A. Bellinger wrote:
> On Fri, 2016-01-29 at 17:32 -0600, Mike Christie wrote:
>> On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:
>>> HI Mike,
>>>
>>> I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.
>>>
>>
>> That is sort of expected. Your target is not setup for ALUA properly. It
>> says it supports ALUA, but when scsi_dh_alua asks about the ports it is
>> reporting there are none. Ccing the people that made the patch that
>> added the issue and own the code.
>>
>> Hey Christoph and Hannes,
>>
>> The dh/alua changes that added this:
>>
>>         error = scsi_dh_add_device(sdev);
>>         if (error) {
>>                 sdev_printk(KERN_INFO, sdev,
>>                                 "failed to add device handler: %d\n",
>> error);
>>                 return error;
>>         }
>>
>> to scsi_sysfs_add_sdev are adding a regression.
>>
>> 1. If that fails, then we forget to do device_del before doing the
>> return. My patch in this thread added that back, so we do not see the
>> sysfs oopses anymore. But.....
>>
>> 2. It looks like in older kernels, we would allow misconfigured targets
>> like this one to still setup devices. Do we want that old behavior back?
>> Should we just ignore the return value from scsi_dh_add_device above?
>> Note that in this case, it is LIO so it can be easily fixed on the
>> target side by just setting it up properly. I do not think other targets
>> would hit this type of issue.
>>
> 
> Btw, what does misconfigured mean here wrt target ALUA..?

[   25.833195] sd 6:0:0:4: alua: supports implicit and explicit TPGS
[   25.833360] sd 6:0:0:4: alua: No target port descriptors found
[   25.833363] sd 6:0:0:4: alua: Attach failed (-22)
[   25.833365] sd 6:0:0:4: failed to add device handler: -22

He has LIO configured to report it supports implicit/explicit ALUA, but
the ports do not seem to be configured.

For the LIO config side, are his LUNs just not in a the default_lu_gp or
any other group?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Serguei Bezverkhi (sbezverk) Feb. 2, 2016, 4:41 p.m. UTC | #5
Hello,

Any chance we could move forward with this investigation? I still cannot attach to any remove iscsi targets with either 4.4.0 or 4.4.1 kernels.

Thank you

Serguei


 

-----Original Message-----
From: Mike Christie [mailto:michaelc@cs.wisc.edu] 
Sent: Monday, February 01, 2016 11:55 AM
To: Nicholas A. Bellinger <nab@linux-iscsi.org>
Cc: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>; bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org; Christoph Hellwig <hch@infradead.org>; Hannes Reinecke <hare@suse.de>
Subject: Re: [Bug 111441] New: iscsi fails to attach to targets

On 01/30/2016 01:38 AM, Nicholas A. Bellinger wrote:
> On Fri, 2016-01-29 at 17:32 -0600, Mike Christie wrote:
>> On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:
>>> HI Mike,
>>>
>>> I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.
>>>
>>
>> That is sort of expected. Your target is not setup for ALUA properly. 
>> It says it supports ALUA, but when scsi_dh_alua asks about the ports 
>> it is reporting there are none. Ccing the people that made the patch 
>> that added the issue and own the code.
>>
>> Hey Christoph and Hannes,
>>
>> The dh/alua changes that added this:
>>
>>         error = scsi_dh_add_device(sdev);
>>         if (error) {
>>                 sdev_printk(KERN_INFO, sdev,
>>                                 "failed to add device handler: %d\n", 
>> error);
>>                 return error;
>>         }
>>
>> to scsi_sysfs_add_sdev are adding a regression.
>>
>> 1. If that fails, then we forget to do device_del before doing the 
>> return. My patch in this thread added that back, so we do not see the 
>> sysfs oopses anymore. But.....
>>
>> 2. It looks like in older kernels, we would allow misconfigured 
>> targets like this one to still setup devices. Do we want that old behavior back?
>> Should we just ignore the return value from scsi_dh_add_device above?
>> Note that in this case, it is LIO so it can be easily fixed on the 
>> target side by just setting it up properly. I do not think other 
>> targets would hit this type of issue.
>>
> 
> Btw, what does misconfigured mean here wrt target ALUA..?

[   25.833195] sd 6:0:0:4: alua: supports implicit and explicit TPGS
[   25.833360] sd 6:0:0:4: alua: No target port descriptors found
[   25.833363] sd 6:0:0:4: alua: Attach failed (-22)
[   25.833365] sd 6:0:0:4: failed to add device handler: -22

He has LIO configured to report it supports implicit/explicit ALUA, but the ports do not seem to be configured.

For the LIO config side, are his LUNs just not in a the default_lu_gp or any other group?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Feb. 2, 2016, 6:09 p.m. UTC | #6
On Fri, Jan 29, 2016 at 05:32:54PM -0600, Mike Christie wrote:
> Hey Christoph and Hannes,
> 
> The dh/alua changes that added this:
> 
>         error = scsi_dh_add_device(sdev);
>         if (error) {
>                 sdev_printk(KERN_INFO, sdev,
>                                 "failed to add device handler: %d\n",
> error);
>                 return error;
>         }
> 
> to scsi_sysfs_add_sdev are adding a regression.
> 
> 1. If that fails, then we forget to do device_del before doing the
> return. My patch in this thread added that back, so we do not see the
> sysfs oopses anymore. But.....

Ok.

> 2. It looks like in older kernels, we would allow misconfigured targets
> like this one to still setup devices. Do we want that old behavior back?
> Should we just ignore the return value from scsi_dh_add_device above?
> Note that in this case, it is LIO so it can be easily fixed on the
> target side by just setting it up properly. I do not think other targets
> would hit this type of issue.

Be liberal in what you accept..  I guess we need to continue allowing
to connect to these broken targets, but a warning would be useful.

Can you send a patch?
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nicholas A. Bellinger Feb. 2, 2016, 10:56 p.m. UTC | #7
On Mon, 2016-02-01 at 10:55 -0600, Mike Christie wrote:
> On 01/30/2016 01:38 AM, Nicholas A. Bellinger wrote:
> > On Fri, 2016-01-29 at 17:32 -0600, Mike Christie wrote:
> >> On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:
> >>> HI Mike,
> >>>
> >>> I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.
> >>>
> >>
> >> That is sort of expected. Your target is not setup for ALUA properly. It
> >> says it supports ALUA, but when scsi_dh_alua asks about the ports it is
> >> reporting there are none. Ccing the people that made the patch that
> >> added the issue and own the code.
> >>
> >> Hey Christoph and Hannes,
> >>
> >> The dh/alua changes that added this:
> >>
> >>         error = scsi_dh_add_device(sdev);
> >>         if (error) {
> >>                 sdev_printk(KERN_INFO, sdev,
> >>                                 "failed to add device handler: %d\n",
> >> error);
> >>                 return error;
> >>         }
> >>
> >> to scsi_sysfs_add_sdev are adding a regression.
> >>
> >> 1. If that fails, then we forget to do device_del before doing the
> >> return. My patch in this thread added that back, so we do not see the
> >> sysfs oopses anymore. But.....
> >>
> >> 2. It looks like in older kernels, we would allow misconfigured targets
> >> like this one to still setup devices. Do we want that old behavior back?
> >> Should we just ignore the return value from scsi_dh_add_device above?
> >> Note that in this case, it is LIO so it can be easily fixed on the
> >> target side by just setting it up properly. I do not think other targets
> >> would hit this type of issue.
> >>
> > 
> > Btw, what does misconfigured mean here wrt target ALUA..?
> 
> [   25.833195] sd 6:0:0:4: alua: supports implicit and explicit TPGS
> [   25.833360] sd 6:0:0:4: alua: No target port descriptors found
> [   25.833363] sd 6:0:0:4: alua: Attach failed (-22)
> [   25.833365] sd 6:0:0:4: failed to add device handler: -22
> 

Strange, this hasn't changed in forever on the target side..

> He has LIO configured to report it supports implicit/explicit ALUA, but
> the ports do not seem to be configured.
> 
> For the LIO config side, are his LUNs just not in a the default_lu_gp or
> any other group?

So every non-PSCSI backend device becomes part of default_lu_gp +
default_tg_pt_gp and automatically shows up in EVPD=0x83, without user
needing to do any additional configuration.

Here's what the output looks like:

root@haakon3:/usr/src/target-pending.git# sg_inq -Hi /dev/sdb
VPD INQUIRY: Device Identification page
  <SNIP>
  Designation descriptor number 3, descriptor length: 8
    transport: Serial Attached SCSI Protocol (SPL-2)
    designator_type: Relative target port,  code_set: Binary
    associated with the target port
    designator header(hex): 61 94 00 04
    designator:
 00     00 00 00 02                                         ....
  Designation descriptor number 4, descriptor length: 8
    transport: Serial Attached SCSI Protocol (SPL-2)
    designator_type: Target port group,  code_set: Binary
    associated with the target port
    designator header(hex): 61 95 00 04
    designator:
 00     00 00 00 00                                         ....
  Designation descriptor number 5, descriptor length: 8
    designator_type: Logical unit group,  code_set: Binary
    associated with the addressed logical unit
    designator header(hex): 01 06 00 04
    designator:
 00     00 00 00 00                                         ....
 <SNIP>

So AFAICT, the relative target port, target port group, and logical unit
group being returned from target on v4.5-rc1 code looks correct.

Serguei, can you confirm with 'sg_inq -Hi /dev/sdX' output on your side
with the v3.10 based target..?

AFAICT the parsing in scsi_vpd_tpg_id() from commit a8aa3978 looks
correct too.

Hannes, any ideas..?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Serguei Bezverkhi (sbezverk) Feb. 2, 2016, 11:21 p.m. UTC | #8
Here you go, same output to compare for 4.4.1 and 3.10.0

Linux 4.4.1 #1 SMP Tue Feb 2 16:15:36 EST 2016

[root@sbezverk-osp-3 ~(keystone_admin)]#  sg_inq -Hi /dev/sdc
VPD INQUIRY: Device Identification page
  Designation descriptor number 1, descriptor length: 20
    designator_type: NAA,  code_set: Binary
    associated with the addressed logical unit
    designator header(hex): 01 03 00 10
    designator:
 00     60 01 40 56 5f e9 25 e7  94 a4 20 69 2c 0c b3 c6    `.@V_.%... i,...
  Designation descriptor number 2, descriptor length: 60
    designator_type: T10 vendor identification,  code_set: ASCII
    associated with the addressed logical unit
    designator header(hex): 02 01 00 38
    designator:
 00     4c 49 4f 2d 4f 52 47 00  73 61 6e 2d 64 69 73 6b    LIO-ORG.san-disk
 10     2d 32 3a 36 35 66 65 39  32 35 65 2d 37 39 34 61    -2:65fe925e-794a
 20     2d 34 32 30 36 2d 39 32  63 30 2d 63 62 33 63 36    -4206-92c0-cb3c6
 30     39 61 35 33 34 39 61 00                             9a5349a.
  Designation descriptor number 3, descriptor length: 8
    transport: Internet SCSI (iSCSI)
    designator_type: Relative target port,  code_set: Binary
    associated with the target port
    designator header(hex): 51 94 00 04
    designator:
 00     00 00 00 01                                         ....
  Designation descriptor number 4, descriptor length: 8
    transport: Internet SCSI (iSCSI)
    designator_type: Target port group,  code_set: Binary
    associated with the target port
    designator header(hex): 51 95 00 04
    designator:
 00     00 00 00 00                                         ....
  Designation descriptor number 5, descriptor length: 8
    designator_type: Logical unit group,  code_set: Binary
    associated with the addressed logical unit
    designator header(hex): 01 06 00 04
    designator:
 00     00 00 00 00                                         ....
  Designation descriptor number 6, descriptor length: 80
    transport: Internet SCSI (iSCSI)
    designator_type: SCSI name string,  code_set: UTF-8
    associated with the target port
    designator header(hex): 53 98 00 4c
    designator:
 00     69 71 6e 2e 32 30 30 33  2d 30 31 2e 6f 72 67 2e    iqn.2003-01.org.
 10     6c 69 6e 75 78 2d 69 73  63 73 69 2e 73 62 65 7a    linux-iscsi.sbez
 20     76 65 72 6b 2d 73 61 6e  2d 31 2e 78 38 36 36 34    verk-san-1.x8664
 30     3a 73 6e 2e 33 64 66 63  66 66 62 64 66 66 34 33    :sn.3dfcffbdff43
 40     2c 74 2c 30 78 30 30 30  31 00 00 00                ,t,0x0001...
  Designation descriptor number 7, descriptor length: 72
    transport: Internet SCSI (iSCSI)
    designator_type: SCSI name string,  code_set: UTF-8
    associated with the target device that contains addressed lu
    designator header(hex): 53 a8 00 44
    designator:
 00     69 71 6e 2e 32 30 30 33  2d 30 31 2e 6f 72 67 2e    iqn.2003-01.org.
 10     6c 69 6e 75 78 2d 69 73  63 73 69 2e 73 62 65 7a    linux-iscsi.sbez
 20     76 65 72 6b 2d 73 61 6e  2d 31 2e 78 38 36 36 34    verk-san-1.x8664
 30     3a 73 6e 2e 33 64 66 63  66 66 62 64 66 66 34 33    :sn.3dfcffbdff43
 40     00 00 00 00

Linux 3.10.0-327.4.5.el7.x86_64 #1 SMP Thu Jan 21 04:10:29 EST 2016

 [root@sbezverk-osp-3 ~(keystone_admin)]# sg_inq -Hi /dev/sdc

VPD INQUIRY: Device Identification page
  Designation descriptor number 1, descriptor length: 20
    designator_type: NAA,  code_set: Binary
    associated with the addressed logical unit
    designator header(hex): 01 03 00 10
    designator:
 00     60 01 40 56 5f e9 25 e7  94 a4 20 69 2c 0c b3 c6    `.@V_.%... i,...
  Designation descriptor number 2, descriptor length: 60
    designator_type: T10 vendor identification,  code_set: ASCII
    associated with the addressed logical unit
    designator header(hex): 02 01 00 38
    designator:
 00     4c 49 4f 2d 4f 52 47 00  73 61 6e 2d 64 69 73 6b    LIO-ORG.san-disk
 10     2d 32 3a 36 35 66 65 39  32 35 65 2d 37 39 34 61    -2:65fe925e-794a
 20     2d 34 32 30 36 2d 39 32  63 30 2d 63 62 33 63 36    -4206-92c0-cb3c6
 30     39 61 35 33 34 39 61 00                             9a5349a.
  Designation descriptor number 3, descriptor length: 8
    transport: Internet SCSI (iSCSI)
    designator_type: Relative target port,  code_set: Binary
    associated with the target port
    designator header(hex): 51 94 00 04
    designator:
 00     00 00 00 01                                         ....
  Designation descriptor number 4, descriptor length: 8
    transport: Internet SCSI (iSCSI)
    designator_type: Target port group,  code_set: Binary
    associated with the target port
    designator header(hex): 51 95 00 04
    designator:
 00     00 00 00 00                                         ....
  Designation descriptor number 5, descriptor length: 8
    designator_type: Logical unit group,  code_set: Binary
    associated with the addressed logical unit
    designator header(hex): 01 06 00 04
    designator:
 00     00 00 00 00                                         ....
  Designation descriptor number 6, descriptor length: 80
    transport: Internet SCSI (iSCSI)
    designator_type: SCSI name string,  code_set: UTF-8
    associated with the target port
    designator header(hex): 53 98 00 4c
    designator:
 00     69 71 6e 2e 32 30 30 33  2d 30 31 2e 6f 72 67 2e    iqn.2003-01.org.
 10     6c 69 6e 75 78 2d 69 73  63 73 69 2e 73 62 65 7a    linux-iscsi.sbez
 20     76 65 72 6b 2d 73 61 6e  2d 31 2e 78 38 36 36 34    verk-san-1.x8664
 30     3a 73 6e 2e 33 64 66 63  66 66 62 64 66 66 34 33    :sn.3dfcffbdff43
 40     2c 74 2c 30 78 30 30 30  31 00 00 00                ,t,0x0001...
  Designation descriptor number 7, descriptor length: 72
    transport: Internet SCSI (iSCSI)
    designator_type: SCSI name string,  code_set: UTF-8
    associated with the target device that contains addressed lu
    designator header(hex): 53 a8 00 44
    designator:
 00     69 71 6e 2e 32 30 30 33  2d 30 31 2e 6f 72 67 2e    iqn.2003-01.org.
 10     6c 69 6e 75 78 2d 69 73  63 73 69 2e 73 62 65 7a    linux-iscsi.sbez
 20     76 65 72 6b 2d 73 61 6e  2d 31 2e 78 38 36 36 34    verk-san-1.x8664
 30     3a 73 6e 2e 33 64 66 63  66 66 62 64 66 66 34 33    :sn.3dfcffbdff43
 40     00 00 00 00                           


Let me know if you need any additional info.

Thank you

Serguei


-----Original Message-----
From: Nicholas A. Bellinger [mailto:nab@linux-iscsi.org] 

Sent: Tuesday, February 02, 2016 5:56 PM
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>; bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org; Christoph Hellwig <hch@infradead.org>; Hannes Reinecke <hare@suse.de>
Subject: Re: [Bug 111441] New: iscsi fails to attach to targets

On Mon, 2016-02-01 at 10:55 -0600, Mike Christie wrote:
> On 01/30/2016 01:38 AM, Nicholas A. Bellinger wrote:

> > On Fri, 2016-01-29 at 17:32 -0600, Mike Christie wrote:

> >> On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:

> >>> HI Mike,

> >>>

> >>> I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.

> >>>

> >>

> >> That is sort of expected. Your target is not setup for ALUA 

> >> properly. It says it supports ALUA, but when scsi_dh_alua asks 

> >> about the ports it is reporting there are none. Ccing the people 

> >> that made the patch that added the issue and own the code.

> >>

> >> Hey Christoph and Hannes,

> >>

> >> The dh/alua changes that added this:

> >>

> >>         error = scsi_dh_add_device(sdev);

> >>         if (error) {

> >>                 sdev_printk(KERN_INFO, sdev,

> >>                                 "failed to add device handler: 

> >> %d\n", error);

> >>                 return error;

> >>         }

> >>

> >> to scsi_sysfs_add_sdev are adding a regression.

> >>

> >> 1. If that fails, then we forget to do device_del before doing the 

> >> return. My patch in this thread added that back, so we do not see 

> >> the sysfs oopses anymore. But.....

> >>

> >> 2. It looks like in older kernels, we would allow misconfigured 

> >> targets like this one to still setup devices. Do we want that old behavior back?

> >> Should we just ignore the return value from scsi_dh_add_device above?

> >> Note that in this case, it is LIO so it can be easily fixed on the 

> >> target side by just setting it up properly. I do not think other 

> >> targets would hit this type of issue.

> >>

> > 

> > Btw, what does misconfigured mean here wrt target ALUA..?

> 

> [   25.833195] sd 6:0:0:4: alua: supports implicit and explicit TPGS

> [   25.833360] sd 6:0:0:4: alua: No target port descriptors found

> [   25.833363] sd 6:0:0:4: alua: Attach failed (-22)

> [   25.833365] sd 6:0:0:4: failed to add device handler: -22

> 


Strange, this hasn't changed in forever on the target side..

> He has LIO configured to report it supports implicit/explicit ALUA, 

> but the ports do not seem to be configured.

> 

> For the LIO config side, are his LUNs just not in a the default_lu_gp 

> or any other group?


So every non-PSCSI backend device becomes part of default_lu_gp + default_tg_pt_gp and automatically shows up in EVPD=0x83, without user needing to do any additional configuration.

Here's what the output looks like:

root@haakon3:/usr/src/target-pending.git# sg_inq -Hi /dev/sdb VPD INQUIRY: Device Identification page
  <SNIP>
  Designation descriptor number 3, descriptor length: 8
    transport: Serial Attached SCSI Protocol (SPL-2)
    designator_type: Relative target port,  code_set: Binary
    associated with the target port
    designator header(hex): 61 94 00 04
    designator:
 00     00 00 00 02                                         ....
  Designation descriptor number 4, descriptor length: 8
    transport: Serial Attached SCSI Protocol (SPL-2)
    designator_type: Target port group,  code_set: Binary
    associated with the target port
    designator header(hex): 61 95 00 04
    designator:
 00     00 00 00 00                                         ....
  Designation descriptor number 5, descriptor length: 8
    designator_type: Logical unit group,  code_set: Binary
    associated with the addressed logical unit
    designator header(hex): 01 06 00 04
    designator:
 00     00 00 00 00                                         ....
 <SNIP>

So AFAICT, the relative target port, target port group, and logical unit group being returned from target on v4.5-rc1 code looks correct.

Serguei, can you confirm with 'sg_inq -Hi /dev/sdX' output on your side with the v3.10 based target..?

AFAICT the parsing in scsi_vpd_tpg_id() from commit a8aa3978 looks correct too.

Hannes, any ideas..?
Nicholas A. Bellinger Feb. 8, 2016, 8:01 a.m. UTC | #9
On Tue, 2016-02-02 at 14:56 -0800, Nicholas A. Bellinger wrote:
> On Mon, 2016-02-01 at 10:55 -0600, Mike Christie wrote:
> > On 01/30/2016 01:38 AM, Nicholas A. Bellinger wrote:
> > > On Fri, 2016-01-29 at 17:32 -0600, Mike Christie wrote:
> > >> On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:
> > >>> HI Mike,
> > >>>
> > >>> I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.
> > >>>
> > >>
> > >> That is sort of expected. Your target is not setup for ALUA properly. It
> > >> says it supports ALUA, but when scsi_dh_alua asks about the ports it is
> > >> reporting there are none. Ccing the people that made the patch that
> > >> added the issue and own the code.
> > >>
> > >> Hey Christoph and Hannes,
> > >>
> > >> The dh/alua changes that added this:
> > >>
> > >>         error = scsi_dh_add_device(sdev);
> > >>         if (error) {
> > >>                 sdev_printk(KERN_INFO, sdev,
> > >>                                 "failed to add device handler: %d\n",
> > >> error);
> > >>                 return error;
> > >>         }
> > >>
> > >> to scsi_sysfs_add_sdev are adding a regression.
> > >>
> > >> 1. If that fails, then we forget to do device_del before doing the
> > >> return. My patch in this thread added that back, so we do not see the
> > >> sysfs oopses anymore. But.....
> > >>
> > >> 2. It looks like in older kernels, we would allow misconfigured targets
> > >> like this one to still setup devices. Do we want that old behavior back?
> > >> Should we just ignore the return value from scsi_dh_add_device above?
> > >> Note that in this case, it is LIO so it can be easily fixed on the
> > >> target side by just setting it up properly. I do not think other targets
> > >> would hit this type of issue.
> > >>
> > > 
> > > Btw, what does misconfigured mean here wrt target ALUA..?
> > 
> > [   25.833195] sd 6:0:0:4: alua: supports implicit and explicit TPGS
> > [   25.833360] sd 6:0:0:4: alua: No target port descriptors found
> > [   25.833363] sd 6:0:0:4: alua: Attach failed (-22)
> > [   25.833365] sd 6:0:0:4: failed to add device handler: -22
> > 
> 
> Strange, this hasn't changed in forever on the target side..
> 
> > He has LIO configured to report it supports implicit/explicit ALUA, but
> > the ports do not seem to be configured.
> > 
> > For the LIO config side, are his LUNs just not in a the default_lu_gp or
> > any other group?
> 
> So every non-PSCSI backend device becomes part of default_lu_gp +
> default_tg_pt_gp and automatically shows up in EVPD=0x83, without user
> needing to do any additional configuration.
> 
> Here's what the output looks like:
> 
> root@haakon3:/usr/src/target-pending.git# sg_inq -Hi /dev/sdb
> VPD INQUIRY: Device Identification page
>   <SNIP>
>   Designation descriptor number 3, descriptor length: 8
>     transport: Serial Attached SCSI Protocol (SPL-2)
>     designator_type: Relative target port,  code_set: Binary
>     associated with the target port
>     designator header(hex): 61 94 00 04
>     designator:
>  00     00 00 00 02                                         ....
>   Designation descriptor number 4, descriptor length: 8
>     transport: Serial Attached SCSI Protocol (SPL-2)
>     designator_type: Target port group,  code_set: Binary
>     associated with the target port
>     designator header(hex): 61 95 00 04
>     designator:
>  00     00 00 00 00                                         ....
>   Designation descriptor number 5, descriptor length: 8
>     designator_type: Logical unit group,  code_set: Binary
>     associated with the addressed logical unit
>     designator header(hex): 01 06 00 04
>     designator:
>  00     00 00 00 00                                         ....
>  <SNIP>
> 
> So AFAICT, the relative target port, target port group, and logical unit
> group being returned from target on v4.5-rc1 code looks correct.
> 
> Serguei, can you confirm with 'sg_inq -Hi /dev/sdX' output on your side
> with the v3.10 based target..?
> 
> AFAICT the parsing in scsi_vpd_tpg_id() from commit a8aa3978 looks
> correct too.
> 
> Hannes, any ideas..?

Ping.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Reinecke Feb. 16, 2016, 7:08 p.m. UTC | #10
On 02/08/2016 09:01 AM, Nicholas A. Bellinger wrote:
> On Tue, 2016-02-02 at 14:56 -0800, Nicholas A. Bellinger wrote:
>> On Mon, 2016-02-01 at 10:55 -0600, Mike Christie wrote:
>>> On 01/30/2016 01:38 AM, Nicholas A. Bellinger wrote:
>>>> On Fri, 2016-01-29 at 17:32 -0600, Mike Christie wrote:
>>>>> On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:
>>>>>> HI Mike,
>>>>>>
>>>>>> I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.
>>>>>>
>>>>>
>>>>> That is sort of expected. Your target is not setup for ALUA properly. It
>>>>> says it supports ALUA, but when scsi_dh_alua asks about the ports it is
>>>>> reporting there are none. Ccing the people that made the patch that
>>>>> added the issue and own the code.
>>>>>
>>>>> Hey Christoph and Hannes,
>>>>>
>>>>> The dh/alua changes that added this:
>>>>>
>>>>>          error = scsi_dh_add_device(sdev);
>>>>>          if (error) {
>>>>>                  sdev_printk(KERN_INFO, sdev,
>>>>>                                  "failed to add device handler: %d\n",
>>>>> error);
>>>>>                  return error;
>>>>>          }
>>>>>
>>>>> to scsi_sysfs_add_sdev are adding a regression.
>>>>>
>>>>> 1. If that fails, then we forget to do device_del before doing the
>>>>> return. My patch in this thread added that back, so we do not see the
>>>>> sysfs oopses anymore. But.....
>>>>>
>>>>> 2. It looks like in older kernels, we would allow misconfigured targets
>>>>> like this one to still setup devices. Do we want that old behavior back?
>>>>> Should we just ignore the return value from scsi_dh_add_device above?
>>>>> Note that in this case, it is LIO so it can be easily fixed on the
>>>>> target side by just setting it up properly. I do not think other targets
>>>>> would hit this type of issue.
>>>>>
>>>>
>>>> Btw, what does misconfigured mean here wrt target ALUA..?
>>>
>>> [   25.833195] sd 6:0:0:4: alua: supports implicit and explicit TPGS
>>> [   25.833360] sd 6:0:0:4: alua: No target port descriptors found
>>> [   25.833363] sd 6:0:0:4: alua: Attach failed (-22)
>>> [   25.833365] sd 6:0:0:4: failed to add device handler: -22
>>>
>>
>> Strange, this hasn't changed in forever on the target side..
>>
>>> He has LIO configured to report it supports implicit/explicit ALUA, but
>>> the ports do not seem to be configured.
>>>
>>> For the LIO config side, are his LUNs just not in a the default_lu_gp or
>>> any other group?
>>
>> So every non-PSCSI backend device becomes part of default_lu_gp +
>> default_tg_pt_gp and automatically shows up in EVPD=0x83, without user
>> needing to do any additional configuration.
>>
>> Here's what the output looks like:
>>
>> root@haakon3:/usr/src/target-pending.git# sg_inq -Hi /dev/sdb
>> VPD INQUIRY: Device Identification page
>>    <SNIP>
>>    Designation descriptor number 3, descriptor length: 8
>>      transport: Serial Attached SCSI Protocol (SPL-2)
>>      designator_type: Relative target port,  code_set: Binary
>>      associated with the target port
>>      designator header(hex): 61 94 00 04
>>      designator:
>>   00     00 00 00 02                                         ....
>>    Designation descriptor number 4, descriptor length: 8
>>      transport: Serial Attached SCSI Protocol (SPL-2)
>>      designator_type: Target port group,  code_set: Binary
>>      associated with the target port
>>      designator header(hex): 61 95 00 04
>>      designator:
>>   00     00 00 00 00                                         ....
>>    Designation descriptor number 5, descriptor length: 8
>>      designator_type: Logical unit group,  code_set: Binary
>>      associated with the addressed logical unit
>>      designator header(hex): 01 06 00 04
>>      designator:
>>   00     00 00 00 00                                         ....
>>   <SNIP>
>>
>> So AFAICT, the relative target port, target port group, and logical unit
>> group being returned from target on v4.5-rc1 code looks correct.
>>
>> Serguei, can you confirm with 'sg_inq -Hi /dev/sdX' output on your side
>> with the v3.10 based target..?
>>
>> AFAICT the parsing in scsi_vpd_tpg_id() from commit a8aa3978 looks
>> correct too.
>>
>> Hannes, any ideas..?
>
> Ping.
>
Please try with my latest scsi_dh_alua patchset posted to linux-scsi.
That should solve the error attaching devices.

Cheers,

Hannes
Serguei Bezverkhi (sbezverk) Feb. 22, 2016, 12:45 a.m. UTC | #11
Hi Mike,

I just wanted to follow up with you to see if the patch got committed to an upstream kernel if yes, please let me into which version it went.

Thank you

Serguei


Serguei Bezverkhi,
TECHNICAL LEADER.SERVICES
Global SP Services
sbezverk@cisco.com
Phone: +1 416 306 7312
Mobile: +1 514 234 7374

CCIE (R&S,SP,Sec) - #9527

Cisco.com



 Think before you print.
This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
Please click here for Company Registration Information.



-----Original Message-----
From: Mike Christie [mailto:michaelc@cs.wisc.edu] 
Sent: Friday, January 29, 2016 6:33 PM
To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>
Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org; Christoph Hellwig <hch@infradead.org>; Hannes Reinecke <hare@suse.de>
Subject: Re: [Bug 111441] New: iscsi fails to attach to targets

On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:
> HI Mike,
> 
> I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.
> 

That is sort of expected. Your target is not setup for ALUA properly. It says it supports ALUA, but when scsi_dh_alua asks about the ports it is reporting there are none. Ccing the people that made the patch that added the issue and own the code.

Hey Christoph and Hannes,

The dh/alua changes that added this:

        error = scsi_dh_add_device(sdev);
        if (error) {
                sdev_printk(KERN_INFO, sdev,
                                "failed to add device handler: %d\n", error);
                return error;
        }

to scsi_sysfs_add_sdev are adding a regression.

1. If that fails, then we forget to do device_del before doing the return. My patch in this thread added that back, so we do not see the sysfs oopses anymore. But.....

2. It looks like in older kernels, we would allow misconfigured targets like this one to still setup devices. Do we want that old behavior back?
Should we just ignore the return value from scsi_dh_add_device above?
Note that in this case, it is LIO so it can be easily fixed on the target side by just setting it up properly. I do not think other targets would hit this type of issue.





> 
> Here is dmesg
> 
> [   26.103812] scsi 3:0:0:2: Direct-Access     LIO-ORG  san-disk-2       4.0  PQ: 0 ANSI: 5
> [   26.104338] sd 3:0:0:2: alua: supports implicit and explicit TPGS
> [   26.104549] sd 3:0:0:2: alua: No target port descriptors found
> [   26.104552] sd 3:0:0:2: alua: Attach failed (-22)
> [   26.104554] sd 3:0:0:2: failed to add device handler: -22
> [   26.104578] sd 3:0:0:2: [sdc] 20507809792 512-byte logical blocks: (10.4 TB/9.54 TiB)
> [   26.104905] sd 3:0:0:2: [sdc] Write Protect is off
> [   26.104908] sd 3:0:0:2: [sdc] Mode Sense: 43 00 10 08
> [   26.105036] sd 3:0:0:2: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [   26.112294] scsi host6: iSCSI Initiator over TCP/IP
> [   26.113279] scsi 4:0:0:3: Direct-Access     LIO-ORG  san-disk-3       4.0  PQ: 0 ANSI: 5
> [   26.113690] sd 4:0:0:3: alua: supports implicit and explicit TPGS
> [   26.113877] sd 4:0:0:3: [sdd] 9765625856 512-byte logical blocks: (5.00 TB/4.54 TiB)
> [   26.113948] sd 4:0:0:3: alua: No target port descriptors found
> [   26.113951] sd 4:0:0:3: alua: Attach failed (-22)
> [   26.113953] sd 4:0:0:3: failed to add device handler: -22
> [   26.114292] sd 4:0:0:3: [sdd] Write Protect is off
> [   26.114295] sd 4:0:0:3: [sdd] Mode Sense: 43 00 10 08
> [   26.114503] sd 4:0:0:3: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [   26.123875] scsi 5:0:0:1: Direct-Access     LIO-ORG  san-disk-1       4.0  PQ: 0 ANSI: 5
> [   26.123911] scsi 6:0:0:4: Direct-Access     LIO-ORG  san-disk-4       4.0  PQ: 0 ANSI: 5
> [   26.124452] sd 6:0:0:4: alua: supports implicit and explicit TPGS
> [   26.124453] sd 5:0:0:1: alua: supports implicit and explicit TPGS
> [   26.124724] sd 5:0:0:1: alua: No target port descriptors found
> [   26.124727] sd 5:0:0:1: alua: Attach failed (-22)
> [   26.124728] sd 5:0:0:1: failed to add device handler: -22
> [   26.124736] sd 6:0:0:4: [sde] 10742171648 512-byte logical blocks: (5.49 TB/5.00 TiB)
> [   26.124773] sd 5:0:0:1: [sdf] 7812499389 512-byte logical blocks: (3.99 TB/3.63 TiB)
> [   26.124777] sd 6:0:0:4: alua: No target port descriptors found
> [   26.124779] sd 6:0:0:4: alua: Attach failed (-22)
> [   26.124780] sd 6:0:0:4: failed to add device handler: -22
> [   26.125182] sd 5:0:0:1: [sdf] Write Protect is off
> [   26.125184] sd 5:0:0:1: [sdf] Mode Sense: 43 00 10 08
> [   26.125217] sd 6:0:0:4: [sde] Write Protect is off
> [   26.125220] sd 6:0:0:4: [sde] Mode Sense: 43 00 10 08
> [   26.125306] sd 5:0:0:1: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [   26.125512] sd 6:0:0:4: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA
> [   26.129633]  sdf: sdf1
> [   26.130637] sd 5:0:0:1: [sdf] Attached SCSI disk
> [   26.144377] ixgbe 0000:04:00.0: registered PHC device on enp4s0f0
> [   26.149072]  sdc: sdc1
> [   26.150434] sd 3:0:0:2: [sdc] Attached SCSI disk
> [   26.190709]  sdd: sdd1 sdd2
> [   26.193348] sd 4:0:0:3: [sdd] Attached SCSI disk
> [   26.230515]  sde: sde1
> [   26.231674] sd 6:0:0:4: [sde] Attached SCSI disk
> [   26.231987] sd 6:0:0:4: [sde] Synchronizing SCSI cache
> [   26.232021] sd 5:0:0:1: [sdf] Synchronizing SCSI cache
> [   26.233212] sd 3:0:0:2: [sdc] Synchronizing SCSI cache
> [   26.233440] sd 4:0:0:3: [sdd] Synchronizing SCSI cache
> [   26.236755] Buffer I/O error on dev sdc, logical block 2563476132, async page read
> [   26.238897] Buffer I/O error on dev sdd, logical block 1220703182, async page read
> [   26.245773] ixgbe 0000:04:00.1: SR-IOV enabled with 8 VFs
> [   26.245775] ixgbe 0000:04:00.1: configure port vlans to keep your VFs secure
> [   26.274544] scsi 6:0:0:0: Unexpected response from lun 4 while scanning, scan aborted
> [   26.283173] scsi 3:0:0:0: Unexpected response from lun 2 while scanning, scan aborted
> [   26.288571] scsi 4:0:0:0: Unexpected response from lun 3 while scanning, scan aborted
> [   26.288618] scsi 5:0:0:0: Unexpected response from lun 1 while scanning, scan aborted
> 
> 
> Second traceback is gone too, but still no luck attaching local iscsi targets either.
> 
> 
> [  639.148875] TARGET_CORE[iSCSI]: Expected Transfer Length: 264 does 
> not match SCSI CDB Length: 8 for SAM Opcode: 0x12 [  639.148911] sd 
> 7:0:0:0: [sdc] 115343360 512-byte logical blocks: (59.0 GB/55.0 GiB) [  
> 639.148925] sd 7:0:0:0: alua: No target port descriptors found [  
> 639.148928] sd 7:0:0:0: alua: Attach failed (-22) [  639.149186] sd 
> 7:0:0:0: [sdc] Write Protect is off [  639.149188] sd 7:0:0:0: [sdc] 
> Mode Sense: 43 00 10 08 [  639.149279] sd 7:0:0:0: [sdc] Write cache: 
> enabled, read cache: enabled, supports DPO and FUA [  639.149298] iSCSI/iqn.1994-05.com.redhat:cf7f1fafca4b: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
> [  639.149530] sd 7:0:0:0: failed to add device handler: -22 [  
> 639.154762] sd 7:0:0:0: [sdc] Attached SCSI disk [  639.154857] sd 
> 7:0:0:0: [sdc] Synchronizing SCSI cache
> [  655.279047] scsi 7:0:0:0: Direct-Access     LIO-ORG  IBLOCK           4.0  PQ: 0 ANSI: 5
> [  655.279397] sd 7:0:0:0: alua: supports implicit and explicit TPGS [  
> 655.279503] TARGET_CORE[iSCSI]: Expected Transfer Length: 264 does not 
> match SCSI CDB Length: 8 for SAM Opcode: 0x12 [  655.279533] sd 
> 7:0:0:0: alua: No target port descriptors found [  655.279535] sd 
> 7:0:0:0: alua: Attach failed (-22) [  655.279587] sd 7:0:0:0: [sdc] 
> 115343360 512-byte logical blocks: (59.0 GB/55.0 GiB) [  655.279848] 
> sd 7:0:0:0: [sdc] Write Protect is off [  655.279849] sd 7:0:0:0: 
> [sdc] Mode Sense: 43 00 10 08 [  655.279981] sd 7:0:0:0: [sdc] Write 
> cache: enabled, read cache: enabled, supports DPO and FUA [  655.280034] iSCSI/iqn.1994-05.com.redhat:cf7f1fafca4b: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
> [  655.280171] sd 7:0:0:0: failed to add device handler: -22 [  
> 655.286008] sd 7:0:0:0: [sdc] Attached SCSI disk [  655.286132] sd 
> 7:0:0:0: [sdc] Synchronizing SCSI cache
> 
> 
> Serguei Bezverkhi,
> TECHNICAL LEADER.SERVICES
> Global SP Services
> sbezverk@cisco.com
> Phone: +1 416 306 7312
> Mobile: +1 514 234 7374
> 
> CCIE (R&S,SP,Sec) - #9527
> 
> Cisco.com
> 
> 
> 
>  Think before you print.
> This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
> Please click here for Company Registration Information.
> 
> 
> 
> 
> -----Original Message-----
> From: Mike Christie [mailto:michaelc@cs.wisc.edu]
> Sent: Friday, January 29, 2016 2:27 PM
> To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>
> Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org
> Subject: Re: [Bug 111441] New: iscsi fails to attach to targets
> 
> 
> 
> On 01/29/2016 01:11 PM, Serguei Bezverkhi (sbezverk) wrote:
>> If you send me the diff for your patch, I will build new kernel myself.
>>
> 
> Bugzilla must be messing something up. I attached to one of the previous mails. Attaching it here again.
> 
> Email me offlist and without bugzilla if you do not get it here.
> 
> The patch will fix the syfs bug ons you are hitting.
> 
> I am not sure if it will fix the genhd one. We can deal with that one next if it is a different issue.
> 
> 
>> Serguei
>>
>>
>> Serguei Bezverkhi,
>> TECHNICAL LEADER.SERVICES
>> Global SP Services
>> sbezverk@cisco.com
>> Phone: +1 416 306 7312
>> Mobile: +1 514 234 7374
>>
>> CCIE (R&S,SP,Sec) - #9527
>>
>> Cisco.com
>>
>>
>>
>>  Think before you print.
>> This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
>> Please click here for Company Registration Information.
>>
>>
>>
>>
>> -----Original Message-----
>> From: Michael Christie [mailto:michaelc@cs.wisc.edu]
>> Sent: Friday, January 29, 2016 2:09 PM
>> To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>
>> Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org
>> Subject: Re: [Bug 111441] New: iscsi fails to attach to targets
>>
>>
>>> On Jan 29, 2016, at 6:04 AM, Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com> wrote:
>>>
>>> Actually this server uses both cases: Local taregts (since it is OpenStack server) and remote targets as it tries to mount 4 remotefile systems.  
>>>
>>> You are correct, I always use the same box I just change the kernel it is using to boot. No other changes to the environment. I do not mind to load a test kernel without that suspected patch, just get me the RPM.
>>>
>>
>> I do not know what you mean. I think the patch I sent will fix the sysfs errors caused due to alua not being setup properly on your system and scsi_dh_alua failing to attach. That patch should be applied to the 4.4 upstream kernel. Are you saying you want me to make you a kernel rpm?
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" 
>> in the body of a message to majordomo@vger.kernel.org More majordomo 
>> info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Reinecke Feb. 22, 2016, 7:08 a.m. UTC | #12
On 02/22/2016 01:45 AM, Serguei Bezverkhi (sbezverk) wrote:
> Hi Mike,
> 
> I just wanted to follow up with you to see if the patch got committed to an upstream kernel if yes, please let me into which version it went.
> 
> Thank you
> 
> Serguei
> 
> 
> Serguei Bezverkhi,
> TECHNICAL LEADER.SERVICES
> Global SP Services
> sbezverk@cisco.com
> Phone: +1 416 306 7312
> Mobile: +1 514 234 7374
> 
> CCIE (R&S,SP,Sec) - #9527
> 
> Cisco.com
> 
> 
> 
>  Think before you print.
> This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
> Please click here for Company Registration Information.
> 
> 
> 
> -----Original Message-----
> From: Mike Christie [mailto:michaelc@cs.wisc.edu] 
> Sent: Friday, January 29, 2016 6:33 PM
> To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>
> Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org; Christoph Hellwig <hch@infradead.org>; Hannes Reinecke <hare@suse.de>
> Subject: Re: [Bug 111441] New: iscsi fails to attach to targets
> 
> On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:
>> HI Mike,
>>
>> I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.
>>
> 
> That is sort of expected. Your target is not setup for ALUA properly. It says it supports ALUA, but when scsi_dh_alua asks about the ports it is reporting there are none. Ccing the people that made the patch that added the issue and own the code.
> 
> Hey Christoph and Hannes,
> 
> The dh/alua changes that added this:
> 
>         error = scsi_dh_add_device(sdev);
>         if (error) {
>                 sdev_printk(KERN_INFO, sdev,
>                                 "failed to add device handler: %d\n", error);
>                 return error;
>         }
> 
> to scsi_sysfs_add_sdev are adding a regression.
> 
> 1. If that fails, then we forget to do device_del before doing the return. My patch in this thread added that back, so we do not see the sysfs oopses anymore. But.....
> 
> 2. It looks like in older kernels, we would allow misconfigured targets like this one to still setup devices. Do we want that old behavior back?
> Should we just ignore the return value from scsi_dh_add_device above?
> Note that in this case, it is LIO so it can be easily fixed on the target side by just setting it up properly. I do not think other targets would hit this type of issue.
> 
> 
This has been fixed up with my patchset to update the ALUA handler, most
notably the commit 'scsi: ignore errors from scsi_dh_add_device()' which
was included in 4.5.

Cheers,

Hannes
Serguei Bezverkhi (sbezverk) Feb. 22, 2016, 11:36 a.m. UTC | #13
Hello Hannes,

Thank you for your reply. I am on 4.4.2 kernel, is there any chance to commit it in 4.4 as well? If not, could you send me diff for 4.4 kernel.

Best regards

Serguei


Serguei Bezverkhi,
TECHNICAL LEADER.SERVICES
Global SP Services
sbezverk@cisco.com
Phone: +1 416 306 7312
Mobile: +1 514 234 7374

CCIE (R&S,SP,Sec) - #9527

Cisco.com



 Think before you print.
This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
Please click here for Company Registration Information.




-----Original Message-----
From: Hannes Reinecke [mailto:hare@suse.de] 
Sent: Monday, February 22, 2016 2:08 AM
To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>; Mike Christie <michaelc@cs.wisc.edu>
Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org; Christoph Hellwig <hch@infradead.org>
Subject: Re: [Bug 111441] New: iscsi fails to attach to targets

On 02/22/2016 01:45 AM, Serguei Bezverkhi (sbezverk) wrote:
> Hi Mike,
> 
> I just wanted to follow up with you to see if the patch got committed to an upstream kernel if yes, please let me into which version it went.
> 
> Thank you
> 
> Serguei
> 
> 
> Serguei Bezverkhi,
> TECHNICAL LEADER.SERVICES
> Global SP Services
> sbezverk@cisco.com
> Phone: +1 416 306 7312
> Mobile: +1 514 234 7374
> 
> CCIE (R&S,SP,Sec) - #9527
> 
> Cisco.com
> 
> 
> 
>  Think before you print.
> This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.
> Please click here for Company Registration Information.
> 
> 
> 
> -----Original Message-----
> From: Mike Christie [mailto:michaelc@cs.wisc.edu]
> Sent: Friday, January 29, 2016 6:33 PM
> To: Serguei Bezverkhi (sbezverk) <sbezverk@cisco.com>
> Cc: bugzilla-daemon@bugzilla.kernel.org; linux-scsi@vger.kernel.org; 
> Christoph Hellwig <hch@infradead.org>; Hannes Reinecke <hare@suse.de>
> Subject: Re: [Bug 111441] New: iscsi fails to attach to targets
> 
> On 01/29/2016 04:21 PM, Serguei Bezverkhi (sbezverk) wrote:
>> HI Mike,
>>
>> I tried your patch and it is has eliminated first traceback but I still do not see my remote targets.
>>
> 
> That is sort of expected. Your target is not setup for ALUA properly. It says it supports ALUA, but when scsi_dh_alua asks about the ports it is reporting there are none. Ccing the people that made the patch that added the issue and own the code.
> 
> Hey Christoph and Hannes,
> 
> The dh/alua changes that added this:
> 
>         error = scsi_dh_add_device(sdev);
>         if (error) {
>                 sdev_printk(KERN_INFO, sdev,
>                                 "failed to add device handler: %d\n", error);
>                 return error;
>         }
> 
> to scsi_sysfs_add_sdev are adding a regression.
> 
> 1. If that fails, then we forget to do device_del before doing the return. My patch in this thread added that back, so we do not see the sysfs oopses anymore. But.....
> 
> 2. It looks like in older kernels, we would allow misconfigured targets like this one to still setup devices. Do we want that old behavior back?
> Should we just ignore the return value from scsi_dh_add_device above?
> Note that in this case, it is LIO so it can be easily fixed on the target side by just setting it up properly. I do not think other targets would hit this type of issue.
> 
> 
This has been fixed up with my patchset to update the ALUA handler, most notably the commit 'scsi: ignore errors from scsi_dh_add_device()' which was included in 4.5.

Cheers,

Hannes
Nicholas A. Bellinger Feb. 27, 2016, 10:15 p.m. UTC | #14
Hey Hannes,

On Tue, 2016-02-16 at 20:08 +0100, Hannes Reinecke wrote:
> On 02/08/2016 09:01 AM, Nicholas A. Bellinger wrote:
> > On Tue, 2016-02-02 at 14:56 -0800, Nicholas A. Bellinger wrote:
> >> On Mon, 2016-02-01 at 10:55 -0600, Mike Christie wrote:
> >>> On 01/30/2016 01:38 AM, Nicholas A. Bellinger wrote:
> >>>> On Fri, 2016-01-29 at 17:32 -0600, Mike Christie wrote:

<SNIP>

> >>>> Btw, what does misconfigured mean here wrt target ALUA..?
> >>>
> >>> [   25.833195] sd 6:0:0:4: alua: supports implicit and explicit TPGS
> >>> [   25.833360] sd 6:0:0:4: alua: No target port descriptors found
> >>> [   25.833363] sd 6:0:0:4: alua: Attach failed (-22)
> >>> [   25.833365] sd 6:0:0:4: failed to add device handler: -22
> >>>
> >>
> >> Strange, this hasn't changed in forever on the target side..
> >>
> >>> He has LIO configured to report it supports implicit/explicit ALUA, but
> >>> the ports do not seem to be configured.
> >>>
> >>> For the LIO config side, are his LUNs just not in a the default_lu_gp or
> >>> any other group?
> >>
> >> So every non-PSCSI backend device becomes part of default_lu_gp +
> >> default_tg_pt_gp and automatically shows up in EVPD=0x83, without user
> >> needing to do any additional configuration.
> >>
> >> Here's what the output looks like:
> >>
> >> root@haakon3:/usr/src/target-pending.git# sg_inq -Hi /dev/sdb
> >> VPD INQUIRY: Device Identification page
> >>    <SNIP>
> >>    Designation descriptor number 3, descriptor length: 8
> >>      transport: Serial Attached SCSI Protocol (SPL-2)
> >>      designator_type: Relative target port,  code_set: Binary
> >>      associated with the target port
> >>      designator header(hex): 61 94 00 04
> >>      designator:
> >>   00     00 00 00 02                                         ....
> >>    Designation descriptor number 4, descriptor length: 8
> >>      transport: Serial Attached SCSI Protocol (SPL-2)
> >>      designator_type: Target port group,  code_set: Binary
> >>      associated with the target port
> >>      designator header(hex): 61 95 00 04
> >>      designator:
> >>   00     00 00 00 00                                         ....
> >>    Designation descriptor number 5, descriptor length: 8
> >>      designator_type: Logical unit group,  code_set: Binary
> >>      associated with the addressed logical unit
> >>      designator header(hex): 01 06 00 04
> >>      designator:
> >>   00     00 00 00 00                                         ....
> >>   <SNIP>
> >>
> >> So AFAICT, the relative target port, target port group, and logical unit
> >> group being returned from target on v4.5-rc1 code looks correct.
> >>
> >> Serguei, can you confirm with 'sg_inq -Hi /dev/sdX' output on your side
> >> with the v3.10 based target..?
> >>
> >> AFAICT the parsing in scsi_vpd_tpg_id() from commit a8aa3978 looks
> >> correct too.
> >>
> >> Hannes, any ideas..?
> >
> > Ping.
> >
> Please try with my latest scsi_dh_alua patchset posted to linux-scsi.
> That should solve the error attaching devices.
> 

Just to confirm, this was not a target side issue, right..?

Also, since Serguei is seeing this on v4.4 we'll still need some hack
for stable, assuming you're entire patchset won't be in 4.4.y code.  ;)

Are you OK with Mike's original patch, or do you have something better
to submit to Greg-KH..?

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 21930c9..4269cbc 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1061,6 +1061,7 @@  int scsi_sysfs_add_sdev(struct scsi_device *sdev)
 	if (error) {
 		sdev_printk(KERN_INFO, sdev,
 				"failed to add device handler: %d\n", error);
+		device_del(&sdev->sdev_gendev);
 		return error;
 	}