diff mbox series

[v2] scsi: libsas: add lun number check in .slave_alloc callback

Message ID 20210609093631.2557822-1-yuyufen@huawei.com (mailing list archive)
State Superseded
Headers show
Series [v2] scsi: libsas: add lun number check in .slave_alloc callback | expand

Commit Message

Yufen Yu June 9, 2021, 9:36 a.m. UTC
We found that offline a ata device on hisi sas control and then
scanning the host can probe 255 not exist devices into system.
It can be reproduced easily as following:

Some ata devices on hisi sas v3 control:
  [root@localhost ~]# lsscsi
  [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
  [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
  [2:0:2:0]    disk    SEAGATE  ST600MM0006      B001  /dev/sdc

 1) echo "offline" > /sys/block/sdb/device/state
 2) echo "- - -" > /sys/class/scsi_host/host2/scan

Then, we can see another 255 not exist devices in system:
  [root@localhost ~]# lsscsi
  [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
  [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
  [2:0:1:1]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdh
  ...
  [2:0:1:255]  disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdjb

After REPORT LUN command issued to the offline device fail, it tries
to do a sequential scan and probe all devices whose lun is not 0
successfully.

To fix the problem, we try to do same things as commit 2fc62e2ac350
("[SCSI] libsas: disable scanning lun > 0 on ata devices"), which
will prevent the device whose lun number is not zero probe into system.

Reported-by: Wu Bo <wubo40@huawei.com>
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
---
 drivers/scsi/aic94xx/aic94xx_init.c    | 1 +
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 1 +
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 1 +
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 1 +
 drivers/scsi/isci/init.c               | 1 +
 drivers/scsi/libsas/sas_scsi_host.c    | 9 +++++++++
 drivers/scsi/mvsas/mv_init.c           | 1 +
 drivers/scsi/pm8001/pm8001_init.c      | 1 +
 8 files changed, 16 insertions(+)

Comments

Yufen Yu June 9, 2021, 12:01 p.m. UTC | #1
hi, John

On 2021/6/9 17:40, John Garry wrote:
> On 09/06/2021 10:36, Yufen Yu wrote:
>> We found that offline a ata device on hisi sas control and then
>> scanning the host can probe 255 not exist devices into system.
>> It can be reproduced easily as following:
>>
>> Some ata devices on hisi sas v3 control:
>>    [root@localhost ~]# lsscsi
>>    [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
>>    [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
>>    [2:0:2:0]    disk    SEAGATE  ST600MM0006      B001  /dev/sdc
>>
>>   1) echo "offline" > /sys/block/sdb/device/state
>>   2) echo "- - -" > /sys/class/scsi_host/host2/scan
>>
>> Then, we can see another 255 not exist devices in system:
>>    [root@localhost ~]# lsscsi
>>    [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
>>    [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
>>    [2:0:1:1]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdh
>>    ...
>>    [2:0:1:255]  disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdjb
>>
>> After REPORT LUN command issued to the offline device fail, it tries
>> to do a sequential scan and probe all devices whose lun is not 0
>> successfully.
>>
>> To fix the problem, we try to do same things as commit 2fc62e2ac350
>> ("[SCSI] libsas: disable scanning lun > 0 on ata devices"), which
>> will prevent the device whose lun number is not zero probe into system.
>>
>> Reported-by: Wu Bo <wubo40@huawei.com>
>> Suggested-by: John Garry <john.garry@huawei.com>
>> Signed-off-by: Yufen Yu <yuyufen@huawei.com>
>> ---
>>   drivers/scsi/aic94xx/aic94xx_init.c    | 1 +
>>   drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 1 +
>>   drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 1 +
>>   drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 1 +
>>   drivers/scsi/isci/init.c               | 1 +
>>   drivers/scsi/libsas/sas_scsi_host.c    | 9 +++++++++
>>   drivers/scsi/mvsas/mv_init.c           | 1 +
>>   drivers/scsi/pm8001/pm8001_init.c      | 1 +
> 
> Do we also need to modify aix79xx in a similar fashion?

There is no aix79xx in directory drivers/scsi. I guess you mean
aic79xxx? But it seems not need to modify.

> 
> I just noticed that libsas.h already has a prototype for sas_slave_alloc() - any idea why?

'git log' shows that commit fa1c1e8f1ece "[SCSI] Add SATA support to libsas" have
introduced sas_slave_alloc. And commit 9508a66f898d "[SCSI] libsas: async ata scanning"
delete it while forgetting to get rid of the prototype.

Yufen,
Thanks
John Garry June 9, 2021, 12:09 p.m. UTC | #2
On 09/06/2021 13:01, Yufen Yu wrote:
>>
>> Do we also need to modify aix79xx in a similar fashion?
> 
> There is no aix79xx in directory drivers/scsi. I guess you mean
> aic79xxx? But it seems not need to modify.
> 

So if you think that this HBA does not support SATA, then it would be 
good to mention it in the commit log.

Some more comments:

On 09/06/2021 10:36, Yufen Yu wrote:
 > We found that offline a ata device on hisi sas control and then

/s/ata/SATA/

 > scanning the host can probe 255 not exist devices into system.

"can probe 255 non-existant"

 > It can be reproduced easily as following:
 >
 > Some ata devices on hisi sas v3 control:

I don't know what this means, so please drop it.

 >    [root@localhost ~]# lsscsi
 >    [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
 >    [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
 >    [2:0:2:0]    disk    SEAGATE  ST600MM0006      B001  /dev/sdc
 >
 >   1) echo "offline" > /sys/block/sdb/device/state
 >   2) echo "- - -" > /sys/class/scsi_host/host2/scan
 >
 > Then, we can see another 255 not exist devices in system:

use "non-existant"


 >    [root@localhost ~]# lsscsi
 >    [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
 >    [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
 >    [2:0:1:1]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdh
 >    ...
 >    [2:0:1:255]  disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdjb
 >
 > After REPORT LUN command issued to the offline device fail, it tries
 > to do a sequential scan and probe all devices whose lun is not 0
 > successfully.
 >
 > To fix the problem, we try to do same things as commit 2fc62e2ac350
 > ("[SCSI] libsas: disable scanning lun > 0 on ata devices"), which
 > will prevent the device whose lun number is not zero probe into system.
 >


Thanks,
John
Yufen Yu June 9, 2021, 12:49 p.m. UTC | #3
On 2021/6/9 20:09, John Garry wrote:
> On 09/06/2021 13:01, Yufen Yu wrote:
>>>
>>> Do we also need to modify aix79xx in a similar fashion?
>>
>> There is no aix79xx in directory drivers/scsi. I guess you mean
>> aic79xxx? But it seems not need to modify.
>>
> 
> So if you think that this HBA does not support SATA, then it would be good to mention it in the commit log.

Maybe I didn't describe it clearly. I am not mean that aic79xxx dose not support
SATA. But it is not libsas driver, so I think we don't need to modify it here.

> 
> Some more comments:
> 
> On 09/06/2021 10:36, Yufen Yu wrote:
>  > We found that offline a ata device on hisi sas control and then
> 
> /s/ata/SATA/
> 
>  > scanning the host can probe 255 not exist devices into system.
> 
> "can probe 255 non-existant"
> 
>  > It can be reproduced easily as following:
>  >
>  > Some ata devices on hisi sas v3 control:
> 
> I don't know what this means, so please drop it.
> 
>  >    [root@localhost ~]# lsscsi
>  >    [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
>  >    [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
>  >    [2:0:2:0]    disk    SEAGATE  ST600MM0006      B001  /dev/sdc
>  >
>  >   1) echo "offline" > /sys/block/sdb/device/state
>  >   2) echo "- - -" > /sys/class/scsi_host/host2/scan
>  >
>  > Then, we can see another 255 not exist devices in system:
> 
> use "non-existant"

Thanks to point out these error. I will fix it in next version.

Thanks,
Yufen
Jason Yan June 9, 2021, 12:54 p.m. UTC | #4
在 2021/6/9 17:40, John Garry 写道:
> On 09/06/2021 10:36, Yufen Yu wrote:
>> We found that offline a ata device on hisi sas control and then
>> scanning the host can probe 255 not exist devices into system.
>> It can be reproduced easily as following:
>>
>> Some ata devices on hisi sas v3 control:
>>    [root@localhost ~]# lsscsi
>>    [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
>>    [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
>>    [2:0:2:0]    disk    SEAGATE  ST600MM0006      B001  /dev/sdc
>>
>>   1) echo "offline" > /sys/block/sdb/device/state
>>   2) echo "- - -" > /sys/class/scsi_host/host2/scan
>>
>> Then, we can see another 255 not exist devices in system:
>>    [root@localhost ~]# lsscsi
>>    [2:0:0:0]    disk    ATA      Samsung SSD 860  2B6Q  /dev/sda
>>    [2:0:1:0]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdb
>>    [2:0:1:1]    disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdh
>>    ...
>>    [2:0:1:255]  disk    ATA      WDC WD2003FYYS-3 1D01  /dev/sdjb
>>
>> After REPORT LUN command issued to the offline device fail, it tries
>> to do a sequential scan and probe all devices whose lun is not 0
>> successfully.
>>
>> To fix the problem, we try to do same things as commit 2fc62e2ac350
>> ("[SCSI] libsas: disable scanning lun > 0 on ata devices"), which
>> will prevent the device whose lun number is not zero probe into system.
>>
>> Reported-by: Wu Bo <wubo40@huawei.com>
>> Suggested-by: John Garry <john.garry@huawei.com>
>> Signed-off-by: Yufen Yu <yuyufen@huawei.com>
>> ---
>>   drivers/scsi/aic94xx/aic94xx_init.c    | 1 +
>>   drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 1 +
>>   drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 1 +
>>   drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 1 +
>>   drivers/scsi/isci/init.c               | 1 +
>>   drivers/scsi/libsas/sas_scsi_host.c    | 9 +++++++++
>>   drivers/scsi/mvsas/mv_init.c           | 1 +
>>   drivers/scsi/pm8001/pm8001_init.c      | 1 +
> 
> Do we also need to modify aix79xx in a similar fashion?
> 
> I just noticed that libsas.h already has a prototype for 
> sas_slave_alloc() - any idea why?

sas_slave_alloc() was implemented in the history and it was removed in 
this commit: 9508a66f898d. And it seems the prototype was left over 
since then.
Jason Yan June 9, 2021, 2:20 p.m. UTC | #5
在 2021/6/9 21:13, John Garry 写道:
> 
>>> I just noticed that libsas.h already has a prototype for 
>>> sas_slave_alloc() - any idea why?
>>
>> sas_slave_alloc() was implemented in the history and it was removed in 
>> this commit: 9508a66f898d. And it seems the prototype was left over 
>> since then.
>> .
> 
> ok, understood.
> 
> So how about backporting this also? I have no idea when the regression 
> was introduced and prob cannot test as it predates hisi_sas support.
> 

This function before did nothing but initialized the ata port. The
commit removed the function just moved the initialization somewhere
else.

-int sas_slave_alloc(struct scsi_device *scsi_dev)
-{
-       struct domain_device *dev = sdev_to_domain_dev(scsi_dev);
-
-       if (dev_is_sata(dev))
-               return ata_sas_port_init(dev->sata_dev.ap);
-
-       return 0;
-}


It looks like it's not related to this issue. And I guess it is
not a regression. This issue only exists when user do a manual scan and
at the same time the device is offlined. Few people will do that actually.

> BTW, we also have a dangling prototype for sas_init_ex_attr(), if 
> someone wants to delete that...
> 
> Thanks,
> John
> .
diff mbox series

Patch

diff --git a/drivers/scsi/aic94xx/aic94xx_init.c b/drivers/scsi/aic94xx/aic94xx_init.c
index a195bfe9eccc..7a78606598c4 100644
--- a/drivers/scsi/aic94xx/aic94xx_init.c
+++ b/drivers/scsi/aic94xx/aic94xx_init.c
@@ -53,6 +53,7 @@  static struct scsi_host_template aic94xx_sht = {
 	.max_sectors		= SCSI_DEFAULT_MAX_SECTORS,
 	.eh_device_reset_handler	= sas_eh_device_reset_handler,
 	.eh_target_reset_handler	= sas_eh_target_reset_handler,
+	.slave_alloc		= sas_slave_alloc,
 	.target_destroy		= sas_target_destroy,
 	.ioctl			= sas_ioctl,
 #ifdef CONFIG_COMPAT
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
index 3e359ac752fd..15eaac3a4eb6 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
@@ -1771,6 +1771,7 @@  static struct scsi_host_template sht_v1_hw = {
 	.max_sectors		= SCSI_DEFAULT_MAX_SECTORS,
 	.eh_device_reset_handler = sas_eh_device_reset_handler,
 	.eh_target_reset_handler = sas_eh_target_reset_handler,
+	.slave_alloc		= sas_slave_alloc,
 	.target_destroy		= sas_target_destroy,
 	.ioctl			= sas_ioctl,
 #ifdef CONFIG_COMPAT
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 46f60fc2a069..9df1639ffa65 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -3584,6 +3584,7 @@  static struct scsi_host_template sht_v2_hw = {
 	.max_sectors		= SCSI_DEFAULT_MAX_SECTORS,
 	.eh_device_reset_handler = sas_eh_device_reset_handler,
 	.eh_target_reset_handler = sas_eh_target_reset_handler,
+	.slave_alloc		= sas_slave_alloc,
 	.target_destroy		= sas_target_destroy,
 	.ioctl			= sas_ioctl,
 #ifdef CONFIG_COMPAT
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index e95408314078..36ec3901cfd4 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -3155,6 +3155,7 @@  static struct scsi_host_template sht_v3_hw = {
 	.max_sectors		= SCSI_DEFAULT_MAX_SECTORS,
 	.eh_device_reset_handler = sas_eh_device_reset_handler,
 	.eh_target_reset_handler = sas_eh_target_reset_handler,
+	.slave_alloc		= sas_slave_alloc,
 	.target_destroy		= sas_target_destroy,
 	.ioctl			= sas_ioctl,
 #ifdef CONFIG_COMPAT
diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
index c452849e7bb4..ffd33e5decae 100644
--- a/drivers/scsi/isci/init.c
+++ b/drivers/scsi/isci/init.c
@@ -167,6 +167,7 @@  static struct scsi_host_template isci_sht = {
 	.eh_abort_handler		= sas_eh_abort_handler,
 	.eh_device_reset_handler        = sas_eh_device_reset_handler,
 	.eh_target_reset_handler        = sas_eh_target_reset_handler,
+	.slave_alloc			= sas_slave_alloc,
 	.target_destroy			= sas_target_destroy,
 	.ioctl				= sas_ioctl,
 #ifdef CONFIG_COMPAT
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 1bf939818c98..ee44a0d7730b 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -911,6 +911,14 @@  void sas_task_abort(struct sas_task *task)
 		blk_abort_request(sc->request);
 }
 
+int sas_slave_alloc(struct scsi_device *sdev)
+{
+	if (dev_is_sata(sdev_to_domain_dev(sdev)) && sdev->lun)
+		return -ENXIO;
+
+	return 0;
+}
+
 void sas_target_destroy(struct scsi_target *starget)
 {
 	struct domain_device *found_dev = starget->hostdata;
@@ -957,5 +965,6 @@  EXPORT_SYMBOL_GPL(sas_task_abort);
 EXPORT_SYMBOL_GPL(sas_phy_reset);
 EXPORT_SYMBOL_GPL(sas_eh_device_reset_handler);
 EXPORT_SYMBOL_GPL(sas_eh_target_reset_handler);
+EXPORT_SYMBOL_GPL(sas_slave_alloc);
 EXPORT_SYMBOL_GPL(sas_target_destroy);
 EXPORT_SYMBOL_GPL(sas_ioctl);
diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index 6aa2697c4a15..b03c0f35d7b0 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -46,6 +46,7 @@  static struct scsi_host_template mvs_sht = {
 	.max_sectors		= SCSI_DEFAULT_MAX_SECTORS,
 	.eh_device_reset_handler = sas_eh_device_reset_handler,
 	.eh_target_reset_handler = sas_eh_target_reset_handler,
+	.slave_alloc		= sas_slave_alloc,
 	.target_destroy		= sas_target_destroy,
 	.ioctl			= sas_ioctl,
 #ifdef CONFIG_COMPAT
diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
index af09bd282cb9..313248c7bab9 100644
--- a/drivers/scsi/pm8001/pm8001_init.c
+++ b/drivers/scsi/pm8001/pm8001_init.c
@@ -101,6 +101,7 @@  static struct scsi_host_template pm8001_sht = {
 	.max_sectors		= SCSI_DEFAULT_MAX_SECTORS,
 	.eh_device_reset_handler = sas_eh_device_reset_handler,
 	.eh_target_reset_handler = sas_eh_target_reset_handler,
+	.slave_alloc		= sas_slave_alloc,
 	.target_destroy		= sas_target_destroy,
 	.ioctl			= sas_ioctl,
 #ifdef CONFIG_COMPAT