diff mbox series

[1/5] scsi: hisi_sas: Add host_tagset_enable module param for v3 hw

Message ID 20250329073236.2300582-2-liyihang9@huawei.com (mailing list archive)
State Superseded
Headers show
Series hisi_sas: Misc patches and cleanups | expand

Commit Message

Yihang Li March 29, 2025, 7:32 a.m. UTC
From: Xingui Yang <yangxingui@huawei.com>

After driver exposes all HW queues and application submits IO to multiple
queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
ata_qc_defer() causes non-NCQ commands to be requeued and possibly repeated
forever. Add host_tagset_enable module parameter to expose multiple queues
for v3 hw.

Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: Yihang Li <liyihang9@huawei.com>
---
 drivers/scsi/hisi_sas/hisi_sas.h       |  1 +
 drivers/scsi/hisi_sas/hisi_sas_main.c  | 10 +++++----
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 30 ++++++++++++++++++++++++--
 3 files changed, 35 insertions(+), 6 deletions(-)

Comments

John Garry March 29, 2025, 8:50 a.m. UTC | #1
On 29/03/2025 07:32, Yihang Li wrote:

+

> From: Xingui Yang<yangxingui@huawei.com>
> 
> After driver exposes all HW queues and application submits IO to multiple
> queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
> ata_qc_defer() causes non-NCQ commands to be requeued and possibly repeated
> forever.

I don't think that it is a good idea to mask out bugs with module 
parameters.

Was this the same libata/libsas issue reported some time ago?

> Add host_tagset_enable module parameter to expose multiple queues
> for v3 hw.
> 
> Signed-off-by: Xingui Yang<yangxingui@huawei.com>
> Signed-off-by: Yihang Li<liyihang9@huawei.com>
yangxingui March 29, 2025, 9:49 a.m. UTC | #2
Hi,John

On 2025/3/29 16:50, John Garry wrote:
> On 29/03/2025 07:32, Yihang Li wrote:
> 
> +
> 
>> From: Xingui Yang<yangxingui@huawei.com>
>>
>> After driver exposes all HW queues and application submits IO to multiple
>> queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
>> ata_qc_defer() causes non-NCQ commands to be requeued and possibly 
>> repeated
>> forever.
> 
> I don't think that it is a good idea to mask out bugs with module 
> parameters.
> 
> Was this the same libata/libsas issue reported some time ago?

Yeah,related to this issue: 
https://lore.kernel.org/linux-block/eef1e927-c9b2-c61d-7f48-92e65d8b0418@huawei.com/

And, Niklas tried to help fix this problem: 
https://lore.kernel.org/linux-scsi/ZynmfyDA9R-lrW71@ryzen/

Considering that there is no formal solution yet. And our users rarely 
use SATA disks and SAS disks together on a single machine. For this 
reason, they can flexibly turn off the exposure of multiple queues in 
the scenario of using only SATA disks. In addition, it is also 
convenient to conduct performance comparison tests to expose multiple 
hardware queues and single queues.

Thanks,
Xingui
John Garry March 31, 2025, 7:43 a.m. UTC | #3
On 29/03/2025 09:49, yangxingui wrote:
> Hi,John
> 
> On 2025/3/29 16:50, John Garry wrote:
>> On 29/03/2025 07:32, Yihang Li wrote:
>>
>> +
>>
>>> From: Xingui Yang<yangxingui@huawei.com>
>>>
>>> After driver exposes all HW queues and application submits IO to 
>>> multiple
>>> queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
>>> ata_qc_defer() causes non-NCQ commands to be requeued and possibly 
>>> repeated
>>> forever.
>>
>> I don't think that it is a good idea to mask out bugs with module 
>> parameters.
>>
>> Was this the same libata/libsas issue reported some time ago?
> 
> Yeah,related to this issue: https://lore.kernel.org/linux-block/ 
> eef1e927-c9b2-c61d-7f48-92e65d8b0418@huawei.com/
> 
> And, Niklas tried to help fix this problem: https://lore.kernel.org/ 
> linux-scsi/ZynmfyDA9R-lrW71@ryzen/
> 
> Considering that there is no formal solution yet. And our users rarely 
> use SATA disks and SAS disks together on a single machine. For this 
> reason, they can flexibly turn off the exposure of multiple queues in 
> the scenario of using only SATA disks. In addition, it is also 
> convenient to conduct performance comparison tests to expose multiple 
> hardware queues and single queues.
> 

The change in this series does not even solve the issues, as:
- you do not guarantee no SAS/SATA mix without that module param enabled
- the driver still uses managed interrupts in both cases, so with 
disabling host_tagset you are now exposed to CPU hotplug issue of IO 
being in-flight when HW queue interrupt is shutdown

And pm8001 driver will have the same issue, so we need to find a proper fix.

Let me consider this issue more.
Niklas Cassel March 31, 2025, 8:40 a.m. UTC | #4
Hello Xingui,

On Sat, Mar 29, 2025 at 05:49:47PM +0800, yangxingui wrote:
> Hi,John
> 
> On 2025/3/29 16:50, John Garry wrote:
> > On 29/03/2025 07:32, Yihang Li wrote:
> > 
> > +
> > 
> > > From: Xingui Yang<yangxingui@huawei.com>
> > > 
> > > After driver exposes all HW queues and application submits IO to multiple
> > > queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
> > > ata_qc_defer() causes non-NCQ commands to be requeued and possibly
> > > repeated
> > > forever.
> > 
> > I don't think that it is a good idea to mask out bugs with module
> > parameters.
> > 
> > Was this the same libata/libsas issue reported some time ago?
> 
> Yeah,related to this issue: https://lore.kernel.org/linux-block/eef1e927-c9b2-c61d-7f48-92e65d8b0418@huawei.com/
> 
> And, Niklas tried to help fix this problem:
> https://lore.kernel.org/linux-scsi/ZynmfyDA9R-lrW71@ryzen/
> 
> Considering that there is no formal solution yet. And our users rarely use
> SATA disks and SAS disks together on a single machine. For this reason, they
> can flexibly turn off the exposure of multiple queues in the scenario of
> using only SATA disks. In addition, it is also convenient to conduct
> performance comparison tests to expose multiple hardware queues and single
> queues.

The solution I sent is not good since it relies on EH.

One would need to come up with a better solution to fix libsas drivers,
possibly a workqueue.

I think Damien is planning to add a workqueue submit path to libata,
if so, perhaps we could base a better solution on top of that.


Kind regards,
Niklas
Yihang Li March 31, 2025, 10:29 a.m. UTC | #5
Hi all,

Thanks for the discussion. I will remove this patch for this series and send a new version later.

Thanks,
Yihang

On 2025/3/31 16:40, Niklas Cassel wrote:
> Hello Xingui,
> 
> On Sat, Mar 29, 2025 at 05:49:47PM +0800, yangxingui wrote:
>> Hi,John
>>
>> On 2025/3/29 16:50, John Garry wrote:
>>> On 29/03/2025 07:32, Yihang Li wrote:
>>>
>>> +
>>>
>>>> From: Xingui Yang<yangxingui@huawei.com>
>>>>
>>>> After driver exposes all HW queues and application submits IO to multiple
>>>> queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
>>>> ata_qc_defer() causes non-NCQ commands to be requeued and possibly
>>>> repeated
>>>> forever.
>>>
>>> I don't think that it is a good idea to mask out bugs with module
>>> parameters.
>>>
>>> Was this the same libata/libsas issue reported some time ago?
>>
>> Yeah,related to this issue: https://lore.kernel.org/linux-block/eef1e927-c9b2-c61d-7f48-92e65d8b0418@huawei.com/
>>
>> And, Niklas tried to help fix this problem:
>> https://lore.kernel.org/linux-scsi/ZynmfyDA9R-lrW71@ryzen/
>>
>> Considering that there is no formal solution yet. And our users rarely use
>> SATA disks and SAS disks together on a single machine. For this reason, they
>> can flexibly turn off the exposure of multiple queues in the scenario of
>> using only SATA disks. In addition, it is also convenient to conduct
>> performance comparison tests to expose multiple hardware queues and single
>> queues.
> 
> The solution I sent is not good since it relies on EH.
> 
> One would need to come up with a better solution to fix libsas drivers,
> possibly a workqueue.
> 
> I think Damien is planning to add a workqueue submit path to libata,
> if so, perhaps we could base a better solution on top of that.
> 
> 
> Kind regards,
> Niklas
> 
> .
>
yangxingui April 1, 2025, 1:12 a.m. UTC | #6
On 2025/3/31 15:43, John Garry wrote:
> On 29/03/2025 09:49, yangxingui wrote:
>> Hi,John
>>
>> On 2025/3/29 16:50, John Garry wrote:
>>> On 29/03/2025 07:32, Yihang Li wrote:
>>>
>>> +
>>>
>>>> From: Xingui Yang<yangxingui@huawei.com>
>>>>
>>>> After driver exposes all HW queues and application submits IO to 
>>>> multiple
>>>> queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
>>>> ata_qc_defer() causes non-NCQ commands to be requeued and possibly 
>>>> repeated
>>>> forever.
>>>
>>> I don't think that it is a good idea to mask out bugs with module 
>>> parameters.
>>>
>>> Was this the same libata/libsas issue reported some time ago?
>>
>> Yeah,related to this issue: https://lore.kernel.org/linux-block/ 
>> eef1e927-c9b2-c61d-7f48-92e65d8b0418@huawei.com/
>>
>> And, Niklas tried to help fix this problem: https://lore.kernel.org/ 
>> linux-scsi/ZynmfyDA9R-lrW71@ryzen/
>>
>> Considering that there is no formal solution yet. And our users rarely 
>> use SATA disks and SAS disks together on a single machine. For this 
>> reason, they can flexibly turn off the exposure of multiple queues in 
>> the scenario of using only SATA disks. In addition, it is also 
>> convenient to conduct performance comparison tests to expose multiple 
>> hardware queues and single queues.
>>
> 
> The change in this series does not even solve the issues, as:
> - you do not guarantee no SAS/SATA mix without that module param enabled
> - the driver still uses managed interrupts in both cases, so with 
> disabling host_tagset you are now exposed to CPU hotplug issue of IO 
> being in-flight when HW queue interrupt is shutdown

Yes, there will be such problems.

> 
> And pm8001 driver will have the same issue, so we need to find a proper 
> fix.
> 
> Let me consider this issue more.

OK.

Thanks,
Xingui
yangxingui April 1, 2025, 1:18 a.m. UTC | #7
On 2025/3/31 16:40, Niklas Cassel wrote:
> Hello Xingui,
> 
> On Sat, Mar 29, 2025 at 05:49:47PM +0800, yangxingui wrote:
>> Hi,John
>>
>> On 2025/3/29 16:50, John Garry wrote:
>>> On 29/03/2025 07:32, Yihang Li wrote:
>>>
>>> +
>>>
>>>> From: Xingui Yang<yangxingui@huawei.com>
>>>>
>>>> After driver exposes all HW queues and application submits IO to multiple
>>>> queues in parallel, if NCQ and non-NCQ commands are mixed to sata disk,
>>>> ata_qc_defer() causes non-NCQ commands to be requeued and possibly
>>>> repeated
>>>> forever.
>>>
>>> I don't think that it is a good idea to mask out bugs with module
>>> parameters.
>>>
>>> Was this the same libata/libsas issue reported some time ago?
>>
>> Yeah,related to this issue: https://lore.kernel.org/linux-block/eef1e927-c9b2-c61d-7f48-92e65d8b0418@huawei.com/
>>
>> And, Niklas tried to help fix this problem:
>> https://lore.kernel.org/linux-scsi/ZynmfyDA9R-lrW71@ryzen/
>>
>> Considering that there is no formal solution yet. And our users rarely use
>> SATA disks and SAS disks together on a single machine. For this reason, they
>> can flexibly turn off the exposure of multiple queues in the scenario of
>> using only SATA disks. In addition, it is also convenient to conduct
>> performance comparison tests to expose multiple hardware queues and single
>> queues.
> 
> The solution I sent is not good since it relies on EH.
> 
> One would need to come up with a better solution to fix libsas drivers,
> possibly a workqueue.
> 
> I think Damien is planning to add a workqueue submit path to libata,
> if so, perhaps we could base a better solution on top of that.

Thank you for your solution. As you said, we may need a better solution.

Thanks,
Xingui
.
diff mbox series

Patch

diff --git a/drivers/scsi/hisi_sas/hisi_sas.h b/drivers/scsi/hisi_sas/hisi_sas.h
index e17f5d8226bf..f93eefe68ccb 100644
--- a/drivers/scsi/hisi_sas/hisi_sas.h
+++ b/drivers/scsi/hisi_sas/hisi_sas.h
@@ -456,6 +456,7 @@  struct hisi_hba {
 	u32 intr_coal_count;	/* Interrupt count to coalesce */
 
 	int cq_nvecs;
+	unsigned int *reply_map;
 
 	/* bist */
 	enum sas_linkrate debugfs_bist_linkrate;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_main.c b/drivers/scsi/hisi_sas/hisi_sas_main.c
index 7a484ad0f9ab..3d2e8ec7f110 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_main.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_main.c
@@ -489,11 +489,12 @@  static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 	struct asd_sas_port *sas_port = device->port;
 	struct hisi_sas_device *sas_dev = device->lldd_dev;
 	bool internal_abort = sas_is_internal_abort(task);
+	struct request *rq = sas_task_find_rq(task);
 	struct hisi_sas_dq *dq = NULL;
 	struct hisi_sas_port *port;
 	struct hisi_hba *hisi_hba;
 	struct hisi_sas_slot *slot;
-	struct request *rq = NULL;
+	unsigned int dq_index;
 	struct device *dev;
 	int rc;
 
@@ -548,9 +549,10 @@  static int hisi_sas_queue_command(struct sas_task *task, gfp_t gfp_flags)
 				return -ECOMM;
 		}
 
-		rq = sas_task_find_rq(task);
-		if (rq) {
-			unsigned int dq_index;
+		if (hisi_hba->reply_map) {
+			dq_index = hisi_hba->reply_map[raw_smp_processor_id()];
+			dq = &hisi_hba->dq[dq_index];
+		} else if (rq) {
 			u32 blk_tag;
 
 			blk_tag = blk_mq_unique_tag(rq);
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 6a0656f3b596..a1f0a59a8c8d 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -561,7 +561,12 @@  MODULE_PARM_DESC(prot_mask, " host protection capabilities mask, def=0x0 ");
 /* the index of iopoll queues are bigger than interrupt queues' */
 static int experimental_iopoll_q_cnt;
 module_param(experimental_iopoll_q_cnt, int, 0444);
-MODULE_PARM_DESC(experimental_iopoll_q_cnt, "number of queues to be used as poll mode, def=0");
+MODULE_PARM_DESC(experimental_iopoll_q_cnt, "number of queues to be used as poll mode, def=0 &\n\t\t"
+		"this parameter is effective only if host_tagset_enable=1");
+
+static bool host_tagset_enable = true;
+module_param(host_tagset_enable, bool, 0444);
+MODULE_PARM_DESC(host_tagset_enable, "shared host tagset enable (0-1), def=1");
 
 static int debugfs_snapshot_regs_v3_hw(struct hisi_hba *hisi_hba);
 
@@ -2574,6 +2579,18 @@  static int interrupt_preinit_v3_hw(struct hisi_hba *hisi_hba)
 	hisi_hba->cq_nvecs = vectors - BASE_VECTORS_V3_HW - hisi_hba->iopoll_q_cnt;
 	shost->nr_hw_queues = hisi_hba->cq_nvecs + hisi_hba->iopoll_q_cnt;
 
+	if (!host_tagset_enable) {
+		shost->host_tagset = 0;
+		shost->nr_hw_queues = 1;
+		hisi_hba->reply_map = devm_kcalloc(&pdev->dev, nr_cpu_ids,
+				sizeof(unsigned int),
+				GFP_KERNEL);
+		if (!hisi_hba->reply_map) {
+			hisi_sas_v3_free_vectors(hisi_hba);
+			return -ENOMEM;
+		}
+	}
+
 	return devm_add_action(&pdev->dev, hisi_sas_v3_free_vectors, pdev);
 }
 
@@ -2615,6 +2632,7 @@  static int interrupt_init_v3_hw(struct hisi_hba *hisi_hba)
 		int nr = hisi_sas_intr_conv ? 16 : 16 + i;
 		unsigned long irqflags = hisi_sas_intr_conv ? IRQF_SHARED :
 							      IRQF_ONESHOT;
+		int cpu;
 
 		cq->irq_no = pci_irq_vector(pdev, nr);
 		rc = devm_request_threaded_irq(dev, cq->irq_no,
@@ -2632,6 +2650,9 @@  static int interrupt_init_v3_hw(struct hisi_hba *hisi_hba)
 			dev_err(dev, "could not get cq%d irq affinity!\n", i);
 			return -ENOENT;
 		}
+		if (!host_tagset_enable)
+			for_each_cpu(cpu, cq->irq_mask)
+				hisi_hba->reply_map[cpu] = i;
 	}
 
 	return 0;
@@ -3318,6 +3339,9 @@  static void hisi_sas_map_queues(struct Scsi_Host *shost)
 	struct blk_mq_queue_map *qmap;
 	int i, qoff;
 
+	if (shost->nr_hw_queues == 1)
+		return;
+
 	for (i = 0, qoff = 0; i < shost->nr_maps; i++) {
 		qmap = &shost->tag_set.map[i];
 		if (i == HCTX_TYPE_DEFAULT) {
@@ -3435,7 +3459,9 @@  hisi_sas_shost_alloc_pci(struct pci_dev *pdev)
 	if (check_fw_info_v3_hw(hisi_hba) < 0)
 		goto err_out;
 
-	if (experimental_iopoll_q_cnt < 0 ||
+	if (!host_tagset_enable)
+		dev_info(dev, "Shared host tag set disabled, iopoll queue cnt uses default 0\n");
+	else if (experimental_iopoll_q_cnt < 0 ||
 		experimental_iopoll_q_cnt >= hisi_hba->queue_count)
 		dev_err(dev, "iopoll queue count %d cannot exceed or equal 16, using default 0\n",
 			experimental_iopoll_q_cnt);