Message ID | 20230726094027.535126-5-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blk-mq: fix wrong queue mapping for kdump kernel | expand |
Hi Ming, From version 1 of the patchset, I thought we had planned to put the min comparison right above pci_alloc_irq_vectors instead? diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 3221a934066b..20410789e8b8 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -13025,6 +13025,8 @@ lpfc_sli4_enable_msix(struct lpfc_hba *phba) flags |= PCI_IRQ_AFFINITY; } + vectors = min_t(unsigned int, vectors, scsi_max_nr_hw_queues()); + rc = pci_alloc_irq_vectors(phba->pcidev, 1, vectors, flags); if (rc < 0) { lpfc_printf_log(phba, KERN_INFO, LOG_INIT, Thanks, Justin On Wed, Jul 26, 2023 at 2:40 AM Ming Lei <ming.lei@redhat.com> wrote: > > Take blk-mq's knowledge into account for calculating io queues. > > Fix wrong queue mapping in case of kdump kernel. > > On arm and ppc64, 'maxcpus=1' is passed to kdump kernel command line, > see `Documentation/admin-guide/kdump/kdump.rst`, so num_possible_cpus() > still returns all CPUs because 'maxcpus=1' just bring up one single > cpu core during booting. > > blk-mq sees single queue in kdump kernel, and in driver's viewpoint > there are still multiple queues, this inconsistency causes driver to apply > wrong queue mapping for handling IO, and IO timeout is triggered. > > Meantime, single queue makes much less resource utilization, and reduce > risk of kernel failure. > > Cc: Justin Tee <justintee8345@gmail.com> > Cc: James Smart <james.smart@broadcom.com> > Signed-off-by: Ming Lei <ming.lei@redhat.com> > --- > drivers/scsi/lpfc/lpfc_init.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c > index 3221a934066b..c546e5275108 100644 > --- a/drivers/scsi/lpfc/lpfc_init.c > +++ b/drivers/scsi/lpfc/lpfc_init.c > @@ -13022,6 +13022,8 @@ lpfc_sli4_enable_msix(struct lpfc_hba *phba) > cpu = cpumask_first(aff_mask); > cpu_select = lpfc_next_online_cpu(aff_mask, cpu); > } else { > + vectors = min_t(unsigned int, vectors, > + scsi_max_nr_hw_queues()); > flags |= PCI_IRQ_AFFINITY; > } > > -- > 2.40.1 >
On Wed, Jul 26, 2023 at 03:12:16PM -0700, Justin Tee wrote: > Hi Ming, > > From version 1 of the patchset, I thought we had planned to put the > min comparison right above pci_alloc_irq_vectors instead? > > diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c > index 3221a934066b..20410789e8b8 100644 > --- a/drivers/scsi/lpfc/lpfc_init.c > +++ b/drivers/scsi/lpfc/lpfc_init.c > @@ -13025,6 +13025,8 @@ lpfc_sli4_enable_msix(struct lpfc_hba *phba) > flags |= PCI_IRQ_AFFINITY; > } > > + vectors = min_t(unsigned int, vectors, scsi_max_nr_hw_queues()); Strictly speaking, the above change is better, but non-managed irq doesn't have such issue, that is why I just apply the change on managed irq branch. Thanks, Ming
On Wed, Jul 26, 2023 at 6:20 PM Ming Lei <ming.lei@redhat.com> wrote: > Strictly speaking, the above change is better, but non-managed irq > doesn't have such issue, that is why I just apply the change on managed > irq branch. > > > Thanks, > Ming Sure, thanks Ming. Reviewed-by: Justin Tee <justin.tee@broadcom.com> Regards, Justin
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 3221a934066b..c546e5275108 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -13022,6 +13022,8 @@ lpfc_sli4_enable_msix(struct lpfc_hba *phba) cpu = cpumask_first(aff_mask); cpu_select = lpfc_next_online_cpu(aff_mask, cpu); } else { + vectors = min_t(unsigned int, vectors, + scsi_max_nr_hw_queues()); flags |= PCI_IRQ_AFFINITY; }
Take blk-mq's knowledge into account for calculating io queues. Fix wrong queue mapping in case of kdump kernel. On arm and ppc64, 'maxcpus=1' is passed to kdump kernel command line, see `Documentation/admin-guide/kdump/kdump.rst`, so num_possible_cpus() still returns all CPUs because 'maxcpus=1' just bring up one single cpu core during booting. blk-mq sees single queue in kdump kernel, and in driver's viewpoint there are still multiple queues, this inconsistency causes driver to apply wrong queue mapping for handling IO, and IO timeout is triggered. Meantime, single queue makes much less resource utilization, and reduce risk of kernel failure. Cc: Justin Tee <justintee8345@gmail.com> Cc: James Smart <james.smart@broadcom.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> --- drivers/scsi/lpfc/lpfc_init.c | 2 ++ 1 file changed, 2 insertions(+)