diff mbox series

[v2,1/3] scsi: qla2xxx: Drop starvation counter on success

Message ID 20241009111654.4697-2-a.kovaleva@yadro.com (mailing list archive)
State Superseded
Headers show
Series Fix bugs in qla2xxx driver | expand

Commit Message

Anastasia Kovaleva Oct. 9, 2024, 11:16 a.m. UTC
Long-lived sessions under high load can accumulate a starvation counter,
and the current implementation does not allow this counter to be reset
during an active session.

If HBA sends correct ATIO IOCB, then it has enough resources to process
commands and we should not call ISP recovery.

Cc: stable@vger.kernel.org
Fixes: ead038556f64 ("qla2xxx: Add Dual mode support in the driver")
Signed-off-by: Anastasia Kovaleva <a.kovaleva@yadro.com>
Reviewed-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
---
 drivers/scsi/qla2xxx/qla_isr.c    | 4 ++++
 drivers/scsi/qla2xxx/qla_target.c | 6 ++++++
 2 files changed, 10 insertions(+)

Comments

kernel test robot Oct. 10, 2024, 2:57 p.m. UTC | #1
Hi Anastasia,

kernel test robot noticed the following build errors:

[auto build test ERROR on jejb-scsi/for-next]
[also build test ERROR on mkp-scsi/for-next linus/master v6.12-rc2 next-20241010]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Anastasia-Kovaleva/scsi-qla2xxx-Drop-starvation-counter-on-success/20241009-192031
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
patch link:    https://lore.kernel.org/r/20241009111654.4697-2-a.kovaleva%40yadro.com
patch subject: [PATCH v2 1/3] scsi: qla2xxx: Drop starvation counter on success
config: alpha-allyesconfig (https://download.01.org/0day-ci/archive/20241010/202410102244.4WCXxyGQ-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 13.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241010/202410102244.4WCXxyGQ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410102244.4WCXxyGQ-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from arch/alpha/include/asm/rwonce.h:33,
                    from include/linux/compiler.h:317,
                    from include/linux/build_bug.h:5,
                    from include/linux/container_of.h:5,
                    from include/linux/list.h:5,
                    from include/linux/module.h:12,
                    from drivers/scsi/qla2xxx/qla_target.c:17:
   drivers/scsi/qla2xxx/qla_target.c: In function 'qlt_24xx_process_atio_queue':
>> include/asm-generic/rwonce.h:55:32: error: lvalue required as unary '&' operand
      55 |         *(volatile typeof(x) *)&(x) = (val);                            \
         |                                ^
   include/asm-generic/rwonce.h:61:9: note: in expansion of macro '__WRITE_ONCE'
      61 |         __WRITE_ONCE(x, val);                                           \
         |         ^~~~~~~~~~~~
   drivers/scsi/qla2xxx/qla_target.c:6833:25: note: in expansion of macro 'WRITE_ONCE'
    6833 |                         WRITE_ONCE(&vha->hw->exch_starvation, 0);
         |                         ^~~~~~~~~~


vim +55 include/asm-generic/rwonce.h

e506ea451254ab Will Deacon 2019-10-15  52  
e506ea451254ab Will Deacon 2019-10-15  53  #define __WRITE_ONCE(x, val)						\
e506ea451254ab Will Deacon 2019-10-15  54  do {									\
e506ea451254ab Will Deacon 2019-10-15 @55  	*(volatile typeof(x) *)&(x) = (val);				\
e506ea451254ab Will Deacon 2019-10-15  56  } while (0)
e506ea451254ab Will Deacon 2019-10-15  57
kernel test robot Oct. 10, 2024, 4:39 p.m. UTC | #2
Hi Anastasia,

kernel test robot noticed the following build errors:

[auto build test ERROR on jejb-scsi/for-next]
[also build test ERROR on mkp-scsi/for-next linus/master v6.12-rc2 next-20241010]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Anastasia-Kovaleva/scsi-qla2xxx-Drop-starvation-counter-on-success/20241009-192031
base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
patch link:    https://lore.kernel.org/r/20241009111654.4697-2-a.kovaleva%40yadro.com
patch subject: [PATCH v2 1/3] scsi: qla2xxx: Drop starvation counter on success
config: um-allmodconfig (https://download.01.org/0day-ci/archive/20241011/202410110059.pb1whtvg-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 70e0a7e7e6a8541bcc46908c592eed561850e416)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241011/202410110059.pb1whtvg-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410110059.pb1whtvg-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from drivers/scsi/qla2xxx/qla_target.c:20:
   In file included from include/linux/blkdev.h:9:
   In file included from include/linux/blk_types.h:10:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:8:
   In file included from include/linux/cacheflush.h:5:
   In file included from arch/um/include/asm/cacheflush.h:4:
   In file included from arch/um/include/asm/tlbflush.h:9:
   In file included from include/linux/mm.h:2213:
   include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     518 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   In file included from drivers/scsi/qla2xxx/qla_target.c:20:
   In file included from include/linux/blkdev.h:9:
   In file included from include/linux/blk_types.h:10:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from arch/um/include/asm/hardirq.h:5:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:14:
   In file included from arch/um/include/asm/io.h:24:
   include/asm-generic/io.h:548:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     548 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:561:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     561 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
         |                                                   ^
   In file included from drivers/scsi/qla2xxx/qla_target.c:20:
   In file included from include/linux/blkdev.h:9:
   In file included from include/linux/blk_types.h:10:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from arch/um/include/asm/hardirq.h:5:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:14:
   In file included from arch/um/include/asm/io.h:24:
   include/asm-generic/io.h:574:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     574 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
         |                                                   ^
   In file included from drivers/scsi/qla2xxx/qla_target.c:20:
   In file included from include/linux/blkdev.h:9:
   In file included from include/linux/blk_types.h:10:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from arch/um/include/asm/hardirq.h:5:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:14:
   In file included from arch/um/include/asm/io.h:24:
   include/asm-generic/io.h:585:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     585 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:595:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     595 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:605:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     605 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:693:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     693 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:701:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     701 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:709:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     709 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:718:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     718 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:727:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     727 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:736:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     736 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
>> drivers/scsi/qla2xxx/qla_target.c:6833:4: error: cannot take the address of an rvalue of type 'uint8_t *' (aka 'unsigned char *')
    6833 |                         WRITE_ONCE(&vha->hw->exch_starvation, 0);
         |                         ^          ~~~~~~~~~~~~~~~~~~~~~~~~~
   include/asm-generic/rwonce.h:61:2: note: expanded from macro 'WRITE_ONCE'
      61 |         __WRITE_ONCE(x, val);                                           \
         |         ^            ~
   include/asm-generic/rwonce.h:55:25: note: expanded from macro '__WRITE_ONCE'
      55 |         *(volatile typeof(x) *)&(x) = (val);                            \
         |                                ^ ~
   13 warnings and 1 error generated.

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for MODVERSIONS
   Depends on [n]: MODULES [=y] && !COMPILE_TEST [=y]
   Selected by [y]:
   - RANDSTRUCT_FULL [=y] && (CC_HAS_RANDSTRUCT [=y] || GCC_PLUGINS [=n]) && MODULES [=y]
   WARNING: unmet direct dependencies detected for GET_FREE_REGION
   Depends on [n]: SPARSEMEM [=n]
   Selected by [m]:
   - RESOURCE_KUNIT_TEST [=m] && RUNTIME_TESTING_MENU [=y] && KUNIT [=m]


vim +6833 drivers/scsi/qla2xxx/qla_target.c

  6793	
  6794	/*
  6795	 * qlt_24xx_process_atio_queue() - Process ATIO queue entries.
  6796	 * @ha: SCSI driver HA context
  6797	 */
  6798	void
  6799	qlt_24xx_process_atio_queue(struct scsi_qla_host *vha, uint8_t ha_locked)
  6800	{
  6801		struct qla_hw_data *ha = vha->hw;
  6802		struct atio_from_isp *pkt;
  6803		int cnt, i;
  6804	
  6805		if (!ha->flags.fw_started)
  6806			return;
  6807	
  6808		while ((ha->tgt.atio_ring_ptr->signature != ATIO_PROCESSED) ||
  6809		    fcpcmd_is_corrupted(ha->tgt.atio_ring_ptr)) {
  6810			pkt = (struct atio_from_isp *)ha->tgt.atio_ring_ptr;
  6811			cnt = pkt->u.raw.entry_count;
  6812	
  6813			if (unlikely(fcpcmd_is_corrupted(ha->tgt.atio_ring_ptr))) {
  6814				/*
  6815				 * This packet is corrupted. The header + payload
  6816				 * can not be trusted. There is no point in passing
  6817				 * it further up.
  6818				 */
  6819				ql_log(ql_log_warn, vha, 0xd03c,
  6820				    "corrupted fcp frame SID[%3phN] OXID[%04x] EXCG[%x] %64phN\n",
  6821				    &pkt->u.isp24.fcp_hdr.s_id,
  6822				    be16_to_cpu(pkt->u.isp24.fcp_hdr.ox_id),
  6823				    pkt->u.isp24.exchange_addr, pkt);
  6824	
  6825				adjust_corrupted_atio(pkt);
  6826				qlt_send_term_exchange(ha->base_qpair, NULL, pkt,
  6827				    ha_locked, 0);
  6828			} else {
  6829				/*
  6830				 * If we get correct ATIO, then HBA had enough memory
  6831				 * to proceed without reset.
  6832				 */
> 6833				WRITE_ONCE(&vha->hw->exch_starvation, 0);
  6834	
  6835				qlt_24xx_atio_pkt_all_vps(vha,
  6836				    (struct atio_from_isp *)pkt, ha_locked);
  6837			}
  6838	
  6839			for (i = 0; i < cnt; i++) {
  6840				ha->tgt.atio_ring_index++;
  6841				if (ha->tgt.atio_ring_index == ha->tgt.atio_q_length) {
  6842					ha->tgt.atio_ring_index = 0;
  6843					ha->tgt.atio_ring_ptr = ha->tgt.atio_ring;
  6844				} else
  6845					ha->tgt.atio_ring_ptr++;
  6846	
  6847				pkt->u.raw.signature = cpu_to_le32(ATIO_PROCESSED);
  6848				pkt = (struct atio_from_isp *)ha->tgt.atio_ring_ptr;
  6849			}
  6850			wmb();
  6851		}
  6852	
  6853		/* Adjust ring index */
  6854		wrt_reg_dword(ISP_ATIO_Q_OUT(vha), ha->tgt.atio_ring_index);
  6855	}
  6856
diff mbox series

Patch

diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index fe98c76e9be3..5234ce0985e0 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -1959,6 +1959,10 @@  qla2x00_async_event(scsi_qla_host_t *vha, struct rsp_que *rsp, uint16_t *mb)
 		ql_dbg(ql_dbg_async, vha, 0x5091, "Transceiver Removal\n");
 		break;
 
+	case MBA_REJECTED_FCP_CMD:
+		ql_dbg(ql_dbg_async, vha, 0x5092, "LS_RJT was sent. No resources to process the ELS request.\n");
+		break;
+
 	default:
 		ql_dbg(ql_dbg_async, vha, 0x5057,
 		    "Unknown AEN:%04x %04x %04x %04x\n",
diff --git a/drivers/scsi/qla2xxx/qla_target.c b/drivers/scsi/qla2xxx/qla_target.c
index d7551b1443e4..bc6b014eb422 100644
--- a/drivers/scsi/qla2xxx/qla_target.c
+++ b/drivers/scsi/qla2xxx/qla_target.c
@@ -6826,6 +6826,12 @@  qlt_24xx_process_atio_queue(struct scsi_qla_host *vha, uint8_t ha_locked)
 			qlt_send_term_exchange(ha->base_qpair, NULL, pkt,
 			    ha_locked, 0);
 		} else {
+			/*
+			 * If we get correct ATIO, then HBA had enough memory
+			 * to proceed without reset.
+			 */
+			WRITE_ONCE(&vha->hw->exch_starvation, 0);
+
 			qlt_24xx_atio_pkt_all_vps(vha,
 			    (struct atio_from_isp *)pkt, ha_locked);
 		}