mbox series

[V2,00/16] smartpqi updates

Message ID 165730597930.177165.11663580730429681919.stgit@brunhilda (mailing list archive)
Headers show
Series smartpqi updates | expand

Message

Don Brace July 8, 2022, 6:46 p.m. UTC
These patches are based on Martin Petersen's 5.20/scsi-queue tree
  https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
  5.20/scsi-queue

This set of changes consists of:
 * Remove a device from the OS faster by adding -ENODEV return code check
   in pqi_lun_reset. This status is set in the io_request->status member.
   Schedule the rescan worker thread within 5 seconds to initiate the
   removal. The driver used to retry a reset without checking for a
   device's removal and initiated 3 more retries. Device resets were
   taking up to 30 seconds. We also added a check to see if the controller
   firmware is still responsive during a reset operation.
 * Add the controller firmware version to the console logs. The firmware
   version is still in sysfs firmware_version.
 * Add support for more controllers; Ramaxel, Lenovo, and Adaptec.
 * Close a few rare read/write ordering issues where a register read
   could pass a register write.
 * Add support for multi-actuator devices. Our controllers now support up
   to 256 LUNs per multi-actuator device. We added a feature bit to check
   if the controller supports multi-actuator devices and updated support
   in the driver to support resets, I/O submission, and multi-actuator
   device removals.
 * Correct some rare system hangs that can occur when a PCI link-down
   condition occurs (such as a cable pull). We also fail all outstanding
   requests when a link-down is detected.
 * Correct an issue with setting the DMA direction flag for RAID path
   requests. It should be noted that there are two submission paths for
   requests in the driver, a RAID path and an Accelerated I/O (AIO) path.
   Beginning with firmware version 5.0 for Gen1 controllers and 3.01.x
   for Gen2 controllers, a change was made that removed the SCSI command
   READ BLOCK LIMITS (0x05) from an internal lookup table for RAID path
   requests. As a result of this change, the firmware switched to using
   the DMA direction flag in the request IU, which was incorrect. This
   caused the command to hang the controller. This patch resolves the
   hang. The AIO path is unaffected by the controller firmware change.
 * correct a rare device RAID map access race condition related to
   configuration changes. We do not access the RAID map until after the
   new RAID map is valid.
 * added a module parameter 'disable_managed_interrupts' to allow
   customers to change IRQ affinity. Multi-queue still works properly.
 * Updated device removal to using .slave_destroy instead of using our
   own internal method.
 * Added another module parameter to reduce the amount of time the
   driver waits for a controller to become ready. The default wait time
   is 3 minutes but can be extended to 30 minutes. This change results
   from customers with large installations requesting a longer wait time.
 * Updated copyright information.
 * Bump the driver version to 2.1.18-045

---

Don Brace (2):
      smartpqi: update copyright to current year.
      smartpqi: update version to 2.1.18-045

Gilbert Wu (1):
      smartpqi: add controller fw version to console log

Kevin Barnett (4):
      smartpqi: stop logging spurious PQI reset failures
      smartpqi: fix RAID map race condition
      smartpqi: update deleting a LUN via sysfs
      smartpqi: add ctrl ready timeout module parameter

Kumar Meiyappan (1):
      smartpqi: add driver support for multi-LUN devices

Mahesh Rajashekhara (1):
      smartpqi: fix dma direction for RAID requests

Mike McGowen (5):
      smartpqi: shorten drive visibility after removal
      smartpqi: close write read holes
      smartpqi: add PCI-ID for Adaptec SmartHBA 2100-8i
      smartpqi: add PCI-IDs for Lenovo controllers
      smartpqi: add module param to disable managed ints

Murthy Bhat (1):
      smartpqi: add PCI-IDs for ramaxel controllers

Sagar Biradar (1):
      smartpqi: fix PCI control linkdown system hang


 drivers/scsi/smartpqi/Kconfig                 |   2 +-
 drivers/scsi/smartpqi/smartpqi.h              |  27 +-
 drivers/scsi/smartpqi/smartpqi_init.c         | 405 +++++++++++++-----
 .../scsi/smartpqi/smartpqi_sas_transport.c    |   2 +-
 drivers/scsi/smartpqi/smartpqi_sis.c          |  11 +-
 drivers/scsi/smartpqi/smartpqi_sis.h          |   4 +-
 6 files changed, 339 insertions(+), 112 deletions(-)

--
Signature

Comments

Martin K. Petersen July 14, 2022, 3:43 a.m. UTC | #1
Don,

> These patches are based on Martin Petersen's 5.20/scsi-queue tree
>   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
>   5.20/scsi-queue

Applied to 5.20/scsi-staging, thanks!
Martin K. Petersen July 19, 2022, 3:08 a.m. UTC | #2
On Fri, 8 Jul 2022 13:46:45 -0500, Don Brace wrote:

> These patches are based on Martin Petersen's 5.20/scsi-queue tree
>   https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
>   5.20/scsi-queue
> 
> This set of changes consists of:
>  * Remove a device from the OS faster by adding -ENODEV return code check
>    in pqi_lun_reset. This status is set in the io_request->status member.
>    Schedule the rescan worker thread within 5 seconds to initiate the
>    removal. The driver used to retry a reset without checking for a
>    device's removal and initiated 3 more retries. Device resets were
>    taking up to 30 seconds. We also added a check to see if the controller
>    firmware is still responsive during a reset operation.
>  * Add the controller firmware version to the console logs. The firmware
>    version is still in sysfs firmware_version.
>  * Add support for more controllers; Ramaxel, Lenovo, and Adaptec.
>  * Close a few rare read/write ordering issues where a register read
>    could pass a register write.
>  * Add support for multi-actuator devices. Our controllers now support up
>    to 256 LUNs per multi-actuator device. We added a feature bit to check
>    if the controller supports multi-actuator devices and updated support
>    in the driver to support resets, I/O submission, and multi-actuator
>    device removals.
>  * Correct some rare system hangs that can occur when a PCI link-down
>    condition occurs (such as a cable pull). We also fail all outstanding
>    requests when a link-down is detected.
>  * Correct an issue with setting the DMA direction flag for RAID path
>    requests. It should be noted that there are two submission paths for
>    requests in the driver, a RAID path and an Accelerated I/O (AIO) path.
>    Beginning with firmware version 5.0 for Gen1 controllers and 3.01.x
>    for Gen2 controllers, a change was made that removed the SCSI command
>    READ BLOCK LIMITS (0x05) from an internal lookup table for RAID path
>    requests. As a result of this change, the firmware switched to using
>    the DMA direction flag in the request IU, which was incorrect. This
>    caused the command to hang the controller. This patch resolves the
>    hang. The AIO path is unaffected by the controller firmware change.
>  * correct a rare device RAID map access race condition related to
>    configuration changes. We do not access the RAID map until after the
>    new RAID map is valid.
>  * added a module parameter 'disable_managed_interrupts' to allow
>    customers to change IRQ affinity. Multi-queue still works properly.
>  * Updated device removal to using .slave_destroy instead of using our
>    own internal method.
>  * Added another module parameter to reduce the amount of time the
>    driver waits for a controller to become ready. The default wait time
>    is 3 minutes but can be extended to 30 minutes. This change results
>    from customers with large installations requesting a longer wait time.
>  * Updated copyright information.
>  * Bump the driver version to 2.1.18-045
> 
> [...]

Applied to 5.20/scsi-queue, thanks!

[01/16] smartpqi: shorten drive visibility after removal
        https://git.kernel.org/mkp/scsi/c/4e7d26029ee7
[02/16] smartpqi: add controller fw version to console log
        https://git.kernel.org/mkp/scsi/c/1d393227fc76
[03/16] smartpqi: add PCI-IDs for ramaxel controllers
        https://git.kernel.org/mkp/scsi/c/dab5378485f6
[04/16] smartpqi: close write read holes
        https://git.kernel.org/mkp/scsi/c/297bdc540f0e
[05/16] smartpqi: add driver support for multi-LUN devices
        https://git.kernel.org/mkp/scsi/c/904f2bfda65e
[06/16] smartpqi: fix PCI control linkdown system hang
        https://git.kernel.org/mkp/scsi/c/331f7e998b20
[07/16] smartpqi: add PCI-ID for Adaptec SmartHBA 2100-8i
        https://git.kernel.org/mkp/scsi/c/44e68c4af5d2
[08/16] smartpqi: add PCI-IDs for Lenovo controllers
        https://git.kernel.org/mkp/scsi/c/2a9c2ba2bc47
[09/16] smartpqi: stop logging spurious PQI reset failures
        https://git.kernel.org/mkp/scsi/c/85b41834b0f4
[10/16] smartpqi: fix dma direction for RAID requests
        https://git.kernel.org/mkp/scsi/c/69695aeaa662
[11/16] smartpqi: fix RAID map race condition
        https://git.kernel.org/mkp/scsi/c/6ce3cfb365eb
[12/16] smartpqi: add module param to disable managed ints
        https://git.kernel.org/mkp/scsi/c/cf15c3e734e8
[13/16] smartpqi: update deleting a LUN via sysfs
        https://git.kernel.org/mkp/scsi/c/2d80f4054f7f
[14/16] smartpqi: add ctrl ready timeout module parameter
        https://git.kernel.org/mkp/scsi/c/6d567dfee0b7
[15/16] smartpqi: update copyright to current year.
        https://git.kernel.org/mkp/scsi/c/e4b73b3fa2b9
[16/16] smartpqi: update version to 2.1.18-045
        https://git.kernel.org/mkp/scsi/c/f54f85dfd757