mbox series

[v3,00/11] Fix shost command overloading issues

Message ID 20230327074310.1862889-1-john.g.garry@oracle.com (mailing list archive)
Headers show
Series Fix shost command overloading issues | expand

Message

John Garry March 27, 2023, 7:42 a.m. UTC
It's easy to get scsi_debug to error on throughput testing when we have
multiple shosts:

$ lsscsi
[7:0:0:0]       disk    Linux   scsi_debug      0191
[0:0:0:0]       disk    Linux   scsi_debug      0191

$ fio --filename=/dev/sda --filename=/dev/sdb --direct=1 --rw=read
--bs=4k --iodepth=256 --runtime=60 --numjobs=40 --time_based --name=jpg
--eta-newline=1 --readonly --ioengine=io_uring --hipri --exitall_on_error
jpg: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=256
...
fio-3.28
Starting 40 processes
[   27.521809] hrtimer: interrupt took 33067 ns
[   27.904660] sd 7:0:0:0: [sdb] tag#171 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s
[   27.904660] sd 0:0:0:0: [sda] tag#58 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s
fio: io_u error [   27.904667] sd 0:0:0:0: [sda] tag#58 CDB: Read(10) 28 00 00 00 27 00 00 01 18 00
on file /dev/sda[   27.904670] sd 0:0:0:0: [sda] tag#62 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s

The issue is related to how the driver manages submit queues and tags. A
single array of submit queues - sdebug_q_arr - with its own set of tags is
shared among all shosts. As such, for occasions when we have more than one
host it is possible to overload the submit queues and run out of tags.

Another separate issue that we may reduce the shost submit queue depth,
sdebug_max_queue, dynamically causing the shost to be overloaded. How many
IOs which the shost may be sent is fixed at can_queue at init time, which
is the same initial value for sdebug_max_queue. So reducing
sdebug_max_queue means that the shost may be sent more IOs than it is
configured to handle, causing overloading.

This series removes the scsi_debug submit queue concept and uses
pre-existing APIs to manage and examine tags, like scsi_block_requests()
and blk_mq_tagset_busy_iter(). Using standard APIs makes the driver more
maintainable and extensible in future.

A restriction is also added to allow sdebug_max_queue only be modified
when no shosts are present, i.e. we need to remove shosts, modify
sdebug_max_queue, and then re-add the shosts. 

Difference to v2:
- Make sdebug_alloc_queued_cmd() static

Differences to v1:
- Add patch to fix sdev queue full test
- Add restriction to allow sdebug_max_queue only be modified for no shosts

John Garry (11):
  scsi: scsi_debug: Fix check for sdev queue full
  scsi: scsi_debug: Don't iter all shosts in
    clear_luns_changed_on_target()
  scsi: scsi_debug: Change shost list lock to a mutex
  scsi: scsi_debug: Protect block_unblock_all_queues() with mutex
  scsi: scsi_debug: Use scsi_block_requests() to block queues
  scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd
  scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in
    sdebug_blk_mq_poll()
  scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in stop_all_queued()
  scsi: scsi_debug: Use scsi_host_busy() in delay_store() and
    ndelay_store()
  scsi: scsi_debug: Only allow sdebug_max_queue be modified when no
    shosts
  scsi: scsi_debug: Drop sdebug_queue

 drivers/scsi/scsi_debug.c | 783 ++++++++++++++++++--------------------
 1 file changed, 361 insertions(+), 422 deletions(-)

Comments

Martin K. Petersen April 3, 2023, 2:15 a.m. UTC | #1
John,

> It's easy to get scsi_debug to error on throughput testing when we have
> multiple shosts:

Applied to 6.4/scsi-staging, thanks!
Martin K. Petersen April 12, 2023, 2:04 a.m. UTC | #2
On Mon, 27 Mar 2023 07:42:59 +0000, John Garry wrote:

> It's easy to get scsi_debug to error on throughput testing when we have
> multiple shosts:
> 
> $ lsscsi
> [7:0:0:0]       disk    Linux   scsi_debug      0191
> [0:0:0:0]       disk    Linux   scsi_debug      0191
> 
> [...]

Applied to 6.4/scsi-queue, thanks!

[01/11] scsi: scsi_debug: Fix check for sdev queue full
        https://git.kernel.org/mkp/scsi/c/6500d2045d52
[02/11] scsi: scsi_debug: Don't iter all shosts in clear_luns_changed_on_target()
        https://git.kernel.org/mkp/scsi/c/00f9d622e8b2
[03/11] scsi: scsi_debug: Change shost list lock to a mutex
        https://git.kernel.org/mkp/scsi/c/0aaa3fad4fd9
[04/11] scsi: scsi_debug: Protect block_unblock_all_queues() with mutex
        https://git.kernel.org/mkp/scsi/c/25b80b2c7582
[05/11] scsi: scsi_debug: Use scsi_block_requests() to block queues
        https://git.kernel.org/mkp/scsi/c/a0473bf31df5
[06/11] scsi: scsi_debug: Dynamically allocate sdebug_queued_cmd
        https://git.kernel.org/mkp/scsi/c/1107c7b24ee3
[07/11] scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in sdebug_blk_mq_poll()
        https://git.kernel.org/mkp/scsi/c/600d9ead3936
[08/11] scsi: scsi_debug: Use blk_mq_tagset_busy_iter() in stop_all_queued()
        https://git.kernel.org/mkp/scsi/c/9c559c9b4748
[09/11] scsi: scsi_debug: Use scsi_host_busy() in delay_store() and ndelay_store()
        https://git.kernel.org/mkp/scsi/c/12f3eef016ea
[10/11] scsi: scsi_debug: Only allow sdebug_max_queue be modified when no shosts
        https://git.kernel.org/mkp/scsi/c/57f7225a4fe2
[11/11] scsi: scsi_debug: Drop sdebug_queue
        https://git.kernel.org/mkp/scsi/c/f1437cd1e535