mbox series

[RFC,blktests,v1,0/1] Test case for 'nvme: short-circuit connection retries'

Message ID 20230621155825.20146-1-dwagner@suse.de (mailing list archive)
Headers show
Series Test case for 'nvme: short-circuit connection retries' | expand

Message

Daniel Wagner June 21, 2023, 3:58 p.m. UTC
We had a longer discussion on how to interpret the DNR bit on reconnect attempts
in [1]. The conclusion was (if I got this right) is we should not try to reconnect
when the error response had the DNR bit set using the same parameters.

The FC transport already implemented this behavior with

  f25f8ef70ce2 ("nvme-fc: short-circuit reconnect retries")

Hannes also provided patches for TCP and RDMA [2]. With these patches this test
will pass.

The nvme/050 implements this test case by (ab)using the queue count mechanism to
trigger a reconnect. Before the reconnect is triggered the tests set the
allowed_any_host attribute to 0 and forces the reconnect to fail.

[1] https://lore.kernel.org/linux-nvme/20220927143157.3659-1-dwagner@suse.de/
[2] https://lore.kernel.org/linux-nvme/20220715063356.134124-1-hare@suse.de/


This patch is based on top of
  blktests: https://lore.kernel.org/linux-nvme/20230620132703.20648-1-dwagner@suse.de/
  linux: https://lore.kernel.org/linux-nvme/20230620133711.22840-1-dwagner@suse.de/


fc:

nvme/050 (test DNR is handled on connect attempt with invalid arguments) [passed]
    runtime  8.845s  ...  3.756s

tcp:

nvme/050 (test DNR is handled on connect attempt with invalid arguments) [failed]
    runtime  3.756s  ...  8.836s
    --- tests/nvme/050.out      2023-06-21 11:47:47.767788898 +0200
    +++ /home/wagi/work/blktests/results/nodev/nvme/050.out.bad 2023-06-21 15:19:08.368414289 +0200
    @@ -1,2 +1,3 @@
     Running nvme/050
    +controller "nvme2" not deleted within 5 seconds
     Test complete

fc:

 run blktests nvme/050 at 2023-06-21 15:11:31
 loop0: detected capacity change from 0 to 32768
 nvmet: adding nsid 1 to subsystem blktests-subsystem-1
 nvme nvme2: NVME-FC{0}: create association : host wwpn 0x20001100aa000002  rport wwpn 0x20001100aa000001: NQN "blktests-subsystem-1"
 (NULL device *): {0:0} Association created
 [7088] nvmet: ctrl 1 start keep-alive timer for 5 secs
 nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a.
 [6743] nvmet: adding queue 1 to ctrl 1.
 [6312] nvmet: adding queue 2 to ctrl 1.
 [7088] nvmet: adding queue 3 to ctrl 1.
 [6927] nvmet: adding queue 4 to ctrl 1.
 nvme nvme2: NVME-FC{0}: new ctrl: NQN "blktests-subsystem-1"
 nvme nvme2: NVME-FC{0}: io failed due to lldd error 6
 nvme nvme2: NVME-FC{0}: transport association event: transport detected io error
 nvme nvme2: NVME-FC{0}: resetting controller
 [7088] nvmet: ctrl 1 stop keep-alive
 (NULL device *): {0:0} Association deleted
 nvme nvme2: NVME-FC{0}: create association : host wwpn 0x20001100aa000002  rport wwpn 0x20001100aa000001: NQN "blktests-subsystem-1"
 (NULL device *): {0:0} Association freed
 (NULL device *): {0:0} Association created
 (NULL device *): Disconnect LS failed: No Association
 nvmet: connect by host nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a for subsystem blktests-subsystem-1 not allowed
 nvme_fabrics: nvmf_log_connect_error: DNR 1
 nvme nvme2: Connect for subsystem blktests-subsystem-1 is not allowed, hostnqn: nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a
 nvme nvme2: NVME-FC{0}: reset: Reconnect attempt failed (16772)
 nvme nvme2: NVME-FC{0}: reconnect failure
 nvme nvme2: Removing ctrl: NQN "blktests-subsystem-1"
 (NULL device *): {0:0} Association deleted
 (NULL device *): {0:0} Association freed
 (NULL device *): Disconnect LS failed: No Association

tcp:

 run blktests nvme/050 at 2023-06-21 15:11:36
 loop0: detected capacity change from 0 to 32768
 nvmet: adding nsid 1 to subsystem blktests-subsystem-1
 nvmet_tcp: enabling port 0 (127.0.0.1:4420)
 [62] nvmet: ctrl 1 start keep-alive timer for 5 secs
 nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a.
 nvme nvme2: creating 4 I/O queues.
 nvme nvme2: mapped 4/0/0 default/read/poll queues.
 [62] nvmet: adding queue 1 to ctrl 1.
 [214] nvmet: adding queue 2 to ctrl 1.
 [215] nvmet: adding queue 3 to ctrl 1.
 [177] nvmet: adding queue 4 to ctrl 1.
 nvme nvme2: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420
 nvme nvme2: starting error recovery
 nvme nvme2: Reconnecting in 1 seconds...
 [6743] nvmet: ctrl 1 stop keep-alive
 nvmet: connect by host nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a for subsystem blktests-subsystem-1 not allowed
 nvme_fabrics: nvmf_log_connect_error: DNR 1
 nvme nvme2: Connect for subsystem blktests-subsystem-1 is not allowed, hostnqn: nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a
 nvme nvme2: failed to connect queue: 0 ret=16772
 nvme nvme2: Failed reconnect attempt 1
 nvme nvme2: Reconnecting in 1 seconds...
 nvmet: connect by host nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a for subsystem blktests-subsystem-1 not allowed
 nvme_fabrics: nvmf_log_connect_error: DNR 1
 nvme nvme2: Connect for subsystem blktests-subsystem-1 is not allowed, hostnqn: nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a
 nvme nvme2: failed to connect queue: 0 ret=16772
 nvme nvme2: Failed reconnect attempt 2
 nvme nvme2: Reconnecting in 1 seconds...
 nvmet: connect by host nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a for subsystem blktests-subsystem-1 not allowed
 nvme_fabrics: nvmf_log_connect_error: DNR 1
 nvme nvme2: Connect for subsystem blktests-subsystem-1 is not allowed, hostnqn: nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a
 nvme nvme2: failed to connect queue: 0 ret=16772
 nvme nvme2: Failed reconnect attempt 3
 nvme nvme2: Reconnecting in 1 seconds...
 nvmet: connect by host nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a for subsystem blktests-subsystem-1 not allowed
 nvme_fabrics: nvmf_log_connect_error: DNR 1
 nvme nvme2: Connect for subsystem blktests-subsystem-1 is not allowed, hostnqn: nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a
 nvme nvme2: failed to connect queue: 0 ret=16772
 nvme nvme2: Failed reconnect attempt 4
 nvme nvme2: Reconnecting in 1 seconds...
 nvmet: connect by host nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a for subsystem blktests-subsystem-1 not allowed
 nvme_fabrics: nvmf_log_connect_error: DNR 1
 nvme nvme2: Connect for subsystem blktests-subsystem-1 is not allowed, hostnqn: nqn.2014-08.org.nvmexpress:uuid:77b49aba-06b4-431a-9af8-75e318740f1a
 nvme nvme2: failed to connect queue: 0 ret=16772
 nvme nvme2: Failed reconnect attempt 5
 nvme nvme2: Reconnecting in 1 seconds...
 nvme nvme2: Removing ctrl: NQN "blktests-subsystem-1"
 nvme nvme2: Property Set error: 880, offset 0x14

Daniel Wagner (1):
  nvme/050: test DNR handling on reconnect

 tests/nvme/050     | 126 +++++++++++++++++++++++++++++++++++++++++++++
 tests/nvme/050.out |   2 +
 2 files changed, 128 insertions(+)
 create mode 100644 tests/nvme/050
 create mode 100644 tests/nvme/050.out