Message ID | 20230322002350.4038048-1-kbusch@meta.com (mailing list archive) |
---|---|
Headers | show |
Series | nvme fabrics polling fixes | expand |
> I couldn't test the existing tcp
It is very simple to do, for tcp you can just operate on the loopback
iface 127.0.0.1
See example:
$ ./target.sh 1 1
$ nvme connect-all -t tcp -a 127.0.0.1 -P 1
For rdma you can work on any interface you want, you just need
iproute with rdma utility (which is available with recent iproute
packages).
Setup Software RoCE:
$ rdma link add rxe0 type rxe netdev <iface>
Setup Software iWarp:
$ rdma link add siw0 type siw netdev <iface>
Then the same as before...
$ ./target.sh 1 1
$ nvme connect-all -t tcp -a <iface-addr> -P 1
That is essentially what blktest do.
target.sh
--
#!/bin/bash
# args: <num_subsys> <nr_ns-per-subsystem> <nvme|file>
# examples:
# - ./target.sh 1 1 # uses null_blk
# - ./target.sh 1 1 nvme
# - ./target.sh 1 1 file
CFGFS=/sys/kernel/config/nvmet
NQN=testnqn
mount -t configfs configfs /sys/kernel/config > /dev/null 2>&1
rm -f $CFGFS/ports/*/subsystems/*
rm -f $CFGFS/subsystems/*/allowed_hosts/* > /dev/null 2>&1
rmdir $CFGFS/subsystems/*/namespaces/* > /dev/null 2>&1
rmdir $CFGFS/subsystems/* > /dev/null 2>&1
rmdir $CFGFS/ports/* > /dev/null 2>&1
rmdir $CFGFS/hosts/* > /dev/null 2>&1
modprobe -r nvmet_rdma nvmet-tcp > /dev/null 2>&1
modprobe -r ib_umad rdma_ucm ib_uverbs rdma_cm > /dev/null 2>&1
modprobe -r null_blk nvme-tcp nvme-rdma
modprobe -r nvme-loop
rm -f /tmp/nsf*
if [[ $1 == "0" ]]; then
exit
fi
modprobe null_blk nr_devices=$2 #queue_mode=0 irqmode=0
modprobe -a nvmet_tcp nvmet_rdma
mkdir $CFGFS/ports/1
echo 0.0.0.0 > $CFGFS/ports/1/addr_traddr
echo 8009 > $CFGFS/ports/1/addr_trsvcid
echo "tcp" > $CFGFS/ports/1/addr_trtype
echo "ipv4" > $CFGFS/ports/1/addr_adrfam
mkdir $CFGFS/ports/2
echo 0.0.0.0 > $CFGFS/ports/2/addr_traddr
echo 4420 > $CFGFS/ports/2/addr_trsvcid
echo "rdma" > $CFGFS/ports/2/addr_trtype
echo "ipv4" > $CFGFS/ports/2/addr_adrfam
mkdir $CFGFS/ports/3
echo "loop" > $CFGFS/ports/3/addr_trtype
for j in `seq $1`
do
mkdir $CFGFS/subsystems/$NQN$j
echo 1 > $CFGFS/subsystems/$NQN$j/attr_allow_any_host
k=0
for i in `seq $2`
do
mkdir $CFGFS/subsystems/$NQN$j/namespaces/$i
if [[ "$3" == "nvme" ]]; then
echo -n "/dev/nvme0n1" >
$CFGFS/subsystems/$NQN$j/namespaces/$i/device_path
elif [[ "$3" == "file" ]]; then
ns="/tmp/nsf$k"
touch $ns
truncate --size=2G $ns
echo -n "$ns" >
$CFGFS/subsystems/$NQN$j/namespaces/$i/device_path
else
echo -n "/dev/nullb$k" >
$CFGFS/subsystems/$NQN$j/namespaces/$i/device_path
fi
echo 1 > $CFGFS/subsystems/$NQN$j/namespaces/$i/enable
let "k+=1"
done
ln -s $CFGFS/subsystems/$NQN$j $CFGFS/ports/1/subsystems/
ln -s $CFGFS/subsystems/$NQN$j $CFGFS/ports/2/subsystems/
ln -s $CFGFS/subsystems/$NQN$j $CFGFS/ports/3/subsystems/
done
--
On Tue, Mar 21, 2023 at 05:23:47PM -0700, Keith Busch wrote: > From: Keith Busch <kbusch@kernel.org> > > I couldn't test the existing tcp or rdma options, so I had to make a > loop poll option. The last patch fixes the polling queues when used with > fabrics. > > Note, this depends on patch I sent earlier today that I should have just > included in this series: > > https://lore.kernel.org/linux-block/20230321215001.2655451-1-kbusch@meta.com/T/#u I've tested this series with https://github.com/igaw/blktests/tree/queue-counts and while for rdma all is good I got a lockdep warning for tcp: ====================================================== WARNING: possible circular locking dependency detected 6.3.0-rc1+ #15 Tainted: G W ------------------------------------------------------ kworker/6:0/54 is trying to acquire lock: ffff888121d88030 ((work_completion)(&queue->io_work)){+.+.}-{0:0}, at: __flush_work+0xb9/0x170 but task is already holding lock: ffff888100b0fd20 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_one_work+0x707/0xbc0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 ((work_completion)(&queue->release_work)){+.+.}-{0:0}: lock_acquire+0x13a/0x310 process_one_work+0x728/0xbc0 worker_thread+0x97a/0x1480 kthread+0x228/0x2b0 ret_from_fork+0x1f/0x30 -> #1 ((wq_completion)nvmet-wq){+.+.}-{0:0}: lock_acquire+0x13a/0x310 __flush_workqueue+0x185/0x14e0 nvmet_tcp_install_queue+0x63/0x270 [nvmet_tcp] nvmet_install_queue+0x2b1/0x6a0 [nvmet] nvmet_execute_admin_connect+0x381/0x880 [nvmet] nvmet_tcp_io_work+0x15e8/0x8f60 [nvmet_tcp] process_one_work+0x756/0xbc0 worker_thread+0x97a/0x1480 kthread+0x228/0x2b0 ret_from_fork+0x1f/0x30 -> #0 ((work_completion)(&queue->io_work)){+.+.}-{0:0}: validate_chain+0x19f1/0x6d50 __lock_acquire+0x122d/0x1e90 lock_acquire+0x13a/0x310 __flush_work+0xd5/0x170 __cancel_work_timer+0x36b/0x470 nvmet_tcp_release_queue_work+0x25c/0x1000 [nvmet_tcp] process_one_work+0x756/0xbc0 worker_thread+0x97a/0x1480 kthread+0x228/0x2b0 ret_from_fork+0x1f/0x30 other info that might help us debug this: Chain exists of: (work_completion)(&queue->io_work) --> (wq_completion)nvmet-wq --> (work_completion)(&queue->release_work) Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock((work_completion)(&queue->release_work)); lock((wq_completion)nvmet-wq); lock((work_completion)(&queue->release_work)); lock((work_completion)(&queue->io_work)); *** DEADLOCK *** 2 locks held by kworker/6:0/54: #0: ffff888109ff6d48 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0x6c8/0xbc0 #1: ffff888100b0fd20 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_one_work+0x707/0xbc0 stack backtrace: CPU: 6 PID: 54 Comm: kworker/6:0 Tainted: G W 6.3.0-rc1+ #15 f4d05de834b07d62567d33b70ec70fb0fa06f103 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: nvmet-wq nvmet_tcp_release_queue_work [nvmet_tcp] Call Trace: <TASK> dump_stack_lvl+0x5a/0x80 check_noncircular+0x2c8/0x390 ? add_chain_block+0x5e0/0x5e0 ? ret_from_fork+0x1f/0x30 ? lockdep_lock+0xd3/0x260 ? stack_trace_save+0x10a/0x1e0 ? stack_trace_snprint+0x100/0x100 ? check_noncircular+0x1a6/0x390 validate_chain+0x19f1/0x6d50 ? lockdep_unlock+0x9e/0x1f0 ? validate_chain+0x15b2/0x6d50 ? reacquire_held_locks+0x510/0x510 ? reacquire_held_locks+0x510/0x510 ? reacquire_held_locks+0x510/0x510 ? add_lock_to_list+0xbf/0x2c0 ? lockdep_unlock+0x9e/0x1f0 ? validate_chain+0x15b2/0x6d50 ? reacquire_held_locks+0x510/0x510 ? reacquire_held_locks+0x510/0x510 ? xfs_buf_find_lock+0xb0/0x430 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? reacquire_held_locks+0x510/0x510 ? validate_chain+0x176/0x6d50 ? trace_lock_acquired+0x7b/0x180 ? lock_is_held_type+0x8b/0x110 ? lock_is_held_type+0x8b/0x110 ? rcu_read_lock_sched_held+0x34/0x70 ? reacquire_held_locks+0x510/0x510 ? xfs_buf_get_map+0xd72/0x11a0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? lock_is_held_type+0x8b/0x110 ? rcu_read_lock_sched_held+0x34/0x70 ? trace_xfs_buf_read+0x7c/0x180 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_buf_read_map+0x111/0x700 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? lock_is_held_type+0x8b/0x110 ? lock_is_held_type+0x8b/0x110 ? xfs_btree_read_buf_block+0x205/0x300 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? rcu_read_lock_sched_held+0x34/0x70 ? trace_xfs_trans_read_buf+0x79/0x170 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_btree_read_buf_block+0x205/0x300 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_trans_read_buf_map+0x303/0x4f0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? trace_xfs_trans_getsb+0x170/0x170 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_verify_fsbno+0x74/0x130 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_btree_ptr_to_daddr+0x19b/0x660 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_buf_set_ref+0x1d/0x50 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_btree_read_buf_block+0x233/0x300 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? mark_lock+0x8f/0x320 ? xfs_btree_readahead+0x250/0x250 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_verify_fsbno+0x74/0x130 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_btree_ptr_to_daddr+0x19b/0x660 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_dio_write_end_io+0x32f/0x3f0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_dio_write_end_io+0x32f/0x3f0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_dio_write_end_io+0x32f/0x3f0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_dio_write_end_io+0x32f/0x3f0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? __module_address+0x86/0x1e0 ? ret_from_fork+0x1f/0x30 ? deref_stack_reg+0x17f/0x210 ? ret_from_fork+0x1f/0x30 ? unwind_next_frame+0x16b/0x2240 ? ret_from_fork+0x1f/0x30 ? stack_trace_save+0x1e0/0x1e0 ? arch_stack_walk+0xb7/0xf0 ? lock_is_held_type+0x8b/0x110 ? find_busiest_group+0x104e/0x2480 ? load_balance+0x2540/0x2540 ? stack_trace_save+0x10a/0x1e0 ? mark_lock+0x8f/0x320 ? __lock_acquire+0x122d/0x1e90 ? lock_is_held_type+0x8b/0x110 ? rcu_lock_acquire+0x30/0x30 ? xfs_buf_ioend+0x248/0x450 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? xfs_buf_ioend+0x248/0x450 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] ? __module_address+0x86/0x1e0 ? ret_from_fork+0x1f/0x30 ? deref_stack_reg+0x17f/0x210 ? ret_from_fork+0x1f/0x30 ? unwind_next_frame+0x16b/0x2240 ? stack_trace_save+0x10a/0x1e0 ? deref_stack_reg+0x17f/0x210 ? look_up_lock_class+0x65/0x130 ? register_lock_class+0x5d/0x860 ? mark_lock+0x8f/0x320 __lock_acquire+0x122d/0x1e90 lock_acquire+0x13a/0x310 ? __flush_work+0xb9/0x170 ? read_lock_is_recursive+0x10/0x10 ? lock_is_held_type+0x8b/0x110 ? rcu_lock_acquire+0x30/0x30 __flush_work+0xd5/0x170 ? __flush_work+0xb9/0x170 ? flush_work+0x10/0x10 ? lock_is_held_type+0x8b/0x110 ? __lock_acquire+0x1e90/0x1e90 ? do_raw_spin_unlock+0x112/0x890 ? mark_lock+0x8f/0x320 ? lockdep_hardirqs_on_prepare+0x2d5/0x610 __cancel_work_timer+0x36b/0x470 ? cancel_work_sync+0x10/0x10 ? mark_lock+0x8f/0x320 ? lockdep_hardirqs_on_prepare+0x2d5/0x610 ? nvmet_tcp_release_queue_work+0x24d/0x1000 [nvmet_tcp f61749ac066e0812c28869697bc2623872f02bd4] ? datagram_poll+0x380/0x380 nvmet_tcp_release_queue_work+0x25c/0x1000 [nvmet_tcp f61749ac066e0812c28869697bc2623872f02bd4] process_one_work+0x756/0xbc0 ? rescuer_thread+0x13f0/0x13f0 ? lock_acquired+0x2f2/0x930 ? worker_thread+0xf55/0x1480 worker_thread+0x97a/0x1480 ? rcu_lock_release+0x20/0x20 kthread+0x228/0x2b0 ? rcu_lock_release+0x20/0x20 ? kthread_blkcg+0xa0/0xa0 ret_from_fork+0x1f/0x30 </TASK>
>> From: Keith Busch <kbusch@kernel.org> >> >> I couldn't test the existing tcp or rdma options, so I had to make a >> loop poll option. The last patch fixes the polling queues when used with >> fabrics. >> >> Note, this depends on patch I sent earlier today that I should have just >> included in this series: >> >> https://lore.kernel.org/linux-block/20230321215001.2655451-1-kbusch@meta.com/T/#u > > I've tested this series with > > https://github.com/igaw/blktests/tree/queue-counts > > and while for rdma all is good I got a lockdep warning for tcp: > > > ====================================================== > WARNING: possible circular locking dependency detected > 6.3.0-rc1+ #15 Tainted: G W > ------------------------------------------------------ > kworker/6:0/54 is trying to acquire lock: > ffff888121d88030 ((work_completion)(&queue->io_work)){+.+.}-{0:0}, at: __flush_work+0xb9/0x170 > > but task is already holding lock: > ffff888100b0fd20 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_one_work+0x707/0xbc0 > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #2 ((work_completion)(&queue->release_work)){+.+.}-{0:0}: > lock_acquire+0x13a/0x310 > process_one_work+0x728/0xbc0 > worker_thread+0x97a/0x1480 > kthread+0x228/0x2b0 > ret_from_fork+0x1f/0x30 > > -> #1 ((wq_completion)nvmet-wq){+.+.}-{0:0}: > lock_acquire+0x13a/0x310 > __flush_workqueue+0x185/0x14e0 > nvmet_tcp_install_queue+0x63/0x270 [nvmet_tcp] > nvmet_install_queue+0x2b1/0x6a0 [nvmet] > nvmet_execute_admin_connect+0x381/0x880 [nvmet] > nvmet_tcp_io_work+0x15e8/0x8f60 [nvmet_tcp] > process_one_work+0x756/0xbc0 > worker_thread+0x97a/0x1480 > kthread+0x228/0x2b0 > ret_from_fork+0x1f/0x30 > > -> #0 ((work_completion)(&queue->io_work)){+.+.}-{0:0}: > validate_chain+0x19f1/0x6d50 > __lock_acquire+0x122d/0x1e90 > lock_acquire+0x13a/0x310 > __flush_work+0xd5/0x170 > __cancel_work_timer+0x36b/0x470 > nvmet_tcp_release_queue_work+0x25c/0x1000 [nvmet_tcp] > process_one_work+0x756/0xbc0 > worker_thread+0x97a/0x1480 > kthread+0x228/0x2b0 > ret_from_fork+0x1f/0x30 > > other info that might help us debug this: > > Chain exists of: > (work_completion)(&queue->io_work) --> (wq_completion)nvmet-wq --> (work_completion)(&queue->release_work) This came up before. Its because in nvmet_tcp_install_queue we flush the nvmet_wq, to let queue releases flush before accepting new queues. Without it, there is no back-pressure and a host can simply reconnect in a loop and exhaust nvmet system memory... I don't think its a deadlock, because the queue that flushes the nvmet_wq cannot be in release, this is only possible that other queues would be releasing. So I can't see how it would deadlock. But I don't know how to teach lockdep. I'll try to see if there is a different way to do this, without annoying lockdep. > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock((work_completion)(&queue->release_work)); > lock((wq_completion)nvmet-wq); > lock((work_completion)(&queue->release_work)); > lock((work_completion)(&queue->io_work)); > > *** DEADLOCK *** > > 2 locks held by kworker/6:0/54: > #0: ffff888109ff6d48 ((wq_completion)nvmet-wq){+.+.}-{0:0}, at: process_one_work+0x6c8/0xbc0 > #1: ffff888100b0fd20 ((work_completion)(&queue->release_work)){+.+.}-{0:0}, at: process_one_work+0x707/0xbc0 > > stack backtrace: > CPU: 6 PID: 54 Comm: kworker/6:0 Tainted: G W 6.3.0-rc1+ #15 f4d05de834b07d62567d33b70ec70fb0fa06f103 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 > Workqueue: nvmet-wq nvmet_tcp_release_queue_work [nvmet_tcp] > Call Trace: > <TASK> > dump_stack_lvl+0x5a/0x80 > check_noncircular+0x2c8/0x390 > ? add_chain_block+0x5e0/0x5e0 > ? ret_from_fork+0x1f/0x30 > ? lockdep_lock+0xd3/0x260 > ? stack_trace_save+0x10a/0x1e0 > ? stack_trace_snprint+0x100/0x100 > ? check_noncircular+0x1a6/0x390 > validate_chain+0x19f1/0x6d50 > ? lockdep_unlock+0x9e/0x1f0 > ? validate_chain+0x15b2/0x6d50 > ? reacquire_held_locks+0x510/0x510 > ? reacquire_held_locks+0x510/0x510 > ? reacquire_held_locks+0x510/0x510 > ? add_lock_to_list+0xbf/0x2c0 > ? lockdep_unlock+0x9e/0x1f0 > ? validate_chain+0x15b2/0x6d50 > ? reacquire_held_locks+0x510/0x510 > ? reacquire_held_locks+0x510/0x510 > ? xfs_buf_find_lock+0xb0/0x430 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? reacquire_held_locks+0x510/0x510 > ? validate_chain+0x176/0x6d50 > ? trace_lock_acquired+0x7b/0x180 > ? lock_is_held_type+0x8b/0x110 > ? lock_is_held_type+0x8b/0x110 > ? rcu_read_lock_sched_held+0x34/0x70 > ? reacquire_held_locks+0x510/0x510 > ? xfs_buf_get_map+0xd72/0x11a0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? lock_is_held_type+0x8b/0x110 > ? rcu_read_lock_sched_held+0x34/0x70 > ? trace_xfs_buf_read+0x7c/0x180 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_buf_read_map+0x111/0x700 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? lock_is_held_type+0x8b/0x110 > ? lock_is_held_type+0x8b/0x110 > ? xfs_btree_read_buf_block+0x205/0x300 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? rcu_read_lock_sched_held+0x34/0x70 > ? trace_xfs_trans_read_buf+0x79/0x170 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_btree_read_buf_block+0x205/0x300 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_trans_read_buf_map+0x303/0x4f0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? trace_xfs_trans_getsb+0x170/0x170 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_verify_fsbno+0x74/0x130 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_btree_ptr_to_daddr+0x19b/0x660 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_buf_set_ref+0x1d/0x50 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_btree_read_buf_block+0x233/0x300 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? mark_lock+0x8f/0x320 > ? xfs_btree_readahead+0x250/0x250 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_verify_fsbno+0x74/0x130 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_btree_ptr_to_daddr+0x19b/0x660 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_dio_write_end_io+0x32f/0x3f0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_dio_write_end_io+0x32f/0x3f0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_dio_write_end_io+0x32f/0x3f0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_dio_write_end_io+0x32f/0x3f0 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? __module_address+0x86/0x1e0 > ? ret_from_fork+0x1f/0x30 > ? deref_stack_reg+0x17f/0x210 > ? ret_from_fork+0x1f/0x30 > ? unwind_next_frame+0x16b/0x2240 > ? ret_from_fork+0x1f/0x30 > ? stack_trace_save+0x1e0/0x1e0 > ? arch_stack_walk+0xb7/0xf0 > ? lock_is_held_type+0x8b/0x110 > ? find_busiest_group+0x104e/0x2480 > ? load_balance+0x2540/0x2540 > ? stack_trace_save+0x10a/0x1e0 > ? mark_lock+0x8f/0x320 > ? __lock_acquire+0x122d/0x1e90 > ? lock_is_held_type+0x8b/0x110 > ? rcu_lock_acquire+0x30/0x30 > ? xfs_buf_ioend+0x248/0x450 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? xfs_buf_ioend+0x248/0x450 [xfs e56ce85f3b18232dbd061be3c73dc29bed4ad37b] > ? __module_address+0x86/0x1e0 > ? ret_from_fork+0x1f/0x30 > ? deref_stack_reg+0x17f/0x210 > ? ret_from_fork+0x1f/0x30 > ? unwind_next_frame+0x16b/0x2240 > ? stack_trace_save+0x10a/0x1e0 > ? deref_stack_reg+0x17f/0x210 > ? look_up_lock_class+0x65/0x130 > ? register_lock_class+0x5d/0x860 > ? mark_lock+0x8f/0x320 > __lock_acquire+0x122d/0x1e90 > lock_acquire+0x13a/0x310 > ? __flush_work+0xb9/0x170 > ? read_lock_is_recursive+0x10/0x10 > ? lock_is_held_type+0x8b/0x110 > ? rcu_lock_acquire+0x30/0x30 > __flush_work+0xd5/0x170 > ? __flush_work+0xb9/0x170 > ? flush_work+0x10/0x10 > ? lock_is_held_type+0x8b/0x110 > ? __lock_acquire+0x1e90/0x1e90 > ? do_raw_spin_unlock+0x112/0x890 > ? mark_lock+0x8f/0x320 > ? lockdep_hardirqs_on_prepare+0x2d5/0x610 > __cancel_work_timer+0x36b/0x470 > ? cancel_work_sync+0x10/0x10 > ? mark_lock+0x8f/0x320 > ? lockdep_hardirqs_on_prepare+0x2d5/0x610 > ? nvmet_tcp_release_queue_work+0x24d/0x1000 [nvmet_tcp f61749ac066e0812c28869697bc2623872f02bd4] > ? datagram_poll+0x380/0x380 > nvmet_tcp_release_queue_work+0x25c/0x1000 [nvmet_tcp f61749ac066e0812c28869697bc2623872f02bd4] > process_one_work+0x756/0xbc0 > ? rescuer_thread+0x13f0/0x13f0 > ? lock_acquired+0x2f2/0x930 > ? worker_thread+0xf55/0x1480 > worker_thread+0x97a/0x1480 > ? rcu_lock_release+0x20/0x20 > kthread+0x228/0x2b0 > ? rcu_lock_release+0x20/0x20 > ? kthread_blkcg+0xa0/0xa0 > ret_from_fork+0x1f/0x30 > </TASK>
From: Keith Busch <kbusch@kernel.org> I couldn't test the existing tcp or rdma options, so I had to make a loop poll option. The last patch fixes the polling queues when used with fabrics. Note, this depends on patch I sent earlier today that I should have just included in this series: https://lore.kernel.org/linux-block/20230321215001.2655451-1-kbusch@meta.com/T/#u Keith Busch (3): nvme-fabrics: add queue setup helpers nvme: add polling options for loop target blk-mq: directly poll requests block/blk-mq.c | 4 +- drivers/nvme/host/fabrics.c | 95 +++++++++++++++++++++++++++++++++++++ drivers/nvme/host/fabrics.h | 5 ++ drivers/nvme/host/rdma.c | 81 ++----------------------------- drivers/nvme/host/tcp.c | 92 ++--------------------------------- drivers/nvme/target/loop.c | 63 ++++++++++++++++++++++-- 6 files changed, 168 insertions(+), 172 deletions(-)