diff mbox

iSER fails to release rdma resources (WRs) if iw_cxgb4 is unloaded while IO is in progress

Message ID 49223bb6-e2a1-eee7-cf4e-701957e6727f@mellanox.com (mailing list archive)
State Accepted
Headers show

Commit Message

Max Gurtovoy Feb. 1, 2017, 3:17 p.m. UTC
hi Raju,
please apply the attached patch I want to push soon (still haven't find 
the chance to test it).
I'm not sure it will solve your problem but let's try it.

thanks,
Max.

On 2/1/2017 11:08 AM, Raju  Rangoju wrote:
>
> Hello Sagi,
>
> I intermittently see an issue with iser when unloading the iw_cxgb4 module while traffic is running. Apparently the rdma resources are not getting released when the iser receives RDMA_CM_EVENT_DEVICE_REMOVAL event while the IO in progress. iser_cma_handler() upon receiving the DEVICE_REMOVAL event, destroys the device by calling iser_cleanup_handler(). iser_free_ib_conn_res() destroys the qp and calls iser_free_fastreg_pool() to free the Memory Regions in the fastreg_pool list, and then it calls ib_dealloc_pd.
>
> Issue:
>
> iSCSI uses its .xmit_task and .cleanup_task callbacks to get/put MRs from iser fr_pool(fastreg_pool) during the normal IO, at this point if the DEVICE_REMOVAL event is received, iser_cma_handler()->iser_cleanup_handler() it simply releases the available MRs in the fr_pool list (some MRs may have been moved to running task list) and eventually calls ib_dealloc_pd, which ends up hitting kernel panic as some registered MRs are not freed up.
>
> iser_free_fastreg_pool() complains about the registered regions; "pool still has %d regions registered"
>
> Trace:
>
> iser: iser_free_fastreg_pool: pool still has 1 regions registered
> iser: iser_device_try_release: device ffff880508660080 refcount 0
> iw_cxgb4:c4iw_destroy_cq ib_cq ffff8803f3addc00
> iw_cxgb4:c4iw_wait_for_reply add wr_waitp ffffc9000dd83a28
> ------------[ cut here ]------------
> WARNING: CPU: 7 PID: 14790 at drivers/infiniband/core/verbs.c:305 ib_dealloc_pd+0x87/0xd0 [ib_core]
> Modules linked in: rdma_ucm ib_uverbs iw_cxgb4(OE-) autofs4 target_core_iblock target_core_file target_core_pscsi target_core_mod bnx2fc fcoe libfcoe 8021q libfc garp stp llc scsi_transport_fc cpufreq_ondemand be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio libcxgb ib_iser rdma_cm ib_cm iw_cm ib_core configfs ipv6 crc_ccitt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi uinput ppdev iTCO_wdt iTCO_vendor_support serio_raw pcspkr parport_pc parport tpm_infineon sg i2c_i801 i2c_core lpc_ich mfd_core e1000e acpi_cpufreq i7core_edac edac_core ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) pata_acpi(E) ata_generic(E) ata_piix(E) floppy(E) cxgb4(OE) ptp(E) pps_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
> CPU: 7 PID: 14790 Comm: rmmod Tainted: G           OE   4.10.0-rc4+ #22
> Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
> Call Trace:
> dump_stack+0x51/0x78
> __warn+0xfd/0x120
> warn_slowpath_null+0x1d/0x20
> ib_dealloc_pd+0x87/0xd0 [ib_core]
> ? ib_unregister_event_handler+0x6d/0x80 [ib_core]
> ? mutex_lock+0x16/0x40
> iser_device_try_release+0x81/0x120 [ib_iser]
> ? iser_free_rx_descriptors+0xd3/0xf0 [ib_iser]
> iser_free_ib_conn_res+0x75/0xb0 [ib_iser]
> iser_cleanup_handler+0x41/0x70 [ib_iser]
> iser_cma_handler+0x1c9/0x220 [ib_iser]
> cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
> cma_process_remove+0x127/0x170 [rdma_cm]
> ? kobject_cleanup+0x82/0x1b0
> ? kobject_release+0xd/0x10
> cma_remove_one+0x6f/0x90 [rdma_cm]
> ib_unregister_device+0xe7/0x190 [ib_core]
> c4iw_unregister_device+0x79/0x90 [iw_cxgb4]
> c4iw_remove+0x45/0x6c [iw_cxgb4]
> c4iw_exit_module+0x31/0x75 [iw_cxgb4]
> SyS_delete_module+0x183/0x1d0
> ? syscall_trace_enter+0x154/0x1f0
> ? SyS_munmap+0x6e/0x90
> do_syscall_64+0x6c/0x160
> entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x37d22e8ee7
> RSP: 002b:00007ffedd1877b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> RAX: ffffffffffffffda RBX: 00007ffedd1877c0 RCX: 00000037d22e8ee7
> RDX: 00007ffedd1877af RSI: 0000000000000880 RDI: 00007ffedd1877c0
> RBP: 00007ffedd187810 R08: 00007f0120b48700 R09: 0000000000000100
> R10: 0000000000000011 R11: 0000000000000206 R12: 0000000000000880
> R13: 00007ffedd188735 R14: 0000000000000000 R15: 0000000000000001
> ---[ end trace 9bdbdddd5759d7e6 ]---
>
>
> Steps to reproduce:
> 1. Bring up the iser target setup
> 2. Bring up the iser initiator setup
> 3. From DUT(initiator) login to all the Targets and start IOzone traffic on all the mounted luns.
> 4. Now unload iw_cxgb4 module on the iser initiator setup.
>
>
> This is a generic issue, seen with other vendors also.
>
> Could you give me a few pointers on how to debug it further to address this issue?
> I am happy to provide any details further.
>
> Thank you for any help you can provide,
> -Raju
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
From c1ec2dc3660fbd3e99dfc4aa3d766a526205aa6d Mon Sep 17 00:00:00 2001
From: Max Gurtovoy <maxg@mellanox.com>
Date: Wed, 1 Feb 2017 13:09:48 +0200
Subject: [PATCH 1/1] IB/iser: access active_qps field atomically

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/infiniband/ulp/iser/iser_verbs.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

Comments

Robert LeBlanc Feb. 1, 2017, 5:47 p.m. UTC | #1
On Wed, Feb 1, 2017 at 8:17 AM, Max Gurtovoy <maxg@mellanox.com> wrote:
> hi Raju,
> please apply the attached patch I want to push soon (still haven't find the
> chance to test it).
> I'm not sure it will solve your problem but let's try it.
>
> thanks,
> Max.

I tried this to see if it would help with my iser D state problem, but no luck.

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Raju Rangoju Feb. 2, 2017, 5:09 a.m. UTC | #2
Hi Max,

I have tried the patch, but no luck. Issue is still seen.

-Raju

-----Original Message-----
From: Max Gurtovoy [mailto:maxg@mellanox.com] 
Sent: 01 February 2017 20:48
To: Raju Rangoju <rajur@chelsio.com>; Sagi Grimberg <sagi@grimberg.me>; linux-rdma@vger.kernel.org
Cc: SWise OGC <swise@opengridcomputing.com>; Potnuri Bharat Teja <bharat@chelsio.com>
Subject: Re: iSER fails to release rdma resources (WRs) if iw_cxgb4 is unloaded while IO is in progress

hi Raju,
please apply the attached patch I want to push soon (still haven't find the chance to test it).
I'm not sure it will solve your problem but let's try it.

thanks,
Max.

On 2/1/2017 11:08 AM, Raju  Rangoju wrote:
>
> Hello Sagi,
>
> I intermittently see an issue with iser when unloading the iw_cxgb4 module while traffic is running. Apparently the rdma resources are not getting released when the iser receives RDMA_CM_EVENT_DEVICE_REMOVAL event while the IO in progress. iser_cma_handler() upon receiving the DEVICE_REMOVAL event, destroys the device by calling iser_cleanup_handler(). iser_free_ib_conn_res() destroys the qp and calls iser_free_fastreg_pool() to free the Memory Regions in the fastreg_pool list, and then it calls ib_dealloc_pd.
>
> Issue:
>
> iSCSI uses its .xmit_task and .cleanup_task callbacks to get/put MRs from iser fr_pool(fastreg_pool) during the normal IO, at this point if the DEVICE_REMOVAL event is received, iser_cma_handler()->iser_cleanup_handler() it simply releases the available MRs in the fr_pool list (some MRs may have been moved to running task list) and eventually calls ib_dealloc_pd, which ends up hitting kernel panic as some registered MRs are not freed up.
>
> iser_free_fastreg_pool() complains about the registered regions; "pool still has %d regions registered"
>
> Trace:
>
> iser: iser_free_fastreg_pool: pool still has 1 regions registered
> iser: iser_device_try_release: device ffff880508660080 refcount 0 
> iw_cxgb4:c4iw_destroy_cq ib_cq ffff8803f3addc00 
> iw_cxgb4:c4iw_wait_for_reply add wr_waitp ffffc9000dd83a28 
> ------------[ cut here ]------------
> WARNING: CPU: 7 PID: 14790 at drivers/infiniband/core/verbs.c:305 
> ib_dealloc_pd+0x87/0xd0 [ib_core] Modules linked in: rdma_ucm ib_uverbs iw_cxgb4(OE-) autofs4 target_core_iblock target_core_file target_core_pscsi target_core_mod bnx2fc fcoe libfcoe 8021q libfc garp stp llc scsi_transport_fc cpufreq_ondemand be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio libcxgb ib_iser rdma_cm ib_cm iw_cm ib_core configfs ipv6 crc_ccitt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi uinput ppdev iTCO_wdt iTCO_vendor_support serio_raw pcspkr parport_pc parport tpm_infineon sg i2c_i801 i2c_core lpc_ich mfd_core e1000e acpi_cpufreq i7core_edac edac_core ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) pata_acpi(E) ata_generic(E) ata_piix(E) floppy(E) cxgb4(OE) ptp(E) pps_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
> CPU: 7 PID: 14790 Comm: rmmod Tainted: G           OE   4.10.0-rc4+ #22
> Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
> Call Trace:
> dump_stack+0x51/0x78
> __warn+0xfd/0x120
> warn_slowpath_null+0x1d/0x20
> ib_dealloc_pd+0x87/0xd0 [ib_core]
> ? ib_unregister_event_handler+0x6d/0x80 [ib_core] ? 
> mutex_lock+0x16/0x40
> iser_device_try_release+0x81/0x120 [ib_iser] ? 
> iser_free_rx_descriptors+0xd3/0xf0 [ib_iser]
> iser_free_ib_conn_res+0x75/0xb0 [ib_iser]
> iser_cleanup_handler+0x41/0x70 [ib_iser]
> iser_cma_handler+0x1c9/0x220 [ib_iser]
> cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
> cma_process_remove+0x127/0x170 [rdma_cm] ? kobject_cleanup+0x82/0x1b0 
> ? kobject_release+0xd/0x10
> cma_remove_one+0x6f/0x90 [rdma_cm]
> ib_unregister_device+0xe7/0x190 [ib_core]
> c4iw_unregister_device+0x79/0x90 [iw_cxgb4] c4iw_remove+0x45/0x6c 
> [iw_cxgb4]
> c4iw_exit_module+0x31/0x75 [iw_cxgb4]
> SyS_delete_module+0x183/0x1d0
> ? syscall_trace_enter+0x154/0x1f0
> ? SyS_munmap+0x6e/0x90
> do_syscall_64+0x6c/0x160
> entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x37d22e8ee7
> RSP: 002b:00007ffedd1877b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> RAX: ffffffffffffffda RBX: 00007ffedd1877c0 RCX: 00000037d22e8ee7
> RDX: 00007ffedd1877af RSI: 0000000000000880 RDI: 00007ffedd1877c0
> RBP: 00007ffedd187810 R08: 00007f0120b48700 R09: 0000000000000100
> R10: 0000000000000011 R11: 0000000000000206 R12: 0000000000000880
> R13: 00007ffedd188735 R14: 0000000000000000 R15: 0000000000000001 ---[ 
> end trace 9bdbdddd5759d7e6 ]---
>
>
> Steps to reproduce:
> 1. Bring up the iser target setup
> 2. Bring up the iser initiator setup
> 3. From DUT(initiator) login to all the Targets and start IOzone traffic on all the mounted luns.
> 4. Now unload iw_cxgb4 module on the iser initiator setup.
>
>
> This is a generic issue, seen with other vendors also.
>
> Could you give me a few pointers on how to debug it further to address this issue?
> I am happy to provide any details further.
>
> Thank you for any help you can provide, -Raju
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Max Gurtovoy Feb. 2, 2017, 8:09 a.m. UTC | #3
On 2/2/2017 7:09 AM, Raju  Rangoju wrote:
> Hi Max,
>
> I have tried the patch, but no luck. Issue is still seen.
>
> -Raju

Thanks Raju.
I sent it anyway because this is the right behaviour.
We'll continue investigating this issue.

>
> -----Original Message-----
> From: Max Gurtovoy [mailto:maxg@mellanox.com]
> Sent: 01 February 2017 20:48
> To: Raju Rangoju <rajur@chelsio.com>; Sagi Grimberg <sagi@grimberg.me>; linux-rdma@vger.kernel.org
> Cc: SWise OGC <swise@opengridcomputing.com>; Potnuri Bharat Teja <bharat@chelsio.com>
> Subject: Re: iSER fails to release rdma resources (WRs) if iw_cxgb4 is unloaded while IO is in progress
>
> hi Raju,
> please apply the attached patch I want to push soon (still haven't find the chance to test it).
> I'm not sure it will solve your problem but let's try it.
>
> thanks,
> Max.
>
> On 2/1/2017 11:08 AM, Raju  Rangoju wrote:
>>
>> Hello Sagi,
>>
>> I intermittently see an issue with iser when unloading the iw_cxgb4 module while traffic is running. Apparently the rdma resources are not getting released when the iser receives RDMA_CM_EVENT_DEVICE_REMOVAL event while the IO in progress. iser_cma_handler() upon receiving the DEVICE_REMOVAL event, destroys the device by calling iser_cleanup_handler(). iser_free_ib_conn_res() destroys the qp and calls iser_free_fastreg_pool() to free the Memory Regions in the fastreg_pool list, and then it calls ib_dealloc_pd.
>>
>> Issue:
>>
>> iSCSI uses its .xmit_task and .cleanup_task callbacks to get/put MRs from iser fr_pool(fastreg_pool) during the normal IO, at this point if the DEVICE_REMOVAL event is received, iser_cma_handler()->iser_cleanup_handler() it simply releases the available MRs in the fr_pool list (some MRs may have been moved to running task list) and eventually calls ib_dealloc_pd, which ends up hitting kernel panic as some registered MRs are not freed up.
>>
>> iser_free_fastreg_pool() complains about the registered regions; "pool still has %d regions registered"
>>
>> Trace:
>>
>> iser: iser_free_fastreg_pool: pool still has 1 regions registered
>> iser: iser_device_try_release: device ffff880508660080 refcount 0
>> iw_cxgb4:c4iw_destroy_cq ib_cq ffff8803f3addc00
>> iw_cxgb4:c4iw_wait_for_reply add wr_waitp ffffc9000dd83a28
>> ------------[ cut here ]------------
>> WARNING: CPU: 7 PID: 14790 at drivers/infiniband/core/verbs.c:305
>> ib_dealloc_pd+0x87/0xd0 [ib_core] Modules linked in: rdma_ucm ib_uverbs iw_cxgb4(OE-) autofs4 target_core_iblock target_core_file target_core_pscsi target_core_mod bnx2fc fcoe libfcoe 8021q libfc garp stp llc scsi_transport_fc cpufreq_ondemand be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio libcxgb ib_iser rdma_cm ib_cm iw_cm ib_core configfs ipv6 crc_ccitt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi uinput ppdev iTCO_wdt iTCO_vendor_support serio_raw pcspkr parport_pc parport tpm_infineon sg i2c_i801 i2c_core lpc_ich mfd_core e1000e acpi_cpufreq i7core_edac edac_core ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) pata_acpi(E) ata_generic(E) ata_piix(E) floppy(E) cxgb4(OE) ptp(E) pps_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
>> CPU: 7 PID: 14790 Comm: rmmod Tainted: G           OE   4.10.0-rc4+ #22
>> Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
>> Call Trace:
>> dump_stack+0x51/0x78
>> __warn+0xfd/0x120
>> warn_slowpath_null+0x1d/0x20
>> ib_dealloc_pd+0x87/0xd0 [ib_core]
>> ? ib_unregister_event_handler+0x6d/0x80 [ib_core] ?
>> mutex_lock+0x16/0x40
>> iser_device_try_release+0x81/0x120 [ib_iser] ?
>> iser_free_rx_descriptors+0xd3/0xf0 [ib_iser]
>> iser_free_ib_conn_res+0x75/0xb0 [ib_iser]
>> iser_cleanup_handler+0x41/0x70 [ib_iser]
>> iser_cma_handler+0x1c9/0x220 [ib_iser]
>> cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
>> cma_process_remove+0x127/0x170 [rdma_cm] ? kobject_cleanup+0x82/0x1b0
>> ? kobject_release+0xd/0x10
>> cma_remove_one+0x6f/0x90 [rdma_cm]
>> ib_unregister_device+0xe7/0x190 [ib_core]
>> c4iw_unregister_device+0x79/0x90 [iw_cxgb4] c4iw_remove+0x45/0x6c
>> [iw_cxgb4]
>> c4iw_exit_module+0x31/0x75 [iw_cxgb4]
>> SyS_delete_module+0x183/0x1d0
>> ? syscall_trace_enter+0x154/0x1f0
>> ? SyS_munmap+0x6e/0x90
>> do_syscall_64+0x6c/0x160
>> entry_SYSCALL64_slow_path+0x25/0x25
>> RIP: 0033:0x37d22e8ee7
>> RSP: 002b:00007ffedd1877b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>> RAX: ffffffffffffffda RBX: 00007ffedd1877c0 RCX: 00000037d22e8ee7
>> RDX: 00007ffedd1877af RSI: 0000000000000880 RDI: 00007ffedd1877c0
>> RBP: 00007ffedd187810 R08: 00007f0120b48700 R09: 0000000000000100
>> R10: 0000000000000011 R11: 0000000000000206 R12: 0000000000000880
>> R13: 00007ffedd188735 R14: 0000000000000000 R15: 0000000000000001 ---[
>> end trace 9bdbdddd5759d7e6 ]---
>>
>>
>> Steps to reproduce:
>> 1. Bring up the iser target setup
>> 2. Bring up the iser initiator setup
>> 3. From DUT(initiator) login to all the Targets and start IOzone traffic on all the mounted luns.
>> 4. Now unload iw_cxgb4 module on the iser initiator setup.
>>
>>
>> This is a generic issue, seen with other vendors also.
>>
>> Could you give me a few pointers on how to debug it further to address this issue?
>> I am happy to provide any details further.
>>
>> Thank you for any help you can provide, -Raju
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>> in the body of a message to majordomo@vger.kernel.org More majordomo
>> info at  http://vger.kernel.org/majordomo-info.html
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Max Gurtovoy Feb. 19, 2017, 9:23 a.m. UTC | #4
Adding Vladimir that was debugging this issue.

Sagi,
there is a comment that was added in commit 3a940daf6fa1 "IB/iser: 
Protect tasks cleanup in case IB device was already released"
that "DEVICE_REMOVAL event might have already released the device"
but it is possible and this is the case now, that not all the tasks 
where cleaned up. We actually destroy the low level structures but the 
upper layer (iscsi) still has some tasks that should be cleaned.
Also we need to think about the case of the absence of the iscsid and 
this makes the situation more difficult.

Vladimir,
please add your patch for Raju to check and let's start thinking of 
pushing this fix to main code.

Max.


On 2/2/2017 7:09 AM, Raju  Rangoju wrote:
> Hi Max,
>
> I have tried the patch, but no luck. Issue is still seen.
>
> -Raju
>
> -----Original Message-----
> From: Max Gurtovoy [mailto:maxg@mellanox.com]
> Sent: 01 February 2017 20:48
> To: Raju Rangoju <rajur@chelsio.com>; Sagi Grimberg <sagi@grimberg.me>; linux-rdma@vger.kernel.org
> Cc: SWise OGC <swise@opengridcomputing.com>; Potnuri Bharat Teja <bharat@chelsio.com>
> Subject: Re: iSER fails to release rdma resources (WRs) if iw_cxgb4 is unloaded while IO is in progress
>
> hi Raju,
> please apply the attached patch I want to push soon (still haven't find the chance to test it).
> I'm not sure it will solve your problem but let's try it.
>
> thanks,
> Max.
>
> On 2/1/2017 11:08 AM, Raju  Rangoju wrote:
>>
>> Hello Sagi,
>>
>> I intermittently see an issue with iser when unloading the iw_cxgb4 module while traffic is running. Apparently the rdma resources are not getting released when the iser receives RDMA_CM_EVENT_DEVICE_REMOVAL event while the IO in progress. iser_cma_handler() upon receiving the DEVICE_REMOVAL event, destroys the device by calling iser_cleanup_handler(). iser_free_ib_conn_res() destroys the qp and calls iser_free_fastreg_pool() to free the Memory Regions in the fastreg_pool list, and then it calls ib_dealloc_pd.
>>
>> Issue:
>>
>> iSCSI uses its .xmit_task and .cleanup_task callbacks to get/put MRs from iser fr_pool(fastreg_pool) during the normal IO, at this point if the DEVICE_REMOVAL event is received, iser_cma_handler()->iser_cleanup_handler() it simply releases the available MRs in the fr_pool list (some MRs may have been moved to running task list) and eventually calls ib_dealloc_pd, which ends up hitting kernel panic as some registered MRs are not freed up.
>>
>> iser_free_fastreg_pool() complains about the registered regions; "pool still has %d regions registered"
>>
>> Trace:
>>
>> iser: iser_free_fastreg_pool: pool still has 1 regions registered
>> iser: iser_device_try_release: device ffff880508660080 refcount 0
>> iw_cxgb4:c4iw_destroy_cq ib_cq ffff8803f3addc00
>> iw_cxgb4:c4iw_wait_for_reply add wr_waitp ffffc9000dd83a28
>> ------------[ cut here ]------------
>> WARNING: CPU: 7 PID: 14790 at drivers/infiniband/core/verbs.c:305
>> ib_dealloc_pd+0x87/0xd0 [ib_core] Modules linked in: rdma_ucm ib_uverbs iw_cxgb4(OE-) autofs4 target_core_iblock target_core_file target_core_pscsi target_core_mod bnx2fc fcoe libfcoe 8021q libfc garp stp llc scsi_transport_fc cpufreq_ondemand be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio libcxgb ib_iser rdma_cm ib_cm iw_cm ib_core configfs ipv6 crc_ccitt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi uinput ppdev iTCO_wdt iTCO_vendor_support serio_raw pcspkr parport_pc parport tpm_infineon sg i2c_i801 i2c_core lpc_ich mfd_core e1000e acpi_cpufreq i7core_edac edac_core ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) pata_acpi(E) ata_generic(E) ata_piix(E) floppy(E) cxgb4(OE) ptp(E) pps_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
>> CPU: 7 PID: 14790 Comm: rmmod Tainted: G           OE   4.10.0-rc4+ #22
>> Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
>> Call Trace:
>> dump_stack+0x51/0x78
>> __warn+0xfd/0x120
>> warn_slowpath_null+0x1d/0x20
>> ib_dealloc_pd+0x87/0xd0 [ib_core]
>> ? ib_unregister_event_handler+0x6d/0x80 [ib_core] ?
>> mutex_lock+0x16/0x40
>> iser_device_try_release+0x81/0x120 [ib_iser] ?
>> iser_free_rx_descriptors+0xd3/0xf0 [ib_iser]
>> iser_free_ib_conn_res+0x75/0xb0 [ib_iser]
>> iser_cleanup_handler+0x41/0x70 [ib_iser]
>> iser_cma_handler+0x1c9/0x220 [ib_iser]
>> cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
>> cma_process_remove+0x127/0x170 [rdma_cm] ? kobject_cleanup+0x82/0x1b0
>> ? kobject_release+0xd/0x10
>> cma_remove_one+0x6f/0x90 [rdma_cm]
>> ib_unregister_device+0xe7/0x190 [ib_core]
>> c4iw_unregister_device+0x79/0x90 [iw_cxgb4] c4iw_remove+0x45/0x6c
>> [iw_cxgb4]
>> c4iw_exit_module+0x31/0x75 [iw_cxgb4]
>> SyS_delete_module+0x183/0x1d0
>> ? syscall_trace_enter+0x154/0x1f0
>> ? SyS_munmap+0x6e/0x90
>> do_syscall_64+0x6c/0x160
>> entry_SYSCALL64_slow_path+0x25/0x25
>> RIP: 0033:0x37d22e8ee7
>> RSP: 002b:00007ffedd1877b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>> RAX: ffffffffffffffda RBX: 00007ffedd1877c0 RCX: 00000037d22e8ee7
>> RDX: 00007ffedd1877af RSI: 0000000000000880 RDI: 00007ffedd1877c0
>> RBP: 00007ffedd187810 R08: 00007f0120b48700 R09: 0000000000000100
>> R10: 0000000000000011 R11: 0000000000000206 R12: 0000000000000880
>> R13: 00007ffedd188735 R14: 0000000000000000 R15: 0000000000000001 ---[
>> end trace 9bdbdddd5759d7e6 ]---
>>
>>
>> Steps to reproduce:
>> 1. Bring up the iser target setup
>> 2. Bring up the iser initiator setup
>> 3. From DUT(initiator) login to all the Targets and start IOzone traffic on all the mounted luns.
>> 4. Now unload iw_cxgb4 module on the iser initiator setup.
>>
>>
>> This is a generic issue, seen with other vendors also.
>>
>> Could you give me a few pointers on how to debug it further to address this issue?
>> I am happy to provide any details further.
>>
>> Thank you for any help you can provide, -Raju
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>> in the body of a message to majordomo@vger.kernel.org More majordomo
>> info at  http://vger.kernel.org/majordomo-info.html
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vladimir Neyelov Feb. 20, 2017, 12:46 p.m. UTC | #5
Hi Raju,
Try this patch that solve this problem, it checked with our tests.
Thanks,
Vladimir


-----Original Message-----
From: Max Gurtovoy 
Sent: Sunday, February 19, 2017 11:23 AM
To: Raju Rangoju <rajur@chelsio.com>; Sagi Grimberg <sagi@grimberg.me>; linux-rdma@vger.kernel.org
Cc: SWise OGC <swise@opengridcomputing.com>; Potnuri Bharat Teja <bharat@chelsio.com>; Vladimir Neyelov <vladimirn@mellanox.com>
Subject: Re: iSER fails to release rdma resources (WRs) if iw_cxgb4 is unloaded while IO is in progress

Adding Vladimir that was debugging this issue.

Sagi,
there is a comment that was added in commit 3a940daf6fa1 "IB/iser: 
Protect tasks cleanup in case IB device was already released"
that "DEVICE_REMOVAL event might have already released the device"
but it is possible and this is the case now, that not all the tasks where cleaned up. We actually destroy the low level structures but the upper layer (iscsi) still has some tasks that should be cleaned.
Also we need to think about the case of the absence of the iscsid and this makes the situation more difficult.

Vladimir,
please add your patch for Raju to check and let's start thinking of pushing this fix to main code.

Max.


On 2/2/2017 7:09 AM, Raju  Rangoju wrote:
> Hi Max,
>
> I have tried the patch, but no luck. Issue is still seen.
>
> -Raju
>
> -----Original Message-----
> From: Max Gurtovoy [mailto:maxg@mellanox.com]
> Sent: 01 February 2017 20:48
> To: Raju Rangoju <rajur@chelsio.com>; Sagi Grimberg 
> <sagi@grimberg.me>; linux-rdma@vger.kernel.org
> Cc: SWise OGC <swise@opengridcomputing.com>; Potnuri Bharat Teja 
> <bharat@chelsio.com>
> Subject: Re: iSER fails to release rdma resources (WRs) if iw_cxgb4 is 
> unloaded while IO is in progress
>
> hi Raju,
> please apply the attached patch I want to push soon (still haven't find the chance to test it).
> I'm not sure it will solve your problem but let's try it.
>
> thanks,
> Max.
>
> On 2/1/2017 11:08 AM, Raju  Rangoju wrote:
>>
>> Hello Sagi,
>>
>> I intermittently see an issue with iser when unloading the iw_cxgb4 module while traffic is running. Apparently the rdma resources are not getting released when the iser receives RDMA_CM_EVENT_DEVICE_REMOVAL event while the IO in progress. iser_cma_handler() upon receiving the DEVICE_REMOVAL event, destroys the device by calling iser_cleanup_handler(). iser_free_ib_conn_res() destroys the qp and calls iser_free_fastreg_pool() to free the Memory Regions in the fastreg_pool list, and then it calls ib_dealloc_pd.
>>
>> Issue:
>>
>> iSCSI uses its .xmit_task and .cleanup_task callbacks to get/put MRs from iser fr_pool(fastreg_pool) during the normal IO, at this point if the DEVICE_REMOVAL event is received, iser_cma_handler()->iser_cleanup_handler() it simply releases the available MRs in the fr_pool list (some MRs may have been moved to running task list) and eventually calls ib_dealloc_pd, which ends up hitting kernel panic as some registered MRs are not freed up.
>>
>> iser_free_fastreg_pool() complains about the registered regions; "pool still has %d regions registered"
>>
>> Trace:
>>
>> iser: iser_free_fastreg_pool: pool still has 1 regions registered
>> iser: iser_device_try_release: device ffff880508660080 refcount 0 
>> iw_cxgb4:c4iw_destroy_cq ib_cq ffff8803f3addc00 
>> iw_cxgb4:c4iw_wait_for_reply add wr_waitp ffffc9000dd83a28 
>> ------------[ cut here ]------------
>> WARNING: CPU: 7 PID: 14790 at drivers/infiniband/core/verbs.c:305
>> ib_dealloc_pd+0x87/0xd0 [ib_core] Modules linked in: rdma_ucm ib_uverbs iw_cxgb4(OE-) autofs4 target_core_iblock target_core_file target_core_pscsi target_core_mod bnx2fc fcoe libfcoe 8021q libfc garp stp llc scsi_transport_fc cpufreq_ondemand be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio libcxgb ib_iser rdma_cm ib_cm iw_cm ib_core configfs ipv6 crc_ccitt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi uinput ppdev iTCO_wdt iTCO_vendor_support serio_raw pcspkr parport_pc parport tpm_infineon sg i2c_i801 i2c_core lpc_ich mfd_core e1000e acpi_cpufreq i7core_edac edac_core ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) pata_acpi(E) ata_generic(E) ata_piix(E) floppy(E) cxgb4(OE) ptp(E) pps_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
>> CPU: 7 PID: 14790 Comm: rmmod Tainted: G           OE   4.10.0-rc4+ #22
>> Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
>> Call Trace:
>> dump_stack+0x51/0x78
>> __warn+0xfd/0x120
>> warn_slowpath_null+0x1d/0x20
>> ib_dealloc_pd+0x87/0xd0 [ib_core]
>> ? ib_unregister_event_handler+0x6d/0x80 [ib_core] ?
>> mutex_lock+0x16/0x40
>> iser_device_try_release+0x81/0x120 [ib_iser] ?
>> iser_free_rx_descriptors+0xd3/0xf0 [ib_iser]
>> iser_free_ib_conn_res+0x75/0xb0 [ib_iser]
>> iser_cleanup_handler+0x41/0x70 [ib_iser]
>> iser_cma_handler+0x1c9/0x220 [ib_iser]
>> cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
>> cma_process_remove+0x127/0x170 [rdma_cm] ? kobject_cleanup+0x82/0x1b0 
>> ? kobject_release+0xd/0x10
>> cma_remove_one+0x6f/0x90 [rdma_cm]
>> ib_unregister_device+0xe7/0x190 [ib_core]
>> c4iw_unregister_device+0x79/0x90 [iw_cxgb4] c4iw_remove+0x45/0x6c 
>> [iw_cxgb4]
>> c4iw_exit_module+0x31/0x75 [iw_cxgb4]
>> SyS_delete_module+0x183/0x1d0
>> ? syscall_trace_enter+0x154/0x1f0
>> ? SyS_munmap+0x6e/0x90
>> do_syscall_64+0x6c/0x160
>> entry_SYSCALL64_slow_path+0x25/0x25
>> RIP: 0033:0x37d22e8ee7
>> RSP: 002b:00007ffedd1877b8 EFLAGS: 00000206 ORIG_RAX: 
>> 00000000000000b0
>> RAX: ffffffffffffffda RBX: 00007ffedd1877c0 RCX: 00000037d22e8ee7
>> RDX: 00007ffedd1877af RSI: 0000000000000880 RDI: 00007ffedd1877c0
>> RBP: 00007ffedd187810 R08: 00007f0120b48700 R09: 0000000000000100
>> R10: 0000000000000011 R11: 0000000000000206 R12: 0000000000000880
>> R13: 00007ffedd188735 R14: 0000000000000000 R15: 0000000000000001 
>> ---[ end trace 9bdbdddd5759d7e6 ]---
>>
>>
>> Steps to reproduce:
>> 1. Bring up the iser target setup
>> 2. Bring up the iser initiator setup
>> 3. From DUT(initiator) login to all the Targets and start IOzone traffic on all the mounted luns.
>> 4. Now unload iw_cxgb4 module on the iser initiator setup.
>>
>>
>> This is a generic issue, seen with other vendors also.
>>
>> Could you give me a few pointers on how to debug it further to address this issue?
>> I am happy to provide any details further.
>>
>> Thank you for any help you can provide, -Raju
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>> in the body of a message to majordomo@vger.kernel.org More majordomo 
>> info at  http://vger.kernel.org/majordomo-info.html
>>
Robert LeBlanc Feb. 21, 2017, 5:06 p.m. UTC | #6
Although not exactly similar, this patch does not help my isert D
state problem. The description sounds very much like what I'm seeing.
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Mon, Feb 20, 2017 at 5:46 AM, Vladimir Neyelov
<vladimirn@mellanox.com> wrote:
> Hi Raju,
> Try this patch that solve this problem, it checked with our tests.
> Thanks,
> Vladimir
>
>
> -----Original Message-----
> From: Max Gurtovoy
> Sent: Sunday, February 19, 2017 11:23 AM
> To: Raju Rangoju <rajur@chelsio.com>; Sagi Grimberg <sagi@grimberg.me>; linux-rdma@vger.kernel.org
> Cc: SWise OGC <swise@opengridcomputing.com>; Potnuri Bharat Teja <bharat@chelsio.com>; Vladimir Neyelov <vladimirn@mellanox.com>
> Subject: Re: iSER fails to release rdma resources (WRs) if iw_cxgb4 is unloaded while IO is in progress
>
> Adding Vladimir that was debugging this issue.
>
> Sagi,
> there is a comment that was added in commit 3a940daf6fa1 "IB/iser:
> Protect tasks cleanup in case IB device was already released"
> that "DEVICE_REMOVAL event might have already released the device"
> but it is possible and this is the case now, that not all the tasks where cleaned up. We actually destroy the low level structures but the upper layer (iscsi) still has some tasks that should be cleaned.
> Also we need to think about the case of the absence of the iscsid and this makes the situation more difficult.
>
> Vladimir,
> please add your patch for Raju to check and let's start thinking of pushing this fix to main code.
>
> Max.
>
>
> On 2/2/2017 7:09 AM, Raju  Rangoju wrote:
>> Hi Max,
>>
>> I have tried the patch, but no luck. Issue is still seen.
>>
>> -Raju
>>
>> -----Original Message-----
>> From: Max Gurtovoy [mailto:maxg@mellanox.com]
>> Sent: 01 February 2017 20:48
>> To: Raju Rangoju <rajur@chelsio.com>; Sagi Grimberg
>> <sagi@grimberg.me>; linux-rdma@vger.kernel.org
>> Cc: SWise OGC <swise@opengridcomputing.com>; Potnuri Bharat Teja
>> <bharat@chelsio.com>
>> Subject: Re: iSER fails to release rdma resources (WRs) if iw_cxgb4 is
>> unloaded while IO is in progress
>>
>> hi Raju,
>> please apply the attached patch I want to push soon (still haven't find the chance to test it).
>> I'm not sure it will solve your problem but let's try it.
>>
>> thanks,
>> Max.
>>
>> On 2/1/2017 11:08 AM, Raju  Rangoju wrote:
>>>
>>> Hello Sagi,
>>>
>>> I intermittently see an issue with iser when unloading the iw_cxgb4 module while traffic is running. Apparently the rdma resources are not getting released when the iser receives RDMA_CM_EVENT_DEVICE_REMOVAL event while the IO in progress. iser_cma_handler() upon receiving the DEVICE_REMOVAL event, destroys the device by calling iser_cleanup_handler(). iser_free_ib_conn_res() destroys the qp and calls iser_free_fastreg_pool() to free the Memory Regions in the fastreg_pool list, and then it calls ib_dealloc_pd.
>>>
>>> Issue:
>>>
>>> iSCSI uses its .xmit_task and .cleanup_task callbacks to get/put MRs from iser fr_pool(fastreg_pool) during the normal IO, at this point if the DEVICE_REMOVAL event is received, iser_cma_handler()->iser_cleanup_handler() it simply releases the available MRs in the fr_pool list (some MRs may have been moved to running task list) and eventually calls ib_dealloc_pd, which ends up hitting kernel panic as some registered MRs are not freed up.
>>>
>>> iser_free_fastreg_pool() complains about the registered regions; "pool still has %d regions registered"
>>>
>>> Trace:
>>>
>>> iser: iser_free_fastreg_pool: pool still has 1 regions registered
>>> iser: iser_device_try_release: device ffff880508660080 refcount 0
>>> iw_cxgb4:c4iw_destroy_cq ib_cq ffff8803f3addc00
>>> iw_cxgb4:c4iw_wait_for_reply add wr_waitp ffffc9000dd83a28
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 7 PID: 14790 at drivers/infiniband/core/verbs.c:305
>>> ib_dealloc_pd+0x87/0xd0 [ib_core] Modules linked in: rdma_ucm ib_uverbs iw_cxgb4(OE-) autofs4 target_core_iblock target_core_file target_core_pscsi target_core_mod bnx2fc fcoe libfcoe 8021q libfc garp stp llc scsi_transport_fc cpufreq_ondemand be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio libcxgb ib_iser rdma_cm ib_cm iw_cm ib_core configfs ipv6 crc_ccitt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi uinput ppdev iTCO_wdt iTCO_vendor_support serio_raw pcspkr parport_pc parport tpm_infineon sg i2c_i801 i2c_core lpc_ich mfd_core e1000e acpi_cpufreq i7core_edac edac_core ioatdma dca ext4(E) mbcache(E) jbd2(E) sd_mod(E) pata_acpi(E) ata_generic(E) ata_piix(E) floppy(E) cxgb4(OE) ptp(E) pps_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E)
>>> CPU: 7 PID: 14790 Comm: rmmod Tainted: G           OE   4.10.0-rc4+ #22
>>> Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0        07/29/10
>>> Call Trace:
>>> dump_stack+0x51/0x78
>>> __warn+0xfd/0x120
>>> warn_slowpath_null+0x1d/0x20
>>> ib_dealloc_pd+0x87/0xd0 [ib_core]
>>> ? ib_unregister_event_handler+0x6d/0x80 [ib_core] ?
>>> mutex_lock+0x16/0x40
>>> iser_device_try_release+0x81/0x120 [ib_iser] ?
>>> iser_free_rx_descriptors+0xd3/0xf0 [ib_iser]
>>> iser_free_ib_conn_res+0x75/0xb0 [ib_iser]
>>> iser_cleanup_handler+0x41/0x70 [ib_iser]
>>> iser_cma_handler+0x1c9/0x220 [ib_iser]
>>> cma_remove_id_dev+0x8f/0xa0 [rdma_cm]
>>> cma_process_remove+0x127/0x170 [rdma_cm] ? kobject_cleanup+0x82/0x1b0
>>> ? kobject_release+0xd/0x10
>>> cma_remove_one+0x6f/0x90 [rdma_cm]
>>> ib_unregister_device+0xe7/0x190 [ib_core]
>>> c4iw_unregister_device+0x79/0x90 [iw_cxgb4] c4iw_remove+0x45/0x6c
>>> [iw_cxgb4]
>>> c4iw_exit_module+0x31/0x75 [iw_cxgb4]
>>> SyS_delete_module+0x183/0x1d0
>>> ? syscall_trace_enter+0x154/0x1f0
>>> ? SyS_munmap+0x6e/0x90
>>> do_syscall_64+0x6c/0x160
>>> entry_SYSCALL64_slow_path+0x25/0x25
>>> RIP: 0033:0x37d22e8ee7
>>> RSP: 002b:00007ffedd1877b8 EFLAGS: 00000206 ORIG_RAX:
>>> 00000000000000b0
>>> RAX: ffffffffffffffda RBX: 00007ffedd1877c0 RCX: 00000037d22e8ee7
>>> RDX: 00007ffedd1877af RSI: 0000000000000880 RDI: 00007ffedd1877c0
>>> RBP: 00007ffedd187810 R08: 00007f0120b48700 R09: 0000000000000100
>>> R10: 0000000000000011 R11: 0000000000000206 R12: 0000000000000880
>>> R13: 00007ffedd188735 R14: 0000000000000000 R15: 0000000000000001
>>> ---[ end trace 9bdbdddd5759d7e6 ]---
>>>
>>>
>>> Steps to reproduce:
>>> 1. Bring up the iser target setup
>>> 2. Bring up the iser initiator setup
>>> 3. From DUT(initiator) login to all the Targets and start IOzone traffic on all the mounted luns.
>>> 4. Now unload iw_cxgb4 module on the iser initiator setup.
>>>
>>>
>>> This is a generic issue, seen with other vendors also.
>>>
>>> Could you give me a few pointers on how to debug it further to address this issue?
>>> I am happy to provide any details further.
>>>
>>> Thank you for any help you can provide, -Raju
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
>>> in the body of a message to majordomo@vger.kernel.org More majordomo
>>> info at  http://vger.kernel.org/majordomo-info.html
>>>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Feb. 22, 2017, 7:59 p.m. UTC | #7
> Hi Raju,
> Try this patch that solve this problem, it checked with our tests.

I might be missing something, but how is this patch any different
then what I sent (other than duplicating some code)?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vladimir Neyelov Feb. 23, 2017, 8:27 a.m. UTC | #8
Hi,
I didn't see your patch. But if it's similar patch it's very good that two people think about same solution.
Vladimir.

-----Original Message-----
From: Sagi Grimberg [mailto:sagi@grimberg.me] 
Sent: Wednesday, February 22, 2017 9:59 PM
To: Vladimir Neyelov <vladimirn@mellanox.com>; Max Gurtovoy <maxg@mellanox.com>; Raju Rangoju <rajur@chelsio.com>; linux-rdma@vger.kernel.org
Cc: SWise OGC <swise@opengridcomputing.com>; Potnuri Bharat Teja <bharat@chelsio.com>
Subject: Re: iSER fails to release rdma resources (WRs) if iw_cxgb4 is unloaded while IO is in progress

> Hi Raju,
> Try this patch that solve this problem, it checked with our tests.

I might be missing something, but how is this patch any different then what I sent (other than duplicating some code)?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Max Gurtovoy Feb. 23, 2017, 10:05 a.m. UTC | #9
On 2/23/2017 10:27 AM, Vladimir Neyelov wrote:
> Hi,
> I didn't see your patch. But if it's similar patch it's very good that two people think about same solution.
> Vladimir.
>
> -----Original Message-----
> From: Sagi Grimberg [mailto:sagi@grimberg.me]
> Sent: Wednesday, February 22, 2017 9:59 PM
> To: Vladimir Neyelov <vladimirn@mellanox.com>; Max Gurtovoy <maxg@mellanox.com>; Raju Rangoju <rajur@chelsio.com>; linux-rdma@vger.kernel.org
> Cc: SWise OGC <swise@opengridcomputing.com>; Potnuri Bharat Teja <bharat@chelsio.com>
> Subject: Re: iSER fails to release rdma resources (WRs) if iw_cxgb4 is unloaded while IO is in progress
>
>> Hi Raju,
>> Try this patch that solve this problem, it checked with our tests.
>
> I might be missing something, but how is this patch any different then what I sent (other than duplicating some code)?
>

Sagi,
can you send your patch again ?
I missed it too.

thanks,
Max.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c
index 6a9d1cb..30b622f 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -597,7 +597,9 @@  static void iser_free_ib_conn_res(struct iser_conn *iser_conn,
 		  iser_conn, ib_conn->cma_id, ib_conn->qp);
 
 	if (ib_conn->qp != NULL) {
+		mutex_lock(&ig.connlist_mutex);
 		ib_conn->comp->active_qps--;
+		mutex_unlock(&ig.connlist_mutex);
 		rdma_destroy_qp(ib_conn->cma_id);
 		ib_conn->qp = NULL;
 	}