From patchwork Sat Oct 29 16:21:03 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 9403973 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 542D760588 for ; Sat, 29 Oct 2016 16:21:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 47A232909D for ; Sat, 29 Oct 2016 16:21:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3AF4C290D2; Sat, 29 Oct 2016 16:21:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9318E2909D for ; Sat, 29 Oct 2016 16:21:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932390AbcJ2QVJ convert rfc822-to-8bit (ORCPT ); Sat, 29 Oct 2016 12:21:09 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:25566 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932255AbcJ2QVI (ORCPT ); Sat, 29 Oct 2016 12:21:08 -0400 Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u9TGL5Ye031021 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 29 Oct 2016 16:21:06 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserv0021.oracle.com (8.13.8/8.13.8) with ESMTP id u9TGL5lb027612 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 29 Oct 2016 16:21:05 GMT Received: from abhmp0001.oracle.com (abhmp0001.oracle.com [141.146.116.7]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id u9TGL5UX016460; Sat, 29 Oct 2016 16:21:05 GMT Received: from anon-dhcp-171.1015granger.net (/68.46.169.226) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 29 Oct 2016 09:21:04 -0700 From: Chuck Lever Subject: svc_xprt_put is no longer BH-safe Date: Sat, 29 Oct 2016 12:21:03 -0400 Message-Id: <1934654A-78D7-4180-A0AA-4B144254BFC2@oracle.com> Cc: Linux NFS Mailing List To: "J. Bruce Fields" Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) X-Mailer: Apple Mail (2.3124) X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Bruce- I hit this lockdep splat this morning Oct 29 11:38:25 klimt kernel: ================================= Oct 29 11:38:25 klimt kernel: [ INFO: inconsistent lock state ] Oct 29 11:38:25 klimt kernel: 4.9.0-rc2-00003-g114ddae #9 Not tainted Oct 29 11:38:25 klimt kernel: --------------------------------- Oct 29 11:38:25 klimt kernel: inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. Oct 29 11:38:25 klimt kernel: swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes: Oct 29 11:38:25 klimt kernel: ( Oct 29 11:38:25 klimt kernel: &(&xps->xps_lock)->rlock Oct 29 11:38:25 klimt kernel: ){+.?...} Oct 29 11:38:25 klimt kernel: , at: Oct 29 11:38:25 klimt kernel: [] xprt_switch_free+0x26/0xb0 [sunrpc] Oct 29 11:38:25 klimt kernel: {SOFTIRQ-ON-W} state was registered at: Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] __lock_acquire+0x343/0x1670 Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] lock_acquire+0x197/0x1f0 Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] _raw_spin_lock+0x38/0x50 Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] xprt_switch_free+0x26/0xb0 [sunrpc] Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] xprt_switch_put+0x22/0x30 [sunrpc] Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] svc_xprt_free+0x5d/0x80 [sunrpc] Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] svc_xprt_release+0x12a/0x140 [sunrpc] Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] svc_recv+0xcb2/0xed0 [sunrpc] Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] nfsd+0xe8/0x160 [nfsd] Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] kthread+0x10b/0x120 Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] ret_from_fork+0x2a/0x40 Oct 29 11:38:25 klimt kernel: irq event stamp: 483745258 Oct 29 11:38:25 klimt kernel: hardirqs last enabled at (483745258): Oct 29 11:38:25 klimt kernel: [] __local_bh_enable_ip+0x95/0xd0 Oct 29 11:38:25 klimt kernel: hardirqs last disabled at (483745257): Oct 29 11:38:25 klimt kernel: [] __local_bh_enable_ip+0x55/0xd0 Oct 29 11:38:25 klimt kernel: softirqs last enabled at (483744848): Oct 29 11:38:25 klimt kernel: [] _local_bh_enable+0x42/0x50 Oct 29 11:38:25 klimt kernel: softirqs last disabled at (483744849): Oct 29 11:38:25 klimt kernel: [] irq_exit+0x5b/0xf0 Oct 29 11:38:25 klimt kernel: #012other info that might help us debug this: Oct 29 11:38:25 klimt kernel: Possible unsafe locking scenario: Oct 29 11:38:25 klimt kernel: CPU0 Oct 29 11:38:25 klimt kernel: ---- Oct 29 11:38:25 klimt kernel: lock( Oct 29 11:38:25 klimt kernel: &(&xps->xps_lock)->rlock Oct 29 11:38:25 klimt kernel: ); Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: lock( Oct 29 11:38:25 klimt kernel: &(&xps->xps_lock)->rlock Oct 29 11:38:25 klimt kernel: ); Oct 29 11:38:25 klimt kernel: #012 *** DEADLOCK *** Oct 29 11:38:25 klimt kernel: no locks held by swapper/6/0. Oct 29 11:38:25 klimt kernel: #012stack backtrace: Oct 29 11:38:25 klimt kernel: CPU: 6 PID: 0 Comm: swapper/6 Not tainted 4.9.0-rc2-00003-g114ddae #9 Oct 29 11:38:25 klimt kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015 Oct 29 11:38:25 klimt kernel: ffff88087bd83be8 ffffffff81392e51 ffff880857008040 ffffffff827e2500 Oct 29 11:38:25 klimt kernel: ffff88087bd83c38 ffffffff811b3e76 0000000000000001 0000000000000001 Oct 29 11:38:25 klimt kernel: 0000000000000000 0000000000000006 ffff880857008040 ffffffff810e1060 Oct 29 11:38:25 klimt kernel: Call Trace: Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] dump_stack+0x85/0xc4 Oct 29 11:38:25 klimt kernel: [] print_usage_bug+0x1eb/0x1fc Oct 29 11:38:25 klimt kernel: [] ? print_irq_inversion_bug+0x200/0x200 Oct 29 11:38:25 klimt kernel: [] mark_lock+0x175/0x290 Oct 29 11:38:25 klimt kernel: [] __lock_acquire+0x28f/0x1670 Oct 29 11:38:25 klimt kernel: [] lock_acquire+0x197/0x1f0 Oct 29 11:38:25 klimt kernel: [] ? xprt_switch_free+0x26/0xb0 [sunrpc] Oct 29 11:38:25 klimt kernel: [] _raw_spin_lock+0x38/0x50 Oct 29 11:38:25 klimt kernel: [] ? xprt_switch_free+0x26/0xb0 [sunrpc] Oct 29 11:38:25 klimt kernel: [] xprt_switch_free+0x26/0xb0 [sunrpc] Oct 29 11:38:25 klimt kernel: [] xprt_switch_put+0x22/0x30 [sunrpc] Oct 29 11:38:25 klimt kernel: [] svc_xprt_free+0x5d/0x80 [sunrpc] Oct 29 11:38:25 klimt kernel: [] svc_xprt_put+0x1d/0x20 [sunrpc] Oct 29 11:38:25 klimt kernel: [] svc_rdma_wc_receive+0xcb/0xe0 [rpcrdma] Oct 29 11:38:25 klimt kernel: [] __ib_process_cq+0x35/0xc0 [ib_core] Oct 29 11:38:25 klimt kernel: [] ib_poll_handler+0x22/0x60 [ib_core] Oct 29 11:38:25 klimt kernel: [] irq_poll_softirq+0x85/0x100 Oct 29 11:38:25 klimt kernel: [] __do_softirq+0x1f9/0x425 Oct 29 11:38:25 klimt kernel: [] irq_exit+0x5b/0xf0 Oct 29 11:38:25 klimt kernel: [] do_IRQ+0xef/0x110 Oct 29 11:38:25 klimt kernel: [] common_interrupt+0x96/0x96 Oct 29 11:38:25 klimt kernel: Oct 29 11:38:25 klimt kernel: [] ? cpuidle_enter_state+0x22c/0x300 Oct 29 11:38:25 klimt kernel: [] cpuidle_enter+0x17/0x20 Oct 29 11:38:25 klimt kernel: [] call_cpuidle+0x3d/0x50 Oct 29 11:38:25 klimt kernel: [] cpu_startup_entry+0x19e/0x240 Oct 29 11:38:25 klimt kernel: [] start_secondary+0x160/0x1a0 In commit 39a9beab5acb83176e8b9a4f0778749a09341f1f ('rpc: share one xps between all backchannels') you added: svc_xprt_free() is invoked by svc_xprt_put(). svc_xprt_put() is called from svc_rdma in soft IRQ context (eg. svc_rdma_wc_receive). However, xprt_switch_put() takes a spin lock (xps_lock) which is locked everywhere without disabling BHs. It looks to me like 39a9beab5acb makes svc_xprt_put() no longer BH-safe? Not sure if svc_xprt_put() was intended to be BH-safe beforehand. Maybe xprt_switch_put() could be invoked in ->xpo_free, but that seems like a temporary solution. --- Chuck Lever -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c index f5572e3..4f01f63 100644 --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -136,6 +136,8 @@ static void svc_xprt_free(struct kref *kref) /* See comment on corresponding get in xs_setup_bc_tcp(): */ if (xprt->xpt_bc_xprt) xprt_put(xprt->xpt_bc_xprt); + if (xprt->xpt_bc_xps) + xprt_switch_put(xprt->xpt_bc_xps); xprt->xpt_ops->xpo_free(xprt); module_put(owner); }