From patchwork Sat Jun 1 13:19:16 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bob Copeland X-Patchwork-Id: 2648281 Return-Path: X-Original-To: patchwork-linux-wireless@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id 37F1DDFE76 for ; Sat, 1 Jun 2013 13:20:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756840Ab3FANUO (ORCPT ); Sat, 1 Jun 2013 09:20:14 -0400 Received: from mail-ie0-f172.google.com ([209.85.223.172]:36558 "EHLO mail-ie0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756482Ab3FANUN (ORCPT ); Sat, 1 Jun 2013 09:20:13 -0400 Received: by mail-ie0-f172.google.com with SMTP id 17so6665489iea.3 for ; Sat, 01 Jun 2013 06:20:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=date:from:to:cc:subject:message-id:mime-version:content-type :content-disposition:user-agent:x-gm-message-state; bh=QCXdQrSA9to/n8IEpYKFl00Nkq7T76Bv9rYtcrJ9+DM=; b=kVR+tgipMkWXngOsvekO3foarsdb+olqHabq1lIIIEZu4vdBt02Xu+NEfMsqM5o0yK /CATWelGBLvUshMxEo1JslW2CHfm7d7fLZH5w0qRdJ+ry9/A3B7WTWKQ9/K3clnUPuTP YqKHg+zI3lOI/O86eURhXqUCeLhcvk90Zj2X+zY2uF8zgOiY8fyxm45gyBkvbN7d5zOr tB9ZQl+5Kb3GkfZY3XIcbVPkl29f1pJpZaMURvc11XbT51udivo39+KUHsuJ7+0BkpxP lbALYv/LjHVetX//o0PL1ny47Obu7dakGa289WEMnG1W5rto6l1WddmCexajYrI5ZXWg UYOQ== X-Received: by 10.50.73.194 with SMTP id n2mr3816891igv.73.1370092812405; Sat, 01 Jun 2013 06:20:12 -0700 (PDT) Received: from hash.lan (CPE0018e7fe5281-CM78cd8ec60a0d.cpe.net.cable.rogers.com. [99.242.255.161]) by mx.google.com with ESMTPSA id qr3sm8477031igb.1.2013.06.01.06.20.10 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Sat, 01 Jun 2013 06:20:11 -0700 (PDT) Received: from bob by hash.lan with local (Exim 4.72) (envelope-from ) id 1UiliQ-0000fl-HG; Sat, 01 Jun 2013 09:19:26 -0400 Date: Sat, 1 Jun 2013 09:19:16 -0400 From: Bob Copeland To: johannes@sipsolutions.net, thomas@cozybit.com Cc: linux-wireless@vger.kernel.org, devel@lists.open80211s.org Subject: [PATCH] cfg80211: fix deadlock in cfg80211_leave_mesh() Message-ID: <20130601131916.GA2484@localhost> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-06-14) X-Gm-Message-State: ALoCoQmeBRZiNa47z5pvH3LB9VlcGcDFAdJs22/OgO9UWsvs/9wnwr4pZ561+i3+P9GJdC7Wfs4j Sender: linux-wireless-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org As of "cfg80211/mac80211: use cfg80211 wdev mutex in mac80211", mac80211 expects to be able to take the wdev mutex around sdata accesses. This causes a recursive deadlock since __cfg80211_leave_mesh() already holds the wdev mutex. Removing the sdata_lock() calls in ieee80211_stop_mesh() alone won't fix this, as the cancel_work_sync() in mesh runs the iface work, and various work items also want to take the wdev lock (not just in mesh, see e.g. ieee80211_sta_rx_queued_mgmt().) We don't seem to need the wdev lock held over rdev_leave_mesh() anyway, so drop it before calling. Fixes: [ 75.571222] ============================================= [ 75.571222] [ INFO: possible recursive locking detected ] [ 75.571222] 3.10.0-rc1-a8cd57b+ #113 Not tainted [ 75.571222] --------------------------------------------- [ 75.571222] iw/2597 is trying to acquire lock: [ 75.571222] (&wdev->mtx){+.+.+.}, at: [] ieee80211_stop_mesh+0x60/0x160 [mac80211] [ 75.571222] [ 75.571222] but task is already holding lock: [ 75.571222] (&wdev->mtx){+.+.+.}, at: [] cfg80211_leave_mesh+0x35/0x370 [cfg80211] [ 75.571222] [ 75.571222] other info that might help us debug this: [ 75.571222] Possible unsafe locking scenario: [ 75.571222] [ 75.571222] CPU0 [ 75.571222] ---- [ 75.571222] lock(&wdev->mtx); [ 75.571222] lock(&wdev->mtx); [ 75.571222] [ 75.571222] *** DEADLOCK *** [ 75.571222] [ 75.571222] May be due to missing lock nesting notation [ 75.571222] [ 75.571222] 4 locks held by iw/2597: [ 75.571222] #0: (cb_lock){++++++}, at: [] genl_rcv+0x1d/0x40 [ 75.571222] #1: (genl_mutex){+.+.+.}, at: [] genl_lock+0x17/0x20 [ 75.571222] #2: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20 [ 75.571222] #3: (&wdev->mtx){+.+.+.}, at: [] cfg80211_leave_mesh+0x35/0x370 [cfg80211] [ 75.571222] [ 75.571222] stack backtrace: [ 75.571222] CPU: 1 PID: 2597 Comm: iw Not tainted 3.10.0-rc1-a8cd57b+ #113 [ 75.571222] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 75.571222] ffffffff82155790 ffff8800035c5768 ffffffff817b8335 ffff8800035c5868 [ 75.571222] ffffffff810a3da2 0000000000000000 0000000000000000 0000000000000002 [ 75.571222] 0000000000000046 00000000035c5818 ffffffff82155790 ffff8800057e2360 [ 75.571222] Call Trace: [ 75.571222] [] dump_stack+0x19/0x1b [ 75.571222] [] __lock_acquire+0xba2/0x1d30 [ 75.571222] [] ? _raw_spin_unlock_irqrestore+0x4d/0x70 [ 75.571222] [] ? trace_hardirqs_on_caller+0x16d/0x200 [ 75.571222] [] lock_acquire+0x17a/0x200 [ 75.571222] [] ? ieee80211_stop_mesh+0x60/0x160 [mac80211] [ 75.571222] [] mutex_lock_nested+0x6e/0x380 [ 75.571222] [] ? ieee80211_stop_mesh+0x60/0x160 [mac80211] [ 75.571222] [] ? ieee80211_bss_info_change_notify+0x202/0x3d0 [mac80211] [ 75.571222] [] ieee80211_stop_mesh+0x60/0x160 [mac80211] [ 75.571222] [] ieee80211_leave_mesh+0x1d/0x30 [mac80211] [ 75.571222] [] cfg80211_leave_mesh+0x188/0x370 [cfg80211] [ 75.571222] [] nl80211_leave_mesh+0x19/0x20 [cfg80211] [ 75.571222] [] genl_family_rcv_msg+0x266/0x2e0 [ 75.571222] [] ? genl_lock+0x20/0x20 [ 75.571222] [] genl_rcv_msg+0x8e/0xc0 [ 75.571222] [] netlink_rcv_skb+0x6e/0xd0 [ 75.571222] [] ? genl_rcv+0x1d/0x40 [ 75.571222] [] genl_rcv+0x2c/0x40 [ 75.571222] [] netlink_unicast+0x11a/0x1f0 [ 75.571222] [] ? sock_has_perm+0x5/0x1f0 [ 75.571222] [] netlink_sendmsg+0x30d/0x360 [ 75.571222] [] sock_sendmsg+0xa6/0xd0 [ 75.571222] [] ? lock_release_non_nested+0xae/0x2e0 [ 75.571222] [] __sys_sendmsg+0x369/0x390 [ 75.571222] [] ? up_read+0x23/0x40 [ 75.571222] [] ? __do_page_fault+0x4b4/0x570 [ 75.571222] [] ? dput+0x13d/0x1d0 [ 75.571222] [] ? fget_light+0x12c/0x430 [ 75.571222] [] SyS_sendmsg+0x4d/0x80 [ 75.571222] [] system_call_fastpath+0x16/0x1b Signed-off-by: Bob Copeland --- I don't like dropping locks, but I can't see any races added by this -- join/leave are already serialized by rtnl and I didn't see leave_mesh using the wdev fields in a way that would matter -- but let me know if I missed something. We could also split out the cancel_work_sync into a separate (unlocked) op but this approach seems simpler. Thoughts? net/wireless/mesh.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/wireless/mesh.c b/net/wireless/mesh.c index 5dfb289..6344a81 100644 --- a/net/wireless/mesh.c +++ b/net/wireless/mesh.c @@ -250,7 +250,9 @@ static int __cfg80211_leave_mesh(struct cfg80211_registered_device *rdev, if (!wdev->mesh_id_len) return -ENOTCONN; + wdev_unlock(wdev); err = rdev_leave_mesh(rdev, dev); + wdev_lock(wdev); if (!err) { wdev->mesh_id_len = 0; wdev->channel = NULL;