From patchwork Thu Feb 12 01:43:27 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Ledford X-Patchwork-Id: 5815031 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 3023C9F37F for ; Thu, 12 Feb 2015 01:44:18 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 4CB04201ED for ; Thu, 12 Feb 2015 01:44:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4980120218 for ; Thu, 12 Feb 2015 01:44:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753484AbbBLBoL (ORCPT ); Wed, 11 Feb 2015 20:44:11 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40295 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751417AbbBLBoK (ORCPT ); Wed, 11 Feb 2015 20:44:10 -0500 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t1C1i4GN028593 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 11 Feb 2015 20:44:04 -0500 Received: from linux-ws.xsintricity.com ([10.10.116.22]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t1C1i1Ef029536; Wed, 11 Feb 2015 20:44:04 -0500 From: Doug Ledford To: linux-rdma@vger.kernel.org, roland@kernel.org Cc: Or Gerlitz , Erez Shitrit , Doug Ledford Subject: [PATCH 04/22] IB/ipoib: fix mcast_dev_flush/mcast_restart_task race Date: Wed, 11 Feb 2015 20:43:27 -0500 Message-Id: <4133bfe2ad236899caf0831475ea636354a54883.1423703861.git.dledford@redhat.com> In-Reply-To: References: In-Reply-To: References: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Our mcast_dev_flush routine and our mcast_restart_task can race against each other. In particular, they both hold the priv->lock while manipulating the rbtree and while removing mcast entries from the multicast_list and while adding entries to the remove_list, but they also both drop their locks prior to doing the actual removes. The mcast_dev_flush routine is run entirely under the rtnl lock and so has at least some locking. The actual race condition is like this: Thread 1 Thread 2 ifconfig ib0 up start multicast join for broadcast multicast join completes for broadcast start to add more multicast joins call mcast_restart_task to add new entries ifconfig ib0 down mcast_dev_flush mcast_leave(mcast A) mcast_leave(mcast A) As mcast_leave calls ib_sa_multicast_leave, and as member in core/multicast.c is ref counted, we run into an unbalanced refcount issue. To avoid stomping on each others removes, take the rtnl lock specifically when we are deleting the entries from the remove list. Signed-off-by: Doug Ledford --- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 37 ++++++++++++++++++++++---- 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index a52c9f3f7e4..41325960e4e 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -802,7 +802,10 @@ void ipoib_mcast_dev_flush(struct net_device *dev) spin_unlock_irqrestore(&priv->lock, flags); - /* seperate between the wait to the leave*/ + /* + * make sure the in-flight joins have finished before we attempt + * to leave + */ list_for_each_entry_safe(mcast, tmcast, &remove_list, list) if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)) wait_for_completion(&mcast->done); @@ -923,14 +926,38 @@ void ipoib_mcast_restart_task(struct work_struct *work) netif_addr_unlock(dev); local_irq_restore(flags); - /* We have to cancel outside of the spinlock */ + /* + * make sure the in-flight joins have finished before we attempt + * to leave + */ + list_for_each_entry_safe(mcast, tmcast, &remove_list, list) + if (test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)) + wait_for_completion(&mcast->done); + + /* + * We have to cancel outside of the spinlock, but we have to + * take the rtnl lock or else we race with the removal of + * entries from the remove list in mcast_dev_flush as part + * of ipoib_stop() which will call mcast_stop_thread with + * flush == 1 while holding the rtnl lock, and the + * flush_workqueue won't complete until this restart_mcast_task + * completes. So do like the carrier on task and attempt to + * take the rtnl lock, but if we can't before the ADMIN_UP flag + * goes away, then just return and know that the remove list will + * get flushed later by mcast_dev_flush. + */ + while (!rtnl_trylock()) { + if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + return; + else + msleep(20); + } list_for_each_entry_safe(mcast, tmcast, &remove_list, list) { ipoib_mcast_leave(mcast->dev, mcast); ipoib_mcast_free(mcast); } - - if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) - ipoib_mcast_start_thread(dev); + ipoib_mcast_start_thread(dev); + rtnl_unlock(); } #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG