From patchwork Tue Nov 4 09:23:44 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "zheng.li" X-Patchwork-Id: 5224061 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id C79AE9F349 for ; Tue, 4 Nov 2014 09:04:39 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id C9BC020166 for ; Tue, 4 Nov 2014 09:04:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BBFD220160 for ; Tue, 4 Nov 2014 09:04:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751496AbaKDJEg (ORCPT ); Tue, 4 Nov 2014 04:04:36 -0500 Received: from userp1040.oracle.com ([156.151.31.81]:36066 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751235AbaKDJEe (ORCPT ); Tue, 4 Nov 2014 04:04:34 -0500 Received: from acsinet21.oracle.com (acsinet21.oracle.com [141.146.126.237]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id sA494TkR012082 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 4 Nov 2014 09:04:30 GMT Received: from userz7022.oracle.com (userz7022.oracle.com [156.151.31.86]) by acsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id sA494SUm009924 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 4 Nov 2014 09:04:28 GMT Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by userz7022.oracle.com (8.14.5+Sun/8.14.4) with ESMTP id sA48GHsr010945; Tue, 4 Nov 2014 08:16:17 GMT Received: from work-host-james.cn.oracle.com (/10.182.37.79) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 04 Nov 2014 01:04:26 -0800 From: Zheng Li To: linux-rdma@vger.kernel.org, ogerlitz@mellanox.com, yishaih@mellanox.com, jackm@dev.mellanox.co.il, eli@mellanox.com Cc: joe.jin@oracle.com, john.sobecki@oracle.com, guru.anbalagane@oracle.com, zheng.x.li@oracle.com Subject: [PATCH] net/mlx4_core: Convert rcu locking to rwlock in CQ. Date: Tue, 4 Nov 2014 17:23:44 +0800 Message-Id: <1415093024-13041-1-git-send-email-zheng.x.li@oracle.com> X-Mailer: git-send-email 1.7.6.5 X-Source-IP: acsinet21.oracle.com [141.146.126.237] Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The mlx4_ib_cq_comp dispatching was racy and this is fixed by using rcu locking. However, that introduced regression in ib port failure tests, so changing to rwlock. rcu_XXX api are used to synchronize threads and use preempt_disable/enable to make sure that no thread cross other thread but is not safe for interrupts routines. In this case we are in inteerupt context: mlx4_msi_x_interrupt -> mlx4_eq_int -> mlx4_cq_completion, so can't use rcu_XXX to protect this code. The stack is below: crash> bt PID: 32068 TASK: ffff880ed14405c0 CPU: 20 COMMAND: "kworker/u:1" #0 [ffff880ec7c01a00] __schedule at ffffffff81506664 #1 [ffff880ec7c01b18] schedule at ffffffff81506d45 #2 [ffff880ec7c01b28] schedule_timeout at ffffffff8150719c #3 [ffff880ec7c01bd8] wait_for_common at ffffffff81506bcd #4 [ffff880ec7c01c68] wait_for_completion at ffffffff81506cfd #5 [ffff880ec7c01c78] synchronize_sched at ffffffff810dda48 #6 [ffff880ec7c01cc8] mlx4_cq_free at ffffffffa027e67e [mlx4_core] #7 [ffff880ec7c01cf8] mlx4_ib_destroy_cq at ffffffffa03238a3 [mlx4_ib] #8 [ffff880ec7c01d18] ib_destroy_cq at ffffffffa0301c9e [ib_core] #9 [ffff880ec7c01d28] rds_ib_conn_shutdown at ffffffffa041cf5d [rds_rdma] #10 [ffff880ec7c01dd8] rds_conn_shutdown at ffffffffa02b7f18 [rds] #11 [ffff880ec7c01e38] rds_shutdown_worker at ffffffffa02bd14a [rds] #12 [ffff880ec7c01e58] process_one_work at ffffffff8108c3b9 #13 [ffff880ec7c01ea8] worker_thread at ffffffff8108ccfa #14 [ffff880ec7c01ee8] kthread at ffffffff810912d7 #15 [ffff880ec7c01f48] kernel_thread_helper at ffffffff815123c4 Signed-off-by: Zheng Li Signed-off-by: Joe Jin Signed-off-by: Guru Signed-off-by: John Sobecki --- drivers/net/ethernet/mellanox/mlx4/cq.c | 26 +++++++++++++++----------- drivers/net/ethernet/mellanox/mlx4/mlx4.h | 2 +- 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/cq.c b/drivers/net/ethernet/mellanox/mlx4/cq.c index 56022d6..8b3b849 100644 --- a/drivers/net/ethernet/mellanox/mlx4/cq.c +++ b/drivers/net/ethernet/mellanox/mlx4/cq.c @@ -54,15 +54,19 @@ void mlx4_cq_completion(struct mlx4_dev *dev, u32 cqn) { + struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table; struct mlx4_cq *cq; - cq = radix_tree_lookup(&mlx4_priv(dev)->cq_table.tree, - cqn & (dev->caps.num_cqs - 1)); + read_lock(&cq_table->lock); + + cq = radix_tree_lookup(&cq_table->tree, cqn & (dev->caps.num_cqs - 1)); if (!cq) { mlx4_dbg(dev, "Completion event for bogus CQ %08x\n", cqn); return; } + read_unlock(&cq_table->lock); + ++cq->arm_sn; cq->comp(cq); @@ -73,13 +77,13 @@ void mlx4_cq_event(struct mlx4_dev *dev, u32 cqn, int event_type) struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table; struct mlx4_cq *cq; - spin_lock(&cq_table->lock); + read_lock(&cq_table->lock); cq = radix_tree_lookup(&cq_table->tree, cqn & (dev->caps.num_cqs - 1)); if (cq) atomic_inc(&cq->refcount); - spin_unlock(&cq_table->lock); + read_unlock(&cq_table->lock); if (!cq) { mlx4_warn(dev, "Async event for bogus CQ %08x\n", cqn); @@ -256,9 +260,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, if (err) return err; - spin_lock_irq(&cq_table->lock); + write_lock_irq(&cq_table->lock); err = radix_tree_insert(&cq_table->tree, cq->cqn, cq); - spin_unlock_irq(&cq_table->lock); + write_unlock_irq(&cq_table->lock); if (err) goto err_icm; @@ -297,9 +301,9 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, return 0; err_radix: - spin_lock_irq(&cq_table->lock); + write_lock_irq(&cq_table->lock); radix_tree_delete(&cq_table->tree, cq->cqn); - spin_unlock_irq(&cq_table->lock); + write_unlock_irq(&cq_table->lock); err_icm: mlx4_cq_free_icm(dev, cq->cqn); @@ -320,9 +324,9 @@ void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq) synchronize_irq(priv->eq_table.eq[cq->vector].irq); - spin_lock_irq(&cq_table->lock); + write_lock_irq(&cq_table->lock); radix_tree_delete(&cq_table->tree, cq->cqn); - spin_unlock_irq(&cq_table->lock); + write_unlock_irq(&cq_table->lock); if (atomic_dec_and_test(&cq->refcount)) complete(&cq->free); @@ -337,7 +341,7 @@ int mlx4_init_cq_table(struct mlx4_dev *dev) struct mlx4_cq_table *cq_table = &mlx4_priv(dev)->cq_table; int err; - spin_lock_init(&cq_table->lock); + rwlock_init(&cq_table->lock); INIT_RADIX_TREE(&cq_table->tree, GFP_ATOMIC); if (mlx4_is_slave(dev)) return 0; diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h index de10dbb..42e2348 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h @@ -642,7 +642,7 @@ struct mlx4_mr_table { struct mlx4_cq_table { struct mlx4_bitmap bitmap; - spinlock_t lock; + rwlock_t lock; struct radix_tree_root tree; struct mlx4_icm_table table; struct mlx4_icm_table cmpt_table;