From patchwork Tue Dec 18 12:15:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 10735547 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2B97013B5 for ; Tue, 18 Dec 2018 12:16:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A4D22A13E for ; Tue, 18 Dec 2018 12:16:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0EB2B2A146; Tue, 18 Dec 2018 12:16:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A19BC2A13E for ; Tue, 18 Dec 2018 12:16:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726664AbeLRMQT (ORCPT ); Tue, 18 Dec 2018 07:16:19 -0500 Received: from mail.kernel.org ([198.145.29.99]:45690 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726548AbeLRMQT (ORCPT ); Tue, 18 Dec 2018 07:16:19 -0500 Received: from localhost (unknown [193.47.165.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5B7B621852; Tue, 18 Dec 2018 12:16:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1545135378; bh=gcVGPQ9iQSsnCno6SRmm5kqo5XF7/sHUGonHKIjcsCA=; h=From:To:Cc:Subject:Date:From; b=bQ11SWWy7BD5tfV2O5dp1eKPMo3ZOYN9lO9HPfZPUBZpp4kmzE+xeep/dK1S03rqA lGJF8xMA8/auTgH1mn4v2HAKsTdPOLsqnUxGlpZmNFqtBBWJw6mjuuYTJdHYdqYJRG 0Se1jnq97UgcoI1zK7L2PtZKZHRWwi5XhaSa6gLM= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Huy Nguyen , RDMA mailing list , Daniel Jurgens , Haggai Eran , Leon Romanovsky Subject: [PATCH rdma-next] IB/mlx5: Fix long EEH recover time with NVMe offloads Date: Tue, 18 Dec 2018 14:15:56 +0200 Message-Id: <20181218121556.24976-1-leon@kernel.org> X-Mailer: git-send-email 2.19.2 MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Huy Nguyen On NVMe offloads connection with many IO queues, EEH takes long time to recover. The culprit is the synchronize_srcu in the destroy_mkey. Solution is to use synchronize_srcu only for ODP mkey. Fixes: b4cfe447d47b ("IB/mlx5: Implement on demand paging by adding support for MMU notifiers") Signed-off-by: Huy Nguyen Reviewed-by: Daniel Jurgens Signed-off-by: Leon Romanovsky Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/mr.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index d45a9163c198..0f01059724e9 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -73,7 +73,8 @@ static int destroy_mkey(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr) #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING /* Wait until all page fault handlers using the mr complete. */ - synchronize_srcu(&dev->mr_srcu); + if (mr->umem && mr->umem->is_odp) + synchronize_srcu(&dev->mr_srcu); #endif return err; @@ -237,6 +238,9 @@ static void remove_keys(struct mlx5_ib_dev *dev, int c, int num) { struct mlx5_mr_cache *cache = &dev->cache; struct mlx5_cache_ent *ent = &cache->ent[c]; +#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING + bool odp_mkey_exist = false; +#endif struct mlx5_ib_mr *tmp_mr; struct mlx5_ib_mr *mr; LIST_HEAD(del_list); @@ -249,6 +253,10 @@ static void remove_keys(struct mlx5_ib_dev *dev, int c, int num) break; } mr = list_first_entry(&ent->head, struct mlx5_ib_mr, list); +#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING + if (mr->umem && mr->umem->is_odp) + odp_mkey_exist = true; +#endif list_move(&mr->list, &del_list); ent->cur--; ent->size--; @@ -257,7 +265,8 @@ static void remove_keys(struct mlx5_ib_dev *dev, int c, int num) } #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING - synchronize_srcu(&dev->mr_srcu); + if (odp_mkey_exist) + synchronize_srcu(&dev->mr_srcu); #endif list_for_each_entry_safe(mr, tmp_mr, &del_list, list) { @@ -572,6 +581,7 @@ static void clean_keys(struct mlx5_ib_dev *dev, int c) { struct mlx5_mr_cache *cache = &dev->cache; struct mlx5_cache_ent *ent = &cache->ent[c]; + bool odp_mkey_exist = false; struct mlx5_ib_mr *tmp_mr; struct mlx5_ib_mr *mr; LIST_HEAD(del_list); @@ -584,6 +594,8 @@ static void clean_keys(struct mlx5_ib_dev *dev, int c) break; } mr = list_first_entry(&ent->head, struct mlx5_ib_mr, list); + if (mr->umem && mr->umem->is_odp) + odp_mkey_exist = true; list_move(&mr->list, &del_list); ent->cur--; ent->size--; @@ -592,7 +604,8 @@ static void clean_keys(struct mlx5_ib_dev *dev, int c) } #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING - synchronize_srcu(&dev->mr_srcu); + if (odp_mkey_exist) + synchronize_srcu(&dev->mr_srcu); #endif list_for_each_entry_safe(mr, tmp_mr, &del_list, list) {