From patchwork Thu Sep 24 02:10:54 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wengang Wang X-Patchwork-Id: 7253251 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id D0D929F380 for ; Thu, 24 Sep 2015 02:09:52 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id C98BF208BC for ; Thu, 24 Sep 2015 02:09:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AED8520798 for ; Thu, 24 Sep 2015 02:09:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932250AbbIXCJt (ORCPT ); Wed, 23 Sep 2015 22:09:49 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:32093 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932329AbbIXCJi (ORCPT ); Wed, 23 Sep 2015 22:09:38 -0400 Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id t8O29ZLA027246 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 24 Sep 2015 02:09:37 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserv0022.oracle.com (8.13.8/8.13.8) with ESMTP id t8O29ZSL030322 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Thu, 24 Sep 2015 02:09:35 GMT Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0122.oracle.com (8.13.8/8.13.8) with ESMTP id t8O29YBK008950 for ; Thu, 24 Sep 2015 02:09:34 GMT Received: from oracle.com (/10.182.64.160) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 23 Sep 2015 19:09:34 -0700 From: Wengang Wang To: linux-rdma@vger.kernel.org Cc: wen.gang.wang@oracle.com Subject: [PATCH] mlx4: vmalloc for mlx4_ib_wq.wrid and mlx4_ib_srq.wrid Date: Thu, 24 Sep 2015 10:10:54 +0800 Message-Id: <1443060654-10402-1-git-send-email-wen.gang.wang@oracle.com> X-Mailer: git-send-email 2.1.0 X-Source-IP: aserv0022.oracle.com [141.146.126.234] Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Use __vmalloc to allocate memory for mlx4_ib_wq.wrid and mlx4_ib_srq.wrid. Several hits that the kmalloc for wrid failed with the following like call back stack: kworker/u:4: page allocation failure: order:4, mode:0x2000d0 Pid: 16388, comm: kworker/u:4 Not tainted Call Trace: [] warn_alloc_failed+0xf3/0x160 [] ? __alloc_pages_direct_compact+0x1fa/0x200 [] __alloc_pages_slowpath+0x4a6/0x7b0 [] __alloc_pages_nodemask+0x2fb/0x320 [] kmem_getpages+0x67/0x1c0 [] fallback_alloc+0x187/0x250 [] ____cache_alloc_node+0x9a/0x150 [] __kmalloc+0x18b/0x340 [] ? create_qp_common+0x431/0x8e0 [mlx4_ib] [] create_qp_common+0x431/0x8e0 [mlx4_ib] [] ? kzalloc.clone.1+0xe/0x10 [mlx4_ib] [] mlx4_ib_create_qp+0x207/0x310 [mlx4_ib] [] ib_create_qp+0x41/0x1c0 [ib_core] [] ipoib_cm_create_tx_qp+0xc8/0x130 [ib_ipoib] [] ? __vmalloc_node+0x35/0x40 [] ipoib_cm_tx_init+0x65/0x380 [ib_ipoib] [] ? sched_clock_cpu+0xcd/0x110 [] ? xen_mc_flush+0xb0/0x1b0 [] ipoib_cm_tx_start+0x230/0x3d0 [ib_ipoib] [] process_one_work+0x180/0x420 [] worker_thread+0x12e/0x390 [] ? manage_workers+0x180/0x180 [] kthread+0xce/0xe0 [] ? xen_end_context_switch+0x1e/0x30 [] ? kthread_freezable_should_stop+0x70/0x70 [] ret_from_fork+0x7c/0xb0 [] ? kthread_freezable_should_stop+0x70/0x70 It needs 16 contigous pages and failed. At the time there actually is 100MB+ free memory: Node 0 Normal: 10268*4kB (UM) 7443*8kB (UEM) 1647*16kB (UM) 35*32kB (UR) 1*64kB (R) 4*128kB (R) 1*256kB (R) 0*512kB 1*1024kB (R) 0*2048kB 0*4096kB = 129944kB I also hit same errors order 3. Signed-off-by: Wengang Wang --- drivers/infiniband/hw/mlx4/qp.c | 15 +++++++++------ drivers/infiniband/hw/mlx4/srq.c | 6 ++++-- 2 files changed, 13 insertions(+), 8 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c index 4ad9be3..754ceb9 100644 --- a/drivers/infiniband/hw/mlx4/qp.c +++ b/drivers/infiniband/hw/mlx4/qp.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -786,8 +787,10 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd, if (err) goto err_mtt; - qp->sq.wrid = kmalloc(qp->sq.wqe_cnt * sizeof (u64), gfp); - qp->rq.wrid = kmalloc(qp->rq.wqe_cnt * sizeof (u64), gfp); + qp->sq.wrid = __vmalloc(qp->sq.wqe_cnt * sizeof(u64), gfp, + PAGE_KERNEL); + qp->rq.wrid = __vmalloc(qp->rq.wqe_cnt * sizeof(u64), gfp, + PAGE_KERNEL); if (!qp->sq.wrid || !qp->rq.wrid) { err = -ENOMEM; goto err_wrid; @@ -874,8 +877,8 @@ err_wrid: if (qp_has_rq(init_attr)) mlx4_ib_db_unmap_user(to_mucontext(pd->uobject->context), &qp->db); } else { - kfree(qp->sq.wrid); - kfree(qp->rq.wrid); + vfree(qp->sq.wrid); + vfree(qp->rq.wrid); } err_mtt: @@ -1050,8 +1053,8 @@ static void destroy_qp_common(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp, &qp->db); ib_umem_release(qp->umem); } else { - kfree(qp->sq.wrid); - kfree(qp->rq.wrid); + vfree(qp->sq.wrid); + vfree(qp->rq.wrid); if (qp->mlx4_ib_qp_type & (MLX4_IB_QPT_PROXY_SMI_OWNER | MLX4_IB_QPT_PROXY_SMI | MLX4_IB_QPT_PROXY_GSI)) free_proxy_bufs(&dev->ib_dev, qp); diff --git a/drivers/infiniband/hw/mlx4/srq.c b/drivers/infiniband/hw/mlx4/srq.c index dce5dfe..6d21bb2 100644 --- a/drivers/infiniband/hw/mlx4/srq.c +++ b/drivers/infiniband/hw/mlx4/srq.c @@ -34,6 +34,7 @@ #include #include #include +#include #include "mlx4_ib.h" #include "user.h" @@ -170,7 +171,8 @@ struct ib_srq *mlx4_ib_create_srq(struct ib_pd *pd, if (err) goto err_mtt; - srq->wrid = kmalloc(srq->msrq.max * sizeof (u64), GFP_KERNEL); + srq->wrid = __vmalloc(srq->msrq.max * sizeof(u64), GFP_KERNEL, + PAGE_KERNEL); if (!srq->wrid) { err = -ENOMEM; goto err_mtt; @@ -204,7 +206,7 @@ err_wrid: if (pd->uobject) mlx4_ib_db_unmap_user(to_mucontext(pd->uobject->context), &srq->db); else - kfree(srq->wrid); + vfree(srq->wrid); err_mtt: mlx4_mtt_cleanup(dev->dev, &srq->mtt);