From patchwork Wed Sep 3 17:13:57 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shawn Bohrer X-Patchwork-Id: 4836171 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id E52C8C0338 for ; Wed, 3 Sep 2014 17:14:36 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 08FC920211 for ; Wed, 3 Sep 2014 17:14:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 004702020F for ; Wed, 3 Sep 2014 17:14:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932357AbaICROd (ORCPT ); Wed, 3 Sep 2014 13:14:33 -0400 Received: from mail-oi0-f44.google.com ([209.85.218.44]:41024 "EHLO mail-oi0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753512AbaICROc (ORCPT ); Wed, 3 Sep 2014 13:14:32 -0400 Received: by mail-oi0-f44.google.com with SMTP id x69so648868oia.17 for ; Wed, 03 Sep 2014 10:14:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=pLDSTky5en/EgdZ6Kn+Z9pe23Y8LGcrQBlnoIARNsxw=; b=Gp0WM23fZ+OqAHlz5KmLGuMOokSGb5ogIgLfh769E+++Dlp8/XzDXe2l8zob7Ktwrs 6Ea+jdJnMYw4w5v1HTlcRgc9c2Di8tsdsTgsfKbNCAqqCiDEQZiCj+14g37L/YEuVTj0 sKrag0dNmXWPkaY8QWWlQmStaiMLmBLtmwp8nsJdBbhKZ7Y5g1foXfpIL0HvuMsNUGVl /VWIvc8CagdUCTyCTSVObQWhNFKgw1zRHQLjLX9ilBzgvPvd9CAGTDSh1ZR/G8Ffnpct 94sV6STeic4QHAaXlncFT5eVBmsD6EVAOfm61Mc0JpUGJDSam2TGgnloF0wlLRiQfk81 X6Tw== X-Received: by 10.182.245.135 with SMTP id xo7mr40419828obc.23.1409764471885; Wed, 03 Sep 2014 10:14:31 -0700 (PDT) Received: from sbohrermbp13-local.rgmadvisors.com ([173.227.92.65]) by mx.google.com with ESMTPSA id wa1sm9877999oeb.1.2014.09.03.10.14.30 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Sep 2014 10:14:31 -0700 (PDT) From: Shawn Bohrer To: Roland Dreier Cc: Sean Hefty , hal.rosenstock@gmail.com, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, tomk@rgmadvisors.com, Yishai Hadas , Or Gerlitz , Haggai Eran , Shachar Raindel , Christoph Lameter , Shawn Bohrer Subject: [PATCH v3] ib_umem_release should decrement mm->pinned_vm from ib_umem_get Date: Wed, 3 Sep 2014 12:13:57 -0500 Message-Id: <1409764437-29699-1-git-send-email-shawn.bohrer@gmail.com> X-Mailer: git-send-email 1.9.3 In-Reply-To: <6B2A6E60C06CCC42AE31809BF572352B010E240634@MTLDAG02.mtl.com> References: <6B2A6E60C06CCC42AE31809BF572352B010E240634@MTLDAG02.mtl.com> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Shawn Bohrer In debugging an application that receives -ENOMEM from ib_reg_mr() I found that ib_umem_get() can fail because the pinned_vm count has wrapped causing it to always be larger than the lock limit even with RLIMIT_MEMLOCK set to RLIM_INFINITY. The wrapping of pinned_vm occurs because the process that calls ib_reg_mr() will have its mm->pinned_vm count incremented. Later a different process with a different mm_struct than the one that allocated the ib_umem struct ends up releasing it which results in decrementing the new processes mm->pinned_vm count past zero and wrapping. I'm not entirely sure what circumstances cause a different process to release the ib_umem than the one that allocated it but the kernel stack trace of the freeing process from my situation looks like the following: Call Trace: [] dump_stack+0x19/0x1b [] ib_umem_release+0x1f5/0x200 [ib_core] [] mlx4_ib_destroy_qp+0x241/0x440 [mlx4_ib] [] ib_destroy_qp+0x12c/0x170 [ib_core] [] ib_uverbs_close+0x259/0x4e0 [ib_uverbs] [] __fput+0xba/0x240 [] ____fput+0xe/0x10 [] task_work_run+0xc4/0xe0 [] do_notify_resume+0x95/0xa0 [] int_signal+0x12/0x17 The following patch fixes the issue by storing the pid struct of the process that calls ib_umem_get() so that ib_umem_release and/or ib_umem_account() can properly decrement the pinned_vm count of the correct mm_struct. Signed-off-by: Shawn Bohrer Reviewed-by: Shachar Raindel --- v3 changes: * Fix resource leak with put_task_struct() v2 changes: * Updated to use get_task_pid to avoid keeping a reference to the mm drivers/infiniband/core/umem.c | 19 +++++++++++++------ include/rdma/ib_umem.h | 1 + 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index a3a2e9c..df0c4f6 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -105,6 +105,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, umem->length = size; umem->offset = addr & ~PAGE_MASK; umem->page_size = PAGE_SIZE; + umem->pid = get_task_pid(current, PIDTYPE_PID); /* * We ask for writable memory if any access flags other than * "remote read" are set. "Local write" and "remote write" @@ -198,6 +199,7 @@ out: if (ret < 0) { if (need_release) __ib_umem_release(context->device, umem, 0); + put_pid(umem->pid); kfree(umem); } else current->mm->pinned_vm = locked; @@ -230,15 +232,19 @@ void ib_umem_release(struct ib_umem *umem) { struct ib_ucontext *context = umem->context; struct mm_struct *mm; + struct task_struct *task; unsigned long diff; __ib_umem_release(umem->context->device, umem, 1); - mm = get_task_mm(current); - if (!mm) { - kfree(umem); - return; - } + task = get_pid_task(umem->pid, PIDTYPE_PID); + put_pid(umem->pid); + if (!task) + goto out; + mm = get_task_mm(task); + put_task_struct(task); + if (!mm) + goto out; diff = PAGE_ALIGN(umem->length + umem->offset) >> PAGE_SHIFT; @@ -262,9 +268,10 @@ void ib_umem_release(struct ib_umem *umem) } else down_write(&mm->mmap_sem); - current->mm->pinned_vm -= diff; + mm->pinned_vm -= diff; up_write(&mm->mmap_sem); mmput(mm); +out: kfree(umem); } EXPORT_SYMBOL(ib_umem_release); diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index 1ea0b65..a2bf41e 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -47,6 +47,7 @@ struct ib_umem { int writable; int hugetlb; struct work_struct work; + struct pid *pid; struct mm_struct *mm; unsigned long diff; struct sg_table sg_head;