From patchwork Fri Oct 19 23:34:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Saleem, Shiraz" X-Patchwork-Id: 10650225 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 507C51508 for ; Fri, 19 Oct 2018 23:34:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 406BA28531 for ; Fri, 19 Oct 2018 23:34:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 354B72853A; Fri, 19 Oct 2018 23:34:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0B64428538 for ; Fri, 19 Oct 2018 23:34:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726349AbeJTHml (ORCPT ); Sat, 20 Oct 2018 03:42:41 -0400 Received: from mga17.intel.com ([192.55.52.151]:43450 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726881AbeJTHml (ORCPT ); Sat, 20 Oct 2018 03:42:41 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Oct 2018 16:34:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,401,1534834800"; d="scan'208";a="82914368" Received: from ssaleem-mobl4.amr.corp.intel.com ([10.255.33.88]) by orsmga008.jf.intel.com with ESMTP; 19 Oct 2018 16:34:30 -0700 From: Shiraz Saleem To: dledford@redhat.com, jgg@ziepe.ca Cc: linux-rdma@vger.kernel.org, Shiraz Saleem Subject: [PATCH RFC 1/4] RDMA/umem: Minimize SG table entries Date: Fri, 19 Oct 2018 18:34:06 -0500 Message-Id: <20181019233409.1104-2-shiraz.saleem@intel.com> X-Mailer: git-send-email 2.8.3 In-Reply-To: <20181019233409.1104-1-shiraz.saleem@intel.com> References: <20181019233409.1104-1-shiraz.saleem@intel.com> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Squash contiguous regions of PAGE_SIZE pages into a single SG entry as opposed to one SG entry per page. This reduces the SG table size and is friendliest to the IOMMU. Suggested-by: Jason Gunthorpe Reviewed-by: Michael J. Ruhl Signed-off-by: Shiraz Saleem --- drivers/infiniband/core/umem.c | 66 ++++++++++++++++++++---------------------- 1 file changed, 31 insertions(+), 35 deletions(-) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index c6144df..486d6d7 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include "uverbs.h" @@ -46,18 +47,16 @@ static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int dirty) { - struct scatterlist *sg; + struct sg_page_iter sg_iter; struct page *page; - int i; if (umem->nmap > 0) ib_dma_unmap_sg(dev, umem->sg_head.sgl, - umem->npages, + umem->sg_head.orig_nents, DMA_BIDIRECTIONAL); - for_each_sg(umem->sg_head.sgl, sg, umem->npages, i) { - - page = sg_page(sg); + for_each_sg_page(umem->sg_head.sgl, &sg_iter, umem->sg_head.orig_nents, 0) { + page = sg_page_iter_page(&sg_iter); if (!PageDirty(page) && umem->writable && dirty) set_page_dirty_lock(page); put_page(page); @@ -92,7 +91,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, int ret; int i; unsigned long dma_attrs = 0; - struct scatterlist *sg, *sg_list_start; unsigned int gup_flags = FOLL_WRITE; if (dmasync) @@ -138,7 +136,13 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, /* We assume the memory is from hugetlb until proved otherwise */ umem->hugetlb = 1; - page_list = (struct page **) __get_free_page(GFP_KERNEL); + npages = ib_umem_num_pages(umem); + if (npages == 0 || npages > UINT_MAX) { + ret = -EINVAL; + goto umem_kfree; + } + + page_list = kmalloc_array(npages, sizeof(*page_list), GFP_KERNEL); if (!page_list) { ret = -ENOMEM; goto umem_kfree; @@ -152,12 +156,6 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, if (!vma_list) umem->hugetlb = 0; - npages = ib_umem_num_pages(umem); - if (npages == 0 || npages > UINT_MAX) { - ret = -EINVAL; - goto out; - } - lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; down_write(&mm->mmap_sem); @@ -172,50 +170,48 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, cur_base = addr & PAGE_MASK; - ret = sg_alloc_table(&umem->sg_head, npages, GFP_KERNEL); - if (ret) - goto vma; - if (!umem->writable) gup_flags |= FOLL_FORCE; - sg_list_start = umem->sg_head.sgl; - while (npages) { down_read(&mm->mmap_sem); ret = get_user_pages_longterm(cur_base, min_t(unsigned long, npages, PAGE_SIZE / sizeof (struct page *)), - gup_flags, page_list, vma_list); + gup_flags, page_list + umem->npages, vma_list); if (ret < 0) { up_read(&mm->mmap_sem); - goto umem_release; + release_pages(page_list, umem->npages); + goto vma; } umem->npages += ret; cur_base += ret * PAGE_SIZE; npages -= ret; - /* Continue to hold the mmap_sem as vma_list access - * needs to be protected. - */ - for_each_sg(sg_list_start, sg, ret, i) { + for(i = 0; i < ret && umem->hugetlb; i++) { if (vma_list && !is_vm_hugetlb_page(vma_list[i])) umem->hugetlb = 0; - - sg_set_page(sg, page_list[i], PAGE_SIZE, 0); } up_read(&mm->mmap_sem); + } - /* preparing for next loop */ - sg_list_start = sg; + ret = sg_alloc_table_from_pages(&umem->sg_head, + page_list, + umem->npages, + 0, + umem->npages << PAGE_SHIFT, + GFP_KERNEL); + if (ret) { + release_pages(page_list, umem->npages); + goto vma; } umem->nmap = ib_dma_map_sg_attrs(context->device, - umem->sg_head.sgl, - umem->npages, - DMA_BIDIRECTIONAL, - dma_attrs); + umem->sg_head.sgl, + umem->sg_head.orig_nents, + DMA_BIDIRECTIONAL, + dma_attrs); if (!umem->nmap) { ret = -ENOMEM; @@ -234,7 +230,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr, out: if (vma_list) free_page((unsigned long) vma_list); - free_page((unsigned long) page_list); + kfree(page_list); umem_kfree: if (ret) { mmdrop(umem->owning_mm);