From patchwork Sun Jan 14 22:35:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernard Metzler X-Patchwork-Id: 10162871 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1BA2C60390 for ; Sun, 14 Jan 2018 22:36:30 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0CD5B28874 for ; Sun, 14 Jan 2018 22:36:30 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 01A532887E; Sun, 14 Jan 2018 22:36:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 484782889C for ; Sun, 14 Jan 2018 22:36:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752757AbeANWg1 (ORCPT ); Sun, 14 Jan 2018 17:36:27 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:43294 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751341AbeANWg0 (ORCPT ); Sun, 14 Jan 2018 17:36:26 -0500 Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w0EMY2Ww101647 for ; Sun, 14 Jan 2018 17:36:26 -0500 Received: from e06smtp10.uk.ibm.com (e06smtp10.uk.ibm.com [195.75.94.106]) by mx0b-001b2d01.pphosted.com with ESMTP id 2fg0fhbdfs-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Sun, 14 Jan 2018 17:36:25 -0500 Received: from localhost by e06smtp10.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 14 Jan 2018 22:36:24 -0000 Received: from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195) by e06smtp10.uk.ibm.com (192.168.101.140) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Sun, 14 Jan 2018 22:36:23 -0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w0EMaMCm59506718; Sun, 14 Jan 2018 22:36:22 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0739811C04C; Sun, 14 Jan 2018 22:30:09 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D63A711C04A; Sun, 14 Jan 2018 22:30:08 +0000 (GMT) Received: from zac27.zurich.ibm.com (unknown [9.4.233.128]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Sun, 14 Jan 2018 22:30:08 +0000 (GMT) From: Bernard Metzler To: linux-rdma@vger.kernel.org Cc: Bernard Metzler Subject: [PATCH v3 07/13] SoftiWarp application buffer management Date: Sun, 14 Jan 2018 17:35:57 -0500 X-Mailer: git-send-email 2.13.6 In-Reply-To: <20180114223603.19961-1-bmt@zurich.ibm.com> References: <20180114223603.19961-1-bmt@zurich.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18011422-0040-0000-0000-000004049D9E X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18011422-0041-0000-0000-00002608036D Message-Id: <20180114223603.19961-8-bmt@zurich.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2018-01-14_14:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1801140321 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Signed-off-by: Bernard Metzler --- drivers/infiniband/sw/siw/siw_mem.c | 243 ++++++++++++++++++++++++++++++++++++ 1 file changed, 243 insertions(+) create mode 100644 drivers/infiniband/sw/siw/siw_mem.c diff --git a/drivers/infiniband/sw/siw/siw_mem.c b/drivers/infiniband/sw/siw/siw_mem.c new file mode 100644 index 000000000000..314a0d4caa41 --- /dev/null +++ b/drivers/infiniband/sw/siw/siw_mem.c @@ -0,0 +1,243 @@ +/* + * Software iWARP device driver + * + * Authors: Animesh Trivedi + * Bernard Metzler + * + * Copyright (c) 2008-2017, IBM Corporation + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above copyright notice, + * this list of conditions and the following disclaimer. + * + * - Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * - Neither the name of IBM nor the names of its contributors may be + * used to endorse or promote products derived from this software without + * specific prior written permission. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ +#include +#include +#include +#include +#include +#include +#include +#include + +#include "siw.h" +#include "siw_debug.h" + +static void siw_umem_update_stats(struct work_struct *work) +{ + struct siw_umem *umem = container_of(work, struct siw_umem, work); + struct mm_struct *mm_s = umem->mm_s; + + down_write(&mm_s->mmap_sem); + mm_s->pinned_vm -= umem->num_pages; + up_write(&mm_s->mmap_sem); + + mmput(mm_s); + + kfree(umem->page_chunk); + kfree(umem); +} + +static void siw_free_plist(struct siw_page_chunk *chunk, int num_pages) +{ + struct page **p = chunk->p; + + while (num_pages--) { + put_page(*p); + p++; + } +} + +void siw_umem_release(struct siw_umem *umem) +{ + struct task_struct *task = get_pid_task(umem->pid, PIDTYPE_PID); + int i, num_pages = umem->num_pages; + + for (i = 0; num_pages; i++) { + int to_free = min_t(int, PAGES_PER_CHUNK, num_pages); + + siw_free_plist(&umem->page_chunk[i], to_free); + kfree(umem->page_chunk[i].p); + num_pages -= to_free; + } + put_pid(umem->pid); + if (task) { + struct mm_struct *mm_s = get_task_mm(task); + + put_task_struct(task); + if (mm_s) { + if (down_write_trylock(&mm_s->mmap_sem)) { + mm_s->pinned_vm -= umem->num_pages; + up_write(&mm_s->mmap_sem); + mmput(mm_s); + } else { + /* + * Schedule delayed accounting if + * mm semaphore is not available + */ + INIT_WORK(&umem->work, siw_umem_update_stats); + umem->mm_s = mm_s; + schedule_work(&umem->work); + + return; + } + } + } + kfree(umem->page_chunk); + kfree(umem); +} + +void siw_pbl_free(struct siw_pbl *pbl) +{ + kfree(pbl); +} + +/* + * Get physical address backed by PBL element. Address is referenced + * by linear byte offset into list of variably sized PB elements. + * Optionally, provide remaining len within current element, and + * current PBL index for later resume at same element. + */ +u64 siw_pbl_get_buffer(struct siw_pbl *pbl, u64 off, int *len, int *idx) +{ + int i = idx ? *idx : 0; + + while (i < pbl->num_buf) { + struct siw_pble *pble = &pbl->pbe[i]; + + if (pble->pbl_off + pble->size > off) { + u64 pble_off = off - pble->pbl_off; + + if (len) + *len = pble->size - pble_off; + if (idx) + *idx = i; + + return pble->addr + pble_off; + } + i++; + } + if (len) + *len = 0; + return 0; +} + +struct siw_pbl *siw_pbl_alloc(u32 num_buf) +{ + struct siw_pbl *pbl; + int buf_size = sizeof(*pbl); + + if (num_buf == 0) + return ERR_PTR(-EINVAL); + + buf_size += ((num_buf - 1) * sizeof(struct siw_pble)); + + pbl = kzalloc(buf_size, GFP_KERNEL); + if (!pbl) + return ERR_PTR(-ENOMEM); + + pbl->max_buf = num_buf; + + return pbl; +} + +struct siw_umem *siw_umem_get(u64 start, u64 len) +{ + struct siw_umem *umem; + u64 first_page_va; + unsigned long mlock_limit; + int num_pages, num_chunks, i, rv = 0; + + if (!can_do_mlock()) + return ERR_PTR(-EPERM); + + if (!len) + return ERR_PTR(-EINVAL); + + first_page_va = start & PAGE_MASK; + num_pages = PAGE_ALIGN(start + len - first_page_va) >> PAGE_SHIFT; + num_chunks = (num_pages >> CHUNK_SHIFT) + 1; + + umem = kzalloc(sizeof(*umem), GFP_KERNEL); + if (!umem) + return ERR_PTR(-ENOMEM); + + umem->pid = get_task_pid(current, PIDTYPE_PID); + + down_write(¤t->mm->mmap_sem); + + mlock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; + + if (num_pages + current->mm->pinned_vm > mlock_limit) { + rv = -ENOMEM; + goto out; + } + umem->fp_addr = first_page_va; + + umem->page_chunk = kcalloc(num_chunks, sizeof(struct siw_page_chunk), + GFP_KERNEL); + if (!umem->page_chunk) { + rv = -ENOMEM; + goto out; + } + for (i = 0; num_pages; i++) { + int got, nents = min_t(int, num_pages, PAGES_PER_CHUNK); + + umem->page_chunk[i].p = kcalloc(nents, sizeof(struct page *), + GFP_KERNEL); + if (!umem->page_chunk[i].p) { + rv = -ENOMEM; + goto out; + } + got = 0; + while (nents) { + struct page **plist = &umem->page_chunk[i].p[got]; + + rv = get_user_pages(first_page_va, nents, FOLL_WRITE, + plist, NULL); + if (rv < 0) + goto out; + + umem->num_pages += rv; + current->mm->pinned_vm += rv; + first_page_va += rv * PAGE_SIZE; + nents -= rv; + got += rv; + } + num_pages -= got; + } +out: + up_write(¤t->mm->mmap_sem); + + if (rv > 0) + return umem; + + siw_umem_release(umem); + + return ERR_PTR(rv); +}