From patchwork Mon Jan 14 09:54:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Aneesh Kumar K.V" X-Patchwork-Id: 10761855 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D0D701515 for ; Mon, 14 Jan 2019 09:55:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A705828ABB for ; Mon, 14 Jan 2019 09:55:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9AAF628AD2; Mon, 14 Jan 2019 09:55:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6BB0528ABB for ; Mon, 14 Jan 2019 09:55:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 559C78E0007; Mon, 14 Jan 2019 04:55:09 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4E1618E0002; Mon, 14 Jan 2019 04:55:09 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35C728E0007; Mon, 14 Jan 2019 04:55:09 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by kanga.kvack.org (Postfix) with ESMTP id 095DE8E0002 for ; Mon, 14 Jan 2019 04:55:09 -0500 (EST) Received: by mail-qt1-f198.google.com with SMTP id w15so24169039qtk.19 for ; Mon, 14 Jan 2019 01:55:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:in-reply-to:references:mime-version :content-transfer-encoding:message-id; bh=CU1hiy8WiqdrDx7ayT2+D+h2EdJISv+IWvzt9fNmsOE=; b=aApIyw7OGIroN/TmuYN9W1YIUcTIiAM1bm6kl+1wrnvqqZC2fz5jVwwAVWzAP6Jpzr ai4EZ/ljwhP3DlAON15SVDzbhaUvn2dpE7Y6aqOCvu3fdc3Ds0ugcc/p/rTBwBhP5DTc QIqFx43TVGzhTN3eJ3mQyjXlwA/cCJkY00LJ9Zkc5g1et8IS6zELDJhgtZ22E3tCp8Wz nMb4oXQH63akvAyPCT4QofkN/I/fsLhLwtQ7pUJHjEUSRHyBF4+yFOBg/4KI2iXKn0L4 valnQbNON3sKpYDr00i3U+2eecok0iRDvOoIG7B79a/z91RksUa2UN2C8o7LK1AMwkYa KdQQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com X-Gm-Message-State: AJcUukfIzgfZ8mWRXrfg6+CbXHPf6Ct1bpojZI81qHoYRhd8zLjimu42 hht75l8KPtDli8e7II5ca7iHCstVx5Cxcv4JPkC+Dh5Nv2CSxp2sn7ltfJPwYgQlv1jJer8TZQF jAOZbRrwJ9XgdwTQ4IT1JuBCLVhJuphrfrK+RApPhwO3ca0C4yhVsJ3DYpp0kV6uY4Q== X-Received: by 2002:aed:26a3:: with SMTP id q32mr23505079qtd.106.1547459708782; Mon, 14 Jan 2019 01:55:08 -0800 (PST) X-Google-Smtp-Source: ALg8bN57thQZrNyh3H10+1BsWYkirPWd/MQeXS1VxltyGksNUTRrFAgfEVwjNGB4i3F1MQSzrZBa X-Received: by 2002:aed:26a3:: with SMTP id q32mr23505043qtd.106.1547459707950; Mon, 14 Jan 2019 01:55:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547459707; cv=none; d=google.com; s=arc-20160816; b=cA1Hq5gEUzy9O7gXcwNT5F2l5hAg0nD1NgMVOclDhl5XtbWjgka3DO0/J5SquzgG/3 gdkiG2zoB1lVmuD6FRkBedqI3e29givZ+Spa2DpLUOYYIFMQmW8226mV50hsxMN/EoE/ ZL9SCoM0IEEaB0d8pg0pIGeqfiO1caMZr5gGsau3NyUsMt/5+EEs5YTXr+7PYWmp/Uxl +nYSewGknw5k2mWFl+KdTRnfM7u8++e7a+NnnXvweOgxU8sNl6vbL9MpN3wLWtnAdtH+ swXAhiCh2r7qa3IsmgpV3Se9Pw5tR+b90EMxKt+36LfkT7Bh1QGXS01nuvs8LzibxrLy 1ulw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:content-transfer-encoding:mime-version:references :in-reply-to:date:subject:cc:to:from; bh=CU1hiy8WiqdrDx7ayT2+D+h2EdJISv+IWvzt9fNmsOE=; b=ZQ0uqXXbHfgdxYde6801XMPiV3v+vcFJ1/ERECEZHxB0l7UQpPmgweydHM8UE5pCLe FgrIpvngXDxqqpEYPgxC1bzfoQ8EhLGcpxYAvCMobXj3aJg02Si9xunLyeB2+9OTwI/X XW7q6T0Yg9WDuY6FzunNFbrB/5cCQf/PbUnM7UobRiBUBuh/CXR8BD6i2meheOWWMZCR IjSrXg08gy3WWxudxr9yBVqoywXlxndjwGg3hX1sqWd3Ne1X0Uhv35qqQX4QGvcz86Yu I7yLX97QbU//mgK8F4jDriGObBx5xilw/lpQStlEpaUxsSGUx7vG3KSh2JP8ZiMfgFLP h0UA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com. [148.163.158.5]) by mx.google.com with ESMTPS id j187si3295039qkf.235.2019.01.14.01.55.07 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 14 Jan 2019 01:55:07 -0800 (PST) Received-SPF: pass (google.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) client-ip=148.163.158.5; Authentication-Results: mx.google.com; spf=pass (google.com: domain of aneesh.kumar@linux.ibm.com designates 148.163.158.5 as permitted sender) smtp.mailfrom=aneesh.kumar@linux.ibm.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id x0E9mXCf112000 for ; Mon, 14 Jan 2019 04:55:07 -0500 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0b-001b2d01.pphosted.com with ESMTP id 2q0pa8n52m-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 14 Jan 2019 04:55:07 -0500 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 14 Jan 2019 09:55:06 -0000 Received: from b01cxnp22036.gho.pok.ibm.com (9.57.198.26) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 14 Jan 2019 09:55:02 -0000 Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x0E9t1CN17104990 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 14 Jan 2019 09:55:01 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4F3AEAE06A; Mon, 14 Jan 2019 09:55:01 +0000 (GMT) Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D8CCDAE064; Mon, 14 Jan 2019 09:54:58 +0000 (GMT) Received: from skywalker.in.ibm.com (unknown [9.124.31.106]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTP; Mon, 14 Jan 2019 09:54:58 +0000 (GMT) From: "Aneesh Kumar K.V" To: akpm@linux-foundation.org, Michal Hocko , Alexey Kardashevskiy , David Gibson , Andrea Arcangeli , mpe@ellerman.id.au Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, "Aneesh Kumar K.V" Subject: [PATCH V7 3/4] powerpc/mm/iommu: Allow migration of cma allocated pages during mm_iommu_do_alloc Date: Mon, 14 Jan 2019 15:24:36 +0530 X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190114095438.32470-1-aneesh.kumar@linux.ibm.com> References: <20190114095438.32470-1-aneesh.kumar@linux.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 19011409-0040-0000-0000-000004B17D68 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010402; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000274; SDB=6.01146283; UDB=6.00593882; IPR=6.00926570; MB=3.00025118; MTD=3.00000008; XFM=3.00000015; UTC=2019-01-14 09:55:04 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19011409-0041-0000-0000-000008BC8571 Message-Id: <20190114095438.32470-5-aneesh.kumar@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-01-14_05:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1901140081 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The current code doesn't do page migration if the page allocated is a compound page. With HugeTLB migration support, we can end up allocating hugetlb pages from CMA region. Also, THP pages can be allocated from CMA region. This patch updates the code to handle compound pages correctly. The patch also switches to a single get_user_pages with the right count, instead of doing one get_user_pages per page. That avoids reading page table multiple times. Since these page reference updates are long term pin, switch to get_user_pages_longterm. That makes sure we fail correctly if the guest RAM is backed by DAX pages. The patch also converts the hpas member of mm_iommu_table_group_mem_t to a union. We use the same storage location to store pointers to struct page. We cannot update all the code path use struct page *, because we access hpas in real mode and we can't do that struct page * to pfn conversion in real mode. Signed-off-by: Aneesh Kumar K.V Reviewed-by: Michael Ellerman --- arch/powerpc/mm/mmu_context_iommu.c | 126 +++++++++------------------- 1 file changed, 39 insertions(+), 87 deletions(-) diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c index a712a650a8b6..f11a2f15071f 100644 --- a/arch/powerpc/mm/mmu_context_iommu.c +++ b/arch/powerpc/mm/mmu_context_iommu.c @@ -21,6 +21,7 @@ #include #include #include +#include static DEFINE_MUTEX(mem_list_mutex); @@ -34,8 +35,18 @@ struct mm_iommu_table_group_mem_t { atomic64_t mapped; unsigned int pageshift; u64 ua; /* userspace address */ - u64 entries; /* number of entries in hpas[] */ - u64 *hpas; /* vmalloc'ed */ + u64 entries; /* number of entries in hpas/hpages[] */ + /* + * in mm_iommu_get we temporarily use this to store + * struct page address. + * + * We need to convert ua to hpa in real mode. Make it + * simpler by storing physical address. + */ + union { + struct page **hpages; /* vmalloc'ed */ + phys_addr_t *hpas; + }; #define MM_IOMMU_TABLE_INVALID_HPA ((uint64_t)-1) u64 dev_hpa; /* Device memory base address */ }; @@ -80,64 +91,15 @@ bool mm_iommu_preregistered(struct mm_struct *mm) } EXPORT_SYMBOL_GPL(mm_iommu_preregistered); -/* - * Taken from alloc_migrate_target with changes to remove CMA allocations - */ -struct page *new_iommu_non_cma_page(struct page *page, unsigned long private) -{ - gfp_t gfp_mask = GFP_USER; - struct page *new_page; - - if (PageCompound(page)) - return NULL; - - if (PageHighMem(page)) - gfp_mask |= __GFP_HIGHMEM; - - /* - * We don't want the allocation to force an OOM if possibe - */ - new_page = alloc_page(gfp_mask | __GFP_NORETRY | __GFP_NOWARN); - return new_page; -} - -static int mm_iommu_move_page_from_cma(struct page *page) -{ - int ret = 0; - LIST_HEAD(cma_migrate_pages); - - /* Ignore huge pages for now */ - if (PageCompound(page)) - return -EBUSY; - - lru_add_drain(); - ret = isolate_lru_page(page); - if (ret) - return ret; - - list_add(&page->lru, &cma_migrate_pages); - put_page(page); /* Drop the gup reference */ - - ret = migrate_pages(&cma_migrate_pages, new_iommu_non_cma_page, - NULL, 0, MIGRATE_SYNC, MR_CONTIG_RANGE); - if (ret) { - if (!list_empty(&cma_migrate_pages)) - putback_movable_pages(&cma_migrate_pages); - } - - return 0; -} - static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, - unsigned long entries, unsigned long dev_hpa, - struct mm_iommu_table_group_mem_t **pmem) + unsigned long entries, unsigned long dev_hpa, + struct mm_iommu_table_group_mem_t **pmem) { struct mm_iommu_table_group_mem_t *mem; - long i, j, ret = 0, locked_entries = 0; + long i, ret = 0, locked_entries = 0; unsigned int pageshift; unsigned long flags; unsigned long cur_ua; - struct page *page = NULL; mutex_lock(&mem_list_mutex); @@ -187,41 +149,27 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, goto unlock_exit; } + down_read(&mm->mmap_sem); + ret = get_user_pages_longterm(ua, entries, FOLL_WRITE, mem->hpages, NULL); + up_read(&mm->mmap_sem); + if (ret != entries) { + /* free the reference taken */ + for (i = 0; i < ret; i++) + put_page(mem->hpages[i]); + + vfree(mem->hpas); + kfree(mem); + ret = -EFAULT; + goto unlock_exit; + } else { + ret = 0; + } + + pageshift = PAGE_SHIFT; for (i = 0; i < entries; ++i) { + struct page *page = mem->hpages[i]; + cur_ua = ua + (i << PAGE_SHIFT); - if (1 != get_user_pages_fast(cur_ua, - 1/* pages */, 1/* iswrite */, &page)) { - ret = -EFAULT; - for (j = 0; j < i; ++j) - put_page(pfn_to_page(mem->hpas[j] >> - PAGE_SHIFT)); - vfree(mem->hpas); - kfree(mem); - goto unlock_exit; - } - /* - * If we get a page from the CMA zone, since we are going to - * be pinning these entries, we might as well move them out - * of the CMA zone if possible. NOTE: faulting in + migration - * can be expensive. Batching can be considered later - */ - if (is_migrate_cma_page(page)) { - if (mm_iommu_move_page_from_cma(page)) - goto populate; - if (1 != get_user_pages_fast(cur_ua, - 1/* pages */, 1/* iswrite */, - &page)) { - ret = -EFAULT; - for (j = 0; j < i; ++j) - put_page(pfn_to_page(mem->hpas[j] >> - PAGE_SHIFT)); - vfree(mem->hpas); - kfree(mem); - goto unlock_exit; - } - } -populate: - pageshift = PAGE_SHIFT; if (mem->pageshift > PAGE_SHIFT && PageCompound(page)) { pte_t *pte; struct page *head = compound_head(page); @@ -239,6 +187,10 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua, local_irq_restore(flags); } mem->pageshift = min(mem->pageshift, pageshift); + /* + * We don't need struct page reference any more, switch + * to physical address. + */ mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; }