From patchwork Tue Sep 6 06:27:31 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Education Directorate X-Patchwork-Id: 9315769 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 71581601C0 for ; Tue, 6 Sep 2016 06:27:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 434BB28BE5 for ; Tue, 6 Sep 2016 06:27:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 379A528BE8; Tue, 6 Sep 2016 06:27:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.4 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1955228BE5 for ; Tue, 6 Sep 2016 06:27:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932762AbcIFG1k (ORCPT ); Tue, 6 Sep 2016 02:27:40 -0400 Received: from mail-pa0-f68.google.com ([209.85.220.68]:35164 "EHLO mail-pa0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932551AbcIFG1i (ORCPT ); Tue, 6 Sep 2016 02:27:38 -0400 Received: by mail-pa0-f68.google.com with SMTP id pp5so3952369pac.2; Mon, 05 Sep 2016 23:27:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=Tu/9U2fbsHF6Y/RHi9kMMeWnbEB7kCwOcVvc1AH+hI0=; b=WqZPViRJKp4hSHAUlD+DHQQQMkm2PAJJ0eA4K7efr59bN732Qi0SPaLJA/DmssoON2 Xx+FUOxf96EK4mUhgpvYZKUK8pDzIrLQ8xb0oZdw5nr5wNb068g2NiezW/bS7Z+tgXj5 8AnBuhsoJMW+wH2lOhXP8XZmzEUcM5bihWgmk7UnYKKfcXNUi1iv/sTvI0s8442FwiWw 05rHZ1C/32/X/9+O49AwMqq8oxE1zLCjC0ZhPibwVmsqjk6cDCId+wftJ7ICx0QOJXGu rRg3CeZEo4VsmByDYUWbyhmdIjua9L8/utJwN3Q5fqc86qONMUR6rNpPiNiLO4VdNElb Z74Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=Tu/9U2fbsHF6Y/RHi9kMMeWnbEB7kCwOcVvc1AH+hI0=; b=be5PNiPk0Eh3cnd0+UeFbb8ziur8WLw3AF952y2Gf0YN8xfoDRFD/TE+tvtSETHjDN GktC46MinZWloZim0clwyyCusgz05KW2zYzDQRgZGWiEy+lRsZR1Rx4H69b3vaJu/+uG dpU69Lg53HBv7LNvK3AqwD1WFKjNxAJ31XhmhVQMN7V/7S0WIrediHQVoXaF4xrGqsIw ZeqizsKdcyBZTvL/ghh0R8j3Aq0yPlUWUXnIxPp6CwWbFgWarO4OywKZThKW27S/4A1R UNVt0JtZ5XpiWH5sXf4cLsYP3A7J5/KnqX1tK1d6KENOeirNSQo/DfKiVwKKmtevOX0g z9gQ== X-Gm-Message-State: AE9vXwO+lKo/B5/JLk4RwmCbCv3dynlGV7s7lNO0EDDK18oXKnprPOBRfK0L90I/YAqMaQ== X-Received: by 10.66.159.170 with SMTP id xd10mr70817450pab.41.1473143257697; Mon, 05 Sep 2016 23:27:37 -0700 (PDT) Received: from balbir.ozlabs.ibm.com ([122.99.82.10]) by smtp.gmail.com with ESMTPSA id wp4sm37807924pab.15.2016.09.05.23.27.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 05 Sep 2016 23:27:37 -0700 (PDT) Subject: [PATCH v3] KVM: PPC: Book3S HV: Migrate pinned pages out of CMA To: Alexey Kardashevskiy , linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org, kvm@vger.kernel.org References: <20160714042536.GG18277@balbir.ozlabs.ibm.com> <3ba0fa6c-bfe6-a395-9c32-db8d6261559d@ozlabs.ru> Cc: "linux-mm@kvack.org" , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman From: Balbir Singh Message-ID: <2e840fe0-40cf-abf0-4fe6-a621ce46ae13@gmail.com> Date: Tue, 6 Sep 2016 16:27:31 +1000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.0 MIME-Version: 1.0 In-Reply-To: Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When PCI Device pass-through is enabled via VFIO, KVM-PPC will pin pages using get_user_pages_fast(). One of the downsides of the pinning is that the page could be in CMA region. The CMA region is used for other allocations like the hash page table. Ideally we want the pinned pages to be from non CMA region. This patch (currently only for KVM PPC with VFIO) forcefully migrates the pages out (huge pages are omitted for the moment). There are more efficient ways of doing this, but that might be elaborate and might impact a larger audience beyond just the kvm ppc implementation. The magic is in new_iommu_non_cma_page() which allocates the new page from a non CMA region. I've tested the patches lightly at my end. The full solution requires migration of THP pages in the CMA region. That work will be done incrementally on top of this. Previous discussion was at http://permalink.gmane.org/gmane.linux.kernel.mm/136738 Cc: Benjamin Herrenschmidt Cc: Michael Ellerman Cc: Paul Mackerras Cc: Alexey Kardashevskiy Signed-off-by: Balbir Singh Acked-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/mmu_context.h | 1 + arch/powerpc/mm/mmu_context_iommu.c | 81 ++++++++++++++++++++++++++++++++-- 2 files changed, 78 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 9d2cd0c..475d1be 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -18,6 +18,7 @@ extern void destroy_context(struct mm_struct *mm); #ifdef CONFIG_SPAPR_TCE_IOMMU struct mm_iommu_table_group_mem_t; +extern int isolate_lru_page(struct page *page); /* from internal.h */ extern bool mm_iommu_preregistered(void); extern long mm_iommu_get(unsigned long ua, unsigned long entries, struct mm_iommu_table_group_mem_t **pmem); diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c index da6a216..e0f1c33 100644 --- a/arch/powerpc/mm/mmu_context_iommu.c +++ b/arch/powerpc/mm/mmu_context_iommu.c @@ -15,6 +15,9 @@ #include #include #include +#include +#include +#include #include static DEFINE_MUTEX(mem_list_mutex); @@ -72,6 +75,55 @@ bool mm_iommu_preregistered(void) } EXPORT_SYMBOL_GPL(mm_iommu_preregistered); +/* + * Taken from alloc_migrate_target with changes to remove CMA allocations + */ +struct page *new_iommu_non_cma_page(struct page *page, unsigned long private, + int **resultp) +{ + gfp_t gfp_mask = GFP_USER; + struct page *new_page; + + if (PageHuge(page) || PageTransHuge(page) || PageCompound(page)) + return NULL; + + if (PageHighMem(page)) + gfp_mask |= __GFP_HIGHMEM; + + /* + * We don't want the allocation to force an OOM if possibe + */ + new_page = alloc_page(gfp_mask | __GFP_NORETRY | __GFP_NOWARN); + return new_page; +} + +static int mm_iommu_move_page_from_cma(struct page *page) +{ + int ret = 0; + LIST_HEAD(cma_migrate_pages); + + /* Ignore huge pages for now */ + if (PageHuge(page) || PageTransHuge(page) || PageCompound(page)) + return -EBUSY; + + lru_add_drain(); + ret = isolate_lru_page(page); + if (ret) + return ret; + + list_add(&page->lru, &cma_migrate_pages); + put_page(page); /* Drop the gup reference */ + + ret = migrate_pages(&cma_migrate_pages, new_iommu_non_cma_page, + NULL, 0, MIGRATE_SYNC, MR_CMA); + if (ret) { + if (!list_empty(&cma_migrate_pages)) + putback_movable_pages(&cma_migrate_pages); + } + + return 0; +} + long mm_iommu_get(unsigned long ua, unsigned long entries, struct mm_iommu_table_group_mem_t **pmem) { @@ -124,15 +176,36 @@ long mm_iommu_get(unsigned long ua, unsigned long entries, for (i = 0; i < entries; ++i) { if (1 != get_user_pages_fast(ua + (i << PAGE_SHIFT), 1/* pages */, 1/* iswrite */, &page)) { + ret = -EFAULT; for (j = 0; j < i; ++j) - put_page(pfn_to_page( - mem->hpas[j] >> PAGE_SHIFT)); + put_page(pfn_to_page(mem->hpas[j] >> + PAGE_SHIFT)); vfree(mem->hpas); kfree(mem); - ret = -EFAULT; goto unlock_exit; } - + /* + * If we get a page from the CMA zone, since we are going to + * be pinning these entries, we might as well move them out + * of the CMA zone if possible. NOTE: faulting in + migration + * can be expensive. Batching can be considered later + */ + if (get_pageblock_migratetype(page) == MIGRATE_CMA) { + if (mm_iommu_move_page_from_cma(page)) + goto populate; + if (1 != get_user_pages_fast(ua + (i << PAGE_SHIFT), + 1/* pages */, 1/* iswrite */, + &page)) { + ret = -EFAULT; + for (j = 0; j < i; ++j) + put_page(pfn_to_page(mem->hpas[j] >> + PAGE_SHIFT)); + vfree(mem->hpas); + kfree(mem); + goto unlock_exit; + } + } +populate: mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; }