From patchwork Wed May 6 20:05:39 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 6352281 X-Patchwork-Delegate: dan.j.williams@gmail.com Return-Path: X-Original-To: patchwork-linux-nvdimm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 745649F32E for ; Wed, 6 May 2015 20:08:23 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 6FB0120274 for ; Wed, 6 May 2015 20:08:22 +0000 (UTC) Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 59AF12037F for ; Wed, 6 May 2015 20:08:21 +0000 (UTC) Received: from ml01.vlan14.01.org (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 4D0DD182ECB; Wed, 6 May 2015 13:08:21 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by ml01.01.org (Postfix) with ESMTP id 71266182E90 for ; Wed, 6 May 2015 13:08:19 -0700 (PDT) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga103.jf.intel.com with ESMTP; 06 May 2015 13:08:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.13,380,1427785200"; d="scan'208";a="567361378" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.23.232.36]) by orsmga003.jf.intel.com with ESMTP; 06 May 2015 13:08:20 -0700 From: Dan Williams To: linux-kernel@vger.kernel.org Date: Wed, 06 May 2015 16:05:39 -0400 Message-ID: <20150506200539.40425.14211.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <20150506200219.40425.74411.stgit@dwillia2-desk3.amr.corp.intel.com> References: <20150506200219.40425.74411.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.17.1-8-g92dd MIME-Version: 1.0 Cc: axboe@kernel.dk, riel@redhat.com, linux-nvdimm@lists.01.org, hch@lst.de, mgorman@suse.de, linux-fsdevel@vger.kernel.org, akpm@linux-foundation.org, mingo@kernel.org Subject: [Linux-nvdimm] [PATCH v2 08/10] x86: support kmap_atomic_pfn_t() for persistent memory X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP It would be unfortunate if the kmap infrastructure escaped its current 32-bit/HIGHMEM bonds and leaked into 64-bit code. Instead, if the user has enabled CONFIG_PMEM_IO we direct the kmap_atomic_pfn_t() implementation to scan a list of pre-mapped persistent memory address ranges inserted by the pmem driver. The __pfn_t to resource lookup is indeed inefficient walking of a linked list, but there are two mitigating factors: 1/ The number of persistent memory ranges is bounded by the number of DIMMs which is on the order of 10s of DIMMs, not hundreds. 2/ The lookup yields the entire range, if it becomes inefficient to do a kmap_atomic_pfn_t() a PAGE_SIZE at a time the caller can take advantage of the fact that the lookup can be amortized for all kmap operations it needs to perform in a given range. Signed-off-by: Dan Williams --- arch/Kconfig | 3 + arch/x86/Kconfig | 2 + arch/x86/kernel/Makefile | 1 arch/x86/kernel/kmap.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++ drivers/block/pmem.c | 6 +++ include/linux/highmem.h | 23 +++++++++++ 6 files changed, 130 insertions(+) create mode 100644 arch/x86/kernel/kmap.c diff --git a/arch/Kconfig b/arch/Kconfig index f7f800860c00..69d3a3fa21af 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -206,6 +206,9 @@ config HAVE_DMA_CONTIGUOUS config HAVE_DMA_PFN bool +config HAVE_KMAP_PFN + bool + config GENERIC_SMP_IDLE_THREAD bool diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 1fae5e842423..eddaea839500 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1434,7 +1434,9 @@ config X86_PMEM_LEGACY Say Y if unsure. config X86_PMEM_DMA + depends on !HIGHMEM def_bool PMEM_IO + select HAVE_KMAP_PFN select HAVE_DMA_PFN config HIGHPTE diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 9bcd0b56ca17..44c323342996 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -96,6 +96,7 @@ obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch_$(BITS).o obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= paravirt-spinlocks.o obj-$(CONFIG_PARAVIRT_CLOCK) += pvclock.o obj-$(CONFIG_X86_PMEM_LEGACY) += pmem.o +obj-$(CONFIG_X86_PMEM_DMA) += kmap.o obj-$(CONFIG_PCSPKR_PLATFORM) += pcspeaker.o diff --git a/arch/x86/kernel/kmap.c b/arch/x86/kernel/kmap.c new file mode 100644 index 000000000000..d597c475377b --- /dev/null +++ b/arch/x86/kernel/kmap.c @@ -0,0 +1,95 @@ +/* + * Copyright(c) 2015 Intel Corporation. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of version 2 of the GNU General Public License as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + */ +#include +#include +#include +#include +#include +#include + +static LIST_HEAD(ranges); + +struct kmap { + struct list_head list; + struct resource *res; + struct device *dev; + void *base; +}; + +static void teardown_kmap(void *data) +{ + struct kmap *kmap = data; + + dev_dbg(kmap->dev, "kmap unregister %pr\n", kmap->res); + list_del_rcu(&kmap->list); + synchronize_rcu(); + kfree(kmap); +} + +int devm_register_kmap_pfn_range(struct device *dev, struct resource *res, + void *base) +{ + struct kmap *kmap = kzalloc(sizeof(*kmap), GFP_KERNEL); + int rc; + + if (!kmap) + return -ENOMEM; + + INIT_LIST_HEAD(&kmap->list); + kmap->res = res; + kmap->base = base; + kmap->dev = dev; + rc = devm_add_action(dev, teardown_kmap, kmap); + if (rc) { + kfree(kmap); + return rc; + } + dev_dbg(kmap->dev, "kmap register %pr\n", kmap->res); + list_add_rcu(&kmap->list, &ranges); + return 0; +} +EXPORT_SYMBOL_GPL(devm_register_kmap_pfn_range); + +void *kmap_atomic_pfn_t(__pfn_t pfn) +{ + struct page *page = __pfn_t_to_page(pfn); + resource_size_t addr; + struct kmap *kmap; + + if (page) + return kmap_atomic(page); + addr = __pfn_t_to_phys(pfn); + rcu_read_lock(); + list_for_each_entry_rcu(kmap, &ranges, list) + if (addr >= kmap->res->start && addr <= kmap->res->end) + return kmap->base + addr - kmap->res->start; + + /* only unlock in the error case */ + rcu_read_unlock(); + return NULL; +} +EXPORT_SYMBOL(kmap_atomic_pfn_t); + +void kunmap_atomic_pfn_t(void *addr) +{ + rcu_read_unlock(); + + /* + * If the original __pfn_t had an entry in the memmap then + * 'addr' will be outside of vmalloc space i.e. it came from + * page_address() + */ + if (!is_vmalloc_addr(addr)) + kunmap_atomic(addr); +} +EXPORT_SYMBOL(kunmap_atomic_pfn_t); diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c index 41bb424533e6..2a847651f8de 100644 --- a/drivers/block/pmem.c +++ b/drivers/block/pmem.c @@ -23,6 +23,7 @@ #include #include #include +#include #define PMEM_MINORS 16 @@ -147,6 +148,11 @@ static struct pmem_device *pmem_alloc(struct device *dev, struct resource *res) if (!pmem->virt_addr) goto out_release_region; + err = devm_register_kmap_pfn_range(dev, res, pmem->virt_addr); + if (err) + goto out_unmap; + + err = -ENOMEM; pmem->pmem_queue = blk_alloc_queue(GFP_KERNEL); if (!pmem->pmem_queue) goto out_unmap; diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 9286a46b7d69..85fd52d43a9a 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -83,6 +83,29 @@ static inline void __kunmap_atomic(void *addr) #endif /* CONFIG_HIGHMEM */ +#ifdef CONFIG_HAVE_KMAP_PFN +extern void *kmap_atomic_pfn_t(__pfn_t pfn); +extern void kunmap_atomic_pfn_t(void *addr); +extern int devm_register_kmap_pfn_range(struct device *dev, + struct resource *res, void *base); +#else +static inline void *kmap_atomic_pfn_t(__pfn_t pfn) +{ + return kmap_atomic(__pfn_t_to_page(pfn)); +} + +static inline void kunmap_atomic_pfn_t(void *addr) +{ + __kunmap_atomic(addr); +} + +static inline int devm_register_kmap_pfn_range(struct device *dev, + struct resource *res, void *base) +{ + return 0; +} +#endif /* CONFIG_HAVE_KMAP_PFN */ + #if defined(CONFIG_HIGHMEM) || defined(CONFIG_X86_32) DECLARE_PER_CPU(int, __kmap_atomic_idx);