From patchwork Mon Oct 22 20:13:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652411 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 655BC13BF for ; Mon, 22 Oct 2018 20:18:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4E27928EC3 for ; Mon, 22 Oct 2018 20:18:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4214E28ED3; Mon, 22 Oct 2018 20:18:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id DF76E28EC4 for ; Mon, 22 Oct 2018 20:18:37 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 7CB292117B57F; Mon, 22 Oct 2018 13:18:37 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: None (no SPF record) identity=mailfrom; client-ip=134.134.136.31; helo=mga06.intel.com; envelope-from=dave.hansen@linux.intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 280E82117B555 for ; Mon, 22 Oct 2018 13:18:37 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="83268375" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga007.jf.intel.com with ESMTP; 22 Oct 2018 13:18:37 -0700 Subject: [PATCH 1/9] mm/resource: return real error codes from walk failures To: linux-kernel@vger.kernel.org From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:19 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201319.471D7B85@viggo.jf.intel.com> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.lendacky@amd.com, mhocko@suse.com, linux-nvdimm@lists.01.org, Dave Hansen , ying.huang@intel.com, linux-mm@kvack.org, zwisler@kernel.org, fengguang.wu@intel.com, akpm@linux-foundation.org MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error. The memory hotplug does the following: ret = walk_system_ram_range(..., func); if (ret) return ret; and 'ret' makes it out to userspace, eventually. The problem is, walk_system_ram_range() failues that result from *it* failing (as opposed to 'func') return -1. That leads to a very odd -EPERM (-1) return code out to userspace. Make walk_system_ram_range() return -EINVAL for internal failures to keep userspace less confused. This return code is compatible with all the callers that I audited. Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/kernel/resource.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN kernel/resource.c~memory-hotplug-walk_system_ram_range-returns-neg-1 kernel/resource.c --- a/kernel/resource.c~memory-hotplug-walk_system_ram_range-returns-neg-1 2018-10-22 13:12:21.000930395 -0700 +++ b/kernel/resource.c 2018-10-22 13:12:21.003930395 -0700 @@ -375,7 +375,7 @@ static int __walk_iomem_res_desc(resourc int (*func)(struct resource *, void *)) { struct resource res; - int ret = -1; + int ret = -EINVAL; while (start < end && !find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) { @@ -453,7 +453,7 @@ int walk_system_ram_range(unsigned long unsigned long flags; struct resource res; unsigned long pfn, end_pfn; - int ret = -1; + int ret = -EINVAL; start = (u64) start_pfn << PAGE_SHIFT; end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1; From patchwork Mon Oct 22 20:13:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652415 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3CA0914BB for ; Mon, 22 Oct 2018 20:18:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 261FC28EC3 for ; Mon, 22 Oct 2018 20:18:41 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1A53428ED4; Mon, 22 Oct 2018 20:18:41 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 9AADF28EC3 for ; Mon, 22 Oct 2018 20:18:40 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 91C992117CEB2; Mon, 22 Oct 2018 13:18:40 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: None (no SPF record) identity=mailfrom; client-ip=134.134.136.31; helo=mga06.intel.com; envelope-from=dave.hansen@linux.intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 434422117B555 for ; Mon, 22 Oct 2018 13:18:39 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="101509365" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by fmsmga001.fm.intel.com with ESMTP; 22 Oct 2018 13:18:38 -0700 Subject: [PATCH 2/9] dax: kernel memory driver for mm ownership of DAX To: linux-kernel@vger.kernel.org From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:20 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201320.45C9785C@viggo.jf.intel.com> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.lendacky@amd.com, mhocko@suse.com, linux-nvdimm@lists.01.org, Dave Hansen , ying.huang@intel.com, linux-mm@kvack.org, zwisler@kernel.org, fengguang.wu@intel.com, akpm@linux-foundation.org MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP Add the actual driver to which will own the DAX range. This allows very nice party with the other possible "owners" of a DAX region: device DAX and filesystem DAX. It also greatly simplifies the process of handing off control of the memory between the different owners since it's just a matter of unbinding and rebinding the device to different drivers. I tried to do this all internally to the kernel and the locking and "self-destruction" of the old device context was a nightmare. Having userspace drive it is a wonderful simplification. Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/drivers/dax/kmem.c | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 152 insertions(+) diff -puN /dev/null drivers/dax/kmem.c --- /dev/null 2018-09-18 12:39:53.059362935 -0700 +++ b/drivers/dax/kmem.c 2018-10-22 13:12:21.502930393 -0700 @@ -0,0 +1,152 @@ +// this just just a copy of drivers/dax/pmem.c with +// s/dax_pmem/dax_kmem' for now. +// +// need real license +/* + * Copyright(c) 2016-2018 Intel Corporation. All rights reserved. + */ +#include +#include +#include +#include +#include "../nvdimm/pfn.h" +#include "../nvdimm/nd.h" +#include "device-dax.h" + +struct dax_kmem { + struct device *dev; + struct percpu_ref ref; + struct dev_pagemap pgmap; + struct completion cmp; +}; + +static struct dax_kmem *to_dax_kmem(struct percpu_ref *ref) +{ + return container_of(ref, struct dax_kmem, ref); +} + +static void dax_kmem_percpu_release(struct percpu_ref *ref) +{ + struct dax_kmem *dax_kmem = to_dax_pmem(ref); + + dev_dbg(dax_kmem->dev, "trace\n"); + complete(&dax_kmem->cmp); +} + +static void dax_kmem_percpu_exit(void *data) +{ + struct percpu_ref *ref = data; + struct dax_kmem *dax_kmem = to_dax_pmem(ref); + + dev_dbg(dax_kmem->dev, "trace\n"); + wait_for_completion(&dax_kmem->cmp); + percpu_ref_exit(ref); +} + +static void dax_kmem_percpu_kill(void *data) +{ + struct percpu_ref *ref = data; + struct dax_kmem *dax_kmem = to_dax_pmem(ref); + + dev_dbg(dax_kmem->dev, "trace\n"); + percpu_ref_kill(ref); +} + +static int dax_kmem_probe(struct device *dev) +{ + void *addr; + struct resource res; + int rc, id, region_id; + struct nd_pfn_sb *pfn_sb; + struct dev_dax *dev_dax; + struct dax_kmem *dax_kmem; + struct nd_namespace_io *nsio; + struct dax_region *dax_region; + struct nd_namespace_common *ndns; + struct nd_dax *nd_dax = to_nd_dax(dev); + struct nd_pfn *nd_pfn = &nd_dax->nd_pfn; + + ndns = nvdimm_namespace_common_probe(dev); + if (IS_ERR(ndns)) + return PTR_ERR(ndns); + nsio = to_nd_namespace_io(&ndns->dev); + + dax_kmem = devm_kzalloc(dev, sizeof(*dax_kmem), GFP_KERNEL); + if (!dax_kmem) + return -ENOMEM; + + /* parse the 'pfn' info block via ->rw_bytes */ + rc = devm_nsio_enable(dev, nsio); + if (rc) + return rc; + rc = nvdimm_setup_pfn(nd_pfn, &dax_kmem->pgmap); + if (rc) + return rc; + devm_nsio_disable(dev, nsio); + + pfn_sb = nd_pfn->pfn_sb; + + if (!devm_request_mem_region(dev, nsio->res.start, + resource_size(&nsio->res), + dev_name(&ndns->dev))) { + dev_warn(dev, "could not reserve region %pR\n", &nsio->res); + return -EBUSY; + } + + dax_kmem->dev = dev; + init_completion(&dax_kmem->cmp); + rc = percpu_ref_init(&dax_kmem->ref, dax_kmem_percpu_release, 0, + GFP_KERNEL); + if (rc) + return rc; + + rc = devm_add_action_or_reset(dev, dax_kmem_percpu_exit, + &dax_kmem->ref); + if (rc) + return rc; + + dax_kmem->pgmap.ref = &dax_kmem->ref; + addr = devm_memremap_pages(dev, &dax_kmem->pgmap); + if (IS_ERR(addr)) + return PTR_ERR(addr); + + rc = devm_add_action_or_reset(dev, dax_kmem_percpu_kill, + &dax_kmem->ref); + if (rc) + return rc; + + /* adjust the dax_region resource to the start of data */ + memcpy(&res, &dax_kmem->pgmap.res, sizeof(res)); + res.start += le64_to_cpu(pfn_sb->dataoff); + + rc = sscanf(dev_name(&ndns->dev), "namespace%d.%d", ®ion_id, &id); + if (rc != 2) + return -EINVAL; + + dax_region = alloc_dax_region(dev, region_id, &res, + le32_to_cpu(pfn_sb->align), addr, PFN_DEV|PFN_MAP); + if (!dax_region) + return -ENOMEM; + + /* TODO: support for subdividing a dax region... */ + dev_dax = devm_create_dev_dax(dax_region, id, &res, 1); + + /* child dev_dax instances now own the lifetime of the dax_region */ + dax_region_put(dax_region); + + return PTR_ERR_OR_ZERO(dev_dax); +} + +static struct nd_device_driver dax_kmem_driver = { + .probe = dax_kmem_probe, + .drv = { + .name = "dax_kmem", + }, + .type = ND_DRIVER_DAX_PMEM, +}; + +module_nd_driver(dax_kmem_driver); + +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Intel Corporation"); +MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM); From patchwork Mon Oct 22 20:13:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652419 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 488F813BF for ; Mon, 22 Oct 2018 20:18:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3213528EC3 for ; Mon, 22 Oct 2018 20:18:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2632F28ED4; Mon, 22 Oct 2018 20:18:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id B551A28EC4 for ; Mon, 22 Oct 2018 20:18:42 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id AB2EA2117CEB5; Mon, 22 Oct 2018 13:18:42 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: Pass (sender SPF authorized) identity=helo; client-ip=192.55.52.136; helo=mga12.intel.com; envelope-from=dave.hansen@linux.intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 600DD2117CEAF for ; Mon, 22 Oct 2018 13:18:41 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="80062767" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by fmsmga007.fm.intel.com with ESMTP; 22 Oct 2018 13:18:40 -0700 Subject: [PATCH 3/9] dax: add more kmem device infrastructure To: linux-kernel@vger.kernel.org From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:22 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201322.6C8A7B2A@viggo.jf.intel.com> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.lendacky@amd.com, mhocko@suse.com, linux-nvdimm@lists.01.org, Dave Hansen , ying.huang@intel.com, linux-mm@kvack.org, zwisler@kernel.org, fengguang.wu@intel.com, akpm@linux-foundation.org MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP The previous patch is a simple copy of the pmem driver. This makes it easy while this is in development to keep the pmem and kmem code in sync. This actually adds some necessary infrastructure for the new driver to compile. Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/drivers/dax/kmem.c | 10 +++++----- b/include/uapi/linux/ndctl.h | 2 ++ 2 files changed, 7 insertions(+), 5 deletions(-) diff -puN drivers/dax/kmem.c~dax-kmem-try-again-2018-2-header drivers/dax/kmem.c --- a/drivers/dax/kmem.c~dax-kmem-try-again-2018-2-header 2018-10-22 13:12:22.000930392 -0700 +++ b/drivers/dax/kmem.c 2018-10-22 13:12:22.005930392 -0700 @@ -27,7 +27,7 @@ static struct dax_kmem *to_dax_kmem(stru static void dax_kmem_percpu_release(struct percpu_ref *ref) { - struct dax_kmem *dax_kmem = to_dax_pmem(ref); + struct dax_kmem *dax_kmem = to_dax_kmem(ref); dev_dbg(dax_kmem->dev, "trace\n"); complete(&dax_kmem->cmp); @@ -36,7 +36,7 @@ static void dax_kmem_percpu_release(stru static void dax_kmem_percpu_exit(void *data) { struct percpu_ref *ref = data; - struct dax_kmem *dax_kmem = to_dax_pmem(ref); + struct dax_kmem *dax_kmem = to_dax_kmem(ref); dev_dbg(dax_kmem->dev, "trace\n"); wait_for_completion(&dax_kmem->cmp); @@ -46,7 +46,7 @@ static void dax_kmem_percpu_exit(void *d static void dax_kmem_percpu_kill(void *data) { struct percpu_ref *ref = data; - struct dax_kmem *dax_kmem = to_dax_pmem(ref); + struct dax_kmem *dax_kmem = to_dax_kmem(ref); dev_dbg(dax_kmem->dev, "trace\n"); percpu_ref_kill(ref); @@ -142,11 +142,11 @@ static struct nd_device_driver dax_kmem_ .drv = { .name = "dax_kmem", }, - .type = ND_DRIVER_DAX_PMEM, + .type = ND_DRIVER_DAX_KMEM, }; module_nd_driver(dax_kmem_driver); MODULE_LICENSE("GPL v2"); MODULE_AUTHOR("Intel Corporation"); -MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM); +MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_KMEM); diff -puN include/uapi/linux/ndctl.h~dax-kmem-try-again-2018-2-header include/uapi/linux/ndctl.h --- a/include/uapi/linux/ndctl.h~dax-kmem-try-again-2018-2-header 2018-10-22 13:12:22.002930392 -0700 +++ b/include/uapi/linux/ndctl.h 2018-10-22 13:12:22.005930392 -0700 @@ -197,6 +197,7 @@ static inline const char *nvdimm_cmd_nam #define ND_DEVICE_NAMESPACE_PMEM 5 /* PMEM namespace (may alias with BLK) */ #define ND_DEVICE_NAMESPACE_BLK 6 /* BLK namespace (may alias with PMEM) */ #define ND_DEVICE_DAX_PMEM 7 /* Device DAX interface to pmem */ +#define ND_DEVICE_DAX_KMEM 8 /* Normal kernel-managed system memory */ enum nd_driver_flags { ND_DRIVER_DIMM = 1 << ND_DEVICE_DIMM, @@ -206,6 +207,7 @@ enum nd_driver_flags { ND_DRIVER_NAMESPACE_PMEM = 1 << ND_DEVICE_NAMESPACE_PMEM, ND_DRIVER_NAMESPACE_BLK = 1 << ND_DEVICE_NAMESPACE_BLK, ND_DRIVER_DAX_PMEM = 1 << ND_DEVICE_DAX_PMEM, + ND_DRIVER_DAX_KMEM = 1 << ND_DEVICE_DAX_KMEM, }; enum { From patchwork Mon Oct 22 20:13:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652421 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4CAAE14BB for ; Mon, 22 Oct 2018 20:18:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 36DF328EC4 for ; Mon, 22 Oct 2018 20:18:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2B7C328EE2; Mon, 22 Oct 2018 20:18:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id CDA7028EC4 for ; Mon, 22 Oct 2018 20:18:45 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id C4AEB2117CEB4; Mon, 22 Oct 2018 13:18:45 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: None (no SPF record) identity=mailfrom; client-ip=134.134.136.24; helo=mga09.intel.com; envelope-from=dave.hansen@linux.intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id CF72A2117CEB1 for ; Mon, 22 Oct 2018 13:18:42 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="80737009" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by fmsmga008.fm.intel.com with ESMTP; 22 Oct 2018 13:18:41 -0700 Subject: [PATCH 4/9] dax/kmem: allow PMEM devices to bind to KMEM driver To: linux-kernel@vger.kernel.org From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:24 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201324.EBB64302@viggo.jf.intel.com> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.lendacky@amd.com, mhocko@suse.com, linux-nvdimm@lists.01.org, Dave Hansen , ying.huang@intel.com, linux-mm@kvack.org, zwisler@kernel.org, fengguang.wu@intel.com, akpm@linux-foundation.org MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP Currently, a persistent memory device's mode must be coordinated with the driver to which it needs to bind. To change it from the fsdax to the device-dax driver, you first change the mode of the device itself. Instead of adding a new device mode, allow the PMEM mode to also bind to the KMEM driver. As I write this, I'm realizing that it might have just been better to add a new device mode, rather than hijacking the PMEM eode. If this is the case, please speak up, NVDIMM folks. :) Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/drivers/nvdimm/bus.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff -puN drivers/nvdimm/bus.c~dax-kmem-try-again-2018-3-bus-match-override drivers/nvdimm/bus.c --- a/drivers/nvdimm/bus.c~dax-kmem-try-again-2018-3-bus-match-override 2018-10-22 13:12:22.522930391 -0700 +++ b/drivers/nvdimm/bus.c 2018-10-22 13:12:22.525930391 -0700 @@ -464,11 +464,24 @@ static struct nd_device_driver nd_bus_dr static int nvdimm_bus_match(struct device *dev, struct device_driver *drv) { struct nd_device_driver *nd_drv = to_nd_device_driver(drv); + bool match; if (is_nvdimm_bus(dev) && nd_drv == &nd_bus_driver) return true; - return !!test_bit(to_nd_device_type(dev), &nd_drv->type); + match = !!test_bit(to_nd_device_type(dev), &nd_drv->type); + + /* + * We allow PMEM devices to be bound to the KMEM driver. + * Force a match if we detect a PMEM device type but + * a KMEM device driver. + */ + if (!match && + (to_nd_device_type(dev) == ND_DEVICE_DAX_PMEM) && + (nd_drv->type == ND_DRIVER_DAX_KMEM)) + match = true; + + return match; } static ASYNC_DOMAIN_EXCLUSIVE(nd_async_domain); From patchwork Mon Oct 22 20:13:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652425 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BB12513BF for ; Mon, 22 Oct 2018 20:18:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A3AFB28EC3 for ; Mon, 22 Oct 2018 20:18:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8EEE728EC4; Mon, 22 Oct 2018 20:18:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E515E28ED4 for ; Mon, 22 Oct 2018 20:18:45 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id DBEE32194D3B8; Mon, 22 Oct 2018 13:18:45 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: None (no SPF record) identity=mailfrom; client-ip=192.55.52.43; helo=mga05.intel.com; envelope-from=dave.hansen@linux.intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 340242117B57B for ; Mon, 22 Oct 2018 13:18:44 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="90396632" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by FMSMGA003.fm.intel.com with ESMTP; 22 Oct 2018 13:18:42 -0700 Subject: [PATCH 5/9] dax/kmem: add more nd dax kmem infrastructure To: linux-kernel@vger.kernel.org From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:26 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201326.5E3F2752@viggo.jf.intel.com> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.lendacky@amd.com, mhocko@suse.com, linux-nvdimm@lists.01.org, Dave Hansen , ying.huang@intel.com, linux-mm@kvack.org, zwisler@kernel.org, fengguang.wu@intel.com, akpm@linux-foundation.org MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP Each DAX mode has a set of wrappers and helpers. Add them for the kmem mode. Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/drivers/nvdimm/bus.c | 2 ++ b/drivers/nvdimm/dax_devs.c | 35 +++++++++++++++++++++++++++++++++++ b/drivers/nvdimm/nd.h | 6 ++++++ 3 files changed, 43 insertions(+) diff -puN drivers/nvdimm/bus.c~dax-kmem-try-again-2018-4-bus-dev-type-kmem drivers/nvdimm/bus.c --- a/drivers/nvdimm/bus.c~dax-kmem-try-again-2018-4-bus-dev-type-kmem 2018-10-22 13:12:23.024930389 -0700 +++ b/drivers/nvdimm/bus.c 2018-10-22 13:12:23.031930389 -0700 @@ -46,6 +46,8 @@ static int to_nd_device_type(struct devi return ND_DEVICE_REGION_BLK; else if (is_nd_dax(dev)) return ND_DEVICE_DAX_PMEM; + else if (is_nd_dax_kmem(dev)) + return ND_DEVICE_DAX_KMEM; else if (is_nd_region(dev->parent)) return nd_region_to_nstype(to_nd_region(dev->parent)); diff -puN drivers/nvdimm/dax_devs.c~dax-kmem-try-again-2018-4-bus-dev-type-kmem drivers/nvdimm/dax_devs.c --- a/drivers/nvdimm/dax_devs.c~dax-kmem-try-again-2018-4-bus-dev-type-kmem 2018-10-22 13:12:23.026930389 -0700 +++ b/drivers/nvdimm/dax_devs.c 2018-10-22 13:12:23.031930389 -0700 @@ -51,6 +51,41 @@ struct nd_dax *to_nd_dax(struct device * } EXPORT_SYMBOL(to_nd_dax); +/* nd_dax_kmem */ +static void nd_dax_kmem_release(struct device *dev) +{ + struct nd_region *nd_region = to_nd_region(dev->parent); + struct nd_dax_kmem *nd_dax_kmem = to_nd_dax_kmem(dev); + struct nd_pfn *nd_pfn = &nd_dax_kmem->nd_pfn; + + dev_dbg(dev, "trace\n"); + nd_detach_ndns(dev, &nd_pfn->ndns); + ida_simple_remove(&nd_region->dax_ida, nd_pfn->id); + kfree(nd_pfn->uuid); + kfree(nd_dax_kmem); +} + +static struct device_type nd_dax_kmem_device_type = { + .name = "nd_dax_kmem", + .release = nd_dax_kmem_release, +}; + +bool is_nd_dax_kmem(struct device *dev) +{ + return dev ? dev->type == &nd_dax_kmem_device_type : false; +} +EXPORT_SYMBOL(is_nd_dax_kmem); + +struct nd_dax_kmem *to_nd_dax_kmem(struct device *dev) +{ + struct nd_dax_kmem *nd_dax_kmem = container_of(dev, struct nd_dax_kmem, nd_pfn.dev); + + WARN_ON(!is_nd_dax_kmem(dev)); + return nd_dax_kmem; +} +EXPORT_SYMBOL(to_nd_dax_kmem); +/* end nd_dax_kmem */ + static const struct attribute_group *nd_dax_attribute_groups[] = { &nd_pfn_attribute_group, &nd_device_attribute_group, diff -puN drivers/nvdimm/nd.h~dax-kmem-try-again-2018-4-bus-dev-type-kmem drivers/nvdimm/nd.h --- a/drivers/nvdimm/nd.h~dax-kmem-try-again-2018-4-bus-dev-type-kmem 2018-10-22 13:12:23.027930389 -0700 +++ b/drivers/nvdimm/nd.h 2018-10-22 13:12:23.031930389 -0700 @@ -215,6 +215,10 @@ struct nd_dax { struct nd_pfn nd_pfn; }; +struct nd_dax_kmem { + struct nd_pfn nd_pfn; +}; + enum nd_async_mode { ND_SYNC, ND_ASYNC, @@ -318,9 +322,11 @@ static inline int nd_pfn_validate(struct #endif struct nd_dax *to_nd_dax(struct device *dev); +struct nd_dax_kmem *to_nd_dax_kmem(struct device *dev); #if IS_ENABLED(CONFIG_NVDIMM_DAX) int nd_dax_probe(struct device *dev, struct nd_namespace_common *ndns); bool is_nd_dax(struct device *dev); +bool is_nd_dax_kmem(struct device *dev); struct device *nd_dax_create(struct nd_region *nd_region); #else static inline int nd_dax_probe(struct device *dev, From patchwork Mon Oct 22 20:13:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652427 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A082713BF for ; Mon, 22 Oct 2018 20:18:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 89B6D28EC3 for ; Mon, 22 Oct 2018 20:18:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7DD4528ED4; Mon, 22 Oct 2018 20:18:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 13D9828EC3 for ; Mon, 22 Oct 2018 20:18:48 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 0B7042117CEBF; Mon, 22 Oct 2018 13:18:48 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: None (no SPF record) identity=mailfrom; client-ip=134.134.136.24; helo=mga09.intel.com; envelope-from=dave.hansen@linux.intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 0E8D62117CEB3 for ; Mon, 22 Oct 2018 13:18:46 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="99739367" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by fmsmga004.fm.intel.com with ESMTP; 22 Oct 2018 13:18:45 -0700 Subject: [PATCH 6/9] mm/memory-hotplug: allow memory resources to be children To: linux-kernel@vger.kernel.org From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:27 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201327.F1642450@viggo.jf.intel.com> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.lendacky@amd.com, mhocko@suse.com, linux-nvdimm@lists.01.org, Dave Hansen , ying.huang@intel.com, linux-mm@kvack.org, zwisler@kernel.org, fengguang.wu@intel.com, akpm@linux-foundation.org MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP The mm/resource.c code is used to manage the physical address space. We can view the current resource configuration in /proc/iomem. An example of this is at the bottom of this description. The nvdimm subsystem "owns" the physical address resources which map to persistent memory and has resources inserted for them as "Persistent Memory". We want to use this persistent memory, but as volatile memory, just like RAM. The best way to do this is to leave the existing resource in place, but add a "System RAM" resource underneath it. This clearly communicates the ownership relationship of this memory. The request_resource_conflict() API only deals with the top-level resources. Replace it with __request_region() which will search for !IORESOURCE_BUSY areas lower in the resource tree than the top level. We also rework the old error message a bit since we do not get the conflicting entry back: only an indication that we *had* a conflict. We *could* also simply truncate the existing top-level "Persistent Memory" resource and take over the released address space. But, this means that if we ever decide to hot-unplug the "RAM" and give it back, we need to recreate the original setup, which may mean going back to the BIOS tables. This should have no real effect on the existing collision detection because the areas that truly conflict should be marked IORESOURCE_BUSY. 00000000-00000fff : Reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : Reserved 000a0000-000bffff : PCI Bus 0000:00 000c0000-000c97ff : Video ROM 000c9800-000ca5ff : Adapter ROM 000f0000-000fffff : Reserved 000f0000-000fffff : System ROM 00100000-9fffffff : System RAM 01000000-01e071d0 : Kernel code 01e071d1-027dfdff : Kernel data 02dc6000-0305dfff : Kernel bss a0000000-afffffff : Persistent Memory (legacy) a0000000-a7ffffff : System RAM b0000000-bffdffff : System RAM bffe0000-bfffffff : Reserved c0000000-febfffff : PCI Bus 0000:00 Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/mm/memory_hotplug.c | 31 ++++++++++++++----------------- 1 file changed, 14 insertions(+), 17 deletions(-) diff -puN mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child mm/memory_hotplug.c --- a/mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child 2018-10-22 13:12:23.570930388 -0700 +++ b/mm/memory_hotplug.c 2018-10-22 13:12:23.573930388 -0700 @@ -99,24 +99,21 @@ void mem_hotplug_done(void) /* add this memory to iomem resource */ static struct resource *register_memory_resource(u64 start, u64 size) { - struct resource *res, *conflict; - res = kzalloc(sizeof(struct resource), GFP_KERNEL); - if (!res) - return ERR_PTR(-ENOMEM); + struct resource *res; + unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + char resource_name[] = "System RAM"; - res->name = "System RAM"; - res->start = start; - res->end = start + size - 1; - res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - conflict = request_resource_conflict(&iomem_resource, res); - if (conflict) { - if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) { - pr_debug("Device unaddressable memory block " - "memory hotplug at %#010llx !\n", - (unsigned long long)start); - } - pr_debug("System RAM resource %pR cannot be added\n", res); - kfree(res); + /* + * Request ownership of the new memory range. This might be + * a child of an existing resource that was present but + * not marked as busy. + */ + res = __request_region(&iomem_resource, start, size, + resource_name, flags); + + if (!res) { + pr_debug("Unable to reserve System RAM region: %016llx->%016llx\n", + start, start + size); return ERR_PTR(-EEXIST); } return res; From patchwork Mon Oct 22 20:13:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652431 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A64215BB5 for ; Mon, 22 Oct 2018 20:18:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8EA0728EC3 for ; Mon, 22 Oct 2018 20:18:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 82E5428ED4; Mon, 22 Oct 2018 20:18:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2CC6D28EC4 for ; Mon, 22 Oct 2018 20:18:49 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 24C142117D260; Mon, 22 Oct 2018 13:18:49 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: None (no SPF record) identity=mailfrom; client-ip=192.55.52.93; helo=mga11.intel.com; envelope-from=dave.hansen@linux.intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id BAEE12117CEBF for ; Mon, 22 Oct 2018 13:18:47 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="273527112" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by fmsmga005.fm.intel.com with ESMTP; 22 Oct 2018 13:18:47 -0700 Subject: [PATCH 7/9] dax/kmem: actually perform memory hotplug To: linux-kernel@vger.kernel.org From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:29 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201329.518577A4@viggo.jf.intel.com> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.lendacky@amd.com, mhocko@suse.com, linux-nvdimm@lists.01.org, Dave Hansen , ying.huang@intel.com, linux-mm@kvack.org, zwisler@kernel.org, fengguang.wu@intel.com, akpm@linux-foundation.org MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP This is the meat of this whole series. When the "kmem" device's probe function is called and we know we have a good persistent memory device, hotplug the memory back into the main kernel. Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/drivers/dax/kmem.c | 28 +++++++++++++++++++++++++--- 1 file changed, 25 insertions(+), 3 deletions(-) diff -puN drivers/dax/kmem.c~dax-kmem-hotplug drivers/dax/kmem.c --- a/drivers/dax/kmem.c~dax-kmem-hotplug 2018-10-22 13:12:24.069930387 -0700 +++ b/drivers/dax/kmem.c 2018-10-22 13:12:24.072930387 -0700 @@ -55,10 +55,12 @@ static void dax_kmem_percpu_kill(void *d static int dax_kmem_probe(struct device *dev) { void *addr; + int numa_node; struct resource res; int rc, id, region_id; struct nd_pfn_sb *pfn_sb; struct dev_dax *dev_dax; + struct resource *new_res; struct dax_kmem *dax_kmem; struct nd_namespace_io *nsio; struct dax_region *dax_region; @@ -86,13 +88,30 @@ static int dax_kmem_probe(struct device pfn_sb = nd_pfn->pfn_sb; - if (!devm_request_mem_region(dev, nsio->res.start, - resource_size(&nsio->res), - dev_name(&ndns->dev))) { + new_res = devm_request_mem_region(dev, nsio->res.start, + resource_size(&nsio->res), + "System RAM (pmem)"); + if (!new_res) { dev_warn(dev, "could not reserve region %pR\n", &nsio->res); return -EBUSY; } + /* + * Set flags appropriate for System RAM. Leave ..._BUSY clear + * so that add_memory() can add a child resource. + */ + new_res->flags = IORESOURCE_SYSTEM_RAM; + + numa_node = dev_to_node(dev); + if (numa_node < 0) { + pr_warn_once("bad numa_node: %d, forcing to 0\n", numa_node); + numa_node = 0; + } + + rc = add_memory(numa_node, nsio->res.start, resource_size(&nsio->res)); + if (rc) + return rc; + dax_kmem->dev = dev; init_completion(&dax_kmem->cmp); rc = percpu_ref_init(&dax_kmem->ref, dax_kmem_percpu_release, 0, @@ -106,6 +125,9 @@ static int dax_kmem_probe(struct device return rc; dax_kmem->pgmap.ref = &dax_kmem->ref; + + dax_kmem->pgmap.res.name = "name_kmem_override2"; + addr = devm_memremap_pages(dev, &dax_kmem->pgmap); if (IS_ERR(addr)) return PTR_ERR(addr); From patchwork Mon Oct 22 20:13:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652433 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C4AC813BF for ; Mon, 22 Oct 2018 20:18:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A932D2916B for ; Mon, 22 Oct 2018 20:18:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9C3F829182; Mon, 22 Oct 2018 20:18:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 44F0E2916B for ; Mon, 22 Oct 2018 20:18:51 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 3BCB32117D26A; Mon, 22 Oct 2018 13:18:51 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: None (no SPF record) identity=mailfrom; client-ip=192.55.52.93; helo=mga11.intel.com; envelope-from=dave.hansen@linux.intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 240892117CEBE for ; Mon, 22 Oct 2018 13:18:49 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="101693469" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga001.jf.intel.com with ESMTP; 22 Oct 2018 13:18:48 -0700 Subject: [PATCH 8/9] dax/kmem: let walk_system_ram_range() search child resources To: linux-kernel@vger.kernel.org From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:31 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201331.8DDC3CDD@viggo.jf.intel.com> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.lendacky@amd.com, mhocko@suse.com, linux-nvdimm@lists.01.org, Dave Hansen , ying.huang@intel.com, linux-mm@kvack.org, zwisler@kernel.org, fengguang.wu@intel.com, akpm@linux-foundation.org MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP In the process of onlining memory, we use walk_system_ram_range() to find the actual RAM areas inside of the area being onlined. However, it currently only finds memory resources which are "top-level" iomem_resources. Children are not currently searched which causes it to skip System RAM in areas like this (in the format of /proc/iomem): a0000000-bfffffff : Persistent Memory (legacy) a0000000-afffffff : System RAM Changing the true->false here allows children to be searched as well. We need this because we add a new "System RAM" resource underneath the "persistent memory" resource when we use persistent memory in a volatile mode. Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/kernel/resource.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff -puN kernel/resource.c~mm-walk_system_ram_range-search-child-resources kernel/resource.c --- a/kernel/resource.c~mm-walk_system_ram_range-search-child-resources 2018-10-22 13:12:24.565930386 -0700 +++ b/kernel/resource.c 2018-10-22 13:12:24.572930386 -0700 @@ -445,6 +445,9 @@ int walk_mem_res(u64 start, u64 end, voi * This function calls the @func callback against all memory ranges of type * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY. * It is to be used only for System RAM. + * + * This will find System RAM ranges that are children of top-level resources + * in addition to top-level System RAM resources. */ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages, void *arg, int (*func)(unsigned long, unsigned long, void *)) @@ -460,7 +463,7 @@ int walk_system_ram_range(unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; while (start < end && !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, - true, &res)) { + false, &res)) { pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT; end_pfn = (res.end + 1) >> PAGE_SHIFT; if (end_pfn > pfn) From patchwork Mon Oct 22 20:13:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652435 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E2F4814BB for ; Mon, 22 Oct 2018 20:18:53 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CDD372916E for ; Mon, 22 Oct 2018 20:18:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C1B502918B; Mon, 22 Oct 2018 20:18:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5AAD72916E for ; Mon, 22 Oct 2018 20:18:52 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 5225B2117D265; Mon, 22 Oct 2018 13:18:52 -0700 (PDT) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: None (no SPF record) identity=mailfrom; client-ip=192.55.52.88; helo=mga01.intel.com; envelope-from=dave.hansen@linux.intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 029642117CEBA for ; Mon, 22 Oct 2018 13:18:50 -0700 (PDT) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="102363901" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga002.jf.intel.com with ESMTP; 22 Oct 2018 13:18:50 -0700 Subject: [PATCH 9/9] dax/kmem: actually enable the code in Makefile To: linux-kernel@vger.kernel.org From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:32 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201332.FC3B5EB7@viggo.jf.intel.com> X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: thomas.lendacky@amd.com, mhocko@suse.com, linux-nvdimm@lists.01.org, Dave Hansen , ying.huang@intel.com, linux-mm@kvack.org, zwisler@kernel.org, fengguang.wu@intel.com, akpm@linux-foundation.org MIME-Version: 1.0 Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP Most of the new code was dead up to this point. Now that all the pieces are in place, enable it. Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/drivers/dax/Makefile | 2 ++ 1 file changed, 2 insertions(+) diff -puN drivers/dax/Makefile~dax-kmem-makefile drivers/dax/Makefile --- a/drivers/dax/Makefile~dax-kmem-makefile 2018-10-22 13:12:25.068930384 -0700 +++ b/drivers/dax/Makefile 2018-10-22 13:12:25.071930384 -0700 @@ -2,7 +2,9 @@ obj-$(CONFIG_DAX) += dax.o obj-$(CONFIG_DEV_DAX) += device_dax.o obj-$(CONFIG_DEV_DAX_PMEM) += dax_pmem.o +obj-$(CONFIG_DEV_DAX_PMEM) += dax_kmem.o dax-y := super.o dax_pmem-y := pmem.o +dax_kmem-y := kmem.o device_dax-y := device.o