From patchwork Mon Oct 22 20:13:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652447 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E77C513BF for ; Mon, 22 Oct 2018 20:19:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D10622916B for ; Mon, 22 Oct 2018 20:19:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C54262916E; Mon, 22 Oct 2018 20:19:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3BCEC2916B for ; Mon, 22 Oct 2018 20:19:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16F826B026D; Mon, 22 Oct 2018 16:18:53 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1230F6B026E; Mon, 22 Oct 2018 16:18:53 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F2C1C6B026F; Mon, 22 Oct 2018 16:18:52 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id B0E6C6B026D for ; Mon, 22 Oct 2018 16:18:52 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id v7-v6so31240164plo.23 for ; Mon, 22 Oct 2018 13:18:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=LkIfAHcPGuqatpzkJQXzIX0BljRNIJziyDVOi4USVeI=; b=ZArO0F3OUyKm52e3Bl7YFlPYLSybUMCtxcpoTXW8M9CENIwHYSZbFNcj8kuX46mkvm i4DmFGq66VPAMi+VDslljhEA79SZvOic5ujMcx1aK1ht80G/rXiCtCvLIoN5EM5hAODt k8gKc/s+ZrIko3XCsKng8sN+rsByeSlnFxVnmgyNQzNZKB2M+GpuVqQhZ2f+TfsT+Xfc vk7+WTUyKJVwJ0qolmqZ2/q6hsY7rF1A0Ima51sf8fcqbwZLIq2NbQZMVMKczO95XOaD My8J1V7pOGqLNAq95IaFoXliAvDCFA9CUewROVyXKbnSwOmm4ko2mDICtf8tt8G2/YVV WSeA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ABuFfoh1FJg/JsPK+vA/2EAfE0YQoN/fsK2NcSIK5RmlLmIwPkQBJX7A o3oifc0G+067xOyZF18mRwoKKk+ODanpSMPtmIz6uN5XiocV6nWiOclBVKLp1dBTQanJUu25USC Pr50SjTwHucLjsTpqWOxjvihOf6abWnEKwtZ/biQwrQ4tn/eMHtwWTFiFvUCU8gibTQ== X-Received: by 2002:a63:f252:: with SMTP id d18-v6mr44396869pgk.2.1540239532374; Mon, 22 Oct 2018 13:18:52 -0700 (PDT) X-Google-Smtp-Source: ACcGV61xWBTALHP0+ACE8wHvDKNq00JhdWk1FotiaNxItOVarsU1XEl1Tw+fcEqs4tJDu0U5X421 X-Received: by 2002:a63:f252:: with SMTP id d18-v6mr44396636pgk.2.1540239526821; Mon, 22 Oct 2018 13:18:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540239526; cv=none; d=google.com; s=arc-20160816; b=0XKtj5fiKqQoAT4JnNRPKeZt2R6tLbfyGnXFR2mn/QzzYCQR6RMcBegCIkhITKF0bd yWKnVFjt81aYJRWXmsyQ544hg2KxlZdoIVQbGLxgdiOlX5h9aQN9ZymFiF5VwX0G5Pba JsgPli/B3+hdiieaTHhRiuAD6hCa+r5EI/7K7VDT0NKX9YcPdE3iJcwRM5FTLBKrlYxZ WhtLL8aKMDxxkjFkDHUS4Amrru+Uo6+nA60Vi3XyazGmrAYmsIsinP3lCrFNkkZ1HSOI fn2q/xqWe+nszIKdby5r8qnY/DF9QJkfA0LPPucybLyW81K/dVI2h3od4NJNHl2zCjjU jnUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=LkIfAHcPGuqatpzkJQXzIX0BljRNIJziyDVOi4USVeI=; b=gPp88Lj7+oGCB1M/Zxbr/k4C97t5RmrhzqNgrWzMWEvp6g4QERl8SvST7Nua4kVZzE /oYUELf2vTymBPijTdyxwKxs3Dnl9TjvJwjK3Ceeq+GOFxk5Ramg/E0Bk0/yKTTAeUwJ po6l7hHMfbbtqwdfbZJswda0QREdLN2I1jiCUp9ML/zWFXLlflRdd446cGyLFK8czRpc kd55pk1SHtUIDgc2oLo4+oQJG1MrmjWfh4rEkxZJiJEGQSty0LgMAzCneyiD3Lno1Qjs Cm+B5KxoVNdalDOuihmDuyYpuVsD4F57oZCj6jufG+RubexUbToQ0/RNFcED+kti8nz3 mrSA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga09.intel.com (mga09.intel.com. [134.134.136.24]) by mx.google.com with ESMTPS id w13-v6si12595570pgj.229.2018.10.22.13.18.46 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Oct 2018 13:18:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.24 as permitted sender) client-ip=134.134.136.24; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.24 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="99739367" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by fmsmga004.fm.intel.com with ESMTP; 22 Oct 2018 13:18:45 -0700 Subject: [PATCH 6/9] mm/memory-hotplug: allow memory resources to be children To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:27 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201327.F1642450@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The mm/resource.c code is used to manage the physical address space. We can view the current resource configuration in /proc/iomem. An example of this is at the bottom of this description. The nvdimm subsystem "owns" the physical address resources which map to persistent memory and has resources inserted for them as "Persistent Memory". We want to use this persistent memory, but as volatile memory, just like RAM. The best way to do this is to leave the existing resource in place, but add a "System RAM" resource underneath it. This clearly communicates the ownership relationship of this memory. The request_resource_conflict() API only deals with the top-level resources. Replace it with __request_region() which will search for !IORESOURCE_BUSY areas lower in the resource tree than the top level. We also rework the old error message a bit since we do not get the conflicting entry back: only an indication that we *had* a conflict. We *could* also simply truncate the existing top-level "Persistent Memory" resource and take over the released address space. But, this means that if we ever decide to hot-unplug the "RAM" and give it back, we need to recreate the original setup, which may mean going back to the BIOS tables. This should have no real effect on the existing collision detection because the areas that truly conflict should be marked IORESOURCE_BUSY. 00000000-00000fff : Reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : Reserved 000a0000-000bffff : PCI Bus 0000:00 000c0000-000c97ff : Video ROM 000c9800-000ca5ff : Adapter ROM 000f0000-000fffff : Reserved 000f0000-000fffff : System ROM 00100000-9fffffff : System RAM 01000000-01e071d0 : Kernel code 01e071d1-027dfdff : Kernel data 02dc6000-0305dfff : Kernel bss a0000000-afffffff : Persistent Memory (legacy) a0000000-a7ffffff : System RAM b0000000-bffdffff : System RAM bffe0000-bfffffff : Reserved c0000000-febfffff : PCI Bus 0000:00 Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/mm/memory_hotplug.c | 31 ++++++++++++++----------------- 1 file changed, 14 insertions(+), 17 deletions(-) diff -puN mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child mm/memory_hotplug.c --- a/mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child 2018-10-22 13:12:23.570930388 -0700 +++ b/mm/memory_hotplug.c 2018-10-22 13:12:23.573930388 -0700 @@ -99,24 +99,21 @@ void mem_hotplug_done(void) /* add this memory to iomem resource */ static struct resource *register_memory_resource(u64 start, u64 size) { - struct resource *res, *conflict; - res = kzalloc(sizeof(struct resource), GFP_KERNEL); - if (!res) - return ERR_PTR(-ENOMEM); + struct resource *res; + unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + char resource_name[] = "System RAM"; - res->name = "System RAM"; - res->start = start; - res->end = start + size - 1; - res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - conflict = request_resource_conflict(&iomem_resource, res); - if (conflict) { - if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) { - pr_debug("Device unaddressable memory block " - "memory hotplug at %#010llx !\n", - (unsigned long long)start); - } - pr_debug("System RAM resource %pR cannot be added\n", res); - kfree(res); + /* + * Request ownership of the new memory range. This might be + * a child of an existing resource that was present but + * not marked as busy. + */ + res = __request_region(&iomem_resource, start, size, + resource_name, flags); + + if (!res) { + pr_debug("Unable to reserve System RAM region: %016llx->%016llx\n", + start, start + size); return ERR_PTR(-EEXIST); } return res;