From patchwork Thu Jan 24 23:14:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10780285 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A522D1399 for ; Thu, 24 Jan 2019 23:21:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 93EE82FDEE for ; Thu, 24 Jan 2019 23:21:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 87B752FE60; Thu, 24 Jan 2019 23:21:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 248492FDEE for ; Thu, 24 Jan 2019 23:21:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D22458E00AE; Thu, 24 Jan 2019 18:21:54 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CD4B78E00AC; Thu, 24 Jan 2019 18:21:54 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BBFEB8E00AE; Thu, 24 Jan 2019 18:21:54 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id 7DDDF8E00AC for ; Thu, 24 Jan 2019 18:21:54 -0500 (EST) Received: by mail-pg1-f198.google.com with SMTP id s27so5058908pgm.4 for ; Thu, 24 Jan 2019 15:21:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=G4UBwBK1ygZOCj9EDaFANslJEVjpXrkdAT+76ni7PnA=; b=P3K1IdEjIp5akXGC046HzrZgDd7dw97WYBNr0WhcYOO6Bvoh1r5LWfrpWe7Pp7cEPz rxZbgp1P8alLk2hh2HyluNS7GkN9gvfELWCS3DQLA7rK/xSFa+U981wOhk8Gl741UQdG 8hV+XgjbZ/pw4Jc5X9iRW1GHsUoi6qKCfj844fK3p9CvLTokvGLqCNnKKJQfJ5qyuer3 BNe3ODQ9FgSQKFzzRuG978Vfvbkp/bczWNxK4FTmwq+mJ/UDwuGUqMAnAtPziJKGA7GM QnRefTSVdUX4PvpZ7J2noOffQEH4GASle5EYsthN8svfumxnvt3qd7mZtzspJqS/6sB2 DvRA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AJcUukdsibMeibkEGsGwql7dU3x8iFoqexH9QRhNluBuy9skenoeg+R5 +QMhmM1OHefUMgDEQkZ925Mv/u44i1Ypf/2kV1a56J2wxFd1BsCPGlWaY/YRcufKWqpwF4QwBR0 jhhXeQ+YJM8IjDFMKHxv7e2DdnVWoPkBekvn8mcq1jTek5bu8uglc5V0Hdy1i0ftBTw== X-Received: by 2002:a63:66c6:: with SMTP id a189mr7790953pgc.167.1548372114157; Thu, 24 Jan 2019 15:21:54 -0800 (PST) X-Google-Smtp-Source: ALg8bN6FiJUjR7SGOjihYXyG2Dhrg8/SkTZ/N0VVvJqV4dAEeYIlAlhF7aBwD3GDXosi2DhgmBeS X-Received: by 2002:a63:66c6:: with SMTP id a189mr7790903pgc.167.1548372113011; Thu, 24 Jan 2019 15:21:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548372112; cv=none; d=google.com; s=arc-20160816; b=K7w2jUM1ADQFHj/Bv4ZelZbKseX0iwQc7A0oLwGnFll4l8DyRXilS2csJPcozwLgM8 o7DZjg+204QOl0RxDJs2ohFP2vMUy/jIOqEuXaP7jB3wR3da5R+nB+ru4su0r3ndS89p rsSb8HEHJ/Nl+Jw52sXTxb2sUwuRXsLoHmijXEMNnNywhSj4mP6YONaeYz42bphplFiX cWwQffaaZpy1+2QHeVZqM55dKm5rx9MrWBBPA1zCvZczC3sd3XE4mM4VIXWWl7q1sOq5 D6C47fl45tcenLmBBof0vnAikLjmMIIhoqgyLKV/Z2RzjxvYBzvOOheBhvNrIYzrk5zt lNUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=G4UBwBK1ygZOCj9EDaFANslJEVjpXrkdAT+76ni7PnA=; b=WmY6/Ryf6Z/h+aL6P1kgtxalwjMZT/XGSR86NA5r3RAY2PyaAMXfpBmD3mYDbTRTk+ 9rFxB2F7vLZe0zbGlkW8XRKVhI1ouV14p76NWNPjALxHoa6ZJWbzgV2/91QRMZ4cHRns mZ32vcYp8nxmdm/dWI0T8t6oOgThXzP4J0KwdgA4CesmpxruQGwrbrQgDx8jiwFyju5n S8/FgEoNp4uaM8nuzh084qDG419VrHo9PB8z6BlDsLMWpuZBNwWCcJfa6Pg59lXW+9FA hLhevRuikR1RyadzNMu9RPgSJMmxuSI5/OQcium5Lha7qtEnqGpGzgheUvDjaH2L3HuW mxog== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id q24si23382216pls.325.2019.01.24.15.21.52 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 15:21:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Jan 2019 15:21:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,518,1539673200"; d="scan'208";a="121111369" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga003.jf.intel.com with ESMTP; 24 Jan 2019 15:21:52 -0800 Subject: [PATCH 1/5] mm/resource: return real error codes from walk failures To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de,jglisse@redhat.com From: Dave Hansen Date: Thu, 24 Jan 2019 15:14:42 -0800 References: <20190124231441.37A4A305@viggo.jf.intel.com> In-Reply-To: <20190124231441.37A4A305@viggo.jf.intel.com> Message-Id: <20190124231442.EFD29EE0@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error. The memory hotplug does the following: ret = walk_system_ram_range(..., func); if (ret) return ret; and 'ret' makes it out to userspace, eventually. The problem is, walk_system_ram_range() failues that result from *it* failing (as opposed to 'func') return -1. That leads to a very odd -EPERM (-1) return code out to userspace. Make walk_system_ram_range() return -EINVAL for internal failures to keep userspace less confused. This return code is compatible with all the callers that I audited. Signed-off-by: Dave Hansen Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Borislav Petkov Cc: Bjorn Helgaas Cc: Yaowei Bai Cc: Takashi Iwai Cc: Jerome Glisse Reviewed-by: Bjorn Helgaas Acked-by: Michael Ellerman (powerpc) --- b/kernel/resource.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff -puN kernel/resource.c~memory-hotplug-walk_system_ram_range-returns-neg-1 kernel/resource.c --- a/kernel/resource.c~memory-hotplug-walk_system_ram_range-returns-neg-1 2019-01-24 15:13:13.950199540 -0800 +++ b/kernel/resource.c 2019-01-24 15:13:13.954199540 -0800 @@ -375,7 +375,7 @@ static int __walk_iomem_res_desc(resourc int (*func)(struct resource *, void *)) { struct resource res; - int ret = -1; + int ret = -EINVAL; while (start < end && !find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) { @@ -453,7 +453,7 @@ int walk_system_ram_range(unsigned long unsigned long flags; struct resource res; unsigned long pfn, end_pfn; - int ret = -1; + int ret = -EINVAL; start = (u64) start_pfn << PAGE_SHIFT; end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1; From patchwork Thu Jan 24 23:14:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10780291 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 229CC1399 for ; Thu, 24 Jan 2019 23:22:00 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 101DD2FDEE for ; Thu, 24 Jan 2019 23:22:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 049802FE60; Thu, 24 Jan 2019 23:22:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8C48E2FDEE for ; Thu, 24 Jan 2019 23:21:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A8CBB8E00AF; Thu, 24 Jan 2019 18:21:55 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A3ABA8E00AC; Thu, 24 Jan 2019 18:21:55 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B93E8E00AF; Thu, 24 Jan 2019 18:21:55 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id 4AEC48E00AC for ; Thu, 24 Jan 2019 18:21:55 -0500 (EST) Received: by mail-pg1-f197.google.com with SMTP id o17so5035381pgi.14 for ; Thu, 24 Jan 2019 15:21:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=mIxy0xdlDeGCgrGHdCCkB4khySDD4hZUeK1rlRQNXm4=; b=f9qjpzExv9ulFeTL5heNF7Jzjt/XLUpFTISq5FbDnFvrJaIjIpJ9VBvGO9wQ9dQ6lW UbR4j53FsFSx0cYdVUZp+KRQYDo+U5TPl6ebAPsl8HpsKgc3aJPHJeTQqrR1lAaQsrRY 9gj70YvwJLbjNQGClhZiemXIQ6W7w/hJowxt3XoMUBS/lETp0eIIc7R9J8bwd0QDtajR /MI0plTEPcHKfdJ6tmrF4rLX+tMHUa48fhNPmLpFfpk9Ev7AztyJ3Ic+7WKIqdKBOL+p fGRrsgB8Ygepm55/yyY2gyxM+HhOyHaYThfoz5WT8cpTYWDblYaNjLcn0a6akeoXsBz4 ZIDg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AJcUuke+1XIIDHaF2uI9IFGrqtynln23c6XRQyjxP9j/biOMZeIheiAr 8rNhlX0DPbIoOz1v052lP0/iPGH2Rr+TQxaCHFiVDl/21xlIE30mEvsO7MEsCSb7Bzg0dj6soBQ V2qnXy7bEC0NDd0HWO3CLoiKkMMoWJAXnNL0vGTIfUQ0EmVmfAOnWEpf4KA119Y17Mg== X-Received: by 2002:a17:902:830a:: with SMTP id bd10mr8531847plb.321.1548372114951; Thu, 24 Jan 2019 15:21:54 -0800 (PST) X-Google-Smtp-Source: ALg8bN4bq6I2DVCDv3Yg9HqZJuqDTFfV6xDkreQLJQjAfhxN0WxOmI6yWga+Rf7x1A4x/TSLURpc X-Received: by 2002:a17:902:830a:: with SMTP id bd10mr8531801plb.321.1548372114168; Thu, 24 Jan 2019 15:21:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548372114; cv=none; d=google.com; s=arc-20160816; b=YTYMUprqw/g7jk0ehJFEVfmGv0efGrk1q2mswPYamOKIyDGMRntQiDLaFxy5m0LKbc CQDFtJ+OL80Pl0MWfzufmKD/CuHqZiNf+lxovTjIAt8Rt538CLDBJ/a7pKTvh36QX8bU ifnrz58nj4CnaEdxGYJzDpOjkwyprmR6DeJtozV18t2KSobNb5YZl7IWukVTCr3LtNX0 qhPep4PD0hefrMvzhpTkfKmr8KK0rE6QQg1dHElSk9094PI9aENgn4B0l71tLTnRkgvI zMLz9Qw3+bWjIzh7CAs3i/3NPLFs1/laGv7IMSTO6YIvGzUTrv2PMaobRiQZlN3BzWOF PqYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=mIxy0xdlDeGCgrGHdCCkB4khySDD4hZUeK1rlRQNXm4=; b=lFe4pPuIjGCCRbPbOuzZZP6l2iQIzl+jp/5ZiO3XUmWImFe/KoiVuAIdMfPvxsO8GQ nkmE+/uG3OlXNpXs126r8qGpnG3xJlYn6wpe8VL92pU9itUWW1kLBh/pCT89taHMp8B2 i0an59UfN/J58w3Rrc+0pbLX/5MX4AVZbN5DGEmy7IYkDbMhmsQE+hxdVzWpEV7t5V7V KOd0gytBGAU9C46f8PGi7wF3W2Fsy0RJXKbp37KSyLeroxBu1fwrNbjSrTbwYiwyayqS GsI9gGZxBpkebLQ0Uh357fZ36aM92FnSnrIAz+HKYB11zOu5DAbWEGiyeu2r6/zyFRnd i94Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id q24si23382216pls.325.2019.01.24.15.21.53 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 15:21:54 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Jan 2019 15:21:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,518,1539673200"; d="scan'208";a="294207413" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga005.jf.intel.com with ESMTP; 24 Jan 2019 15:21:53 -0800 Subject: [PATCH 2/5] mm/resource: move HMM pr_debug() deeper into resource code To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,jglisse@redhat.com From: Dave Hansen Date: Thu, 24 Jan 2019 15:14:44 -0800 References: <20190124231441.37A4A305@viggo.jf.intel.com> In-Reply-To: <20190124231441.37A4A305@viggo.jf.intel.com> Message-Id: <20190124231444.38182DD8@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen HMM consumes physical address space for its own use, even though nothing is mapped or accessible there. It uses a special resource description (IORES_DESC_DEVICE_PRIVATE_MEMORY) to uniquely identify these areas. When HMM consumes address space, it makes a best guess about what to consume. However, it is possible that a future memory or device hotplug can collide with the reserved area. In the case of these conflicts, there is an error message in register_memory_resource(). Later patches in this series move register_memory_resource() from using request_resource_conflict() to __request_region(). Unfortunately, __request_region() does not return the conflict like the previous function did, which makes it impossible to check for IORES_DESC_DEVICE_PRIVATE_MEMORY in a conflicting resource. Instead of warning in register_memory_resource(), move the check into the core resource code itself (__request_region()) where the conflicting resource _is_ available. This has the added bonus of producing a warning in case of HMM conflicts with devices *or* RAM address space, as opposed to the RAM- only warnings that were there previously. Signed-off-by: Dave Hansen Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Jerome Glisse Reviewed-by: Jerome Glisse --- b/kernel/resource.c | 10 ++++++++++ b/mm/memory_hotplug.c | 5 ----- 2 files changed, 10 insertions(+), 5 deletions(-) diff -puN kernel/resource.c~move-request_region-check kernel/resource.c --- a/kernel/resource.c~move-request_region-check 2019-01-24 15:13:14.453199539 -0800 +++ b/kernel/resource.c 2019-01-24 15:13:14.458199539 -0800 @@ -1123,6 +1123,16 @@ struct resource * __request_region(struc conflict = __request_resource(parent, res); if (!conflict) break; + /* + * mm/hmm.c reserves physical addresses which then + * become unavailable to other users. Conflicts are + * not expected. Be verbose if one is encountered. + */ + if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) { + pr_debug("Resource conflict with unaddressable " + "device memory at %#010llx !\n", + (unsigned long long)start); + } if (conflict != parent) { if (!(conflict->flags & IORESOURCE_BUSY)) { parent = conflict; diff -puN mm/memory_hotplug.c~move-request_region-check mm/memory_hotplug.c --- a/mm/memory_hotplug.c~move-request_region-check 2019-01-24 15:13:14.455199539 -0800 +++ b/mm/memory_hotplug.c 2019-01-24 15:13:14.459199539 -0800 @@ -109,11 +109,6 @@ static struct resource *register_memory_ res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; conflict = request_resource_conflict(&iomem_resource, res); if (conflict) { - if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) { - pr_debug("Device unaddressable memory block " - "memory hotplug at %#010llx !\n", - (unsigned long long)start); - } pr_debug("System RAM resource %pR cannot be added\n", res); kfree(res); return ERR_PTR(-EEXIST); From patchwork Thu Jan 24 23:14:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10780295 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7BEB21390 for ; Thu, 24 Jan 2019 23:22:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A7132FDEE for ; Thu, 24 Jan 2019 23:22:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5E43A2FE60; Thu, 24 Jan 2019 23:22:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CEC732FDEE for ; Thu, 24 Jan 2019 23:22:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C86D28E00B0; Thu, 24 Jan 2019 18:21:57 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BE8628E00AC; Thu, 24 Jan 2019 18:21:57 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AFDD88E00B0; Thu, 24 Jan 2019 18:21:57 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 657A58E00AC for ; Thu, 24 Jan 2019 18:21:57 -0500 (EST) Received: by mail-pl1-f198.google.com with SMTP id v12so4956627plp.16 for ; Thu, 24 Jan 2019 15:21:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=ILkCP7OxlYRImbbJu2pTk1APDNsU2c8oqx2F1FKKK+8=; b=NZbwrM9MknbnZwJ9gSkvY0jW+6QSrUIAalFppUiRzKh7TFVnhjhAsxI+tdhNhOYZmt CeIf8PBdMN5JBfxuZC1mNzDik6K1ywb4AAChDVNHOj5u6niWj1+zP/9Mes5uiW3BEsd/ TOiqp1X4iQQYSRqT3oSyPpGbDzSJXJrLO2cF40v7Gw58UBdMP3UUWPSKKvgMLSGB8oTD gpfBH9KN6YA1BfJ8WZM73bn4bIWTnIItrzQzgB02+BxbbFKXMZ/DrN1o8iDoNoLF/aVR 1M37AC41mOAkWZMHkQHQ2uknLTcZnbF5PxaLczgKf5KwW3CmhOZnF3xALcA9/rlqnooW sQRw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AJcUukflXgk5PAKAQQYPzcgihTefwWixf5DVonhOp4xlK3S9zn19whf1 jAjvJrz8mos/mjRNYk/YvSrOXM2RvxWg6FqKegOkXmCyODVnhenAGli9upuyLMDmnPdF1SWOy4Z qdAkZBj66KdM7jjbqlfs05oLIdrkDDAHCp2+JhXPQMbKG6BbmSkJ6byuZocF2zrpakg== X-Received: by 2002:a63:5723:: with SMTP id l35mr7514470pgb.228.1548372117048; Thu, 24 Jan 2019 15:21:57 -0800 (PST) X-Google-Smtp-Source: ALg8bN6/zn1qUDYrSDwsUOKQlZhfom9EOOZQOaZUCEVduQeHGk2wllrvSrPcqHEU+xZyfaX0vbFe X-Received: by 2002:a63:5723:: with SMTP id l35mr7514426pgb.228.1548372116125; Thu, 24 Jan 2019 15:21:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548372116; cv=none; d=google.com; s=arc-20160816; b=vNEPd+rqodXrJxcWPUWWsvUcZsP85bAhxDvg5lYoZCMLKRElD9vTrfPx/LyVxHfsyW 2R1efraytM1uzBQVW6ecBIVXj+lwXtyI31kbJGUKNkOE0IO0XPYgFMkHCsqc/Jt25tMN kanVznehZgm1N8FjTlJ6JgWSYegcB74EG2Q7ONU3XdxY3kR/D3dhF8X1bJ3b+8RIhoUq AqUq/A9KGMhgaDrubGvUwYwOFcCBXWmOhSDWyncGX1HZPWcdfFWv3n0FYHtgadD2l9K5 9taYm7Jbc4dME3F7hjZL60Y80HZPaXuiecA5lY8dHmXMn5pMjOz0SHZcFfAutONb4785 TVlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=ILkCP7OxlYRImbbJu2pTk1APDNsU2c8oqx2F1FKKK+8=; b=Blj5RxbUoCXYcVGyg6PDYMmK9evRNqchBLL2sqEx9WiZuwBdtSsgldNJ33AnI+4R37 BELMLI0tjNARoMAXS8ZMrsgs7usVtThwBkyjaiq5t45CsmqRu1PNk72x+iXe/Cb7VI5o e8b2+VGYA2VtudVlgkZBtLpiAcmI2tFS7G4n9fwqe7fkikNW7r0kg0ZSU8Dr91hULCYL G2J7edf7wtJIc0xAvP2Lm8OJRyu09Nh5z7PFlDNeC1sCrpOANUD0ROVgEtm2IGh8Q7Xu b3bOfDewEd7lqzs1h3Ih2KoCakGT91TdB1r4+D2Kw2N7kpZPrEh1iaAvLsP+UEg5X3cY pddA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga17.intel.com (mga17.intel.com. [192.55.52.151]) by mx.google.com with ESMTPS id p3si8871042plk.424.2019.01.24.15.21.55 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 15:21:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.151 as permitted sender) client-ip=192.55.52.151; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Jan 2019 15:21:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,518,1539673200"; d="scan'208";a="112454053" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga008.jf.intel.com with ESMTP; 24 Jan 2019 15:21:55 -0800 Subject: [PATCH 3/5] mm/memory-hotplug: allow memory resources to be children To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de,jglisse@redhat.com From: Dave Hansen Date: Thu, 24 Jan 2019 15:14:45 -0800 References: <20190124231441.37A4A305@viggo.jf.intel.com> In-Reply-To: <20190124231441.37A4A305@viggo.jf.intel.com> Message-Id: <20190124231445.5D8EEDAF@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen The mm/resource.c code is used to manage the physical address space. The current resource configuration can be viewed in /proc/iomem. An example of this is at the bottom of this description. The nvdimm subsystem "owns" the physical address resources which map to persistent memory and has resources inserted for them as "Persistent Memory". The best way to repurpose this for volatile use is to leave the existing resource in place, but add a "System RAM" resource underneath it. This clearly communicates the ownership relationship of this memory. The request_resource_conflict() API only deals with the top-level resources. Replace it with __request_region() which will search for !IORESOURCE_BUSY areas lower in the resource tree than the top level. We *could* also simply truncate the existing top-level "Persistent Memory" resource and take over the released address space. But, this means that if we ever decide to hot-unplug the "RAM" and give it back, we need to recreate the original setup, which may mean going back to the BIOS tables. This should have no real effect on the existing collision detection because the areas that truly conflict should be marked IORESOURCE_BUSY. 00000000-00000fff : Reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : Reserved 000a0000-000bffff : PCI Bus 0000:00 000c0000-000c97ff : Video ROM 000c9800-000ca5ff : Adapter ROM 000f0000-000fffff : Reserved 000f0000-000fffff : System ROM 00100000-9fffffff : System RAM 01000000-01e071d0 : Kernel code 01e071d1-027dfdff : Kernel data 02dc6000-0305dfff : Kernel bss a0000000-afffffff : Persistent Memory (legacy) a0000000-a7ffffff : System RAM b0000000-bffdffff : System RAM bffe0000-bfffffff : Reserved c0000000-febfffff : PCI Bus 0000:00 Signed-off-by: Dave Hansen Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Borislav Petkov Cc: Bjorn Helgaas Cc: Yaowei Bai Cc: Takashi Iwai Cc: Jerome Glisse --- b/mm/memory_hotplug.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff -puN mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child mm/memory_hotplug.c --- a/mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child 2019-01-24 15:13:14.979199537 -0800 +++ b/mm/memory_hotplug.c 2019-01-24 15:13:14.983199537 -0800 @@ -98,19 +98,21 @@ void mem_hotplug_done(void) /* add this memory to iomem resource */ static struct resource *register_memory_resource(u64 start, u64 size) { - struct resource *res, *conflict; - res = kzalloc(sizeof(struct resource), GFP_KERNEL); - if (!res) - return ERR_PTR(-ENOMEM); + struct resource *res; + unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + char *resource_name = "System RAM"; - res->name = "System RAM"; - res->start = start; - res->end = start + size - 1; - res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - conflict = request_resource_conflict(&iomem_resource, res); - if (conflict) { - pr_debug("System RAM resource %pR cannot be added\n", res); - kfree(res); + /* + * Request ownership of the new memory range. This might be + * a child of an existing resource that was present but + * not marked as busy. + */ + res = __request_region(&iomem_resource, start, size, + resource_name, flags); + + if (!res) { + pr_debug("Unable to reserve System RAM region: %016llx->%016llx\n", + start, start + size); return ERR_PTR(-EEXIST); } return res; From patchwork Thu Jan 24 23:14:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10780297 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D5DCE1399 for ; Thu, 24 Jan 2019 23:22:04 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C3C722FDEE for ; Thu, 24 Jan 2019 23:22:04 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B79772FE60; Thu, 24 Jan 2019 23:22:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 45D8A2FDEE for ; Thu, 24 Jan 2019 23:22:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8698B8E00B1; Thu, 24 Jan 2019 18:21:59 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 820938E00AC; Thu, 24 Jan 2019 18:21:59 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 693DB8E00B1; Thu, 24 Jan 2019 18:21:59 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id 2483B8E00AC for ; Thu, 24 Jan 2019 18:21:59 -0500 (EST) Received: by mail-pg1-f200.google.com with SMTP id d3so5005450pgv.23 for ; Thu, 24 Jan 2019 15:21:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=pLfVhbANs+dHRF061J8403dGvp1pUAcAUzYH31HwZOg=; b=mm3b22Fcgg04/H4UPIFgiTyFbKSENoRDznhLJ3TPxil62+7/nDN9x50V4uCAFiAx6E m2IhNo8fj4PLKmPz7PobMC29fJnfT8CKnJPiF11qV80IJFN40j9SB8OdmNuZzTpFRBrM I9C8X4zeKTVbb0LCq5wmx+QvnSWoZ4VihMJzzabfX46E88nv6oouvbVSng+s9qsVK230 e2yABge7Jvx12St7IAJewHwUdOG6G38Idv4BqkocO1495rWz6IMxNcXQzC5oNcctO44b 0cJgoxxCSwNrLHls04Fsws8s2V5Bgu9joI7QVCOUctEFAzqk2Bx3T9vxpdRybSvCgPOn nHZw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AJcUukfe5kRtRV4JckK46+HceHEhAFQZtWm8StDKSsSpc1Peh3DcBUi8 qX9QI18c7rVpwpuNLRpkMCXFuvTsltuOwYLrxss3DaR2yCaDOu+l0QhMXlRvc2iw6UlpQZukx2D 7GQDI+hwG2O1mSmsOvX2DTb3SQLolcHfo/L358roskuiV84E5EHfkY2eWhDzJn6RpHw== X-Received: by 2002:a65:500c:: with SMTP id f12mr7716383pgo.226.1548372118806; Thu, 24 Jan 2019 15:21:58 -0800 (PST) X-Google-Smtp-Source: ALg8bN7EU2FLMiD5pySHDmtRDA2OianHm/8sMPaO8/WaqS2rVtajt1DEY/MLSJzD4Oa9/Z33el4X X-Received: by 2002:a65:500c:: with SMTP id f12mr7716312pgo.226.1548372117239; Thu, 24 Jan 2019 15:21:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548372117; cv=none; d=google.com; s=arc-20160816; b=N5Qmh+/bnC2ysq9oyTDQ5oPkGwo2XzJjGoQtG8fjMM1L1UBSAkd/AbK881dV5loDTc ey0LH2wdBQbksXX7sA4apvu2CtAb+84FCHemfPyym1UE5dJtxlbTn+MKhYJNX+1ccr2g D8AK1e0nmUnF8dV2SG+at5mHYX2/kGzQTL1+QBP41QxRLyC+F6S1sumQ5yaoeMif1GRP 4vH36rXKGx2uAMbYYZmg4eFS+xUUE/GxK7wvNKEFpK4kTJi91o8h4NYH2WcfbRGmz3Vy eN9ybYbsYUkuLi85t6TLX0ih8PrYawYaTICNrawwd8qbhAeR8iP7GlrStbBaAeVlox/r 9plA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=pLfVhbANs+dHRF061J8403dGvp1pUAcAUzYH31HwZOg=; b=yQphbNOBW9Ql4KxBRyfbzOoUNfhnTiB458ON5aakXzYYOfzY35mLSKaTD4jBR9ifUt xf8EsD4r14XAFxT6WqWHPygHYEHjkFw7R0qlFTHERq4wv5B8IWg3PGvIXGLZ6VUXWk0d +W0h1uHgyGOm1ZZDgbKjlN0IX2tdowlNL3dYWemX0RG+DTPgCex8jiiDvmCH/ZtolzA1 zawKU5qHoS4cE3f86UdjJgyaMEl2RTutk99TclHQUJ+ToUZXqbbEGN6lWBOJxmXbQ4N5 X25ShM9Zy7cCm5CI3HIinNUumE7cTzFeb7GwISP1zBs9JvKjw+m5mu3ynXVPBW8YTDBE KkKw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id h19si12278998pgb.231.2019.01.24.15.21.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 15:21:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Jan 2019 15:21:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,518,1539673200"; d="scan'208";a="109574602" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga007.jf.intel.com with ESMTP; 24 Jan 2019 15:21:56 -0800 Subject: [PATCH 4/5] dax/kmem: let walk_system_ram_range() search child resources To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de,jglisse@redhat.com From: Dave Hansen Date: Thu, 24 Jan 2019 15:14:47 -0800 References: <20190124231441.37A4A305@viggo.jf.intel.com> In-Reply-To: <20190124231441.37A4A305@viggo.jf.intel.com> Message-Id: <20190124231447.74358AA5@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen In the process of onlining memory, we use walk_system_ram_range() to find the actual RAM areas inside of the area being onlined. However, it currently only finds memory resources which are "top-level" iomem_resources. Children are not currently searched which causes it to skip System RAM in areas like this (in the format of /proc/iomem): a0000000-bfffffff : Persistent Memory (legacy) a0000000-afffffff : System RAM Changing the true->false here allows children to be searched as well. We need this because we add a new "System RAM" resource underneath the "persistent memory" resource when we use persistent memory in a volatile mode. Signed-off-by: Dave Hansen Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Borislav Petkov Cc: Bjorn Helgaas Cc: Yaowei Bai Cc: Takashi Iwai Cc: Jerome Glisse --- b/kernel/resource.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff -puN kernel/resource.c~mm-walk_system_ram_range-search-child-resources kernel/resource.c --- a/kernel/resource.c~mm-walk_system_ram_range-search-child-resources 2019-01-24 15:13:15.482199536 -0800 +++ b/kernel/resource.c 2019-01-24 15:13:15.486199536 -0800 @@ -445,6 +445,9 @@ int walk_mem_res(u64 start, u64 end, voi * This function calls the @func callback against all memory ranges of type * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY. * It is to be used only for System RAM. + * + * This will find System RAM ranges that are children of top-level resources + * in addition to top-level System RAM resources. */ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages, void *arg, int (*func)(unsigned long, unsigned long, void *)) @@ -460,7 +463,7 @@ int walk_system_ram_range(unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; while (start < end && !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, - true, &res)) { + false, &res)) { pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT; end_pfn = (res.end + 1) >> PAGE_SHIFT; if (end_pfn > pfn) From patchwork Thu Jan 24 23:14:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10780299 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 923D31399 for ; Thu, 24 Jan 2019 23:22:07 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 806252FDEE for ; Thu, 24 Jan 2019 23:22:07 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7419F2FE60; Thu, 24 Jan 2019 23:22:07 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ABCD42FDEE for ; Thu, 24 Jan 2019 23:22:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C0EE18E00B2; Thu, 24 Jan 2019 18:22:00 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id BC0298E00AC; Thu, 24 Jan 2019 18:22:00 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 97DD18E00B2; Thu, 24 Jan 2019 18:22:00 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by kanga.kvack.org (Postfix) with ESMTP id 5526C8E00AC for ; Thu, 24 Jan 2019 18:22:00 -0500 (EST) Received: by mail-pf1-f200.google.com with SMTP id l76so5977900pfg.1 for ; Thu, 24 Jan 2019 15:22:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=mJCuWqPpUOm9TvYmavfFDGJ3x/EivAx+XR14BBqqjWw=; b=O0RCWH4TngFz4/u/IY9GT4Wr3aMKRVwSSyhwMLvr61yCxfAUjBMOu/zd3Fq0mqG2LL msb9G9TiVliacoSA8K4xT1E3h3hTnjMfsvzNByKGThplbhkK4f9OxUwaIBIwBgIygCcv b/R3YfOdzYQ2QeZSVqQeMLcAEPlcOTVSwiv9AXvPhuN+Ci1Mziuf9sZCfmDZF2E1SIy8 itp84N1uh5NzKZB5O5q61kTuhSo5ixBO7CVX7UXzbz6SySeLtlxBCPZXpnyGjvwPmHI6 rzvKhdkyC2UAED8JCWVL8zRImUoAls1zZFtdibH968jMVdtLfv3XJDI/U049XHFNLbPP C71Q== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AJcUukdou8W16Fvv32gH6vKUdNq4witOTEzEfhsiHxgQHsPHFtMlcRpA IGVrb0lrWfrTvZhZUu9hl6YL7Q6UwDuNLL+lxyRSFhTS7iH685N44eE0NL3A4yBwlZM3Dm5RYyJ Z6GnUCTbQneUNkj5CzkMcV0dcSG5l/CvgbOvjuYYb0+3iqoBeq3Wluzd+3Km/t1kXDQ== X-Received: by 2002:a62:644:: with SMTP id 65mr8534982pfg.161.1548372119980; Thu, 24 Jan 2019 15:21:59 -0800 (PST) X-Google-Smtp-Source: ALg8bN5BhfxE9FWKZ8eq0BlOBYk781BNef+OvY7dlso9OuQL1ebw6izoN7RRLpXMqCq6CO0a7tci X-Received: by 2002:a62:644:: with SMTP id 65mr8534926pfg.161.1548372118835; Thu, 24 Jan 2019 15:21:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548372118; cv=none; d=google.com; s=arc-20160816; b=Wfceh13HfC0B2Z6OTSAz4zY20NqM/I6VvoTLsaoaaEFVrUCmeJi240FrE98HbsjuDM Y3feejs4UeRFewq0WR+Dn46GuKnAdJm5+Wxy8gwjS/vL1/D1qaWHwOUir2Z7/xb96c0n hsan46zUhaqfJcqN7yx8cKY4l6mWOoGvesh5FIEYokQJvZC4t66CjF0fMQ+5tKAACun8 +8unq1pwjLmSCzxBgujD/s5XDnYVrG1vTTJlEzcbi7xIboeSe2KMzBOn9n+l7WvM4mWL I/2E/+NVc9V40ZWO462G4mJX5DyQNltr+BK1Pep5EYnWnrUSueFfMVXJRdZUhfGBlaXp CKHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=mJCuWqPpUOm9TvYmavfFDGJ3x/EivAx+XR14BBqqjWw=; b=PsTyKjbCrChBfIf+lxRK9tXVJSvWa/ifb9xZL4csdvm8PmH9UKPbKqS5fHAQMMMw2O SZG7azRcrjWrRUXpswQvMje1WOkEy5JcvVzVVoctXzBPBy2GTWCDxKTziZp6eTNgXqj6 TUY28OYsfCrbWWMVsZeJhkFygOs7iARwoUZstf3ehCuLizXmCo4s1Nn4Gvzu3I2OSBFW rYoJ+JpkrpnRtDaSVVCArP+6kkNvDRIB4XsdQXO6GMSxDU1FcBzU0w2knWyZHxa8wf1v 3gWVhvWDpr4+bM9yKwW/XqZY3+3xWRoSb9FYZXyN9DSmQbfOKuUuJQNJieuifCHkiKgL LJAQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id p12si23187888pgl.106.2019.01.24.15.21.58 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Jan 2019 15:21:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Jan 2019 15:21:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,518,1539673200"; d="scan'208";a="269699706" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga004.jf.intel.com with ESMTP; 24 Jan 2019 15:21:58 -0800 Subject: [PATCH 5/5] dax: "Hotplug" persistent memory for use like normal RAM To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de,jglisse@redhat.com From: Dave Hansen Date: Thu, 24 Jan 2019 15:14:48 -0800 References: <20190124231441.37A4A305@viggo.jf.intel.com> In-Reply-To: <20190124231441.37A4A305@viggo.jf.intel.com> Message-Id: <20190124231448.E102D18E@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen This is intended for use with NVDIMMs that are physically persistent (physically like flash) so that they can be used as a cost-effective RAM replacement. Intel Optane DC persistent memory is one implementation of this kind of NVDIMM. Currently, a persistent memory region is "owned" by a device driver, either the "Direct DAX" or "Filesystem DAX" drivers. These drivers allow applications to explicitly use persistent memory, generally by being modified to use special, new libraries. (DIMM-based persistent memory hardware/software is described in great detail here: Documentation/nvdimm/nvdimm.txt). However, this limits persistent memory use to applications which *have* been modified. To make it more broadly usable, this driver "hotplugs" memory into the kernel, to be managed and used just like normal RAM would be. To make this work, management software must remove the device from being controlled by the "Device DAX" infrastructure: echo -n dax0.0 > /sys/bus/dax/drivers/device_dax/remove_id echo -n dax0.0 > /sys/bus/dax/drivers/device_dax/unbind and then bind it to this new driver: echo -n dax0.0 > /sys/bus/dax/drivers/kmem/new_id echo -n dax0.0 > /sys/bus/dax/drivers/kmem/bind After this, there will be a number of new memory sections visible in sysfs that can be onlined, or that may get onlined by existing udev-initiated memory hotplug rules. This rebinding procedure is currently a one-way trip. Once memory is bound to "kmem", it's there permanently and can not be unbound and assigned back to device_dax. The kmem driver will never bind to a dax device unless the device is *explicitly* bound to the driver. There are two reasons for this: One, since it is a one-way trip, it can not be undone if bound incorrectly. Two, the kmem driver destroys data on the device. Think of if you had good data on a pmem device. It would be catastrophic if you compile-in "kmem", but leave out the "device_dax" driver. kmem would take over the device and write volatile data all over your good data. This inherits any existing NUMA information for the newly-added memory from the persistent memory device that came from the firmware. On Intel platforms, the firmware has guarantees that require each socket's persistent memory to be in a separate memory-only NUMA node. That means that this patch is not expected to create NUMA nodes, but will simply hotplug memory into existing nodes. Because NUMA nodes are created, the existing NUMA APIs and tools are sufficient to create policies for applications or memory areas to have affinity for or an aversion to using this memory. There is currently some metadata at the beginning of pmem regions. The section-size memory hotplug restrictions, plus this small reserved area can cause the "loss" of a section or two of capacity. This should be fixable in follow-on patches. But, as a first step, losing 256MB of memory (worst case) out of hundreds of gigabytes is a good tradeoff vs. the required code to fix this up precisely. This calculation is also the reason we export memory_block_size_bytes(). Signed-off-by: Dave Hansen Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Borislav Petkov Cc: Bjorn Helgaas Cc: Yaowei Bai Cc: Takashi Iwai Cc: Jerome Glisse --- b/drivers/base/memory.c | 1 b/drivers/dax/Kconfig | 16 +++++++ b/drivers/dax/Makefile | 1 b/drivers/dax/kmem.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 126 insertions(+) diff -puN drivers/base/memory.c~dax-kmem-try-4 drivers/base/memory.c --- a/drivers/base/memory.c~dax-kmem-try-4 2019-01-24 15:13:15.987199535 -0800 +++ b/drivers/base/memory.c 2019-01-24 15:13:15.994199535 -0800 @@ -88,6 +88,7 @@ unsigned long __weak memory_block_size_b { return MIN_MEMORY_BLOCK_SIZE; } +EXPORT_SYMBOL_GPL(memory_block_size_bytes); static unsigned long get_memory_block_size(void) { diff -puN drivers/dax/Kconfig~dax-kmem-try-4 drivers/dax/Kconfig --- a/drivers/dax/Kconfig~dax-kmem-try-4 2019-01-24 15:13:15.988199535 -0800 +++ b/drivers/dax/Kconfig 2019-01-24 15:13:15.994199535 -0800 @@ -32,6 +32,22 @@ config DEV_DAX_PMEM Say M if unsure +config DEV_DAX_KMEM + tristate "KMEM DAX: volatile-use of persistent memory" + default DEV_DAX + depends on DEV_DAX + depends on MEMORY_HOTPLUG # for add_memory() and friends + help + Support access to persistent memory as if it were RAM. This + allows easier use of persistent memory by unmodified + applications. + + To use this feature, a DAX device must be unbound from the + device_dax driver (PMEM DAX) and bound to this kmem driver + on each boot. + + Say N if unsure. + config DEV_DAX_PMEM_COMPAT tristate "PMEM DAX: support the deprecated /sys/class/dax interface" depends on DEV_DAX_PMEM diff -puN /dev/null drivers/dax/kmem.c --- /dev/null 2018-12-03 08:41:47.355756491 -0800 +++ b/drivers/dax/kmem.c 2019-01-24 15:13:15.994199535 -0800 @@ -0,0 +1,108 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2016-2018 Intel Corporation. All rights reserved. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "dax-private.h" +#include "bus.h" + +int dev_dax_kmem_probe(struct device *dev) +{ + struct dev_dax *dev_dax = to_dev_dax(dev); + struct resource *res = &dev_dax->region->res; + resource_size_t kmem_start; + resource_size_t kmem_size; + resource_size_t kmem_end; + struct resource *new_res; + int numa_node; + int rc; + + /* + * Ensure good NUMA information for the persistent memory. + * Without this check, there is a risk that slow memory + * could be mixed in a node with faster memory, causing + * unavoidable performance issues. + */ + numa_node = dev_dax->target_node; + if (numa_node < 0) { + dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n", + res, numa_node); + return -EINVAL; + } + + /* Hotplug starting at the beginning of the next block: */ + kmem_start = ALIGN(res->start, memory_block_size_bytes()); + + kmem_size = resource_size(res); + /* Adjust the size down to compensate for moving up kmem_start: */ + kmem_size -= kmem_start - res->start; + /* Align the size down to cover only complete blocks: */ + kmem_size &= ~(memory_block_size_bytes() - 1); + kmem_end = kmem_start+kmem_size; + + /* Region is permanently reserved. Hot-remove not yet implemented. */ + new_res = request_mem_region(kmem_start, kmem_size, dev_name(dev)); + if (!new_res) { + dev_warn(dev, "could not reserve region [%pa-%pa]\n", + &kmem_start, &kmem_end); + return -EBUSY; + } + + /* + * Set flags appropriate for System RAM. Leave ..._BUSY clear + * so that add_memory() can add a child resource. Do not + * inherit flags from the parent since it may set new flags + * unknown to us that will break add_memory() below. + */ + new_res->flags = IORESOURCE_SYSTEM_RAM; + new_res->name = dev_name(dev); + + rc = add_memory(numa_node, new_res->start, resource_size(new_res)); + if (rc) + return rc; + + return 0; +} + +static int dev_dax_kmem_remove(struct device *dev) +{ + /* + * Purposely leak the request_mem_region() for the device-dax + * range and return '0' to ->remove() attempts. The removal of + * the device from the driver always succeeds, but the region + * is permanently pinned as reserved by the unreleased + * request_mem_region(). + */ + return -EBUSY; +} + +static struct dax_device_driver device_dax_kmem_driver = { + .drv = { + .probe = dev_dax_kmem_probe, + .remove = dev_dax_kmem_remove, + }, +}; + +static int __init dax_kmem_init(void) +{ + return dax_driver_register(&device_dax_kmem_driver); +} + +static void __exit dax_kmem_exit(void) +{ + dax_driver_unregister(&device_dax_kmem_driver); +} + +MODULE_AUTHOR("Intel Corporation"); +MODULE_LICENSE("GPL v2"); +module_init(dax_kmem_init); +module_exit(dax_kmem_exit); +MODULE_ALIAS_DAX_DEVICE(0); diff -puN drivers/dax/Makefile~dax-kmem-try-4 drivers/dax/Makefile --- a/drivers/dax/Makefile~dax-kmem-try-4 2019-01-24 15:13:15.990199535 -0800 +++ b/drivers/dax/Makefile 2019-01-24 15:13:15.994199535 -0800 @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_DAX) += dax.o obj-$(CONFIG_DEV_DAX) += device_dax.o +obj-$(CONFIG_DEV_DAX_KMEM) += kmem.o dax-y := super.o dax-y += bus.o