From patchwork Mon Feb 25 18:57:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10829025 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C4A7713B5 for ; Mon, 25 Feb 2019 19:02:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B52C62B7CC for ; Mon, 25 Feb 2019 19:02:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A96C72B7DB; Mon, 25 Feb 2019 19:02:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 358BE2B7CC for ; Mon, 25 Feb 2019 19:02:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 73A158E000A; Mon, 25 Feb 2019 14:02:38 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 6EB3B8E0004; Mon, 25 Feb 2019 14:02:38 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 600F88E000A; Mon, 25 Feb 2019 14:02:38 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id 1A1DF8E0004 for ; Mon, 25 Feb 2019 14:02:38 -0500 (EST) Received: by mail-pg1-f199.google.com with SMTP id 202so7690145pgb.6 for ; Mon, 25 Feb 2019 11:02:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=ddSVWhx5TfNJ/WvyKgYL4bUsHx/Z5PTEAP9KHJe7W9s=; b=UpjbCyWDSzu8OnlQwFMskhfy4pdazgBfRl41vAlnPajgjbXpyNaR8V91DnWKg1wLpi w9tpMQ22GYrRy/Vt7t3m+mkNuah666BXTDzbEvbU7op7wvVbAxGguAVsucUAsMBCqVZx bMZVEx/Wyc8GGHTT2fP0+DQjsHKCCQLil3atbI2XzHfoCPK3gf3r1x1Vz55Ke9A96OuZ +diI8QjaXflb6ReiPICVj+21NJ94YQYCYwR1iiyDmDD2Ur3tEtL3YSY4fmoJP+pgU8Sq KCCCXvUxiFBYZYbXd+aubxkS0hyFBtZAPx7FMxXOu8rvzNs8WJIAUgwtqBkeuGIioL3m kERA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AHQUAuZtOq5vWaELb5RMlqXgO5m/L6ydnRR7npDwc780yFxp7iJSMJmA Wtbu0PObiG2Ro+Q7mNdgO4ZVWcs49sNnCYcmOw4qz8l7hVamtPT1MOt/z/IUH2mzMoJJnD11I+k lcNkaoFLdoNbFAuTU3YNa4PO/9PIVWTBJllFW3f/Fto6XK5lwe4evOe36p0CsaE0LZA== X-Received: by 2002:a17:902:2de4:: with SMTP id p91mr22113808plb.215.1551121357658; Mon, 25 Feb 2019 11:02:37 -0800 (PST) X-Google-Smtp-Source: AHgI3IZYz8WkdgrfD0KTiG/bJ0iqAX5KwLiwb772yRYdhdr0VK27EwYyVPRcvgKYCY90RC/9J/r5 X-Received: by 2002:a17:902:2de4:: with SMTP id p91mr22113714plb.215.1551121356405; Mon, 25 Feb 2019 11:02:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551121356; cv=none; d=google.com; s=arc-20160816; b=IOkd9xVb0FhDogKWTMlq/ydpXqmER2DhUvivfbF8mBqPNGu+bGg/TwVrWYPrLKtAY4 6BrYzlhg0uqd/JmgIyzgUIWZH9scmfU2RgKsask7hhqDv37/PRzLhw/StC0WvQE3wdQk 0PfFV3aItKL3qFFcUqj3KNZjhyDZEHXVT58LV/ZY2NkHQOe7aZag/e7+Eisn1x7pZczO QnM5xNdWwkzK8lMF6TegE3xzWA8EVYbHT+bLCCH+PgWcpmXuugqF2PBY9TEMDcDeHQKF JGrQxtee9pJUCAVsYPCnHYmXu+qBWjBWcS1eFMWDese/61jgn0PcGWIv8gV6Mkgc8R82 G5hg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=ddSVWhx5TfNJ/WvyKgYL4bUsHx/Z5PTEAP9KHJe7W9s=; b=hd/PrKuwEGjE9J8zVE5O9ksGBIDpCsGR3U5nxza52CarbMS3XR1EfC5lAno5kdMsfv Dnq0CK8zPmeYl2YjT2jRPZINv1us31dCadH/9+d4T+WD4cowmAO/lgTDRImPXUTt6mED hYEPEYfwnYi5Fo/13Wub8CXEZOeW2Yyms8YnP3W6UpuTjdCVG3P9Il4AuH0/UxzPQuno W3rPJMe2xEdIrBupQF7OPsDBvp0rp6604aPMTC/ry1+NJyaiZ8bcIWGYTYfLnJ0kE4f5 96w2W2euMc4ca25GXrV3iHkxeDV/VdReABrmh1QTBDyxmohtKiJfEJTDkdx5AeRurXk8 c0yQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id j66si10521427pfc.251.2019.02.25.11.02.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Feb 2019 11:02:36 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Feb 2019 11:02:35 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,412,1544515200"; d="scan'208";a="149861256" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by fmsmga001.fm.intel.com with ESMTP; 25 Feb 2019 11:02:34 -0800 Subject: [PATCH 1/5] mm/resource: return real error codes from walk failures To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,bhelgaas@google.com,mpe@ellerman.id.au,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,baiyaowei@cmss.chinamobile.com,tiwai@suse.de,jglisse@redhat.com,benh@kernel.crashing.org,paulus@samba.org,linuxppc-dev@lists.ozlabs.org,keith.busch@intel.com From: Dave Hansen Date: Mon, 25 Feb 2019 10:57:30 -0800 References: <20190225185727.BCBD768C@viggo.jf.intel.com> In-Reply-To: <20190225185727.BCBD768C@viggo.jf.intel.com> Message-Id: <20190225185730.D8AA7812@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen walk_system_ram_range() can return an error code either becuase *it* failed, or because the 'func' that it calls returned an error. The memory hotplug does the following: ret = walk_system_ram_range(..., func); if (ret) return ret; and 'ret' makes it out to userspace, eventually. The problem s, walk_system_ram_range() failues that result from *it* failing (as opposed to 'func') return -1. That leads to a very odd -EPERM (-1) return code out to userspace. Make walk_system_ram_range() return -EINVAL for internal failures to keep userspace less confused. This return code is compatible with all the callers that I audited. This changes both the generic mm/ and powerpc-specific implementations to have the same return value. Signed-off-by: Dave Hansen Reviewed-by: Bjorn Helgaas Acked-by: Michael Ellerman (powerpc) Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Borislav Petkov Cc: Yaowei Bai Cc: Takashi Iwai Cc: Jerome Glisse Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: linuxppc-dev@lists.ozlabs.org Cc: Keith Busch --- b/arch/powerpc/mm/mem.c | 2 +- b/kernel/resource.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff -puN arch/powerpc/mm/mem.c~memory-hotplug-walk_system_ram_range-returns-neg-1 arch/powerpc/mm/mem.c --- a/arch/powerpc/mm/mem.c~memory-hotplug-walk_system_ram_range-returns-neg-1 2019-02-25 10:56:47.452908034 -0800 +++ b/arch/powerpc/mm/mem.c 2019-02-25 10:56:47.458908034 -0800 @@ -189,7 +189,7 @@ walk_system_ram_range(unsigned long star struct memblock_region *reg; unsigned long end_pfn = start_pfn + nr_pages; unsigned long tstart, tend; - int ret = -1; + int ret = -EINVAL; for_each_memblock(memory, reg) { tstart = max(start_pfn, memblock_region_memory_base_pfn(reg)); diff -puN kernel/resource.c~memory-hotplug-walk_system_ram_range-returns-neg-1 kernel/resource.c --- a/kernel/resource.c~memory-hotplug-walk_system_ram_range-returns-neg-1 2019-02-25 10:56:47.454908034 -0800 +++ b/kernel/resource.c 2019-02-25 10:56:47.459908034 -0800 @@ -382,7 +382,7 @@ static int __walk_iomem_res_desc(resourc int (*func)(struct resource *, void *)) { struct resource res; - int ret = -1; + int ret = -EINVAL; while (start < end && !find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) { @@ -462,7 +462,7 @@ int walk_system_ram_range(unsigned long unsigned long flags; struct resource res; unsigned long pfn, end_pfn; - int ret = -1; + int ret = -EINVAL; start = (u64) start_pfn << PAGE_SHIFT; end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1; From patchwork Mon Feb 25 18:57:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10829031 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8343F17E9 for ; Mon, 25 Feb 2019 19:02:43 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 740AA2B7CC for ; Mon, 25 Feb 2019 19:02:43 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 682C12B7D4; Mon, 25 Feb 2019 19:02:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E2B972B7DA for ; Mon, 25 Feb 2019 19:02:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C47A08E000B; Mon, 25 Feb 2019 14:02:40 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B814B8E0004; Mon, 25 Feb 2019 14:02:40 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A45DB8E000B; Mon, 25 Feb 2019 14:02:40 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id 6401C8E0004 for ; Mon, 25 Feb 2019 14:02:40 -0500 (EST) Received: by mail-pg1-f200.google.com with SMTP id 11so7662658pgd.19 for ; Mon, 25 Feb 2019 11:02:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=m3exMRICffW7oyoKK4pXkv1gy7VWxQjl4Jnx4soXu74=; b=HI8839LIM9Ag5sPAdTGlGmcFmvo4UAiL6jX+FX/Blqx0tLwshdBnAId+/vOU9Xf6f/ iG4vO+JaJoKn2CnpOckwWE2aJYgYOfJps5aWSeth5o9dw2LmeZpZmJdFw7ZZPRRs7PuO 2IbdXTDHBZOIPdWqHreuAVLw+aWMGcTficx+pAuQq6W1lrkOf33uVgJAY4KVj1Z5E5yq QrN31UzFhzxWV10N6YGOiXrgBIa2O9F8+XMarZlCzx8Dg6YxOr2okIILOKcFUqa3Ar6p Jd7/sICk26/6Eh3LKGgIWMJHCxnV0Pze46hC+NSNmARQYNiu7+JNKDC4IkJGVGRTje4i 2Gmw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AHQUAuY4sDANRhgaCm6y2wpw3SaScGLcLZoPPZrlPeRBLpmgBJM/WSVl UXJ9pNJ93C2hm739N5vUwaJkVybIDERnoPLKdT5zc0v08rMrHCcLV/qFoJqtcWeUq6Z6WyLeaUC RI3p3SoJJO2klgjV+bDZFq8rlJQkswI1qBzsUbcwnK/k5Lgf7SBnbL5lS+P0O+yWtXQ== X-Received: by 2002:aa7:9090:: with SMTP id i16mr21369999pfa.85.1551121359990; Mon, 25 Feb 2019 11:02:39 -0800 (PST) X-Google-Smtp-Source: AHgI3IaOQdMXno2Q3WhuYcgry5g8mfJNBRUhys/Wkw5YQPs4NvAn5Rcl+olvBPZnioChgwsk4yG3 X-Received: by 2002:aa7:9090:: with SMTP id i16mr21369922pfa.85.1551121358946; Mon, 25 Feb 2019 11:02:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551121358; cv=none; d=google.com; s=arc-20160816; b=w1GCLdjXOih4kHpIIhr1uEap81vx53cXebrKcj8V+GdfHa2teM37TxQRHF+QdvJai+ 0BnMO68UzsLD7ZkP8VNsaWe2920MlQGG/Fuqqq8KE3BY0Wu4M/miND6vJ/2h24QViGa+ 9ndUiCHoJa8rXmhiV4J79nnbOvYUW88g8tINlPMungrs1QaB7eIFt+cAmoQb+a1MwgEo 7LWNIxtzIp4zR3kXRHBbdsqWHRGBu2qsL80OH7krc4VcEhPS0yKIYfsc/csUokIllThw Qe2eFSoznBqKszsUiayF4A6g0ZK1+5DfpYB8Ug//jRs4YDc0yEd8EcTkzuXi5QfPCi6h 0xIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=m3exMRICffW7oyoKK4pXkv1gy7VWxQjl4Jnx4soXu74=; b=Uqhy5GlmrFlYaQyiQDsrUTyc9AoyZRhhrL8Vaq7n/+SeCpfBJmckBsjeCWiVLhKw78 9R9OWUtBfWRbUiouXKYurqX9QYL2KK8pl63pXvMEwLvMNu6DyHauTbeKdM+blIhbKlON IRlKVMVsm9oexPnew76vjgV0UjYgVGYy2+M8AqtiAWZLU0aIhhdwxPG/xrzrcWvSin2+ zcGcZT/3FKvMh+LhHcQWvBotf8NxZEDwm7aaegXwkdbfim83BgcGYuQ8KlYMa782un1/ 8CdWYnDEfUcxdCDMy4BBXZJy2KoFDvJxE1GtojD9NVtq8a1TA5u74CieoyYkRJGqb24f y4/A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id i3si9605563pgq.282.2019.02.25.11.02.38 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Feb 2019 11:02:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Feb 2019 11:02:38 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,412,1544515200"; d="scan'208";a="323272306" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by fmsmga005.fm.intel.com with ESMTP; 25 Feb 2019 11:02:37 -0800 Subject: [PATCH 2/5] mm/resource: move HMM pr_debug() deeper into resource code To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,jglisse@redhat.com,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,keith.busch@intel.com From: Dave Hansen Date: Mon, 25 Feb 2019 10:57:33 -0800 References: <20190225185727.BCBD768C@viggo.jf.intel.com> In-Reply-To: <20190225185727.BCBD768C@viggo.jf.intel.com> Message-Id: <20190225185733.FB5686EB@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen HMM consumes physical address space for its own use, even though nothing is mapped or accessible there. It uses a special resource description (IORES_DESC_DEVICE_PRIVATE_MEMORY) to uniquely identify these areas. When HMM consumes address space, it makes a best guess about what to consume. However, it is possible that a future memory or device hotplug can collide with the reserved area. In the case of these conflicts, there is an error message in register_memory_resource(). Later patches in this series move register_memory_resource() from using request_resource_conflict() to __request_region(). Unfortunately, __request_region() does not return the conflict like the previous function did, which makes it impossible to check for IORES_DESC_DEVICE_PRIVATE_MEMORY in a conflicting resource. Instead of warning in register_memory_resource(), move the check into the core resource code itself (__request_region()) where the conflicting resource _is_ available. This has the added bonus of producing a warning in case of HMM conflicts with devices *or* RAM address space, as opposed to the RAM- only warnings that were there previously. Signed-off-by: Dave Hansen Reviewed-by: Jerome Glisse Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Keith Busch --- b/kernel/resource.c | 9 +++++++++ b/mm/memory_hotplug.c | 5 ----- 2 files changed, 9 insertions(+), 5 deletions(-) diff -puN kernel/resource.c~move-request_region-check kernel/resource.c --- a/kernel/resource.c~move-request_region-check 2019-02-25 10:56:48.581908031 -0800 +++ b/kernel/resource.c 2019-02-25 10:56:48.588908031 -0800 @@ -1132,6 +1132,15 @@ struct resource * __request_region(struc conflict = __request_resource(parent, res); if (!conflict) break; + /* + * mm/hmm.c reserves physical addresses which then + * become unavailable to other users. Conflicts are + * not expected. Warn to aid debugging if encountered. + */ + if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) { + pr_warn("Unaddressable device %s %pR conflicts with %pR", + conflict->name, conflict, res); + } if (conflict != parent) { if (!(conflict->flags & IORESOURCE_BUSY)) { parent = conflict; diff -puN mm/memory_hotplug.c~move-request_region-check mm/memory_hotplug.c --- a/mm/memory_hotplug.c~move-request_region-check 2019-02-25 10:56:48.583908031 -0800 +++ b/mm/memory_hotplug.c 2019-02-25 10:56:48.588908031 -0800 @@ -111,11 +111,6 @@ static struct resource *register_memory_ res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; conflict = request_resource_conflict(&iomem_resource, res); if (conflict) { - if (conflict->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) { - pr_debug("Device unaddressable memory block " - "memory hotplug at %#010llx !\n", - (unsigned long long)start); - } pr_debug("System RAM resource %pR cannot be added\n", res); kfree(res); return ERR_PTR(-EEXIST); From patchwork Mon Feb 25 18:57:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10829037 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4FA1E13B5 for ; Mon, 25 Feb 2019 19:02:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4133F2B7CC for ; Mon, 25 Feb 2019 19:02:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 34ACB2B7D5; Mon, 25 Feb 2019 19:02:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A50DE2B7CC for ; Mon, 25 Feb 2019 19:02:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 518948E000C; Mon, 25 Feb 2019 14:02:42 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4C63F8E0004; Mon, 25 Feb 2019 14:02:42 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B79E8E000C; Mon, 25 Feb 2019 14:02:42 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f197.google.com (mail-pg1-f197.google.com [209.85.215.197]) by kanga.kvack.org (Postfix) with ESMTP id F0D1E8E0004 for ; Mon, 25 Feb 2019 14:02:41 -0500 (EST) Received: by mail-pg1-f197.google.com with SMTP id 143so7699412pgc.3 for ; Mon, 25 Feb 2019 11:02:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=XiMqr0hWyG/rAB1cGp/Q4giXGgzO5P9hGJu2yp7i+yw=; b=dfDxn7DAPLDvaDVRTJ4G2gRtpIFqnUUhDTWErftiitD7SDgtktU6FmRqKzvcvayPTI VGiuYuq/wIE0xFAOpkhfpyZ3ofLBrX5SpkpNUvbLnyA3UGcDou9g6A6Zn081MxLMjPAA QBeG5eSsx8dqIvu5VQ1UanzFJZgkbMp+yQnYSmj3hwLOq1EMVZ/27tbkZYnmiqVnI8w+ vWsv74844LYQFytbFj+vAcIMFWiwR0A9h60sVqooI5uVNLJzXOxgxAbFhmGcaEJfP1ZT cqar26bZtDnxoAgm1wdBaYuBCPx6mljCkRhiOjplJqf4Gvcnv1MiGw1IAmunc1+G8tGu llKg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AHQUAuaLuaOmylF1+wPrQGzt8HVz4zZw85PyAf24vgwy7pRUfzgjFdKp nKquk02MMNml++jjhObQh9nwy/FJ5QO8ftAxR9b6EGLT5Pf2tha4HAaISLoaeYO63ajVTT/A2fK oQaGnOktvQo00mBtGjzovFzNIlYdnaj4RWX5gfm9CajUHppiFb8UOAeDYzAxhxZ3yaQ== X-Received: by 2002:a63:4e05:: with SMTP id c5mr20701741pgb.393.1551121361618; Mon, 25 Feb 2019 11:02:41 -0800 (PST) X-Google-Smtp-Source: AHgI3IZVmv84sItNYVXG83rZxg7ubJH/g1t9nG009JEoLjFTizWsSBKvCxqf9wgcIAzRG5jUQDTQ X-Received: by 2002:a63:4e05:: with SMTP id c5mr20701674pgb.393.1551121360708; Mon, 25 Feb 2019 11:02:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551121360; cv=none; d=google.com; s=arc-20160816; b=EzbF9CifXSfMdow648d6f8dhqYhY+7idmyOc6UYyCVFzzEgeJGjgG+eHE1FaG1wNs2 fYih86eKaGxtRutRxSCSOH/YQHDFgVL64Cr7LHhVS+3C+6VS/TTMocSjhsrvfaqeQPdf ggsTNHeD10zkTVZVkPJrlkfPN/+Uurn2XNIsOPiljw82YJO3Os0h0cuAOEYZZ/1zfVtq XGlG2ZygotUVfq5EHlvaq+ho3plTOIQrP9zaGInqi7wTNM6Cgdo1iNAk9jgjPA/jxDLv AhaHOhAFbbHvhg7MAWKItYX7sFYbfIPksOdMNMiZNRq8+9tL8KhFmvCC6MbRXgcpjkIz rIoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=XiMqr0hWyG/rAB1cGp/Q4giXGgzO5P9hGJu2yp7i+yw=; b=0SjX1aNsXWgWrR6PSiUCn1A3kkwWIkx4Lwv3TJHSujxyHl7P5Q7tu7KOH+AinjLEOs pj5D3nN+TdiEsT1UYakILt4o3DUrZls5YGiOc/bzzCTiOIXkUuW5QGfgEk+BPMkK8kJD MW9r3oUr6i2M1Q0o+SrFFJf0D8rvMxQ5G9+WpUcvlXkD5TTeWETAh7FOe/Yx/MzxhghM VAKJqJzCBv+hT5bJ+JMpBwQcwkOr9lwpGRHuCsjAfjx/4ldXlWIetP+Ps306LkqzqIib D2VXEmch9dEdek4+XX6feYIsK0I/IlgSecXa3K17ssVcssrIGdY+yX1kmOjQPH0Pw4Xh +bAQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga06.intel.com (mga06.intel.com. [134.134.136.31]) by mx.google.com with ESMTPS id i3si9605563pgq.282.2019.02.25.11.02.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Feb 2019 11:02:40 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) client-ip=134.134.136.31; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Feb 2019 11:02:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,412,1544515200"; d="scan'208";a="302443100" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga005.jf.intel.com with ESMTP; 25 Feb 2019 11:02:40 -0800 Subject: [PATCH 3/5] mm/memory-hotplug: allow memory resources to be children To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,dan.j.williams@intel.com,vishal.l.verma@intel.com,dave.jiang@intel.com,zwisler@kernel.org,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de,jglisse@redhat.com,keith.busch@intel.com From: Dave Hansen Date: Mon, 25 Feb 2019 10:57:36 -0800 References: <20190225185727.BCBD768C@viggo.jf.intel.com> In-Reply-To: <20190225185727.BCBD768C@viggo.jf.intel.com> Message-Id: <20190225185736.7B4711BC@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen The mm/resource.c code is used to manage the physical address space. The current resource configuration can be viewed in /proc/iomem. An example of this is at the bottom of this description. The nvdimm subsystem "owns" the physical address resources which map to persistent memory and has resources inserted for them as "Persistent Memory". The best way to repurpose this for volatile use is to leave the existing resource in place, but add a "System RAM" resource underneath it. This clearly communicates the ownership relationship of this memory. The request_resource_conflict() API only deals with the top-level resources. Replace it with __request_region() which will search for !IORESOURCE_BUSY areas lower in the resource tree than the top level. We *could* also simply truncate the existing top-level "Persistent Memory" resource and take over the released address space. But, this means that if we ever decide to hot-unplug the "RAM" and give it back, we need to recreate the original setup, which may mean going back to the BIOS tables. This should have no real effect on the existing collision detection because the areas that truly conflict should be marked IORESOURCE_BUSY. 00000000-00000fff : Reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : Reserved 000a0000-000bffff : PCI Bus 0000:00 000c0000-000c97ff : Video ROM 000c9800-000ca5ff : Adapter ROM 000f0000-000fffff : Reserved 000f0000-000fffff : System ROM 00100000-9fffffff : System RAM 01000000-01e071d0 : Kernel code 01e071d1-027dfdff : Kernel data 02dc6000-0305dfff : Kernel bss a0000000-afffffff : Persistent Memory (legacy) a0000000-a7ffffff : System RAM b0000000-bffdffff : System RAM bffe0000-bfffffff : Reserved c0000000-febfffff : PCI Bus 0000:00 Signed-off-by: Dave Hansen Reviewed-by: Dan Williams Reviewed-by: Vishal Verma Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Borislav Petkov Cc: Bjorn Helgaas Cc: Yaowei Bai Cc: Takashi Iwai Cc: Jerome Glisse Cc: Keith Busch --- b/mm/memory_hotplug.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff -puN mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child mm/memory_hotplug.c --- a/mm/memory_hotplug.c~mm-memory-hotplug-allow-memory-resource-to-be-child 2019-02-25 10:56:49.707908029 -0800 +++ b/mm/memory_hotplug.c 2019-02-25 10:56:49.711908029 -0800 @@ -100,19 +100,21 @@ void mem_hotplug_done(void) /* add this memory to iomem resource */ static struct resource *register_memory_resource(u64 start, u64 size) { - struct resource *res, *conflict; - res = kzalloc(sizeof(struct resource), GFP_KERNEL); - if (!res) - return ERR_PTR(-ENOMEM); + struct resource *res; + unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; + char *resource_name = "System RAM"; - res->name = "System RAM"; - res->start = start; - res->end = start + size - 1; - res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; - conflict = request_resource_conflict(&iomem_resource, res); - if (conflict) { - pr_debug("System RAM resource %pR cannot be added\n", res); - kfree(res); + /* + * Request ownership of the new memory range. This might be + * a child of an existing resource that was present but + * not marked as busy. + */ + res = __request_region(&iomem_resource, start, size, + resource_name, flags); + + if (!res) { + pr_debug("Unable to reserve System RAM region: %016llx->%016llx\n", + start, start + size); return ERR_PTR(-EEXIST); } return res; From patchwork Mon Feb 25 18:57:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10829039 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 244FB13B5 for ; Mon, 25 Feb 2019 19:02:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 16EB82B7D4 for ; Mon, 25 Feb 2019 19:02:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0B3E42B7DB; Mon, 25 Feb 2019 19:02:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8ADAD2B7D4 for ; Mon, 25 Feb 2019 19:02:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E65A58E000D; Mon, 25 Feb 2019 14:02:44 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E3D718E0004; Mon, 25 Feb 2019 14:02:44 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C6C1C8E000D; Mon, 25 Feb 2019 14:02:44 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id 83B0A8E0004 for ; Mon, 25 Feb 2019 14:02:44 -0500 (EST) Received: by mail-pl1-f199.google.com with SMTP id go14so7946197plb.0 for ; Mon, 25 Feb 2019 11:02:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=wTpfmsZuic2HvDF5hoBzln3PVoDtdFMJRGlcBqYIJwE=; b=KAHRIWJ41Y6MiX/1t5jzj5GJVdkdnxHXkxDi22D1Oa5DTpQ9dNtBDMQDhkt4AOjCc7 cCy93K6bnLsAuysadCqSsJHooDlePGoUzr4SnSoHl+frqgl+Fez09X7ig6r1EqFlS0KO wy7h1wvmJNLMiJWqyh8pVCzyJLeNR2M9osnfTnxAHMrNvc/yEr/bJ38XzkBilJ7rRKek tMZX4cnG98T1Uf21RkoLTRFFivDs4omzQjdR5qo+ovfgsZ98z9z7ucto1/OMgcnm395L bKIgFZ9inW/cD72H/P9tgcsrYNATWi3H/jYPbP991zoB+DOYbQTDUIq7W9Mds40NW6ii bEgg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AHQUAuYaqh++waSkj1lknTbSQ2P0OpX+JQNiH1Y2m1nYZqHjHXOK0nY6 /B3tZyBe5m3veAMELk4oltWjLyqvmeXGm0o6Yi6uosuCGJ3mDrtm0zTgqFEe1ZicUDxz0paFdZN cGWNS1WI+TmGpDzB/IVfRqwIJ8OKGeUPX/rVJhe/uLMBe5BpXGJ/CYXOAHXyU1VDEuw== X-Received: by 2002:a65:6549:: with SMTP id a9mr19783355pgw.21.1551121364175; Mon, 25 Feb 2019 11:02:44 -0800 (PST) X-Google-Smtp-Source: AHgI3IZBjq7r6yh5vxOcKLxJzLtBKvpSEsV89M19ogwVd7ekIIu5uTD1Vm5yWzf/WDbgaYwBFGe0 X-Received: by 2002:a65:6549:: with SMTP id a9mr19783294pgw.21.1551121363405; Mon, 25 Feb 2019 11:02:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551121363; cv=none; d=google.com; s=arc-20160816; b=xJUeiuB1T7fC/Zyq3CB919zjxxTnU1xgaz6+eBYsBafdGeuTmMdjnJzAKMklH+vvYV BSWMZZV3CpvE2i0qCMhRSPNkhcf4/Yu9bBs90LLC/BEHiMj9uptuSYmlLQQ3AjvPS5VW QlBKrBJ8gnUFgdkIJ0UOOywOOKoNdw6Efg0qcnJ1Z6h2VY2wgGDaU7yvMUBrkTt3IFDH hZRLHDkl3scRLQ6vCC+ULGm0BhXghZyf/L8QWiXy4Q32mSRT3GAVtHUMG+4IMuQaXp+B +e9lad3WIpU2cbRE1jRXInq8Uvdpro+6xsAHsrVPSum3/ywowVdAE4gcJylSP+jmlsxI Izsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=wTpfmsZuic2HvDF5hoBzln3PVoDtdFMJRGlcBqYIJwE=; b=KDG2YUxJjWV3pZNe/RZJqcHfjUMFZQu6NEubJBhQQ696r+WKEcYsttCBlhhAAShBU8 ySIqfy45gGx45mDi0TbAmxK+NwyRohArRLwPvn5eLW5svktQurlb5uJ59NZGwlbHFnKL Ltp5K+Fh6OEG+ywTdktS2Fnd5UOlvMOMJk85K6keErG58DFRFyopSIenVTgMEPDmOu2s AUmNyL7DVMlo/XCEAcjNMkHLRL6HBhB5SZ7GUweoQ4S/0nbpEBMhM5BHukrQ39zfN+JG /09S+zgM4Vls00gClU9R5LTsdgLu761RcTtsQnNh7ncIByUaXICxuMhw70MiIRoPFfuI oDXg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga11.intel.com (mga11.intel.com. [192.55.52.93]) by mx.google.com with ESMTPS id a89si10434099pla.362.2019.02.25.11.02.43 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Feb 2019 11:02:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.93 as permitted sender) client-ip=192.55.52.93; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Feb 2019 11:02:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,412,1544515200"; d="scan'208";a="118984325" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga006.jf.intel.com with ESMTP; 25 Feb 2019 11:02:42 -0800 Subject: [PATCH 4/5] mm/resource: let walk_system_ram_range() search child resources To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,keith.busch@intel.com,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de,jglisse@redhat.com From: Dave Hansen Date: Mon, 25 Feb 2019 10:57:38 -0800 References: <20190225185727.BCBD768C@viggo.jf.intel.com> In-Reply-To: <20190225185727.BCBD768C@viggo.jf.intel.com> Message-Id: <20190225185738.F6C24E62@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen In the process of onlining memory, we use walk_system_ram_range() to find the actual RAM areas inside of the area being onlined. However, it currently only finds memory resources which are "top-level" iomem_resources. Children are not currently searched which causes it to skip System RAM in areas like this (in the format of /proc/iomem): a0000000-bfffffff : Persistent Memory (legacy) a0000000-afffffff : System RAM Changing the true->false here allows children to be searched as well. We need this because we add a new "System RAM" resource underneath the "persistent memory" resource when we use persistent memory in a volatile mode. Signed-off-by: Dave Hansen Cc: Keith Busch Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Borislav Petkov Cc: Bjorn Helgaas Cc: Yaowei Bai Cc: Takashi Iwai Cc: Jerome Glisse --- b/kernel/resource.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff -puN kernel/resource.c~mm-walk_system_ram_range-search-child-resources kernel/resource.c --- a/kernel/resource.c~mm-walk_system_ram_range-search-child-resources 2019-02-25 10:56:50.750908026 -0800 +++ b/kernel/resource.c 2019-02-25 10:56:50.754908026 -0800 @@ -454,6 +454,9 @@ int walk_mem_res(u64 start, u64 end, voi * This function calls the @func callback against all memory ranges of type * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY. * It is to be used only for System RAM. + * + * This will find System RAM ranges that are children of top-level resources + * in addition to top-level System RAM resources. */ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages, void *arg, int (*func)(unsigned long, unsigned long, void *)) @@ -469,7 +472,7 @@ int walk_system_ram_range(unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; while (start < end && !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, - true, &res)) { + false, &res)) { pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT; end_pfn = (res.end + 1) >> PAGE_SHIFT; if (end_pfn > pfn) From patchwork Mon Feb 25 18:57:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10829043 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8E2061390 for ; Mon, 25 Feb 2019 19:02:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7F3982B7D4 for ; Mon, 25 Feb 2019 19:02:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 72F142B7DA; Mon, 25 Feb 2019 19:02:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A8B2A2B7D4 for ; Mon, 25 Feb 2019 19:02:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D313C8E000E; Mon, 25 Feb 2019 14:02:47 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C90798E0004; Mon, 25 Feb 2019 14:02:47 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B33BE8E000E; Mon, 25 Feb 2019 14:02:47 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 6B34D8E0004 for ; Mon, 25 Feb 2019 14:02:47 -0500 (EST) Received: by mail-pl1-f198.google.com with SMTP id 71so7899256plf.19 for ; Mon, 25 Feb 2019 11:02:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=qSCp8WFq4n0m4QhR+X834DlshjvZ4prw08Kpqwu8a28=; b=RLy25/WJLA1NsQi9clN4OSBHEl6Hvf9tfzUIbm8K1D3oAphcPpHmSRaY/qd5y5cuQU Vy4x3cqBsx8zAZGkUxulyfomKHgmq+mmvPuUZ7QE5QCm/Uu0rhADscTRojwH79LYNhpC 9qVzLTf4zpkRVhv8duvc4Cxpqt5Qna/iOv9HlWDjnl/5zG4lp765NwDwXKjzslF9uW3Y HeMhZslbRVgPQ6MHyVSAowTBdJjSmQwtAMuM+EPMq2bCYRZAyJGWNGXnTCwFCxQgN+lo F9xQFwzgSEcXIdeKl/r8kzm//S/TigKkkwai+h4cBxQ03tDWZWHLiBnDfy5546CBPTkv uC2w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: AHQUAuYoOZdaCgFM6hVjHvFVYet5dKh+Eaihm2hxeuY0coT4HhNZFd5g 2dJZgAv5oIZ6QREktbSyzwJsbt9zVXBjCL0PAQO2SX1MU+KNrr+n2NG8jT6n7W4QllD+XkSYIDr +Zt4L2DETPj5heVlzko6wvqdcTD2kX6YuJYg9lfvPsPzaJreEES6dG1Q8G7hRisj+2w== X-Received: by 2002:a17:902:9893:: with SMTP id s19mr21946835plp.165.1551121367039; Mon, 25 Feb 2019 11:02:47 -0800 (PST) X-Google-Smtp-Source: AHgI3IbTseGy8pdxcQFU5VY3PnX7hU17fxQY0/2/efE3A/m63pdqpjP+ibf8W7YXCSlq7tXfTml9 X-Received: by 2002:a17:902:9893:: with SMTP id s19mr21946711plp.165.1551121365520; Mon, 25 Feb 2019 11:02:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551121365; cv=none; d=google.com; s=arc-20160816; b=VnVCsXObonwDwdvOQ5FlkKdYKSxjJiU0dDUBN4EqBtWZzeoG/qo9N4kvdwMP0KktTJ x72aZ/8RyypZyKBmne+9LVd7YId+VD8edMoMQsP+ItV6B0UGQHXonoFxz/xEaypn5Grw Vnp46hiNzyrD8cj6f5D+2VVkIgL4cPd1nrMsQL0T/x68fMG2t7g2dJ8+dpCFc1JqNFEO b0Pif3bHNCoqIPkx0FKDdhknUfr9UTH8n5ur15mB8K7tkDAPWFNlxVf52kusrL1n0nzW /QRABfa5bFJo2Znl/6Qvz+eEHwuUGGtYdrlXJrPm6iEYNOBB9UInh+IlXFXuuV3ATCGm Tv0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=qSCp8WFq4n0m4QhR+X834DlshjvZ4prw08Kpqwu8a28=; b=FeO6VU2qOyW0Jg8SEfcsa8Hl4cb7i2pBZY1txn5EOd9+S+aTJzA0z+PcHWAAlhgxLT Fwl7XgpCBweUHkQEWcW4SMnkIm+d1Y0kTJwg2afbnvPKQy7aoenF4Lqmpr/znASvq/AK 1UP3tygCznJQoFwh/vqm1ieSls8XPeNf489OZNfXIMhWimjPZAG6yDVG0SJdiqgYGprr sPEQ9BuUPaBCzXzwAmzNqouXhaRb1c8aA95EqVekySn1sc3ajAUhV39hTLEqIPhE3h18 IbBZbruIs9MA2gF1cHEKpSZPYma+TnDhK49ZqSqtjEpqK1EELN2Nr/v1MBuhbhB5jmMS HkaA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga12.intel.com (mga12.intel.com. [192.55.52.136]) by mx.google.com with ESMTPS id z190si9928052pgd.238.2019.02.25.11.02.45 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Feb 2019 11:02:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.136 as permitted sender) client-ip=192.55.52.136; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Feb 2019 11:02:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,412,1544515200"; d="scan'208";a="120687537" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga008.jf.intel.com with ESMTP; 25 Feb 2019 11:02:44 -0800 Subject: [PATCH 5/5] dax: "Hotplug" persistent memory for use like normal RAM To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,dan.j.williams@intel.com,keith.busch@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com,bp@suse.de,bhelgaas@google.com,baiyaowei@cmss.chinamobile.com,tiwai@suse.de,jglisse@redhat.com From: Dave Hansen Date: Mon, 25 Feb 2019 10:57:40 -0800 References: <20190225185727.BCBD768C@viggo.jf.intel.com> In-Reply-To: <20190225185727.BCBD768C@viggo.jf.intel.com> Message-Id: <20190225185740.8660866F@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Hansen This is intended for use with NVDIMMs that are physically persistent (physically like flash) so that they can be used as a cost-effective RAM replacement. Intel Optane DC persistent memory is one implementation of this kind of NVDIMM. Currently, a persistent memory region is "owned" by a device driver, either the "Direct DAX" or "Filesystem DAX" drivers. These drivers allow applications to explicitly use persistent memory, generally by being modified to use special, new libraries. (DIMM-based persistent memory hardware/software is described in great detail here: Documentation/nvdimm/nvdimm.txt). However, this limits persistent memory use to applications which *have* been modified. To make it more broadly usable, this driver "hotplugs" memory into the kernel, to be managed and used just like normal RAM would be. To make this work, management software must remove the device from being controlled by the "Device DAX" infrastructure: echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind and then tell the new driver that it can bind to the device: echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id After this, there will be a number of new memory sections visible in sysfs that can be onlined, or that may get onlined by existing udev-initiated memory hotplug rules. This rebinding procedure is currently a one-way trip. Once memory is bound to "kmem", it's there permanently and can not be unbound and assigned back to device_dax. The kmem driver will never bind to a dax device unless the device is *explicitly* bound to the driver. There are two reasons for this: One, since it is a one-way trip, it can not be undone if bound incorrectly. Two, the kmem driver destroys data on the device. Think of if you had good data on a pmem device. It would be catastrophic if you compile-in "kmem", but leave out the "device_dax" driver. kmem would take over the device and write volatile data all over your good data. This inherits any existing NUMA information for the newly-added memory from the persistent memory device that came from the firmware. On Intel platforms, the firmware has guarantees that require each socket's persistent memory to be in a separate memory-only NUMA node. That means that this patch is not expected to create NUMA nodes, but will simply hotplug memory into existing nodes. Because NUMA nodes are created, the existing NUMA APIs and tools are sufficient to create policies for applications or memory areas to have affinity for or an aversion to using this memory. There is currently some metadata at the beginning of pmem regions. The section-size memory hotplug restrictions, plus this small reserved area can cause the "loss" of a section or two of capacity. This should be fixable in follow-on patches. But, as a first step, losing 256MB of memory (worst case) out of hundreds of gigabytes is a good tradeoff vs. the required code to fix this up precisely. This calculation is also the reason we export memory_block_size_bytes(). Signed-off-by: Dave Hansen Reviewed-by: Dan Williams Reviewed-by: Keith Busch Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu Cc: Borislav Petkov Cc: Bjorn Helgaas Cc: Yaowei Bai Cc: Takashi Iwai Cc: Jerome Glisse Reviewed-by: Vishal Verma --- b/drivers/base/memory.c | 1 b/drivers/dax/Kconfig | 16 +++++++ b/drivers/dax/Makefile | 1 b/drivers/dax/kmem.c | 108 ++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 126 insertions(+) diff -puN drivers/base/memory.c~dax-kmem-try-4 drivers/base/memory.c --- a/drivers/base/memory.c~dax-kmem-try-4 2019-02-25 10:56:51.791908023 -0800 +++ b/drivers/base/memory.c 2019-02-25 10:56:51.800908023 -0800 @@ -88,6 +88,7 @@ unsigned long __weak memory_block_size_b { return MIN_MEMORY_BLOCK_SIZE; } +EXPORT_SYMBOL_GPL(memory_block_size_bytes); static unsigned long get_memory_block_size(void) { diff -puN drivers/dax/Kconfig~dax-kmem-try-4 drivers/dax/Kconfig --- a/drivers/dax/Kconfig~dax-kmem-try-4 2019-02-25 10:56:51.793908023 -0800 +++ b/drivers/dax/Kconfig 2019-02-25 10:56:51.800908023 -0800 @@ -32,6 +32,22 @@ config DEV_DAX_PMEM Say M if unsure +config DEV_DAX_KMEM + tristate "KMEM DAX: volatile-use of persistent memory" + default DEV_DAX + depends on DEV_DAX + depends on MEMORY_HOTPLUG # for add_memory() and friends + help + Support access to persistent memory as if it were RAM. This + allows easier use of persistent memory by unmodified + applications. + + To use this feature, a DAX device must be unbound from the + device_dax driver (PMEM DAX) and bound to this kmem driver + on each boot. + + Say N if unsure. + config DEV_DAX_PMEM_COMPAT tristate "PMEM DAX: support the deprecated /sys/class/dax interface" depends on DEV_DAX_PMEM diff -puN /dev/null drivers/dax/kmem.c --- /dev/null 2019-02-15 15:42:29.903470860 -0800 +++ b/drivers/dax/kmem.c 2019-02-25 10:56:51.800908023 -0800 @@ -0,0 +1,108 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2016-2019 Intel Corporation. All rights reserved. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "dax-private.h" +#include "bus.h" + +int dev_dax_kmem_probe(struct device *dev) +{ + struct dev_dax *dev_dax = to_dev_dax(dev); + struct resource *res = &dev_dax->region->res; + resource_size_t kmem_start; + resource_size_t kmem_size; + resource_size_t kmem_end; + struct resource *new_res; + int numa_node; + int rc; + + /* + * Ensure good NUMA information for the persistent memory. + * Without this check, there is a risk that slow memory + * could be mixed in a node with faster memory, causing + * unavoidable performance issues. + */ + numa_node = dev_dax->target_node; + if (numa_node < 0) { + dev_warn(dev, "rejecting DAX region %pR with invalid node: %d\n", + res, numa_node); + return -EINVAL; + } + + /* Hotplug starting at the beginning of the next block: */ + kmem_start = ALIGN(res->start, memory_block_size_bytes()); + + kmem_size = resource_size(res); + /* Adjust the size down to compensate for moving up kmem_start: */ + kmem_size -= kmem_start - res->start; + /* Align the size down to cover only complete blocks: */ + kmem_size &= ~(memory_block_size_bytes() - 1); + kmem_end = kmem_start + kmem_size; + + /* Region is permanently reserved. Hot-remove not yet implemented. */ + new_res = request_mem_region(kmem_start, kmem_size, dev_name(dev)); + if (!new_res) { + dev_warn(dev, "could not reserve region [%pa-%pa]\n", + &kmem_start, &kmem_end); + return -EBUSY; + } + + /* + * Set flags appropriate for System RAM. Leave ..._BUSY clear + * so that add_memory() can add a child resource. Do not + * inherit flags from the parent since it may set new flags + * unknown to us that will break add_memory() below. + */ + new_res->flags = IORESOURCE_SYSTEM_RAM; + new_res->name = dev_name(dev); + + rc = add_memory(numa_node, new_res->start, resource_size(new_res)); + if (rc) + return rc; + + return 0; +} + +static int dev_dax_kmem_remove(struct device *dev) +{ + /* + * Purposely leak the request_mem_region() for the device-dax + * range and return '0' to ->remove() attempts. The removal of + * the device from the driver always succeeds, but the region + * is permanently pinned as reserved by the unreleased + * request_mem_region(). + */ + return 0; +} + +static struct dax_device_driver device_dax_kmem_driver = { + .drv = { + .probe = dev_dax_kmem_probe, + .remove = dev_dax_kmem_remove, + }, +}; + +static int __init dax_kmem_init(void) +{ + return dax_driver_register(&device_dax_kmem_driver); +} + +static void __exit dax_kmem_exit(void) +{ + dax_driver_unregister(&device_dax_kmem_driver); +} + +MODULE_AUTHOR("Intel Corporation"); +MODULE_LICENSE("GPL v2"); +module_init(dax_kmem_init); +module_exit(dax_kmem_exit); +MODULE_ALIAS_DAX_DEVICE(0); diff -puN drivers/dax/Makefile~dax-kmem-try-4 drivers/dax/Makefile --- a/drivers/dax/Makefile~dax-kmem-try-4 2019-02-25 10:56:51.796908023 -0800 +++ b/drivers/dax/Makefile 2019-02-25 10:56:51.800908023 -0800 @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_DAX) += dax.o obj-$(CONFIG_DEV_DAX) += device_dax.o +obj-$(CONFIG_DEV_DAX_KMEM) += kmem.o dax-y := super.o dax-y += bus.o