From patchwork Mon Oct 22 20:13:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652413 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5E27913BF for ; Mon, 22 Oct 2018 20:18:41 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4840828EC3 for ; Mon, 22 Oct 2018 20:18:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3C9D328ED4; Mon, 22 Oct 2018 20:18:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B5E8728EC3 for ; Mon, 22 Oct 2018 20:18:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 84D836B0003; Mon, 22 Oct 2018 16:18:38 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7FDC16B0005; Mon, 22 Oct 2018 16:18:38 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6EDAF6B0006; Mon, 22 Oct 2018 16:18:38 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id 2BC2C6B0003 for ; Mon, 22 Oct 2018 16:18:38 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id j9-v6so27750468plt.3 for ; Mon, 22 Oct 2018 13:18:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:message-id; bh=uYk5rP2umlt13bDHWk/ZvrxW3Md/SVJc/yTtulukFnA=; b=Z7vS26/DAGPmWBh06Lmuv06f7eEiNe4XC4FIt8sgHdRpEvAu4H9fThCCmpGvsOjzBx 410IigubBgg4/t/OmbtCqS/DA+adRhH5LHfuRq2UzsFplQd8F5xmchNd4C2xRbb4UJX4 Fiu3U7LtHZ6rELKzUJk3vFgJB1K5sM6T3NOtnAPo+5NKSAA3NwQe1teqn3uiSO4QFwRj qM+lwu4PMPd1FMTN89iJUObFZWA9w4AGs6Ixe1vv2fJVdnEXySDKOlAEk3B8hmQFD6Mg MULlEC9WJGykkeY8XvwI7dKzlYziFI6ZEMp8t0bq8lnJ1/b9OeMExovomEwpgsOj1l7p exsw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ABuFfohV4PHY+oMLC/T5a4RPYXQYQgGlO82ybDBDL9Ao046oydf6oamJ JAfQqj0n7QjqlQfqDFLV7kCeOuib2Isfv9NiUQmP2+Ec6xxAUeHTUEbdBkRVv3jJ5+m8uYvA2Mz lGsDHttS3EHkymPK/00XKgI46/n3qyXAjrbSB+MOq4MbW+Gxt124S0YuGecGjUq/8Qg== X-Received: by 2002:a63:2e47:: with SMTP id u68-v6mr44641873pgu.294.1540239517833; Mon, 22 Oct 2018 13:18:37 -0700 (PDT) X-Google-Smtp-Source: ACcGV60wwUwvZ7mZNz2JfKxNKTSlOFOvHU6BQn0B47hsC7EC6Phs3vfmUSBHeT3iQcZTyhxbBdOU X-Received: by 2002:a63:2e47:: with SMTP id u68-v6mr44641822pgu.294.1540239516701; Mon, 22 Oct 2018 13:18:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540239516; cv=none; d=google.com; s=arc-20160816; b=ZmtkRJHidrkeXVheW7f5Fr001uAzu6KJi8G/sHSQr+WVoBA/TPBePBEq5TJ9WCaMzw 1cl2ejgtZa+vJBnOOpjngCRzDq9EOLDZyG5qMRtOX6TMQ/BASuhFoxIpKP7WXirQ4fGY 0A1ts4UxEefSZP8VpnFepjLnBY/c4nv0K4e3uSo2df9xGW8DwiUdNgoNv5H6O/+68Kbm OIJa5MAe0p02o2WpRW/90IpX8JILBxe9If+hzGzmLzHFes1QCCBxFlyaMTsNMixFxDiq y6ZX4iZov/vuog5KIK9Y61sDIJlT8T81KJOchbtQszCtT9/2S/OLyb9LDEHfTagZ8H6J 8qIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:date:from:cc:to:subject; bh=uYk5rP2umlt13bDHWk/ZvrxW3Md/SVJc/yTtulukFnA=; b=p3FmdfFxEZp4JiO+FR0G2Blb8v7BcOebW7sRiyWNp/J94l+CllA6xjR7h5zbZpj++D fuB06hgwGQ7NzgscYDQEJhu1IPMA4zsASHrV6F69dUfannNKSSSS9d2PlSw78mEpa2R3 Ki31q2Hd8bY03DQA8HYsLWy3Ow3/gOOmKU7ZIRVEYn7O98U510LZLVHYD6LlPjM+UaVU GhhyMKGJ+CQaM0CQKW/zvOd6PDOwBe63n+v3fTVBHSl/6ukMomPKst7KfiLTUI1pC+O2 jaE3uSrLamjtgavlDIhQxYiBuedK+t9IF0X8CCssVrJxwbwTBzQ8+VRexZlwadPOchOW WpGg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga17.intel.com (mga17.intel.com. [192.55.52.151]) by mx.google.com with ESMTPS id m16-v6si35901816pgd.48.2018.10.22.13.18.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Oct 2018 13:18:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.151 as permitted sender) client-ip=192.55.52.151; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.151 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="83549665" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by orsmga008.jf.intel.com with ESMTP; 22 Oct 2018 13:18:35 -0700 Subject: [PATCH 0/9] Allow persistent memory to be used like normal RAM To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:17 -0700 Message-Id: <20181022201317.8558C1D8@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Persistent memory is cool. But, currently, you have to rewrite your applications to use it. Wouldn't it be cool if you could just have it show up in your system like normal RAM and get to it like a slow blob of memory? Well... have I got the patch series for you! This series adds a new "driver" to which pmem devices can be attached. Once attached, the memory "owned" by the device is hot-added to the kernel and managed like any other memory. On systems with an HMAT (a new ACPI table), each socket (roughly) will have a separate NUMA node for its persistent memory so this newly-added memory can be selected by its unique NUMA node. This is highly RFC, and I really want the feedback from the nvdimm/pmem folks about whether this is a viable long-term perversion of their code and device mode. It's insufficiently documented and probably not bisectable either. Todo: 1. The device re-binding hacks are ham-fisted at best. We need a better way of doing this, especially so the kmem driver does not get in the way of normal pmem devices. 2. When the device has no proper node, we default it to NUMA node 0. Is that OK? 3. We muck with the 'struct resource' code quite a bit. It definitely needs a once-over from folks more familiar with it than I. 4. Is there a better way to do this than starting with a copy of pmem.c? Here's how I set up a system to test this thing: 1. Boot qemu with lots of memory: "-m 4096", for instance 2. Reserve 512MB of physical memory. Reserving a spot a 2GB physical seems to work: memmap=512M!0x0000000080000000 This will end up looking like a pmem device at boot. 3. When booted, convert fsdax device to "device dax": ndctl create-namespace -fe namespace0.0 -m dax 4. In the background, the kmem driver will probably bind to the new device. 5. Now, online the new memory sections. Perhaps: grep ^MemTotal /proc/meminfo for f in `grep -vl online /sys/devices/system/memory/*/state`; do echo $f: `cat $f` echo online > $f grep ^MemTotal /proc/meminfo done Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu