From patchwork Mon Oct 22 20:13:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Hansen X-Patchwork-Id: 10652423 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6AF595BB5 for ; Mon, 22 Oct 2018 20:18:47 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 50AE028EC3 for ; Mon, 22 Oct 2018 20:18:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 43B4F28F01; Mon, 22 Oct 2018 20:18:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A8F4128EC3 for ; Mon, 22 Oct 2018 20:18:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 409996B0008; Mon, 22 Oct 2018 16:18:42 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 3B7376B000A; Mon, 22 Oct 2018 16:18:42 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 259DD6B000C; Mon, 22 Oct 2018 16:18:42 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id D4DD16B0008 for ; Mon, 22 Oct 2018 16:18:41 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id 134-v6so12199941pga.1 for ; Mon, 22 Oct 2018 13:18:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :from:date:references:in-reply-to:message-id; bh=/KXAefnM2F2s3Ny4wZVN+A3m0bVkrUQgwEadPEblYmE=; b=kEMTEvVAjjqfjhyxkEQhZPAzkGkFhK3PiLmMkgu8QHs9NbSGwDixdG4Gp8rmFtubgE 3aP/D/FS3O8VX/pV4UMguTQ5SB7K/Nw87dcIRqvoASnEXYKQUyhCpnnLnf6Vg9gSLz/P 9vM07e64XF5LQs19v/7UUg3OskVOslbS52nTqk2ZWt2vSzQPyAeKNCsTXBFHhyTQTu03 DDTwhRojBrp5j3XOdN/lCHc/e3Lq+NTEpmCwct4fRXZTOLIzhKrgMiWYag9i1FdSgtYh aDwiFK6nUy3xbk9iHyifM3A2z/Ky+lzJXz5gAw/a3bcrmQo1AvNf9wyHpH6xhQpoMVxh 0i5w== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ABuFfogQDGvkqYk0o+KkkQNfGdf7f4MMhm7EIJHup0K/GQ7eB9yP1X9z EdeUD/+lM821+C0V/ioVJJUcbpcyJ8vtVESjezavir+28olOdVnCdGch6l+MQrwKi3Y3qvd9h/4 M8mY5FlSb+IRxAFShnVwj+2Jae1uSE4vJvY27g0ONHtIlQONwYGcrcTwB51MrNW2RHw== X-Received: by 2002:a17:902:8b8c:: with SMTP id ay12-v6mr45788493plb.262.1540239521524; Mon, 22 Oct 2018 13:18:41 -0700 (PDT) X-Google-Smtp-Source: ACcGV60uqZkLuGjfc88CBGQHDi8jFxNTG9xZlbgNJBQrxt0TQ/23RClG60vHIeaF6xbLT/jAi29o X-Received: by 2002:a17:902:8b8c:: with SMTP id ay12-v6mr45788453plb.262.1540239520600; Mon, 22 Oct 2018 13:18:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540239520; cv=none; d=google.com; s=arc-20160816; b=NTO86h89bysGCWaM629FvotOiLCaqu+ZFzYuBjDOM1Vvnw/MearsBStt5x9Ct0PdgT vl1SA245XXl7m9vjiK2yJ6e4fCSrJRhzdM4UT8AykVjsvq7NlLzWhyn2SHyU1Dh/IQM6 gNFBdtPOspZRxMHXvs8wZXnhTbQfTqSRPPrtdgx/Amt9BD6DTtm2lEvdtVDmpGez31sk f/OLkW21yiSF17cOGlOfH3UeGqGLbRscdNcEHM2zAVlH0hQh9hU9l1JqGVNtwlrncKrp TPGAQ5/ICNFa4vC20vv1Sj0ZhwLpe0AzjG/u76i74dNlbjYEVUv3jFy8SsC4uJzGe6jp jqAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:references:date:from:cc:to:subject; bh=/KXAefnM2F2s3Ny4wZVN+A3m0bVkrUQgwEadPEblYmE=; b=0vG34W5jvX+PYqtaux5pDnOS3FjH+xT2ONauIm3FBFuTwZgzYG/xaqTvSZA3xRFt+E 8NQD+emqrl+gRNZJlXlOWxbuYRKfxqYJ7HcMnItl0A/QBBqb6rgkyz3Vj8SWtrknIL+t kOErzM3+N2TfZwwHZSOyYDNOwSKyxNajpHQShnm3CzT0fT9TsijiAuxrmYlWHQInl6TU vvk0iMTperzsc7groTkppXfAlXD0SG0zujv2oTZ7AxJShvR2TMjlwz+1Xm19EteLRl6f Nqh6wlSujJOaa0mS0FhSDTec4by2Kgx+ckgGKciqjka5tZpljMLOBj5kFs5a1N9w1JqP 7wpg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga11.intel.com (mga11.intel.com. [192.55.52.93]) by mx.google.com with ESMTPS id b3-v6si28711757plc.103.2018.10.22.13.18.39 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Oct 2018 13:18:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.93 as permitted sender) client-ip=192.55.52.93; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of dave.hansen@linux.intel.com designates 192.55.52.93 as permitted sender) smtp.mailfrom=dave.hansen@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Oct 2018 13:18:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,413,1534834800"; d="scan'208";a="101509365" Received: from viggo.jf.intel.com (HELO localhost.localdomain) ([10.54.77.144]) by fmsmga001.fm.intel.com with ESMTP; 22 Oct 2018 13:18:38 -0700 Subject: [PATCH 2/9] dax: kernel memory driver for mm ownership of DAX To: linux-kernel@vger.kernel.org Cc: Dave Hansen ,dan.j.williams@intel.com,dave.jiang@intel.com,zwisler@kernel.org,vishal.l.verma@intel.com,thomas.lendacky@amd.com,akpm@linux-foundation.org,mhocko@suse.com,linux-nvdimm@lists.01.org,linux-mm@kvack.org,ying.huang@intel.com,fengguang.wu@intel.com From: Dave Hansen Date: Mon, 22 Oct 2018 13:13:20 -0700 References: <20181022201317.8558C1D8@viggo.jf.intel.com> In-Reply-To: <20181022201317.8558C1D8@viggo.jf.intel.com> Message-Id: <20181022201320.45C9785C@viggo.jf.intel.com> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Add the actual driver to which will own the DAX range. This allows very nice party with the other possible "owners" of a DAX region: device DAX and filesystem DAX. It also greatly simplifies the process of handing off control of the memory between the different owners since it's just a matter of unbinding and rebinding the device to different drivers. I tried to do this all internally to the kernel and the locking and "self-destruction" of the old device context was a nightmare. Having userspace drive it is a wonderful simplification. Cc: Dan Williams Cc: Dave Jiang Cc: Ross Zwisler Cc: Vishal Verma Cc: Tom Lendacky Cc: Andrew Morton Cc: Michal Hocko Cc: linux-nvdimm@lists.01.org Cc: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org Cc: Huang Ying Cc: Fengguang Wu --- b/drivers/dax/kmem.c | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 152 insertions(+) diff -puN /dev/null drivers/dax/kmem.c --- /dev/null 2018-09-18 12:39:53.059362935 -0700 +++ b/drivers/dax/kmem.c 2018-10-22 13:12:21.502930393 -0700 @@ -0,0 +1,152 @@ +// this just just a copy of drivers/dax/pmem.c with +// s/dax_pmem/dax_kmem' for now. +// +// need real license +/* + * Copyright(c) 2016-2018 Intel Corporation. All rights reserved. + */ +#include +#include +#include +#include +#include "../nvdimm/pfn.h" +#include "../nvdimm/nd.h" +#include "device-dax.h" + +struct dax_kmem { + struct device *dev; + struct percpu_ref ref; + struct dev_pagemap pgmap; + struct completion cmp; +}; + +static struct dax_kmem *to_dax_kmem(struct percpu_ref *ref) +{ + return container_of(ref, struct dax_kmem, ref); +} + +static void dax_kmem_percpu_release(struct percpu_ref *ref) +{ + struct dax_kmem *dax_kmem = to_dax_pmem(ref); + + dev_dbg(dax_kmem->dev, "trace\n"); + complete(&dax_kmem->cmp); +} + +static void dax_kmem_percpu_exit(void *data) +{ + struct percpu_ref *ref = data; + struct dax_kmem *dax_kmem = to_dax_pmem(ref); + + dev_dbg(dax_kmem->dev, "trace\n"); + wait_for_completion(&dax_kmem->cmp); + percpu_ref_exit(ref); +} + +static void dax_kmem_percpu_kill(void *data) +{ + struct percpu_ref *ref = data; + struct dax_kmem *dax_kmem = to_dax_pmem(ref); + + dev_dbg(dax_kmem->dev, "trace\n"); + percpu_ref_kill(ref); +} + +static int dax_kmem_probe(struct device *dev) +{ + void *addr; + struct resource res; + int rc, id, region_id; + struct nd_pfn_sb *pfn_sb; + struct dev_dax *dev_dax; + struct dax_kmem *dax_kmem; + struct nd_namespace_io *nsio; + struct dax_region *dax_region; + struct nd_namespace_common *ndns; + struct nd_dax *nd_dax = to_nd_dax(dev); + struct nd_pfn *nd_pfn = &nd_dax->nd_pfn; + + ndns = nvdimm_namespace_common_probe(dev); + if (IS_ERR(ndns)) + return PTR_ERR(ndns); + nsio = to_nd_namespace_io(&ndns->dev); + + dax_kmem = devm_kzalloc(dev, sizeof(*dax_kmem), GFP_KERNEL); + if (!dax_kmem) + return -ENOMEM; + + /* parse the 'pfn' info block via ->rw_bytes */ + rc = devm_nsio_enable(dev, nsio); + if (rc) + return rc; + rc = nvdimm_setup_pfn(nd_pfn, &dax_kmem->pgmap); + if (rc) + return rc; + devm_nsio_disable(dev, nsio); + + pfn_sb = nd_pfn->pfn_sb; + + if (!devm_request_mem_region(dev, nsio->res.start, + resource_size(&nsio->res), + dev_name(&ndns->dev))) { + dev_warn(dev, "could not reserve region %pR\n", &nsio->res); + return -EBUSY; + } + + dax_kmem->dev = dev; + init_completion(&dax_kmem->cmp); + rc = percpu_ref_init(&dax_kmem->ref, dax_kmem_percpu_release, 0, + GFP_KERNEL); + if (rc) + return rc; + + rc = devm_add_action_or_reset(dev, dax_kmem_percpu_exit, + &dax_kmem->ref); + if (rc) + return rc; + + dax_kmem->pgmap.ref = &dax_kmem->ref; + addr = devm_memremap_pages(dev, &dax_kmem->pgmap); + if (IS_ERR(addr)) + return PTR_ERR(addr); + + rc = devm_add_action_or_reset(dev, dax_kmem_percpu_kill, + &dax_kmem->ref); + if (rc) + return rc; + + /* adjust the dax_region resource to the start of data */ + memcpy(&res, &dax_kmem->pgmap.res, sizeof(res)); + res.start += le64_to_cpu(pfn_sb->dataoff); + + rc = sscanf(dev_name(&ndns->dev), "namespace%d.%d", ®ion_id, &id); + if (rc != 2) + return -EINVAL; + + dax_region = alloc_dax_region(dev, region_id, &res, + le32_to_cpu(pfn_sb->align), addr, PFN_DEV|PFN_MAP); + if (!dax_region) + return -ENOMEM; + + /* TODO: support for subdividing a dax region... */ + dev_dax = devm_create_dev_dax(dax_region, id, &res, 1); + + /* child dev_dax instances now own the lifetime of the dax_region */ + dax_region_put(dax_region); + + return PTR_ERR_OR_ZERO(dev_dax); +} + +static struct nd_device_driver dax_kmem_driver = { + .probe = dax_kmem_probe, + .drv = { + .name = "dax_kmem", + }, + .type = ND_DRIVER_DAX_PMEM, +}; + +module_nd_driver(dax_kmem_driver); + +MODULE_LICENSE("GPL v2"); +MODULE_AUTHOR("Intel Corporation"); +MODULE_ALIAS_ND_DEVICE(ND_DEVICE_DAX_PMEM);