From patchwork Tue Jan 15 04:24:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10763995 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5CC486C5 for ; Tue, 15 Jan 2019 04:37:15 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4B86120564 for ; Tue, 15 Jan 2019 04:37:15 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3F9002BDD3; Tue, 15 Jan 2019 04:37:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C695D20564 for ; Tue, 15 Jan 2019 04:37:14 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 9F4A6211B76A2; Mon, 14 Jan 2019 20:37:14 -0800 (PST) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=192.55.52.115; helo=mga14.intel.com; envelope-from=dan.j.williams@intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 93DA2211B63CA for ; Mon, 14 Jan 2019 20:37:13 -0800 (PST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Jan 2019 20:37:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,480,1539673200"; d="scan'208";a="311872147" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga005.fm.intel.com with ESMTP; 14 Jan 2019 20:37:13 -0800 Subject: [daxctl PATCH v2 1/2] daxctl: Support the /sys/bus/dax ABI From: Dan Williams To: linux-nvdimm@lists.01.org Date: Mon, 14 Jan 2019 20:24:36 -0800 Message-ID: <154752627642.1710673.17923183512691145854.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <154752627111.1710673.1469336464670065810.stgit@dwillia2-desk3.amr.corp.intel.com> References: <154752627111.1710673.1469336464670065810.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP The kernel is implementing a '/sys/bus/dax' ABI to allow for alternate device-DAX drivers to be bound to device instances. In support of this conversion, teach the libdaxctl subsystem-layout-specific code to parse the new layout. For backwards compatibility the implementation transparently and optionally supports either '/sys/bus/dax' or '/sys/class/dax'. Signed-off-by: Dan Williams --- daxctl/lib/libdaxctl-private.h | 11 ++++++ daxctl/lib/libdaxctl.c | 70 +++++++++++++++++++++++++++++----------- util/sysfs.c | 2 + 3 files changed, 62 insertions(+), 21 deletions(-) diff --git a/daxctl/lib/libdaxctl-private.h b/daxctl/lib/libdaxctl-private.h index f7667324026f..4a462e7245d2 100644 --- a/daxctl/lib/libdaxctl-private.h +++ b/daxctl/lib/libdaxctl-private.h @@ -15,6 +15,17 @@ #define DAXCTL_EXPORT __attribute__ ((visibility("default"))) +enum dax_subsystem { + DAX_UNKNOWN, + DAX_CLASS, + DAX_BUS, +}; + +static const char *dax_subsystems[] = { + [DAX_CLASS] = "/sys/class/dax", + [DAX_BUS] = "/sys/bus/dax/devices", +}; + /** * struct daxctl_region - container for dax_devices */ diff --git a/daxctl/lib/libdaxctl.c b/daxctl/lib/libdaxctl.c index 22f4210a7ea0..c2e3a52d6c7c 100644 --- a/daxctl/lib/libdaxctl.c +++ b/daxctl/lib/libdaxctl.c @@ -444,26 +444,38 @@ static void dax_devices_init(struct daxctl_region *region) { struct daxctl_ctx *ctx = daxctl_region_get_ctx(region); char daxdev_fmt[50]; - char *region_path; + size_t i; if (region->devices_init) return; region->devices_init = 1; sprintf(daxdev_fmt, "dax%d.", region->id); - if (asprintf(®ion_path, "%s/dax", region->region_path) < 0) { - dbg(ctx, "region path alloc fail\n"); - return; + for (i = 0; i < ARRAY_SIZE(dax_subsystems); i++) { + char *region_path; + + if (i == DAX_BUS) + region_path = region->region_path; + else if (i == DAX_CLASS) { + if (asprintf(®ion_path, "%s/dax", + region->region_path) < 0) { + dbg(ctx, "region path alloc fail\n"); + continue; + } + } else + continue; + sysfs_device_parse(ctx, region_path, daxdev_fmt, region, + add_dax_dev); + if (i == DAX_CLASS) + free(region_path); } - sysfs_device_parse(ctx, region_path, daxdev_fmt, region, add_dax_dev); - free(region_path); } -static char *dax_region_path(const char *base, const char *device) +static char *dax_region_path(const char *device, enum dax_subsystem subsys) { char *path, *region_path, *c; - if (asprintf(&path, "%s/%s", base, device) < 0) + if (asprintf(&path, "%s/%s", dax_subsystems[subsys], device) < 0) return NULL; /* dax_region must be the instance's direct parent */ @@ -472,7 +484,11 @@ static char *dax_region_path(const char *base, const char *device) if (!region_path) return NULL; - /* 'region_path' is now regionX/dax/daxX.Y', trim back to regionX */ + /* + * 'region_path' is now regionX/dax/daxX.Y' (DAX_CLASS), or + * regionX/daxX.Y (DAX_BUS), trim it back to the regionX + * component + */ c = strrchr(region_path, '/'); if (!c) { free(region_path); @@ -480,6 +496,9 @@ static char *dax_region_path(const char *base, const char *device) } *c = '\0'; + if (subsys == DAX_BUS) + return region_path; + c = strrchr(region_path, '/'); if (!c) { free(region_path); @@ -490,20 +509,15 @@ static char *dax_region_path(const char *base, const char *device) return region_path; } -static void dax_regions_init(struct daxctl_ctx *ctx) +static void __dax_regions_init(struct daxctl_ctx *ctx, enum dax_subsystem subsys) { - const char *base = "/sys/class/dax"; struct dirent *de; - DIR *dir; + DIR *dir = NULL; - if (ctx->regions_init) - return; - - ctx->regions_init = 1; - - dir = opendir(base); + dir = opendir(dax_subsystems[subsys]); if (!dir) { - dbg(ctx, "no dax regions found\n"); + dbg(ctx, "no dax regions found via: %s\n", + dax_subsystems[subsys]); return; } @@ -516,7 +530,7 @@ static void dax_regions_init(struct daxctl_ctx *ctx) continue; if (sscanf(de->d_name, "dax%d.%d", ®ion_id, &id) != 2) continue; - dev_path = dax_region_path(base, de->d_name); + dev_path = dax_region_path(de->d_name, subsys); if (!dev_path) { err(ctx, "dax region path allocation failure\n"); continue; @@ -529,6 +543,22 @@ static void dax_regions_init(struct daxctl_ctx *ctx) closedir(dir); } +static void dax_regions_init(struct daxctl_ctx *ctx) +{ + size_t i; + + if (ctx->regions_init) + return; + + ctx->regions_init = 1; + + for (i = 0; i < ARRAY_SIZE(dax_subsystems); i++) { + if (i == DAX_UNKNOWN) + continue; + __dax_regions_init(ctx, i); + } +} + DAXCTL_EXPORT struct daxctl_dev *daxctl_dev_get_first(struct daxctl_region *region) { dax_devices_init(region); diff --git a/util/sysfs.c b/util/sysfs.c index 0440fd0f49a3..9f7bc1f4930f 100644 --- a/util/sysfs.c +++ b/util/sysfs.c @@ -91,7 +91,7 @@ int __sysfs_device_parse(struct log_ctx *ctx, const char *base_path, struct dirent *de; DIR *dir; - log_dbg(ctx, "base: %s dev: %s\n", base_path, dev_name); + log_dbg(ctx, "base: '%s' dev: '%s'\n", base_path, dev_name); dir = opendir(base_path); if (!dir) { log_dbg(ctx, "no \"%s\" devices found\n", dev_name); From patchwork Tue Jan 15 04:24:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10763997 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 604B413B4 for ; Tue, 15 Jan 2019 04:37:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C4E920564 for ; Tue, 15 Jan 2019 04:37:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3D6592BDD3; Tue, 15 Jan 2019 04:37:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C16A420564 for ; Tue, 15 Jan 2019 04:37:20 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id BAA63211B76AD; Mon, 14 Jan 2019 20:37:20 -0800 (PST) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=192.55.52.93; helo=mga11.intel.com; envelope-from=dan.j.williams@intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 8721F211B63CA for ; Mon, 14 Jan 2019 20:37:19 -0800 (PST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Jan 2019 20:37:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,480,1539673200"; d="scan'208";a="127781172" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga001.jf.intel.com with ESMTP; 14 Jan 2019 20:37:18 -0800 Subject: [daxctl PATCH v2 2/2] daxctl: Opt-in to /sys/bus/dax ABI From: Dan Williams To: linux-nvdimm@lists.01.org Date: Mon, 14 Jan 2019 20:24:41 -0800 Message-ID: <154752628167.1710673.10834552344037521034.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <154752627111.1710673.1469336464670065810.stgit@dwillia2-desk3.amr.corp.intel.com> References: <154752627111.1710673.1469336464670065810.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP The kernel is implementing a '/sys/bus/dax' ABI to allow for alternate device-DAX drivers to be bound to device instances. While the kernel conversion to '/sys/bus/dax' does not effect the primary ndctl use case of putting namespaces into 'devdax' mode since that uses libnvdimm namespace device relative paths, it does break current implementations of 'ndctl list -X' and 'daxctl list'. It is also known to break fio and some pmdk versions that explicitly reference "/sys/class/dax". In order to avoid userspace regressions the kernel can be configured to maintain '/sys/class/dax' as the default ABI. However, once all '/sys/class/dax' users have been converted, or removed from the installation, an administrator can opt-in to the new '/sys/bus/dax' ABI. The 'dax migrate-device-model' command installs a modprobe rule to blacklist the dax_pmem_compat module and arrange for the dax_pmem module to auto-load in response to the detection of device-DAX instances emitted from the libnvdimm subsystem. Reported-by: Jeff Moyer Signed-off-by: Dan Williams --- .gitignore | 1 Documentation/daxctl/Makefile.am | 3 + .../daxctl/daxctl-migrate-device-model.txt | 47 ++++++++++++++++++++ configure.ac | 5 ++ daxctl/Makefile.am | 10 ++++ daxctl/builtin.h | 1 daxctl/daxctl.c | 1 daxctl/lib/Makefile.am | 2 + daxctl/lib/daxctl.conf | 2 + daxctl/migrate.c | 41 +++++++++++++++++ ndctl.spec.in | 1 11 files changed, 113 insertions(+), 1 deletion(-) create mode 100644 Documentation/daxctl/daxctl-migrate-device-model.txt create mode 100644 daxctl/lib/daxctl.conf create mode 100644 daxctl/migrate.c diff --git a/.gitignore b/.gitignore index 5a3d8e4507e5..3ef9ff7a9a2f 100644 --- a/.gitignore +++ b/.gitignore @@ -18,6 +18,7 @@ Documentation/ndctl/asciidoc.conf Documentation/daxctl/asciidoctor-extensions.rb Documentation/ndctl/asciidoctor-extensions.rb .dirstamp +daxctl/config.h daxctl/daxctl daxctl/lib/libdaxctl.la daxctl/lib/libdaxctl.lo diff --git a/Documentation/daxctl/Makefile.am b/Documentation/daxctl/Makefile.am index fc0fbe138d93..6aba035ff543 100644 --- a/Documentation/daxctl/Makefile.am +++ b/Documentation/daxctl/Makefile.am @@ -27,7 +27,8 @@ endif man1_MANS = \ daxctl.1 \ - daxctl-list.1 + daxctl-list.1 \ + daxctl-migrate-device-model.1 CLEANFILES = $(man1_MANS) diff --git a/Documentation/daxctl/daxctl-migrate-device-model.txt b/Documentation/daxctl/daxctl-migrate-device-model.txt new file mode 100644 index 000000000000..23e89f1afafc --- /dev/null +++ b/Documentation/daxctl/daxctl-migrate-device-model.txt @@ -0,0 +1,47 @@ +// SPDX-License-Identifier: GPL-2.0 + +daxctl-migrate-device-model(1) +============================== + +NAME +---- +daxctl-migrate-device-model - Opt-in to the /sys/bus/dax device-model, +allow for alternative Device-DAX instance drivers. + +SYNOPSIS +-------- +[verse] +'daxctl migrate-device-model' + +Arrange for modprobe to disable the dax_pmem_compat, if present, and +instead deploy the dax_pmem module to convert to the /sys/bus/dax model. +Kernel versions prior to v5.1 may not support /sys/bus/dax in which case +the result of this command is a nop until the kernel is updated. The +motivation for changing from /sys/class/dax to /sys/bus/dax is to allow +for alternative drivers for Device-DAX instances, in particular the +dax_kmem driver. + +By default device-dax publishes a /dev/daxX.Y character device for +userspace to directly map performance differentiated memory. This is +fine if the memory is to be exclusively consumed / managed by userspace. +Alternatively an administrator may want the kernel to manage the memory, +make it available via malloc(), allow for over-provisioning, and / or +apply kernel-based resource control schemes to the memory. In that case +the memory fronted by a given Device-DAX instance can be assigned to the +dax_kmem driver which arranges for the core-kernel memory-management +sub-system to assume management of the memory range. + +This behavior is opt-in for consideration of existing applications / +scripts that may be hard coded to use /sys/class/dax. Fixes have been +submitted to applications known to have these direct dependencies +http://git.kernel.dk/cgit/fio/commit/?id=b08e7d6b18b4[FIO] +https://github.com/pmem/pmdk/commit/91bc8620884e[PMDK], however, there may +be others and a system-owner should be aware of the potential for +regression of Device-DAX consuming scripts, applications, or older +daxctl binaries. + +The modprobe policy established by this utility becomes effective after +the next reboot, or after all DAX related modules have been removed and +re-loaded with "udevadm trigger" + +include::../copyright.txt[] diff --git a/configure.ac b/configure.ac index a02a2d80e1d5..5b4f1fc8d346 100644 --- a/configure.ac +++ b/configure.ac @@ -159,6 +159,11 @@ ndctl_monitorconf=monitor.conf AC_SUBST([ndctl_monitorconfdir]) AC_SUBST([ndctl_monitorconf]) +daxctl_modprobe_datadir=${datadir}/daxctl +daxctl_modprobe_data=daxctl.conf +AC_SUBST([daxctl_modprobe_datadir]) +AC_SUBST([daxctl_modprobe_data]) + my_CFLAGS="\ -Wall \ -Wchar-subscripts \ diff --git a/daxctl/Makefile.am b/daxctl/Makefile.am index fe467d030c38..e424d435c22b 100644 --- a/daxctl/Makefile.am +++ b/daxctl/Makefile.am @@ -2,9 +2,19 @@ include $(top_srcdir)/Makefile.am.in bin_PROGRAMS = daxctl +DISTCLEANFILES = config.h +BUILT_SOURCES = config.h +config.h: Makefile + $(AM_V_GEN) echo "/* Autogenerated by daxctl/Makefile.am */" >$@ + $(AM_V_GEN) echo '#define DAXCTL_MODPROBE_DATA \ + "$(daxctl_modprobe_datadir)/$(daxctl_modprobe_data)"' >>$@ + $(AM_V_GEN) echo '#define DAXCTL_MODPROBE_INSTALL \ + "$(sysconfdir)/modprobe.d/$(daxctl_modprobe_data)"' >>$@ + daxctl_SOURCES =\ daxctl.c \ list.c \ + migrate.c \ ../util/json.c daxctl_LDADD =\ diff --git a/daxctl/builtin.h b/daxctl/builtin.h index dae2615b7ddb..00ef5e930417 100644 --- a/daxctl/builtin.h +++ b/daxctl/builtin.h @@ -5,4 +5,5 @@ struct daxctl_ctx; int cmd_list(int argc, const char **argv, struct daxctl_ctx *ctx); +int cmd_migrate(int argc, const char **argv, struct daxctl_ctx *ctx); #endif /* _DAXCTL_BUILTIN_H_ */ diff --git a/daxctl/daxctl.c b/daxctl/daxctl.c index cb6f50e39170..2e41747b5ea9 100644 --- a/daxctl/daxctl.c +++ b/daxctl/daxctl.c @@ -70,6 +70,7 @@ static struct cmd_struct commands[] = { { "version", .d_fn = cmd_version }, { "list", .d_fn = cmd_list }, { "help", .d_fn = cmd_help }, + { "migrate-device-model", .d_fn = cmd_migrate }, }; int main(int argc, const char **argv) diff --git a/daxctl/lib/Makefile.am b/daxctl/lib/Makefile.am index 0167e3995b00..d3d4852916ca 100644 --- a/daxctl/lib/Makefile.am +++ b/daxctl/lib/Makefile.am @@ -18,6 +18,8 @@ libdaxctl_la_SOURCES =\ libdaxctl_la_LIBADD =\ $(UUID_LIBS) +daxctl_modprobe_data_DATA = daxctl.conf + EXTRA_DIST += libdaxctl.sym libdaxctl_la_LDFLAGS = $(AM_LDFLAGS) \ diff --git a/daxctl/lib/daxctl.conf b/daxctl/lib/daxctl.conf new file mode 100644 index 000000000000..c64a088cbc0b --- /dev/null +++ b/daxctl/lib/daxctl.conf @@ -0,0 +1,2 @@ +blacklist dax_pmem_compat +alias nd:t7* dax_pmem diff --git a/daxctl/migrate.c b/daxctl/migrate.c new file mode 100644 index 000000000000..f7bf8347c4ea --- /dev/null +++ b/daxctl/migrate.c @@ -0,0 +1,41 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright(c) 2015-2018 Intel Corporation. All rights reserved. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include + +int cmd_migrate(int argc, const char **argv, struct daxctl_ctx *ctx) +{ + int i; + static const struct option options[] = { + OPT_END(), + }; + const char * const u[] = { + "daxctl migrate-device-model", + NULL + }; + + argc = parse_options(argc, argv, options, u, 0); + for (i = 0; i < argc; i++) + error("unknown parameter \"%s\"\n", argv[i]); + + if (argc) + usage_with_options(u, options); + + if (symlink(DAXCTL_MODPROBE_DATA, DAXCTL_MODPROBE_INSTALL) == 0) { + fprintf(stderr, " Success: installed %s\n", + DAXCTL_MODPROBE_INSTALL); + return EXIT_SUCCESS; + } + + error("failed to install %s: %s\n", DAXCTL_MODPROBE_INSTALL, + strerror(errno)); + + return EXIT_FAILURE; +} diff --git a/ndctl.spec.in b/ndctl.spec.in index bc4d65c1f988..bc65a471a6d2 100644 --- a/ndctl.spec.in +++ b/ndctl.spec.in @@ -126,6 +126,7 @@ make check %license util/COPYING licenses/BSD-MIT licenses/CC0 %{_bindir}/daxctl %{_mandir}/man1/daxctl* +%{_datadir}/daxctl/daxctl.conf %files -n LNAME %defattr(-,root,root)