diff mbox

[v5,09/12] nfit/libnvdimm: add support for issue secure erase DSM to Intel nvdimm

Message ID 153186089522.27463.4537738384176593789.stgit@djiang5-desk3.ch.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Jiang July 17, 2018, 8:54 p.m. UTC
Add support to issue a secure erase DSM to the Intel nvdimm. The
required passphrase is acquired from userspace through the kernel key
management. To trigger the action, "erase" is written to the "security"
sysfs attribute.  libnvdimm will support the erase generic API call.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/nfit/intel.c  |   55 ++++++++++++++++++++++++++++++++++++++++++++
 drivers/nvdimm/dimm_devs.c |   47 ++++++++++++++++++++++++++++++++++++++
 include/linux/libnvdimm.h  |    2 ++
 3 files changed, 104 insertions(+)

Comments

Elliott, Robert (Servers) July 18, 2018, 5:27 p.m. UTC | #1
> -----Original Message-----
> From: Linux-nvdimm [mailto:linux-nvdimm-bounces@lists.01.org] On Behalf Of Dave Jiang
> Sent: Tuesday, July 17, 2018 3:55 PM
> Subject: [PATCH v5 09/12] nfit/libnvdimm: add support for issue secure erase DSM to Intel nvdimm
...
 +static int intel_dimm_security_erase(struct nvdimm_bus *nvdimm_bus,
> +		struct nvdimm *nvdimm, struct nvdimm_key_data *nkey)
...
> +	/* DIMM unlocked, invalidate all CPU caches before we read it */
> +	wbinvd_on_all_cpus();

For this function, that comment should use "erased" rather than
"unlocked".

For both this function and intel_dimm_security_unlock() in patch 04/12,
could the driver do a loop of clflushopts on one CPU via
clflush_cache_range() rather than run wbinvd on all CPUs?

---
Robert Elliott, HPE Persistent Memory
Dave Jiang July 18, 2018, 5:41 p.m. UTC | #2
On 07/18/2018 10:27 AM, Elliott, Robert (Persistent Memory) wrote:
> 
> 
>> -----Original Message-----
>> From: Linux-nvdimm [mailto:linux-nvdimm-bounces@lists.01.org] On Behalf Of Dave Jiang
>> Sent: Tuesday, July 17, 2018 3:55 PM
>> Subject: [PATCH v5 09/12] nfit/libnvdimm: add support for issue secure erase DSM to Intel nvdimm
> ...
>  +static int intel_dimm_security_erase(struct nvdimm_bus *nvdimm_bus,
>> +struct nvdimm *nvdimm, struct nvdimm_key_data *nkey)
> ...
>> +/* DIMM unlocked, invalidate all CPU caches before we read it */
>> +wbinvd_on_all_cpus();
> 
> For this function, that comment should use "erased" rather than
> "unlocked".
> 
> For both this function and intel_dimm_security_unlock() in patch 04/12,
> could the driver do a loop of clflushopts on one CPU via
> clflush_cache_range() rather than run wbinvd on all CPUs?

The loop should work, but wbinvd is going to be less overall impact to
the performance for really huge ranges. Also, unlock should happen only
once and during NVDIMM initialization. So wbinvd should be ok.

BTW thanks for looking over the patches.

> 
> ---
> Robert Elliott, HPE Persistent Memory
> 
> 
> 
>
Elliott, Robert (Servers) July 19, 2018, 1:43 a.m. UTC | #3
> -----Original Message-----
> From: Dave Jiang <dave.jiang@intel.com>
> Sent: Wednesday, July 18, 2018 12:41 PM
> To: Elliott, Robert (Persistent Memory) <elliott@hpe.com>; Williams,
> Dan J <dan.j.williams@intel.com>
> Cc: dhowells@redhat.com; Schofield, Alison
> <alison.schofield@intel.com>; keyrings@vger.kernel.org;
> keescook@chromium.org; linux-nvdimm@lists.01.org
> Subject: Re: [PATCH v5 09/12] nfit/libnvdimm: add support for issue
> secure erase DSM to Intel nvdimm
> 
> 
> 
> On 07/18/2018 10:27 AM, Elliott, Robert (Persistent Memory) wrote:
> >
> >
> >> -----Original Message-----
> >> From: Linux-nvdimm [mailto:linux-nvdimm-bounces@lists.01.org] On
> Behalf Of Dave Jiang
> >> Sent: Tuesday, July 17, 2018 3:55 PM
> >> Subject: [PATCH v5 09/12] nfit/libnvdimm: add support for issue
> secure erase DSM to Intel nvdimm
> > ...
> >  +static int intel_dimm_security_erase(struct nvdimm_bus
> *nvdimm_bus,
> >> +struct nvdimm *nvdimm, struct nvdimm_key_data *nkey)
> > ...
> >> +/* DIMM unlocked, invalidate all CPU caches before we read it */
> >> +wbinvd_on_all_cpus();
> >
> > For this function, that comment should use "erased" rather than
> > "unlocked".
> >
> > For both this function and intel_dimm_security_unlock() in patch
> 04/12,
> > could the driver do a loop of clflushopts on one CPU via
> > clflush_cache_range() rather than run wbinvd on all CPUs?
> 
> The loop should work, but wbinvd is going to be less overall impact
> to the performance for really huge ranges. Also, unlock should happen
> only once and during NVDIMM initialization. So wbinvd should be ok.

Unlike unlock, secure erase could be requested at any time.

wbinvd must run on every physical core on every physical CPU, while
clflushopt flushes everything from just one CPU core.

wbinvd adds huge interrupt latencies, generating complaints like these:
	https://patchwork.kernel.org/patch/37090/
	https://lists.xenproject.org/archives/html/xen-devel/2011-09/msg00675.html

Also, there's no need to disrupt cache content for other addresses;
only the data at the addresses just erased or unlocked is a concern.
clflushopt avoids disrupting other threads.

Related topic: a flush is also necessary before sending the secure erase or
unlock command.  Otherwise, there could be dirty write data that gets
written by the concluding flush (overwriting the now-unlocked or just-erased
data).  For unlock during boot, you might assume that no writes having
occurred yet, but that isn't true for secure erase on demand.  Flushing
before both commands is safest.

---
Robert Elliott, HPE Persistent Memory
Juston Li July 19, 2018, 6:09 a.m. UTC | #4
> -----Original Message-----
> From: Linux-nvdimm [mailto:linux-nvdimm-bounces@lists.01.org] On Behalf Of
> Elliott, Robert (Persistent Memory)
> Sent: Wednesday, July 18, 2018 6:43 PM
> To: Jiang, Dave <dave.jiang@intel.com>; Williams, Dan J
> <dan.j.williams@intel.com>
> Cc: dhowells@redhat.com; Schofield, Alison <alison.schofield@intel.com>;
> keyrings@vger.kernel.org; keescook@chromium.org; linux-
> nvdimm@lists.01.org
> Subject: RE: [PATCH v5 09/12] nfit/libnvdimm: add support for issue secure erase
> DSM to Intel nvdimm
> 
>
> Related topic: a flush is also necessary before sending the secure erase or unlock
> command.  Otherwise, there could be dirty write data that gets written by the
> concluding flush (overwriting the now-unlocked or just-erased data).  For unlock
> during boot, you might assume that no writes having occurred yet, but that isn't
> true for secure erase on demand.  Flushing before both commands is safest.
> 

I was wondering this too.

Is it handled by the fact the DIMM must be disabled to do a secure erase?
I'm assuming that means the namespace that the DIMM is a part of also must be
disabled first? Then no further writes can occur and provided that dirty write data is
flushed when the namespace is disabled, it should be safe to issue a secure erase.

Thanks
Juston
Dave Jiang July 19, 2018, 8:06 p.m. UTC | #5
On 07/18/2018 06:43 PM, Elliott, Robert (Persistent Memory) wrote:
> 
> 
>> -----Original Message-----
>> From: Dave Jiang <dave.jiang@intel.com>
>> Sent: Wednesday, July 18, 2018 12:41 PM
>> To: Elliott, Robert (Persistent Memory) <elliott@hpe.com>; Williams,
>> Dan J <dan.j.williams@intel.com>
>> Cc: dhowells@redhat.com; Schofield, Alison
>> <alison.schofield@intel.com>; keyrings@vger.kernel.org;
>> keescook@chromium.org; linux-nvdimm@lists.01.org
>> Subject: Re: [PATCH v5 09/12] nfit/libnvdimm: add support for issue
>> secure erase DSM to Intel nvdimm
>>
>>
>>
>> On 07/18/2018 10:27 AM, Elliott, Robert (Persistent Memory) wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Linux-nvdimm [mailto:linux-nvdimm-bounces@lists.01.org] On
>> Behalf Of Dave Jiang
>>>> Sent: Tuesday, July 17, 2018 3:55 PM
>>>> Subject: [PATCH v5 09/12] nfit/libnvdimm: add support for issue
>> secure erase DSM to Intel nvdimm
>>> ...
>>>  +static int intel_dimm_security_erase(struct nvdimm_bus
>> *nvdimm_bus,
>>>> +struct nvdimm *nvdimm, struct nvdimm_key_data *nkey)
>>> ...
>>>> +/* DIMM unlocked, invalidate all CPU caches before we read it */
>>>> +wbinvd_on_all_cpus();
>>>
>>> For this function, that comment should use "erased" rather than
>>> "unlocked".
>>>
>>> For both this function and intel_dimm_security_unlock() in patch
>> 04/12,
>>> could the driver do a loop of clflushopts on one CPU via
>>> clflush_cache_range() rather than run wbinvd on all CPUs?
>>
>> The loop should work, but wbinvd is going to be less overall impact
>> to the performance for really huge ranges. Also, unlock should happen
>> only once and during NVDIMM initialization. So wbinvd should be ok.
> 
> Unlike unlock, secure erase could be requested at any time.
> 
> wbinvd must run on every physical core on every physical CPU, while
> clflushopt flushes everything from just one CPU core.
> 
> wbinvd adds huge interrupt latencies, generating complaints like these:
> https://patchwork.kernel.org/patch/37090/
> https://lists.xenproject.org/archives/html/xen-devel/2011-09/msg00675.html
> 
> Also, there's no need to disrupt cache content for other addresses;
> only the data at the addresses just erased or unlocked is a concern.
> clflushopt avoids disrupting other threads.

Yes secure erase could be requested at any time, but the likelihood of
that happening frequently is unlikely. Also, in order to do secure
erase, one must disable regions impacted by the dimm and also the dimm
itself. More likely than not, the admin is doing maintenance and not
expecting running workloads (at least not on the pmem). The concern is
more that the admin wants to finish the task quickly rather than if
there's performance impact while the maintenance task is going on.

Also, looping over potentially TB-sized ranges with CLFLUSHOPT may take
a while (many minutes?)? Yes it just flushes cache from one CPU, but
also it causes cross-CPU traffic to maintain coherency, and KTI traffic
and/or reads from the media to check directory bits. WBINVD is pretty
heavy handed but it's the only option we have that doesn't have to plow
through each cache line in the huge range.

> 
> Related topic: a flush is also necessary before sending the secure erase or
> unlock command.  Otherwise, there could be dirty write data that gets
> written by the concluding flush (overwriting the now-unlocked or just-erased
> data).  For unlock during boot, you might assume that no writes having
> occurred yet, but that isn't true for secure erase on demand.  Flushing
> before both commands is safest.

Yes I missed that. Thanks for catching. I'll add the flush before
executing secure erase. It's probably not necessary for unlock since
there's no data that would be in the CPU cache until the DIMMs are
accessible.
diff mbox

Patch

diff --git a/drivers/acpi/nfit/intel.c b/drivers/acpi/nfit/intel.c
index 0ab56f03ebc4..6449db6db85d 100644
--- a/drivers/acpi/nfit/intel.c
+++ b/drivers/acpi/nfit/intel.c
@@ -18,6 +18,60 @@ 
 #include "intel.h"
 #include "nfit.h"
 
+static int intel_dimm_security_erase(struct nvdimm_bus *nvdimm_bus,
+		struct nvdimm *nvdimm, struct nvdimm_key_data *nkey)
+{
+	struct nvdimm_bus_descriptor *nd_desc = to_nd_desc(nvdimm_bus);
+	int cmd_rc, rc = 0;
+	struct nfit_mem *nfit_mem = nvdimm_provider_data(nvdimm);
+	struct {
+		struct nd_cmd_pkg pkg;
+		struct nd_intel_secure_erase cmd;
+	} nd_cmd = {
+		.pkg = {
+			.nd_command = NVDIMM_INTEL_SECURE_ERASE,
+			.nd_family = NVDIMM_FAMILY_INTEL,
+			.nd_size_in = ND_INTEL_PASSPHRASE_SIZE,
+			.nd_size_out = ND_INTEL_STATUS_SIZE,
+			.nd_fw_size = ND_INTEL_STATUS_SIZE,
+		},
+		.cmd = {
+			.status = 0,
+		},
+	};
+
+	if (!test_bit(NVDIMM_INTEL_SECURE_ERASE, &nfit_mem->dsm_mask))
+		return -ENOTTY;
+
+	memcpy(nd_cmd.cmd.passphrase, nkey->data, ND_INTEL_PASSPHRASE_SIZE);
+	rc = nd_desc->ndctl(nd_desc, nvdimm, ND_CMD_CALL, &nd_cmd,
+			sizeof(nd_cmd), &cmd_rc);
+	if (rc < 0)
+		goto out;
+	if (cmd_rc < 0) {
+		rc = cmd_rc;
+		goto out;
+	}
+
+	switch (nd_cmd.cmd.status) {
+	case 0:
+		break;
+	case ND_INTEL_STATUS_INVALID_PASS:
+		rc = -EINVAL;
+		goto out;
+	case ND_INTEL_STATUS_INVALID_STATE:
+	default:
+		rc = -ENXIO;
+		goto out;
+	}
+
+	/* DIMM unlocked, invalidate all CPU caches before we read it */
+	wbinvd_on_all_cpus();
+
+ out:
+	return rc;
+}
+
 static int intel_dimm_security_freeze_lock(struct nvdimm_bus *nvdimm_bus,
 		struct nvdimm *nvdimm)
 {
@@ -308,4 +362,5 @@  struct nvdimm_security_ops intel_security_ops = {
 	.change_key = intel_dimm_security_update_passphrase,
 	.disable = intel_dimm_security_disable,
 	.freeze_lock = intel_dimm_security_freeze_lock,
+	.erase = intel_dimm_security_erase,
 };
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index 6a653476bb7c..ed56649bc971 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -125,6 +125,51 @@  int nvdimm_security_get_state(struct device *dev)
 			&nvdimm->state);
 }
 
+static int nvdimm_security_erase(struct device *dev)
+{
+	struct nvdimm *nvdimm = to_nvdimm(dev);
+	struct nvdimm_bus *nvdimm_bus = walk_to_nvdimm_bus(dev);
+	struct key *key;
+	void *payload;
+	int rc = 0;
+
+	if (!nvdimm->security_ops)
+		return 0;
+
+	/* lock the device and disallow driver bind */
+	device_lock(dev);
+	/* No driver data means dimm is disabled. Proceed if so. */
+	if (dev_get_drvdata(dev)) {
+		dev_warn(dev, "Unable to secure erase while DIMM active.\n");
+		rc = -EINVAL;
+		goto out;
+	}
+
+	if (nvdimm->state == NVDIMM_SECURITY_UNSUPPORTED)
+		goto out;
+
+	key = nvdimm_search_key(dev);
+	if (!key)
+		key = nvdimm_request_key(dev);
+	if (!key) {
+		rc = -ENXIO;
+		goto out;
+	}
+
+	down_read(&key->sem);
+	payload = key->payload.data[0];
+	rc = nvdimm->security_ops->erase(nvdimm_bus, nvdimm, payload);
+	up_read(&key->sem);
+	/* remove key since secure erase kills the passphrase */
+	key_invalidate(key);
+	key_put(key);
+
+ out:
+	device_unlock(dev);
+	nvdimm_security_get_state(dev);
+	return rc;
+}
+
 static int nvdimm_security_freeze_lock(struct device *dev)
 {
 	struct nvdimm *nvdimm = to_nvdimm(dev);
@@ -694,6 +739,8 @@  static ssize_t security_store(struct device *dev,
 		rc = nvdimm_security_disable(dev);
 	else if (sysfs_streq(buf, "freeze"))
 		rc = nvdimm_security_freeze_lock(dev);
+	else if (sysfs_streq(buf, "erase"))
+		rc = nvdimm_security_erase(dev);
 	else
 		return -EINVAL;
 
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index 1836599ed5b8..1ac5acb3c457 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -187,6 +187,8 @@  struct nvdimm_security_ops {
 			struct nvdimm *nvdimm, struct nvdimm_key_data *nkey);
 	int (*freeze_lock)(struct nvdimm_bus *nvdimm_bus,
 			struct nvdimm *nvdimm);
+	int (*erase)(struct nvdimm_bus *nvdimm_bus,
+			struct nvdimm *nvdimm, struct nvdimm_key_data *nkey);
 };
 
 void badrange_init(struct badrange *badrange);