diff mbox series

libnvdimm: Fix __nd_ioctl() to check error in cmd_rc

Message ID 20181107185241.26982-1-toshi.kani@hpe.com (mailing list archive)
State New, archived
Headers show
Series libnvdimm: Fix __nd_ioctl() to check error in cmd_rc | expand

Commit Message

Kani, Toshi Nov. 7, 2018, 6:52 p.m. UTC
ndctl zero-labels completes with a large number of zeroed nmems when
it fails to do zeroing on a protected NVDIMM.

  # ndctl zero-labels nmem1
  zeroed 65504 nmems

When an ACPI call completes with error, xlat_status() called from
acpi_nfit_ctl() sets error to *cmd_rc.  __nd_ioctl(), however, does
not check this error and returns with success.

Fix __nd_ioctl() to check this error in cmd_rc.

Fixes: 006358b35c73a ("libnvdimm: add support for clear poison list and badblocks for device dax")
Reported-by: Robert Elliott <elliott@hpe.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: <stable@vger.kernel.org>
---
 drivers/nvdimm/bus.c |    4 ++++
 1 file changed, 4 insertions(+)

Comments

Dan Williams Nov. 7, 2018, 7:34 p.m. UTC | #1
On Wed, Nov 7, 2018 at 10:52 AM Toshi Kani <toshi.kani@hpe.com> wrote:
>
> ndctl zero-labels completes with a large number of zeroed nmems when
> it fails to do zeroing on a protected NVDIMM.
>
>   # ndctl zero-labels nmem1
>   zeroed 65504 nmems
>
> When an ACPI call completes with error, xlat_status() called from
> acpi_nfit_ctl() sets error to *cmd_rc.  __nd_ioctl(), however, does
> not check this error and returns with success.
>
> Fix __nd_ioctl() to check this error in cmd_rc.

So this arrangement is by design and the bug is in the ndctl utility.

A successful return code from the ioctl means that the command was
successfully submitted to firmware. It's then up to userspace to parse
if there was a command specific error returned in the response
payload. Automatically returning cmd_rc removes the ability for
userspace tooling to do its own command specific error handling. With
this change userspace could no longer be sure if the failure is in the
submission or the execution of the command, or determine if the
command response payload is valid.
Kani, Toshi Nov. 7, 2018, 9:26 p.m. UTC | #2
On Wed, 2018-11-07 at 11:34 -0800, Dan Williams wrote:
> On Wed, Nov 7, 2018 at 10:52 AM Toshi Kani <toshi.kani@hpe.com> wrote:
> > 
> > ndctl zero-labels completes with a large number of zeroed nmems when
> > it fails to do zeroing on a protected NVDIMM.
> > 
> >   # ndctl zero-labels nmem1
> >   zeroed 65504 nmems
> > 
> > When an ACPI call completes with error, xlat_status() called from
> > acpi_nfit_ctl() sets error to *cmd_rc.  __nd_ioctl(), however, does
> > not check this error and returns with success.
> > 
> > Fix __nd_ioctl() to check this error in cmd_rc.
> 
> So this arrangement is by design and the bug is in the ndctl utility.
> 
> A successful return code from the ioctl means that the command was
> successfully submitted to firmware. It's then up to userspace to parse
> if there was a command specific error returned in the response
> payload. Automatically returning cmd_rc removes the ability for
> userspace tooling to do its own command specific error handling. With
> this change userspace could no longer be sure if the failure is in the
> submission or the execution of the command, or determine if the
> command response payload is valid.

I see.  I was wondering which side needs to be fixed, and decided to
follow kernel-internal ACPI calls like nvdimm_clear_poison().  I agree
that a command error code is necessary if user space tool needs to deal
with it.  OK, I will look into fixing ndctl.

Thanks,
-Toshi
diff mbox series

Patch

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index f1fb39921236..af12817d8a02 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -1050,6 +1050,10 @@  static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, struct nvdimm *nvdimm,
 	rc = nd_desc->ndctl(nd_desc, nvdimm, cmd, buf, buf_len, &cmd_rc);
 	if (rc < 0)
 		goto out_unlock;
+	if (cmd_rc < 0) {
+		rc = cmd_rc;
+		goto out_unlock;
+	}
 
 	if (!nvdimm && cmd == ND_CMD_CLEAR_ERROR && cmd_rc >= 0) {
 		struct nd_cmd_clear_error *clear_err = buf;