diff mbox

[1/2,v3] tpm: cmd_ready command can be issued only after granting locality

Message ID 20180214134319.4400-2-tomas.winkler@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Winkler, Tomas Feb. 14, 2018, 1:43 p.m. UTC
The correct sequence is to first request locality and only after
that perform cmd_ready  handshake, otherwise the hardware will drop
the subsequent message as from the device point of view the cmd_ready
handshake wasn't performed. Symmetrically locality has to be relinquished
only after going idle handshake has completed, this requires that
go_idle has to poll for the completion and as well locality
relinquish has to poll for completion so it is not overridden
in back to back commands flow.

The issue is only visible on devices that support multiple localities.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
---
V2: poll for locality relinquish completion
V3: 1. Print error message upon locality relinquish failure
    2. Don't override rc code on error path with locality relinquish
       return value.

 drivers/char/tpm/tpm-interface.c |  21 +++++---
 drivers/char/tpm/tpm_crb.c       | 108 +++++++++++++++++++++++++++------------
 drivers/char/tpm/tpm_tis_core.c  |   4 +-
 include/linux/tpm.h              |   2 +-
 4 files changed, 94 insertions(+), 41 deletions(-)

Comments

Jarkko Sakkinen Feb. 19, 2018, 11:27 a.m. UTC | #1
On Wed, Feb 14, 2018 at 03:43:18PM +0200, Tomas Winkler wrote:
>  	if (need_locality && chip->ops->relinquish_locality) {
> -		chip->ops->relinquish_locality(chip, chip->locality);
> +		/* this coud be on error path, don't override error code */
> +		int l_rc = chip->ops->relinquish_locality(chip, chip->locality);

All local variable declarations must be in the beginning of the
function.

> +
> +		if (l_rc) {
> +			dev_err(&chip->dev, "%s: relinquish_locality: error %d\n",
> +				__func__, l_rc);
> +			rc = l_rc;
> +		}

Your comment about not overriding error code is incorrect.

The value of 'rc' should be never overridden, which kind of supports
to "just print" behavior that we had for a locality error.

Is your fix somehow dependent on changing relinquish_locality()
behavior? If not, please remove this change. If you want to contribute
such behavioral change, you should make a separate patch of it.

Now it's like a trojan horse bundled inside a bug fix.

/Jarkko
Winkler, Tomas Feb. 19, 2018, 11:43 a.m. UTC | #2
> 
> On Wed, Feb 14, 2018 at 03:43:18PM +0200, Tomas Winkler wrote:
> >  	if (need_locality && chip->ops->relinquish_locality) {
> > -		chip->ops->relinquish_locality(chip, chip->locality);
> > +		/* this coud be on error path, don't override error code */
> > +		int l_rc = chip->ops->relinquish_locality(chip, chip->locality);
> 
> All local variable declarations must be in the beginning of the function.

Who says?


> 
> > +
> > +		if (l_rc) {
> > +			dev_err(&chip->dev, "%s: relinquish_locality: error
> %d\n",
> > +				__func__, l_rc);
> > +			rc = l_rc;
> > +		}
> 
> Your comment about not overriding error code is incorrect. 

Please explain? 

> The value of 'rc' should be never overridden, which kind of supports to "just
> print" behavior that we had for a locality error.

You are not consistent, you've agreed with propagating it to user space. 
The error will  be propagated in case of an error in locality relinquish
the device is pretty much in non functional state and provious errors do not matter much,
but rc value won't be modified if locality_reliquish succeeds.

> Is your fix somehow dependent on changing relinquish_locality() behavior? If
> not, please remove this change. If you want to contribute such behavioral
> change, you should make a separate patch of it.

The issue is structural, this is required just because the relinquish locality  is inside the error path handling.

> Now it's like a trojan horse bundled inside a bug fix.

Not sure I understand your methaphore. 
Please review again.

Thanks
Tomas
Jarkko Sakkinen Feb. 20, 2018, 2:12 p.m. UTC | #3
On Mon, 2018-02-19 at 13:27 +0200, Jarkko Sakkinen wrote:
> On Wed, Feb 14, 2018 at 03:43:18PM +0200, Tomas Winkler wrote:
> >  	if (need_locality && chip->ops->relinquish_locality) {
> > -		chip->ops->relinquish_locality(chip, chip-
> > >locality);
> > +		/* this coud be on error path, don't override
> > error code */
> > +		int l_rc = chip->ops->relinquish_locality(chip,
> > chip->locality);
> 
> All local variable declarations must be in the beginning of the
> function.
> 
> > +
> > +		if (l_rc) {
> > +			dev_err(&chip->dev, "%s:
> > relinquish_locality: error %d\n",
> > +				__func__, l_rc);
> > +			rc = l_rc;
> > +		}
> 
> Your comment about not overriding error code is incorrect.
> 
> The value of 'rc' should be never overridden, which kind of supports
> to "just print" behavior that we had for a locality error.
> 
> Is your fix somehow dependent on changing relinquish_locality()
> behavior? If not, please remove this change. If you want to
> contribute
> such behavioral change, you should make a separate patch of it.
> 
> Now it's like a trojan horse bundled inside a bug fix.

Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>

[And while doing this noticed a flaw in my test suite:
https://github.com/jsakkine-intel/tpm2-scripts/issues/3]

/Jarkko
Jarkko Sakkinen Feb. 20, 2018, 2:57 p.m. UTC | #4
On Mon, 2018-02-19 at 11:43 +0000, Winkler, Tomas wrote:
> > All local variable declarations must be in the beginning of the
> > function.
>
> Who says?

It is coherent how we have everything else.

It is much easier to see the stack allocation this way when the
allocation is only done in the beginning of each function. If
you really need to do such pattern, then it would be a better
idea to consider an additional helper function.

> > Your comment about not overriding error code is incorrect.
>
> Please explain?

'l_rc' overrides 'rc' in the case when both are non-zero.

> > The value of 'rc' should be never overridden, which kind of
> > supports to "just
> > print" behavior that we had for a locality error.
>
> You are not consistent, you've agreed with propagating it to user
> space.  The error will  be propagated in case of an error in
> locality relinquish the device is pretty much in non functional
> state and provious errors do not matter much, but rc value won't
> be modified if locality_reliquish succeeds.

Well, sometimes you fail to notice things and I failed to notice the
collision above. The commit message does not describe why 'l_rc'
overrides 'rc' in the case when both are non-zero. What was the
reasoning, which made you end up with this priority order?  Why is
'l_rc' more important than 'rc'?

My take is that does it really make sense have this change as part
of a high priority bug fix that should be as localized as possible?
Seems like a non-trivial problem by itself.

/Jarkko
Winkler, Tomas Feb. 20, 2018, 8:26 p.m. UTC | #5
> 

> On Mon, 2018-02-19 at 11:43 +0000, Winkler, Tomas wrote:

> > > All local variable declarations must be in the beginning of the

> > > function.

> >

> > Who says?

> 

> It is coherent how we have everything else.

 I will have to care about its value out of the scope where the variable existence is not relevant.

> It is much easier to see the stack allocation this way when the allocation is

> only done in the beginning of each function. If you really need to do such

> pattern, then it would be a better idea to consider an additional helper

> function.

The code block decides whether to modify 'rc'. I'm not sure if additional function will make
 the code cleaner, on the opposite.
> 

> > > Your comment about not overriding error code is incorrect.

> >

> > Please explain?

> 

> 'l_rc' overrides 'rc' in the case when both are non-zero.


Yes, that's been the intention, we cannot return more than one value. 
l_rc if set it has hire priority. 

> 

> > > The value of 'rc' should be never overridden, which kind of supports

> > > to "just print" behavior that we had for a locality error.

> >

> > You are not consistent, you've agreed with propagating it to user

> > space.  The error will  be propagated in case of an error in locality

> > relinquish the device is pretty much in non functional state and

> > provious errors do not matter much, but rc value won't be modified if

> > locality_reliquish succeeds.

> 

> Well, sometimes you fail to notice things and I failed to notice the collision

> above. The commit message does not describe why 'l_rc'

> overrides 'rc' in the case when both are non-zero. What was the reasoning,

> which made you end up with this priority order?  Why is 'l_rc' more

> important than 'rc'?


Because, it's fatal. I'm not sure it's matter much what the previous error was, it cannot be recovered
That's my understanding of this flow.

 
> My take is that does it really make sense have this change as part of a high

> priority bug fix that should be as localized as possible?

> Seems like a non-trivial problem by itself.


Yes, the issue here is that also an error path can fail. Now what is the correct return value.. 

In any case, in order to resolve this dispute, I will post a version when the error is just prints out,
Once, however fatal the error is, it's very unlikely that it will happen.
Second the driver will find the device not responding in a subsequent command.

Not perfect, but at least we will have functional driver.

Thanks
Tomas
Jarkko Sakkinen Feb. 20, 2018, 11:03 p.m. UTC | #6
On Tue, Feb 20, 2018 at 08:26:45PM +0000, Winkler, Tomas wrote:
> > 
> > On Mon, 2018-02-19 at 11:43 +0000, Winkler, Tomas wrote:
> > > > All local variable declarations must be in the beginning of the
> > > > function.
> > >
> > > Who says?
> > 
> > It is coherent how we have everything else.
>  I will have to care about its value out of the scope where the variable existence is not relevant.
> 
> > It is much easier to see the stack allocation this way when the allocation is
> > only done in the beginning of each function. If you really need to do such
> > pattern, then it would be a better idea to consider an additional helper
> > function.
> The code block decides whether to modify 'rc'. I'm not sure if additional function will make
>  the code cleaner, on the opposite.
> > 
> > > > Your comment about not overriding error code is incorrect.
> > >
> > > Please explain?
> > 
> > 'l_rc' overrides 'rc' in the case when both are non-zero.
> 
> Yes, that's been the intention, we cannot return more than one value. 
> l_rc if set it has hire priority. 
> 
> > 
> > > > The value of 'rc' should be never overridden, which kind of supports
> > > > to "just print" behavior that we had for a locality error.
> > >
> > > You are not consistent, you've agreed with propagating it to user
> > > space.  The error will  be propagated in case of an error in locality
> > > relinquish the device is pretty much in non functional state and
> > > provious errors do not matter much, but rc value won't be modified if
> > > locality_reliquish succeeds.
> > 
> > Well, sometimes you fail to notice things and I failed to notice the collision
> > above. The commit message does not describe why 'l_rc'
> > overrides 'rc' in the case when both are non-zero. What was the reasoning,
> > which made you end up with this priority order?  Why is 'l_rc' more
> > important than 'rc'?
> 
> Because, it's fatal. I'm not sure it's matter much what the previous error was, it cannot be recovered
> That's my understanding of this flow.
> 
>  
> > My take is that does it really make sense have this change as part of a high
> > priority bug fix that should be as localized as possible?
> > Seems like a non-trivial problem by itself.
> 
> Yes, the issue here is that also an error path can fail. Now what is the correct return value.. 
> 
> In any case, in order to resolve this dispute, I will post a version when the error is just prints out,
> Once, however fatal the error is, it's very unlikely that it will happen.
> Second the driver will find the device not responding in a subsequent command.
> 
> Not perfect, but at least we will have functional driver.
> 
> Thanks
> Tomas
> 

Please add my tested by to next version. Thanks.

/Jarkko
diff mbox

Patch

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 9e80a953d693..f47b29c1a963 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -422,8 +422,6 @@  ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
 	if (!(flags & TPM_TRANSMIT_UNLOCKED))
 		mutex_lock(&chip->tpm_mutex);
 
-	if (chip->dev.parent)
-		pm_runtime_get_sync(chip->dev.parent);
 
 	if (chip->ops->clk_enable != NULL)
 		chip->ops->clk_enable(chip, true);
@@ -439,6 +437,9 @@  ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
 		chip->locality = rc;
 	}
 
+	if (chip->dev.parent)
+		pm_runtime_get_sync(chip->dev.parent);
+
 	rc = tpm2_prepare_space(chip, space, ordinal, buf);
 	if (rc)
 		goto out;
@@ -499,17 +500,25 @@  ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
 	rc = tpm2_commit_space(chip, space, ordinal, buf, &len);
 
 out:
+	if (chip->dev.parent)
+		pm_runtime_put_sync(chip->dev.parent);
+
 	if (need_locality && chip->ops->relinquish_locality) {
-		chip->ops->relinquish_locality(chip, chip->locality);
+		/* this coud be on error path, don't override error code */
+		int l_rc = chip->ops->relinquish_locality(chip, chip->locality);
+
+		if (l_rc) {
+			dev_err(&chip->dev, "%s: relinquish_locality: error %d\n",
+				__func__, l_rc);
+			rc = l_rc;
+		}
 		chip->locality = -1;
 	}
+
 out_no_locality:
 	if (chip->ops->clk_enable != NULL)
 		chip->ops->clk_enable(chip, false);
 
-	if (chip->dev.parent)
-		pm_runtime_put_sync(chip->dev.parent);
-
 	if (!(flags & TPM_TRANSMIT_UNLOCKED))
 		mutex_unlock(&chip->tpm_mutex);
 	return rc ? rc : len;
diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
index 7b3c2a8aa9de..497edd9848cd 100644
--- a/drivers/char/tpm/tpm_crb.c
+++ b/drivers/char/tpm/tpm_crb.c
@@ -112,6 +112,25 @@  struct tpm2_crb_smc {
 	u32 smc_func_id;
 };
 
+static bool crb_wait_for_reg_32(u32 __iomem *reg, u32 mask, u32 value,
+				unsigned long timeout)
+{
+	ktime_t start;
+	ktime_t stop;
+
+	start = ktime_get();
+	stop = ktime_add(start, ms_to_ktime(timeout));
+
+	do {
+		if ((ioread32(reg) & mask) == value)
+			return true;
+
+		usleep_range(50, 100);
+	} while (ktime_before(ktime_get(), stop));
+
+	return ((ioread32(reg) & mask) == value);
+}
+
 /**
  * crb_go_idle - request tpm crb device to go the idle state
  *
@@ -128,7 +147,7 @@  struct tpm2_crb_smc {
  *
  * Return: 0 always
  */
-static int __maybe_unused crb_go_idle(struct device *dev, struct crb_priv *priv)
+static int crb_go_idle(struct device *dev, struct crb_priv *priv)
 {
 	if ((priv->sm == ACPI_TPM2_START_METHOD) ||
 	    (priv->sm == ACPI_TPM2_COMMAND_BUFFER_WITH_START_METHOD) ||
@@ -136,30 +155,17 @@  static int __maybe_unused crb_go_idle(struct device *dev, struct crb_priv *priv)
 		return 0;
 
 	iowrite32(CRB_CTRL_REQ_GO_IDLE, &priv->regs_t->ctrl_req);
-	/* we don't really care when this settles */
 
+	if (!crb_wait_for_reg_32(&priv->regs_t->ctrl_req,
+				 CRB_CTRL_REQ_GO_IDLE/* mask */,
+				 0, /* value */
+				 TPM2_TIMEOUT_C)) {
+		dev_warn(dev, "goIdle timed out\n");
+		return -ETIME;
+	}
 	return 0;
 }
 
-static bool crb_wait_for_reg_32(u32 __iomem *reg, u32 mask, u32 value,
-				unsigned long timeout)
-{
-	ktime_t start;
-	ktime_t stop;
-
-	start = ktime_get();
-	stop = ktime_add(start, ms_to_ktime(timeout));
-
-	do {
-		if ((ioread32(reg) & mask) == value)
-			return true;
-
-		usleep_range(50, 100);
-	} while (ktime_before(ktime_get(), stop));
-
-	return false;
-}
-
 /**
  * crb_cmd_ready - request tpm crb device to enter ready state
  *
@@ -175,8 +181,7 @@  static bool crb_wait_for_reg_32(u32 __iomem *reg, u32 mask, u32 value,
  *
  * Return: 0 on success -ETIME on timeout;
  */
-static int __maybe_unused crb_cmd_ready(struct device *dev,
-					struct crb_priv *priv)
+static int crb_cmd_ready(struct device *dev, struct crb_priv *priv)
 {
 	if ((priv->sm == ACPI_TPM2_START_METHOD) ||
 	    (priv->sm == ACPI_TPM2_COMMAND_BUFFER_WITH_START_METHOD) ||
@@ -195,11 +200,11 @@  static int __maybe_unused crb_cmd_ready(struct device *dev,
 	return 0;
 }
 
-static int crb_request_locality(struct tpm_chip *chip, int loc)
+static int __crb_request_locality(struct device *dev,
+				  struct crb_priv *priv, int loc)
 {
-	struct crb_priv *priv = dev_get_drvdata(&chip->dev);
 	u32 value = CRB_LOC_STATE_LOC_ASSIGNED |
-		CRB_LOC_STATE_TPM_REG_VALID_STS;
+		    CRB_LOC_STATE_TPM_REG_VALID_STS;
 
 	if (!priv->regs_h)
 		return 0;
@@ -207,21 +212,45 @@  static int crb_request_locality(struct tpm_chip *chip, int loc)
 	iowrite32(CRB_LOC_CTRL_REQUEST_ACCESS, &priv->regs_h->loc_ctrl);
 	if (!crb_wait_for_reg_32(&priv->regs_h->loc_state, value, value,
 				 TPM2_TIMEOUT_C)) {
-		dev_warn(&chip->dev, "TPM_LOC_STATE_x.requestAccess timed out\n");
+		dev_warn(dev, "TPM_LOC_STATE_x.requestAccess timed out\n");
 		return -ETIME;
 	}
 
 	return 0;
 }
 
-static void crb_relinquish_locality(struct tpm_chip *chip, int loc)
+static int crb_request_locality(struct tpm_chip *chip, int loc)
 {
 	struct crb_priv *priv = dev_get_drvdata(&chip->dev);
 
+	return __crb_request_locality(&chip->dev, priv, loc);
+}
+
+static int __crb_relinquish_locality(struct device *dev,
+				     struct crb_priv *priv, int loc)
+{
+	u32 mask = CRB_LOC_STATE_LOC_ASSIGNED |
+		   CRB_LOC_STATE_TPM_REG_VALID_STS;
+	u32 value = CRB_LOC_STATE_TPM_REG_VALID_STS;
+
 	if (!priv->regs_h)
-		return;
+		return 0;
 
 	iowrite32(CRB_LOC_CTRL_RELINQUISH, &priv->regs_h->loc_ctrl);
+	if (!crb_wait_for_reg_32(&priv->regs_h->loc_state, mask, value,
+				 TPM2_TIMEOUT_C)) {
+		dev_warn(dev, "TPM_LOC_STATE_x.requestAccess timed out\n");
+		return -ETIME;
+	}
+
+	return 0;
+}
+
+static int crb_relinquish_locality(struct tpm_chip *chip, int loc)
+{
+	struct crb_priv *priv = dev_get_drvdata(&chip->dev);
+
+	return __crb_relinquish_locality(&chip->dev, priv, loc);
 }
 
 static u8 crb_status(struct tpm_chip *chip)
@@ -475,6 +504,10 @@  static int crb_map_io(struct acpi_device *device, struct crb_priv *priv,
 			dev_warn(dev, FW_BUG "Bad ACPI memory layout");
 	}
 
+	ret = __crb_request_locality(dev, priv, 0);
+	if (ret)
+		return ret;
+
 	priv->regs_t = crb_map_res(dev, priv, &io_res, buf->control_address,
 				   sizeof(struct crb_regs_tail));
 	if (IS_ERR(priv->regs_t))
@@ -531,6 +564,8 @@  static int crb_map_io(struct acpi_device *device, struct crb_priv *priv,
 
 	crb_go_idle(dev, priv);
 
+	__crb_relinquish_locality(dev, priv, 0);
+
 	return ret;
 }
 
@@ -588,10 +623,14 @@  static int crb_acpi_add(struct acpi_device *device)
 	chip->acpi_dev_handle = device->handle;
 	chip->flags = TPM_CHIP_FLAG_TPM2;
 
-	rc  = crb_cmd_ready(dev, priv);
+	rc = __crb_request_locality(dev, priv, 0);
 	if (rc)
 		return rc;
 
+	rc  = crb_cmd_ready(dev, priv);
+	if (rc)
+		goto out;
+
 	pm_runtime_get_noresume(dev);
 	pm_runtime_set_active(dev);
 	pm_runtime_enable(dev);
@@ -601,12 +640,15 @@  static int crb_acpi_add(struct acpi_device *device)
 		crb_go_idle(dev, priv);
 		pm_runtime_put_noidle(dev);
 		pm_runtime_disable(dev);
-		return rc;
+		goto out;
 	}
 
-	pm_runtime_put(dev);
+	pm_runtime_put_sync(dev);
 
-	return 0;
+out:
+	__crb_relinquish_locality(dev, priv, 0);
+
+	return rc;
 }
 
 static int crb_acpi_remove(struct acpi_device *device)
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 183a5f54d875..a22b12adbdfd 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -143,11 +143,13 @@  static bool check_locality(struct tpm_chip *chip, int l)
 	return false;
 }
 
-static void release_locality(struct tpm_chip *chip, int l)
+static int release_locality(struct tpm_chip *chip, int l)
 {
 	struct tpm_tis_data *priv = dev_get_drvdata(&chip->dev);
 
 	tpm_tis_write8(priv, TPM_ACCESS(l), TPM_ACCESS_ACTIVE_LOCALITY);
+
+	return 0;
 }
 
 static int request_locality(struct tpm_chip *chip, int l)
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index bcdd3790e94d..06639fb6ab85 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -44,7 +44,7 @@  struct tpm_class_ops {
 	bool (*update_timeouts)(struct tpm_chip *chip,
 				unsigned long *timeout_cap);
 	int (*request_locality)(struct tpm_chip *chip, int loc);
-	void (*relinquish_locality)(struct tpm_chip *chip, int loc);
+	int (*relinquish_locality)(struct tpm_chip *chip, int loc);
 	void (*clk_enable)(struct tpm_chip *chip, bool value);
 };