diff mbox

tpm: add retry logic

Message ID 1521249754.12827.7.camel@HansenPartnership.com (mailing list archive)
State New, archived
Headers show

Commit Message

James Bottomley March 17, 2018, 1:22 a.m. UTC
TPM2 can return TPM2_RC_RETRY to any command and when it does we get
unexpected failures inside the kernel that surprise users (this is
mostly observed in the trusted key handling code).  The UEFI 2.6 spec
has advice on how to handle this:

    The firmware SHALL not return TPM2_RC_RETRY prior to the completion
    of the call to ExitBootServices().

    Implementer’s Note: the implementation of this function should check
    the return value in the TPM response and, if it is TPM2_RC_RETRY,
    resend the command. The implementation may abort if a sufficient
    number of retries has been done.

So we follow that advice in our tpm_transmit() code using
TPM2_DURATION_SHORT as the initial wait duration and
TPM2_DURATION_LONG as the maximum wait time.  This should fix all the
in-kernel use cases and also means that user space TSS implementations
don't have to have their own retry handling.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: stable@vger.kernel.org
---
 drivers/char/tpm/tpm-interface.c | 75 ++++++++++++++++++++++++++++++++--------
 drivers/char/tpm/tpm.h           |  1 +
 2 files changed, 61 insertions(+), 15 deletions(-)

Comments

Jarkko Sakkinen March 19, 2018, 9:17 p.m. UTC | #1
On Fri, Mar 16, 2018 at 06:22:34PM -0700, James Bottomley wrote:
> TPM2 can return TPM2_RC_RETRY to any command and when it does we get
> unexpected failures inside the kernel that surprise users (this is
> mostly observed in the trusted key handling code).  The UEFI 2.6 spec
> has advice on how to handle this:
> 
>     The firmware SHALL not return TPM2_RC_RETRY prior to the completion
>     of the call to ExitBootServices().
> 
>     Implementer’s Note: the implementation of this function should check
>     the return value in the TPM response and, if it is TPM2_RC_RETRY,
>     resend the command. The implementation may abort if a sufficient
>     number of retries has been done.

When does TPM decide to send this code anyway? TCG specifications do
not cover this too well, which makes the whole review process quite
staggering.

"the TPM was not able to start the command" is essentially a
tautology... I wonder who came up with such a bad description.

> So we follow that advice in our tpm_transmit() code using
> TPM2_DURATION_SHORT as the initial wait duration and
> TPM2_DURATION_LONG as the maximum wait time.  This should fix all the
> in-kernel use cases and also means that user space TSS implementations
> don't have to have their own retry handling.
> 
> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
> Cc: stable@vger.kernel.org

Didn't look at this before responding to your previous email. This
should be a separate commit before the self test change I guess.

Maybe we could:

1. Create a two patch patch set and check that it applies cleanly
   to my tree.
2.

> ---
>  drivers/char/tpm/tpm-interface.c | 75 ++++++++++++++++++++++++++++++++--------
>  drivers/char/tpm/tpm.h           |  1 +
>  2 files changed, 61 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
> index 7c3380201960..1d25d1dca2f0 100644
> --- a/drivers/char/tpm/tpm-interface.c
> +++ b/drivers/char/tpm/tpm-interface.c
> @@ -398,21 +398,10 @@ static void tpm_relinquish_locality(struct tpm_chip *chip)
>  	chip->locality = -1;
>  }
>  
> -/**
> - * tpm_transmit - Internal kernel interface to transmit TPM commands.
> - *
> - * @chip: TPM chip to use
> - * @space: tpm space
> - * @buf: TPM command buffer
> - * @bufsiz: length of the TPM command buffer
> - * @flags: tpm transmit flags - bitmap
> - *
> - * Return:
> - *     0 when the operation is successful.
> - *     A negative number for system errors (errno).
> - */
> -ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
> -		     u8 *buf, size_t bufsiz, unsigned int flags)
> +static ssize_t tpm_transmit_internal(struct tpm_chip *chip,
> +				     struct tpm_space *space,
> +				     u8 *buf, size_t bufsiz,
> +				     unsigned int flags)

I would name this as tpm_try_transmit() because that is what it
exactly is given the existence of TPM_RC_RETRY. It is more exact and
less abstract naming than tpm_transmit_internal() therefore a better
naming.

/Jarkko
James Bottomley March 21, 2018, 6:26 p.m. UTC | #2
On Mon, 2018-03-19 at 23:17 +0200, Jarkko Sakkinen wrote:
> On Fri, Mar 16, 2018 at 06:22:34PM -0700, James Bottomley wrote:
> > 
> > TPM2 can return TPM2_RC_RETRY to any command and when it does we
> > get
> > unexpected failures inside the kernel that surprise users (this is
> > mostly observed in the trusted key handling code).  The UEFI 2.6
> > spec
> > has advice on how to handle this:
> > 
> >     The firmware SHALL not return TPM2_RC_RETRY prior to the
> > completion
> >     of the call to ExitBootServices().
> > 
> >     Implementer’s Note: the implementation of this function should
> > check
> >     the return value in the TPM response and, if it is
> > TPM2_RC_RETRY,
> >     resend the command. The implementation may abort if a
> > sufficient
> >     number of retries has been done.
> 
> When does TPM decide to send this code anyway? TCG specifications do
> not cover this too well, which makes the whole review process quite
> staggering.
> 
> "the TPM was not able to start the command" is essentially a
> tautology... I wonder who came up with such a bad description.
> 
> > 
> > So we follow that advice in our tpm_transmit() code using
> > TPM2_DURATION_SHORT as the initial wait duration and
> > TPM2_DURATION_LONG as the maximum wait time.  This should fix all
> > the
> > in-kernel use cases and also means that user space TSS
> > implementations
> > don't have to have their own retry handling.
> > 
> > Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.c
> > om>
> > Cc: stable@vger.kernel.org
> 
> Didn't look at this before responding to your previous email. This
> should be a separate commit before the self test change I guess.
> 
> Maybe we could:
> 
> 1. Create a two patch patch set and check that it applies cleanly
>    to my tree.
> 2.

OK, I've built a two patch series on your tree with the original "tpm:
fix intermittent failure with self tests" patch removed and created a
new version of it based on the working retry logic.

I'll send it as a proper two sequence patch set.

James
Jarkko Sakkinen March 22, 2018, 2:21 p.m. UTC | #3
On Wed, 2018-03-21 at 11:26 -0700, James Bottomley wrote:
> On Mon, 2018-03-19 at 23:17 +0200, Jarkko Sakkinen wrote:
> > On Fri, Mar 16, 2018 at 06:22:34PM -0700, James Bottomley wrote:
> > > 
> > > TPM2 can return TPM2_RC_RETRY to any command and when it does we
> > > get
> > > unexpected failures inside the kernel that surprise users (this is
> > > mostly observed in the trusted key handling code).  The UEFI 2.6
> > > spec
> > > has advice on how to handle this:
> > > 
> > >     The firmware SHALL not return TPM2_RC_RETRY prior to the
> > > completion
> > >     of the call to ExitBootServices().
> > > 
> > >     Implementer’s Note: the implementation of this function should
> > > check
> > >     the return value in the TPM response and, if it is
> > > TPM2_RC_RETRY,
> > >     resend the command. The implementation may abort if a
> > > sufficient
> > >     number of retries has been done.
> > 
> > When does TPM decide to send this code anyway? TCG specifications do
> > not cover this too well, which makes the whole review process quite
> > staggering.
> > 
> > "the TPM was not able to start the command" is essentially a
> > tautology... I wonder who came up with such a bad description.
> > 
> > > 
> > > So we follow that advice in our tpm_transmit() code using
> > > TPM2_DURATION_SHORT as the initial wait duration and
> > > TPM2_DURATION_LONG as the maximum wait time.  This should fix all
> > > the
> > > in-kernel use cases and also means that user space TSS
> > > implementations
> > > don't have to have their own retry handling.
> > > 
> > > Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.c
> > > om>
> > > Cc: stable@vger.kernel.org
> > 
> > Didn't look at this before responding to your previous email. This
> > should be a separate commit before the self test change I guess.
> > 
> > Maybe we could:
> > 
> > 1. Create a two patch patch set and check that it applies cleanly
> >    to my tree.
> > 2.
> 
> OK, I've built a two patch series on your tree with the original "tpm:
> fix intermittent failure with self tests" patch removed and created a
> new version of it based on the working retry logic.
> 
> I'll send it as a proper two sequence patch set.

Awesome, thank you!

/Jarkko
diff mbox

Patch

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 7c3380201960..1d25d1dca2f0 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -398,21 +398,10 @@  static void tpm_relinquish_locality(struct tpm_chip *chip)
 	chip->locality = -1;
 }
 
-/**
- * tpm_transmit - Internal kernel interface to transmit TPM commands.
- *
- * @chip: TPM chip to use
- * @space: tpm space
- * @buf: TPM command buffer
- * @bufsiz: length of the TPM command buffer
- * @flags: tpm transmit flags - bitmap
- *
- * Return:
- *     0 when the operation is successful.
- *     A negative number for system errors (errno).
- */
-ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
-		     u8 *buf, size_t bufsiz, unsigned int flags)
+static ssize_t tpm_transmit_internal(struct tpm_chip *chip,
+				     struct tpm_space *space,
+				     u8 *buf, size_t bufsiz,
+				     unsigned int flags)
 {
 	struct tpm_output_header *header = (void *)buf;
 	int rc;
@@ -550,6 +539,62 @@  ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
 }
 
 /**
+ * tpm_transmit - Internal kernel interface to transmit TPM commands.
+ *
+ * @chip: TPM chip to use
+ * @space: tpm space
+ * @buf: TPM command buffer
+ * @bufsiz: length of the TPM command buffer
+ * @flags: tpm transmit flags - bitmap
+ *
+ * A wrapper around tpm_transmit_internal that handles TPM2_RC_RETRY
+ * returns from the TPM and retransmits the command after a delay up
+ * to a maximum wait of TPM2_DURATION_LONG.
+ *
+ * Note: TPM1 never returns TPM2_RC_RETRY so the retry logic is TPM2
+ * only
+ *
+ * Return:
+ *     the length of the return when the operation is successful.
+ *     A negative number for system errors (errno).
+ */
+ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
+		     u8 *buf, size_t bufsiz, unsigned int flags)
+{
+	struct tpm_output_header *header = (struct tpm_output_header *)buf;
+	/* space for header and handles */
+	u8 save[TPM_HEADER_SIZE + 3*sizeof(u32)];
+	unsigned int delay_msec = TPM2_DURATION_SHORT;
+	u32 rc = 0;
+	ssize_t ret;
+	const size_t save_size = min(space ? sizeof(save): TPM_HEADER_SIZE,
+				     bufsiz);
+
+	/*
+	 * Subtlety here: if we have a space, the handles will be
+	 * transformed, so when we restore the header we also have to
+	 * restore the handles.
+	 */
+	memcpy(save, buf, save_size);
+
+	for (;;) {
+		ret = tpm_transmit_internal(chip, space, buf, bufsiz, flags);
+		if (ret < 0)
+			break;
+		rc = be32_to_cpu(header->return_code);
+		if (rc != TPM2_RC_RETRY)
+			break;
+		delay_msec *= 2;
+		if (delay_msec > TPM2_DURATION_LONG) {
+			dev_err(&chip->dev, "TPM is in retry loop\n");
+			break;
+		}
+		tpm_msleep(delay_msec);
+		memcpy(buf, save, save_size);
+	}
+	return ret;
+}
+/**
  * tpm_transmit_cmd - send a tpm command to the device
  *    The function extracts tpm out header return code
  *
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index b2547a11271a..c542f877cea7 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -113,6 +113,7 @@  enum tpm2_return_codes {
 	TPM2_RC_COMMAND_CODE    = 0x0143,
 	TPM2_RC_TESTING		= 0x090A, /* RC_WARN */
 	TPM2_RC_REFERENCE_H0	= 0x0910,
+	TPM2_RC_RETRY		= 0x0922,
 };
 
 enum tpm2_algorithms {