diff mbox series

[v2] tpm_tis: work around status register bug in STMicroelectronics TPM

Message ID c0ba1e2931ca7c46a21a43f2b9a6add2e188d6c8.1586996553.git.osandov@fb.com (mailing list archive)
State New, archived
Headers show
Series [v2] tpm_tis: work around status register bug in STMicroelectronics TPM | expand

Commit Message

Omar Sandoval April 16, 2020, 12:23 a.m. UTC
From: Omar Sandoval <osandov@fb.com>

We've encountered a particular model of STMicroelectronics TPM that
transiently returns a bad value in the status register. This causes the
kernel to believe that the TPM is ready to receive a command when it
actually isn't, which in turn causes the send to time out in
get_burstcount(). In testing, reading the status register one extra time
convinces the TPM to return a valid value.

Signed-off-by: Omar Sandoval <osandov@fb.com>
---
 drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Paul Menzel April 16, 2020, 6:22 a.m. UTC | #1
Dear Omar,


Am 16.04.20 um 02:23 schrieb Omar Sandoval:
> From: Omar Sandoval <osandov@fb.com>

Thank you for the patch.

> We've encountered a particular model of STMicroelectronics TPM that

Please add models you are encountering this with to the commit message.

> transiently returns a bad value in the status register. This causes the

Have you contacted STMMicroelectronics?

> kernel to believe that the TPM is ready to receive a command when it
> actually isn't, which in turn causes the send to time out in
> get_burstcount(). In testing, reading the status register one extra time
> convinces the TPM to return a valid value.
> 
> Signed-off-by: Omar Sandoval <osandov@fb.com>
> ---
>   drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++++
>   1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> index 27c6ca031e23..5a2f6acaf768 100644
> --- a/drivers/char/tpm/tpm_tis_core.c
> +++ b/drivers/char/tpm/tpm_tis_core.c
> @@ -238,6 +238,25 @@ static u8 tpm_tis_status(struct tpm_chip *chip)
>   	rc = tpm_tis_read8(priv, TPM_STS(priv->locality), &status);
>   	if (rc < 0)
>   		return 0;
> +	/*
> +	 * Some STMicroelectronics TPMs have a bug where the status register is
> +	 * sometimes bogus (all 1s) if read immediately after the access
> +	 * register is written to. Bits 0, 1, and 5 are always supposed to read
> +	 * as 0, so this is clearly invalid. Reading the register a second time
> +	 * returns a valid value.
> +	 */
> +	if (unlikely(status == 0xff)) {

I’d like to see a debug message here, saying the TPM is buggy. Maybe the 
model can be printed to, and that the TPM manufacturer should be contacted.

> +		rc = tpm_tis_read8(priv, TPM_STS(priv->locality), &status);
> +		if (rc < 0)
> +			return 0;
> +		/*
> +		 * The status is somehow still bad. This hasn't been observed in
> +		 * practice, but clear it just in case so that it doesn't appear
> +		 * to be ready.
> +		 */
> +		if (unlikely(status == 0xff))
> +			status = 0;
> +	}
>   
>   	return status;
>   }


Kind regards,

Paul
Omar Sandoval April 16, 2020, 7:02 p.m. UTC | #2
On Thu, Apr 16, 2020 at 08:22:10AM +0200, Paul Menzel wrote:
> Dear Omar,
> 
> 
> Am 16.04.20 um 02:23 schrieb Omar Sandoval:
> > From: Omar Sandoval <osandov@fb.com>
> 
> Thank you for the patch.
> 
> > We've encountered a particular model of STMicroelectronics TPM that
> 
> Please add models you are encountering this with to the commit message.
> 
> > transiently returns a bad value in the status register. This causes the
> 
> Have you contacted STMMicroelectronics?
> 
> > kernel to believe that the TPM is ready to receive a command when it
> > actually isn't, which in turn causes the send to time out in
> > get_burstcount(). In testing, reading the status register one extra time
> > convinces the TPM to return a valid value.
> > 
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > ---
> >   drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++++
> >   1 file changed, 19 insertions(+)
> > 
> > diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> > index 27c6ca031e23..5a2f6acaf768 100644
> > --- a/drivers/char/tpm/tpm_tis_core.c
> > +++ b/drivers/char/tpm/tpm_tis_core.c
> > @@ -238,6 +238,25 @@ static u8 tpm_tis_status(struct tpm_chip *chip)
> >   	rc = tpm_tis_read8(priv, TPM_STS(priv->locality), &status);
> >   	if (rc < 0)
> >   		return 0;
> > +	/*
> > +	 * Some STMicroelectronics TPMs have a bug where the status register is
> > +	 * sometimes bogus (all 1s) if read immediately after the access
> > +	 * register is written to. Bits 0, 1, and 5 are always supposed to read
> > +	 * as 0, so this is clearly invalid. Reading the register a second time
> > +	 * returns a valid value.
> > +	 */
> > +	if (unlikely(status == 0xff)) {
> 
> I’d like to see a debug message here, saying the TPM is buggy. Maybe the
> model can be printed to, and that the TPM manufacturer should be contacted.

How can I get the model information? (Sorry, I'm not very familiar with
TPMs, I'm just the guy on the team that ended up tracking this down.)

Thanks!
Ken Goldman April 20, 2020, 10:36 p.m. UTC | #3
The model information is returned by getcapability.

TSSes like this one https://sourceforge.net/projects/ibmtpm20tss/ 
(shameless plug) have command line tools that will return the information.

E.g., this returns the TPM spec revision, TPM vendor part number, TPM 
firmware version, etc.

getcapability -cap 6 -pc 13

I assume other TSSes have similar tools.  If you want to experiment with 
TPMs, the command line tools are convenient.

On 4/16/2020 3:02 PM, Omar Sandoval wrote:
> How can I get the model information? (Sorry, I'm not very familiar with
> TPMs, I'm just the guy on the team that ended up tracking this down.)
Mimi Zohar April 21, 2020, 1:44 p.m. UTC | #4
Hi Omar,

On Thu, 2020-04-16 at 12:02 -0700, Omar Sandoval wrote:
> On Thu, Apr 16, 2020 at 08:22:10AM +0200, Paul Menzel wrote:
> > Dear Omar,
> > 
> > 
> > Am 16.04.20 um 02:23 schrieb Omar Sandoval:
> > > From: Omar Sandoval <osandov@fb.com>
> > 
> > Thank you for the patch.
> > 
> > > We've encountered a particular model of STMicroelectronics TPM that
> > 
> > Please add models you are encountering this with to the commit message.
> > 
> > > transiently returns a bad value in the status register. This causes the
> > 
> > Have you contacted STMMicroelectronics?

Also how transient is it?  Is this something that only happens early,
for example before selftests finishes?  Could you get some statistics
here?

> > 
> > > kernel to believe that the TPM is ready to receive a command when it
> > > actually isn't, which in turn causes the send to time out in
> > > get_burstcount(). In testing, reading the status register one extra time
> > > convinces the TPM to return a valid value.

Is this only affecting get_burstcount()?

> > > 
> > > Signed-off-by: Omar Sandoval <osandov@fb.com>
> > > ---
> > >   drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++++
> > >   1 file changed, 19 insertions(+)
> > > 
> > > diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
> > > index 27c6ca031e23..5a2f6acaf768 100644
> > > --- a/drivers/char/tpm/tpm_tis_core.c
> > > +++ b/drivers/char/tpm/tpm_tis_core.c
> > > @@ -238,6 +238,25 @@ static u8 tpm_tis_status(struct tpm_chip *chip)
> > >   	rc = tpm_tis_read8(priv, TPM_STS(priv->locality), &status);
> > >   	if (rc < 0)
> > >   		return 0;
> > > +	/*
> > > +	 * Some STMicroelectronics TPMs have a bug where the status register is
> > > +	 * sometimes bogus (all 1s) if read immediately after the access
> > > +	 * register is written to. Bits 0, 1, and 5 are always supposed to read
> > > +	 * as 0, so this is clearly invalid. Reading the register a second time
> > > +	 * returns a valid value.
> > > +	 */
> > > +	if (unlikely(status == 0xff)) {
> > 
> > I’d like to see a debug message here, saying the TPM is buggy. Maybe the
> > model can be printed to, and that the TPM manufacturer should be contacted.
> 
> How can I get the model information? (Sorry, I'm not very familiar with
> TPMs, I'm just the guy on the team that ended up tracking this down.)

Ken's post yesterday suggested using a userspace tool.

In general, Linux does support buggy HW, like the iTPM support.  As
Paul said, see if there is a vendor solution first.  Whatever fix is
upstreamed should be very specific with a clear explanation of the
problem.

thanks,

Mimi
Benoit HOUYERE April 21, 2020, 8:56 p.m. UTC | #5
>Hi Omar,
>
>On Thu, 2020-04-16 at 12:02 -0700, Omar Sandoval wrote:
>> On Thu, Apr 16, 2020 at 08:22:10AM +0200, Paul Menzel wrote:
>> > Dear Omar,
>> > 
>> > 
>> > Am 16.04.20 um 02:23 schrieb Omar Sandoval:
>> > > From: Omar Sandoval <osandov@fb.com>
>> > 
>> > Thank you for the patch.
>> > 
>> > > We've encountered a particular model of STMicroelectronics TPM 
>> > > that
>> > 
>> > Please add models you are encountering this with to the commit message.
>> > 
>> > > transiently returns a bad value in the status register. This 
>> > > causes the
>> > 
>> > Have you contacted STMMicroelectronics?
>
>Also how transient is it?  Is this something that only happens early, for example before selftests finishes?  Could you get some statistics here?
>
>> > 
>> > > kernel to believe that the TPM is ready to receive a command when 
>> > > it actually isn't, which in turn causes the send to time out in 
>> > > get_burstcount(). In testing, reading the status register one 
>> > > extra time convinces the TPM to return a valid value.
>
>Is this only affecting get_burstcount()?
>
Issue seen is that TPM has not had time to update STS register after locality request (STS Initial value = 0xFF) when STS register reading (tpm_tis_status(chip)) occurs (for less than 1µs). It's happen when time between two events is very short (with PC chipset, High SPI clock frequency and so on).
As explained, at next condition (if ((status & TPM_STS_COMMAND_READY) == 0) {)  status will be at 0xFF and consider wrongly in TPM Ready state (check only one bit), TPM is in fact in idle state and remains in this state because  does not execute (tpm_tis_ready(chip);).
TPM driver goes to "out_err" and stopped after timeout (750 ms) during get_burstcount function where TPM report 0x00 (correct value in TPM idle state but not expected in Ready State).

static int tpm_tis_send_data(struct tpm_chip *chip, const u8 *buf, size_t len)
{
.....
	bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;

	status = tpm_tis_status(chip);

	if ((status & TPM_STS_COMMAND_READY) == 0) {
		tpm_tis_ready(chip);
		if (wait_for_tpm_stat
		    (chip, TPM_STS_COMMAND_READY, chip->timeout_b,
		     &priv->int_queue, false) < 0) {
			rc = -ETIME;
			goto out_err;
		}
	}

	while (count < len - 1) {
		burstcnt = get_burstcount(chip);
		if (burstcnt < 0) {
			dev_err(&chip->dev, "Unable to read burstcount\n");
			rc = burstcnt;
			goto out_err;
		}
>> > > 
>> > > Signed-off-by: Omar Sandoval <osandov@fb.com>
>> > > ---
>> > >   drivers/char/tpm/tpm_tis_core.c | 19 +++++++++++++++++++
>> > >   1 file changed, 19 insertions(+)
>> > > 
>> > > diff --git a/drivers/char/tpm/tpm_tis_core.c 
>> > > b/drivers/char/tpm/tpm_tis_core.c index 27c6ca031e23..5a2f6acaf768 
>> > > 100644
>> > > --- a/drivers/char/tpm/tpm_tis_core.c
>> > > +++ b/drivers/char/tpm/tpm_tis_core.c
>> > > @@ -238,6 +238,25 @@ static u8 tpm_tis_status(struct tpm_chip *chip)
>> > >   	rc = tpm_tis_read8(priv, TPM_STS(priv->locality), &status);
>> > >   	if (rc < 0)
>> > >   		return 0;
>> > > +	/*
>> > > +	 * Some STMicroelectronics TPMs have a bug where the status register is
>> > > +	 * sometimes bogus (all 1s) if read immediately after the access
>> > > +	 * register is written to. Bits 0, 1, and 5 are always supposed to read
>> > > +	 * as 0, so this is clearly invalid. Reading the register a second time
>> > > +	 * returns a valid value.
>> > > +	 */
>> > > +	if (unlikely(status == 0xff)) {
>> > 
>> > I’d like to see a debug message here, saying the TPM is buggy. Maybe 
>> > the model can be printed to, and that the TPM manufacturer should be contacted.
>> 
>> How can I get the model information? (Sorry, I'm not very familiar 
>> with TPMs, I'm just the guy on the team that ended up tracking this 
>> down.)
>
>
>Ken's post yesterday suggested using a userspace tool.
>
>In general, Linux does support buggy HW, like the iTPM support.  As Paul said, see if there is a vendor solution first.  Whatever fix is upstreamed should be >very specific with a clear explanation of the problem.

>thanks,

>Mimi

Issue occurs on several legacy models and corrected on latest TPM versions. Several corrections are possible. Omar's proposal is quite simple, short and efficient. Penalty time is only condition check but for all TPM_status access.

Other possibility  is to check status register validity (bit 5 is always at 0) at the first reading and modify wait_for_stat function (already inserted for I2C patch).
 
static int tpm_tis_send_data(struct tpm_chip *chip, const u8 *buf, size_t len)
{
.....
	bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;

	- status = tpm_tis_status(chip);
	+ if (wait_for_tpm_stat_result
		    (chip, TPM_STS_GO, 0 ,chip->timeout_c,
		     &priv->int_queue, false) < 0) {
			rc = -ETIME;
			goto out_err;
		}
	if ((status & TPM_STS_COMMAND_READY) == 0) {
		tpm_tis_ready(chip);



-static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
-		unsigned long timeout, wait_queue_head_t *queue,
-		bool check_cancel)
+static int wait_for_tpm_stat_result(struct tpm_chip *chip, u8 mask,
+				    u8 mask_result, unsigned long timeout,
+				    wait_queue_head_t *queue,
+				    bool check_cancel)
 {
 	unsigned long stop;
 	long rc;
@@ -55,7 +56,7 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 
 	/* check current status */
 	status = chip->ops->status(chip);
-	if ((status & mask) == mask)
+	if ((status & mask) == mask_result)
 		return 0;
 
 	stop = jiffies + timeout;
@@ -83,7 +84,7 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask,
 			usleep_range(TPM_TIMEOUT_USECS_MIN,
 				     TPM_TIMEOUT_USECS_MAX);
 			status = chip->ops->status(chip);
-			if ((status & mask) == mask)
+			if ((status & mask) == mask_result)
 				return 0;
 		} while (time_before(jiffies, stop));
 	}

Best Regards,

Benoit
Mimi Zohar April 21, 2020, 10:17 p.m. UTC | #6
On Tue, 2020-04-21 at 20:56 +0000, Benoit HOUYERE wrote:

> Issue occurs on several legacy models and corrected on latest TPM
> versions. Several corrections are possible. Omar's proposal is quite
> simple, short and efficient. Penalty time is only condition check
> but for all TPM_status access.
> 
> Other possibility  is to check status register validity (bit 5 is
> always at 0) at the first reading and modify wait_for_stat function
> (already inserted for I2C patch).

Benoit, thank you for the explanation.

It sounds like by "already inserted for I2C patch", you mean that this
proposed solution is part of the i2c patch set.  If that is the case,
to simplify backporting, the fix should be the first patch in the i2c
patch set.

Mimi
Benoit HOUYERE April 21, 2020, 10:22 p.m. UTC | #7
On Tue, 2020-04-21 at 20:56 +0000, Benoit HOUYERE wrote:

>> Issue occurs on several legacy models and corrected on latest TPM 
>> versions. Several corrections are possible. Omar's proposal is quite 
>> simple, short and efficient. Penalty time is only condition check but 
>> for all TPM_status access.
>> 
>> Other possibility  is to check status register validity (bit 5 is 
>> always at 0) at the first reading and modify wait_for_stat function 
>> (already inserted for I2C patch).

>Benoit, thank you for the explanation.

>It sounds like by "already inserted for I2C patch", you mean that this proposed solution is part of the i2c patch set.  If that is the case, to simplify backporting, the fix should be the first patch in the i2c patch set.

>Mimi
Ok, I will check with Amir before.
diff mbox series

Patch

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 27c6ca031e23..5a2f6acaf768 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -238,6 +238,25 @@  static u8 tpm_tis_status(struct tpm_chip *chip)
 	rc = tpm_tis_read8(priv, TPM_STS(priv->locality), &status);
 	if (rc < 0)
 		return 0;
+	/*
+	 * Some STMicroelectronics TPMs have a bug where the status register is
+	 * sometimes bogus (all 1s) if read immediately after the access
+	 * register is written to. Bits 0, 1, and 5 are always supposed to read
+	 * as 0, so this is clearly invalid. Reading the register a second time
+	 * returns a valid value.
+	 */
+	if (unlikely(status == 0xff)) {
+		rc = tpm_tis_read8(priv, TPM_STS(priv->locality), &status);
+		if (rc < 0)
+			return 0;
+		/*
+		 * The status is somehow still bad. This hasn't been observed in
+		 * practice, but clear it just in case so that it doesn't appear
+		 * to be ready.
+		 */
+		if (unlikely(status == 0xff))
+			status = 0;
+	}
 
 	return status;
 }