diff mbox series

[1/8] tpm: block messages while suspended

Message ID 20190613180931.65445-2-swboyd@chromium.org (mailing list archive)
State New, archived
Headers show
Series tpm: Add driver for cr50 | expand

Commit Message

Stephen Boyd June 13, 2019, 6:09 p.m. UTC
From: Andrey Pronin <apronin@chromium.org>

Other drivers or userspace may initiate sending a message to the tpm
while the device itself and the controller of the bus it is on are
suspended. That may break the bus driver logic.
Block sending messages while the device is suspended.

Signed-off-by: Andrey Pronin <apronin@chromium.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
---

I don't think this was ever posted before.

 drivers/char/tpm/tpm-interface.c | 16 ++++++++++++++--
 include/linux/tpm.h              |  2 ++
 2 files changed, 16 insertions(+), 2 deletions(-)

Comments

Jason Gunthorpe June 13, 2019, 11:26 p.m. UTC | #1
On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote:
> From: Andrey Pronin <apronin@chromium.org>
> 
> Other drivers or userspace may initiate sending a message to the tpm
> while the device itself and the controller of the bus it is on are
> suspended. That may break the bus driver logic.
> Block sending messages while the device is suspended.
> 
> Signed-off-by: Andrey Pronin <apronin@chromium.org>
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> 
> I don't think this was ever posted before.

Use a real lock.

Jason
Jarkko Sakkinen June 14, 2019, 3:27 p.m. UTC | #2
On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote:
> diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> index 1b5436b213a2..48df005228d0 100644
> --- a/include/linux/tpm.h
> +++ b/include/linux/tpm.h
> @@ -132,6 +132,8 @@ struct tpm_chip {
>  	int dev_num;		/* /dev/tpm# */
>  	unsigned long is_open;	/* only one allowed */
>  
> +	unsigned long is_suspended;
> +
>  	char hwrng_name[64];
>  	struct hwrng hwrng;

I think it would better idea to have a bitmask of some sort that
would have bits for 'open' and 'suspended'.

/Jarkko
Stephen Boyd June 14, 2019, 6:12 p.m. UTC | #3
Quoting Jason Gunthorpe (2019-06-13 16:26:13)
> On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote:
> > From: Andrey Pronin <apronin@chromium.org>
> > 
> > Other drivers or userspace may initiate sending a message to the tpm
> > while the device itself and the controller of the bus it is on are
> > suspended. That may break the bus driver logic.
> > Block sending messages while the device is suspended.
> > 
> > Signed-off-by: Andrey Pronin <apronin@chromium.org>
> > Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> > 
> > I don't think this was ever posted before.
> 
> Use a real lock.
> 

To make sure the bit is tested under a lock so that suspend/resume can't
update the bit in parallel?
Stephen Boyd June 14, 2019, 6:13 p.m. UTC | #4
Quoting Jarkko Sakkinen (2019-06-14 08:27:00)
> On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote:
> > diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> > index 1b5436b213a2..48df005228d0 100644
> > --- a/include/linux/tpm.h
> > +++ b/include/linux/tpm.h
> > @@ -132,6 +132,8 @@ struct tpm_chip {
> >       int dev_num;            /* /dev/tpm# */
> >       unsigned long is_open;  /* only one allowed */
> >  
> > +     unsigned long is_suspended;
> > +
> >       char hwrng_name[64];
> >       struct hwrng hwrng;
> 
> I think it would better idea to have a bitmask of some sort that
> would have bits for 'open' and 'suspended'.
> 

Sure. I can combine is_open and is_suspended into some sort of 'unsigned
long flags' member and then have #define TPM_IS_OPEN 0 and #define
TPM_IS_SUSPENDED 1 defines?
Jarkko Sakkinen June 17, 2019, 4:38 p.m. UTC | #5
On Fri, Jun 14, 2019 at 11:13:13AM -0700, Stephen Boyd wrote:
> Quoting Jarkko Sakkinen (2019-06-14 08:27:00)
> > On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote:
> > > diff --git a/include/linux/tpm.h b/include/linux/tpm.h
> > > index 1b5436b213a2..48df005228d0 100644
> > > --- a/include/linux/tpm.h
> > > +++ b/include/linux/tpm.h
> > > @@ -132,6 +132,8 @@ struct tpm_chip {
> > >       int dev_num;            /* /dev/tpm# */
> > >       unsigned long is_open;  /* only one allowed */
> > >  
> > > +     unsigned long is_suspended;
> > > +
> > >       char hwrng_name[64];
> > >       struct hwrng hwrng;
> > 
> > I think it would better idea to have a bitmask of some sort that
> > would have bits for 'open' and 'suspended'.
> > 
> 
> Sure. I can combine is_open and is_suspended into some sort of 'unsigned
> long flags' member and then have #define TPM_IS_OPEN 0 and #define
> TPM_IS_SUSPENDED 1 defines?

Sounds sustainable.

/Jarkko
Jason Gunthorpe June 17, 2019, 10:51 p.m. UTC | #6
On Fri, Jun 14, 2019 at 11:12:36AM -0700, Stephen Boyd wrote:
> Quoting Jason Gunthorpe (2019-06-13 16:26:13)
> > On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote:
> > > From: Andrey Pronin <apronin@chromium.org>
> > > 
> > > Other drivers or userspace may initiate sending a message to the tpm
> > > while the device itself and the controller of the bus it is on are
> > > suspended. That may break the bus driver logic.
> > > Block sending messages while the device is suspended.
> > > 
> > > Signed-off-by: Andrey Pronin <apronin@chromium.org>
> > > Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> > > 
> > > I don't think this was ever posted before.
> > 
> > Use a real lock.
> > 
> 
> To make sure the bit is tested under a lock so that suspend/resume can't
> update the bit in parallel?

No, just use a real lock, don't make locks out of test bit/set bit

Jason
Stephen Boyd June 21, 2019, 1:03 a.m. UTC | #7
Quoting Jason Gunthorpe (2019-06-17 15:51:34)
> On Fri, Jun 14, 2019 at 11:12:36AM -0700, Stephen Boyd wrote:
> > Quoting Jason Gunthorpe (2019-06-13 16:26:13)
> > > On Thu, Jun 13, 2019 at 11:09:24AM -0700, Stephen Boyd wrote:
> > > > From: Andrey Pronin <apronin@chromium.org>
> > > > 
> > > > Other drivers or userspace may initiate sending a message to the tpm
> > > > while the device itself and the controller of the bus it is on are
> > > > suspended. That may break the bus driver logic.
> > > > Block sending messages while the device is suspended.
> > > > 
> > > > Signed-off-by: Andrey Pronin <apronin@chromium.org>
> > > > Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> > > > 
> > > > I don't think this was ever posted before.
> > > 
> > > Use a real lock.
> > > 
> > 
> > To make sure the bit is tested under a lock so that suspend/resume can't
> > update the bit in parallel?
> 
> No, just use a real lock, don't make locks out of test bit/set bit
> 

Ok. I looked back on the history of this change in our kernel (seems it
wasn't attempted upstream for some time) and it looks like the problem
may have been that the khwrng kthread (i.e. hwrng_fill()) isn't frozen
across suspend/resume. This kthread runs concurrently with devices being
resumed, the cr50 hardware is still suspended, and then a tpm command is
sent and it hangs the I2C bus because the device hasn't been properly
resumed yet.

I suspect a better approach than trying to hold of all TPM commands
across suspend/resume would be to fix the caller here to not even try to
read the hwrng during this time. It's a general problem for other hwrngs
that have some suspend/resume hooks too. This kthread is going to be
running while suspend/resume is going on if the random entropy gets too
low, and that probably shouldn't be the case.

What do you think of the attached patch? I haven't tested it, but it
would make sure that the kthread is frozen so that the hardware can be
resumed before the kthread is thawed and tries to go touch the hardware.

----8<-----
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 95be7228f327..3b88af3149a7 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -13,6 +13,7 @@
 #include <linux/delay.h>
 #include <linux/device.h>
 #include <linux/err.h>
+#include <linux/freezer.h>
 #include <linux/fs.h>
 #include <linux/hw_random.h>
 #include <linux/kernel.h>
@@ -421,7 +422,9 @@ static int hwrng_fillfn(void *unused)
 {
 	long rc;
 
-	while (!kthread_should_stop()) {
+	set_freezable();
+
+	while (!kthread_freezable_should_stop(NULL)) {
 		struct hwrng *rng;
 
 		rng = get_current_rng();
Herbert Xu June 24, 2019, 2:26 p.m. UTC | #8
On Thu, Jun 20, 2019 at 06:03:17PM -0700, Stephen Boyd wrote:
>
> What do you think of the attached patch? I haven't tested it, but it
> would make sure that the kthread is frozen so that the hardware can be
> resumed before the kthread is thawed and tries to go touch the hardware.

Looks good to me.

Cheers,
Stephen Boyd July 16, 2019, 6:28 p.m. UTC | #9
Quoting Herbert Xu (2019-06-24 07:26:54)
> On Thu, Jun 20, 2019 at 06:03:17PM -0700, Stephen Boyd wrote:
> >
> > What do you think of the attached patch? I haven't tested it, but it
> > would make sure that the kthread is frozen so that the hardware can be
> > resumed before the kthread is thawed and tries to go touch the hardware.
> 
> Looks good to me.
> 

Thanks for checking. I haven't been able to reproduce the problem yet to
confirm this is actually fixing anything, even after tweaking the sysctl
values for khwrng to try and force the thread to run continuously.

From reading the bug report that caused this 'is_suspended' code to be
added to the driver I'm fairly convinced this is the right solution. To
give some more background, it looks like we use s2idle suspend (i.e.
all CPUs are idle across suspend) and then we have the kthread running
to ask the hardware for some more random numbers. The i2c transaction
fails when asking the hwrng for data, and we see these messages printed
on the resume path:

	i2c_designware i2c_designware.2: i2c_dw_handle_tx_abort: lost arbitration
	tpm tpm0: i2c transfer failed (attempt 3/3): -11
	tpm0: tpm_transmit: tpm_send: error -11
	hwrng: no data available

Userspace tasks are thawed after this failure so it looks like something
in the kernel kicks khwrng to grab more data before the i2c bus can be
resumed.
diff mbox series

Patch

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index ae1030c9b086..7232527652e8 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -86,6 +86,11 @@  static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
 		return -E2BIG;
 	}
 
+	if (test_bit(0, &chip->is_suspended)) {
+		dev_warn(&chip->dev, "blocking transmit while suspended\n");
+		return -EAGAIN;
+	}
+
 	rc = chip->ops->send(chip, buf, count);
 	if (rc < 0) {
 		if (rc != -EPIPE)
@@ -403,14 +408,19 @@  int tpm_pm_suspend(struct device *dev)
 		return 0;
 
 	if (!tpm_chip_start(chip)) {
-		if (chip->flags & TPM_CHIP_FLAG_TPM2)
+		if (chip->flags & TPM_CHIP_FLAG_TPM2) {
 			tpm2_shutdown(chip, TPM2_SU_STATE);
-		else
+			set_bit(0, &chip->is_suspended);
+		} else {
 			rc = tpm1_pm_suspend(chip, tpm_suspend_pcr);
+		}
 
 		tpm_chip_stop(chip);
 	}
 
+	if (!rc)
+		set_bit(0, &chip->is_suspended);
+
 	return rc;
 }
 EXPORT_SYMBOL_GPL(tpm_pm_suspend);
@@ -426,6 +436,8 @@  int tpm_pm_resume(struct device *dev)
 	if (chip == NULL)
 		return -ENODEV;
 
+	clear_bit(0, &chip->is_suspended);
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(tpm_pm_resume);
diff --git a/include/linux/tpm.h b/include/linux/tpm.h
index 1b5436b213a2..48df005228d0 100644
--- a/include/linux/tpm.h
+++ b/include/linux/tpm.h
@@ -132,6 +132,8 @@  struct tpm_chip {
 	int dev_num;		/* /dev/tpm# */
 	unsigned long is_open;	/* only one allowed */
 
+	unsigned long is_suspended;
+
 	char hwrng_name[64];
 	struct hwrng hwrng;