diff mbox series

tpm/tpm_crb: Avoid unaligned reads in crb_recv():

Message ID 20190201111949.14881-1-jarkko.sakkinen@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series tpm/tpm_crb: Avoid unaligned reads in crb_recv(): | expand

Commit Message

Jarkko Sakkinen Feb. 1, 2019, 11:19 a.m. UTC
The current approach to read first 6 bytes from the response and then tail
of the response, can cause the 2nd memcpy_fromio() to do an unaligned read
(e.g. read 32-bit word from address aligned to a 16-bits), depending on how
memcpy_fromio() is implemented. If this happens, the read will fail and the
memory controller will fill the read with 1's.

This was triggered by 170d13ca3a2f, which should be probably refined to
check and react to the address alignment. Before that commit, on x86
memcpy_fromio() turned out to be memcpy(). By a luck GCC has done the right
thing (from tpm_crb's perspective) for us so far, but we should not rely on
that. Thus, it makes sense to fix this also in tpm_crb, not least because
the fix can be then backported to stable kernels and make them more robust
when compiled in differing environments.

Cc: stable@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: James Morris <jmorris@namei.org>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Cc: Jerry Snitselaar <jsnitsel@redhat.com>
Fixes: 30fc8d138e91 ("tpm: TPM 2.0 CRB Interface")
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
---
 drivers/char/tpm/tpm_crb.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Linus Torvalds Feb. 1, 2019, 5:49 p.m. UTC | #1
On Fri, Feb 1, 2019 at 3:33 AM Jarkko Sakkinen
<jarkko.sakkinen@linux.intel.com> wrote:
>
> Thus, it makes sense to fix this also in tpm_crb, not least because
> the fix can be then backported to stable kernels and make them more robust
> when compiled in differing environments.

Ack, looks sane to me, and should help both the backport and probably
generate better code too.

In the meantime, I've committed the iomem.c change with a *long*
commit message. For all we know, there might be other cases like this
lurking somewhere else that just happened to work. Plus it's the right
thing to do anyway.

               Linus
Jerry Snitselaar Feb. 1, 2019, 6:42 p.m. UTC | #2
On Fri Feb 01 19, Jarkko Sakkinen wrote:
>The current approach to read first 6 bytes from the response and then tail
>of the response, can cause the 2nd memcpy_fromio() to do an unaligned read
>(e.g. read 32-bit word from address aligned to a 16-bits), depending on how
>memcpy_fromio() is implemented. If this happens, the read will fail and the
>memory controller will fill the read with 1's.
>
>This was triggered by 170d13ca3a2f, which should be probably refined to
>check and react to the address alignment. Before that commit, on x86
>memcpy_fromio() turned out to be memcpy(). By a luck GCC has done the right
>thing (from tpm_crb's perspective) for us so far, but we should not rely on
>that. Thus, it makes sense to fix this also in tpm_crb, not least because
>the fix can be then backported to stable kernels and make them more robust
>when compiled in differing environments.
>
>Cc: stable@vger.kernel.org
>Cc: Linus Torvalds <torvalds@linux-foundation.org>
>Cc: James Morris <jmorris@namei.org>
>Cc: Tomas Winkler <tomas.winkler@intel.com>
>Cc: Jerry Snitselaar <jsnitsel@redhat.com>
>Fixes: 30fc8d138e91 ("tpm: TPM 2.0 CRB Interface")
>Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
>---
> drivers/char/tpm/tpm_crb.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
>diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
>index 36952ef98f90..7f47e43aa9f1 100644
>--- a/drivers/char/tpm/tpm_crb.c
>+++ b/drivers/char/tpm/tpm_crb.c
>@@ -288,18 +288,18 @@ static int crb_recv(struct tpm_chip *chip, u8 *buf, size_t count)
> 	unsigned int expected;
>
> 	/* sanity check */
>-	if (count < 6)
>+	if (count < 8)
> 		return -EIO;
>
> 	if (ioread32(&priv->regs_t->ctrl_sts) & CRB_CTRL_STS_ERROR)
> 		return -EIO;
>
>-	memcpy_fromio(buf, priv->rsp, 6);
>+	memcpy_fromio(buf, priv->rsp, 8);
> 	expected = be32_to_cpup((__be32 *) &buf[2]);
>-	if (expected > count || expected < 6)
>+	if (expected > count || expected < 8)
> 		return -EIO;
>
>-	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
>+	memcpy_fromio(&buf[8], &priv->rsp[8], expected - 8);
>
> 	return expected;
> }
>-- 
>2.17.1
>

Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Winkler, Tomas Feb. 1, 2019, 7:20 p.m. UTC | #3
> 
> The current approach to read first 6 bytes from the response and then tail of
> the response, can cause the 2nd memcpy_fromio() to do an unaligned read
> (e.g. read 32-bit word from address aligned to a 16-bits), depending on how
> memcpy_fromio() is implemented. If this happens, the read will fail and the
> memory controller will fill the read with 1's.
> 
> This was triggered by 170d13ca3a2f, which should be probably refined to check
> and react to the address alignment. Before that commit, on x86
> memcpy_fromio() turned out to be memcpy(). By a luck GCC has done the right
> thing (from tpm_crb's perspective) for us so far, but we should not rely on that.
> Thus, it makes sense to fix this also in tpm_crb, not least because the fix can be
> then backported to stable kernels and make them more robust when compiled
> in differing environments.
> 
> Cc: stable@vger.kernel.org
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: James Morris <jmorris@namei.org>
> Cc: Tomas Winkler <tomas.winkler@intel.com>
> Cc: Jerry Snitselaar <jsnitsel@redhat.com>
> Fixes: 30fc8d138e91 ("tpm: TPM 2.0 CRB Interface")
> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> ---
>  drivers/char/tpm/tpm_crb.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c index
> 36952ef98f90..7f47e43aa9f1 100644
> --- a/drivers/char/tpm/tpm_crb.c
> +++ b/drivers/char/tpm/tpm_crb.c
> @@ -288,18 +288,18 @@ static int crb_recv(struct tpm_chip *chip, u8 *buf,
> size_t count)
>  	unsigned int expected;
> 
>  	/* sanity check */
> -	if (count < 6)
> +	if (count < 8)
>  		return -EIO;
Why don't you already enforce reading at least the whole TPM header 10bytes, we are reading the whole buffer at one call anyway.
Who every is asking for 8 bytes from the protocol level, is doing something wrong.

>  	if (ioread32(&priv->regs_t->ctrl_sts) & CRB_CTRL_STS_ERROR)
>  		return -EIO;
> 
> -	memcpy_fromio(buf, priv->rsp, 6);
> +	memcpy_fromio(buf, priv->rsp, 8);
Maybe a short comment will spare someone looking into git history 
>  	expected = be32_to_cpup((__be32 *) &buf[2]);
> -	if (expected > count || expected < 6)
> +	if (expected > count || expected < 8)
Expected should be at least tpm header, right?
>  		return -EIO;
> 
> -	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
> +	memcpy_fromio(&buf[8], &priv->rsp[8], expected - 8);
> 
>  	return expected;
>  }
Otherwise ready the first 8 bytes looks good. 
Thanks 
Tomas
Jarkko Sakkinen Feb. 4, 2019, 11:44 a.m. UTC | #4
On Fri, Feb 01, 2019 at 07:20:42PM +0000, Winkler, Tomas wrote:
> > 
> > The current approach to read first 6 bytes from the response and then tail of
> > the response, can cause the 2nd memcpy_fromio() to do an unaligned read
> > (e.g. read 32-bit word from address aligned to a 16-bits), depending on how
> > memcpy_fromio() is implemented. If this happens, the read will fail and the
> > memory controller will fill the read with 1's.
> > 
> > This was triggered by 170d13ca3a2f, which should be probably refined to check
> > and react to the address alignment. Before that commit, on x86
> > memcpy_fromio() turned out to be memcpy(). By a luck GCC has done the right
> > thing (from tpm_crb's perspective) for us so far, but we should not rely on that.
> > Thus, it makes sense to fix this also in tpm_crb, not least because the fix can be
> > then backported to stable kernels and make them more robust when compiled
> > in differing environments.
> > 
> > Cc: stable@vger.kernel.org
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: James Morris <jmorris@namei.org>
> > Cc: Tomas Winkler <tomas.winkler@intel.com>
> > Cc: Jerry Snitselaar <jsnitsel@redhat.com>
> > Fixes: 30fc8d138e91 ("tpm: TPM 2.0 CRB Interface")
> > Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
> > ---
> >  drivers/char/tpm/tpm_crb.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c index
> > 36952ef98f90..7f47e43aa9f1 100644
> > --- a/drivers/char/tpm/tpm_crb.c
> > +++ b/drivers/char/tpm/tpm_crb.c
> > @@ -288,18 +288,18 @@ static int crb_recv(struct tpm_chip *chip, u8 *buf,
> > size_t count)
> >  	unsigned int expected;
> > 
> >  	/* sanity check */
> > -	if (count < 6)
> > +	if (count < 8)
> >  		return -EIO;
> Why don't you already enforce reading at least the whole TPM header 10bytes, we are reading the whole buffer at one call anyway.
> Who every is asking for 8 bytes from the protocol level, is doing something wrong.
> 
> >  	if (ioread32(&priv->regs_t->ctrl_sts) & CRB_CTRL_STS_ERROR)
> >  		return -EIO;
> > 
> > -	memcpy_fromio(buf, priv->rsp, 6);
> > +	memcpy_fromio(buf, priv->rsp, 8);
> Maybe a short comment will spare someone looking into git history 
> >  	expected = be32_to_cpup((__be32 *) &buf[2]);
> > -	if (expected > count || expected < 6)
> > +	if (expected > count || expected < 8)
> Expected should be at least tpm header, right?
> >  		return -EIO;
> > 
> > -	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
> > +	memcpy_fromio(&buf[8], &priv->rsp[8], expected - 8);
> > 
> >  	return expected;
> >  }
> Otherwise ready the first 8 bytes looks good. 
> Thanks 
> Tomas

Your proposals look sane, thank you.

/Jarkko
Jarkko Sakkinen Feb. 4, 2019, 11:47 a.m. UTC | #5
On Fri, Feb 01, 2019 at 09:49:05AM -0800, Linus Torvalds wrote:
> On Fri, Feb 1, 2019 at 3:33 AM Jarkko Sakkinen
> <jarkko.sakkinen@linux.intel.com> wrote:
> >
> > Thus, it makes sense to fix this also in tpm_crb, not least because
> > the fix can be then backported to stable kernels and make them more robust
> > when compiled in differing environments.
> 
> Ack, looks sane to me, and should help both the backport and probably
> generate better code too.
> 
> In the meantime, I've committed the iomem.c change with a *long*
> commit message. For all we know, there might be other cases like this
> lurking somewhere else that just happened to work. Plus it's the right
> thing to do anyway.
> 
>                Linus

Great!

Umh, so, should my tpm_crb change committed to 5.0 with the minor changes
implemented suggested by Tomas or not? Can also include it to my 5.1 PR.
Either WFM.

/Jarkko
David Laight Feb. 4, 2019, 12:17 p.m. UTC | #6
From: Jarkko Sakkinen
> Sent: 01 February 2019 11:20
> The current approach to read first 6 bytes from the response and then tail
> of the response, can cause the 2nd memcpy_fromio() to do an unaligned read
> (e.g. read 32-bit word from address aligned to a 16-bits), depending on how
> memcpy_fromio() is implemented. If this happens, the read will fail and the
> memory controller will fill the read with 1's.

To my mind memcpy_to/fromio() should only be used on IO addresses that are
adequately like memory, and should be implemented in a way that that won't
generate invalid bus cycles.
Also memcpy_fromio() should also be allowed to do 'aligned' accesses that
go beyond the ends of the required memory area.

...
> 
> -	memcpy_fromio(buf, priv->rsp, 6);
> +	memcpy_fromio(buf, priv->rsp, 8);
>  	expected = be32_to_cpup((__be32 *) &buf[2]);
> -	if (expected > count || expected < 6)
> +	if (expected > count || expected < 8)
>  		return -EIO;
> 
> -	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
> +	memcpy_fromio(&buf[8], &priv->rsp[8], expected - 8);

Why not just use readl() or readq() ?

Bound to generate better code.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Jarkko Sakkinen Feb. 5, 2019, 10:44 a.m. UTC | #7
On Mon, Feb 04, 2019 at 12:17:43PM +0000, David Laight wrote:
> From: Jarkko Sakkinen
> > Sent: 01 February 2019 11:20
> > The current approach to read first 6 bytes from the response and then tail
> > of the response, can cause the 2nd memcpy_fromio() to do an unaligned read
> > (e.g. read 32-bit word from address aligned to a 16-bits), depending on how
> > memcpy_fromio() is implemented. If this happens, the read will fail and the
> > memory controller will fill the read with 1's.
> 
> To my mind memcpy_to/fromio() should only be used on IO addresses that are
> adequately like memory, and should be implemented in a way that that won't
> generate invalid bus cycles.
> Also memcpy_fromio() should also be allowed to do 'aligned' accesses that
> go beyond the ends of the required memory area.
> 
> ...
> > 
> > -	memcpy_fromio(buf, priv->rsp, 6);
> > +	memcpy_fromio(buf, priv->rsp, 8);
> >  	expected = be32_to_cpup((__be32 *) &buf[2]);
> > -	if (expected > count || expected < 6)
> > +	if (expected > count || expected < 8)
> >  		return -EIO;
> > 
> > -	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
> > +	memcpy_fromio(&buf[8], &priv->rsp[8], expected - 8);
> 
> Why not just use readl() or readq() ?
> 
> Bound to generate better code.

For the first read can be done. The second read is of variable
length.

/Jarkko
Jarkko Sakkinen Feb. 5, 2019, 10:47 a.m. UTC | #8
On Tue, Feb 05, 2019 at 12:44:06PM +0200, Jarkko Sakkinen wrote:
> On Mon, Feb 04, 2019 at 12:17:43PM +0000, David Laight wrote:
> > From: Jarkko Sakkinen
> > > Sent: 01 February 2019 11:20
> > > The current approach to read first 6 bytes from the response and then tail
> > > of the response, can cause the 2nd memcpy_fromio() to do an unaligned read
> > > (e.g. read 32-bit word from address aligned to a 16-bits), depending on how
> > > memcpy_fromio() is implemented. If this happens, the read will fail and the
> > > memory controller will fill the read with 1's.
> > 
> > To my mind memcpy_to/fromio() should only be used on IO addresses that are
> > adequately like memory, and should be implemented in a way that that won't
> > generate invalid bus cycles.
> > Also memcpy_fromio() should also be allowed to do 'aligned' accesses that
> > go beyond the ends of the required memory area.
> > 
> > ...
> > > 
> > > -	memcpy_fromio(buf, priv->rsp, 6);
> > > +	memcpy_fromio(buf, priv->rsp, 8);
> > >  	expected = be32_to_cpup((__be32 *) &buf[2]);
> > > -	if (expected > count || expected < 6)
> > > +	if (expected > count || expected < 8)
> > >  		return -EIO;
> > > 
> > > -	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
> > > +	memcpy_fromio(&buf[8], &priv->rsp[8], expected - 8);
> > 
> > Why not just use readl() or readq() ?
> > 
> > Bound to generate better code.
> 
> For the first read can be done. The second read is of variable
> length.

Neither can be done to the first one, because readq() does
le64_to_cpu(). Shoud not do any conversions, only raw read.
So I'll just stick it to what we have.

/jarkko
Jarkko Sakkinen Feb. 5, 2019, 10:49 a.m. UTC | #9
On Tue, Feb 05, 2019 at 12:47:47PM +0200, Jarkko Sakkinen wrote:
> On Tue, Feb 05, 2019 at 12:44:06PM +0200, Jarkko Sakkinen wrote:
> > On Mon, Feb 04, 2019 at 12:17:43PM +0000, David Laight wrote:
> > > From: Jarkko Sakkinen
> > > > Sent: 01 February 2019 11:20
> > > > The current approach to read first 6 bytes from the response and then tail
> > > > of the response, can cause the 2nd memcpy_fromio() to do an unaligned read
> > > > (e.g. read 32-bit word from address aligned to a 16-bits), depending on how
> > > > memcpy_fromio() is implemented. If this happens, the read will fail and the
> > > > memory controller will fill the read with 1's.
> > > 
> > > To my mind memcpy_to/fromio() should only be used on IO addresses that are
> > > adequately like memory, and should be implemented in a way that that won't
> > > generate invalid bus cycles.
> > > Also memcpy_fromio() should also be allowed to do 'aligned' accesses that
> > > go beyond the ends of the required memory area.
> > > 
> > > ...
> > > > 
> > > > -	memcpy_fromio(buf, priv->rsp, 6);
> > > > +	memcpy_fromio(buf, priv->rsp, 8);
> > > >  	expected = be32_to_cpup((__be32 *) &buf[2]);
> > > > -	if (expected > count || expected < 6)
> > > > +	if (expected > count || expected < 8)
> > > >  		return -EIO;
> > > > 
> > > > -	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
> > > > +	memcpy_fromio(&buf[8], &priv->rsp[8], expected - 8);
> > > 
> > > Why not just use readl() or readq() ?
> > > 
> > > Bound to generate better code.
> > 
> > For the first read can be done. The second read is of variable
> > length.
> 
> Neither can be done to the first one, because readq() does
> le64_to_cpu(). Shoud not do any conversions, only raw read.
> So I'll just stick it to what we have.

ATM tmp_crb is only used in LE architectures (x86 and ARM64), but
I still hate to have hidden extra byte order conversions, even if
they get compiled as nil ops. It is more wrong than to use
memcpy_fromio().

/Jarkko
diff mbox series

Patch

diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
index 36952ef98f90..7f47e43aa9f1 100644
--- a/drivers/char/tpm/tpm_crb.c
+++ b/drivers/char/tpm/tpm_crb.c
@@ -288,18 +288,18 @@  static int crb_recv(struct tpm_chip *chip, u8 *buf, size_t count)
 	unsigned int expected;
 
 	/* sanity check */
-	if (count < 6)
+	if (count < 8)
 		return -EIO;
 
 	if (ioread32(&priv->regs_t->ctrl_sts) & CRB_CTRL_STS_ERROR)
 		return -EIO;
 
-	memcpy_fromio(buf, priv->rsp, 6);
+	memcpy_fromio(buf, priv->rsp, 8);
 	expected = be32_to_cpup((__be32 *) &buf[2]);
-	if (expected > count || expected < 6)
+	if (expected > count || expected < 8)
 		return -EIO;
 
-	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
+	memcpy_fromio(&buf[8], &priv->rsp[8], expected - 8);
 
 	return expected;
 }