diff mbox

crypto: sun4i-ss: prevent deadlock on emulated hardware

Message ID 20180614193659.29261-1-clabbe.montjoie@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Corentin Labbe June 14, 2018, 7:36 p.m. UTC
Running a qemu emulated cubieboard with sun4i-ss driver enabled led to a never
ending boot.
This is due to sun4i-ss deadlocked and taking all cpu in an infinite loop.
Since the crypto hardware is not implemented, all registers are read as 0.
So sun4i-ss will never progress in any operations. (TX_CNT being always 0)

The first idea is to add a "TX_CNT always zero timeout" but this made cipher/hash loops
more complex and prevent a case that never happen on real hardware.

The best way to fix is to check at probe time if we run on a virtual
machine with hardware emulated but non-implemented and prevent
sun4i-ss to be loaded in that case.
Letting sun4i-ss to load is useless anyway since all crypto algorithm will be
disabled since they will fail crypto selftests.

Tested-on: qemu-cubieboard
Tested-on: cubieboard2

Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
---
 drivers/crypto/sunxi-ss/sun4i-ss-core.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Maxime Ripard June 15, 2018, 7:57 a.m. UTC | #1
On Thu, Jun 14, 2018 at 09:36:59PM +0200, Corentin Labbe wrote:
> Running a qemu emulated cubieboard with sun4i-ss driver enabled led to a never
> ending boot.
> This is due to sun4i-ss deadlocked and taking all cpu in an infinite loop.
> Since the crypto hardware is not implemented, all registers are read as 0.
> So sun4i-ss will never progress in any operations. (TX_CNT being always 0)
> 
> The first idea is to add a "TX_CNT always zero timeout" but this made cipher/hash loops
> more complex and prevent a case that never happen on real hardware.
> 
> The best way to fix is to check at probe time if we run on a virtual
> machine with hardware emulated but non-implemented and prevent
> sun4i-ss to be loaded in that case.
> Letting sun4i-ss to load is useless anyway since all crypto algorithm will be
> disabled since they will fail crypto selftests.
> 
> Tested-on: qemu-cubieboard
> Tested-on: cubieboard2
> 
> Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
> ---
>  drivers/crypto/sunxi-ss/sun4i-ss-core.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/crypto/sunxi-ss/sun4i-ss-core.c b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> index a81d89b3b7d8..a178e80adcf3 100644
> --- a/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> +++ b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> @@ -341,9 +341,18 @@ static int sun4i_ss_probe(struct platform_device *pdev)
>  	 * I expect to be a sort of Security System Revision number.
>  	 * Since the A80 seems to have an other version of SS
>  	 * this info could be useful
> +	 * Detect virtual machine with non-implemented hardware
> +	 * (qemu-cubieboard) by checking the register value after a write to it.
> +	 * On non-implemented hardware, all registers are read as 0.
> +	 * On real hardware we should have a value > 0.
>  	 */
>  	writel(SS_ENABLED, ss->base + SS_CTL);
>  	v = readl(ss->base + SS_CTL);
> +	if (!v) {
> +		dev_err(&pdev->dev, "Qemu with non-implemented SS detected.\n");
> +		err = -ENODEV;
> +		goto error_rst;
> +	}

This is wrong way to tackle the issue. There's multiple reason why
this could happen (for example the device not being clocked, or
maintained in reset). There's nothing specific about qemu here, and
the fundamental issue isn't that the device isn't functional in qemu,
it's that qemu lies about which hardware it can emulate in the DT it
passes to the kernel.

There's no way this can scale, alone from the fact that qemu should
patch the DT according to what it can do. Not trying to chase after
each and every device that is broken in qemu.

NAK.

Maxime
Corentin Labbe June 15, 2018, 8:15 a.m. UTC | #2
On Fri, Jun 15, 2018 at 09:57:54AM +0200, Maxime Ripard wrote:
> On Thu, Jun 14, 2018 at 09:36:59PM +0200, Corentin Labbe wrote:
> > Running a qemu emulated cubieboard with sun4i-ss driver enabled led to a never
> > ending boot.
> > This is due to sun4i-ss deadlocked and taking all cpu in an infinite loop.
> > Since the crypto hardware is not implemented, all registers are read as 0.
> > So sun4i-ss will never progress in any operations. (TX_CNT being always 0)
> > 
> > The first idea is to add a "TX_CNT always zero timeout" but this made cipher/hash loops
> > more complex and prevent a case that never happen on real hardware.
> > 
> > The best way to fix is to check at probe time if we run on a virtual
> > machine with hardware emulated but non-implemented and prevent
> > sun4i-ss to be loaded in that case.
> > Letting sun4i-ss to load is useless anyway since all crypto algorithm will be
> > disabled since they will fail crypto selftests.
> > 
> > Tested-on: qemu-cubieboard
> > Tested-on: cubieboard2
> > 
> > Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
> > ---
> >  drivers/crypto/sunxi-ss/sun4i-ss-core.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/drivers/crypto/sunxi-ss/sun4i-ss-core.c b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > index a81d89b3b7d8..a178e80adcf3 100644
> > --- a/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > +++ b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > @@ -341,9 +341,18 @@ static int sun4i_ss_probe(struct platform_device *pdev)
> >  	 * I expect to be a sort of Security System Revision number.
> >  	 * Since the A80 seems to have an other version of SS
> >  	 * this info could be useful
> > +	 * Detect virtual machine with non-implemented hardware
> > +	 * (qemu-cubieboard) by checking the register value after a write to it.
> > +	 * On non-implemented hardware, all registers are read as 0.
> > +	 * On real hardware we should have a value > 0.
> >  	 */
> >  	writel(SS_ENABLED, ss->base + SS_CTL);
> >  	v = readl(ss->base + SS_CTL);
> > +	if (!v) {
> > +		dev_err(&pdev->dev, "Qemu with non-implemented SS detected.\n");
> > +		err = -ENODEV;
> > +		goto error_rst;
> > +	}
> 
> This is wrong way to tackle the issue. There's multiple reason why
> this could happen (for example the device not being clocked, or
> maintained in reset). There's nothing specific about qemu here, and
> the fundamental issue isn't that the device isn't functional in qemu,
> it's that qemu lies about which hardware it can emulate in the DT it
> passes to the kernel.
> 
> There's no way this can scale, alone from the fact that qemu should
> patch the DT according to what it can do. Not trying to chase after
> each and every device that is broken in qemu.
> 
> NAK.
> 

My fix detect also when the device is badly clocked.
So since it could fix problem unrelated to qemu, I will send a V2 with updated comment.

Regards
Maxime Ripard June 15, 2018, 9:04 a.m. UTC | #3
On Fri, Jun 15, 2018 at 10:15:54AM +0200, Corentin Labbe wrote:
> On Fri, Jun 15, 2018 at 09:57:54AM +0200, Maxime Ripard wrote:
> > On Thu, Jun 14, 2018 at 09:36:59PM +0200, Corentin Labbe wrote:
> > > Running a qemu emulated cubieboard with sun4i-ss driver enabled led to a never
> > > ending boot.
> > > This is due to sun4i-ss deadlocked and taking all cpu in an infinite loop.
> > > Since the crypto hardware is not implemented, all registers are read as 0.
> > > So sun4i-ss will never progress in any operations. (TX_CNT being always 0)
> > > 
> > > The first idea is to add a "TX_CNT always zero timeout" but this made cipher/hash loops
> > > more complex and prevent a case that never happen on real hardware.
> > > 
> > > The best way to fix is to check at probe time if we run on a virtual
> > > machine with hardware emulated but non-implemented and prevent
> > > sun4i-ss to be loaded in that case.
> > > Letting sun4i-ss to load is useless anyway since all crypto algorithm will be
> > > disabled since they will fail crypto selftests.
> > > 
> > > Tested-on: qemu-cubieboard
> > > Tested-on: cubieboard2
> > > 
> > > Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
> > > ---
> > >  drivers/crypto/sunxi-ss/sun4i-ss-core.c | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > > 
> > > diff --git a/drivers/crypto/sunxi-ss/sun4i-ss-core.c b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > > index a81d89b3b7d8..a178e80adcf3 100644
> > > --- a/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > > +++ b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > > @@ -341,9 +341,18 @@ static int sun4i_ss_probe(struct platform_device *pdev)
> > >  	 * I expect to be a sort of Security System Revision number.
> > >  	 * Since the A80 seems to have an other version of SS
> > >  	 * this info could be useful
> > > +	 * Detect virtual machine with non-implemented hardware
> > > +	 * (qemu-cubieboard) by checking the register value after a write to it.
> > > +	 * On non-implemented hardware, all registers are read as 0.
> > > +	 * On real hardware we should have a value > 0.
> > >  	 */
> > >  	writel(SS_ENABLED, ss->base + SS_CTL);
> > >  	v = readl(ss->base + SS_CTL);
> > > +	if (!v) {
> > > +		dev_err(&pdev->dev, "Qemu with non-implemented SS detected.\n");
> > > +		err = -ENODEV;
> > > +		goto error_rst;
> > > +	}
> > 
> > This is wrong way to tackle the issue. There's multiple reason why
> > this could happen (for example the device not being clocked, or
> > maintained in reset). There's nothing specific about qemu here, and
> > the fundamental issue isn't that the device isn't functional in qemu,
> > it's that qemu lies about which hardware it can emulate in the DT it
> > passes to the kernel.
> > 
> > There's no way this can scale, alone from the fact that qemu should
> > patch the DT according to what it can do. Not trying to chase after
> > each and every device that is broken in qemu.
> > 
> > NAK.
> > 
> 
> My fix detect also when the device is badly clocked.

In which case, the proper fix is to enable the clock, not throw the
kernel's arm up in the air.

Maxime
Corentin Labbe June 15, 2018, 9:16 a.m. UTC | #4
On Fri, Jun 15, 2018 at 11:04:12AM +0200, Maxime Ripard wrote:
> On Fri, Jun 15, 2018 at 10:15:54AM +0200, Corentin Labbe wrote:
> > On Fri, Jun 15, 2018 at 09:57:54AM +0200, Maxime Ripard wrote:
> > > On Thu, Jun 14, 2018 at 09:36:59PM +0200, Corentin Labbe wrote:
> > > > Running a qemu emulated cubieboard with sun4i-ss driver enabled led to a never
> > > > ending boot.
> > > > This is due to sun4i-ss deadlocked and taking all cpu in an infinite loop.
> > > > Since the crypto hardware is not implemented, all registers are read as 0.
> > > > So sun4i-ss will never progress in any operations. (TX_CNT being always 0)
> > > > 
> > > > The first idea is to add a "TX_CNT always zero timeout" but this made cipher/hash loops
> > > > more complex and prevent a case that never happen on real hardware.
> > > > 
> > > > The best way to fix is to check at probe time if we run on a virtual
> > > > machine with hardware emulated but non-implemented and prevent
> > > > sun4i-ss to be loaded in that case.
> > > > Letting sun4i-ss to load is useless anyway since all crypto algorithm will be
> > > > disabled since they will fail crypto selftests.
> > > > 
> > > > Tested-on: qemu-cubieboard
> > > > Tested-on: cubieboard2
> > > > 
> > > > Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
> > > > ---
> > > >  drivers/crypto/sunxi-ss/sun4i-ss-core.c | 10 ++++++++++
> > > >  1 file changed, 10 insertions(+)
> > > > 
> > > > diff --git a/drivers/crypto/sunxi-ss/sun4i-ss-core.c b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > > > index a81d89b3b7d8..a178e80adcf3 100644
> > > > --- a/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > > > +++ b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > > > @@ -341,9 +341,18 @@ static int sun4i_ss_probe(struct platform_device *pdev)
> > > >  	 * I expect to be a sort of Security System Revision number.
> > > >  	 * Since the A80 seems to have an other version of SS
> > > >  	 * this info could be useful
> > > > +	 * Detect virtual machine with non-implemented hardware
> > > > +	 * (qemu-cubieboard) by checking the register value after a write to it.
> > > > +	 * On non-implemented hardware, all registers are read as 0.
> > > > +	 * On real hardware we should have a value > 0.
> > > >  	 */
> > > >  	writel(SS_ENABLED, ss->base + SS_CTL);
> > > >  	v = readl(ss->base + SS_CTL);
> > > > +	if (!v) {
> > > > +		dev_err(&pdev->dev, "Qemu with non-implemented SS detected.\n");
> > > > +		err = -ENODEV;
> > > > +		goto error_rst;
> > > > +	}
> > > 
> > > This is wrong way to tackle the issue. There's multiple reason why
> > > this could happen (for example the device not being clocked, or
> > > maintained in reset). There's nothing specific about qemu here, and
> > > the fundamental issue isn't that the device isn't functional in qemu,
> > > it's that qemu lies about which hardware it can emulate in the DT it
> > > passes to the kernel.
> > > 
> > > There's no way this can scale, alone from the fact that qemu should
> > > patch the DT according to what it can do. Not trying to chase after
> > > each and every device that is broken in qemu.
> > > 
> > > NAK.
> > > 
> > 
> > My fix detect also when the device is badly clocked.
> 
> In which case, the proper fix is to enable the clock, not throw the
> kernel's arm up in the air.
> 

By badly I mean "not clocked" or "with the wrong frequencies".

I could change the clock rate range test to exit (it issue only a warning for now).
But I think this fix detect all cases and still permit someone to play with overclocking/downclocking.

Regards
Maxime Ripard June 15, 2018, 11:08 a.m. UTC | #5
On Fri, Jun 15, 2018 at 11:16:50AM +0200, Corentin Labbe wrote:
> On Fri, Jun 15, 2018 at 11:04:12AM +0200, Maxime Ripard wrote:
> > On Fri, Jun 15, 2018 at 10:15:54AM +0200, Corentin Labbe wrote:
> > > On Fri, Jun 15, 2018 at 09:57:54AM +0200, Maxime Ripard wrote:
> > > > On Thu, Jun 14, 2018 at 09:36:59PM +0200, Corentin Labbe wrote:
> > > > > Running a qemu emulated cubieboard with sun4i-ss driver enabled led to a never
> > > > > ending boot.
> > > > > This is due to sun4i-ss deadlocked and taking all cpu in an infinite loop.
> > > > > Since the crypto hardware is not implemented, all registers are read as 0.
> > > > > So sun4i-ss will never progress in any operations. (TX_CNT being always 0)
> > > > > 
> > > > > The first idea is to add a "TX_CNT always zero timeout" but this made cipher/hash loops
> > > > > more complex and prevent a case that never happen on real hardware.
> > > > > 
> > > > > The best way to fix is to check at probe time if we run on a virtual
> > > > > machine with hardware emulated but non-implemented and prevent
> > > > > sun4i-ss to be loaded in that case.
> > > > > Letting sun4i-ss to load is useless anyway since all crypto algorithm will be
> > > > > disabled since they will fail crypto selftests.
> > > > > 
> > > > > Tested-on: qemu-cubieboard
> > > > > Tested-on: cubieboard2
> > > > > 
> > > > > Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
> > > > > ---
> > > > >  drivers/crypto/sunxi-ss/sun4i-ss-core.c | 10 ++++++++++
> > > > >  1 file changed, 10 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/crypto/sunxi-ss/sun4i-ss-core.c b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > > > > index a81d89b3b7d8..a178e80adcf3 100644
> > > > > --- a/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > > > > +++ b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
> > > > > @@ -341,9 +341,18 @@ static int sun4i_ss_probe(struct platform_device *pdev)
> > > > >  	 * I expect to be a sort of Security System Revision number.
> > > > >  	 * Since the A80 seems to have an other version of SS
> > > > >  	 * this info could be useful
> > > > > +	 * Detect virtual machine with non-implemented hardware
> > > > > +	 * (qemu-cubieboard) by checking the register value after a write to it.
> > > > > +	 * On non-implemented hardware, all registers are read as 0.
> > > > > +	 * On real hardware we should have a value > 0.
> > > > >  	 */
> > > > >  	writel(SS_ENABLED, ss->base + SS_CTL);
> > > > >  	v = readl(ss->base + SS_CTL);
> > > > > +	if (!v) {
> > > > > +		dev_err(&pdev->dev, "Qemu with non-implemented SS detected.\n");
> > > > > +		err = -ENODEV;
> > > > > +		goto error_rst;
> > > > > +	}
> > > > 
> > > > This is wrong way to tackle the issue. There's multiple reason why
> > > > this could happen (for example the device not being clocked, or
> > > > maintained in reset). There's nothing specific about qemu here, and
> > > > the fundamental issue isn't that the device isn't functional in qemu,
> > > > it's that qemu lies about which hardware it can emulate in the DT it
> > > > passes to the kernel.
> > > > 
> > > > There's no way this can scale, alone from the fact that qemu should
> > > > patch the DT according to what it can do. Not trying to chase after
> > > > each and every device that is broken in qemu.
> > > > 
> > > > NAK.
> > > > 
> > > 
> > > My fix detect also when the device is badly clocked.
> > 
> > In which case, the proper fix is to enable the clock, not throw the
> > kernel's arm up in the air.
> > 
> 
> By badly I mean "not clocked" or "with the wrong frequencies".
>
> I could change the clock rate range test to exit (it issue only a
> warning for now).  But I think this fix detect all cases and still
> permit someone to play with overclocking/downclocking.

You're still trying to fix the consequence when you should be fixing
the cause.

Maxime
diff mbox

Patch

diff --git a/drivers/crypto/sunxi-ss/sun4i-ss-core.c b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
index a81d89b3b7d8..a178e80adcf3 100644
--- a/drivers/crypto/sunxi-ss/sun4i-ss-core.c
+++ b/drivers/crypto/sunxi-ss/sun4i-ss-core.c
@@ -341,9 +341,18 @@  static int sun4i_ss_probe(struct platform_device *pdev)
 	 * I expect to be a sort of Security System Revision number.
 	 * Since the A80 seems to have an other version of SS
 	 * this info could be useful
+	 * Detect virtual machine with non-implemented hardware
+	 * (qemu-cubieboard) by checking the register value after a write to it.
+	 * On non-implemented hardware, all registers are read as 0.
+	 * On real hardware we should have a value > 0.
 	 */
 	writel(SS_ENABLED, ss->base + SS_CTL);
 	v = readl(ss->base + SS_CTL);
+	if (!v) {
+		dev_err(&pdev->dev, "Qemu with non-implemented SS detected.\n");
+		err = -ENODEV;
+		goto error_rst;
+	}
 	v >>= 16;
 	v &= 0x07;
 	dev_info(&pdev->dev, "Die ID %d\n", v);
@@ -398,6 +407,7 @@  static int sun4i_ss_probe(struct platform_device *pdev)
 			break;
 		}
 	}
+error_rst:
 	if (ss->reset)
 		reset_control_assert(ss->reset);
 error_clk: