diff mbox

[v18,6/7] crypto: caam: cleanup CONFIG_64BIT ifdefs when using io{read|write}64

Message ID 991b2298-bb3f-dad3-c93b-b43ee5f372de@deltatee.com (mailing list archive)
State Not Applicable
Delegated to: Herbert Xu
Headers show

Commit Message

Logan Gunthorpe July 3, 2018, 11:57 p.m. UTC
On 03/07/18 04:21 PM, Andy Shevchenko wrote:
> It is an explicit call to BUG().
> That's why we see wrong instruction trap.

Ok, I think I see the problem... the code is rather confusing:

Prior to the patch, IOs were BE depending on caam_little_end but if
caam_imx was set, then it wrote two LE writes with the high one first.
After the patch, it writes two BE writes with the high one first.

To confirm, can you try the patch below?

If this is the case, we can either revert the commit or fold in this
patch depending on what others think is clearer.

Thanks,

Logan

--

 static inline u64 rd_reg64(void __iomem *reg)

Comments

Horia Geanta July 4, 2018, 11:45 a.m. UTC | #1
On 7/4/2018 2:58 AM, Logan Gunthorpe wrote:
> 
> 
> On 03/07/18 04:21 PM, Andy Shevchenko wrote:
>> It is an explicit call to BUG().
>> That's why we see wrong instruction trap.
> 
> Ok, I think I see the problem... the code is rather confusing:
> 
> Prior to the patch, IOs were BE depending on caam_little_end but if
> caam_imx was set, then it wrote two LE writes with the high one first.
> After the patch, it writes two BE writes with the high one first.
> 
I think there are two separate issues here:

1. Semantics of operations in io-64-nonatomic-lo-hi.h, io-64-nonatomic-hi-lo.h

Logan, you mentioned the following (which unfortunately I somehow missed):
https://lore.kernel.org/lkml/c3f2e061-5ed1-5c74-b955-3d2bfb0da074@deltatee.com
The lo_hi/hi_lo functions seem to always refer to the data being written
or read not to the address operated on.

OTOH, initial commit that added asm-generic/io-64-nonatomic-lo-hi.h and
asm-generic/io-64-nonatomic-hi-lo.h mentions:
797a796a13df6 ("asm-generic: architecture independent readq/writeq for 32bit
environment")
- <asm-generic/io-64-nonatomic-lo-hi.h> provides non-atomic readq/ writeq with
the order of lower address -> higher address
- <asm-generic/io-64-nonatomic-hi-lo.h> provides non-atomic readq/ writeq with
reversed order

I think we should keep the initial semantics when adding support for
io{read|write}64, i.e. "lo" and "hi" to refer to address and not to value.

Actually this is the semantics intended for the CAAM patch, see the note at the
end of the commit message that refers to addresses, not values:
To be consistent with CAAM engine HW spec: in case of 64-bit registers,
irrespective of device endianness, the lower address should be read from
/ written to first, followed by the upper address.


2. CAAM driver I/O accessors for i.MX case

CAAM block in some i.MX parts has broken endianness for 64b registers.
For e.g. for i.MX6S/SL/D/Q even though CAAM is little endian, BARs for I/O rings
have to be programmed as:
I/O Ring BAR+0: unused
I/O Ring BAR+4: IOVA (32-bit little endian)
when the proper layout (for a 64b register) would have been to program IOVA at
BAR+0.

This explains why I/O accessors in CAAM driver handle things differently in case
caam_imx=true.

Since this quirk cannot be accommodated in generic fashion, code dealing with
caam_imx has to stay.

Horia
Fabio Estevam July 4, 2018, 3:06 p.m. UTC | #2
Hi Horia,

On Wed, Jul 4, 2018 at 8:45 AM, Horia Geanta <horia.geanta@nxp.com> wrote:

> I think there are two separate issues here:
>
> 1. Semantics of operations in io-64-nonatomic-lo-hi.h, io-64-nonatomic-hi-lo.h
>
> Logan, you mentioned the following (which unfortunately I somehow missed):
> https://lore.kernel.org/lkml/c3f2e061-5ed1-5c74-b955-3d2bfb0da074@deltatee.com
> The lo_hi/hi_lo functions seem to always refer to the data being written
> or read not to the address operated on.
>
> OTOH, initial commit that added asm-generic/io-64-nonatomic-lo-hi.h and
> asm-generic/io-64-nonatomic-hi-lo.h mentions:
> 797a796a13df6 ("asm-generic: architecture independent readq/writeq for 32bit
> environment")
> - <asm-generic/io-64-nonatomic-lo-hi.h> provides non-atomic readq/ writeq with
> the order of lower address -> higher address
> - <asm-generic/io-64-nonatomic-hi-lo.h> provides non-atomic readq/ writeq with
> reversed order
>
> I think we should keep the initial semantics when adding support for
> io{read|write}64, i.e. "lo" and "hi" to refer to address and not to value.
>
> Actually this is the semantics intended for the CAAM patch, see the note at the
> end of the commit message that refers to addresses, not values:
> To be consistent with CAAM engine HW spec: in case of 64-bit registers,
> irrespective of device endianness, the lower address should be read from
> / written to first, followed by the upper address.
>
>
> 2. CAAM driver I/O accessors for i.MX case
>
> CAAM block in some i.MX parts has broken endianness for 64b registers.
> For e.g. for i.MX6S/SL/D/Q even though CAAM is little endian, BARs for I/O rings
> have to be programmed as:
> I/O Ring BAR+0: unused
> I/O Ring BAR+4: IOVA (32-bit little endian)
> when the proper layout (for a 64b register) would have been to program IOVA at
> BAR+0.
>
> This explains why I/O accessors in CAAM driver handle things differently in case
> caam_imx=true.
>
> Since this quirk cannot be accommodated in generic fashion, code dealing with
> caam_imx has to stay.

Should I sent a revert of patch 46e4bf08f6388 for the boot regression for now?

Then the two issues you pointed out could be fixed later.
Horia Geanta July 4, 2018, 3:59 p.m. UTC | #3
On 7/4/2018 6:06 PM, Fabio Estevam wrote:
> Hi Horia,
> 
> On Wed, Jul 4, 2018 at 8:45 AM, Horia Geanta <horia.geanta@nxp.com> wrote:
> 
>> I think there are two separate issues here:
>>
>> 1. Semantics of operations in io-64-nonatomic-lo-hi.h, io-64-nonatomic-hi-lo.h
>>
>> Logan, you mentioned the following (which unfortunately I somehow missed):
>> https://lore.kernel.org/lkml/c3f2e061-5ed1-5c74-b955-3d2bfb0da074@deltatee.com
>> The lo_hi/hi_lo functions seem to always refer to the data being written
>> or read not to the address operated on.
>>
>> OTOH, initial commit that added asm-generic/io-64-nonatomic-lo-hi.h and
>> asm-generic/io-64-nonatomic-hi-lo.h mentions:
>> 797a796a13df6 ("asm-generic: architecture independent readq/writeq for 32bit
>> environment")
>> - <asm-generic/io-64-nonatomic-lo-hi.h> provides non-atomic readq/ writeq with
>> the order of lower address -> higher address
>> - <asm-generic/io-64-nonatomic-hi-lo.h> provides non-atomic readq/ writeq with
>> reversed order
>>
>> I think we should keep the initial semantics when adding support for
>> io{read|write}64, i.e. "lo" and "hi" to refer to address and not to value.
>>
>> Actually this is the semantics intended for the CAAM patch, see the note at the
>> end of the commit message that refers to addresses, not values:
>> To be consistent with CAAM engine HW spec: in case of 64-bit registers,
>> irrespective of device endianness, the lower address should be read from
>> / written to first, followed by the upper address.
>>
>>
>> 2. CAAM driver I/O accessors for i.MX case
>>
>> CAAM block in some i.MX parts has broken endianness for 64b registers.
>> For e.g. for i.MX6S/SL/D/Q even though CAAM is little endian, BARs for I/O rings
>> have to be programmed as:
>> I/O Ring BAR+0: unused
>> I/O Ring BAR+4: IOVA (32-bit little endian)
>> when the proper layout (for a 64b register) would have been to program IOVA at
>> BAR+0.
>>
>> This explains why I/O accessors in CAAM driver handle things differently in case
>> caam_imx=true.
>>
>> Since this quirk cannot be accommodated in generic fashion, code dealing with
>> caam_imx has to stay.
> 
> Should I sent a revert of patch 46e4bf08f6388 for the boot regression for now?
> 
> Then the two issues you pointed out could be fixed later.
> 
I guess it would be better this way.

Thanks,
Horia
Fabio Estevam July 4, 2018, 4:16 p.m. UTC | #4
On Wed, Jul 4, 2018 at 12:59 PM, Horia Geanta <horia.geanta@nxp.com> wrote:

> I guess it would be better this way.

Sounds good. I will submit the revert patch then.

Thanks
Logan Gunthorpe July 4, 2018, 5:01 p.m. UTC | #5
On 7/4/2018 9:06 AM, Fabio Estevam wrote:
>> Logan, you mentioned the following (which unfortunately I somehow missed):
>> https://lore.kernel.org/lkml/c3f2e061-5ed1-5c74-b955-3d2bfb0da074@deltatee.com
>> The lo_hi/hi_lo functions seem to always refer to the data being written
>> or read not to the address operated on.

Oy, yes that was more than a year ago.

>> OTOH, initial commit that added asm-generic/io-64-nonatomic-lo-hi.h and
>> asm-generic/io-64-nonatomic-hi-lo.h mentions:
>> 797a796a13df6 ("asm-generic: architecture independent readq/writeq for 32bit
>> environment")
>> - <asm-generic/io-64-nonatomic-lo-hi.h> provides non-atomic readq/ writeq with
>> the order of lower address -> higher address
>> - <asm-generic/io-64-nonatomic-hi-lo.h> provides non-atomic readq/ writeq with
>> reversed order

Hmm, well in fairness that commit didn't add any BE operations so
lower/higher address is the same as lower/higher data being written.
hi-lo/lo-hi is a bit ambiguous in that sense and designing it to match
the semantics of the only user seemed to make sense at the time. I
didn't even check the rough comments in an old commit message which I
wouldn't really take as canonical. Also, it seems to me, most hardware
would expect you to write in order of the address (if it cares at all)
not in the order of the higher/lower data word. Though, I have no
explicit examples only a gut feeling.

>> I think we should keep the initial semantics when adding support for
>> io{read|write}64, i.e. "lo" and "hi" to refer to address and not to value.
>>
>> Actually this is the semantics intended for the CAAM patch, see the note at the
>> end of the commit message that refers to addresses, not values:
>> To be consistent with CAAM engine HW spec: in case of 64-bit registers,
>> irrespective of device endianness, the lower address should be read from
>> / written to first, followed by the upper address.
>>
>>
>> 2. CAAM driver I/O accessors for i.MX case
>> Since this quirk cannot be accommodated in generic fashion, code dealing with
>> caam_imx has to stay.

Yes, though IMO, I think the patch I sent earlier is clearer than the
current code with the nested ifs in different static functions. In any
case, I'll drop it and let you make any changes you see fit.

Logan
Andy Shevchenko July 4, 2018, 5:10 p.m. UTC | #6
On Wed, Jul 4, 2018 at 8:01 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
> On 7/4/2018 9:06 AM, Fabio Estevam wrote:

> Hmm, well in fairness that commit didn't add any BE operations so
> lower/higher address is the same as lower/higher data being written.
> hi-lo/lo-hi is a bit ambiguous in that sense and designing it to match
> the semantics of the only user seemed to make sense at the time. I
> didn't even check the rough comments in an old commit message which I
> wouldn't really take as canonical. Also, it seems to me, most hardware
> would expect you to write in order of the address (if it cares at all)
> not in the order of the higher/lower data word. Though, I have no
> explicit examples only a gut feeling.

We have an iDMA 32-bit hardware (see drivers/dma/dw/) which has an
extension 64-bit registers where only one of them has a specific bit
to "commit" the changes written to all of them. And by some very
unknown reason that bit is in lo part which automatically means we
must to write it last.
Logan Gunthorpe July 4, 2018, 5:13 p.m. UTC | #7
On 7/4/2018 11:10 AM, Andy Shevchenko wrote:
> We have an iDMA 32-bit hardware (see drivers/dma/dw/) which has an
> extension 64-bit registers where only one of them has a specific bit
> to "commit" the changes written to all of them. And by some very
> unknown reason that bit is in lo part which automatically means we
> must to write it last.

And it supports both BE and LE? And in both cases it's the lo part?

Logan
Andy Shevchenko July 4, 2018, 5:16 p.m. UTC | #8
On Wed, Jul 4, 2018 at 8:13 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
> On 7/4/2018 11:10 AM, Andy Shevchenko wrote:
>> We have an iDMA 32-bit hardware (see drivers/dma/dw/) which has an
>> extension 64-bit registers where only one of them has a specific bit
>> to "commit" the changes written to all of them. And by some very
>> unknown reason that bit is in lo part which automatically means we
>> must to write it last.
>
> And it supports both BE and LE? And in both cases it's the lo part?

It's only LE for now.

P.S. If you more interested in code in kernel look for idma32_fifo_partition()
(While the bit is set in each of 32-bit part, it's actually present in
only one place)
Logan Gunthorpe July 4, 2018, 5:21 p.m. UTC | #9
On 7/4/2018 11:16 AM, Andy Shevchenko wrote:
> On Wed, Jul 4, 2018 at 8:13 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
>> On 7/4/2018 11:10 AM, Andy Shevchenko wrote:
>>> We have an iDMA 32-bit hardware (see drivers/dma/dw/) which has an
>>> extension 64-bit registers where only one of them has a specific bit
>>> to "commit" the changes written to all of them. And by some very
>>> unknown reason that bit is in lo part which automatically means we
>>> must to write it last.
>>
>> And it supports both BE and LE? And in both cases it's the lo part?
> 
> It's only LE for now.

So the main question is if they were to add BE support, would they leave
the trigger in the same address or swap it to the other address so that
it's always the LO part that triggers? Otherwise it's hard to say what
we really want for the BE variants of the non-atomic hi-lo operations.

In the end, the driver author is free to use specifically which ever
function is necessary in any given situation (lo-hi vs hi-lo) and they
can read the definitions and no one is using them yet. So either way is
probably just as valid and it's probably not really worth fussing too
much about.

Logan
Andy Shevchenko July 4, 2018, 5:21 p.m. UTC | #10
On Wed, Jul 4, 2018 at 8:16 PM, Andy Shevchenko
<andy.shevchenko@gmail.com> wrote:
> On Wed, Jul 4, 2018 at 8:13 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
>> On 7/4/2018 11:10 AM, Andy Shevchenko wrote:
>>> We have an iDMA 32-bit hardware (see drivers/dma/dw/) which has an
>>> extension 64-bit registers where only one of them has a specific bit
>>> to "commit" the changes written to all of them. And by some very
>>> unknown reason that bit is in lo part which automatically means we
>>> must to write it last.
>>
>> And it supports both BE and LE? And in both cases it's the lo part?
>
> It's only LE for now.
>
> P.S. If you more interested in code in kernel look for idma32_fifo_partition()
> (While the bit is set in each of 32-bit part, it's actually present in
> only one place)

I think it doesn't contradict with what you are saying rather supports it.

I would expect to have lo-hi and hi-lo semantics done according to the
data flow, not to the address.
Logan Gunthorpe July 4, 2018, 5:23 p.m. UTC | #11
On 7/4/2018 11:21 AM, Andy Shevchenko wrote:
> I think it doesn't contradict with what you are saying rather supports it.
> 
> I would expect to have lo-hi and hi-lo semantics done according to the
> data flow, not to the address.

Hmm, no, I expected the opposite...

Logan
Andy Shevchenko July 4, 2018, 5:25 p.m. UTC | #12
On Wed, Jul 4, 2018 at 8:23 PM, Logan Gunthorpe <logang@deltatee.com> wrote:
> On 7/4/2018 11:21 AM, Andy Shevchenko wrote:
>> I think it doesn't contradict with what you are saying rather supports it.
>>
>> I would expect to have lo-hi and hi-lo semantics done according to the
>> data flow, not to the address.
>
> Hmm, no, I expected the opposite...

Ah, sorry, misread you. Then it means I'm for the what I said and
opposing address flow.
Andy Shevchenko July 4, 2018, 5:32 p.m. UTC | #13
On Wed, Jul 4, 2018 at 2:45 PM, Horia Geanta <horia.geanta@nxp.com> wrote:
> On 7/4/2018 2:58 AM, Logan Gunthorpe wrote:
>>
>>
>> On 03/07/18 04:21 PM, Andy Shevchenko wrote:
>>> It is an explicit call to BUG().
>>> That's why we see wrong instruction trap.
>>
>> Ok, I think I see the problem... the code is rather confusing:
>>
>> Prior to the patch, IOs were BE depending on caam_little_end but if
>> caam_imx was set, then it wrote two LE writes with the high one first.
>> After the patch, it writes two BE writes with the high one first.
>>
> I think there are two separate issues here:
>
> 1. Semantics of operations in io-64-nonatomic-lo-hi.h, io-64-nonatomic-hi-lo.h
>
> Logan, you mentioned the following (which unfortunately I somehow missed):
> https://lore.kernel.org/lkml/c3f2e061-5ed1-5c74-b955-3d2bfb0da074@deltatee.com
> The lo_hi/hi_lo functions seem to always refer to the data being written
> or read not to the address operated on.
>
> OTOH, initial commit that added asm-generic/io-64-nonatomic-lo-hi.h and
> asm-generic/io-64-nonatomic-hi-lo.h mentions:
> 797a796a13df6 ("asm-generic: architecture independent readq/writeq for 32bit
> environment")
> - <asm-generic/io-64-nonatomic-lo-hi.h> provides non-atomic readq/ writeq with
> the order of lower address -> higher address
> - <asm-generic/io-64-nonatomic-hi-lo.h> provides non-atomic readq/ writeq with
> reversed order
>
> I think we should keep the initial semantics when adding support for
> io{read|write}64, i.e. "lo" and "hi" to refer to address and not to value.

There is a ambiguity with the commit message.
Since it had implemented only LE IO accessors the address semantics is
the same as data flow.

As a driver writer intuitively I would expect data flow semantics.

It means it would be invariant to LE BE accessors, right?

lo-hi: LE (0x0 0x4) BE (0x4 0x0)
hi-lo: LE (0x4 0x0) BE (0x0 0x4)
Logan Gunthorpe July 4, 2018, 5:36 p.m. UTC | #14
On 7/4/2018 11:32 AM, Andy Shevchenko wrote:
> It means it would be invariant to LE BE accessors, right?
> 
> lo-hi: LE (0x0 0x4) BE (0x4 0x0)
> hi-lo: LE (0x4 0x0) BE (0x0 0x4)


Ok, well, given that this is what I implemented originally and the
argument seems a little bike-sheddy. I vote we just leave the current
patch as is.

Logan
diff mbox

Patch

diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
index 5826acd9194e..5f70c460da25 100644
--- a/drivers/crypto/caam/regs.h
+++ b/drivers/crypto/caam/regs.h
@@ -138,10 +138,14 @@  static inline void clrsetbits_32(void __iomem
*reg, u32 clear, u32 set)
  */
 static inline void wr_reg64(void __iomem *reg, u64 data)
 {
-       if (!caam_imx && caam_little_end)
+       if (caam_imx && caam_little_end) {
+               iowrite32(data >> 32, reg);
+               iowrite32(data, reg + sizeof(u32));
+       } else if (caam_little_end) {
                iowrite64(data, reg);
-       else
+       } else {
                iowrite64be(data, reg);
+       }
 }