diff mbox

[01/11] Initialize the mapping of KASan shadow memory

Message ID B8AC3E80E903784988AB3003E3E97330C006371B@dggemm510-mbs.china.huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Abbott Liu Nov. 16, 2017, 2:24 p.m. UTC
On 16/11/17  17:54 Marc Zyngier [mailto:marc.zyngier@arm.com] wrote:
>On Thu, Nov 16 2017 at  3:07:54 am GMT, "Liuwenliang (Abbott Liu)" <liuwenliang@huawei.com> wrote:

>>>On 15/11/17 13:16, Liuwenliang (Abbott Liu) wrote:

>>>> On 09/11/17  18:36 Marc Zyngier [mailto:marc.zyngier@arm.com] wrote:

>>>>> On Wed, Nov 15 2017 at 10:20:02 am GMT, "Liuwenliang (Abbott Liu)"

>>>>> <liuwenliang@huawei.com> wrote:

>>>>>> diff --git a/arch/arm/include/asm/cp15.h

>>>>>> b/arch/arm/include/asm/cp15.h index dbdbce1..6db1f51 100644

>>>>>> --- a/arch/arm/include/asm/cp15.h

>>>>>> +++ b/arch/arm/include/asm/cp15.h

>>>>>> @@ -64,6 +64,43 @@

>>>>>>  #define __write_sysreg(v, r, w, c, t) asm volatile(w " " c : :

>>>>>> "r" ((t)(v)))

>>>>>>  #define write_sysreg(v, ...)           __write_sysreg(v, __VA_ARGS__)

>>>>>>

>>>>>> +#ifdef CONFIG_ARM_LPAE

>>>>>> +#define TTBR0           __ACCESS_CP15_64(0, c2)

>>>>>> +#define TTBR1           __ACCESS_CP15_64(1, c2)

>>>>>> +#define PAR             __ACCESS_CP15_64(0, c7)

>>>>>> +#else

>>>>>> +#define TTBR0           __ACCESS_CP15(c2, 0, c0, 0)

>>>>>> +#define TTBR1           __ACCESS_CP15(c2, 0, c0, 1)

>>>>>> +#define PAR             __ACCESS_CP15(c7, 0, c4, 0)

>>>>>> +#endif

>>>>> Again: there is no point in not having these register encodings

>>>>> cohabiting. They are both perfectly defined in the architecture.

>>>>> Just suffix one (or even both) with their respective size, making

>>>>> it obvious which one you're talking about.

>>>>

>>>> I am sorry that I didn't point why I need to define TTBR0/ TTBR1/PAR

>>>> in to different way between CONFIG_ARM_LPAE and non CONFIG_ARM_LPAE.

>>>> The following description is the reason:

>>>> Here is the description come from

>>>> DDI0406C2c_arm_architecture_reference_manual.pdf:

>>>[...]

>>>

>>>You're missing the point. TTBR0 existence as a 64bit CP15 register has

>>>nothing to do the kernel being compiled with LPAE or not. It has

>>>everything to do with the HW supporting LPAE, and it is the kernel's job

>>>to use the right accessor depending on how it is compiled. On a CPU

>>>supporting LPAE, both TTBR0 accessors are valid. It is the kernel that

>>>chooses to use one rather than the other.

>>

>> Thanks for your review.  I don't think both TTBR0 accessors(64bit

>> accessor and 32bit accessor) are valid on a CPU supporting LPAE which

>> the LPAE is enabled. Here is the description come form

>> DDI0406C2c_arm_architecture_reference_manual.pdf (=ARM® Architecture

>> Reference Manual ARMv7-A and ARMv7-R edition) which you can get the

>> document by google "ARM® Architecture Reference Manual ARMv7-A and

>> ARMv7-R edition".


>Trust me, from where I seat, I have a much better source than Google for

>that document. Who would have thought?


>Nothing in what you randomly quote invalids what I've been saying. And

>to show you what's wrong with your reasoning, let me describe a

>scenario,


>I have a non-LPAE kernel that runs on my system. It uses the 32bit

>version of the TTBRs. It turns out that this kernel runs under a

>hypervisor (KVM, Xen, or your toy of the day). The hypervisor

>context-switches vcpus without even looking at whether the configuration

>of that guest. It doesn't have to care. It just blindly uses the 64bit

>version of the TTBRs.


>The architecture *guarantees* that it works (it even works with a 32bit

>guest under a 64bit hypervisor). In your world, this doesn't work. I

>guess the architecture wins.


>> So, I think if you access TTBR0/TTBR1 on CPU supporting LPAE, you must

>> use "mcrr/mrrc" instruction (__ACCESS_CP15_64). If you access

>> TTBR0/TTBR1 on CPU supporting LPAE by "mcr/mrc" instruction which is

>> 32bit version (__ACCESS_CP15), even if the CPU doesn't report error,

>> you also lose the high or low 32bit of the TTBR0/TTBR1.


>It is not about "supporting LPAE". It is about using the accessor that

>makes sense in a particular context. Yes, the architecture allows you to

>do something stupid. Don't do it. It doesn't mean the accessors cannot

>be used, and I hope that my example above demonstrates it.


>Conclusion: I still stand by my request that both versions of TTBRs/PAR

>are described without depending on the kernel configuration, because

>this has nothing to do with the kernel configuration.


Thanks for your reviews.
Yes, you are right. I have tested that "mcrr/mrrc" instruction (__ACCESS_CP15_64)
can work on non LPAE on vexpress_a9.

Here is the code I tested on vexpress_a9 and vexpress_a15:

Comments

Marc Zyngier Nov. 16, 2017, 2:40 p.m. UTC | #1
On Thu, Nov 16 2017 at  2:24:31 pm GMT, "Liuwenliang (Abbott Liu)" <liuwenliang@huawei.com> wrote:
> On 16/11/17  17:54 Marc Zyngier [mailto:marc.zyngier@arm.com] wrote:
>>On Thu, Nov 16 2017 at 3:07:54 am GMT, "Liuwenliang (Abbott Liu)"
>> <liuwenliang@huawei.com> wrote:
>>>>On 15/11/17 13:16, Liuwenliang (Abbott Liu) wrote:
>>>>> On 09/11/17  18:36 Marc Zyngier [mailto:marc.zyngier@arm.com] wrote:
>>>>>> On Wed, Nov 15 2017 at 10:20:02 am GMT, "Liuwenliang (Abbott Liu)"
>>>>>> <liuwenliang@huawei.com> wrote:
>>>>>>> diff --git a/arch/arm/include/asm/cp15.h
>>>>>>> b/arch/arm/include/asm/cp15.h index dbdbce1..6db1f51 100644
>>>>>>> --- a/arch/arm/include/asm/cp15.h
>>>>>>> +++ b/arch/arm/include/asm/cp15.h
>>>>>>> @@ -64,6 +64,43 @@
>>>>>>>  #define __write_sysreg(v, r, w, c, t) asm volatile(w " " c : :
>>>>>>> "r" ((t)(v)))
>>>>>>>  #define write_sysreg(v, ...)           __write_sysreg(v, __VA_ARGS__)
>>>>>>>
>>>>>>> +#ifdef CONFIG_ARM_LPAE
>>>>>>> +#define TTBR0           __ACCESS_CP15_64(0, c2)
>>>>>>> +#define TTBR1           __ACCESS_CP15_64(1, c2)
>>>>>>> +#define PAR             __ACCESS_CP15_64(0, c7)
>>>>>>> +#else
>>>>>>> +#define TTBR0           __ACCESS_CP15(c2, 0, c0, 0)
>>>>>>> +#define TTBR1           __ACCESS_CP15(c2, 0, c0, 1)
>>>>>>> +#define PAR             __ACCESS_CP15(c7, 0, c4, 0)
>>>>>>> +#endif
>>>>>> Again: there is no point in not having these register encodings
>>>>>> cohabiting. They are both perfectly defined in the architecture.
>>>>>> Just suffix one (or even both) with their respective size, making
>>>>>> it obvious which one you're talking about.
>>>>>
>>>>> I am sorry that I didn't point why I need to define TTBR0/ TTBR1/PAR
>>>>> in to different way between CONFIG_ARM_LPAE and non CONFIG_ARM_LPAE.
>>>>> The following description is the reason:
>>>>> Here is the description come from
>>>>> DDI0406C2c_arm_architecture_reference_manual.pdf:
>>>>[...]
>>>>
>>>>You're missing the point. TTBR0 existence as a 64bit CP15 register has
>>>>nothing to do the kernel being compiled with LPAE or not. It has
>>>>everything to do with the HW supporting LPAE, and it is the kernel's job
>>>>to use the right accessor depending on how it is compiled. On a CPU
>>>>supporting LPAE, both TTBR0 accessors are valid. It is the kernel that
>>>>chooses to use one rather than the other.
>>>
>>> Thanks for your review.  I don't think both TTBR0 accessors(64bit
>>> accessor and 32bit accessor) are valid on a CPU supporting LPAE which
>>> the LPAE is enabled. Here is the description come form
>>> DDI0406C2c_arm_architecture_reference_manual.pdf (=ARM® Architecture
>>> Reference Manual ARMv7-A and ARMv7-R edition) which you can get the
>>> document by google "ARM® Architecture Reference Manual ARMv7-A and
>>> ARMv7-R edition".
>
>>Trust me, from where I seat, I have a much better source than Google for
>>that document. Who would have thought?
>
>>Nothing in what you randomly quote invalids what I've been saying. And
>>to show you what's wrong with your reasoning, let me describe a
>>scenario,
>
>>I have a non-LPAE kernel that runs on my system. It uses the 32bit
>>version of the TTBRs. It turns out that this kernel runs under a
>>hypervisor (KVM, Xen, or your toy of the day). The hypervisor
>>context-switches vcpus without even looking at whether the configuration
>>of that guest. It doesn't have to care. It just blindly uses the 64bit
>>version of the TTBRs.
>
>>The architecture *guarantees* that it works (it even works with a 32bit
>>guest under a 64bit hypervisor). In your world, this doesn't work. I
>>guess the architecture wins.
>
>>> So, I think if you access TTBR0/TTBR1 on CPU supporting LPAE, you must
>>> use "mcrr/mrrc" instruction (__ACCESS_CP15_64). If you access
>>> TTBR0/TTBR1 on CPU supporting LPAE by "mcr/mrc" instruction which is
>>> 32bit version (__ACCESS_CP15), even if the CPU doesn't report error,
>>> you also lose the high or low 32bit of the TTBR0/TTBR1.
>
>>It is not about "supporting LPAE". It is about using the accessor that
>>makes sense in a particular context. Yes, the architecture allows you to
>>do something stupid. Don't do it. It doesn't mean the accessors cannot
>>be used, and I hope that my example above demonstrates it.
>
>>Conclusion: I still stand by my request that both versions of TTBRs/PAR
>>are described without depending on the kernel configuration, because
>>this has nothing to do with the kernel configuration.
>
> Thanks for your reviews.
> Yes, you are right. I have tested that "mcrr/mrrc" instruction
> (__ACCESS_CP15_64) can work on non LPAE on vexpress_a9.

No, it doesn't. It cannot work, because Cortex-A9 predates the invention
of the 64bit accessor. I suspect that you are testing stuff in QEMU,
which is giving you a SW model that always supports LPAE. I suggest you
test this code on *real* HW, and not only on QEMU.

What I have said is:

- If the CPU supports LPAE, then both 32 and 64bit accessors work
- If the CPU doesn't support LPAE, then only the 32bit accssor work
- In both cases, that's a function of the CPU, and not of the kernel
  configuration.
- Both accessors can be safely defined as long as we ensure that they
  are used in the right context.

> Here is the code I tested on vexpress_a9 and vexpress_a15:
> --- a/arch/arm/include/asm/cp15.h
> +++ b/arch/arm/include/asm/cp15.h
> @@ -64,6 +64,56 @@
>  #define __write_sysreg(v, r, w, c, t)  asm volatile(w " " c : : "r" ((t)(v)))
>  #define write_sysreg(v, ...)           __write_sysreg(v, __VA_ARGS__)
>
> +#define TTBR0           __ACCESS_CP15_64(0, c2)
> +#define TTBR1           __ACCESS_CP15_64(1, c2)
> +#define PAR             __ACCESS_CP15_64(0, c7)

You still need to add the 32bit accessors.

	M.
Abbott Liu Nov. 17, 2017, 1:39 a.m. UTC | #2
On 16/11/17  22:41 Marc Zyngier [mailto:marc.zyngier@arm.com] wrote:
>- If the CPU supports LPAE, then both 32 and 64bit accessors work


I don't how 32bit accessor can work on CPU supporting LPAE, give me your solution.

Thanks.
Abbott Liu Nov. 17, 2017, 7:18 a.m. UTC | #3
On 16/11/17  22:41 Marc Zyngier [mailto:marc.zyngier@arm.com] wrote:
>No, it doesn't. It cannot work, because Cortex-A9 predates the invention
>of the 64bit accessor. I suspect that you are testing stuff in QEMU,
>which is giving you a SW model that always supports LPAE. I suggest you
>test this code on *real* HW, and not only on QEMU.

I am sorry. My test is fault. I only defined TTBR0 as __ACCESS_CP15_64,
but I don't use the definition TTBR0 as __ACCESS_CP15_64. 

Now I use the definition TTBR0 as __ACCESS_CP15_64 on CPU supporting
LPAE(vexpress_a9), I find it doesn't work and report undefined instruction error
when execute "mrrc" instruction.

So, you are right that 64bit accessor of TTBR0 cannot work on LPAE.
Christoffer Dall Nov. 17, 2017, 7:35 a.m. UTC | #4
On Fri, Nov 17, 2017 at 07:18:45AM +0000, Liuwenliang (Abbott Liu) wrote:
> On 16/11/17  22:41 Marc Zyngier [mailto:marc.zyngier@arm.com] wrote:
> >No, it doesn't. It cannot work, because Cortex-A9 predates the invention
> >of the 64bit accessor. I suspect that you are testing stuff in QEMU,
> >which is giving you a SW model that always supports LPAE. I suggest you
> >test this code on *real* HW, and not only on QEMU.
> 
> I am sorry. My test is fault. I only defined TTBR0 as __ACCESS_CP15_64,
> but I don't use the definition TTBR0 as __ACCESS_CP15_64. 
> 
> Now I use the definition TTBR0 as __ACCESS_CP15_64 on CPU supporting
> LPAE(vexpress_a9)

What does a "CPU supporting LPAE(vexpress_a9) mean?  As Marc pointed
out, a Cortex-A9 doesn't support LPAE.  If you configure your kernel
with LPAE it's not going to work on a Cortex-A9.

> I find it doesn't work and report undefined instruction error
> when execute "mrrc" instruction.
> 
> So, you are right that 64bit accessor of TTBR0 cannot work on LPAE.
> 

It's the other way around.  It doesn't work WITHOUT LPAE, it only works
WITH LPAE.

The ARM ARM explains this quite clearly:

"Accessing TTBR0

To access TTBR0 in an implementation that does not include the Large
Physical Address Extension, or bits[31:0] of TTBR0 in an implementation
that includes the Large Physical Address Extension, software reads or
writes the CP15 registers with <opc1> set to 0, <CRn> set to c2, <CRm>
set to c0, and <opc2> set to 0. For example:

MRC p15, 0, <Rt>, c2, c0, 0
  ; Read 32-bit TTBR0 into Rt
MCR p15, 0, <Rt>, c2, c0, 0
  ; Write Rt to 32-bit TTBR0

In an implementation that includes the Large Physical Address Extension,
to access all 64 bits of TTBR0, software performs a 64-bit read or write
of the CP15 registers with <CRm> set to c2 and <opc1> set to 0. For
example:

MRRC p15, 0, <Rt>, <Rt2>, c2
  ; Read 64-bit TTBR0 into Rt (low word) and Rt2 (high word)
MCRR p15, 0, <Rt>, <Rt2>, c2
  ; Write Rt (low word) and Rt2 (high word) to 64-bit TTBR0

In these MRRC and MCRR instructions, Rt holds the least-significant word
of TTBR0, and Rt2 holds the most-significant word."

That is, if your processor (like the Cortex-A9) does NOT support LPAE,
all you have is the 32-bit accessors (MRC and MCR).

If your processor does support LPAE (like a Cortex-A15 for example),
then you have both the 32-bit accessors (MRC and MCR) and the 64-bit
accessors (MRRC, MCRR), and using the 32-bit accessor will simply access
the lower 32-bits of the 64-bit register.

Hope this helps,
-Christoffer
Abbott Liu Nov. 18, 2017, 10:40 a.m. UTC | #5
On Nov 17, 2017  15:36 Christoffer Dall [mailto:cdall@linaro.org]  wrote:
>If your processor does support LPAE (like a Cortex-A15 for example),
>then you have both the 32-bit accessors (MRC and MCR) and the 64-bit
>accessors (MRRC, MCRR), and using the 32-bit accessor will simply access
>the lower 32-bits of the 64-bit register.
>
>Hope this helps,
>-Christoffer

If you know the higher 32-bits of the 64-bits cp15's register is not useful for your system,
then you can use the 32-bit accessor to get or set the 64-bit cp15's register.
But if the higher 32-bits of the 64-bits cp15's register is useful for your system,
then you can't use the 32-bit accessor to get or set the 64-bit cp15's register.

TTBR0/TTBR1/PAR's higher 32-bits is useful for CPU supporting LPAE.
The following description which comes from ARM(r) Architecture Reference
Manual ARMv7-A and ARMv7-R edition tell us the reason:

64-bit TTBR0 and TTBR1 format:
...
BADDR, bits[39:x] : 
Translation table base address, bits[39:x]. Defining the translation table base address width on
page B4-1698 describes how x is defined.
The value of x determines the required alignment of the translation table, which must be aligned to
2x bytes.

Abbott Liu: Because BADDR on CPU supporting LPAE may be bigger than max value of 32-bit, so bits[39:32] may 
be valid value which is useful for the system.

64-bit PAR format
...
PA[39:12]
Physical Address. The physical address corresponding to the supplied virtual address. This field
returns address bits[39:12].

Abbott Liu: Because Physical Address on CPU supporting LPAE may be bigger than max value of 32-bit, 
so bits[39:32] may be valid value which is useful for the system.

Conclusion: Don't use 32-bit accessor to get or set TTBR0/TTBR1/PAR on CPU supporting LPAE,
if you do that, your system may run error.
Marc Zyngier Nov. 18, 2017, 1:48 p.m. UTC | #6
On Sat, 18 Nov 2017 10:40:08 +0000
"Liuwenliang (Abbott Liu)" <liuwenliang@huawei.com> wrote:

> On Nov 17, 2017  15:36 Christoffer Dall [mailto:cdall@linaro.org]  wrote:
> >If your processor does support LPAE (like a Cortex-A15 for example),
> >then you have both the 32-bit accessors (MRC and MCR) and the 64-bit
> >accessors (MRRC, MCRR), and using the 32-bit accessor will simply access
> >the lower 32-bits of the 64-bit register.
> >
> >Hope this helps,
> >-Christoffer  
> 
> If you know the higher 32-bits of the 64-bits cp15's register is not useful for your system,
> then you can use the 32-bit accessor to get or set the 64-bit cp15's register.
> But if the higher 32-bits of the 64-bits cp15's register is useful for your system,
> then you can't use the 32-bit accessor to get or set the 64-bit cp15's register.
> 
> TTBR0/TTBR1/PAR's higher 32-bits is useful for CPU supporting LPAE.
> The following description which comes from ARM(r) Architecture Reference
> Manual ARMv7-A and ARMv7-R edition tell us the reason:
> 
> 64-bit TTBR0 and TTBR1 format:
> ...
> BADDR, bits[39:x] : 
> Translation table base address, bits[39:x]. Defining the translation table base address width on
> page B4-1698 describes how x is defined.
> The value of x determines the required alignment of the translation table, which must be aligned to
> 2x bytes.
> 
> Abbott Liu: Because BADDR on CPU supporting LPAE may be bigger than max value of 32-bit, so bits[39:32] may 
> be valid value which is useful for the system.
> 
> 64-bit PAR format
> ...
> PA[39:12]
> Physical Address. The physical address corresponding to the supplied virtual address. This field
> returns address bits[39:12].
> 
> Abbott Liu: Because Physical Address on CPU supporting LPAE may be bigger than max value of 32-bit, 
> so bits[39:32] may be valid value which is useful for the system.
> 
> Conclusion: Don't use 32-bit accessor to get or set TTBR0/TTBR1/PAR on CPU supporting LPAE,
> if you do that, your system may run error.

That's not really true. You can run an non-LPAE kernel that uses the
32bit accessors an a Cortex-A15 that supports LPAE. You're just limited
to 4GB of physical space. And you're pretty much guaranteed to have
some memory below 4GB (one way or another), or you'd have a slight
problem setting up your page tables.

	M.
Abbott Liu Nov. 21, 2017, 7:59 a.m. UTC | #7
On Nov 17, 2017  21:49  Marc Zyngier [mailto:marc.zyngier@arm.com]  wrote:
>On Sat, 18 Nov 2017 10:40:08 +0000
>"Liuwenliang (Abbott Liu)" <liuwenliang@huawei.com> wrote:

>> On Nov 17, 2017  15:36 Christoffer Dall [mailto:cdall@linaro.org]  wrote:
>> >If your processor does support LPAE (like a Cortex-A15 for example),
>> >then you have both the 32-bit accessors (MRC and MCR) and the 64-bit
>> >accessors (MRRC, MCRR), and using the 32-bit accessor will simply access
>> >the lower 32-bits of the 64-bit register.
>> >
>> >Hope this helps,
>> >-Christoffer
>>
>> If you know the higher 32-bits of the 64-bits cp15's register is not useful for your system,
>> then you can use the 32-bit accessor to get or set the 64-bit cp15's register.
>> But if the higher 32-bits of the 64-bits cp15's register is useful for your system,
>> then you can't use the 32-bit accessor to get or set the 64-bit cp15's register.
>>
>> TTBR0/TTBR1/PAR's higher 32-bits is useful for CPU supporting LPAE.
>> The following description which comes from ARM(r) Architecture Reference
>> Manual ARMv7-A and ARMv7-R edition tell us the reason:
>>
>> 64-bit TTBR0 and TTBR1 format:
>> ...
>> BADDR, bits[39:x] :
>> Translation table base address, bits[39:x]. Defining the translation table base address width on
>> page B4-1698 describes how x is defined.
>> The value of x determines the required alignment of the translation table, which must be aligned to
>> 2x bytes.
>>
>> Abbott Liu: Because BADDR on CPU supporting LPAE may be bigger than max value of 32-bit, so bits[39:32] may
>> be valid value which is useful for the system.
>>
>> 64-bit PAR format
>> ...
>> PA[39:12]
>> Physical Address. The physical address corresponding to the supplied virtual address. This field
>> returns address bits[39:12].
>>
>> Abbott Liu: Because Physical Address on CPU supporting LPAE may be bigger than max value of 32-bit,
>> so bits[39:32] may be valid value which is useful for the system.
>>
>> Conclusion: Don't use 32-bit accessor to get or set TTBR0/TTBR1/PAR on CPU supporting LPAE,
>> if you do that, your system may run error.

>That's not really true. You can run an non-LPAE kernel that uses the
>32bit accessors an a Cortex-A15 that supports LPAE. You're just limited
>to 4GB of physical space. And you're pretty much guaranteed to have
>some memory below 4GB (one way or another), or you'd have a slight
>problem setting up your page tables.

>       M.
>--
>Without deviation from the norm, progress is not possible.

Thanks for your review.
Please don't ask people to limit to 4GB of physical space on CPU
supporting LPAE, please don't ask people to guaranteed to have some
memory below 4GB on CPU supporting LPAE.
Why people select CPU supporting LPAE(just like cortex A15)? 
Because some of people think 4GB physical space is not enough for their 
system, maybe they want to use 8GB/16GB DDR space.
Then you tell them that they must guaranteed to have some memory below 4GB,
just only because you think the code as follow:
+#define TTBR0           __ACCESS_CP15(c2, 0, c0, 0)
+#define TTBR1           __ACCESS_CP15(c2, 0, c0, 1)
+#define PAR             __ACCESS_CP15(c7, 0, c4, 0)

is better than the code like this:

+#ifdef CONFIG_ARM_LPAE
+#define TTBR0           __ACCESS_CP15_64(0, c2)
+#define TTBR1           __ACCESS_CP15_64(1, c2)
+#define PAR             __ACCESS_CP15_64(0, c7)
+#else
+#define TTBR0           __ACCESS_CP15(c2, 0, c0, 0)
+#define TTBR1           __ACCESS_CP15(c2, 0, c0, 1)
+#define PAR             __ACCESS_CP15(c7, 0, c4, 0)
+#endif


So,I think the following code: 
+#ifdef CONFIG_ARM_LPAE
+#define TTBR0           __ACCESS_CP15_64(0, c2)
+#define TTBR1           __ACCESS_CP15_64(1, c2)
+#define PAR             __ACCESS_CP15_64(0, c7)
+#else
+#define TTBR0           __ACCESS_CP15(c2, 0, c0, 0)
+#define TTBR1           __ACCESS_CP15(c2, 0, c0, 1)
+#define PAR             __ACCESS_CP15(c7, 0, c4, 0)
+#endif

is better because it's not necessary to ask people to guaranteed to
have some memory below 4GB on CPU supporting LPAE. 
If we want to ask people to guaranteed to have some memory below 4GB 
on CPU supporting LPAE, there need to modify some other code.
I think it makes the simple problem more complex to modify some other code for this.
Russell King (Oracle) Nov. 21, 2017, 9:40 a.m. UTC | #8
On Tue, Nov 21, 2017 at 07:59:01AM +0000, Liuwenliang (Abbott Liu) wrote:
> On Nov 17, 2017  21:49  Marc Zyngier [mailto:marc.zyngier@arm.com]  wrote:
> >On Sat, 18 Nov 2017 10:40:08 +0000
> >"Liuwenliang (Abbott Liu)" <liuwenliang@huawei.com> wrote:
> 
> >> On Nov 17, 2017  15:36 Christoffer Dall [mailto:cdall@linaro.org]  wrote:
> >> >If your processor does support LPAE (like a Cortex-A15 for example),
> >> >then you have both the 32-bit accessors (MRC and MCR) and the 64-bit
> >> >accessors (MRRC, MCRR), and using the 32-bit accessor will simply access
> >> >the lower 32-bits of the 64-bit register.
> >> >
> >> >Hope this helps,
> >> >-Christoffer
> >>
> >> If you know the higher 32-bits of the 64-bits cp15's register is not useful for your system,
> >> then you can use the 32-bit accessor to get or set the 64-bit cp15's register.
> >> But if the higher 32-bits of the 64-bits cp15's register is useful for your system,
> >> then you can't use the 32-bit accessor to get or set the 64-bit cp15's register.
> >>
> >> TTBR0/TTBR1/PAR's higher 32-bits is useful for CPU supporting LPAE.
> >> The following description which comes from ARM(r) Architecture Reference
> >> Manual ARMv7-A and ARMv7-R edition tell us the reason:
> >>
> >> 64-bit TTBR0 and TTBR1 format:
> >> ...
> >> BADDR, bits[39:x] :
> >> Translation table base address, bits[39:x]. Defining the translation table base address width on
> >> page B4-1698 describes how x is defined.
> >> The value of x determines the required alignment of the translation table, which must be aligned to
> >> 2x bytes.
> >>
> >> Abbott Liu: Because BADDR on CPU supporting LPAE may be bigger than max value of 32-bit, so bits[39:32] may
> >> be valid value which is useful for the system.
> >>
> >> 64-bit PAR format
> >> ...
> >> PA[39:12]
> >> Physical Address. The physical address corresponding to the supplied virtual address. This field
> >> returns address bits[39:12].
> >>
> >> Abbott Liu: Because Physical Address on CPU supporting LPAE may be bigger than max value of 32-bit,
> >> so bits[39:32] may be valid value which is useful for the system.
> >>
> >> Conclusion: Don't use 32-bit accessor to get or set TTBR0/TTBR1/PAR on CPU supporting LPAE,
> >> if you do that, your system may run error.
> 
> >That's not really true. You can run an non-LPAE kernel that uses the
> >32bit accessors an a Cortex-A15 that supports LPAE. You're just limited
> >to 4GB of physical space. And you're pretty much guaranteed to have
> >some memory below 4GB (one way or another), or you'd have a slight
> >problem setting up your page tables.
> 
> >       M.
> >--
> >Without deviation from the norm, progress is not possible.
> 
> Thanks for your review.
> Please don't ask people to limit to 4GB of physical space on CPU
> supporting LPAE, please don't ask people to guaranteed to have some
> memory below 4GB on CPU supporting LPAE.

A LPAE-capable CPU must always have memory below 4GB physical, no ifs
no buts.

If there's no memory below 4GB physical, then the CPU has no accessible
memory before the MMU is enabled.  That means operating systems such as
Linux are completely unbootable.

There must _always_ be accessible memory below 4GB physical.  This is
not negotiable, it's a fundamental requirement.

The location of physical memory has nothing to do with the accessors.
This point I'm making also has nothing to do with the accessors.
Marc Zyngier Nov. 21, 2017, 9:46 a.m. UTC | #9
On 21/11/17 07:59, Liuwenliang (Abbott Liu) wrote:
> On Nov 17, 2017  21:49  Marc Zyngier [mailto:marc.zyngier@arm.com]  wrote:
>> On Sat, 18 Nov 2017 10:40:08 +0000
>> "Liuwenliang (Abbott Liu)" <liuwenliang@huawei.com> wrote:
> 
>>> On Nov 17, 2017  15:36 Christoffer Dall [mailto:cdall@linaro.org]  wrote:
>>>> If your processor does support LPAE (like a Cortex-A15 for example),
>>>> then you have both the 32-bit accessors (MRC and MCR) and the 64-bit
>>>> accessors (MRRC, MCRR), and using the 32-bit accessor will simply access
>>>> the lower 32-bits of the 64-bit register.
>>>>
>>>> Hope this helps,
>>>> -Christoffer
>>>
>>> If you know the higher 32-bits of the 64-bits cp15's register is not useful for your system,
>>> then you can use the 32-bit accessor to get or set the 64-bit cp15's register.
>>> But if the higher 32-bits of the 64-bits cp15's register is useful for your system,
>>> then you can't use the 32-bit accessor to get or set the 64-bit cp15's register.
>>>
>>> TTBR0/TTBR1/PAR's higher 32-bits is useful for CPU supporting LPAE.
>>> The following description which comes from ARM(r) Architecture Reference
>>> Manual ARMv7-A and ARMv7-R edition tell us the reason:
>>>
>>> 64-bit TTBR0 and TTBR1 format:
>>> ...
>>> BADDR, bits[39:x] :
>>> Translation table base address, bits[39:x]. Defining the translation table base address width on
>>> page B4-1698 describes how x is defined.
>>> The value of x determines the required alignment of the translation table, which must be aligned to
>>> 2x bytes.
>>>
>>> Abbott Liu: Because BADDR on CPU supporting LPAE may be bigger than max value of 32-bit, so bits[39:32] may
>>> be valid value which is useful for the system.
>>>
>>> 64-bit PAR format
>>> ...
>>> PA[39:12]
>>> Physical Address. The physical address corresponding to the supplied virtual address. This field
>>> returns address bits[39:12].
>>>
>>> Abbott Liu: Because Physical Address on CPU supporting LPAE may be bigger than max value of 32-bit,
>>> so bits[39:32] may be valid value which is useful for the system.
>>>
>>> Conclusion: Don't use 32-bit accessor to get or set TTBR0/TTBR1/PAR on CPU supporting LPAE,
>>> if you do that, your system may run error.
> 
>> That's not really true. You can run an non-LPAE kernel that uses the
>> 32bit accessors an a Cortex-A15 that supports LPAE. You're just limited
>> to 4GB of physical space. And you're pretty much guaranteed to have
>> some memory below 4GB (one way or another), or you'd have a slight
>> problem setting up your page tables.
> 
>>       M.
>> --
>> Without deviation from the norm, progress is not possible.
> 

> Thanks for your review.
> Please don't ask people to limit to 4GB of physical space on CPU
> supporting LPAE, please don't ask people to guaranteed to have some
> memory below 4GB on CPU supporting LPAE.

Please tell me how you enable LPAE if you don't. I've truly curious.
Because otherwise, you should really take a step back and seriously
reconsider what you're writing. Hint: where do you think the page tables
required to enable LPAE will be? How do you even *boot*?

> Why people select CPU supporting LPAE(just like cortex A15)? 
> Because some of people think 4GB physical space is not enough for their 
> system, maybe they want to use 8GB/16GB DDR space.
> Then you tell them that they must guaranteed to have some memory below 4GB,
> just only because you think the code as follow:
> +#define TTBR0           __ACCESS_CP15(c2, 0, c0, 0)
> +#define TTBR1           __ACCESS_CP15(c2, 0, c0, 1)
> +#define PAR             __ACCESS_CP15(c7, 0, c4, 0)
> 
> is better than the code like this:
> 
> +#ifdef CONFIG_ARM_LPAE
> +#define TTBR0           __ACCESS_CP15_64(0, c2)
> +#define TTBR1           __ACCESS_CP15_64(1, c2)
> +#define PAR             __ACCESS_CP15_64(0, c7)
> +#else
> +#define TTBR0           __ACCESS_CP15(c2, 0, c0, 0)
> +#define TTBR1           __ACCESS_CP15(c2, 0, c0, 1)
> +#define PAR             __ACCESS_CP15(c7, 0, c4, 0)
> +#endif
> 
> 
> So,I think the following code: 
> +#ifdef CONFIG_ARM_LPAE
> +#define TTBR0           __ACCESS_CP15_64(0, c2)
> +#define TTBR1           __ACCESS_CP15_64(1, c2)
> +#define PAR             __ACCESS_CP15_64(0, c7)
> +#else
> +#define TTBR0           __ACCESS_CP15(c2, 0, c0, 0)
> +#define TTBR1           __ACCESS_CP15(c2, 0, c0, 1)
> +#define PAR             __ACCESS_CP15(c7, 0, c4, 0)
> +#endif
> 
> is better because it's not necessary to ask people to guaranteed to
> have some memory below 4GB on CPU supporting LPAE. 

NAK.

> If we want to ask people to guaranteed to have some memory below 4GB 
> on CPU supporting LPAE, there need to modify some other code.
> I think it makes the simple problem more complex to modify some other code for this.

At this stage, you've proven that you don't understand the problem at hand.

	M.
Mark Rutland Nov. 21, 2017, 12:29 p.m. UTC | #10
On Tue, Nov 21, 2017 at 07:59:01AM +0000, Liuwenliang (Abbott Liu) wrote:
> On Nov 17, 2017  21:49  Marc Zyngier [mailto:marc.zyngier@arm.com]  wrote:
> >On Sat, 18 Nov 2017 10:40:08 +0000
> >"Liuwenliang (Abbott Liu)" <liuwenliang@huawei.com> wrote:
> >> On Nov 17, 2017  15:36 Christoffer Dall [mailto:cdall@linaro.org]  wrote:

> Please don't ask people to limit to 4GB of physical space on CPU
> supporting LPAE, please don't ask people to guaranteed to have some
> memory below 4GB on CPU supporting LPAE.

I don't think that Marc is suggesting that you'd always use the 32-bit
accessors on an LPAE system, just that all the definitions should exist
regardless of configuration.

So rather than this:

> +#ifdef CONFIG_ARM_LPAE
> +#define TTBR0           __ACCESS_CP15_64(0, c2)
> +#define TTBR1           __ACCESS_CP15_64(1, c2)
> +#define PAR             __ACCESS_CP15_64(0, c7)
> +#else
> +#define TTBR0           __ACCESS_CP15(c2, 0, c0, 0)
> +#define TTBR1           __ACCESS_CP15(c2, 0, c0, 1)
> +#define PAR             __ACCESS_CP15(c7, 0, c4, 0)
> +#endif

... you'd have the following in cp15.h:

#define TTBR0_64	__ACCESS_CP15_64(0, c2)
#define TTBR1_64	__ACCESS_CP15_64(1, c2)
#define PAR_64		__ACCESS_CP15_64(0, c7)

#define TTBR0_32	__ACCESS_CP15(c2, 0, c0, 0)
#define TTBR1_32	__ACCESS_CP15(c2, 0, c0, 1)
#define PAR_32		__ACCESS_CP15(c7, 0, c4, 0)

... and elsewhere, where it matters, we choose which to use depending on
the kernel configuration, e.g.

void set_ttbr0(u64 val)
{
	if (IS_ENABLED(CONFIG_ARM_LPAE))
		write_sysreg(val, TTBR0_64);
	else
		write_sysreg(val, TTBR0_32);
}

Thanks,
Mark.
diff mbox

Patch

--- a/arch/arm/include/asm/cp15.h
+++ b/arch/arm/include/asm/cp15.h
@@ -64,6 +64,56 @@ 
 #define __write_sysreg(v, r, w, c, t)  asm volatile(w " " c : : "r" ((t)(v)))
 #define write_sysreg(v, ...)           __write_sysreg(v, __VA_ARGS__)

+#define TTBR0           __ACCESS_CP15_64(0, c2)
+#define TTBR1           __ACCESS_CP15_64(1, c2)
+#define PAR             __ACCESS_CP15_64(0, c7)
+#define VTTBR           __ACCESS_CP15_64(6, c2)
+#define CNTV_CVAL       __ACCESS_CP15_64(3, c14)
+#define CNTVOFF         __ACCESS_CP15_64(4, c14)
+
+#define MIDR            __ACCESS_CP15(c0, 0, c0, 0)
+#define CSSELR          __ACCESS_CP15(c0, 2, c0, 0)
+#define VPIDR           __ACCESS_CP15(c0, 4, c0, 0)
+#define VMPIDR          __ACCESS_CP15(c0, 4, c0, 5)
+#define SCTLR           __ACCESS_CP15(c1, 0, c0, 0)
+#define CPACR           __ACCESS_CP15(c1, 0, c0, 2)
+#define HCR             __ACCESS_CP15(c1, 4, c1, 0)
+#define HDCR            __ACCESS_CP15(c1, 4, c1, 1)
+#define HCPTR           __ACCESS_CP15(c1, 4, c1, 2)
+#define HSTR            __ACCESS_CP15(c1, 4, c1, 3)
+#define TTBCR           __ACCESS_CP15(c2, 0, c0, 2)
+#define HTCR            __ACCESS_CP15(c2, 4, c0, 2)
+#define VTCR            __ACCESS_CP15(c2, 4, c1, 2)
+#define DACR            __ACCESS_CP15(c3, 0, c0, 0)
+#define DFSR            __ACCESS_CP15(c5, 0, c0, 0)
+#define IFSR            __ACCESS_CP15(c5, 0, c0, 1)
+#define ADFSR           __ACCESS_CP15(c5, 0, c1, 0)
+#define AIFSR           __ACCESS_CP15(c5, 0, c1, 1)
+#define HSR             __ACCESS_CP15(c5, 4, c2, 0)
+#define DFAR            __ACCESS_CP15(c6, 0, c0, 0)
+#define IFAR            __ACCESS_CP15(c6, 0, c0, 2)
+#define HDFAR           __ACCESS_CP15(c6, 4, c0, 0)
+#define HIFAR           __ACCESS_CP15(c6, 4, c0, 2)
+#define HPFAR           __ACCESS_CP15(c6, 4, c0, 4)
+#define ICIALLUIS       __ACCESS_CP15(c7, 0, c1, 0)
+#define ATS1CPR         __ACCESS_CP15(c7, 0, c8, 0)
+#define TLBIALLIS       __ACCESS_CP15(c8, 0, c3, 0)
+#define TLBIALL         __ACCESS_CP15(c8, 0, c7, 0)
+#define TLBIALLNSNHIS   __ACCESS_CP15(c8, 4, c3, 4)
+#define PRRR            __ACCESS_CP15(c10, 0, c2, 0)
+#define NMRR            __ACCESS_CP15(c10, 0, c2, 1)
+#define AMAIR0          __ACCESS_CP15(c10, 0, c3, 0)
+#define AMAIR1          __ACCESS_CP15(c10, 0, c3, 1)
+#define VBAR            __ACCESS_CP15(c12, 0, c0, 0)
+#define CID             __ACCESS_CP15(c13, 0, c0, 1)
+#define TID_URW         __ACCESS_CP15(c13, 0, c0, 2)
+#define TID_URO         __ACCESS_CP15(c13, 0, c0, 3)
+#define TID_PRIV        __ACCESS_CP15(c13, 0, c0, 4)
+#define HTPIDR          __ACCESS_CP15(c13, 4, c0, 2)
+#define CNTKCTL         __ACCESS_CP15(c14, 0, c1, 0)
+#define CNTV_CTL        __ACCESS_CP15(c14, 0, c3, 1)
+#define CNTHCTL         __ACCESS_CP15(c14, 4, c1, 0)
+
extern unsigned long cr_alignment;     /* defined in entry-armv.S */

 static inline unsigned long get_cr(void)
diff --git a/arch/arm/include/asm/kvm_hyp.h b/arch/arm/include/asm/kvm_hyp.h
index 14b5903..8db8a8c 100644
--- a/arch/arm/include/asm/kvm_hyp.h
+++ b/arch/arm/include/asm/kvm_hyp.h
@@ -37,56 +37,6 @@ 
        __val;                                                  \
 })

-#define TTBR0              __ACCESS_CP15_64(0, c2)
-#define TTBR1              __ACCESS_CP15_64(1, c2)
-#define VTTBR              __ACCESS_CP15_64(6, c2)
-#define PAR                __ACCESS_CP15_64(0, c7)
-#define CNTV_CVAL  __ACCESS_CP15_64(3, c14)
-#define CNTVOFF            __ACCESS_CP15_64(4, c14)
-
-#define MIDR               __ACCESS_CP15(c0, 0, c0, 0)
-#define CSSELR             __ACCESS_CP15(c0, 2, c0, 0)
-#define VPIDR              __ACCESS_CP15(c0, 4, c0, 0)
-#define VMPIDR             __ACCESS_CP15(c0, 4, c0, 5)
-#define SCTLR              __ACCESS_CP15(c1, 0, c0, 0)
-#define CPACR              __ACCESS_CP15(c1, 0, c0, 2)
-#define HCR                __ACCESS_CP15(c1, 4, c1, 0)
-#define HDCR               __ACCESS_CP15(c1, 4, c1, 1)
-#define HCPTR              __ACCESS_CP15(c1, 4, c1, 2)
-#define HSTR               __ACCESS_CP15(c1, 4, c1, 3)
-#define TTBCR              __ACCESS_CP15(c2, 0, c0, 2)
-#define HTCR               __ACCESS_CP15(c2, 4, c0, 2)
-#define VTCR               __ACCESS_CP15(c2, 4, c1, 2)
-#define DACR               __ACCESS_CP15(c3, 0, c0, 0)
-#define DFSR               __ACCESS_CP15(c5, 0, c0, 0)
-#define IFSR               __ACCESS_CP15(c5, 0, c0, 1)
-#define ADFSR              __ACCESS_CP15(c5, 0, c1, 0)
-#define AIFSR              __ACCESS_CP15(c5, 0, c1, 1)
-#define HSR                __ACCESS_CP15(c5, 4, c2, 0)
-#define DFAR               __ACCESS_CP15(c6, 0, c0, 0)
-#define IFAR               __ACCESS_CP15(c6, 0, c0, 2)
-#define HDFAR              __ACCESS_CP15(c6, 4, c0, 0)
-#define HIFAR              __ACCESS_CP15(c6, 4, c0, 2)
-#define HPFAR              __ACCESS_CP15(c6, 4, c0, 4)
-#define ICIALLUIS  __ACCESS_CP15(c7, 0, c1, 0)
-#define ATS1CPR            __ACCESS_CP15(c7, 0, c8, 0)
-#define TLBIALLIS  __ACCESS_CP15(c8, 0, c3, 0)
-#define TLBIALL            __ACCESS_CP15(c8, 0, c7, 0)
-#define TLBIALLNSNHIS      __ACCESS_CP15(c8, 4, c3, 4)
-#define PRRR               __ACCESS_CP15(c10, 0, c2, 0)
-#define NMRR               __ACCESS_CP15(c10, 0, c2, 1)
-#define AMAIR0             __ACCESS_CP15(c10, 0, c3, 0)
-#define AMAIR1             __ACCESS_CP15(c10, 0, c3, 1)
-#define VBAR               __ACCESS_CP15(c12, 0, c0, 0)
-#define CID                __ACCESS_CP15(c13, 0, c0, 1)
-#define TID_URW            __ACCESS_CP15(c13, 0, c0, 2)
-#define TID_URO            __ACCESS_CP15(c13, 0, c0, 3)
-#define TID_PRIV   __ACCESS_CP15(c13, 0, c0, 4)
-#define HTPIDR             __ACCESS_CP15(c13, 4, c0, 2)
-#define CNTKCTL            __ACCESS_CP15(c14, 0, c1, 0)
-#define CNTV_CTL   __ACCESS_CP15(c14, 0, c3, 1)
-#define CNTHCTL            __ACCESS_CP15(c14, 4, c1, 0)
-