[v2,05/22] ARM: LPAE: support 64-bit virt_to_phys patching

Message ID	1344648306-15619-6-git-send-email-cyril@ti.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Cyril Chemparathy <cyril@ti.com> To: <linux-kernel@vger.kernel.org>, <linux-arm-kernel@lists.infradead.org> Subject: [PATCH v2 05/22] ARM: LPAE: support 64-bit virt_to_phys patching Date: Fri, 10 Aug 2012 21:24:48 -0400 Message-ID: <1344648306-15619-6-git-send-email-cyril@ti.com> In-Reply-To: <1344648306-15619-1-git-send-email-cyril@ti.com> References: <1344648306-15619-1-git-send-email-cyril@ti.com> MIME-Version: 1.0 summary: Content analysis details: (-6.9 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at http://www.dnswl.org/, high trust [192.94.94.41 listed in list.dnswl.org] -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -0.0 SPF_PASS SPF: sender matches SPF record -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: linux@arm.linux.org.uk, arnd@arndb.de, nico@linaro.org, catalin.marinas@arm.com, will.deacon@arm.com, grant.likely@secretlab.ca, Cyril Chemparathy <cyril@ti.com> Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Message ID

1344648306-15619-6-git-send-email-cyril@ti.com (mailing list archive)

State

New, archived

Headers

From: Cyril Chemparathy <cyril@ti.com>
To: <linux-kernel@vger.kernel.org>, <linux-arm-kernel@lists.infradead.org>
Subject: [PATCH v2 05/22] ARM: LPAE: support 64-bit virt_to_phys patching
Date: Fri, 10 Aug 2012 21:24:48 -0400
Message-ID: <1344648306-15619-6-git-send-email-cyril@ti.com>
In-Reply-To: <1344648306-15619-1-git-send-email-cyril@ti.com>
References: <1344648306-15619-1-git-send-email-cyril@ti.com>
MIME-Version: 1.0
Cc: linux@arm.linux.org.uk, arnd@arndb.de, nico@linaro.org,
	catalin.marinas@arm.com, will.deacon@arm.com, grant.likely@secretlab.ca, 
	Cyril Chemparathy <cyril@ti.com>
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: linux-arm-kernel-bounces@lists.infradead.org
Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Commit Message

Cyril Chemparathy Aug. 11, 2012, 1:24 a.m. UTC

This patch adds support for 64-bit physical addresses in virt_to_phys()
patching.  This does not do real 64-bit add/sub, but instead patches in the
upper 32-bits of the phys_offset directly into the output of virt_to_phys.

There is no corresponding change on the phys_to_virt() side, because
computations on the upper 32-bits would be discarded anyway.

Signed-off-by: Cyril Chemparathy <cyril@ti.com>
---
 arch/arm/include/asm/memory.h |   22 ++++++++++++++++++----
 arch/arm/kernel/head.S        |    4 ++++
 arch/arm/kernel/setup.c       |    2 +-
 3 files changed, 23 insertions(+), 5 deletions(-)

Comments

Nicolas Pitre Aug. 12, 2012, 3:39 a.m. UTC | #1

On Fri, 10 Aug 2012, Cyril Chemparathy wrote:

> This patch adds support for 64-bit physical addresses in virt_to_phys()
> patching.  This does not do real 64-bit add/sub, but instead patches in the
> upper 32-bits of the phys_offset directly into the output of virt_to_phys.
> 
> There is no corresponding change on the phys_to_virt() side, because
> computations on the upper 32-bits would be discarded anyway.
> 
> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> ---
>  arch/arm/include/asm/memory.h |   22 ++++++++++++++++++----
>  arch/arm/kernel/head.S        |    4 ++++
>  arch/arm/kernel/setup.c       |    2 +-
>  3 files changed, 23 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 81e1714..dc5fbf3 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -154,14 +154,28 @@
>  #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>  
>  extern unsigned long	__pv_offset;
> -extern unsigned long	__pv_phys_offset;
> +extern phys_addr_t	__pv_phys_offset;
>  #define PHYS_OFFSET	__virt_to_phys(PAGE_OFFSET)
>  
>  static inline phys_addr_t __virt_to_phys(unsigned long x)
>  {
> -	unsigned long t;
> -	early_patch_imm8("add", t, x, __pv_offset, 0);
> -	return t;
> +	unsigned long tlo, thi;
> +
> +	early_patch_imm8("add", tlo, x, __pv_offset, 0);
> +
> +#ifdef CONFIG_ARM_LPAE
> +	/*
> +	 * On LPAE, we do not _need_ to do 64-bit arithmetic because the high
> +	 * order 32 bits are never changed by the phys-virt offset.  We simply
> +	 * patch in the high order physical address bits instead.
> +	 */
> +#ifdef __ARMEB__
> +	early_patch_imm8_mov("mov", thi, __pv_phys_offset, 0);
> +#else
> +	early_patch_imm8_mov("mov", thi, __pv_phys_offset, 4);
> +#endif
> +#endif
> +	return (u64)tlo | (u64)thi << 32;
>  }

Hmmm...  I'm afraid this is going to be suboptimal when LPAE is not 
selected.

First of all, you do not need to cast tlo to a u64 in the return value.

Then, I'm not sure if the compiler is smart enough to see that the 
returned value is a phys_addr_t which can be a u32, and in this case the 
(u64)thi << 32 is going to be truncated right away, and therefore there 
is no point in emiting the corresponding instructions.

Furthermore, if LPAE is not defined, then thi never gets initialized and 
should produce a warning. Did you test compilation of the code with LPAE 
turned off?

I'd prefer something like this where more stuff is validated by the 
compiler:

static inline phys_addr_t __virt_to_phys(unsigned long x)
{
	unsigned long tlo, thi;
	phys_addr_t ret;

	early_patch_imm8("add", tlo, x, __pv_offset, 0);
	ret = tlo;

	if (sizeof(phys_addr_t) > 4) {
#ifdef __ARMEB__
		early_patch_imm8_mov("mov", thi, __pv_phys_offset, 0);
#else
		early_patch_imm8_mov("mov", thi, __pv_phys_offset, 4);
#endif
		ret |= ((u64)thi) << 32;
	}

	return ret);
}

This should let the compiler optimize things whether LPAE is enabledor 
not while validating both cases.


Nicolas

Cyril Chemparathy Aug. 12, 2012, 11:27 p.m. UTC | #2

On 08/11/12 23:39, Nicolas Pitre wrote:
> On Fri, 10 Aug 2012, Cyril Chemparathy wrote:
>
>> This patch adds support for 64-bit physical addresses in virt_to_phys()
>> patching.  This does not do real 64-bit add/sub, but instead patches in the
>> upper 32-bits of the phys_offset directly into the output of virt_to_phys.
>>
>> There is no corresponding change on the phys_to_virt() side, because
>> computations on the upper 32-bits would be discarded anyway.
>>
>> Signed-off-by: Cyril Chemparathy <cyril@ti.com>
>> ---
>>   arch/arm/include/asm/memory.h |   22 ++++++++++++++++++----
>>   arch/arm/kernel/head.S        |    4 ++++
>>   arch/arm/kernel/setup.c       |    2 +-
>>   3 files changed, 23 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
>> index 81e1714..dc5fbf3 100644
>> --- a/arch/arm/include/asm/memory.h
>> +++ b/arch/arm/include/asm/memory.h
>> @@ -154,14 +154,28 @@
>>   #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
>>
>>   extern unsigned long	__pv_offset;
>> -extern unsigned long	__pv_phys_offset;
>> +extern phys_addr_t	__pv_phys_offset;
>>   #define PHYS_OFFSET	__virt_to_phys(PAGE_OFFSET)
>>
>>   static inline phys_addr_t __virt_to_phys(unsigned long x)
>>   {
>> -	unsigned long t;
>> -	early_patch_imm8("add", t, x, __pv_offset, 0);
>> -	return t;
>> +	unsigned long tlo, thi;
>> +
>> +	early_patch_imm8("add", tlo, x, __pv_offset, 0);
>> +
>> +#ifdef CONFIG_ARM_LPAE
>> +	/*
>> +	 * On LPAE, we do not _need_ to do 64-bit arithmetic because the high
>> +	 * order 32 bits are never changed by the phys-virt offset.  We simply
>> +	 * patch in the high order physical address bits instead.
>> +	 */
>> +#ifdef __ARMEB__
>> +	early_patch_imm8_mov("mov", thi, __pv_phys_offset, 0);
>> +#else
>> +	early_patch_imm8_mov("mov", thi, __pv_phys_offset, 4);
>> +#endif
>> +#endif
>> +	return (u64)tlo | (u64)thi << 32;
>>   }
>
> Hmmm...  I'm afraid this is going to be suboptimal when LPAE is not
> selected.
>

I understand your concern, but I don't see the sub-optimality.  I tested 
the following function with GCC versions 4.3.3 and 4.7.  This is after 
the other changes that I mentioned in my previous email, but with the 
__virt_to_phys() code itself unchanged:

phys_addr_t ____test_virt_to_phys(unsigned long addr)
{
         return __virt_to_phys(addr);
}

The resulting code in both cases looks like:

<____test_virt_to_phys>:
	b       c04f0528
	bx      lr

[the branch of course gets patched to an add]

> First of all, you do not need to cast tlo to a u64 in the return value.
>

True enough.

> Then, I'm not sure if the compiler is smart enough to see that the
> returned value is a phys_addr_t which can be a u32, and in this case the
> (u64)thi << 32 is going to be truncated right away, and therefore there
> is no point in emiting the corresponding instructions.
>

In this case, it appears to be smart enough.  However, I agree that 
relying on compiler smarts is probably not the best thing for us to do.

> Furthermore, if LPAE is not defined, then thi never gets initialized and
> should produce a warning. Did you test compilation of the code with LPAE
> turned off?
>

Sure.  One of our test platforms is non-LPAE.  The compiler does not 
produce warnings on this, and this is consistent across both compiler 
versions.

> I'd prefer something like this where more stuff is validated by the
> compiler:
>
> static inline phys_addr_t __virt_to_phys(unsigned long x)
> {
> 	unsigned long tlo, thi;
> 	phys_addr_t ret;
>
> 	early_patch_imm8("add", tlo, x, __pv_offset, 0);
> 	ret = tlo;
>
> 	if (sizeof(phys_addr_t) > 4) {
> #ifdef __ARMEB__
> 		early_patch_imm8_mov("mov", thi, __pv_phys_offset, 0);
> #else
> 		early_patch_imm8_mov("mov", thi, __pv_phys_offset, 4);
> #endif
> 		ret |= ((u64)thi) << 32;
> 	}
>
> 	return ret);
> }
>
> This should let the compiler optimize things whether LPAE is enabledor
> not while validating both cases.
>

Agreed on the principal, but more below...


I've meanwhile been chasing down another problem - the code generated 
for the LPAE case.  The original code resulted in the following:

<____test_virt_to_phys>:
         mov     r2, #0
         b       c01bc800	# patch: add r1, r0, __pv_offset
         b       c01bc810	# patch: mov r0, __phys_offset_high
         orr     r2, r2, r1
         mov     r3, r0
         mov     r1, r3
         mov     r0, r2
         bx      lr

Yikes! This code does a bunch of futile register shuffling and a 
pointless or, all in the name of generating the result in a 64-bit 
register-pair from the 32-bit halves.

In order to get past this, I tried adding operand qualifiers (R = upper 
32-bits, Q = lower 32-bits) in the patch macros, in the hope that 
treating these as native 64-bit register pairs would eliminate the need 
to shuffle them around after the inline assembly blocks.  This then 
allows us to implement __virt_to_phys() as follows:

static inline phys_addr_t __virt_to_phys(unsigned long x)
{
         phys_addr_t t;

         if (sizeof(t) == 4) {
		t = x;
                 early_patch_imm8("add", t, "", t, __pv_offset, 0);
                 return t;
         }

         /*
          * On LPAE, we do not _need_ to do 64-bit arithmetic because
          * the high order 32 bits are never changed by the phys-virt
          * offset.  We simply patch in the high order physical address
          * bits instead.
          *
          * Note: the mov _must_ be first here.  From the compiler's
          * perspective, this is the initializer for the variable. The
          * mov itself initializes only the upper half. The subsequent
          * add treats t as a read/write operand and initializes the
          * lower half.
          */
#ifdef __ARMEB__
         early_patch_imm8_mov("mov", t, "R", __pv_phys_offset, 0);
#else
         early_patch_imm8_mov("mov", t, "R", __pv_phys_offset, 4);
#endif
         early_patch_imm8("add", t, "Q", x, __pv_offset, 0);

         return t;
}

With this, we get significantly better looking generated code:

<____test_virt_to_phys>:
	b       c01d519c	# patch: mov r3, __phys_offset_high
	b       c01d51ac	# patch: add r2, r0, __phys_offset_high
	mov     r0, r2
	mov     r1, r3
	bx      lr

This is about as far along as I've been able to proceed.  I still 
haven't figured out a way to get it to patch in place without an extra 
register pair.

Overall, this is still a bit too kludgy for my liking.  In particular, 
the read/write operand forces add/sub/... users to initialize the result 
variable.  I am currently leaning towards adding native support for 
64-bit operations in the runtime patch code, instead of having to hack 
around it with 32-bit primitives.  Better ideas, any one?

Thanks
-- Cyril.

Nicolas Pitre Aug. 13, 2012, 4:03 a.m. UTC | #3

On Sun, 12 Aug 2012, Cyril Chemparathy wrote:

> On 08/11/12 23:39, Nicolas Pitre wrote:
> > On Fri, 10 Aug 2012, Cyril Chemparathy wrote:
> > 
> > > This patch adds support for 64-bit physical addresses in virt_to_phys()
> > > patching.  This does not do real 64-bit add/sub, but instead patches in
> > > the
> > > upper 32-bits of the phys_offset directly into the output of virt_to_phys.
> > > 
> > > There is no corresponding change on the phys_to_virt() side, because
> > > computations on the upper 32-bits would be discarded anyway.
> > > 
> > > Signed-off-by: Cyril Chemparathy <cyril@ti.com>
> > > ---
> > >   arch/arm/include/asm/memory.h |   22 ++++++++++++++++++----
> > >   arch/arm/kernel/head.S        |    4 ++++
> > >   arch/arm/kernel/setup.c       |    2 +-
> > >   3 files changed, 23 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> > > index 81e1714..dc5fbf3 100644
> > > --- a/arch/arm/include/asm/memory.h
> > > +++ b/arch/arm/include/asm/memory.h
> > > @@ -154,14 +154,28 @@
> > >   #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> > > 
> > >   extern unsigned long	__pv_offset;
> > > -extern unsigned long	__pv_phys_offset;
> > > +extern phys_addr_t	__pv_phys_offset;
> > >   #define PHYS_OFFSET	__virt_to_phys(PAGE_OFFSET)
> > > 
> > >   static inline phys_addr_t __virt_to_phys(unsigned long x)
> > >   {
> > > -	unsigned long t;
> > > -	early_patch_imm8("add", t, x, __pv_offset, 0);
> > > -	return t;
> > > +	unsigned long tlo, thi;
> > > +
> > > +	early_patch_imm8("add", tlo, x, __pv_offset, 0);
> > > +
> > > +#ifdef CONFIG_ARM_LPAE
> > > +	/*
> > > +	 * On LPAE, we do not _need_ to do 64-bit arithmetic because the high
> > > +	 * order 32 bits are never changed by the phys-virt offset.  We simply
> > > +	 * patch in the high order physical address bits instead.
> > > +	 */
> > > +#ifdef __ARMEB__
> > > +	early_patch_imm8_mov("mov", thi, __pv_phys_offset, 0);
> > > +#else
> > > +	early_patch_imm8_mov("mov", thi, __pv_phys_offset, 4);
> > > +#endif
> > > +#endif
> > > +	return (u64)tlo | (u64)thi << 32;
> > >   }
> > 
> > Hmmm...  I'm afraid this is going to be suboptimal when LPAE is not
> > selected.
> > 
> 
> I understand your concern, but I don't see the sub-optimality.  I tested the
> following function with GCC versions 4.3.3 and 4.7.  This is after the other
> changes that I mentioned in my previous email, but with the __virt_to_phys()
> code itself unchanged:
> 
> phys_addr_t ____test_virt_to_phys(unsigned long addr)
> {
>         return __virt_to_phys(addr);
> }
> 
> The resulting code in both cases looks like:
> 
> <____test_virt_to_phys>:
> 	b       c04f0528
> 	bx      lr
> 
> [the branch of course gets patched to an add]

OK.  Please add such information and results in your commit log.

> > First of all, you do not need to cast tlo to a u64 in the return value.
> > 
> 
> True enough.
> 
> > Then, I'm not sure if the compiler is smart enough to see that the
> > returned value is a phys_addr_t which can be a u32, and in this case the
> > (u64)thi << 32 is going to be truncated right away, and therefore there
> > is no point in emiting the corresponding instructions.
> > 
> 
> In this case, it appears to be smart enough.  However, I agree that relying on
> compiler smarts is probably not the best thing for us to do.

Well, we do rely on a lot of compiler smarts in the kernel.  As long as 
sufficiently old compiler versions do get it right then it should be 
fine.

> > Furthermore, if LPAE is not defined, then thi never gets initialized and
> > should produce a warning. Did you test compilation of the code with LPAE
> > turned off?
> > 
> 
> Sure.  One of our test platforms is non-LPAE.  The compiler does not produce
> warnings on this, and this is consistent across both compiler versions.

That is odd.  I still prefer my version nevertheless so to avoid this 
potential issue in the future.

> > I'd prefer something like this where more stuff is validated by the
> > compiler:
> > 
> > static inline phys_addr_t __virt_to_phys(unsigned long x)
> > {
> > 	unsigned long tlo, thi;
> > 	phys_addr_t ret;
> > 
> > 	early_patch_imm8("add", tlo, x, __pv_offset, 0);
> > 	ret = tlo;
> > 
> > 	if (sizeof(phys_addr_t) > 4) {
> > #ifdef __ARMEB__
> > 		early_patch_imm8_mov("mov", thi, __pv_phys_offset, 0);
> > #else
> > 		early_patch_imm8_mov("mov", thi, __pv_phys_offset, 4);
> > #endif
> > 		ret |= ((u64)thi) << 32;
> > 	}
> > 
> > 	return ret);
> > }
> > 
> > This should let the compiler optimize things whether LPAE is enabledor
> > not while validating both cases.
> > 
> 
> Agreed on the principal, but more below...
> 
> 
> I've meanwhile been chasing down another problem - the code generated for the
> LPAE case.  The original code resulted in the following:
> 
> <____test_virt_to_phys>:
>         mov     r2, #0
>         b       c01bc800	# patch: add r1, r0, __pv_offset
>         b       c01bc810	# patch: mov r0, __phys_offset_high
>         orr     r2, r2, r1
>         mov     r3, r0
>         mov     r1, r3
>         mov     r0, r2
>         bx      lr
> 
> Yikes! This code does a bunch of futile register shuffling and a pointless or,
> all in the name of generating the result in a 64-bit register-pair from the
> 32-bit halves.
> 
> In order to get past this, I tried adding operand qualifiers (R = upper
> 32-bits, Q = lower 32-bits) in the patch macros, in the hope that treating
> these as native 64-bit register pairs would eliminate the need to shuffle them
> around after the inline assembly blocks.  This then allows us to implement
> __virt_to_phys() as follows:
> 
> static inline phys_addr_t __virt_to_phys(unsigned long x)
> {
>         phys_addr_t t;
> 
>         if (sizeof(t) == 4) {
> 		t = x;
>                 early_patch_imm8("add", t, "", t, __pv_offset, 0);
>                 return t;
>         }
> 
>         /*
>          * On LPAE, we do not _need_ to do 64-bit arithmetic because
>          * the high order 32 bits are never changed by the phys-virt
>          * offset.  We simply patch in the high order physical address
>          * bits instead.
>          *
>          * Note: the mov _must_ be first here.  From the compiler's
>          * perspective, this is the initializer for the variable. The
>          * mov itself initializes only the upper half. The subsequent
>          * add treats t as a read/write operand and initializes the
>          * lower half.
>          */
> #ifdef __ARMEB__
>         early_patch_imm8_mov("mov", t, "R", __pv_phys_offset, 0);
> #else
>         early_patch_imm8_mov("mov", t, "R", __pv_phys_offset, 4);
> #endif
>         early_patch_imm8("add", t, "Q", x, __pv_offset, 0);
> 
>         return t;
> }
> 
> With this, we get significantly better looking generated code:
> 
> <____test_virt_to_phys>:
> 	b       c01d519c	# patch: mov r3, __phys_offset_high
> 	b       c01d51ac	# patch: add r2, r0, __phys_offset_high
> 	mov     r0, r2
> 	mov     r1, r3
> 	bx      lr
> 
> This is about as far along as I've been able to proceed.  I still haven't
> figured out a way to get it to patch in place without an extra register pair.
> 
> Overall, this is still a bit too kludgy for my liking.  In particular, the
> read/write operand forces add/sub/... users to initialize the result variable.
> I am currently leaning towards adding native support for 64-bit operations in
> the runtime patch code, instead of having to hack around it with 32-bit
> primitives.  Better ideas, any one?

The best I've come up with is this:

//typedef unsigned long phys_addr_t;
typedef unsigned long long phys_addr_t;

phys_addr_t __virt_to_phys(unsigned long v)
{
	phys_addr_t p;

	if (sizeof(phys_addr_t) > 4) {
		asm ("add %Q0, %1, #PV_OFFSET" : "=r" (p) : "r" (v));
		asm ("mov %R0, #PHYS_HIGHBITS" : "+r" (p));
	} else {
		asm ("add %0, %1, #PV_OFFSET" : "=r" (p) : "r" (v));
	}

	return p;
}

The add is done first as this reduce the life span of the v variable.  
Because p is listed as an early clobber, then in theory the register 
holding v could be used to store the low part of p right away.  If you 
put the mov first then you can't do that especially on big endian.

It seems that there is no clean way to combine the LPAE and non LPAE 
cases.


Nicolas

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 81e1714..dc5fbf3 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -154,14 +154,28 @@ 
 #ifdef CONFIG_ARM_PATCH_PHYS_VIRT
 
 extern unsigned long	__pv_offset;
-extern unsigned long	__pv_phys_offset;
+extern phys_addr_t	__pv_phys_offset;
 #define PHYS_OFFSET	__virt_to_phys(PAGE_OFFSET)
 
 static inline phys_addr_t __virt_to_phys(unsigned long x)
 {
-	unsigned long t;
-	early_patch_imm8("add", t, x, __pv_offset, 0);
-	return t;
+	unsigned long tlo, thi;
+
+	early_patch_imm8("add", tlo, x, __pv_offset, 0);
+
+#ifdef CONFIG_ARM_LPAE
+	/*
+	 * On LPAE, we do not _need_ to do 64-bit arithmetic because the high
+	 * order 32 bits are never changed by the phys-virt offset.  We simply
+	 * patch in the high order physical address bits instead.
+	 */
+#ifdef __ARMEB__
+	early_patch_imm8_mov("mov", thi, __pv_phys_offset, 0);
+#else
+	early_patch_imm8_mov("mov", thi, __pv_phys_offset, 4);
+#endif
+#endif
+	return (u64)tlo | (u64)thi << 32;
 }
 
 static inline unsigned long __phys_to_virt(phys_addr_t x)
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 69a3c09..61fb8df 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -530,7 +530,11 @@  ENDPROC(__fixup_pv_offsets)
 
 	.align
 1:	.long	.
+#if defined(CONFIG_ARM_LPAE) && defined(__ARMEB__)
+	.long	__pv_phys_offset + 4
+#else
 	.long	__pv_phys_offset
+#endif
 	.long	__pv_offset
 	.long	PAGE_OFFSET
 #endif
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 59e0f57..edb4f42 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -159,7 +159,7 @@  DEFINE_PER_CPU(struct cpuinfo_arm, cpu_data);
  * The initializers here prevent these from landing in the BSS section.
  */
 unsigned long __pv_offset = 0xdeadbeef;
-unsigned long __pv_phys_offset = 0xdeadbeef;
+phys_addr_t   __pv_phys_offset = 0xdeadbeef;
 EXPORT_SYMBOL(__pv_phys_offset);
 
 #endif

[v2,05/22] ARM: LPAE: support 64-bit virt_to_phys patching

Commit Message

Comments

Patch