diff mbox series

ARM: decompressor: avoid ADRL pseudo-instruction

Message ID 20201109205155.1207545-1-ndesaulniers@google.com (mailing list archive)
State New, archived
Headers show
Series ARM: decompressor: avoid ADRL pseudo-instruction | expand

Commit Message

Nick Desaulniers Nov. 9, 2020, 8:51 p.m. UTC
As Ard notes in
commit 54781938ec34 ("crypto: arm/sha256-neon - avoid ADRL pseudo
instruction")
commit 0f5e8323777b ("crypto: arm/sha512-neon - avoid ADRL pseudo
instruction")

  The ADRL pseudo instruction is not an architectural construct, but a
  convenience macro that was supported by the ARM proprietary assembler
  and adopted by binutils GAS as well, but only when assembling in 32-bit
  ARM mode. Therefore, it can only be used in assembler code that is known
  to assemble in ARM mode only, but as it turns out, the Clang assembler
  does not implement ADRL at all, and so it is better to get rid of it
  entirely.

  So replace the ADRL instruction with a ADR instruction that refers to
  a nearer symbol, and apply the delta explicitly using an additional
  instruction.

We can use the same technique to generate the same offset. It looks like
the ADRL pseudo instruction assembles to two SUB instructions in this
case. Because the largest immediate operand that can be specified for
this instruction is 0x400, and the distance between the reference and
the symbol are larger than that, we need to use an intermediary symbol
(cache_off in this case) to calculate the full range.

Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Suggested-by: Jian Cai <jiancai@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 arch/arm/boot/compressed/head.S | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Ard Biesheuvel Nov. 9, 2020, 8:53 p.m. UTC | #1
On Mon, 9 Nov 2020 at 21:52, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> As Ard notes in
> commit 54781938ec34 ("crypto: arm/sha256-neon - avoid ADRL pseudo
> instruction")
> commit 0f5e8323777b ("crypto: arm/sha512-neon - avoid ADRL pseudo
> instruction")
>
>   The ADRL pseudo instruction is not an architectural construct, but a
>   convenience macro that was supported by the ARM proprietary assembler
>   and adopted by binutils GAS as well, but only when assembling in 32-bit
>   ARM mode. Therefore, it can only be used in assembler code that is known
>   to assemble in ARM mode only, but as it turns out, the Clang assembler
>   does not implement ADRL at all, and so it is better to get rid of it
>   entirely.
>
>   So replace the ADRL instruction with a ADR instruction that refers to
>   a nearer symbol, and apply the delta explicitly using an additional
>   instruction.
>
> We can use the same technique to generate the same offset. It looks like
> the ADRL pseudo instruction assembles to two SUB instructions in this
> case. Because the largest immediate operand that can be specified for
> this instruction is 0x400, and the distance between the reference and
> the symbol are larger than that, we need to use an intermediary symbol
> (cache_off in this case) to calculate the full range.
>
> Suggested-by: Ard Biesheuvel <ardb@kernel.org>
> Suggested-by: Jian Cai <jiancai@google.com>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
>  arch/arm/boot/compressed/head.S | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> index 2e04ec5b5446..b3eac6f9a709 100644
> --- a/arch/arm/boot/compressed/head.S
> +++ b/arch/arm/boot/compressed/head.S
> @@ -1440,7 +1440,9 @@ ENTRY(efi_enter_kernel)
>                 mov     r4, r0                  @ preserve image base
>                 mov     r8, r1                  @ preserve DT pointer
>
> - ARM(          adrl    r0, call_cache_fn       )
> + ARM(          sub     r0, pc, #.L__efi_enter_kernel-cache_off )
> + ARM(          sub     r0, r0, #cache_off-call_cache_fn        )
> +.L__efi_enter_kernel:
>   THUMB(                adr     r0, call_cache_fn       )
>                 adr     r1, 0f                  @ clean the region of code we
>                 bl      cache_clean_flush       @ may run with the MMU off
> --
> 2.29.2.222.g5d2a92d10f8-goog
>

This is already fixed in Russell's for-next tree.
Nick Desaulniers Nov. 9, 2020, 9:09 p.m. UTC | #2
On Mon, Nov 9, 2020 at 12:53 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 9 Nov 2020 at 21:52, Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > As Ard notes in
> > commit 54781938ec34 ("crypto: arm/sha256-neon - avoid ADRL pseudo
> > instruction")
> > commit 0f5e8323777b ("crypto: arm/sha512-neon - avoid ADRL pseudo
> > instruction")
> >
> >   The ADRL pseudo instruction is not an architectural construct, but a
> >   convenience macro that was supported by the ARM proprietary assembler
> >   and adopted by binutils GAS as well, but only when assembling in 32-bit
> >   ARM mode. Therefore, it can only be used in assembler code that is known
> >   to assemble in ARM mode only, but as it turns out, the Clang assembler
> >   does not implement ADRL at all, and so it is better to get rid of it
> >   entirely.
> >
> >   So replace the ADRL instruction with a ADR instruction that refers to
> >   a nearer symbol, and apply the delta explicitly using an additional
> >   instruction.
> >
> > We can use the same technique to generate the same offset. It looks like
> > the ADRL pseudo instruction assembles to two SUB instructions in this
> > case. Because the largest immediate operand that can be specified for
> > this instruction is 0x400, and the distance between the reference and
> > the symbol are larger than that, we need to use an intermediary symbol
> > (cache_off in this case) to calculate the full range.
> >
> > Suggested-by: Ard Biesheuvel <ardb@kernel.org>
> > Suggested-by: Jian Cai <jiancai@google.com>
> > Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> > ---
> >  arch/arm/boot/compressed/head.S | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> > index 2e04ec5b5446..b3eac6f9a709 100644
> > --- a/arch/arm/boot/compressed/head.S
> > +++ b/arch/arm/boot/compressed/head.S
> > @@ -1440,7 +1440,9 @@ ENTRY(efi_enter_kernel)
> >                 mov     r4, r0                  @ preserve image base
> >                 mov     r8, r1                  @ preserve DT pointer
> >
> > - ARM(          adrl    r0, call_cache_fn       )
> > + ARM(          sub     r0, pc, #.L__efi_enter_kernel-cache_off )
> > + ARM(          sub     r0, r0, #cache_off-call_cache_fn        )
> > +.L__efi_enter_kernel:
> >   THUMB(                adr     r0, call_cache_fn       )
> >                 adr     r1, 0f                  @ clean the region of code we
> >                 bl      cache_clean_flush       @ may run with the MMU off
> > --
> > 2.29.2.222.g5d2a92d10f8-goog
> >
>
> This is already fixed in Russell's for-next tree.

Ah right, trolling through lore, there was:
https://lore.kernel.org/linux-arm-kernel/20200914095706.3985-1-ardb@kernel.org/

I didn't see anything in linux-next today, or
https://www.armlinux.org.uk/developer/patches/ Incoming or Applied.

Did it just get merged into the for-next branch, or is for-next not
getting pulled into linux-next?
Ard Biesheuvel Nov. 9, 2020, 9:45 p.m. UTC | #3
On Mon, 9 Nov 2020 at 22:09, Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Mon, Nov 9, 2020 at 12:53 PM Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 9 Nov 2020 at 21:52, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > >
> > > As Ard notes in
> > > commit 54781938ec34 ("crypto: arm/sha256-neon - avoid ADRL pseudo
> > > instruction")
> > > commit 0f5e8323777b ("crypto: arm/sha512-neon - avoid ADRL pseudo
> > > instruction")
> > >
> > >   The ADRL pseudo instruction is not an architectural construct, but a
> > >   convenience macro that was supported by the ARM proprietary assembler
> > >   and adopted by binutils GAS as well, but only when assembling in 32-bit
> > >   ARM mode. Therefore, it can only be used in assembler code that is known
> > >   to assemble in ARM mode only, but as it turns out, the Clang assembler
> > >   does not implement ADRL at all, and so it is better to get rid of it
> > >   entirely.
> > >
> > >   So replace the ADRL instruction with a ADR instruction that refers to
> > >   a nearer symbol, and apply the delta explicitly using an additional
> > >   instruction.
> > >
> > > We can use the same technique to generate the same offset. It looks like
> > > the ADRL pseudo instruction assembles to two SUB instructions in this
> > > case. Because the largest immediate operand that can be specified for
> > > this instruction is 0x400, and the distance between the reference and
> > > the symbol are larger than that, we need to use an intermediary symbol
> > > (cache_off in this case) to calculate the full range.
> > >
> > > Suggested-by: Ard Biesheuvel <ardb@kernel.org>
> > > Suggested-by: Jian Cai <jiancai@google.com>
> > > Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> > > ---
> > >  arch/arm/boot/compressed/head.S | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
> > > index 2e04ec5b5446..b3eac6f9a709 100644
> > > --- a/arch/arm/boot/compressed/head.S
> > > +++ b/arch/arm/boot/compressed/head.S
> > > @@ -1440,7 +1440,9 @@ ENTRY(efi_enter_kernel)
> > >                 mov     r4, r0                  @ preserve image base
> > >                 mov     r8, r1                  @ preserve DT pointer
> > >
> > > - ARM(          adrl    r0, call_cache_fn       )
> > > + ARM(          sub     r0, pc, #.L__efi_enter_kernel-cache_off )
> > > + ARM(          sub     r0, r0, #cache_off-call_cache_fn        )
> > > +.L__efi_enter_kernel:
> > >   THUMB(                adr     r0, call_cache_fn       )
> > >                 adr     r1, 0f                  @ clean the region of code we
> > >                 bl      cache_clean_flush       @ may run with the MMU off
> > > --
> > > 2.29.2.222.g5d2a92d10f8-goog
> > >
> >
> > This is already fixed in Russell's for-next tree.
>
> Ah right, trolling through lore, there was:
> https://lore.kernel.org/linux-arm-kernel/20200914095706.3985-1-ardb@kernel.org/
>
> I didn't see anything in linux-next today, or
> https://www.armlinux.org.uk/developer/patches/ Incoming or Applied.
>
> Did it just get merged into the for-next branch, or is for-next not
> getting pulled into linux-next?


It should appear tomorrow.
Nick Desaulniers Nov. 10, 2020, 7:33 p.m. UTC | #4
On Mon, Nov 9, 2020 at 1:45 PM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Mon, 9 Nov 2020 at 22:09, Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > I didn't see anything in linux-next today, or
> > https://www.armlinux.org.uk/developer/patches/ Incoming or Applied.
> >
> > Did it just get merged into the for-next branch, or is for-next not
> > getting pulled into linux-next?
>
>
> It should appear tomorrow.

Yep, LGTM.
diff mbox series

Patch

diff --git a/arch/arm/boot/compressed/head.S b/arch/arm/boot/compressed/head.S
index 2e04ec5b5446..b3eac6f9a709 100644
--- a/arch/arm/boot/compressed/head.S
+++ b/arch/arm/boot/compressed/head.S
@@ -1440,7 +1440,9 @@  ENTRY(efi_enter_kernel)
 		mov	r4, r0			@ preserve image base
 		mov	r8, r1			@ preserve DT pointer
 
- ARM(		adrl	r0, call_cache_fn	)
+ ARM(		sub	r0, pc, #.L__efi_enter_kernel-cache_off	)
+ ARM(		sub	r0, r0, #cache_off-call_cache_fn	)
+.L__efi_enter_kernel:
  THUMB(		adr	r0, call_cache_fn	)
 		adr	r1, 0f			@ clean the region of code we
 		bl	cache_clean_flush	@ may run with the MMU off