mbox series

[0/2] use adrp/add pairs for PLT entries

Message ID 20181122084646.3247-1-ard.biesheuvel@linaro.org (mailing list archive)
Headers show
Series use adrp/add pairs for PLT entries | expand

Message

Ard Biesheuvel Nov. 22, 2018, 8:46 a.m. UTC
Currently, PLT entries use a non-idiomatic movn/movz/movz/br instruction
sequence which is also longer than necessary. Also, the code emitting
them does not use the instruction generation code but open codes the
opcodes directly.

The extended KASLR range is now 4 GB, given that we switched to the
small code model everywhere else (including for modules), so we can
switch to adrp/add/br sequences which are easier in the I-cache.

So implement adrp/add pair generation in the instruction generation code
and wire it up into the PLT code. Note that the Cortex-A53 errata handling
requires some special care to ensure that generated veneers are not
susceptible to the erratum.

Cc: Torsten Duwe <duwe@lst.de>
Cc: Jessica Yu <jeyu@kernel.org>

Ard Biesheuvel (2):
  arm64/insn: add support for emitting ADR/ADRP instructions
  arm64/module: switch to ADRP/ADD sequences for PLT entries

 arch/arm64/include/asm/insn.h   |  8 ++
 arch/arm64/include/asm/module.h | 36 ++------
 arch/arm64/kernel/ftrace.c      |  2 +-
 arch/arm64/kernel/insn.c        | 29 ++++++
 arch/arm64/kernel/module-plts.c | 93 +++++++++++++++-----
 arch/arm64/kernel/module.c      |  4 +-
 6 files changed, 119 insertions(+), 53 deletions(-)

Comments

Will Deacon Nov. 27, 2018, 7:44 p.m. UTC | #1
Hi Ard,

On Thu, Nov 22, 2018 at 09:46:44AM +0100, Ard Biesheuvel wrote:
> Currently, PLT entries use a non-idiomatic movn/movz/movz/br instruction
> sequence which is also longer than necessary. Also, the code emitting
> them does not use the instruction generation code but open codes the
> opcodes directly.
> 
> The extended KASLR range is now 4 GB, given that we switched to the
> small code model everywhere else (including for modules), so we can
> switch to adrp/add/br sequences which are easier in the I-cache.
> 
> So implement adrp/add pair generation in the instruction generation code
> and wire it up into the PLT code. Note that the Cortex-A53 errata handling
> requires some special care to ensure that generated veneers are not
> susceptible to the erratum.
> 
> Cc: Torsten Duwe <duwe@lst.de>
> Cc: Jessica Yu <jeyu@kernel.org>

I've applied this, with a couple of extra comments in the plt comparison
code and the Reviewed-by from Torsten. There were some trivial conflicts
with Jessica's rework of the plt lookup, but I think I got it right. Please
take a look at for-next/core when you get a chance.

Will
Ard Biesheuvel Nov. 27, 2018, 9:13 p.m. UTC | #2
On Tue, 27 Nov 2018 at 20:44, Will Deacon <will.deacon@arm.com> wrote:
>
> Hi Ard,
>
> On Thu, Nov 22, 2018 at 09:46:44AM +0100, Ard Biesheuvel wrote:
> > Currently, PLT entries use a non-idiomatic movn/movz/movz/br instruction
> > sequence which is also longer than necessary. Also, the code emitting
> > them does not use the instruction generation code but open codes the
> > opcodes directly.
> >
> > The extended KASLR range is now 4 GB, given that we switched to the
> > small code model everywhere else (including for modules), so we can
> > switch to adrp/add/br sequences which are easier in the I-cache.
> >
> > So implement adrp/add pair generation in the instruction generation code
> > and wire it up into the PLT code. Note that the Cortex-A53 errata handling
> > requires some special care to ensure that generated veneers are not
> > susceptible to the erratum.
> >
> > Cc: Torsten Duwe <duwe@lst.de>
> > Cc: Jessica Yu <jeyu@kernel.org>
>
> I've applied this, with a couple of extra comments in the plt comparison
> code and the Reviewed-by from Torsten. There were some trivial conflicts
> with Jessica's rework of the plt lookup, but I think I got it right. Please
> take a look at for-next/core when you get a chance.
>

Looks fine to me.