Message ID | 20240826071624.350108-1-xukuohai@huaweicloud.com (mailing list archive) |
---|---|
Headers | show |
Series | bpf, arm64: Simplify jited prologue/epilogue | expand |
Xu Kuohai <xukuohai@huaweicloud.com> writes: > From: Xu Kuohai <xukuohai@huawei.com> > > The arm64 jit blindly saves/restores all callee-saved registers, making > the jited result looks a bit too compliated. For example, for an empty > prog, the jited result is: > > 0: bti jc > 4: mov x9, lr > 8: nop > c: paciasp > 10: stp fp, lr, [sp, #-16]! > 14: mov fp, sp > 18: stp x19, x20, [sp, #-16]! > 1c: stp x21, x22, [sp, #-16]! > 20: stp x26, x25, [sp, #-16]! > 24: mov x26, #0 > 28: stp x26, x25, [sp, #-16]! > 2c: mov x26, sp > 30: stp x27, x28, [sp, #-16]! > 34: mov x25, sp > 38: bti j // tailcall target > 3c: sub sp, sp, #0 > 40: mov x7, #0 > 44: add sp, sp, #0 > 48: ldp x27, x28, [sp], #16 > 4c: ldp x26, x25, [sp], #16 > 50: ldp x26, x25, [sp], #16 > 54: ldp x21, x22, [sp], #16 > 58: ldp x19, x20, [sp], #16 > 5c: ldp fp, lr, [sp], #16 > 60: mov x0, x7 > 64: autiasp > 68: ret > > Clearly, there is no need to save/restore unused callee-saved registers. > This patch does this change, making the jited image to only save/restore > the callee-saved registers it uses. > > Now the jited result of empty prog is: > > 0: bti jc > 4: mov x9, lr > 8: nop > c: paciasp > 10: stp fp, lr, [sp, #-16]! > 14: mov fp, sp > 18: stp xzr, x26, [sp, #-16]! > 1c: mov x26, sp > 20: bti j // tailcall target > 24: mov x7, #0 > 28: ldp xzr, x26, [sp], #16 > 2c: ldp fp, lr, [sp], #16 > 30: mov x0, x7 > 34: autiasp > 38: ret > > Xu Kuohai (2): > bpf, arm64: Get rid of fpb > bpf, arm64: Avoid blindly saving/restoring all callee-saved registers > Acked-by: Puranjay Mohan <puranjay@kernel.org> Thanks, Puranjay Mohan
Hello: This series was applied to bpf/bpf-next.git (master) by Alexei Starovoitov <ast@kernel.org>: On Mon, 26 Aug 2024 15:16:22 +0800 you wrote: > From: Xu Kuohai <xukuohai@huawei.com> > > The arm64 jit blindly saves/restores all callee-saved registers, making > the jited result looks a bit too compliated. For example, for an empty > prog, the jited result is: > > 0: bti jc > 4: mov x9, lr > 8: nop > c: paciasp > 10: stp fp, lr, [sp, #-16]! > 14: mov fp, sp > 18: stp x19, x20, [sp, #-16]! > 1c: stp x21, x22, [sp, #-16]! > 20: stp x26, x25, [sp, #-16]! > 24: mov x26, #0 > 28: stp x26, x25, [sp, #-16]! > 2c: mov x26, sp > 30: stp x27, x28, [sp, #-16]! > 34: mov x25, sp > 38: bti j // tailcall target > 3c: sub sp, sp, #0 > 40: mov x7, #0 > 44: add sp, sp, #0 > 48: ldp x27, x28, [sp], #16 > 4c: ldp x26, x25, [sp], #16 > 50: ldp x26, x25, [sp], #16 > 54: ldp x21, x22, [sp], #16 > 58: ldp x19, x20, [sp], #16 > 5c: ldp fp, lr, [sp], #16 > 60: mov x0, x7 > 64: autiasp > 68: ret > > [...] Here is the summary with links: - [bpf-next,1/2] bpf, arm64: Get rid of fpb https://git.kernel.org/bpf/bpf-next/c/bd737fcb6485 - [bpf-next,2/2] bpf, arm64: Avoid blindly saving/restoring all callee-saved registers https://git.kernel.org/bpf/bpf-next/c/5d4fa9ec5643 You are awesome, thank you!
From: Xu Kuohai <xukuohai@huawei.com> The arm64 jit blindly saves/restores all callee-saved registers, making the jited result looks a bit too compliated. For example, for an empty prog, the jited result is: 0: bti jc 4: mov x9, lr 8: nop c: paciasp 10: stp fp, lr, [sp, #-16]! 14: mov fp, sp 18: stp x19, x20, [sp, #-16]! 1c: stp x21, x22, [sp, #-16]! 20: stp x26, x25, [sp, #-16]! 24: mov x26, #0 28: stp x26, x25, [sp, #-16]! 2c: mov x26, sp 30: stp x27, x28, [sp, #-16]! 34: mov x25, sp 38: bti j // tailcall target 3c: sub sp, sp, #0 40: mov x7, #0 44: add sp, sp, #0 48: ldp x27, x28, [sp], #16 4c: ldp x26, x25, [sp], #16 50: ldp x26, x25, [sp], #16 54: ldp x21, x22, [sp], #16 58: ldp x19, x20, [sp], #16 5c: ldp fp, lr, [sp], #16 60: mov x0, x7 64: autiasp 68: ret Clearly, there is no need to save/restore unused callee-saved registers. This patch does this change, making the jited image to only save/restore the callee-saved registers it uses. Now the jited result of empty prog is: 0: bti jc 4: mov x9, lr 8: nop c: paciasp 10: stp fp, lr, [sp, #-16]! 14: mov fp, sp 18: stp xzr, x26, [sp, #-16]! 1c: mov x26, sp 20: bti j // tailcall target 24: mov x7, #0 28: ldp xzr, x26, [sp], #16 2c: ldp fp, lr, [sp], #16 30: mov x0, x7 34: autiasp 38: ret Xu Kuohai (2): bpf, arm64: Get rid of fpb bpf, arm64: Avoid blindly saving/restoring all callee-saved registers arch/arm64/net/bpf_jit_comp.c | 394 +++++++++++++++++----------------- 1 file changed, 192 insertions(+), 202 deletions(-)