Message ID | 20191216091343.23260-7-bjorn.topel@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | riscv: BPF JIT fix, optimizations and far jumps support | expand |
On 12/16/19 10:13 AM, Björn Töpel wrote: > This commit makes sure that the JIT images is kept close to the kernel > text, so BPF calls can use relative calling with auipc/jalr or jal > instead of loading the full 64-bit address and jalr. > > The BPF JIT image region is 128 MB before the kernel text. > > Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> > --- > arch/riscv/include/asm/pgtable.h | 4 ++++ > arch/riscv/net/bpf_jit_comp.c | 13 +++++++++++++ > 2 files changed, 17 insertions(+) > > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h > index 7ff0ed4f292e..cc3f49415620 100644 > --- a/arch/riscv/include/asm/pgtable.h > +++ b/arch/riscv/include/asm/pgtable.h > @@ -404,6 +404,10 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, > #define VMALLOC_END (PAGE_OFFSET - 1) > #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) > > +#define BPF_JIT_REGION_SIZE (SZ_128M) > +#define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) > +#define BPF_JIT_REGION_END (VMALLOC_END) > + Series looks good to me, thanks; I'd like to get an ACK from Palmer/others on this one. > /* > * Roughly size the vmemmap space to be large enough to fit enough > * struct pages to map half the virtual address space. Then > diff --git a/arch/riscv/net/bpf_jit_comp.c b/arch/riscv/net/bpf_jit_comp.c > index 8aa19c846881..46cff093f526 100644 > --- a/arch/riscv/net/bpf_jit_comp.c > +++ b/arch/riscv/net/bpf_jit_comp.c > @@ -1656,3 +1656,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) > tmp : orig_prog); > return prog; > } > + > +void *bpf_jit_alloc_exec(unsigned long size) > +{ > + return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START, > + BPF_JIT_REGION_END, GFP_KERNEL, > + PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, > + __builtin_return_address(0)); > +} > + > +void bpf_jit_free_exec(void *addr) > +{ > + return vfree(addr); > +} >
On Mon, 16 Dec 2019 at 16:09, Daniel Borkmann <daniel@iogearbox.net> wrote: > > On 12/16/19 10:13 AM, Björn Töpel wrote: > > This commit makes sure that the JIT images is kept close to the kernel > > text, so BPF calls can use relative calling with auipc/jalr or jal > > instead of loading the full 64-bit address and jalr. > > > > The BPF JIT image region is 128 MB before the kernel text. > > > > Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> > > --- > > arch/riscv/include/asm/pgtable.h | 4 ++++ > > arch/riscv/net/bpf_jit_comp.c | 13 +++++++++++++ > > 2 files changed, 17 insertions(+) > > > > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h > > index 7ff0ed4f292e..cc3f49415620 100644 > > --- a/arch/riscv/include/asm/pgtable.h > > +++ b/arch/riscv/include/asm/pgtable.h > > @@ -404,6 +404,10 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, > > #define VMALLOC_END (PAGE_OFFSET - 1) > > #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) > > > > +#define BPF_JIT_REGION_SIZE (SZ_128M) > > +#define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) > > +#define BPF_JIT_REGION_END (VMALLOC_END) > > + > > Series looks good to me, thanks; I'd like to get an ACK from Palmer/others on this one. > Palmer/Paul/Albert, any thoughts/input? I can respin the the series w/o the call optimizations (excluding this patch and the next), but I'd prefer not given it's a nice speed/clean up for calls.
On Mon, 16 Dec 2019 01:13:40 PST (-0800), Bjorn Topel wrote: > This commit makes sure that the JIT images is kept close to the kernel > text, so BPF calls can use relative calling with auipc/jalr or jal > instead of loading the full 64-bit address and jalr. > > The BPF JIT image region is 128 MB before the kernel text. > > Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> > --- > arch/riscv/include/asm/pgtable.h | 4 ++++ > arch/riscv/net/bpf_jit_comp.c | 13 +++++++++++++ > 2 files changed, 17 insertions(+) > > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h > index 7ff0ed4f292e..cc3f49415620 100644 > --- a/arch/riscv/include/asm/pgtable.h > +++ b/arch/riscv/include/asm/pgtable.h > @@ -404,6 +404,10 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, > #define VMALLOC_END (PAGE_OFFSET - 1) > #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) > > +#define BPF_JIT_REGION_SIZE (SZ_128M) > +#define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) > +#define BPF_JIT_REGION_END (VMALLOC_END) > + > /* > * Roughly size the vmemmap space to be large enough to fit enough > * struct pages to map half the virtual address space. Then > diff --git a/arch/riscv/net/bpf_jit_comp.c b/arch/riscv/net/bpf_jit_comp.c > index 8aa19c846881..46cff093f526 100644 > --- a/arch/riscv/net/bpf_jit_comp.c > +++ b/arch/riscv/net/bpf_jit_comp.c > @@ -1656,3 +1656,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) > tmp : orig_prog); > return prog; > } > + > +void *bpf_jit_alloc_exec(unsigned long size) > +{ > + return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START, > + BPF_JIT_REGION_END, GFP_KERNEL, > + PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, > + __builtin_return_address(0)); > +} > + > +void bpf_jit_free_exec(void *addr) > +{ > + return vfree(addr); > +} Ah, I guess I should have read the whole patch set :) Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> and feel free to put the same on whatever patch I just asked that question on, though maybe this one should go before the others as they sort of depend on it?
On Wed, 18 Dec 2019, Björn Töpel wrote: > On Mon, 16 Dec 2019 at 16:09, Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > On 12/16/19 10:13 AM, Björn Töpel wrote: > > > This commit makes sure that the JIT images is kept close to the kernel > > > text, so BPF calls can use relative calling with auipc/jalr or jal > > > instead of loading the full 64-bit address and jalr. > > > > > > The BPF JIT image region is 128 MB before the kernel text. > > > > > > Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> > > > --- > > > arch/riscv/include/asm/pgtable.h | 4 ++++ > > > arch/riscv/net/bpf_jit_comp.c | 13 +++++++++++++ > > > 2 files changed, 17 insertions(+) > > > > > > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h > > > index 7ff0ed4f292e..cc3f49415620 100644 > > > --- a/arch/riscv/include/asm/pgtable.h > > > +++ b/arch/riscv/include/asm/pgtable.h > > > @@ -404,6 +404,10 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, > > > #define VMALLOC_END (PAGE_OFFSET - 1) > > > #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) > > > > > > +#define BPF_JIT_REGION_SIZE (SZ_128M) > > > +#define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) > > > +#define BPF_JIT_REGION_END (VMALLOC_END) > > > + > > > > Series looks good to me, thanks; I'd like to get an ACK from Palmer/others on this one. > > > > Palmer/Paul/Albert, any thoughts/input? I can respin the the series > w/o the call optimizations (excluding this patch and the next), but > I'd prefer not given it's a nice speed/clean up for calls. Looks like Palmer's already reviewed it. One additional request: could you update the VM layout debugging messages in arch/riscv/mm/init.c to include this new area? thanks, - Paul
Paul, Back from the holidays. Sorry about the delay! On Sat, 4 Jan 2020 at 02:32, Paul Walmsley <paul.walmsley@sifive.com> wrote: > [...] > > Looks like Palmer's already reviewed it. One additional request: could > you update the VM layout debugging messages in arch/riscv/mm/init.c to > include this new area? > Sure, I'll send that as a follow-up! Related; Other archs, e.g. arm64, has moved away from dumping the VM layout (commit 071929dbdd86 ("arm64: Stop printing the virtual memory layout")), and instead rely on _PTDUMP. Going forward that probably applies to riscv as well! Cheers, Björn
On Tue, 7 Jan 2020, Björn Töpel wrote: > On Sat, 4 Jan 2020 at 02:32, Paul Walmsley <paul.walmsley@sifive.com> wrote: > > > Looks like Palmer's already reviewed it. One additional request: could > > you update the VM layout debugging messages in arch/riscv/mm/init.c to > > include this new area? > > Sure, I'll send that as a follow-up! Thanks. > Related; Other archs, e.g. arm64, has moved away from dumping the VM > layout (commit 071929dbdd86 ("arm64: Stop printing the virtual memory > layout")), and instead rely on _PTDUMP. Going forward that probably > applies to riscv as well! For the specific case of the page table dumper: we're waiting for the generic ptdump patchset to be merged first - hopefully for v5.6. The RISC-V integration patches against it were posted to the lists back in October. But, to me, that targets a different use-case than the VM layout debug print code. - Paul
On 12/16/19 4:13 AM, Björn Töpel wrote: > This commit makes sure that the JIT images is kept close to the kernel > text, so BPF calls can use relative calling with auipc/jalr or jal > instead of loading the full 64-bit address and jalr. > > The BPF JIT image region is 128 MB before the kernel text. > > Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> > --- > arch/riscv/include/asm/pgtable.h | 4 ++++ > arch/riscv/net/bpf_jit_comp.c | 13 +++++++++++++ > 2 files changed, 17 insertions(+) > > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h > index 7ff0ed4f292e..cc3f49415620 100644 > --- a/arch/riscv/include/asm/pgtable.h > +++ b/arch/riscv/include/asm/pgtable.h > @@ -404,6 +404,10 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, > #define VMALLOC_END (PAGE_OFFSET - 1) > #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) > > +#define BPF_JIT_REGION_SIZE (SZ_128M) > +#define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) > +#define BPF_JIT_REGION_END (VMALLOC_END) > + > /* > * Roughly size the vmemmap space to be large enough to fit enough > * struct pages to map half the virtual address space. Then > diff --git a/arch/riscv/net/bpf_jit_comp.c b/arch/riscv/net/bpf_jit_comp.c > index 8aa19c846881..46cff093f526 100644 > --- a/arch/riscv/net/bpf_jit_comp.c > +++ b/arch/riscv/net/bpf_jit_comp.c > @@ -1656,3 +1656,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) > tmp : orig_prog); > return prog; > } > + > +void *bpf_jit_alloc_exec(unsigned long size) > +{ > + return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START, > + BPF_JIT_REGION_END, GFP_KERNEL, > + PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, > + __builtin_return_address(0)); > +} > + > +void bpf_jit_free_exec(void *addr) > +{ > + return vfree(addr); > +} I think it would be better to completely avoid this patch and the definition of this new zone by using the generic implementation if we had the patch discussed here regarding modules memory allocation (that in any case we need to fix modules loading): https://lore.kernel.org/linux-riscv/d868acf5-7242-93dc-0051-f97e64dc4387@ghiti.fr/T/#m2be30cb71dc9aa834a50d346961acee26158a238 Alex
On Sun, 2 Feb 2020 at 14:37, Alex Ghiti <alex@ghiti.fr> wrote: > [...] > > I think it would be better to completely avoid this patch and the > definition of this > new zone by using the generic implementation if we had the patch > discussed here > regarding modules memory allocation (that in any case we need to fix > modules loading): > > https://lore.kernel.org/linux-riscv/d868acf5-7242-93dc-0051-f97e64dc4387@ghiti.fr/T/#m2be30cb71dc9aa834a50d346961acee26158a238 > This patch is already upstream. I agree that when the module allocation fix is upstream, the BPF image allocation can be folded into the module allocation. IOW, I wont send any page table dumper patch for BPF memory. But keep in mind that the RV BPF JIT relies on having the kernel text within the 32b range (as does modules) Cheers, Björn
On 2/3/20 7:28 AM, Björn Töpel wrote: > On Sun, 2 Feb 2020 at 14:37, Alex Ghiti <alex@ghiti.fr> wrote: > [...] >> I think it would be better to completely avoid this patch and the >> definition of this >> new zone by using the generic implementation if we had the patch >> discussed here >> regarding modules memory allocation (that in any case we need to fix >> modules loading): >> >> https://lore.kernel.org/linux-riscv/d868acf5-7242-93dc-0051-f97e64dc4387@ghiti.fr/T/#m2be30cb71dc9aa834a50d346961acee26158a238 >> > This patch is already upstream. I agree that when the module > allocation fix is upstream, the BPF image allocation can be folded > into the module allocation. IOW, I wont send any page table dumper > patch for BPF memory. Too late then :) I'll remove this zone with the patch regarding module allocation. > > But keep in mind that the RV BPF JIT relies on having the kernel text > within the 32b range (as does modules) Yep, same constraints as for modules ;) Thanks, Alex > > Cheers, > Björn
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 7ff0ed4f292e..cc3f49415620 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -404,6 +404,10 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, #define VMALLOC_END (PAGE_OFFSET - 1) #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) +#define BPF_JIT_REGION_SIZE (SZ_128M) +#define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) +#define BPF_JIT_REGION_END (VMALLOC_END) + /* * Roughly size the vmemmap space to be large enough to fit enough * struct pages to map half the virtual address space. Then diff --git a/arch/riscv/net/bpf_jit_comp.c b/arch/riscv/net/bpf_jit_comp.c index 8aa19c846881..46cff093f526 100644 --- a/arch/riscv/net/bpf_jit_comp.c +++ b/arch/riscv/net/bpf_jit_comp.c @@ -1656,3 +1656,16 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog) tmp : orig_prog); return prog; } + +void *bpf_jit_alloc_exec(unsigned long size) +{ + return __vmalloc_node_range(size, PAGE_SIZE, BPF_JIT_REGION_START, + BPF_JIT_REGION_END, GFP_KERNEL, + PAGE_KERNEL_EXEC, 0, NUMA_NO_NODE, + __builtin_return_address(0)); +} + +void bpf_jit_free_exec(void *addr) +{ + return vfree(addr); +}
This commit makes sure that the JIT images is kept close to the kernel text, so BPF calls can use relative calling with auipc/jalr or jal instead of loading the full 64-bit address and jalr. The BPF JIT image region is 128 MB before the kernel text. Signed-off-by: Björn Töpel <bjorn.topel@gmail.com> --- arch/riscv/include/asm/pgtable.h | 4 ++++ arch/riscv/net/bpf_jit_comp.c | 13 +++++++++++++ 2 files changed, 17 insertions(+)