Message ID | 20210706073211.349889-5-ravi.bangoria@linux.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | bpf powerpc: Add BPF_PROBE_MEM support for 64bit JIT | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Not a local patch |
Le 06/07/2021 à 09:32, Ravi Bangoria a écrit : > On PowerPC with KUAP enabled, any kernel code which wants to > access userspace needs to be surrounded by disable-enable KUAP. > But that is not happening for BPF_PROBE_MEM load instruction. > So, when BPF program tries to access invalid userspace address, > page-fault handler considers it as bad KUAP fault: > > Kernel attempted to read user page (d0000000) - exploit attempt? (uid: 0) > > Considering the fact that PTR_TO_BTF_ID (which uses BPF_PROBE_MEM > mode) could either be a valid kernel pointer or NULL but should > never be a pointer to userspace address, execute BPF_PROBE_MEM load > only if addr > TASK_SIZE_MAX, otherwise set dst_reg=0 and move on. > > This will catch NULL, valid or invalid userspace pointers. Only bad > kernel pointer will be handled by BPF exception table. > > [Alexei suggested for x86] > Suggested-by: Alexei Starovoitov <ast@kernel.org> > Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> > --- > arch/powerpc/net/bpf_jit_comp64.c | 38 +++++++++++++++++++++++++++++++ > 1 file changed, 38 insertions(+) > > diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c > index 1884c6dca89a..46becae76210 100644 > --- a/arch/powerpc/net/bpf_jit_comp64.c > +++ b/arch/powerpc/net/bpf_jit_comp64.c > @@ -753,6 +753,14 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * > /* dst = *(u8 *)(ul) (src + off) */ > case BPF_LDX | BPF_MEM | BPF_B: > case BPF_LDX | BPF_PROBE_MEM | BPF_B: > + if (BPF_MODE(code) == BPF_PROBE_MEM) { > + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); > + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); > + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); > + PPC_BCC(COND_GT, (ctx->idx + 4) * 4); > + EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); Prefered way to clear a register is to do 'li reg, 0' > + PPC_JMP((ctx->idx + 2) * 4); > + } > EMIT(PPC_RAW_LBZ(dst_reg, src_reg, off)); > if (insn_is_zext(&insn[i + 1])) > addrs[++i] = ctx->idx * 4; > @@ -763,6 +771,14 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * > /* dst = *(u16 *)(ul) (src + off) */ > case BPF_LDX | BPF_MEM | BPF_H: > case BPF_LDX | BPF_PROBE_MEM | BPF_H: > + if (BPF_MODE(code) == BPF_PROBE_MEM) { > + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); > + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); > + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); > + PPC_BCC(COND_GT, (ctx->idx + 4) * 4); > + EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); > + PPC_JMP((ctx->idx + 2) * 4); > + } That code seems strictly identical to the previous one and the next one. Can you refactor in a function ? > EMIT(PPC_RAW_LHZ(dst_reg, src_reg, off)); > if (insn_is_zext(&insn[i + 1])) > addrs[++i] = ctx->idx * 4; > @@ -773,6 +789,14 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * > /* dst = *(u32 *)(ul) (src + off) */ > case BPF_LDX | BPF_MEM | BPF_W: > case BPF_LDX | BPF_PROBE_MEM | BPF_W: > + if (BPF_MODE(code) == BPF_PROBE_MEM) { > + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); > + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); > + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); > + PPC_BCC(COND_GT, (ctx->idx + 4) * 4); > + EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); > + PPC_JMP((ctx->idx + 2) * 4); > + } > EMIT(PPC_RAW_LWZ(dst_reg, src_reg, off)); > if (insn_is_zext(&insn[i + 1])) > addrs[++i] = ctx->idx * 4; > @@ -783,6 +807,20 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * > /* dst = *(u64 *)(ul) (src + off) */ > case BPF_LDX | BPF_MEM | BPF_DW: > case BPF_LDX | BPF_PROBE_MEM | BPF_DW: > + if (BPF_MODE(code) == BPF_PROBE_MEM) { > + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); > + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); > + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); > + if (off % 4) That test is worth a comment. And I'd prefer if (off & 3) { PPC_BCC(COND_GT, (ctx->idx + 5) * 4); EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); PPC_JMP((ctx->idx + 3) * 4); } else { PPC_BCC(COND_GT, (ctx->idx + 4) * 4); EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); PPC_JMP((ctx->idx + 2) * 4); } > + PPC_BCC(COND_GT, (ctx->idx + 5) * 4); > + else > + PPC_BCC(COND_GT, (ctx->idx + 4) * 4); > + EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); Use PPC_RAW_LI(dst_reg, 0); > + if (off % 4) > + PPC_JMP((ctx->idx + 3) * 4); > + else > + PPC_JMP((ctx->idx + 2) * 4); > + } > PPC_BPF_LL(dst_reg, src_reg, off); > ret = add_extable_entry(fp, image, pass, code, ctx, dst_reg); > if (ret) >
>> @@ -763,6 +771,14 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * >> /* dst = *(u16 *)(ul) (src + off) */ >> case BPF_LDX | BPF_MEM | BPF_H: >> case BPF_LDX | BPF_PROBE_MEM | BPF_H: >> + if (BPF_MODE(code) == BPF_PROBE_MEM) { >> + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); >> + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); >> + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); >> + PPC_BCC(COND_GT, (ctx->idx + 4) * 4); >> + EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); >> + PPC_JMP((ctx->idx + 2) * 4); >> + } > > That code seems strictly identical to the previous one and the next one. > Can you refactor in a function ? I'll check this. > >> EMIT(PPC_RAW_LHZ(dst_reg, src_reg, off)); >> if (insn_is_zext(&insn[i + 1])) >> addrs[++i] = ctx->idx * 4; >> @@ -773,6 +789,14 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * >> /* dst = *(u32 *)(ul) (src + off) */ >> case BPF_LDX | BPF_MEM | BPF_W: >> case BPF_LDX | BPF_PROBE_MEM | BPF_W: >> + if (BPF_MODE(code) == BPF_PROBE_MEM) { >> + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); >> + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); >> + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); >> + PPC_BCC(COND_GT, (ctx->idx + 4) * 4); >> + EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); >> + PPC_JMP((ctx->idx + 2) * 4); >> + } >> EMIT(PPC_RAW_LWZ(dst_reg, src_reg, off)); >> if (insn_is_zext(&insn[i + 1])) >> addrs[++i] = ctx->idx * 4; >> @@ -783,6 +807,20 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * >> /* dst = *(u64 *)(ul) (src + off) */ >> case BPF_LDX | BPF_MEM | BPF_DW: >> case BPF_LDX | BPF_PROBE_MEM | BPF_DW: >> + if (BPF_MODE(code) == BPF_PROBE_MEM) { >> + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); >> + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); >> + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); >> + if (off % 4) > > That test is worth a comment. (off % 4) test is based on how PPC_BPF_LL() emits instruction. > > And I'd prefer > > if (off & 3) { > PPC_BCC(COND_GT, (ctx->idx + 5) * 4); > EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); > PPC_JMP((ctx->idx + 3) * 4); > } else { > PPC_BCC(COND_GT, (ctx->idx + 4) * 4); > EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); > PPC_JMP((ctx->idx + 2) * 4); > } Yes this is neat. Thanks for the review, Ravi
diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c index 1884c6dca89a..46becae76210 100644 --- a/arch/powerpc/net/bpf_jit_comp64.c +++ b/arch/powerpc/net/bpf_jit_comp64.c @@ -753,6 +753,14 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * /* dst = *(u8 *)(ul) (src + off) */ case BPF_LDX | BPF_MEM | BPF_B: case BPF_LDX | BPF_PROBE_MEM | BPF_B: + if (BPF_MODE(code) == BPF_PROBE_MEM) { + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); + PPC_BCC(COND_GT, (ctx->idx + 4) * 4); + EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); + PPC_JMP((ctx->idx + 2) * 4); + } EMIT(PPC_RAW_LBZ(dst_reg, src_reg, off)); if (insn_is_zext(&insn[i + 1])) addrs[++i] = ctx->idx * 4; @@ -763,6 +771,14 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * /* dst = *(u16 *)(ul) (src + off) */ case BPF_LDX | BPF_MEM | BPF_H: case BPF_LDX | BPF_PROBE_MEM | BPF_H: + if (BPF_MODE(code) == BPF_PROBE_MEM) { + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); + PPC_BCC(COND_GT, (ctx->idx + 4) * 4); + EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); + PPC_JMP((ctx->idx + 2) * 4); + } EMIT(PPC_RAW_LHZ(dst_reg, src_reg, off)); if (insn_is_zext(&insn[i + 1])) addrs[++i] = ctx->idx * 4; @@ -773,6 +789,14 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * /* dst = *(u32 *)(ul) (src + off) */ case BPF_LDX | BPF_MEM | BPF_W: case BPF_LDX | BPF_PROBE_MEM | BPF_W: + if (BPF_MODE(code) == BPF_PROBE_MEM) { + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); + PPC_BCC(COND_GT, (ctx->idx + 4) * 4); + EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); + PPC_JMP((ctx->idx + 2) * 4); + } EMIT(PPC_RAW_LWZ(dst_reg, src_reg, off)); if (insn_is_zext(&insn[i + 1])) addrs[++i] = ctx->idx * 4; @@ -783,6 +807,20 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct codegen_context * /* dst = *(u64 *)(ul) (src + off) */ case BPF_LDX | BPF_MEM | BPF_DW: case BPF_LDX | BPF_PROBE_MEM | BPF_DW: + if (BPF_MODE(code) == BPF_PROBE_MEM) { + EMIT(PPC_RAW_ADDI(b2p[TMP_REG_1], src_reg, off)); + PPC_LI64(b2p[TMP_REG_2], TASK_SIZE_MAX); + EMIT(PPC_RAW_CMPLD(b2p[TMP_REG_1], b2p[TMP_REG_2])); + if (off % 4) + PPC_BCC(COND_GT, (ctx->idx + 5) * 4); + else + PPC_BCC(COND_GT, (ctx->idx + 4) * 4); + EMIT(PPC_RAW_XOR(dst_reg, dst_reg, dst_reg)); + if (off % 4) + PPC_JMP((ctx->idx + 3) * 4); + else + PPC_JMP((ctx->idx + 2) * 4); + } PPC_BPF_LL(dst_reg, src_reg, off); ret = add_extable_entry(fp, image, pass, code, ctx, dst_reg); if (ret)
On PowerPC with KUAP enabled, any kernel code which wants to access userspace needs to be surrounded by disable-enable KUAP. But that is not happening for BPF_PROBE_MEM load instruction. So, when BPF program tries to access invalid userspace address, page-fault handler considers it as bad KUAP fault: Kernel attempted to read user page (d0000000) - exploit attempt? (uid: 0) Considering the fact that PTR_TO_BTF_ID (which uses BPF_PROBE_MEM mode) could either be a valid kernel pointer or NULL but should never be a pointer to userspace address, execute BPF_PROBE_MEM load only if addr > TASK_SIZE_MAX, otherwise set dst_reg=0 and move on. This will catch NULL, valid or invalid userspace pointers. Only bad kernel pointer will be handled by BPF exception table. [Alexei suggested for x86] Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> --- arch/powerpc/net/bpf_jit_comp64.c | 38 +++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+)