Message ID | 20220421155620.81048-1-larysa.zaremba@intel.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | BPF |
Headers | show |
Series | Accessing XDP packet memory from the end | expand |
On 21/04/2022 17.56, Larysa Zaremba wrote: > Dear all, > Our team has encountered a need of accessing data_meta in a following way: > > int xdp_meta_prog(struct xdp_md *ctx) > { > void *data_meta_ptr = (void *)(long)ctx->data_meta; > void *data_end = (void *)(long)ctx->data_end; > void *data = (void *)(long)ctx->data; > u64 data_size = sizeof(u32); > u32 magic_meta; > u8 offset; > > offset = (u8)((s64)data - (s64)data_meta_ptr); I'm not sure the verifier can handle this 'offset' calc. As it cannot statically know the sized based on this statement. Maybe this is not the issue. > if (offset < data_size) { > bpf_printk("invalid offset: %ld\n", offset); > return XDP_DROP; > } > > data_meta_ptr += offset; > data_meta_ptr -= data_size; > > if (data_meta_ptr + data_size > data) { > return XDP_DROP; > } > > magic_meta = *((u32 *)data); > bpf_printk("Magic: %d\n", magic_meta); > return XDP_PASS; > } > > Unfortunately, verifier claims this code attempts to access packet with > an offset of -2 (a constant part) and negative offset is generally forbidden. Are you forgetting to mention: - Have you modified the NIC driver to adjust data_meta pointer and provide info in this area? p.s. this is exactly what I'm also working towards[1], so I'll be happy to collaborate. I'm missing the driver code, as link[1] is focused on decoding BTF data_meta area in userspace for AF_XDP. [1] https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-interaction > For now we have 2 solutions, one is using bpf_xdp_adjust_meta(), > which is pretty good, but not ideal for the hot path. > The second one is the patch at the end. > Are you saying, verifier cannot handle that driver changed data_meta pointer and provided info there (without calling bpf_xdp_adjust_meta)? > Do you see any other way of accessing memory from the end of data_meta/data? > What do you think about both suggested solutions? > > Best regards, > Larysa Zaremba > > --- > > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -3576,8 +3576,11 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off, > } > > err = reg->range < 0 ? -EINVAL : > - __check_mem_access(env, regno, off, size, reg->range, > - zero_size_allowed); > + __check_mem_access(env, regno, off + reg->smin_value, size, > + reg->range + reg->smin_value, zero_size_allowed); > + err = err ? : > + __check_mem_access(env, regno, off + reg->umax_value, size, > + reg->range + reg->umax_value, zero_size_allowed); > if (err) { > verbose(env, "R%d offset is outside of the packet\n", regno); > return err; >
Larysa Zaremba <larysa.zaremba@intel.com> writes: > Dear all, > Our team has encountered a need of accessing data_meta in a following way: > > int xdp_meta_prog(struct xdp_md *ctx) > { > void *data_meta_ptr = (void *)(long)ctx->data_meta; > void *data_end = (void *)(long)ctx->data_end; > void *data = (void *)(long)ctx->data; > u64 data_size = sizeof(u32); > u32 magic_meta; > u8 offset; > > offset = (u8)((s64)data - (s64)data_meta_ptr); > if (offset < data_size) { > bpf_printk("invalid offset: %ld\n", offset); > return XDP_DROP; > } > > data_meta_ptr += offset; > data_meta_ptr -= data_size; > > if (data_meta_ptr + data_size > data) { > return XDP_DROP; > } > > magic_meta = *((u32 *)data); > bpf_printk("Magic: %d\n", magic_meta); > return XDP_PASS; > } > > Unfortunately, verifier claims this code attempts to access packet with > an offset of -2 (a constant part) and negative offset is generally forbidden. > > For now we have 2 solutions, one is using bpf_xdp_adjust_meta(), > which is pretty good, but not ideal for the hot path. > The second one is the patch at the end. > > Do you see any other way of accessing memory from the end of data_meta/data? > What do you think about both suggested solutions? The problem is that the compiler is generating code that the verifier doesn't understand. It's notoriously hard to get LLVM to produce code that preserves the right bounds checks which is why projects like Cilium use helpers with inline ASM to produce the right loads, like in [0]. Adapting that cilium helper to load from the metadata area, your example can be rewritten as follows (which works just fine with no verifier changes): static __always_inline int xdp_load_meta_bytes(const struct xdp_md *ctx, __u64 off, void *to, const __u64 len) { void *from; int ret; /* LLVM tends to generate code that verifier doesn't understand, * so force it the way we want it in order to open up a range * on the reg. */ asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t" "r2 = *(u32 *)(%[ctx] +0)\n\t" "%[off] &= %[offmax]\n\t" "r1 += %[off]\n\t" "%[from] = r1\n\t" "r1 += %[len]\n\t" "if r1 > r2 goto +2\n\t" "%[ret] = 0\n\t" "goto +1\n\t" "%[ret] = %[errno]\n\t" : [ret]"=r"(ret), [from]"=r"(from) : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len), [offmax]"i"(__CTX_OFF_MAX), [errno]"i"(-EINVAL) : "r1", "r2"); if (!ret) __builtin_memcpy(to, from, len); return ret; } SEC("xdp") int xdp_meta_prog(struct xdp_md *ctx) { void *data_meta_ptr = (void *)(long)ctx->data_meta; void *data = (void *)(long)ctx->data; __u32 magic_meta; __u8 offset; int ret; offset = (__u8)((__s64)data - (__s64)data_meta_ptr); ret = xdp_load_meta_bytes(ctx, offset - 4, &magic_meta, sizeof(magic_meta)); if (ret) { bpf_printk("load bytes failed: %d\n", ret); return XDP_DROP; } bpf_printk("Magic: %d\n", magic_meta); return XDP_PASS; } -Toke [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/ctx/xdp.h#L35
From: Jesper Dangaard Brouer <jbrouer@redhat.com> Date: Thu, 21 Apr 2022 18:27:47 +0200 > On 21/04/2022 17.56, Larysa Zaremba wrote: > > Dear all, > > Our team has encountered a need of accessing data_meta in a following way: > > > > int xdp_meta_prog(struct xdp_md *ctx) > > { > > void *data_meta_ptr = (void *)(long)ctx->data_meta; > > void *data_end = (void *)(long)ctx->data_end; > > void *data = (void *)(long)ctx->data; > > u64 data_size = sizeof(u32); > > u32 magic_meta; > > u8 offset; > > > > offset = (u8)((s64)data - (s64)data_meta_ptr); > > I'm not sure the verifier can handle this 'offset' calc. As it cannot > statically know the sized based on this statement. Maybe this is not the > issue. > > > if (offset < data_size) { > > bpf_printk("invalid offset: %ld\n", offset); > > return XDP_DROP; > > } > > > > data_meta_ptr += offset; > > data_meta_ptr -= data_size; > > > > if (data_meta_ptr + data_size > data) { > > return XDP_DROP; > > } > > > > magic_meta = *((u32 *)data); > > bpf_printk("Magic: %d\n", magic_meta); > > return XDP_PASS; > > } > > > > Unfortunately, verifier claims this code attempts to access packet with > > an offset of -2 (a constant part) and negative offset is generally forbidden. > > Are you forgetting to mention: > - Have you modified the NIC driver to adjust data_meta pointer and > provide info in this area? Exactly. Previously, @data_meta == @data prior to running BPF program in 100% cases. Now, the driver can provide arbitrary metadata and set @data_meta to be @data - 32, data - 48 or so. > > p.s. this is exactly what I'm also working towards[1], so I'll be happy > to collaborate. I'm missing the driver code, as link[1] is focused on > decoding BTF data_meta area in userspace for AF_XDP. Yeah, we're almost about to post a first RFC to LKML. This issue is the last one, the rest just needs to be rebased to fix some minors and polish the code. It will contain the kernel core part and the driver part (only ice for now). Then we could e.g. fuse it with your changes (we weren't touching AF_XDP part) etc. But for now, until an RFC is posted, you could take a look at the code in my GH[0] if you're wish :) The second half of the ice code is not committed yet tho. > > [1] > https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-interaction > > > For now we have 2 solutions, one is using bpf_xdp_adjust_meta(), > > which is pretty good, but not ideal for the hot path. > > The second one is the patch at the end. > > > > Are you saying, verifier cannot handle that driver changed data_meta > pointer and provided info there (without calling bpf_xdp_adjust_meta)? Correct. I suspect the verifier just assumes that @data_meta always equals @data when executing BPF prog. Let's assume: offset = data - data_meta; // 64 bytes data_meta += offset; // equals to data now /* Let's say xdp_meta_generic is 48 bytes long, then */ data_meta -= sizeof(struct xdp_meta_generic); /* data_meta is now 16 bytes past the original data_meta, * or data - 48. */ bpf_printk("magic: 0x%04x\n", ((struct xdp_meta_generic)data_meta)->magic); So in fact, this code is absolutely correct, it doesn't go past the bounds in either direction, but the verifier claims it goes out of bounds to the left by 48 bytes (not counting the offsetof). OTOH, data_meta = (void *)ctx->data_meta; bpf_printk("magic: 0x%04x\n", ((struct xdp_meta_generic)data_meta)->magic); works with no issues. The verifier still thinks @data_meta == @data, but this code effectively accesses the metadata, not the frame itself. > > > > Do you see any other way of accessing memory from the end of data_meta/data? > > What do you think about both suggested solutions? > > > > Best regards, > > Larysa Zaremba > > > > --- > > > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -3576,8 +3576,11 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off, > > } > > > > err = reg->range < 0 ? -EINVAL : > > - __check_mem_access(env, regno, off, size, reg->range, > > - zero_size_allowed); > > + __check_mem_access(env, regno, off + reg->smin_value, size, > > + reg->range + reg->smin_value, zero_size_allowed); > > + err = err ? : > > + __check_mem_access(env, regno, off + reg->umax_value, size, > > + reg->range + reg->umax_value, zero_size_allowed); > > if (err) { > > verbose(env, "R%d offset is outside of the packet\n", regno); > > return err; [0] https://github.com/alobakin/linux/commits/xdp_hints Thanks, Al
From: Toke Høiland-Jørgensen <toke@redhat.com> Date: Thu, 21 Apr 2022 19:17:11 +0200 > Larysa Zaremba <larysa.zaremba@intel.com> writes: > > > Dear all, > > Our team has encountered a need of accessing data_meta in a following way: > > > > int xdp_meta_prog(struct xdp_md *ctx) > > { > > void *data_meta_ptr = (void *)(long)ctx->data_meta; > > void *data_end = (void *)(long)ctx->data_end; > > void *data = (void *)(long)ctx->data; > > u64 data_size = sizeof(u32); > > u32 magic_meta; > > u8 offset; > > > > offset = (u8)((s64)data - (s64)data_meta_ptr); > > if (offset < data_size) { > > bpf_printk("invalid offset: %ld\n", offset); > > return XDP_DROP; > > } > > > > data_meta_ptr += offset; > > data_meta_ptr -= data_size; > > > > if (data_meta_ptr + data_size > data) { > > return XDP_DROP; > > } > > > > magic_meta = *((u32 *)data); > > bpf_printk("Magic: %d\n", magic_meta); > > return XDP_PASS; > > } > > > > Unfortunately, verifier claims this code attempts to access packet with > > an offset of -2 (a constant part) and negative offset is generally forbidden. > > > > For now we have 2 solutions, one is using bpf_xdp_adjust_meta(), > > which is pretty good, but not ideal for the hot path. > > The second one is the patch at the end. > > > > Do you see any other way of accessing memory from the end of data_meta/data? > > What do you think about both suggested solutions? > > The problem is that the compiler is generating code that the verifier > doesn't understand. It's notoriously hard to get LLVM to produce code > that preserves the right bounds checks which is why projects like Cilium > use helpers with inline ASM to produce the right loads, like in [0]. > > Adapting that cilium helper to load from the metadata area, your example > can be rewritten as follows (which works just fine with no verifier > changes): > > static __always_inline int > xdp_load_meta_bytes(const struct xdp_md *ctx, __u64 off, void *to, const __u64 len) > { > void *from; > int ret; > /* LLVM tends to generate code that verifier doesn't understand, > * so force it the way we want it in order to open up a range > * on the reg. > */ > asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t" > "r2 = *(u32 *)(%[ctx] +0)\n\t" > "%[off] &= %[offmax]\n\t" > "r1 += %[off]\n\t" > "%[from] = r1\n\t" > "r1 += %[len]\n\t" > "if r1 > r2 goto +2\n\t" > "%[ret] = 0\n\t" > "goto +1\n\t" > "%[ret] = %[errno]\n\t" > : [ret]"=r"(ret), [from]"=r"(from) > : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len), > [offmax]"i"(__CTX_OFF_MAX), [errno]"i"(-EINVAL) > : "r1", "r2"); > if (!ret) > __builtin_memcpy(to, from, len); > return ret; > } > > > SEC("xdp") > int xdp_meta_prog(struct xdp_md *ctx) > { > void *data_meta_ptr = (void *)(long)ctx->data_meta; > void *data = (void *)(long)ctx->data; > __u32 magic_meta; > __u8 offset; > int ret; > > offset = (__u8)((__s64)data - (__s64)data_meta_ptr); > ret = xdp_load_meta_bytes(ctx, offset - 4, &magic_meta, sizeof(magic_meta)); > if (ret) { > bpf_printk("load bytes failed: %d\n", ret); > return XDP_DROP; > } > > bpf_printk("Magic: %d\n", magic_meta); > return XDP_PASS; > } At the moment, we use this (based on Cilium's and your), it works just like we want C code to work previously: #define __CTX_OFF_MAX 0xff static __always_inline void * can_i_access_meta_please(const struct xdp_md *ctx, __u64 off, const __u64 len) { void *ret; /* LLVM tends to generate code that verifier doesn't understand, * so force it the way we want it in order to open up a range * on the reg. */ asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t" "r2 = *(u32 *)(%[ctx] +0)\n\t" "%[off] &= %[offmax]\n\t" "r1 += %[off]\n\t" "%[ret] = r1\n\t" "r1 += %[len]\n\t" "if r1 > r2 goto +1\n\t" "goto +1\n\t" "%[ret] = %[null]\n\t" : [ret]"=r"(ret) : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len), [offmax]"i"(__CTX_OFF_MAX), [null]"i"(NULL) : "r1", "r2"); return ret; } SEC("xdp") int xdp_prognum_n0_meta(struct xdp_md *ctx) { void *data_meta = (void *)(__s64)ctx->data_meta; void *data = (void *)(__s64)ctx->data; struct xdp_meta_generic *md; __u64 offset; offset = (__u64)((__s64)data - (__s64)data_meta); md = can_i_access_meta_please(ctx, offset, sizeof(*md)); if (__builtin_expect(!md, 0)) { bpf_printk("No you can't\n"); return XDP_DROP; } bpf_printk("Magic: 0x%04x\n", md->magic_id); return XDP_PASS; } Thanks for the help! It's a shame LLVM still suck on generating correct object code from C. I guess we'll define a helper above in one of the headers to not copy-paste it back and forth between each program wanting to access only the generic part of the metadata (which is always being placed at the end). > > -Toke > > > [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/ctx/xdp.h#L35 Thanks, Al
Alexander Lobakin <alexandr.lobakin@intel.com> writes: > From: Toke Høiland-Jørgensen <toke@redhat.com> > Date: Thu, 21 Apr 2022 19:17:11 +0200 > >> Larysa Zaremba <larysa.zaremba@intel.com> writes: >> >> > Dear all, >> > Our team has encountered a need of accessing data_meta in a following way: >> > >> > int xdp_meta_prog(struct xdp_md *ctx) >> > { >> > void *data_meta_ptr = (void *)(long)ctx->data_meta; >> > void *data_end = (void *)(long)ctx->data_end; >> > void *data = (void *)(long)ctx->data; >> > u64 data_size = sizeof(u32); >> > u32 magic_meta; >> > u8 offset; >> > >> > offset = (u8)((s64)data - (s64)data_meta_ptr); >> > if (offset < data_size) { >> > bpf_printk("invalid offset: %ld\n", offset); >> > return XDP_DROP; >> > } >> > >> > data_meta_ptr += offset; >> > data_meta_ptr -= data_size; >> > >> > if (data_meta_ptr + data_size > data) { >> > return XDP_DROP; >> > } >> > >> > magic_meta = *((u32 *)data); >> > bpf_printk("Magic: %d\n", magic_meta); >> > return XDP_PASS; >> > } >> > >> > Unfortunately, verifier claims this code attempts to access packet with >> > an offset of -2 (a constant part) and negative offset is generally forbidden. >> > >> > For now we have 2 solutions, one is using bpf_xdp_adjust_meta(), >> > which is pretty good, but not ideal for the hot path. >> > The second one is the patch at the end. >> > >> > Do you see any other way of accessing memory from the end of data_meta/data? >> > What do you think about both suggested solutions? >> >> The problem is that the compiler is generating code that the verifier >> doesn't understand. It's notoriously hard to get LLVM to produce code >> that preserves the right bounds checks which is why projects like Cilium >> use helpers with inline ASM to produce the right loads, like in [0]. >> >> Adapting that cilium helper to load from the metadata area, your example >> can be rewritten as follows (which works just fine with no verifier >> changes): >> >> static __always_inline int >> xdp_load_meta_bytes(const struct xdp_md *ctx, __u64 off, void *to, const __u64 len) >> { >> void *from; >> int ret; >> /* LLVM tends to generate code that verifier doesn't understand, >> * so force it the way we want it in order to open up a range >> * on the reg. >> */ >> asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t" >> "r2 = *(u32 *)(%[ctx] +0)\n\t" >> "%[off] &= %[offmax]\n\t" >> "r1 += %[off]\n\t" >> "%[from] = r1\n\t" >> "r1 += %[len]\n\t" >> "if r1 > r2 goto +2\n\t" >> "%[ret] = 0\n\t" >> "goto +1\n\t" >> "%[ret] = %[errno]\n\t" >> : [ret]"=r"(ret), [from]"=r"(from) >> : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len), >> [offmax]"i"(__CTX_OFF_MAX), [errno]"i"(-EINVAL) >> : "r1", "r2"); >> if (!ret) >> __builtin_memcpy(to, from, len); >> return ret; >> } >> >> >> SEC("xdp") >> int xdp_meta_prog(struct xdp_md *ctx) >> { >> void *data_meta_ptr = (void *)(long)ctx->data_meta; >> void *data = (void *)(long)ctx->data; >> __u32 magic_meta; >> __u8 offset; >> int ret; >> >> offset = (__u8)((__s64)data - (__s64)data_meta_ptr); >> ret = xdp_load_meta_bytes(ctx, offset - 4, &magic_meta, sizeof(magic_meta)); >> if (ret) { >> bpf_printk("load bytes failed: %d\n", ret); >> return XDP_DROP; >> } >> >> bpf_printk("Magic: %d\n", magic_meta); >> return XDP_PASS; >> } > > At the moment, we use this (based on Cilium's and your), it works > just like we want C code to work previously: > > #define __CTX_OFF_MAX 0xff > > static __always_inline void * > can_i_access_meta_please(const struct xdp_md *ctx, __u64 off, const __u64 len) > { > void *ret; > > /* LLVM tends to generate code that verifier doesn't understand, > * so force it the way we want it in order to open up a range > * on the reg. > */ > asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t" > "r2 = *(u32 *)(%[ctx] +0)\n\t" > "%[off] &= %[offmax]\n\t" > "r1 += %[off]\n\t" > "%[ret] = r1\n\t" > "r1 += %[len]\n\t" > "if r1 > r2 goto +1\n\t" > "goto +1\n\t" > "%[ret] = %[null]\n\t" > : [ret]"=r"(ret) > : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len), > [offmax]"i"(__CTX_OFF_MAX), [null]"i"(NULL) > : "r1", "r2"); > > return ret; > } > > SEC("xdp") > int xdp_prognum_n0_meta(struct xdp_md *ctx) > { > void *data_meta = (void *)(__s64)ctx->data_meta; > void *data = (void *)(__s64)ctx->data; > struct xdp_meta_generic *md; > __u64 offset; > > offset = (__u64)((__s64)data - (__s64)data_meta); > > md = can_i_access_meta_please(ctx, offset, sizeof(*md)); > if (__builtin_expect(!md, 0)) { > bpf_printk("No you can't\n"); > return XDP_DROP; > } > > bpf_printk("Magic: 0x%04x\n", md->magic_id); > return XDP_PASS; > } > > Thanks for the help! Great! You're welcome! :) > It's a shame LLVM still suck on generating correct object code from C. > I guess we'll define a helper above in one of the headers to not > copy-paste it back and forth between each program wanting to access > only the generic part of the metadata (which is always being placed at > the end). Yeah, it would be nice if LLVM could just generate code that works, but in the meantime we'll just have to define a helper. I suspect we'll need to define some helper functions to work with xdp-hints style metadata field anyway, so wrapping the reader into that somewhere would probably make sense, no? -Toke
--- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3576,8 +3576,11 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off, } err = reg->range < 0 ? -EINVAL : - __check_mem_access(env, regno, off, size, reg->range, - zero_size_allowed); + __check_mem_access(env, regno, off + reg->smin_value, size, + reg->range + reg->smin_value, zero_size_allowed); + err = err ? : + __check_mem_access(env, regno, off + reg->umax_value, size, + reg->range + reg->umax_value, zero_size_allowed); if (err) { verbose(env, "R%d offset is outside of the packet\n", regno); return err;