Message ID | 20220306234311.452206-1-memxor@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce bpf_packet_pointer helper | expand |
On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer' > returning a packet pointer with a fixed immutable range. This can be useful to > enable DPA without having to use memcpy (currently the case in > bpf_xdp_load_bytes and bpf_xdp_store_bytes). > > The intended usage to read and write data for multi-buff XDP is: > > int err = 0; > char buf[N]; > > off &= 0xffff; > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err); > if (unlikely(!ptr)) { > if (err < 0) > return XDP_ABORTED; > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf)); > if (err < 0) > return XDP_ABORTED; > ptr = buf; > } > ... > // Do some stores and loads in [ptr, ptr + N) region > ... > if (unlikely(ptr == buf)) { > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf)); > if (err < 0) > return XDP_ABORTED; > } > > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and > it is also more meaningful to the user to see return value as R0=pkt. > > This series is meant to collect feedback on the approach, next version can > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC > hooks, and explore not resetting range to zero on r0 += rX, instead check access > like check_mem_region_access (var_off + off < range), since there would be no > data_end to compare against and obtain a new range. > > The common name and func_id is supposed to allow writing generic code using > bpf_packet_pointer that works for both XDP and TC programs. > > Please see the individual patches for implementation details. > Joanne is working on a "bpf_dynptr" framework that will support exactly this feature, in addition to working with dynamically allocated memory, working with memory of statically unknown size (but safe and checked at runtime), etc. And all that within a generic common feature implemented uniformly within the verifier. E.g., it won't need any of the custom bits of logic added in patch #2 and #3. So I'm thinking that instead of custom-implementing a partial case of bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr and do it only once there? See also my ARG_CONSTANT comment. It seems like a pretty common thing where input constant is used to characterize some pointer returned from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need that for bpf_dynptr for exactly this "give me direct access of N bytes, if possible" case. So improving/generalizing it now before dynptr lands makes a lot of sense, outside of bpf_packet_pointer() feature itself. > Kumar Kartikeya Dwivedi (5): > bpf: Add ARG_SCALAR and ARG_CONSTANT > bpf: Introduce pkt_uid concept for PTR_TO_PACKET > bpf: Introduce bpf_packet_pointer helper to do DPA > selftests/bpf: Add verifier tests for pkt pointer with pkt_uid > selftests/bpf: Update xdp_adjust_frags to use bpf_packet_pointer > > include/linux/bpf.h | 4 + > include/linux/bpf_verifier.h | 9 +- > include/uapi/linux/bpf.h | 12 ++ > kernel/bpf/verifier.c | 97 ++++++++++-- > net/core/filter.c | 48 +++--- > tools/include/uapi/linux/bpf.h | 12 ++ > .../bpf/prog_tests/xdp_adjust_frags.c | 46 ++++-- > .../bpf/progs/test_xdp_update_frags.c | 46 ++++-- > tools/testing/selftests/bpf/verifier/xdp.c | 146 ++++++++++++++++++ > 9 files changed, 358 insertions(+), 62 deletions(-) > > -- > 2.35.1 >
On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote: > On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer' > > returning a packet pointer with a fixed immutable range. This can be useful to > > enable DPA without having to use memcpy (currently the case in > > bpf_xdp_load_bytes and bpf_xdp_store_bytes). > > > > The intended usage to read and write data for multi-buff XDP is: > > > > int err = 0; > > char buf[N]; > > > > off &= 0xffff; > > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err); > > if (unlikely(!ptr)) { > > if (err < 0) > > return XDP_ABORTED; > > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf)); > > if (err < 0) > > return XDP_ABORTED; > > ptr = buf; > > } > > ... > > // Do some stores and loads in [ptr, ptr + N) region > > ... > > if (unlikely(ptr == buf)) { > > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf)); > > if (err < 0) > > return XDP_ABORTED; > > } > > > > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because > > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and > > it is also more meaningful to the user to see return value as R0=pkt. > > > > This series is meant to collect feedback on the approach, next version can > > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC > > hooks, and explore not resetting range to zero on r0 += rX, instead check access > > like check_mem_region_access (var_off + off < range), since there would be no > > data_end to compare against and obtain a new range. > > > > The common name and func_id is supposed to allow writing generic code using > > bpf_packet_pointer that works for both XDP and TC programs. > > > > Please see the individual patches for implementation details. > > > > Joanne is working on a "bpf_dynptr" framework that will support > exactly this feature, in addition to working with dynamically > allocated memory, working with memory of statically unknown size (but > safe and checked at runtime), etc. And all that within a generic > common feature implemented uniformly within the verifier. E.g., it > won't need any of the custom bits of logic added in patch #2 and #3. > So I'm thinking that instead of custom-implementing a partial case of > bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr > and do it only once there? > Interesting stuff, looking forward to it. > See also my ARG_CONSTANT comment. It seems like a pretty common thing > where input constant is used to characterize some pointer returned > from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need > that for bpf_dynptr for exactly this "give me direct access of N > bytes, if possible" case. So improving/generalizing it now before > dynptr lands makes a lot of sense, outside of bpf_packet_pointer() > feature itself. No worries, we can continue the discussion in patch 1, I'll split out the arg changes into a separate patch, and wait for dynptr to be posted before reworking this. > > > Kumar Kartikeya Dwivedi (5): > > bpf: Add ARG_SCALAR and ARG_CONSTANT > > bpf: Introduce pkt_uid concept for PTR_TO_PACKET > > bpf: Introduce bpf_packet_pointer helper to do DPA > > selftests/bpf: Add verifier tests for pkt pointer with pkt_uid > > selftests/bpf: Update xdp_adjust_frags to use bpf_packet_pointer > > > > include/linux/bpf.h | 4 + > > include/linux/bpf_verifier.h | 9 +- > > include/uapi/linux/bpf.h | 12 ++ > > kernel/bpf/verifier.c | 97 ++++++++++-- > > net/core/filter.c | 48 +++--- > > tools/include/uapi/linux/bpf.h | 12 ++ > > .../bpf/prog_tests/xdp_adjust_frags.c | 46 ++++-- > > .../bpf/progs/test_xdp_update_frags.c | 46 ++++-- > > tools/testing/selftests/bpf/verifier/xdp.c | 146 ++++++++++++++++++ > > 9 files changed, 358 insertions(+), 62 deletions(-) > > > > -- > > 2.35.1 > > -- Kartikeya
Kumar Kartikeya Dwivedi <memxor@gmail.com> writes: > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote: >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: >> > >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer' >> > returning a packet pointer with a fixed immutable range. This can be useful to >> > enable DPA without having to use memcpy (currently the case in >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes). >> > >> > The intended usage to read and write data for multi-buff XDP is: >> > >> > int err = 0; >> > char buf[N]; >> > >> > off &= 0xffff; >> > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err); >> > if (unlikely(!ptr)) { >> > if (err < 0) >> > return XDP_ABORTED; >> > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf)); >> > if (err < 0) >> > return XDP_ABORTED; >> > ptr = buf; >> > } >> > ... >> > // Do some stores and loads in [ptr, ptr + N) region >> > ... >> > if (unlikely(ptr == buf)) { >> > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf)); >> > if (err < 0) >> > return XDP_ABORTED; >> > } >> > >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and >> > it is also more meaningful to the user to see return value as R0=pkt. >> > >> > This series is meant to collect feedback on the approach, next version can >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access >> > like check_mem_region_access (var_off + off < range), since there would be no >> > data_end to compare against and obtain a new range. >> > >> > The common name and func_id is supposed to allow writing generic code using >> > bpf_packet_pointer that works for both XDP and TC programs. >> > >> > Please see the individual patches for implementation details. >> > >> >> Joanne is working on a "bpf_dynptr" framework that will support >> exactly this feature, in addition to working with dynamically >> allocated memory, working with memory of statically unknown size (but >> safe and checked at runtime), etc. And all that within a generic >> common feature implemented uniformly within the verifier. E.g., it >> won't need any of the custom bits of logic added in patch #2 and #3. >> So I'm thinking that instead of custom-implementing a partial case of >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr >> and do it only once there? >> > > Interesting stuff, looking forward to it. > >> See also my ARG_CONSTANT comment. It seems like a pretty common thing >> where input constant is used to characterize some pointer returned >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need >> that for bpf_dynptr for exactly this "give me direct access of N >> bytes, if possible" case. So improving/generalizing it now before >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer() >> feature itself. > > No worries, we can continue the discussion in patch 1, I'll split out the arg > changes into a separate patch, and wait for dynptr to be posted before reworking > this. This does raise the question of what we do in the meantime, though? Your patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're making it, really has to go in before those hit a release and become UAPI. One option would be to still make the change to those other helpers; they'd become a bit slower, but if we have a solution for that coming, that may be OK for a single release? WDYT? -Toke
On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Kumar Kartikeya Dwivedi <memxor@gmail.com> writes: > > > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote: > >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > >> > > >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer' > >> > returning a packet pointer with a fixed immutable range. This can be useful to > >> > enable DPA without having to use memcpy (currently the case in > >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes). > >> > > >> > The intended usage to read and write data for multi-buff XDP is: > >> > > >> > int err = 0; > >> > char buf[N]; > >> > > >> > off &= 0xffff; > >> > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err); > >> > if (unlikely(!ptr)) { > >> > if (err < 0) > >> > return XDP_ABORTED; > >> > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf)); > >> > if (err < 0) > >> > return XDP_ABORTED; > >> > ptr = buf; > >> > } > >> > ... > >> > // Do some stores and loads in [ptr, ptr + N) region > >> > ... > >> > if (unlikely(ptr == buf)) { > >> > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf)); > >> > if (err < 0) > >> > return XDP_ABORTED; > >> > } > >> > > >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because > >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and > >> > it is also more meaningful to the user to see return value as R0=pkt. > >> > > >> > This series is meant to collect feedback on the approach, next version can > >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC > >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access > >> > like check_mem_region_access (var_off + off < range), since there would be no > >> > data_end to compare against and obtain a new range. > >> > > >> > The common name and func_id is supposed to allow writing generic code using > >> > bpf_packet_pointer that works for both XDP and TC programs. > >> > > >> > Please see the individual patches for implementation details. > >> > > >> > >> Joanne is working on a "bpf_dynptr" framework that will support > >> exactly this feature, in addition to working with dynamically > >> allocated memory, working with memory of statically unknown size (but > >> safe and checked at runtime), etc. And all that within a generic > >> common feature implemented uniformly within the verifier. E.g., it > >> won't need any of the custom bits of logic added in patch #2 and #3. > >> So I'm thinking that instead of custom-implementing a partial case of > >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr > >> and do it only once there? > >> > > > > Interesting stuff, looking forward to it. > > > >> See also my ARG_CONSTANT comment. It seems like a pretty common thing > >> where input constant is used to characterize some pointer returned > >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need > >> that for bpf_dynptr for exactly this "give me direct access of N > >> bytes, if possible" case. So improving/generalizing it now before > >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer() > >> feature itself. > > > > No worries, we can continue the discussion in patch 1, I'll split out the arg > > changes into a separate patch, and wait for dynptr to be posted before reworking > > this. > > This does raise the question of what we do in the meantime, though? Your > patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're > making it, really has to go in before those hit a release and become > UAPI. > > One option would be to still make the change to those other helpers; > they'd become a bit slower, but if we have a solution for that coming, > that may be OK for a single release? WDYT? I must have missed important changes to bpf_xdp_{load,store}_bytes(). Does anything change about its behavior? If there are some fixes specific to those helpers, we should fix them as well as a separate patch. My main objection is adding a bpf_packet_pointer() special case when we have a generic mechanism in the works that will come this use case (among other use cases). > > -Toke >
Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes: >> >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote: >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: >> >> > >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer' >> >> > returning a packet pointer with a fixed immutable range. This can be useful to >> >> > enable DPA without having to use memcpy (currently the case in >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes). >> >> > >> >> > The intended usage to read and write data for multi-buff XDP is: >> >> > >> >> > int err = 0; >> >> > char buf[N]; >> >> > >> >> > off &= 0xffff; >> >> > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err); >> >> > if (unlikely(!ptr)) { >> >> > if (err < 0) >> >> > return XDP_ABORTED; >> >> > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf)); >> >> > if (err < 0) >> >> > return XDP_ABORTED; >> >> > ptr = buf; >> >> > } >> >> > ... >> >> > // Do some stores and loads in [ptr, ptr + N) region >> >> > ... >> >> > if (unlikely(ptr == buf)) { >> >> > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf)); >> >> > if (err < 0) >> >> > return XDP_ABORTED; >> >> > } >> >> > >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and >> >> > it is also more meaningful to the user to see return value as R0=pkt. >> >> > >> >> > This series is meant to collect feedback on the approach, next version can >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access >> >> > like check_mem_region_access (var_off + off < range), since there would be no >> >> > data_end to compare against and obtain a new range. >> >> > >> >> > The common name and func_id is supposed to allow writing generic code using >> >> > bpf_packet_pointer that works for both XDP and TC programs. >> >> > >> >> > Please see the individual patches for implementation details. >> >> > >> >> >> >> Joanne is working on a "bpf_dynptr" framework that will support >> >> exactly this feature, in addition to working with dynamically >> >> allocated memory, working with memory of statically unknown size (but >> >> safe and checked at runtime), etc. And all that within a generic >> >> common feature implemented uniformly within the verifier. E.g., it >> >> won't need any of the custom bits of logic added in patch #2 and #3. >> >> So I'm thinking that instead of custom-implementing a partial case of >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr >> >> and do it only once there? >> >> >> > >> > Interesting stuff, looking forward to it. >> > >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing >> >> where input constant is used to characterize some pointer returned >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need >> >> that for bpf_dynptr for exactly this "give me direct access of N >> >> bytes, if possible" case. So improving/generalizing it now before >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer() >> >> feature itself. >> > >> > No worries, we can continue the discussion in patch 1, I'll split out the arg >> > changes into a separate patch, and wait for dynptr to be posted before reworking >> > this. >> >> This does raise the question of what we do in the meantime, though? Your >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're >> making it, really has to go in before those hit a release and become >> UAPI. >> >> One option would be to still make the change to those other helpers; >> they'd become a bit slower, but if we have a solution for that coming, >> that may be OK for a single release? WDYT? > > I must have missed important changes to bpf_xdp_{load,store}_bytes(). > Does anything change about its behavior? If there are some fixes > specific to those helpers, we should fix them as well as a separate > patch. My main objection is adding a bpf_packet_pointer() special case > when we have a generic mechanism in the works that will come this use > case (among other use cases). Well it's not a functional change per se, but Kartikeya's patch is removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the use of the bpf_xdp_pointer()) in favour of making it available directly to BPF. So if we don't do that change before those helpers are finalised, we will end up either introducing a performance regression for code using those helpers, or being stuck with the bpf_xdp_pointer() use inside them even though it makes more sense to move it out to BPF. So the "safe" thing to do would do the change to the store/load helpers now, and get rid of the bpf_xdp_pointer() entirely until it can be introduced as a BPF helper in a generic way. Of course this depends on whether you consider performance regressions to be something to avoid, but this being XDP IMO we should :) -Toke
On Thu, Mar 10, 2022 at 3:30 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes: > >> > >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote: > >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > >> >> > > >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer' > >> >> > returning a packet pointer with a fixed immutable range. This can be useful to > >> >> > enable DPA without having to use memcpy (currently the case in > >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes). > >> >> > > >> >> > The intended usage to read and write data for multi-buff XDP is: > >> >> > > >> >> > int err = 0; > >> >> > char buf[N]; > >> >> > > >> >> > off &= 0xffff; > >> >> > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err); > >> >> > if (unlikely(!ptr)) { > >> >> > if (err < 0) > >> >> > return XDP_ABORTED; > >> >> > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf)); > >> >> > if (err < 0) > >> >> > return XDP_ABORTED; > >> >> > ptr = buf; > >> >> > } > >> >> > ... > >> >> > // Do some stores and loads in [ptr, ptr + N) region > >> >> > ... > >> >> > if (unlikely(ptr == buf)) { > >> >> > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf)); > >> >> > if (err < 0) > >> >> > return XDP_ABORTED; > >> >> > } > >> >> > > >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because > >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and > >> >> > it is also more meaningful to the user to see return value as R0=pkt. > >> >> > > >> >> > This series is meant to collect feedback on the approach, next version can > >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC > >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access > >> >> > like check_mem_region_access (var_off + off < range), since there would be no > >> >> > data_end to compare against and obtain a new range. > >> >> > > >> >> > The common name and func_id is supposed to allow writing generic code using > >> >> > bpf_packet_pointer that works for both XDP and TC programs. > >> >> > > >> >> > Please see the individual patches for implementation details. > >> >> > > >> >> > >> >> Joanne is working on a "bpf_dynptr" framework that will support > >> >> exactly this feature, in addition to working with dynamically > >> >> allocated memory, working with memory of statically unknown size (but > >> >> safe and checked at runtime), etc. And all that within a generic > >> >> common feature implemented uniformly within the verifier. E.g., it > >> >> won't need any of the custom bits of logic added in patch #2 and #3. > >> >> So I'm thinking that instead of custom-implementing a partial case of > >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr > >> >> and do it only once there? > >> >> > >> > > >> > Interesting stuff, looking forward to it. > >> > > >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing > >> >> where input constant is used to characterize some pointer returned > >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need > >> >> that for bpf_dynptr for exactly this "give me direct access of N > >> >> bytes, if possible" case. So improving/generalizing it now before > >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer() > >> >> feature itself. > >> > > >> > No worries, we can continue the discussion in patch 1, I'll split out the arg > >> > changes into a separate patch, and wait for dynptr to be posted before reworking > >> > this. > >> > >> This does raise the question of what we do in the meantime, though? Your > >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're > >> making it, really has to go in before those hit a release and become > >> UAPI. > >> > >> One option would be to still make the change to those other helpers; > >> they'd become a bit slower, but if we have a solution for that coming, > >> that may be OK for a single release? WDYT? > > > > I must have missed important changes to bpf_xdp_{load,store}_bytes(). > > Does anything change about its behavior? If there are some fixes > > specific to those helpers, we should fix them as well as a separate > > patch. My main objection is adding a bpf_packet_pointer() special case > > when we have a generic mechanism in the works that will come this use > > case (among other use cases). > > Well it's not a functional change per se, but Kartikeya's patch is > removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the > use of the bpf_xdp_pointer()) in favour of making it available directly > to BPF. So if we don't do that change before those helpers are > finalised, we will end up either introducing a performance regression > for code using those helpers, or being stuck with the bpf_xdp_pointer() > use inside them even though it makes more sense to move it out to BPF. > > So the "safe" thing to do would do the change to the store/load helpers > now, and get rid of the bpf_xdp_pointer() entirely until it can be > introduced as a BPF helper in a generic way. Of course this depends on > whether you consider performance regressions to be something to avoid, > but this being XDP IMO we should :) I don't follow this logic. Would you mean by "get rid of the bpf_xdp_pointer" ? It's just an internal static function. Also I don't believe that this patch set and exposing bpf_xdp_pointer as a helper actually gives measurable performance wins. It looks quirky to me and hard to use.
On Fri, Mar 11, 2022 at 05:00:42AM IST, Toke Høiland-Jørgensen wrote: > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes: > >> > >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote: > >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > >> >> > > >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer' > >> >> > returning a packet pointer with a fixed immutable range. This can be useful to > >> >> > enable DPA without having to use memcpy (currently the case in > >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes). > >> >> > > >> >> > The intended usage to read and write data for multi-buff XDP is: > >> >> > > >> >> > int err = 0; > >> >> > char buf[N]; > >> >> > > >> >> > off &= 0xffff; > >> >> > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err); > >> >> > if (unlikely(!ptr)) { > >> >> > if (err < 0) > >> >> > return XDP_ABORTED; > >> >> > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf)); > >> >> > if (err < 0) > >> >> > return XDP_ABORTED; > >> >> > ptr = buf; > >> >> > } > >> >> > ... > >> >> > // Do some stores and loads in [ptr, ptr + N) region > >> >> > ... > >> >> > if (unlikely(ptr == buf)) { > >> >> > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf)); > >> >> > if (err < 0) > >> >> > return XDP_ABORTED; > >> >> > } > >> >> > > >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because > >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and > >> >> > it is also more meaningful to the user to see return value as R0=pkt. > >> >> > > >> >> > This series is meant to collect feedback on the approach, next version can > >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC > >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access > >> >> > like check_mem_region_access (var_off + off < range), since there would be no > >> >> > data_end to compare against and obtain a new range. > >> >> > > >> >> > The common name and func_id is supposed to allow writing generic code using > >> >> > bpf_packet_pointer that works for both XDP and TC programs. > >> >> > > >> >> > Please see the individual patches for implementation details. > >> >> > > >> >> > >> >> Joanne is working on a "bpf_dynptr" framework that will support > >> >> exactly this feature, in addition to working with dynamically > >> >> allocated memory, working with memory of statically unknown size (but > >> >> safe and checked at runtime), etc. And all that within a generic > >> >> common feature implemented uniformly within the verifier. E.g., it > >> >> won't need any of the custom bits of logic added in patch #2 and #3. > >> >> So I'm thinking that instead of custom-implementing a partial case of > >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr > >> >> and do it only once there? > >> >> > >> > > >> > Interesting stuff, looking forward to it. > >> > > >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing > >> >> where input constant is used to characterize some pointer returned > >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need > >> >> that for bpf_dynptr for exactly this "give me direct access of N > >> >> bytes, if possible" case. So improving/generalizing it now before > >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer() > >> >> feature itself. > >> > > >> > No worries, we can continue the discussion in patch 1, I'll split out the arg > >> > changes into a separate patch, and wait for dynptr to be posted before reworking > >> > this. > >> > >> This does raise the question of what we do in the meantime, though? Your > >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're > >> making it, really has to go in before those hit a release and become > >> UAPI. > >> > >> One option would be to still make the change to those other helpers; > >> they'd become a bit slower, but if we have a solution for that coming, > >> that may be OK for a single release? WDYT? > > > > I must have missed important changes to bpf_xdp_{load,store}_bytes(). > > Does anything change about its behavior? If there are some fixes > > specific to those helpers, we should fix them as well as a separate > > patch. My main objection is adding a bpf_packet_pointer() special case > > when we have a generic mechanism in the works that will come this use > > case (among other use cases). > > Well it's not a functional change per se, but Kartikeya's patch is > removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the > use of the bpf_xdp_pointer()) in favour of making it available directly > to BPF. So if we don't do that change before those helpers are > finalised, we will end up either introducing a performance regression > for code using those helpers, or being stuck with the bpf_xdp_pointer() > use inside them even though it makes more sense to move it out to BPF. > So IIUC, the case we're worried about is when a linear region is in head or a frag and bpf_xdp_pointer can be used to do a direct memcpy for it. But in my testing there doesn't seem to be any difference. With or without the call, the time taken e.g. for bpf_xdp_load_bytes lies in the 30-40ns range. It would make sense, because for this case the code in bpf_xdp_pointer and bpf_xdp_copy_buf are almost the same, just that the latter has a conditional jump out of the loop based on len. bpf_xdp_copy_buf is still only doing a single memcpy, the cost seems to be dominated by that. Otoh, removing it would improve the case for the other scenario (when region touches two or more frags) because we wouldn't spend time in bpf_xdp_pointer and returning NULL from it failing to find a linear region, but that shouldn't be a regression. Please let me know if I missed something. > So the "safe" thing to do would do the change to the store/load helpers > now, and get rid of the bpf_xdp_pointer() entirely until it can be > introduced as a BPF helper in a generic way. Of course this depends on > whether you consider performance regressions to be something to avoid, > but this being XDP IMO we should :) > > -Toke > -- Kartikeya
On Fri, Mar 11, 2022 at 05:05:24AM IST, Alexei Starovoitov wrote: > On Thu, Mar 10, 2022 at 3:30 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > > > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > >> > > >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes: > > >> > > >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote: > > >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > >> >> > > > >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer' > > >> >> > returning a packet pointer with a fixed immutable range. This can be useful to > > >> >> > enable DPA without having to use memcpy (currently the case in > > >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes). > > >> >> > > > >> >> > The intended usage to read and write data for multi-buff XDP is: > > >> >> > > > >> >> > int err = 0; > > >> >> > char buf[N]; > > >> >> > > > >> >> > off &= 0xffff; > > >> >> > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err); > > >> >> > if (unlikely(!ptr)) { > > >> >> > if (err < 0) > > >> >> > return XDP_ABORTED; > > >> >> > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf)); > > >> >> > if (err < 0) > > >> >> > return XDP_ABORTED; > > >> >> > ptr = buf; > > >> >> > } > > >> >> > ... > > >> >> > // Do some stores and loads in [ptr, ptr + N) region > > >> >> > ... > > >> >> > if (unlikely(ptr == buf)) { > > >> >> > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf)); > > >> >> > if (err < 0) > > >> >> > return XDP_ABORTED; > > >> >> > } > > >> >> > > > >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because > > >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and > > >> >> > it is also more meaningful to the user to see return value as R0=pkt. > > >> >> > > > >> >> > This series is meant to collect feedback on the approach, next version can > > >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC > > >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access > > >> >> > like check_mem_region_access (var_off + off < range), since there would be no > > >> >> > data_end to compare against and obtain a new range. > > >> >> > > > >> >> > The common name and func_id is supposed to allow writing generic code using > > >> >> > bpf_packet_pointer that works for both XDP and TC programs. > > >> >> > > > >> >> > Please see the individual patches for implementation details. > > >> >> > > > >> >> > > >> >> Joanne is working on a "bpf_dynptr" framework that will support > > >> >> exactly this feature, in addition to working with dynamically > > >> >> allocated memory, working with memory of statically unknown size (but > > >> >> safe and checked at runtime), etc. And all that within a generic > > >> >> common feature implemented uniformly within the verifier. E.g., it > > >> >> won't need any of the custom bits of logic added in patch #2 and #3. > > >> >> So I'm thinking that instead of custom-implementing a partial case of > > >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr > > >> >> and do it only once there? > > >> >> > > >> > > > >> > Interesting stuff, looking forward to it. > > >> > > > >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing > > >> >> where input constant is used to characterize some pointer returned > > >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need > > >> >> that for bpf_dynptr for exactly this "give me direct access of N > > >> >> bytes, if possible" case. So improving/generalizing it now before > > >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer() > > >> >> feature itself. > > >> > > > >> > No worries, we can continue the discussion in patch 1, I'll split out the arg > > >> > changes into a separate patch, and wait for dynptr to be posted before reworking > > >> > this. > > >> > > >> This does raise the question of what we do in the meantime, though? Your > > >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're > > >> making it, really has to go in before those hit a release and become > > >> UAPI. > > >> > > >> One option would be to still make the change to those other helpers; > > >> they'd become a bit slower, but if we have a solution for that coming, > > >> that may be OK for a single release? WDYT? > > > > > > I must have missed important changes to bpf_xdp_{load,store}_bytes(). > > > Does anything change about its behavior? If there are some fixes > > > specific to those helpers, we should fix them as well as a separate > > > patch. My main objection is adding a bpf_packet_pointer() special case > > > when we have a generic mechanism in the works that will come this use > > > case (among other use cases). > > > > Well it's not a functional change per se, but Kartikeya's patch is > > removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the > > use of the bpf_xdp_pointer()) in favour of making it available directly > > to BPF. So if we don't do that change before those helpers are > > finalised, we will end up either introducing a performance regression > > for code using those helpers, or being stuck with the bpf_xdp_pointer() > > use inside them even though it makes more sense to move it out to BPF. > > > > So the "safe" thing to do would do the change to the store/load helpers > > now, and get rid of the bpf_xdp_pointer() entirely until it can be > > introduced as a BPF helper in a generic way. Of course this depends on > > whether you consider performance regressions to be something to avoid, > > but this being XDP IMO we should :) > > I don't follow this logic. > Would you mean by "get rid of the bpf_xdp_pointer" ? > It's just an internal static function. > > Also I don't believe that this patch set and exposing > bpf_xdp_pointer as a helper actually gives measurable performance wins. > It looks quirky to me and hard to use. This is actually inspired from your idea to avoid memcpy when reading and writing to multi-buff XDP [0]. But instead of passing in the stack or mem pointer (as discussed in that thread), I let the user set it and detect it themselves, which makes the implementation simpler. I am sure accessing a few bytes directly is going to be faster than first memcpy'ing it to a local buffer, reading, and then possibly writing things back again using a memcpy, but I will be happy to come with some numbers when I respin this later, when Joanne posts the dynptr series. Ofcourse, we could just make return value PTR_TO_MEM even for the 'pass buf pointer' idea, but then we have to conservatively invalidate the pointer even if it points to stack buffer on clear_all_pkt_pointers. The current approach looked better to me. [0]: https://lore.kernel.org/bpf/CAADnVQKbrkOxfNoixUx-RLJEWULJLyhqjZ=M_X2cFG_APwNyCg@mail.gmail.com -- Kartikeya
On Fri, Mar 11, 2022 at 12:55:45PM IST, Kumar Kartikeya Dwivedi wrote: > On Fri, Mar 11, 2022 at 05:05:24AM IST, Alexei Starovoitov wrote: > > On Thu, Mar 10, 2022 at 3:30 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > > > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: > > > > > > > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > > >> > > > >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes: > > > >> > > > >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote: > > > >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > >> >> > > > > >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer' > > > >> >> > returning a packet pointer with a fixed immutable range. This can be useful to > > > >> >> > enable DPA without having to use memcpy (currently the case in > > > >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes). > > > >> >> > > > > >> >> > The intended usage to read and write data for multi-buff XDP is: > > > >> >> > > > > >> >> > int err = 0; > > > >> >> > char buf[N]; > > > >> >> > > > > >> >> > off &= 0xffff; > > > >> >> > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err); > > > >> >> > if (unlikely(!ptr)) { > > > >> >> > if (err < 0) > > > >> >> > return XDP_ABORTED; > > > >> >> > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf)); > > > >> >> > if (err < 0) > > > >> >> > return XDP_ABORTED; > > > >> >> > ptr = buf; > > > >> >> > } > > > >> >> > ... > > > >> >> > // Do some stores and loads in [ptr, ptr + N) region > > > >> >> > ... > > > >> >> > if (unlikely(ptr == buf)) { > > > >> >> > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf)); > > > >> >> > if (err < 0) > > > >> >> > return XDP_ABORTED; > > > >> >> > } > > > >> >> > > > > >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because > > > >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and > > > >> >> > it is also more meaningful to the user to see return value as R0=pkt. > > > >> >> > > > > >> >> > This series is meant to collect feedback on the approach, next version can > > > >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC > > > >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access > > > >> >> > like check_mem_region_access (var_off + off < range), since there would be no > > > >> >> > data_end to compare against and obtain a new range. > > > >> >> > > > > >> >> > The common name and func_id is supposed to allow writing generic code using > > > >> >> > bpf_packet_pointer that works for both XDP and TC programs. > > > >> >> > > > > >> >> > Please see the individual patches for implementation details. > > > >> >> > > > > >> >> > > > >> >> Joanne is working on a "bpf_dynptr" framework that will support > > > >> >> exactly this feature, in addition to working with dynamically > > > >> >> allocated memory, working with memory of statically unknown size (but > > > >> >> safe and checked at runtime), etc. And all that within a generic > > > >> >> common feature implemented uniformly within the verifier. E.g., it > > > >> >> won't need any of the custom bits of logic added in patch #2 and #3. > > > >> >> So I'm thinking that instead of custom-implementing a partial case of > > > >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr > > > >> >> and do it only once there? > > > >> >> > > > >> > > > > >> > Interesting stuff, looking forward to it. > > > >> > > > > >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing > > > >> >> where input constant is used to characterize some pointer returned > > > >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need > > > >> >> that for bpf_dynptr for exactly this "give me direct access of N > > > >> >> bytes, if possible" case. So improving/generalizing it now before > > > >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer() > > > >> >> feature itself. > > > >> > > > > >> > No worries, we can continue the discussion in patch 1, I'll split out the arg > > > >> > changes into a separate patch, and wait for dynptr to be posted before reworking > > > >> > this. > > > >> > > > >> This does raise the question of what we do in the meantime, though? Your > > > >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're > > > >> making it, really has to go in before those hit a release and become > > > >> UAPI. > > > >> > > > >> One option would be to still make the change to those other helpers; > > > >> they'd become a bit slower, but if we have a solution for that coming, > > > >> that may be OK for a single release? WDYT? > > > > > > > > I must have missed important changes to bpf_xdp_{load,store}_bytes(). > > > > Does anything change about its behavior? If there are some fixes > > > > specific to those helpers, we should fix them as well as a separate > > > > patch. My main objection is adding a bpf_packet_pointer() special case > > > > when we have a generic mechanism in the works that will come this use > > > > case (among other use cases). > > > > > > Well it's not a functional change per se, but Kartikeya's patch is > > > removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the > > > use of the bpf_xdp_pointer()) in favour of making it available directly > > > to BPF. So if we don't do that change before those helpers are > > > finalised, we will end up either introducing a performance regression > > > for code using those helpers, or being stuck with the bpf_xdp_pointer() > > > use inside them even though it makes more sense to move it out to BPF. > > > > > > So the "safe" thing to do would do the change to the store/load helpers > > > now, and get rid of the bpf_xdp_pointer() entirely until it can be > > > introduced as a BPF helper in a generic way. Of course this depends on > > > whether you consider performance regressions to be something to avoid, > > > but this being XDP IMO we should :) > > > > I don't follow this logic. > > Would you mean by "get rid of the bpf_xdp_pointer" ? > > It's just an internal static function. > > > > Also I don't believe that this patch set and exposing > > bpf_xdp_pointer as a helper actually gives measurable performance wins. > > It looks quirky to me and hard to use. > > This is actually inspired from your idea to avoid memcpy when reading and > writing to multi-buff XDP [0]. But instead of passing in the stack or mem > pointer (as discussed in that thread), I let the user set it and detect it > themselves, which makes the implementation simpler. > > I am sure accessing a few bytes directly is going to be faster than first > memcpy'ing it to a local buffer, reading, and then possibly writing things > back again using a memcpy, but I will be happy to come with some numbers when > I respin this later, when Joanne posts the dynptr series. > > Ofcourse, we could just make return value PTR_TO_MEM even for the 'pass buf > pointer' idea, but then we have to conservatively invalidate the pointer even if > it points to stack buffer on clear_all_pkt_pointers. The current approach looked > better to me. > > [0]: https://lore.kernel.org/bpf/CAADnVQKbrkOxfNoixUx-RLJEWULJLyhqjZ=M_X2cFG_APwNyCg@mail.gmail.com > This is probably the correct link: https://lore.kernel.org/bpf/CAADnVQ+XXGUxzqMdbPMYf+t_ViDkqvGDdogrmv-wH-dckzujLw@mail.gmail.com > -- > Kartikeya -- Kartikeya
Kumar Kartikeya Dwivedi <memxor@gmail.com> writes: > On Fri, Mar 11, 2022 at 05:00:42AM IST, Toke Høiland-Jørgensen wrote: >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes: >> >> > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> >> >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes: >> >> >> >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote: >> >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: >> >> >> > >> >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer' >> >> >> > returning a packet pointer with a fixed immutable range. This can be useful to >> >> >> > enable DPA without having to use memcpy (currently the case in >> >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes). >> >> >> > >> >> >> > The intended usage to read and write data for multi-buff XDP is: >> >> >> > >> >> >> > int err = 0; >> >> >> > char buf[N]; >> >> >> > >> >> >> > off &= 0xffff; >> >> >> > ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err); >> >> >> > if (unlikely(!ptr)) { >> >> >> > if (err < 0) >> >> >> > return XDP_ABORTED; >> >> >> > err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf)); >> >> >> > if (err < 0) >> >> >> > return XDP_ABORTED; >> >> >> > ptr = buf; >> >> >> > } >> >> >> > ... >> >> >> > // Do some stores and loads in [ptr, ptr + N) region >> >> >> > ... >> >> >> > if (unlikely(ptr == buf)) { >> >> >> > err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf)); >> >> >> > if (err < 0) >> >> >> > return XDP_ABORTED; >> >> >> > } >> >> >> > >> >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because >> >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and >> >> >> > it is also more meaningful to the user to see return value as R0=pkt. >> >> >> > >> >> >> > This series is meant to collect feedback on the approach, next version can >> >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC >> >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access >> >> >> > like check_mem_region_access (var_off + off < range), since there would be no >> >> >> > data_end to compare against and obtain a new range. >> >> >> > >> >> >> > The common name and func_id is supposed to allow writing generic code using >> >> >> > bpf_packet_pointer that works for both XDP and TC programs. >> >> >> > >> >> >> > Please see the individual patches for implementation details. >> >> >> > >> >> >> >> >> >> Joanne is working on a "bpf_dynptr" framework that will support >> >> >> exactly this feature, in addition to working with dynamically >> >> >> allocated memory, working with memory of statically unknown size (but >> >> >> safe and checked at runtime), etc. And all that within a generic >> >> >> common feature implemented uniformly within the verifier. E.g., it >> >> >> won't need any of the custom bits of logic added in patch #2 and #3. >> >> >> So I'm thinking that instead of custom-implementing a partial case of >> >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr >> >> >> and do it only once there? >> >> >> >> >> > >> >> > Interesting stuff, looking forward to it. >> >> > >> >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing >> >> >> where input constant is used to characterize some pointer returned >> >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need >> >> >> that for bpf_dynptr for exactly this "give me direct access of N >> >> >> bytes, if possible" case. So improving/generalizing it now before >> >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer() >> >> >> feature itself. >> >> > >> >> > No worries, we can continue the discussion in patch 1, I'll split out the arg >> >> > changes into a separate patch, and wait for dynptr to be posted before reworking >> >> > this. >> >> >> >> This does raise the question of what we do in the meantime, though? Your >> >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're >> >> making it, really has to go in before those hit a release and become >> >> UAPI. >> >> >> >> One option would be to still make the change to those other helpers; >> >> they'd become a bit slower, but if we have a solution for that coming, >> >> that may be OK for a single release? WDYT? >> > >> > I must have missed important changes to bpf_xdp_{load,store}_bytes(). >> > Does anything change about its behavior? If there are some fixes >> > specific to those helpers, we should fix them as well as a separate >> > patch. My main objection is adding a bpf_packet_pointer() special case >> > when we have a generic mechanism in the works that will come this use >> > case (among other use cases). >> >> Well it's not a functional change per se, but Kartikeya's patch is >> removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the >> use of the bpf_xdp_pointer()) in favour of making it available directly >> to BPF. So if we don't do that change before those helpers are >> finalised, we will end up either introducing a performance regression >> for code using those helpers, or being stuck with the bpf_xdp_pointer() >> use inside them even though it makes more sense to move it out to BPF. >> > > So IIUC, the case we're worried about is when a linear region is in head or a > frag and bpf_xdp_pointer can be used to do a direct memcpy for it. But in my > testing there doesn't seem to be any difference. With or without the call, the > time taken e.g. for bpf_xdp_load_bytes lies in the 30-40ns range. It would make > sense, because for this case the code in bpf_xdp_pointer and bpf_xdp_copy_buf > are almost the same, just that the latter has a conditional jump out of the loop > based on len. bpf_xdp_copy_buf is still only doing a single memcpy, the cost > seems to be dominated by that. > > Otoh, removing it would improve the case for the other scenario (when region > touches two or more frags) because we wouldn't spend time in bpf_xdp_pointer and > returning NULL from it failing to find a linear region, but that shouldn't be a > regression. Yeah, that was basically what I was worried about; thanks for testing! So this implies that the current use of the bpf_xdp_pointer() helper function is pretty pointless, right? But at least it's an internal detail so there's no hurry in fixing it... -Toke
On Thu, Mar 10, 2022 at 11:25 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > > > Also I don't believe that this patch set and exposing > > bpf_xdp_pointer as a helper actually gives measurable performance wins. > > It looks quirky to me and hard to use. > > This is actually inspired from your idea to avoid memcpy when reading and > writing to multi-buff XDP [0]. But instead of passing in the stack or mem > pointer (as discussed in that thread), I let the user set it and detect it > themselves, which makes the implementation simpler. > > I am sure accessing a few bytes directly is going to be faster than first > memcpy'ing it to a local buffer, reading, and then possibly writing things > back again using a memcpy, but I will be happy to come with some numbers when > I respin this later, when Joanne posts the dynptr series. > > Ofcourse, we could just make return value PTR_TO_MEM even for the 'pass buf > pointer' idea, but then we have to conservatively invalidate the pointer even if > it points to stack buffer on clear_all_pkt_pointers. The current approach looked > better to me. > > [0]: https://lore.kernel.org/bpf/CAADnVQKbrkOxfNoixUx-RLJEWULJLyhqjZ=M_X2cFG_APwNyCg@mail.gmail.com There is a big difference in what was proposed earlier vs what you've implemented. In that proposal bpf_packet_pointer would behave similar to skb_header_pointer. The kernel developers are familiar with it and the concept is tested by time. While your bpf_packet_pointer is different. It fails on frag boundaries, so the user has to open code the checks and add explicit bpf_xdp_load_bytes. I guess you're arguing that it's the same. My point that it's a big conceptual difference in api. This kind of difference should be discussed instead of dumping a patch set. Especially without RFC tag. Please focus on kptr patches. They were close enough and have a chance of landing in this release. This bpf_packet_pointer stuff is certainly not going to make it.