Message ID | 20211116073742.7941-1-ciara.loftus@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | XDP_REDIRECT_XSK and Batched AF_XDP Rx | expand |
On 16/11/2021 08.37, Ciara Loftus wrote: > The common case for AF_XDP sockets (xsks) is creating a single xsk on a queue for sending and > receiving frames as this is analogous to HW packet steering through RSS and other classification > methods in the NIC. AF_XDP uses the xdp redirect infrastructure to direct packets to the socket. It > was designed for the much more complicated case of DEVMAP xdp_redirects which directs traffic to > another netdev and thus potentially another driver. In the xsk redirect case, by skipping the > unnecessary parts of this common code we can significantly improve performance and pave the way > for batching in the driver. This RFC proposes one such way to simplify the infrastructure which > yields a 27% increase in throughput and a decrease in cycles per packet of 24 cycles [1]. The goal > of this RFC is to start a discussion on how best to simplify the single-socket datapath while > providing one method as an example. > > Current approach: > 1. XSK pointer: an xsk is created and a handle to the xsk is stored in the XSKMAP. > 2. XDP program: bpf_redirect_map helper triggers the XSKMAP lookup which stores the result (handle > to the xsk) and the map type (XSKMAP) in the percpu bpf_redirect_info struct. The XDP_REDIRECT > action is returned. > 3. XDP_REDIRECT handling called by the driver: the map type (XSKMAP) is read from the > bpf_redirect_info which selects the xsk_map_redirect path. The xsk pointer is retrieved from the > bpf_redirect_info and the XDP descriptor is pushed to the xsk's Rx ring. The socket is added to a > list for flushing later. > 4. xdp_do_flush: iterate through the lists of all maps that can be used for redirect (CPUMAP, > DEVMAP and XSKMAP). When XSKMAP is flushed, go through all xsks that had any traffic redirected to > them and bump the Rx ring head pointer(s). > > For the end goal of submitting the descriptor to the Rx ring and bumping the head pointer of that > ring, only some of these steps are needed. The rest is overhead. The bpf_redirect_map > infrastructure is needed for all other redirect operations, but is not necessary when redirecting > to a single AF_XDP socket. And similarly, flushing the list for every map type in step 4 is not > necessary when only one socket needs to be flushed. > > Proposed approach: > 1. XSK pointer: an xsk is created and a handle to the xsk is stored both in the XSKMAP and also the > netdev_rx_queue struct. > 2. XDP program: new bpf_redirect_xsk helper returns XDP_REDIRECT_XSK. > 3. XDP_REDIRECT_XSK handling called by the driver: the xsk pointer is retrieved from the > netdev_rx_queue struct and the XDP descriptor is pushed to the xsk's Rx ring. > 4. xsk_flush: fetch the handle from the netdev_rx_queue and flush the xsk. > > This fast path is triggered on XDP_REDIRECT_XSK if: > (i) AF_XDP socket SW Rx ring configured > (ii) Exactly one xsk attached to the queue > If any of these conditions are not met, fall back to the same behavior as the original approach: > xdp_redirect_map. This is handled under-the-hood in the new bpf_xdp_redirect_xsk helper so the user > does not need to be aware of these conditions. > > Batching: > With this new approach it is possible to optimize the driver by submitting a batch of descriptors > to the Rx ring in Step 3 of the new approach by simply verifying that the action returned from > every program run of each packet in a batch equals XDP_REDIRECT_XSK. That's because with this > action we know the socket to redirect to will be the same for each packet in the batch. This is > not possible with XDP_REDIRECT because the xsk pointer is stored in the bpf_redirect_info and not > guaranteed to be the same for every packet in a batch. > > [1] Performance: > The benchmarks were performed on VM running a 2.4GHz Ice Lake host with an i40e device passed > through. The xdpsock app was run on a single core with busy polling and configured in 'rxonly' mode. > ./xdpsock -i <iface> -r -B > The improvement in throughput when using the new bpf helper and XDP action was measured at ~13% for > scalar processing, with reduction in cycles per packet of ~13. A further ~14% improvement in > throughput and reduction of ~11 cycles per packet was measured when the batched i40e driver path > was used, for a total improvement of ~27% in throughput and reduction of ~24 cycles per packet. > > Other approaches considered: > Two other approaches were considered. The advantage of both being that neither involved introducing > a new XDP action. The first alternative approach considered was to create a new map type > BPF_MAP_TYPE_XSKMAP_DIRECT. When the XDP_REDIRECT action was returned, this map type could be > checked and used as an indicator to skip the map lookup and use the netdev_rx_queue xsk instead. > The second approach considered was similar and involved using a new bpf_redirect_info flag which > could be used in a similar fashion. > While both approaches yielded a performance improvement they were measured at about half of what > was measured for the approach outlined in this RFC. It seems using bpf_redirect_info is too > expensive. I think it was Bjørn that discovered that accessing the per CPU bpf_redirect_info struct have an overhead of approx 2 ns (times 2.4GHz ~4.8 cycles). Your reduction in cycles per packet was ~13, where ~4.8 seem to be large. The code access this_cpu_ptr(&bpf_redirect_info) two times. One time in the BPF-helper redirect call and second in xdp_do_redirect. (Hint xdp_redirect_map end-up calling __bpf_xdp_redirect_map) Thus, it seems (as you say), the bpf_redirect_info approach is too expensive. May be should look at storing bpf_redirect_info in a place that doesn't requires the this_cpu_ptr() lookup... or cache the lookup per NAPI cycle. Have you tried this? > Also, centralised processing of XDP actions was investigated. This would involve porting all drivers > to a common interface for handling XDP actions which would greatly simplify the work involved in > adding support for new XDP actions such as XDP_REDIRECT_XSK. However it was deemed at this point to > be more complex than adding support for the new action to every driver. Should this series be > considered worth pursuing for a proper patch set, the intention would be to update each driver > individually. I'm fine with adding a new helper, but I don't like introducing a new XDP_REDIRECT_XSK action, which requires updating ALL the drivers. With XDP_REDIRECT infra we beleived we didn't need to add more XDP-action code to drivers, as we multiplex/add new features by extending the bpf_redirect_info. In this extreme performance case, it seems the this_cpu_ptr "lookup" of bpf_redirect_info is the performance issue itself. Could you experiement with different approaches that modify xdp_do_redirect() to handle if new helper bpf_redirect_xsk was called, prior to this_cpu_ptr() call. (Thus, avoiding to introduce a new XDP-action). > Thank you to Magnus Karlsson and Maciej Fijalkowski for several suggestions and insight provided. > > TODO: > * Add selftest(s) > * Add support for all copy and zero copy drivers > * Libxdp support > > The series applies on commit e5043894b21f ("bpftool: Use libbpf_get_error() to check error") > > Thanks, > Ciara > > Ciara Loftus (8): > xsk: add struct xdp_sock to netdev_rx_queue > bpf: add bpf_redirect_xsk helper and XDP_REDIRECT_XSK action > xsk: handle XDP_REDIRECT_XSK and expose xsk_rcv/flush > i40e: handle the XDP_REDIRECT_XSK action > xsk: implement a batched version of xsk_rcv > i40e: isolate descriptor processing in separate function > i40e: introduce batched XDP rx descriptor processing > libbpf: use bpf_redirect_xsk in the default program > > drivers/net/ethernet/intel/i40e/i40e_txrx.c | 13 +- > .../ethernet/intel/i40e/i40e_txrx_common.h | 1 + > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 285 +++++++++++++++--- > include/linux/netdevice.h | 2 + > include/net/xdp_sock_drv.h | 49 +++ > include/net/xsk_buff_pool.h | 22 ++ > include/uapi/linux/bpf.h | 13 + > kernel/bpf/verifier.c | 7 +- > net/core/dev.c | 14 + > net/core/filter.c | 26 ++ > net/xdp/xsk.c | 69 ++++- > net/xdp/xsk_queue.h | 31 ++ > tools/include/uapi/linux/bpf.h | 13 + > tools/lib/bpf/xsk.c | 50 ++- > 14 files changed, 551 insertions(+), 44 deletions(-) >
> > > > On 16/11/2021 08.37, Ciara Loftus wrote: > > The common case for AF_XDP sockets (xsks) is creating a single xsk on a > queue for sending and > > receiving frames as this is analogous to HW packet steering through RSS > and other classification > > methods in the NIC. AF_XDP uses the xdp redirect infrastructure to direct > packets to the socket. It > > was designed for the much more complicated case of DEVMAP > xdp_redirects which directs traffic to > > another netdev and thus potentially another driver. In the xsk redirect > case, by skipping the > > unnecessary parts of this common code we can significantly improve > performance and pave the way > > for batching in the driver. This RFC proposes one such way to simplify the > infrastructure which > > yields a 27% increase in throughput and a decrease in cycles per packet of > 24 cycles [1]. The goal > > of this RFC is to start a discussion on how best to simplify the single-socket > datapath while > > providing one method as an example. > > > > Current approach: > > 1. XSK pointer: an xsk is created and a handle to the xsk is stored in the > XSKMAP. > > 2. XDP program: bpf_redirect_map helper triggers the XSKMAP lookup > which stores the result (handle > > to the xsk) and the map type (XSKMAP) in the percpu bpf_redirect_info > struct. The XDP_REDIRECT > > action is returned. > > 3. XDP_REDIRECT handling called by the driver: the map type (XSKMAP) is > read from the > > bpf_redirect_info which selects the xsk_map_redirect path. The xsk > pointer is retrieved from the > > bpf_redirect_info and the XDP descriptor is pushed to the xsk's Rx ring. The > socket is added to a > > list for flushing later. > > 4. xdp_do_flush: iterate through the lists of all maps that can be used for > redirect (CPUMAP, > > DEVMAP and XSKMAP). When XSKMAP is flushed, go through all xsks that > had any traffic redirected to > > them and bump the Rx ring head pointer(s). > > > > For the end goal of submitting the descriptor to the Rx ring and bumping > the head pointer of that > > ring, only some of these steps are needed. The rest is overhead. The > bpf_redirect_map > > infrastructure is needed for all other redirect operations, but is not > necessary when redirecting > > to a single AF_XDP socket. And similarly, flushing the list for every map type > in step 4 is not > > necessary when only one socket needs to be flushed. > > > > Proposed approach: > > 1. XSK pointer: an xsk is created and a handle to the xsk is stored both in > the XSKMAP and also the > > netdev_rx_queue struct. > > 2. XDP program: new bpf_redirect_xsk helper returns XDP_REDIRECT_XSK. > > 3. XDP_REDIRECT_XSK handling called by the driver: the xsk pointer is > retrieved from the > > netdev_rx_queue struct and the XDP descriptor is pushed to the xsk's Rx > ring. > > 4. xsk_flush: fetch the handle from the netdev_rx_queue and flush the > xsk. > > > > This fast path is triggered on XDP_REDIRECT_XSK if: > > (i) AF_XDP socket SW Rx ring configured > > (ii) Exactly one xsk attached to the queue > > If any of these conditions are not met, fall back to the same behavior as the > original approach: > > xdp_redirect_map. This is handled under-the-hood in the new > bpf_xdp_redirect_xsk helper so the user > > does not need to be aware of these conditions. > > > > Batching: > > With this new approach it is possible to optimize the driver by submitting a > batch of descriptors > > to the Rx ring in Step 3 of the new approach by simply verifying that the > action returned from > > every program run of each packet in a batch equals XDP_REDIRECT_XSK. > That's because with this > > action we know the socket to redirect to will be the same for each packet in > the batch. This is > > not possible with XDP_REDIRECT because the xsk pointer is stored in the > bpf_redirect_info and not > > guaranteed to be the same for every packet in a batch. > > > > [1] Performance: > > The benchmarks were performed on VM running a 2.4GHz Ice Lake host > with an i40e device passed > > through. The xdpsock app was run on a single core with busy polling and > configured in 'rxonly' mode. > > ./xdpsock -i <iface> -r -B > > The improvement in throughput when using the new bpf helper and XDP > action was measured at ~13% for > > scalar processing, with reduction in cycles per packet of ~13. A further ~14% > improvement in > > throughput and reduction of ~11 cycles per packet was measured when > the batched i40e driver path > > was used, for a total improvement of ~27% in throughput and reduction of > ~24 cycles per packet. > > > > Other approaches considered: > > Two other approaches were considered. The advantage of both being that > neither involved introducing > > a new XDP action. The first alternative approach considered was to create a > new map type > > BPF_MAP_TYPE_XSKMAP_DIRECT. When the XDP_REDIRECT action was > returned, this map type could be > > checked and used as an indicator to skip the map lookup and use the > netdev_rx_queue xsk instead. > > The second approach considered was similar and involved using a new > bpf_redirect_info flag which > > could be used in a similar fashion. > > While both approaches yielded a performance improvement they were > measured at about half of what > > was measured for the approach outlined in this RFC. It seems using > bpf_redirect_info is too > > expensive. > > I think it was Bjørn that discovered that accessing the per CPU > bpf_redirect_info struct have an overhead of approx 2 ns (times 2.4GHz > ~4.8 cycles). Your reduction in cycles per packet was ~13, where ~4.8 > seem to be large. > > The code access this_cpu_ptr(&bpf_redirect_info) two times. > One time in the BPF-helper redirect call and second in xdp_do_redirect. > (Hint xdp_redirect_map end-up calling __bpf_xdp_redirect_map) > > Thus, it seems (as you say), the bpf_redirect_info approach is too > expensive. May be should look at storing bpf_redirect_info in a place > that doesn't requires the this_cpu_ptr() lookup... or cache the lookup > per NAPI cycle. > Have you tried this? > > > > Also, centralised processing of XDP actions was investigated. This would > involve porting all drivers > > to a common interface for handling XDP actions which would greatly > simplify the work involved in > > adding support for new XDP actions such as XDP_REDIRECT_XSK. However > it was deemed at this point to > > be more complex than adding support for the new action to every driver. > Should this series be > > considered worth pursuing for a proper patch set, the intention would be > to update each driver > > individually. > > I'm fine with adding a new helper, but I don't like introducing a new > XDP_REDIRECT_XSK action, which requires updating ALL the drivers. > > With XDP_REDIRECT infra we beleived we didn't need to add more > XDP-action code to drivers, as we multiplex/add new features by > extending the bpf_redirect_info. > In this extreme performance case, it seems the this_cpu_ptr "lookup" of > bpf_redirect_info is the performance issue itself. > > Could you experiement with different approaches that modify > xdp_do_redirect() to handle if new helper bpf_redirect_xsk was called, > prior to this_cpu_ptr() call. > (Thus, avoiding to introduce a new XDP-action). Thanks for your feedback Jesper. I understand the hesitation of adding a new action. If we can achieve the same improvement without introducing a new action I would be very happy! Without new the action we'll need a new way to indicate that the bpf_redirect_xsk helper was called. Maybe another new field in the netdev alongside the xsk_refcnt. Or else extend bpf_redirect_info - if we find a new home for it that it's too costly to access. Thanks for your suggestions. I'll experiment as you suggested and report back. Thanks, Ciara > > > Thank you to Magnus Karlsson and Maciej Fijalkowski for several > suggestions and insight provided. > > > > TODO: > > * Add selftest(s) > > * Add support for all copy and zero copy drivers > > * Libxdp support > > > > The series applies on commit e5043894b21f ("bpftool: Use > libbpf_get_error() to check error") > > > > Thanks, > > Ciara > > > > Ciara Loftus (8): > > xsk: add struct xdp_sock to netdev_rx_queue > > bpf: add bpf_redirect_xsk helper and XDP_REDIRECT_XSK action > > xsk: handle XDP_REDIRECT_XSK and expose xsk_rcv/flush > > i40e: handle the XDP_REDIRECT_XSK action > > xsk: implement a batched version of xsk_rcv > > i40e: isolate descriptor processing in separate function > > i40e: introduce batched XDP rx descriptor processing > > libbpf: use bpf_redirect_xsk in the default program > > > > drivers/net/ethernet/intel/i40e/i40e_txrx.c | 13 +- > > .../ethernet/intel/i40e/i40e_txrx_common.h | 1 + > > drivers/net/ethernet/intel/i40e/i40e_xsk.c | 285 +++++++++++++++--- > > include/linux/netdevice.h | 2 + > > include/net/xdp_sock_drv.h | 49 +++ > > include/net/xsk_buff_pool.h | 22 ++ > > include/uapi/linux/bpf.h | 13 + > > kernel/bpf/verifier.c | 7 +- > > net/core/dev.c | 14 + > > net/core/filter.c | 26 ++ > > net/xdp/xsk.c | 69 ++++- > > net/xdp/xsk_queue.h | 31 ++ > > tools/include/uapi/linux/bpf.h | 13 + > > tools/lib/bpf/xsk.c | 50 ++- > > 14 files changed, 551 insertions(+), 44 deletions(-) > >
"Loftus, Ciara" <ciara.loftus@intel.com> writes: >> I'm fine with adding a new helper, but I don't like introducing a new >> XDP_REDIRECT_XSK action, which requires updating ALL the drivers. >> >> With XDP_REDIRECT infra we beleived we didn't need to add more >> XDP-action code to drivers, as we multiplex/add new features by >> extending the bpf_redirect_info. >> In this extreme performance case, it seems the this_cpu_ptr "lookup" of >> bpf_redirect_info is the performance issue itself. >> >> Could you experiement with different approaches that modify >> xdp_do_redirect() to handle if new helper bpf_redirect_xsk was called, >> prior to this_cpu_ptr() call. >> (Thus, avoiding to introduce a new XDP-action). > > Thanks for your feedback Jesper. > I understand the hesitation of adding a new action. If we can achieve the same improvement without > introducing a new action I would be very happy! > Without new the action we'll need a new way to indicate that the bpf_redirect_xsk helper was > called. Maybe another new field in the netdev alongside the xsk_refcnt. Or else extend > bpf_redirect_info - if we find a new home for it that it's too costly to access. > Thanks for your suggestions. I'll experiment as you suggested and > report back. I'll add a +1 to the "let's try to solve this without a new return code" :) Also, I don't think we need a new helper either; the bpf_redirect() helper takes a flags argument, so we could just use ifindex=0, flags=DEV_XSK or something like that. Also, I think the batching in the driver idea can be generalised: we just need to generalise the idea of "are all these packets going to the same place" and have a batched version of xdp_do_redirect(), no? The other map types do batching internally already, though, so I'm wondering why batching in the driver helps XSK? -Toke
On Tue, Nov 16, 2021 at 07:37:34AM +0000, Ciara Loftus wrote: > The common case for AF_XDP sockets (xsks) is creating a single xsk on a queue for sending and > receiving frames as this is analogous to HW packet steering through RSS and other classification > methods in the NIC. AF_XDP uses the xdp redirect infrastructure to direct packets to the socket. It > was designed for the much more complicated case of DEVMAP xdp_redirects which directs traffic to > another netdev and thus potentially another driver. In the xsk redirect case, by skipping the > unnecessary parts of this common code we can significantly improve performance and pave the way > for batching in the driver. This RFC proposes one such way to simplify the infrastructure which > yields a 27% increase in throughput and a decrease in cycles per packet of 24 cycles [1]. The goal > of this RFC is to start a discussion on how best to simplify the single-socket datapath while > providing one method as an example. > > Current approach: > 1. XSK pointer: an xsk is created and a handle to the xsk is stored in the XSKMAP. > 2. XDP program: bpf_redirect_map helper triggers the XSKMAP lookup which stores the result (handle > to the xsk) and the map type (XSKMAP) in the percpu bpf_redirect_info struct. The XDP_REDIRECT > action is returned. > 3. XDP_REDIRECT handling called by the driver: the map type (XSKMAP) is read from the > bpf_redirect_info which selects the xsk_map_redirect path. The xsk pointer is retrieved from the > bpf_redirect_info and the XDP descriptor is pushed to the xsk's Rx ring. The socket is added to a > list for flushing later. > 4. xdp_do_flush: iterate through the lists of all maps that can be used for redirect (CPUMAP, > DEVMAP and XSKMAP). When XSKMAP is flushed, go through all xsks that had any traffic redirected to > them and bump the Rx ring head pointer(s). > > For the end goal of submitting the descriptor to the Rx ring and bumping the head pointer of that > ring, only some of these steps are needed. The rest is overhead. The bpf_redirect_map > infrastructure is needed for all other redirect operations, but is not necessary when redirecting > to a single AF_XDP socket. And similarly, flushing the list for every map type in step 4 is not > necessary when only one socket needs to be flushed. > > Proposed approach: > 1. XSK pointer: an xsk is created and a handle to the xsk is stored both in the XSKMAP and also the > netdev_rx_queue struct. > 2. XDP program: new bpf_redirect_xsk helper returns XDP_REDIRECT_XSK. > 3. XDP_REDIRECT_XSK handling called by the driver: the xsk pointer is retrieved from the > netdev_rx_queue struct and the XDP descriptor is pushed to the xsk's Rx ring. > 4. xsk_flush: fetch the handle from the netdev_rx_queue and flush the xsk. > > This fast path is triggered on XDP_REDIRECT_XSK if: > (i) AF_XDP socket SW Rx ring configured > (ii) Exactly one xsk attached to the queue > If any of these conditions are not met, fall back to the same behavior as the original approach: > xdp_redirect_map. This is handled under-the-hood in the new bpf_xdp_redirect_xsk helper so the user > does not need to be aware of these conditions. I don't think the micro optimization for specific use case warrants addition of new apis. Please optimize it without adding new actions and new helpers.
> "Loftus, Ciara" <ciara.loftus@intel.com> writes: > > >> I'm fine with adding a new helper, but I don't like introducing a new > >> XDP_REDIRECT_XSK action, which requires updating ALL the drivers. > >> > >> With XDP_REDIRECT infra we beleived we didn't need to add more > >> XDP-action code to drivers, as we multiplex/add new features by > >> extending the bpf_redirect_info. > >> In this extreme performance case, it seems the this_cpu_ptr "lookup" of > >> bpf_redirect_info is the performance issue itself. > >> > >> Could you experiement with different approaches that modify > >> xdp_do_redirect() to handle if new helper bpf_redirect_xsk was called, > >> prior to this_cpu_ptr() call. > >> (Thus, avoiding to introduce a new XDP-action). > > > > Thanks for your feedback Jesper. > > I understand the hesitation of adding a new action. If we can achieve the > same improvement without > > introducing a new action I would be very happy! > > Without new the action we'll need a new way to indicate that the > bpf_redirect_xsk helper was > > called. Maybe another new field in the netdev alongside the xsk_refcnt. Or > else extend > > bpf_redirect_info - if we find a new home for it that it's too costly to > access. > > Thanks for your suggestions. I'll experiment as you suggested and > > report back. > > I'll add a +1 to the "let's try to solve this without a new return code" :) > > Also, I don't think we need a new helper either; the bpf_redirect() > helper takes a flags argument, so we could just use ifindex=0, > flags=DEV_XSK or something like that. The advantage of a new helper is that we can access the netdev struct from it and check if there's a valid xsk stored in it, before returning XDP_REDIRECT without the xskmap lookup. However, I think your suggestion could work too. We would just have to delay the check until xdp_do_redirect. At this point though, if there isn't a valid xsk we might have to drop the packet instead of falling back to the xskmap. > > Also, I think the batching in the driver idea can be generalised: we > just need to generalise the idea of "are all these packets going to the > same place" and have a batched version of xdp_do_redirect(), no? The > other map types do batching internally already, though, so I'm wondering > why batching in the driver helps XSK? > With the current infrastructure figuring out if "all the packets are going to the same place" looks like an expensive operation which could undo the benefits of the batching that would come after it. We would need to run the program N=batch_size times, store the actions and bpf_redirect_info for each run and perform a series of compares. The new action really helped here because it could easily indicate if all the packets in a batch were going to the same place. But I understand it's not an option. Maybe if we can mitigate the cost of accessing the bpf_redirect_info as Jesper suggested, we can use a flag in it to signal what the new action was signalling. I'm not familiar with how the other map types and how they handle batching so I will look into that. Appreciate your feedback. I have a few avenues to explore. Ciara > -Toke
"Loftus, Ciara" <ciara.loftus@intel.com> writes: >> "Loftus, Ciara" <ciara.loftus@intel.com> writes: >> >> >> I'm fine with adding a new helper, but I don't like introducing a new >> >> XDP_REDIRECT_XSK action, which requires updating ALL the drivers. >> >> >> >> With XDP_REDIRECT infra we beleived we didn't need to add more >> >> XDP-action code to drivers, as we multiplex/add new features by >> >> extending the bpf_redirect_info. >> >> In this extreme performance case, it seems the this_cpu_ptr "lookup" of >> >> bpf_redirect_info is the performance issue itself. >> >> >> >> Could you experiement with different approaches that modify >> >> xdp_do_redirect() to handle if new helper bpf_redirect_xsk was called, >> >> prior to this_cpu_ptr() call. >> >> (Thus, avoiding to introduce a new XDP-action). >> > >> > Thanks for your feedback Jesper. >> > I understand the hesitation of adding a new action. If we can achieve the >> same improvement without >> > introducing a new action I would be very happy! >> > Without new the action we'll need a new way to indicate that the >> bpf_redirect_xsk helper was >> > called. Maybe another new field in the netdev alongside the xsk_refcnt. Or >> else extend >> > bpf_redirect_info - if we find a new home for it that it's too costly to >> access. >> > Thanks for your suggestions. I'll experiment as you suggested and >> > report back. >> >> I'll add a +1 to the "let's try to solve this without a new return code" :) >> >> Also, I don't think we need a new helper either; the bpf_redirect() >> helper takes a flags argument, so we could just use ifindex=0, >> flags=DEV_XSK or something like that. > > The advantage of a new helper is that we can access the netdev > struct from it and check if there's a valid xsk stored in it, before > returning XDP_REDIRECT without the xskmap lookup. However, > I think your suggestion could work too. We would just > have to delay the check until xdp_do_redirect. At this point > though, if there isn't a valid xsk we might have to drop the packet > instead of falling back to the xskmap. I think it's OK to require the user to make sure that there is such a socket loaded before using that flag... >> Also, I think the batching in the driver idea can be generalised: we >> just need to generalise the idea of "are all these packets going to the >> same place" and have a batched version of xdp_do_redirect(), no? The >> other map types do batching internally already, though, so I'm wondering >> why batching in the driver helps XSK? >> > > With the current infrastructure figuring out if "all the packets are going > to the same place" looks like an expensive operation which could undo > the benefits of the batching that would come after it. We would need > to run the program N=batch_size times, store the actions and > bpf_redirect_info for each run and perform a series of compares. The new > action really helped here because it could easily indicate if all the > packets in a batch were going to the same place. But I understand it's > not an option. Maybe if we can mitigate the cost of accessing the > bpf_redirect_info as Jesper suggested, we can use a flag in it to signal > what the new action was signalling. Yes, it would probably require comparing the contents of the whole bpf_redirect_info, at least as it is now; but assuming we can find a way to mitigate the cost of accessing that structure, this may not be so bad, and I believe there is some potential for compressing the state further so we can get down to a single, or maybe two, 64-bit compares... -Toke