Message ID | 20230605144433.290114-1-maciej.fijalkowski@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | xsk: multi-buffer support | expand |
Great to see this proceeding! Thought I'd weigh in on this part: > Unfortunately, we had to introduce a new bind flag (XDP_USE_SG) on the > AF_XDP level to enable multi-buffer support. It would be great if you > have ideas on how to get rid of it. The reason we need to > differentiate between non multi-buffer and multi-buffer is the > behaviour when the kernel gets a packet that is larger than the frame > size. Without multi-buffer, this packet is dropped and marked in the > stats. With multi-buffer on, we want to split it up into multiple > frames instead. > > At the start, we thought that riding on the .frags section name of > the XDP program was a good idea. You do not have to introduce yet > another flag and all AF_XDP users must load an XDP program anyway > to get any traffic up to the socket, so why not just say that the XDP > program decides if the AF_XDP socket should get multi-buffer packets > or not? The problem is that we can create an AF_XDP socket that is Tx > only and that works without having to load an XDP program at > all. Another problem is that the XDP program might change during the > execution, so we would have to check this for every single packet. I agree that it's better to tie the enabling of this to a socket flag instead of to the XDP program, for a couple of reasons: - The XDP program can, as you say, be changed, but it can also be shared between several sockets in a single XSK, so this really needs to be tied to the socket. - The XDP program is often installed implicitly by libxdp, in which case the program can't really set the flag on the program itself. There's a related question of whether the frags flag on the XDP program should be a prerequisite for enabling it at the socket? I think probably it should, right? Also, related to the first point above, how does the driver respond to two different sockets being attached to the same device with two different values of the flag? (As you can probably tell I didn't look at the details of the implementation...) -Toke
On Mon, Jun 05, 2023 at 06:58:25PM +0200, Toke Høiland-Jørgensen wrote: > Great to see this proceeding! Thought I'd weigh in on this part: Hey Toke that is very nice to hear and thanks for chiming in:) > > > Unfortunately, we had to introduce a new bind flag (XDP_USE_SG) on the > > AF_XDP level to enable multi-buffer support. It would be great if you > > have ideas on how to get rid of it. The reason we need to > > differentiate between non multi-buffer and multi-buffer is the > > behaviour when the kernel gets a packet that is larger than the frame > > size. Without multi-buffer, this packet is dropped and marked in the > > stats. With multi-buffer on, we want to split it up into multiple > > frames instead. > > > > At the start, we thought that riding on the .frags section name of > > the XDP program was a good idea. You do not have to introduce yet > > another flag and all AF_XDP users must load an XDP program anyway > > to get any traffic up to the socket, so why not just say that the XDP > > program decides if the AF_XDP socket should get multi-buffer packets > > or not? The problem is that we can create an AF_XDP socket that is Tx > > only and that works without having to load an XDP program at > > all. Another problem is that the XDP program might change during the > > execution, so we would have to check this for every single packet. > > I agree that it's better to tie the enabling of this to a socket flag > instead of to the XDP program, for a couple of reasons: > > - The XDP program can, as you say, be changed, but it can also be shared > between several sockets in a single XSK, so this really needs to be > tied to the socket. exactly > > - The XDP program is often installed implicitly by libxdp, in which case > the program can't really set the flag on the program itself. > > There's a related question of whether the frags flag on the XDP program > should be a prerequisite for enabling it at the socket? I think probably > it should, right? These are two separate events (loading XDP prog vs loading AF_XDP socket) which are unordered, so you can load mbuf AF_XDP socket in the first place and then non-mbuf XDP prog and it will still work at some circumstances - i will quote here commit msg from patch 02: <quote> Such capability of the application needs to be independent of the xdp_prog's frag support capability since there are cases where even a single xdp_buffer may need to be split into multiple descriptors owing to a smaller xsk frame size. For e.g., with NIC rx_buffer size set to 4kB, a 3kB packet will constitute of a single buffer and so will be sent as such to AF_XDP layer irrespective of 'xdp.frags' capability of the XDP program. Now if the xsk frame size is set to 2kB by the AF_XDP application, then the packet will need to be split into 2 descriptors if AF_XDP application can handle multi-buffer, else it needs to be dropped. </quote> > > Also, related to the first point above, how does the driver respond to > two different sockets being attached to the same device with two > different values of the flag? (As you can probably tell I didn't look at > the details of the implementation...) If we talk about zero-copy multi-buffer enabled driver then it will combine all of the frags that belong to particular packet onto xdp_buff which then will be redirected and AF_XDP core will check XDP_USE_SG flag vs the length of xdp_buff - if len is bigger than a chunk size from XSK pool (implies mbuf) and there is no XDP_USE_SG flag on socket - packet will be dropped. So driver is agnostic to that. AF_XDP core handles case you brought up respectively. Also what we actually attach down to driver is XSK pool not XSK socket itself as you know. XSK pool does not carry any info regarding frags. > > -Toke
Maciej Fijalkowski <maciej.fijalkowski@intel.com> writes: > On Mon, Jun 05, 2023 at 06:58:25PM +0200, Toke Høiland-Jørgensen wrote: >> Great to see this proceeding! Thought I'd weigh in on this part: > > Hey Toke that is very nice to hear and thanks for chiming in:) > >> >> > Unfortunately, we had to introduce a new bind flag (XDP_USE_SG) on the >> > AF_XDP level to enable multi-buffer support. It would be great if you >> > have ideas on how to get rid of it. The reason we need to >> > differentiate between non multi-buffer and multi-buffer is the >> > behaviour when the kernel gets a packet that is larger than the frame >> > size. Without multi-buffer, this packet is dropped and marked in the >> > stats. With multi-buffer on, we want to split it up into multiple >> > frames instead. >> > >> > At the start, we thought that riding on the .frags section name of >> > the XDP program was a good idea. You do not have to introduce yet >> > another flag and all AF_XDP users must load an XDP program anyway >> > to get any traffic up to the socket, so why not just say that the XDP >> > program decides if the AF_XDP socket should get multi-buffer packets >> > or not? The problem is that we can create an AF_XDP socket that is Tx >> > only and that works without having to load an XDP program at >> > all. Another problem is that the XDP program might change during the >> > execution, so we would have to check this for every single packet. >> >> I agree that it's better to tie the enabling of this to a socket flag >> instead of to the XDP program, for a couple of reasons: >> >> - The XDP program can, as you say, be changed, but it can also be shared >> between several sockets in a single XSK, so this really needs to be >> tied to the socket. > > exactly > >> >> - The XDP program is often installed implicitly by libxdp, in which case >> the program can't really set the flag on the program itself. >> >> There's a related question of whether the frags flag on the XDP program >> should be a prerequisite for enabling it at the socket? I think probably >> it should, right? > > These are two separate events (loading XDP prog vs loading AF_XDP socket) > which are unordered, so you can load mbuf AF_XDP socket in the first place > and then non-mbuf XDP prog and it will still work at some circumstances - > i will quote here commit msg from patch 02: > > <quote> > Such capability of the application needs to be independent of the > xdp_prog's frag support capability since there are cases where even a > single xdp_buffer may need to be split into multiple descriptors owing to > a smaller xsk frame size. > > For e.g., with NIC rx_buffer size set to 4kB, a 3kB packet will > constitute of a single buffer and so will be sent as such to AF_XDP layer > irrespective of 'xdp.frags' capability of the XDP program. Now if the xsk > frame size is set to 2kB by the AF_XDP application, then the packet will > need to be split into 2 descriptors if AF_XDP application can handle > multi-buffer, else it needs to be dropped. > </quote> Ah, right, the fact that the XSK frame size is set independently was not present in my mind. Okay, makes sense. So the frags flag in the XDP program is only needed to be able to attach the program to the large-MTU interface in the first place, then. I guess for the default libxdp program we can just always set the frags flag (it the kernel supports it), since there's no processing of the packet data on the kernel side there... >> >> Also, related to the first point above, how does the driver respond to >> two different sockets being attached to the same device with two >> different values of the flag? (As you can probably tell I didn't look at >> the details of the implementation...) > > If we talk about zero-copy multi-buffer enabled driver then it will > combine all of the frags that belong to particular packet onto xdp_buff > which then will be redirected and AF_XDP core will check XDP_USE_SG flag > vs the length of xdp_buff - if len is bigger than a chunk size from XSK > pool (implies mbuf) and there is no XDP_USE_SG flag on socket - packet > will be dropped. > > So driver is agnostic to that. AF_XDP core handles case you brought up > respectively. > > Also what we actually attach down to driver is XSK pool not XSK socket > itself as you know. XSK pool does not carry any info regarding frags. Alright, makes sense, thanks for clarifying! :) -Toke