Message ID | 20240822200252.472298-1-wangfe@google.com (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | xfrm: add SA information to the offloaded packet | expand |
On Thu, Aug 22, 2024 at 01:02:52PM -0700, Feng Wang wrote: > From: wangfe <wangfe@google.com> > > In packet offload mode, append Security Association (SA) information > to each packet, replicating the crypto offload implementation. > The XFRM_XMIT flag is set to enable packet to be returned immediately > from the validate_xmit_xfrm function, thus aligning with the existing > code path for packet offload mode. > > This SA info helps HW offload match packets to their correct security > policies. The XFRM interface ID is included, which is crucial in setups > with multiple XFRM interfaces where source/destination addresses alone > can't pinpoint the right policy. > > Signed-off-by: wangfe <wangfe@google.com> Applied to ipsec-next, thanks Feng!
On Wed, Aug 28, 2024 at 07:32:47AM +0200, Steffen Klassert wrote: > On Thu, Aug 22, 2024 at 01:02:52PM -0700, Feng Wang wrote: > > From: wangfe <wangfe@google.com> > > > > In packet offload mode, append Security Association (SA) information > > to each packet, replicating the crypto offload implementation. > > The XFRM_XMIT flag is set to enable packet to be returned immediately > > from the validate_xmit_xfrm function, thus aligning with the existing > > code path for packet offload mode. > > > > This SA info helps HW offload match packets to their correct security > > policies. The XFRM interface ID is included, which is crucial in setups > > with multiple XFRM interfaces where source/destination addresses alone > > can't pinpoint the right policy. > > > > Signed-off-by: wangfe <wangfe@google.com> > > Applied to ipsec-next, thanks Feng! Stephen, can you please explain why do you think that this is correct thing to do? There are no in-tree any drivers which is using this information, and it is unclear to me how state is released and it has controversial code around validity of xfrm_offload() too. For example: + sp->olen++; + sp->xvec[sp->len++] = x; + xfrm_state_hold(x); + + xo = xfrm_offload(skb); + if (!xo) { <--- previous code handled this case perfectly in validate_xmit_xfrm + secpath_reset(skb); + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR); + kfree_skb(skb); + return -EINVAL; <--- xfrm state leak + } Can you please revert/drop this patch for now? Thanks
Hi Leon, Thank you for your insightful questions and comments. Just like in crypto offload mode, now pass SA (Security Association) information to the driver in packet offload mode. This helps the driver quickly identify the packet's source and destination, rather than having to parse the packet itself. The xfrm interface ID is especially useful here. As you can see in the kernel code (https://elixir.bootlin.com/linux/v6.10/source/net/xfrm/xfrm_policy.c#L1993), it's used to differentiate between various policies in complex network setups. During my testing of packet offload mode without the patch, I observed that the sec_path was NULL. Consequently, xo was also NULL when validate_xmit_xfrm was called, leading to an immediate return at line 125 (https://elixir.bootlin.com/linux/v6.10/source/net/xfrm/xfrm_device.c#L125). Regarding the potential xfrm_state leak, upon further investigation, it may not be the case. It seems that both secpath_reset and kfree_skb invoke the xfrm_state_put function, which should be responsible for releasing the state. The call flow appears to be as follows: kfree_skb -> skb_release_all -> skb_release_head_state -> skb_ext_put -> skb_ext_put_sp -> xfrm_state_put. Please let me know if you have any further questions or concerns. I appreciate your valuable feedback! Feng On Wed, Aug 28, 2024 at 4:26 AM Leon Romanovsky <leon@kernel.org> wrote: > > On Wed, Aug 28, 2024 at 07:32:47AM +0200, Steffen Klassert wrote: > > On Thu, Aug 22, 2024 at 01:02:52PM -0700, Feng Wang wrote: > > > From: wangfe <wangfe@google.com> > > > > > > In packet offload mode, append Security Association (SA) information > > > to each packet, replicating the crypto offload implementation. > > > The XFRM_XMIT flag is set to enable packet to be returned immediately > > > from the validate_xmit_xfrm function, thus aligning with the existing > > > code path for packet offload mode. > > > > > > This SA info helps HW offload match packets to their correct security > > > policies. The XFRM interface ID is included, which is crucial in setups > > > with multiple XFRM interfaces where source/destination addresses alone > > > can't pinpoint the right policy. > > > > > > Signed-off-by: wangfe <wangfe@google.com> > > > > Applied to ipsec-next, thanks Feng! > > Stephen, can you please explain why do you think that this is correct > thing to do? > > There are no in-tree any drivers which is using this information, and it > is unclear to me how state is released and it has controversial code > around validity of xfrm_offload() too. > > For example: > + sp->olen++; > + sp->xvec[sp->len++] = x; > + xfrm_state_hold(x); > + > + xo = xfrm_offload(skb); > + if (!xo) { <--- previous code handled this case perfectly in validate_xmit_xfrm > + secpath_reset(skb); > + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR); > + kfree_skb(skb); > + return -EINVAL; <--- xfrm state leak > + } > > > Can you please revert/drop this patch for now? > > Thanks
On Wed, Aug 28, 2024 at 02:25:22PM -0700, Feng Wang wrote: > Hi Leon, Please don't top-post your replies when you are replying to a mailing list. It makes it hard to follow the conversation. > > Thank you for your insightful questions and comments. > > Just like in crypto offload mode, now pass SA (Security Association) > information to the driver in packet offload mode. This helps the > driver quickly identify the packet's source and destination, rather > than having to parse the packet itself. The xfrm interface ID is > especially useful here. As you can see in the kernel code > (https://elixir.bootlin.com/linux/v6.10/source/net/xfrm/xfrm_policy.c#L1993), > it's used to differentiate between various policies in complex network > setups. Which in-kernel driver use this information? > > During my testing of packet offload mode without the patch, I observed > that the sec_path was NULL. Consequently, xo was also NULL when > validate_xmit_xfrm was called, leading to an immediate return at line > 125 (https://elixir.bootlin.com/linux/v6.10/source/net/xfrm/xfrm_device.c#L125). It is intentional, because the packet is forwarded to HW as plain text and it is not offloaded (doesn't have xfrm_offload()). > > Regarding the potential xfrm_state leak, upon further investigation, > it may not be the case. It seems that both secpath_reset and kfree_skb > invoke the xfrm_state_put function, which should be responsible for > releasing the state. The call flow appears to be as follows: kfree_skb > -> skb_release_all -> skb_release_head_state -> skb_ext_put -> > skb_ext_put_sp -> xfrm_state_put. You are trying to make same flow as for crypto, but it is not the same, in crypto case secpath_reset() was called to release SKB extensions and perform cleanup, first and only after that new xfrm_state_hold() was called, but in new code SKB is not reset. Thanks > > Please let me know if you have any further questions or concerns. I > appreciate your valuable feedback! > > Feng > > On Wed, Aug 28, 2024 at 4:26 AM Leon Romanovsky <leon@kernel.org> wrote: > > > > On Wed, Aug 28, 2024 at 07:32:47AM +0200, Steffen Klassert wrote: > > > On Thu, Aug 22, 2024 at 01:02:52PM -0700, Feng Wang wrote: > > > > From: wangfe <wangfe@google.com> > > > > > > > > In packet offload mode, append Security Association (SA) information > > > > to each packet, replicating the crypto offload implementation. > > > > The XFRM_XMIT flag is set to enable packet to be returned immediately > > > > from the validate_xmit_xfrm function, thus aligning with the existing > > > > code path for packet offload mode. > > > > > > > > This SA info helps HW offload match packets to their correct security > > > > policies. The XFRM interface ID is included, which is crucial in setups > > > > with multiple XFRM interfaces where source/destination addresses alone > > > > can't pinpoint the right policy. > > > > > > > > Signed-off-by: wangfe <wangfe@google.com> > > > > > > Applied to ipsec-next, thanks Feng! > > > > Stephen, can you please explain why do you think that this is correct > > thing to do? > > > > There are no in-tree any drivers which is using this information, and it > > is unclear to me how state is released and it has controversial code > > around validity of xfrm_offload() too. > > > > For example: > > + sp->olen++; > > + sp->xvec[sp->len++] = x; > > + xfrm_state_hold(x); > > + > > + xo = xfrm_offload(skb); > > + if (!xo) { <--- previous code handled this case perfectly in validate_xmit_xfrm > > + secpath_reset(skb); > > + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR); > > + kfree_skb(skb); > > + return -EINVAL; <--- xfrm state leak > > + } > > > > > > Can you please revert/drop this patch for now? > > > > Thanks
Hi Leon, Thank you again for your thoughtful questions and comments. I'd like to provide further clarification and address your points: SA Information Usage: There are several instances in the kernel code where it's used, such as in esp4(6)_offload.c and xfrm.c. This clearly demonstrates how SA information is used. Moreover, passing this information to the driver shouldn't negatively impact those drivers that don't require it. Regarding a driver example, the function mlx5e_ipsec_feature_check caught my attention. https://elixir.bootlin.com/linux/v6.10/source/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.h#L89) As you're more familiar with this codebase, I defer to your expertise on whether it's an appropriate sample. However, the crucial point is that including this information empowers certain drivers to leverage it without affecting those that don't need it. validate_xmit_xfrm Function: My primary goal in discussing the validate_xmit_xfrm function is to assure you that my patch maintains the existing packet offload code flow, avoiding any unintended disruption. State Release: I've noticed that secpath_reset() is called before xfrm_output(). The sequence seems to be: xfrmi_xmit2 -> xfrmi_scrub_packet -> secpath_reset(), followed by xfrmi_xmit2 calling dst_output, which is essentially xfrm_output(). I'm also open to moving the xfrm_state_hold(x) after the if (!xo) check block. This would ensure the state is held only when everything is ok. I'll gladly make this adjustment if you believe it's the better option. Thank you once again for your valuable insights and collaboration. Your feedback is greatly appreciated! Feng
On Thu, Aug 29, 2024 at 02:19:25PM -0700, Feng Wang wrote: > Hi Leon, > > Thank you again for your thoughtful questions and comments. I'd like > to provide further clarification and address your points: > > SA Information Usage: > > There are several instances in the kernel code where it's used, such > as in esp4(6)_offload.c and xfrm.c. This clearly demonstrates how SA > information is used. Moreover, passing this information to the driver > shouldn't negatively impact those drivers that don't require it. > Regarding a driver example, the function mlx5e_ipsec_feature_check > caught my attention. > https://elixir.bootlin.com/linux/v6.10/source/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.h#L89) > As you're more familiar with this codebase, I defer to your expertise > on whether it's an appropriate sample. This function is not involved when packet is going to be offloaded for "IPsec packet offload". > However, the crucial point is that including this information empowers certain drivers to leverage > it without affecting those that don't need it. Can you please provide a list of drivers that will benefit from this change? Can you give a complete flow (including driver) which didn't work before and will work after this change? > > validate_xmit_xfrm Function: > My primary goal in discussing the validate_xmit_xfrm function is to > assure you that my patch maintains the existing packet offload code > flow, avoiding any unintended disruption. The whole idea of packet offload is to skip everything in XFRM stack and present packet as plain text. > > State Release: > I've noticed that secpath_reset() is called before xfrm_output(). The > sequence seems to be: xfrmi_xmit2 -> xfrmi_scrub_packet -> > secpath_reset(), followed by xfrmi_xmit2 calling dst_output, which is > essentially xfrm_output(). > I'm also open to moving the xfrm_state_hold(x) after the if (!xo) > check block. This would ensure the state is held only when everything > is ok. I'll gladly make this adjustment if you believe it's the better > option. > > Thank you once again for your valuable insights and collaboration. > Your feedback is greatly appreciated! > > Feng >
Hi Leon, I believe you are right about the mlx5e_ipsec_feature_check function. And it shows that the driver can indeed make use of the SA information. Similarly, in packet offload mode, drivers can potentially leverage this information for their own purposes. The patch is designed to be non-intrusive, so drivers that don't utilize this information won't be affected in any way. I'm also curious about why the mlx driver doesn't seem to use the XFRM interface ID in the same way that xfrm_policy_match() does. https://elixir.bootlin.com/linux/v6.10.7/source/net/xfrm/xfrm_policy.c#L1993 This ID is critical in scenarios with multiple IPsec tunnels, where source and destination addresses alone might not be sufficient to identify the correct security policy. Perhaps there's a specific reason or design choice behind this in the mlx driver? Thank you once again for your valuable insights and collaboration. Feng
On Fri, Aug 30, 2024 at 05:27:29PM -0700, Feng Wang wrote: > Hi Leon, > > I believe you are right about the mlx5e_ipsec_feature_check function. > And it shows that the driver can indeed make use of the SA > information. Similarly, in packet offload mode, drivers can > potentially leverage this information for their own purposes. The > patch is designed to be non-intrusive, so drivers that don't utilize > this information won't be affected in any way. I asked about examples of such drivers. Can you please provide them? > > I'm also curious about why the mlx driver doesn't seem to use the XFRM > interface ID in the same way that xfrm_policy_match() does. > https://elixir.bootlin.com/linux/v6.10.7/source/net/xfrm/xfrm_policy.c#L1993a HW offload is always last in packet TX traversal and it means that if HW catches that packet and it meets the HW offload requirements, it will be encrypted. The main idea is that routing (sending to right if_id) is handled by the upper layers and HW offload is just a final step. > This ID is critical in scenarios with multiple IPsec tunnels, where > source and destination addresses alone might not be sufficient to > identify the correct security policy. Perhaps there's a specific > reason or design choice behind this in the mlx driver? It is not specific to mlx5, but to all HW offload drivers. They should implement both policy and SA offloading. It is violation of current mailing list deign to do not offload policy. If you offload both policy and SA, you won't need if_id at all. > > Thank you once again for your valuable insights and collaboration. > > Feng
On Wed, Aug 28, 2024 at 07:32:47AM +0200, Steffen Klassert wrote: > On Thu, Aug 22, 2024 at 01:02:52PM -0700, Feng Wang wrote: > > From: wangfe <wangfe@google.com> > > > > In packet offload mode, append Security Association (SA) information > > to each packet, replicating the crypto offload implementation. > > The XFRM_XMIT flag is set to enable packet to be returned immediately > > from the validate_xmit_xfrm function, thus aligning with the existing > > code path for packet offload mode. > > > > This SA info helps HW offload match packets to their correct security > > policies. The XFRM interface ID is included, which is crucial in setups > > with multiple XFRM interfaces where source/destination addresses alone > > can't pinpoint the right policy. > > > > Signed-off-by: wangfe <wangfe@google.com> > > Applied to ipsec-next, thanks Feng! Steffen, What is your position on this patch? It is the same patch (logically) as the one that was rejected before? https://lore.kernel.org/all/ZfpnCIv+8eYd7CpO@gauss3.secunet.de/ Thanks
Sorry for the delay. I'm on vacation, so responses will take a bit longer during the next two weeks. On Sat, Aug 31, 2024 at 08:39:34PM +0300, Leon Romanovsky wrote: > On Wed, Aug 28, 2024 at 07:32:47AM +0200, Steffen Klassert wrote: > > On Thu, Aug 22, 2024 at 01:02:52PM -0700, Feng Wang wrote: > > > From: wangfe <wangfe@google.com> > > > > > > In packet offload mode, append Security Association (SA) information > > > to each packet, replicating the crypto offload implementation. > > > The XFRM_XMIT flag is set to enable packet to be returned immediately > > > from the validate_xmit_xfrm function, thus aligning with the existing > > > code path for packet offload mode. > > > > > > This SA info helps HW offload match packets to their correct security > > > policies. The XFRM interface ID is included, which is crucial in setups > > > with multiple XFRM interfaces where source/destination addresses alone > > > can't pinpoint the right policy. > > > > > > Signed-off-by: wangfe <wangfe@google.com> > > > > Applied to ipsec-next, thanks Feng! > > Steffen, > > What is your position on this patch? > It is the same patch (logically) as the one that was rejected before? > https://lore.kernel.org/all/ZfpnCIv+8eYd7CpO@gauss3.secunet.de/ This is an infrastructure patch to support routing based IPsec with xfrm interfaces. I just did not notice it because it was not mentioned in the commit message of the first patchset. This should have been included into the packet offload API patchset, but I overlooked that xfrm interfaces can't work with packet offload mode. The stack infrastructure should be complete, so that drivers can implement that without the need to fix the stack before. In case the patch has issues, we should fix it.
On Thu, Aug 22, 2024 at 01:02:52PM -0700, Feng Wang wrote: > From: wangfe <wangfe@google.com> > > In packet offload mode, append Security Association (SA) information > to each packet, replicating the crypto offload implementation. > The XFRM_XMIT flag is set to enable packet to be returned immediately > from the validate_xmit_xfrm function, thus aligning with the existing > code path for packet offload mode. > > This SA info helps HW offload match packets to their correct security > policies. The XFRM interface ID is included, which is crucial in setups > with multiple XFRM interfaces where source/destination addresses alone > can't pinpoint the right policy. > > Signed-off-by: wangfe <wangfe@google.com> > --- > v2: > - Add why HW offload requires the SA info to the commit message > v1: https://lore.kernel.org/all/20240812182317.1962756-1-wangfe@google.com/ > --- > net/xfrm/xfrm_output.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c > index e5722c95b8bb..a12588e7b060 100644 > --- a/net/xfrm/xfrm_output.c > +++ b/net/xfrm/xfrm_output.c > @@ -706,6 +706,8 @@ int xfrm_output(struct sock *sk, struct sk_buff *skb) > struct xfrm_state *x = skb_dst(skb)->xfrm; > int family; > int err; > + struct xfrm_offload *xo; > + struct sec_path *sp; > > family = (x->xso.type != XFRM_DEV_OFFLOAD_PACKET) ? x->outer_mode.family > : skb_dst(skb)->ops->family; > @@ -728,6 +730,25 @@ int xfrm_output(struct sock *sk, struct sk_buff *skb) > kfree_skb(skb); > return -EHOSTUNREACH; > } > + sp = secpath_set(skb); Maybe we should do that only if if we really use xfrm interfaces, i.e. if xfrm if_id is not zero.
On Mon, Sep 02, 2024 at 09:44:24AM +0200, Steffen Klassert wrote: > Sorry for the delay. I'm on vacation, so responses will take a bit > longer during the next two weeks. > > On Sat, Aug 31, 2024 at 08:39:34PM +0300, Leon Romanovsky wrote: > > On Wed, Aug 28, 2024 at 07:32:47AM +0200, Steffen Klassert wrote: > > > On Thu, Aug 22, 2024 at 01:02:52PM -0700, Feng Wang wrote: > > > > From: wangfe <wangfe@google.com> > > > > > > > > In packet offload mode, append Security Association (SA) information > > > > to each packet, replicating the crypto offload implementation. > > > > The XFRM_XMIT flag is set to enable packet to be returned immediately > > > > from the validate_xmit_xfrm function, thus aligning with the existing > > > > code path for packet offload mode. > > > > > > > > This SA info helps HW offload match packets to their correct security > > > > policies. The XFRM interface ID is included, which is crucial in setups > > > > with multiple XFRM interfaces where source/destination addresses alone > > > > can't pinpoint the right policy. > > > > > > > > Signed-off-by: wangfe <wangfe@google.com> > > > > > > Applied to ipsec-next, thanks Feng! > > > > Steffen, > > > > What is your position on this patch? > > It is the same patch (logically) as the one that was rejected before? > > https://lore.kernel.org/all/ZfpnCIv+8eYd7CpO@gauss3.secunet.de/ > > This is an infrastructure patch to support routing based IPsec > with xfrm interfaces. I just did not notice it because it was not > mentioned in the commit message of the first patchset. This should have > been included into the packet offload API patchset, but I overlooked > that xfrm interfaces can't work with packet offload mode. The stack > infrastructure should be complete, so that drivers can implement > that without the need to fix the stack before. Core implementation that is not used by any upstream code is rarely right thing to do. It is not tested, complicates the code and mostly overlooked when patches are reviewed. The better way will be to extend the stack when this feature will be actually used and needed. IMHO, attempt to enrich core code without showing users of this new flow is comparable to premature optimization. And Feng more than once said that this code is for some out-of-tree driver. > > In case the patch has issues, we should fix it. Yes, this patch doesn't have in-kernel users. Thanks
This patch simply assigns a value to a field, replicating existing crypto offload behavior - it's working/tested in that mode. Many instances within the kernel code utilize this information in different cases, making the implementation pretty simple and safe. Hi Leon, "It is not specific to mlx5, but to all HW offload drivers. They should implement both policy and SA offloading. It is violation of current mailing list deign to do not offload policy. If you offload both policy and SA, you won't need if_id at all." Could you please clarify why the if_id is unnecessary in scenarios with hardware offload? For instance, imagine I have two tunnel sessions sharing the same source and destination addresses. One tunnel utilizes xfrm ID 1, while the other uses xfrm ID 2. If a packet is sent out via xfrm ID 1 and lacks any specific markings, how does the hardware offload determine that this packet belongs to xfrm ID 1 and not xfrm ID 2? This distinction is crucial for the hardware to locate the correct encryption information and encrypt the packet accordingly. Thanks for your help. Feng
On Tue, Sep 03, 2024 at 11:19:41AM -0700, Feng Wang wrote: > This patch simply assigns a value to a field, replicating existing > crypto offload behavior - it's working/tested in that mode. Many > instances within the kernel code utilize this information in different > cases, making the implementation pretty simple and safe. Not really, the thing that you are adding secpath in the place which wasn't designed to have it, is very problematic. Crypto and packet offloads are two different things as they treat packets differently. First one sees them as ESP packets, while the second one sees them as plain text packets. > > Hi Leon, > > "It is not specific to mlx5, but to all HW offload drivers. They should > implement both policy and SA offloading. It is violation of current mailing > list deign to do not offload policy. If you offload both policy and SA, you > won't need if_id at all." > > Could you please clarify why the if_id is unnecessary in scenarios > with hardware offload? I have no objections to support xfrm IDs in the packet offload, my request is to have upstream driver which uses this feature. > > For instance, imagine I have two tunnel sessions sharing the same > source and destination addresses. One tunnel utilizes xfrm ID 1, while > the other uses xfrm ID 2. If a packet is sent out via xfrm ID 1 and > lacks any specific markings, how does the hardware offload determine > that this packet belongs to xfrm ID 1 and not xfrm ID 2? This > distinction is crucial for the hardware to locate the correct > encryption information and encrypt the packet accordingly. HW doesn't know about xfrm ID, as this information is kernel specific. In order HW will use this information someone needs to pass it to HW, while configuring SA/policy and nothing in the kernel does that. Thanks > > Thanks for your help. > > Feng
Hi Leon, I'm looking at the MLX5 driver to understand how the SA information is used. In mlx5e_ipsec_handle_tx_skb(), it appears we might leverage the current MLX5 implementation to verify the xfrm id. https://elixir.bootlin.com/linux/v6.10/source/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c#L271 During the mlx5e_xfrm_add_state() function, the xfrm ID (x->if_id) is passed to the driver along with the associated xfrm_state pointer. Therefore, by checking the if_id within the skb tx function like mlx5e_ipsec_handle_tx_skb(), we should be able to demonstrate the use case effectively. What’s your opinion? Thanks for your help. Feng
On Wed, Sep 04, 2024 at 10:41:38AM -0700, Feng Wang wrote: > Hi Leon, > > I'm looking at the MLX5 driver to understand how the SA information is > used. In mlx5e_ipsec_handle_tx_skb(), it appears we might leverage the > current MLX5 implementation to verify the xfrm id. > https://elixir.bootlin.com/linux/v6.10/source/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.c#L271 > > During the mlx5e_xfrm_add_state() function, the xfrm ID (x->if_id) is > passed to the driver along with the associated xfrm_state pointer. > Therefore, by checking the if_id within the skb tx function like > mlx5e_ipsec_handle_tx_skb(), we should be able to demonstrate the use > case effectively. > > What’s your opinion? Packet offloaded packets don't pass mlx5e_ipsec_handle_tx_skb() because SKB is treated as plain text and not encrypted. In order to support this feature in mlx5, you will need to do two things: 1. Create rule which matches x->if_id in mlx5 flow steering, while creating SAs (see tx_add_rule()->setup_fte_reg_a()). This register is used in the transmit steering tables, and is loaded with the value of flow_table_metadata field in the Ethernet Segment of the WQE. 2. Set x->if_id from SKB in flow_table_metadata to allow HW to catch these packets. It means change mlx5e datapath to set this value from SKB. The first item is easy, just move setup_fte_reg_a() to the right place, but the second one is more complex as whole packet offload assumption that we are working with plain text packets. I'm not even talking about eswitch mode, which will bring more complexity. Thanks > > Thanks for your help. > > Feng
Hi Leon, I'm currently reading the mlx5e_xmit() function to understand the MLX5 datapath. I'm particularly interested in understanding how packet offload mode functions. Could you please clarify if CONFIG_MLX5_EN_IPSEC needs to be defined when MLX5 employs packet offload mode? Additionally, are there any specific flags associated with packet offload that I should be mindful of? Any guidance would be greatly appreciated. Thanks, Feng
On Mon, Sep 02, 2024 at 12:44:52PM +0300, Leon Romanovsky wrote: > On Mon, Sep 02, 2024 at 09:44:24AM +0200, Steffen Klassert wrote: > > > > > > Steffen, > > > > > > What is your position on this patch? > > > It is the same patch (logically) as the one that was rejected before? > > > https://lore.kernel.org/all/ZfpnCIv+8eYd7CpO@gauss3.secunet.de/ > > > > This is an infrastructure patch to support routing based IPsec > > with xfrm interfaces. I just did not notice it because it was not > > mentioned in the commit message of the first patchset. This should have > > been included into the packet offload API patchset, but I overlooked > > that xfrm interfaces can't work with packet offload mode. The stack > > infrastructure should be complete, so that drivers can implement > > that without the need to fix the stack before. > > Core implementation that is not used by any upstream code is rarely > right thing to do. It is not tested, complicates the code and mostly > overlooked when patches are reviewed. The better way will be to extend > the stack when this feature will be actually used and needed. This is our tradeoff, an API should be fully designed from the beginning, everything else is bad design and will likely result in band aids (as it happens here). The API can be connected to netdevsim to test it. Currently the combination of xfrm interfaces and packet offload is just broken. Unfortunalely this patch does not fix it. I think we need to do three things: - Fix xfrm interfaces + packet offload combination - Extend netdevsim to support packet offload - Extend the API for xfrm interfaces (and everything else we forgot). > IMHO, attempt to enrich core code without showing users of this new flow > is comparable to premature optimization. > > And Feng more than once said that this code is for some out-of-tree > driver. It is an API, so everyone can use it. Anyway, I'm thinking about reverting it for now and do it right in the next development cycle.
On Mon, Sep 09, 2024 at 11:09:06AM +0200, Steffen Klassert wrote: > On Mon, Sep 02, 2024 at 12:44:52PM +0300, Leon Romanovsky wrote: > > On Mon, Sep 02, 2024 at 09:44:24AM +0200, Steffen Klassert wrote: > > Anyway, I'm thinking about reverting it for now and do it right > in the next development cycle. I finally decided to revert it. Let's work on it after the next merge window.
On Mon, Sep 09, 2024 at 11:09:05AM +0200, Steffen Klassert wrote: > On Mon, Sep 02, 2024 at 12:44:52PM +0300, Leon Romanovsky wrote: > > On Mon, Sep 02, 2024 at 09:44:24AM +0200, Steffen Klassert wrote: > > > > > > > > Steffen, > > > > > > > > What is your position on this patch? > > > > It is the same patch (logically) as the one that was rejected before? > > > > https://lore.kernel.org/all/ZfpnCIv+8eYd7CpO@gauss3.secunet.de/ > > > > > > This is an infrastructure patch to support routing based IPsec > > > with xfrm interfaces. I just did not notice it because it was not > > > mentioned in the commit message of the first patchset. This should have > > > been included into the packet offload API patchset, but I overlooked > > > that xfrm interfaces can't work with packet offload mode. The stack > > > infrastructure should be complete, so that drivers can implement > > > that without the need to fix the stack before. > > > > Core implementation that is not used by any upstream code is rarely > > right thing to do. It is not tested, complicates the code and mostly > > overlooked when patches are reviewed. The better way will be to extend > > the stack when this feature will be actually used and needed. > > This is our tradeoff, an API should be fully designed from the > beginning, everything else is bad design and will likely result > in band aids (as it happens here). The API can be connected to > netdevsim to test it. > > Currently the combination of xfrm interfaces and packet offload > is just broken. I don't think that it is broken. It is just not implemented. XFRM interfaces are optional field, which is not really popular in the field. > Unfortunalely this patch does not fix it. > > I think we need to do three things: > > - Fix xfrm interfaces + packet offload combination > > - Extend netdevsim to support packet offload > > - Extend the API for xfrm interfaces (and everything > else we forgot). This is the most challenging part. It is not clear what should we extend if customers are not asking for it and they are extremely happy with the current IPsec packet offload state. BTW, I'm aware of one gap, which is not clear how to handle, and it is combination of policy sockets and offload. > > > IMHO, attempt to enrich core code without showing users of this new flow > > is comparable to premature optimization. > > > > And Feng more than once said that this code is for some out-of-tree > > driver. > > It is an API, so everyone can use it. Of course, as long as in-kernel user exists. Thanks
Hi Steffen, Can you reconsider the revert of the CL? Based on our discussion, I believe we've reached a consensus that the xfrm id is necessary. If some customers prefer not to utilize it, we can extend the offload flag (something like XFRM_DEV_OFFLOAD_FLAG_ACQ) to manage this behavior. The default setting would remain consistent with the current implementation. Thank you! Feng On Wed, Sep 11, 2024 at 3:40 AM Leon Romanovsky <leon@kernel.org> wrote: > > On Mon, Sep 09, 2024 at 11:09:05AM +0200, Steffen Klassert wrote: > > On Mon, Sep 02, 2024 at 12:44:52PM +0300, Leon Romanovsky wrote: > > > On Mon, Sep 02, 2024 at 09:44:24AM +0200, Steffen Klassert wrote: > > > > > > > > > > Steffen, > > > > > > > > > > What is your position on this patch? > > > > > It is the same patch (logically) as the one that was rejected before? > > > > > https://lore.kernel.org/all/ZfpnCIv+8eYd7CpO@gauss3.secunet.de/ > > > > > > > > This is an infrastructure patch to support routing based IPsec > > > > with xfrm interfaces. I just did not notice it because it was not > > > > mentioned in the commit message of the first patchset. This should have > > > > been included into the packet offload API patchset, but I overlooked > > > > that xfrm interfaces can't work with packet offload mode. The stack > > > > infrastructure should be complete, so that drivers can implement > > > > that without the need to fix the stack before. > > > > > > Core implementation that is not used by any upstream code is rarely > > > right thing to do. It is not tested, complicates the code and mostly > > > overlooked when patches are reviewed. The better way will be to extend > > > the stack when this feature will be actually used and needed. > > > > This is our tradeoff, an API should be fully designed from the > > beginning, everything else is bad design and will likely result > > in band aids (as it happens here). The API can be connected to > > netdevsim to test it. > > > > Currently the combination of xfrm interfaces and packet offload > > is just broken. > > I don't think that it is broken. It is just not implemented. XFRM > interfaces are optional field, which is not really popular in the > field. > > > Unfortunalely this patch does not fix it. > > > > I think we need to do three things: > > > > - Fix xfrm interfaces + packet offload combination > > > > - Extend netdevsim to support packet offload > > > > - Extend the API for xfrm interfaces (and everything > > else we forgot). > > This is the most challenging part. It is not clear what should > we extend if customers are not asking for it and they are extremely > happy with the current IPsec packet offload state. > > BTW, I'm aware of one gap, which is not clear how to handle, and > it is combination of policy sockets and offload. > > > > > > IMHO, attempt to enrich core code without showing users of this new flow > > > is comparable to premature optimization. > > > > > > And Feng more than once said that this code is for some out-of-tree > > > driver. > > > > It is an API, so everyone can use it. > > Of course, as long as in-kernel user exists. > > Thanks
On Wed, Sep 11, 2024 at 04:43:55PM -0700, Feng Wang wrote: > Hi Steffen, > > Can you reconsider the revert of the CL? Based on our discussion, I > believe we've reached a consensus that the xfrm id is necessary. It is only half of the consensus, the other half is that upstream customer should exist. Thanks
On Wed, Sep 11, 2024 at 04:43:55PM -0700, Feng Wang wrote: > Hi Steffen, > > Can you reconsider the revert of the CL? Based on our discussion, I > believe we've reached a consensus that the xfrm id is necessary. I did not say it is necessary, all I want is a complete API that exposes all xfrm stack features correctly to HW. > If > some customers prefer not to utilize it, we can extend the offload > flag (something like XFRM_DEV_OFFLOAD_FLAG_ACQ) to manage this > behavior. The default setting would remain consistent with the current > implementation. No new flags please. The easiest thing would be to upstream your driver, that is the prefered way and would just end this discussion.
On Wed, Sep 11, 2024 at 01:40:40PM +0300, Leon Romanovsky wrote: > On Mon, Sep 09, 2024 at 11:09:05AM +0200, Steffen Klassert wrote: > > On Mon, Sep 02, 2024 at 12:44:52PM +0300, Leon Romanovsky wrote: > > > On Mon, Sep 02, 2024 at 09:44:24AM +0200, Steffen Klassert wrote: > > > > > > > > > > Steffen, > > > > > > > > > > What is your position on this patch? > > > > > It is the same patch (logically) as the one that was rejected before? > > > > > https://lore.kernel.org/all/ZfpnCIv+8eYd7CpO@gauss3.secunet.de/ > > > > > > > > This is an infrastructure patch to support routing based IPsec > > > > with xfrm interfaces. I just did not notice it because it was not > > > > mentioned in the commit message of the first patchset. This should have > > > > been included into the packet offload API patchset, but I overlooked > > > > that xfrm interfaces can't work with packet offload mode. The stack > > > > infrastructure should be complete, so that drivers can implement > > > > that without the need to fix the stack before. > > > > > > Core implementation that is not used by any upstream code is rarely > > > right thing to do. It is not tested, complicates the code and mostly > > > overlooked when patches are reviewed. The better way will be to extend > > > the stack when this feature will be actually used and needed. > > > > This is our tradeoff, an API should be fully designed from the > > beginning, everything else is bad design and will likely result > > in band aids (as it happens here). The API can be connected to > > netdevsim to test it. > > > > Currently the combination of xfrm interfaces and packet offload > > is just broken. > > I don't think that it is broken. I don't see anything that prevents you from offloading a SA with an xfrm interface ID. The binding to the interface is just ignored in that case. > It is just not implemented. XFRM > interfaces are optional field, which is not really popular in the > field. It is very popular, I know of more than a billion devices that are using xfrm interfaces. > > > Unfortunalely this patch does not fix it. > > > > I think we need to do three things: > > > > - Fix xfrm interfaces + packet offload combination > > > > - Extend netdevsim to support packet offload > > > > - Extend the API for xfrm interfaces (and everything > > else we forgot). > > This is the most challenging part. It is not clear what should > we extend if customers are not asking for it and they are extremely > happy with the current IPsec packet offload state. We just need to push the information down to the driver, and reject the offload if not supported. > > BTW, I'm aware of one gap, which is not clear how to handle, and > it is combination of policy sockets and offload. Socket policies are a bit special as they are configured by the application that uses the socket. I don't think that we can even configure offload for a socket policy.
Hi Steffen, The easiest thing would be to upstream your driver, that is the prefered way and would just end this discussion. I will try to upstream the xfrm interface id handling code to netdevsim, thus it will have an in-driver implementation. Thanks, Feng On Tue, Sep 24, 2024 at 3:34 AM Steffen Klassert <steffen.klassert@secunet.com> wrote: > > On Wed, Sep 11, 2024 at 01:40:40PM +0300, Leon Romanovsky wrote: > > On Mon, Sep 09, 2024 at 11:09:05AM +0200, Steffen Klassert wrote: > > > On Mon, Sep 02, 2024 at 12:44:52PM +0300, Leon Romanovsky wrote: > > > > On Mon, Sep 02, 2024 at 09:44:24AM +0200, Steffen Klassert wrote: > > > > > > > > > > > > Steffen, > > > > > > > > > > > > What is your position on this patch? > > > > > > It is the same patch (logically) as the one that was rejected before? > > > > > > https://lore.kernel.org/all/ZfpnCIv+8eYd7CpO@gauss3.secunet.de/ > > > > > > > > > > This is an infrastructure patch to support routing based IPsec > > > > > with xfrm interfaces. I just did not notice it because it was not > > > > > mentioned in the commit message of the first patchset. This should have > > > > > been included into the packet offload API patchset, but I overlooked > > > > > that xfrm interfaces can't work with packet offload mode. The stack > > > > > infrastructure should be complete, so that drivers can implement > > > > > that without the need to fix the stack before. > > > > > > > > Core implementation that is not used by any upstream code is rarely > > > > right thing to do. It is not tested, complicates the code and mostly > > > > overlooked when patches are reviewed. The better way will be to extend > > > > the stack when this feature will be actually used and needed. > > > > > > This is our tradeoff, an API should be fully designed from the > > > beginning, everything else is bad design and will likely result > > > in band aids (as it happens here). The API can be connected to > > > netdevsim to test it. > > > > > > Currently the combination of xfrm interfaces and packet offload > > > is just broken. > > > > I don't think that it is broken. > > I don't see anything that prevents you from offloading a SA > with an xfrm interface ID. The binding to the interface is > just ignored in that case. > > > It is just not implemented. XFRM > > interfaces are optional field, which is not really popular in the > > field. > > It is very popular, I know of more than a billion devices that > are using xfrm interfaces. > > > > > > Unfortunalely this patch does not fix it. > > > > > > I think we need to do three things: > > > > > > - Fix xfrm interfaces + packet offload combination > > > > > > - Extend netdevsim to support packet offload > > > > > > - Extend the API for xfrm interfaces (and everything > > > else we forgot). > > > > This is the most challenging part. It is not clear what should > > we extend if customers are not asking for it and they are extremely > > happy with the current IPsec packet offload state. > > We just need to push the information down to the driver, > and reject the offload if not supported. > > > > > BTW, I'm aware of one gap, which is not clear how to handle, and > > it is combination of policy sockets and offload. > > Socket policies are a bit special as they are configured by > the application that uses the socket. I don't think that > we can even configure offload for a socket policy. >
Please don't top post! On Tue, Sep 24, 2024 at 10:57:12AM -0700, Feng Wang wrote: > Hi Steffen, > > The easiest thing would be to upstream your driver, that is the > prefered way and would just end this discussion. > > I will try to upstream the xfrm interface id handling code to > netdevsim, thus it will have an in-driver implementation. Netdevsim is just the second best option and still might lead to discussion. Upstream the google driver that actually uses it, this _is_ the prefered way.
On Tue, Sep 24, 2024 at 08:10:41PM +0200, Steffen Klassert wrote: > Please don't top post! > > On Tue, Sep 24, 2024 at 10:57:12AM -0700, Feng Wang wrote: > > Hi Steffen, > > > > The easiest thing would be to upstream your driver, that is the > > prefered way and would just end this discussion. > > > > I will try to upstream the xfrm interface id handling code to > > netdevsim, thus it will have an in-driver implementation. > > Netdevsim is just the second best option and still might > lead to discussion. netdevsim should come with relevant selftests, otherwise this new feature won't be tested properly. We can't rely on someone out-of-tree to test it. > > Upstream the google driver that actually uses it, this _is_ > the prefered way. +1 >
On Tue, Sep 24, 2024 at 12:34:32PM +0200, Steffen Klassert wrote: > On Wed, Sep 11, 2024 at 01:40:40PM +0300, Leon Romanovsky wrote: > > On Mon, Sep 09, 2024 at 11:09:05AM +0200, Steffen Klassert wrote: > > > On Mon, Sep 02, 2024 at 12:44:52PM +0300, Leon Romanovsky wrote: > > > > On Mon, Sep 02, 2024 at 09:44:24AM +0200, Steffen Klassert wrote: > > > > > > > > > > > > Steffen, > > > > > > > > > > > > What is your position on this patch? > > > > > > It is the same patch (logically) as the one that was rejected before? > > > > > > https://lore.kernel.org/all/ZfpnCIv+8eYd7CpO@gauss3.secunet.de/ > > > > > > > > > > This is an infrastructure patch to support routing based IPsec > > > > > with xfrm interfaces. I just did not notice it because it was not > > > > > mentioned in the commit message of the first patchset. This should have > > > > > been included into the packet offload API patchset, but I overlooked > > > > > that xfrm interfaces can't work with packet offload mode. The stack > > > > > infrastructure should be complete, so that drivers can implement > > > > > that without the need to fix the stack before. > > > > > > > > Core implementation that is not used by any upstream code is rarely > > > > right thing to do. It is not tested, complicates the code and mostly > > > > overlooked when patches are reviewed. The better way will be to extend > > > > the stack when this feature will be actually used and needed. > > > > > > This is our tradeoff, an API should be fully designed from the > > > beginning, everything else is bad design and will likely result > > > in band aids (as it happens here). The API can be connected to > > > netdevsim to test it. > > > > > > Currently the combination of xfrm interfaces and packet offload > > > is just broken. > > > > I don't think that it is broken. > > I don't see anything that prevents you from offloading a SA > with an xfrm interface ID. The binding to the interface is > just ignored in that case. > > > It is just not implemented. XFRM > > interfaces are optional field, which is not really popular in the > > field. > > It is very popular, I know of more than a billion devices that > are using xfrm interfaces. We see different parts of "the field". In my case it it enterprise/cloud world, and I can say same sentence as you with "are NOT using ..." words instead. This is why so important to see google's driver (which is Android) to understand the real need from this feature. > > > > > > Unfortunalely this patch does not fix it. > > > > > > I think we need to do three things: > > > > > > - Fix xfrm interfaces + packet offload combination > > > > > > - Extend netdevsim to support packet offload > > > > > > - Extend the API for xfrm interfaces (and everything > > > else we forgot). > > > > This is the most challenging part. It is not clear what should > > we extend if customers are not asking for it and they are extremely > > happy with the current IPsec packet offload state. > > We just need to push the information down to the driver, > and reject the offload if not supported. Yes and in addition to that it will be beneficial do not add this information to SKB if it won't be used. > > > > > BTW, I'm aware of one gap, which is not clear how to handle, and > > it is combination of policy sockets and offload. > > Socket policies are a bit special as they are configured by > the application that uses the socket. I don't think that > we can even configure offload for a socket policy. One of the idea is to iterate over all devices and check if they support offload, if yes, then offload, if not, then fallback to software for that device. This is just rough idea with many caveats. Thanks
diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c index e5722c95b8bb..a12588e7b060 100644 --- a/net/xfrm/xfrm_output.c +++ b/net/xfrm/xfrm_output.c @@ -706,6 +706,8 @@ int xfrm_output(struct sock *sk, struct sk_buff *skb) struct xfrm_state *x = skb_dst(skb)->xfrm; int family; int err; + struct xfrm_offload *xo; + struct sec_path *sp; family = (x->xso.type != XFRM_DEV_OFFLOAD_PACKET) ? x->outer_mode.family : skb_dst(skb)->ops->family; @@ -728,6 +730,25 @@ int xfrm_output(struct sock *sk, struct sk_buff *skb) kfree_skb(skb); return -EHOSTUNREACH; } + sp = secpath_set(skb); + if (!sp) { + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR); + kfree_skb(skb); + return -ENOMEM; + } + + sp->olen++; + sp->xvec[sp->len++] = x; + xfrm_state_hold(x); + + xo = xfrm_offload(skb); + if (!xo) { + secpath_reset(skb); + XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR); + kfree_skb(skb); + return -EINVAL; + } + xo->flags |= XFRM_XMIT; return xfrm_output_resume(sk, skb, 0); }