Message ID | OS3P286MB229572718C0B4E7229710062F5AB9@OS3P286MB2295.JPNP286.PROD.OUTLOOK.COM (mailing list archive) |
---|---|
State | Deferred |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v2,1/1] net: openvswitch: Use on stack sw_flow_key in ovs_packet_cmd_execute | expand |
Sorry, there is a typo in the mail, i will resend shortly, please ignore it for now On 2023/2/23 20:21, Eddy Tao wrote: > Use on stack sw_flow_key in ovs_packet_cmd_execute > > Reason: As key function in slow-path, ovs_packet_cmd_execute and > ovs_flow_cmd_new allocate transient memory for sw_flow > and frees it at the end of function. > The procedure is not efficient in 2 aspects > 1. actuall sw_flow_key is what the function need > 2. free/alloc involves kmem_cache operations > when system under frequent slow path operation > > Existing code in ovs_flow_cmd_new/set/get use stack > to store sw_flow_mask and sw_flow_key deliberately > > Performance benefit: > ovs_packet_cmd_execute efficiency improved > Avoid 2 calls to kmem_cache alloc > Avoid memzero of 200 bytes > 6% less instructions > > Testing topology > +-----+ > nic1--| |--nic1 > nic2--| |--nic2 > VM1(16cpus) | ovs | VM2(16 cpus) > nic3--|4cpus|--nic3 > nic4--| |--nic4 > +-----+ > 2 threads on each vnic with affinity set on client side > > netperf -H $peer -p $((port+$i)) -t UDP_RR -l 60 -- -R 1 -r 8K,8K > netperf -H $peer -p $((port+$i)) -t TCP_RR -l 60 -- -R 1 -r 120,240 > netperf -H $peer -p $((port+$i)) -t TCP_CRR -l 60 -- -R 1 -r 120,240 > > Before the fix > Mode Iterations Variance Average > UDP_RR 10 %1.31 46724 > TCP_RR 10 %6.26 77188 > TCP_CRR 10 %0.10 20505 > UDP_STREAM 10 %4.55 19907 > TCP_STREAM 10 %9.93 28942 > > After the fix > Mode Iterations Variance Average > UDP_RR 10 %1.51 49097 > TCP_RR 10 %5.58 78540 > TCP_CRR 10 %0.14 20542 > UDP_STREAM 10 %11.17 22532 > TCP_STREAM 10 %11.14 28579 > > Signed-off-by: Eddy Tao <taoyuan_eddy@hotmail.com> > --- > V1 -> V2: Further reduce memory usage by using sw_flow_key instead > of sw_flow, revise description of change and provide data > > net/openvswitch/datapath.c | 30 +++++++++++------------------- > 1 file changed, 11 insertions(+), 19 deletions(-) > > diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c > index fcee6012293b..ae3146d51079 100644 > --- a/net/openvswitch/datapath.c > +++ b/net/openvswitch/datapath.c > @@ -596,8 +596,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) > struct nlattr **a = info->attrs; > struct sw_flow_actions *acts; > struct sk_buff *packet; > - struct sw_flow *flow; > - struct sw_flow_actions *sf_acts; > + struct sw_flow_key key; > struct datapath *dp; > struct vport *input_vport; > u16 mru = 0; > @@ -636,24 +635,20 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) > } > > /* Build an sw_flow for sending this packet. */ > - flow = ovs_flow_alloc(); > - err = PTR_ERR(flow); > - if (IS_ERR(flow)) > - goto err_kfree_skb; > + memset(&key, 0, sizeof(key)); > > err = ovs_flow_key_extract_userspace(net, a[OVS_PACKET_ATTR_KEY], > - packet, &flow->key, log); > + packet, &key, log); > if (err) > - goto err_flow_free; > + goto err_kfree_skb; > > err = ovs_nla_copy_actions(net, a[OVS_PACKET_ATTR_ACTIONS], > - &flow->key, &acts, log); > + &key, &acts, log); > if (err) > - goto err_flow_free; > + goto err_kfree_skb; > > - rcu_assign_pointer(flow->sf_acts, acts); > - packet->priority = flow->key.phy.priority; > - packet->mark = flow->key.phy.skb_mark; > + packet->priority = key.phy.priority; > + packet->mark = key.phy.skb_mark; > > rcu_read_lock(); > dp = get_dp_rcu(net, ovs_header->dp_ifindex); > @@ -661,7 +656,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) > if (!dp) > goto err_unlock; > > - input_vport = ovs_vport_rcu(dp, flow->key.phy.in_port); > + input_vport = ovs_vport_rcu(dp, key.phy.in_port); > if (!input_vport) > input_vport = ovs_vport_rcu(dp, OVSP_LOCAL); > > @@ -670,20 +665,17 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) > > packet->dev = input_vport->dev; > OVS_CB(packet)->input_vport = input_vport; > - sf_acts = rcu_dereference(flow->sf_acts); > > local_bh_disable(); > - err = ovs_execute_actions(dp, packet, sf_acts, &flow->key); > + err = ovs_execute_actions(dp, packet, acts, &key); > local_bh_enable(); > rcu_read_unlock(); > > - ovs_flow_free(flow, false); > + ovs_nla_free_flow_actions(acts); > return err; > > err_unlock: > rcu_read_unlock(); > -err_flow_free: > - ovs_flow_free(flow, false); > err_kfree_skb: > kfree_skb(packet); > err:
On Thu, Feb 23, 2023 at 08:24:50PM +0800, Eddy Tao wrote: > Sorry, there is a typo in the mail, i will resend shortly, please ignore it > for now net-next is now closed. You'll need to repost this patch after v6.3-rc1 has been tagged. Or post it as an RFC. Ref: https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html
On Thu, 2023-02-23 at 20:24 +0800, Eddy Tao wrote:
> Sorry, there is a typo in the mail, i will resend shortly,
please, don't do that.
# Form letter - net-next is closed
The merge window for v6.3 has begun and therefore net-next is closed
for new drivers, features, code refactoring and optimizations.
We are currently accepting bug fixes only.
Please repost when net-next reopens after Mar 6th.
RFC patches sent for review only are obviously welcome at any time.
Sure, will redo the post when window open Have a great day eddy
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c index fcee6012293b..ae3146d51079 100644 --- a/net/openvswitch/datapath.c +++ b/net/openvswitch/datapath.c @@ -596,8 +596,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) struct nlattr **a = info->attrs; struct sw_flow_actions *acts; struct sk_buff *packet; - struct sw_flow *flow; - struct sw_flow_actions *sf_acts; + struct sw_flow_key key; struct datapath *dp; struct vport *input_vport; u16 mru = 0; @@ -636,24 +635,20 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) } /* Build an sw_flow for sending this packet. */ - flow = ovs_flow_alloc(); - err = PTR_ERR(flow); - if (IS_ERR(flow)) - goto err_kfree_skb; + memset(&key, 0, sizeof(key)); err = ovs_flow_key_extract_userspace(net, a[OVS_PACKET_ATTR_KEY], - packet, &flow->key, log); + packet, &key, log); if (err) - goto err_flow_free; + goto err_kfree_skb; err = ovs_nla_copy_actions(net, a[OVS_PACKET_ATTR_ACTIONS], - &flow->key, &acts, log); + &key, &acts, log); if (err) - goto err_flow_free; + goto err_kfree_skb; - rcu_assign_pointer(flow->sf_acts, acts); - packet->priority = flow->key.phy.priority; - packet->mark = flow->key.phy.skb_mark; + packet->priority = key.phy.priority; + packet->mark = key.phy.skb_mark; rcu_read_lock(); dp = get_dp_rcu(net, ovs_header->dp_ifindex); @@ -661,7 +656,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) if (!dp) goto err_unlock; - input_vport = ovs_vport_rcu(dp, flow->key.phy.in_port); + input_vport = ovs_vport_rcu(dp, key.phy.in_port); if (!input_vport) input_vport = ovs_vport_rcu(dp, OVSP_LOCAL); @@ -670,20 +665,17 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info) packet->dev = input_vport->dev; OVS_CB(packet)->input_vport = input_vport; - sf_acts = rcu_dereference(flow->sf_acts); local_bh_disable(); - err = ovs_execute_actions(dp, packet, sf_acts, &flow->key); + err = ovs_execute_actions(dp, packet, acts, &key); local_bh_enable(); rcu_read_unlock(); - ovs_flow_free(flow, false); + ovs_nla_free_flow_actions(acts); return err; err_unlock: rcu_read_unlock(); -err_flow_free: - ovs_flow_free(flow, false); err_kfree_skb: kfree_skb(packet); err:
Use on stack sw_flow_key in ovs_packet_cmd_execute Reason: As key function in slow-path, ovs_packet_cmd_execute and ovs_flow_cmd_new allocate transient memory for sw_flow and frees it at the end of function. The procedure is not efficient in 2 aspects 1. actuall sw_flow_key is what the function need 2. free/alloc involves kmem_cache operations when system under frequent slow path operation Existing code in ovs_flow_cmd_new/set/get use stack to store sw_flow_mask and sw_flow_key deliberately Performance benefit: ovs_packet_cmd_execute efficiency improved Avoid 2 calls to kmem_cache alloc Avoid memzero of 200 bytes 6% less instructions Testing topology +-----+ nic1--| |--nic1 nic2--| |--nic2 VM1(16cpus) | ovs | VM2(16 cpus) nic3--|4cpus|--nic3 nic4--| |--nic4 +-----+ 2 threads on each vnic with affinity set on client side netperf -H $peer -p $((port+$i)) -t UDP_RR -l 60 -- -R 1 -r 8K,8K netperf -H $peer -p $((port+$i)) -t TCP_RR -l 60 -- -R 1 -r 120,240 netperf -H $peer -p $((port+$i)) -t TCP_CRR -l 60 -- -R 1 -r 120,240 Before the fix Mode Iterations Variance Average UDP_RR 10 %1.31 46724 TCP_RR 10 %6.26 77188 TCP_CRR 10 %0.10 20505 UDP_STREAM 10 %4.55 19907 TCP_STREAM 10 %9.93 28942 After the fix Mode Iterations Variance Average UDP_RR 10 %1.51 49097 TCP_RR 10 %5.58 78540 TCP_CRR 10 %0.14 20542 UDP_STREAM 10 %11.17 22532 TCP_STREAM 10 %11.14 28579 Signed-off-by: Eddy Tao <taoyuan_eddy@hotmail.com> --- V1 -> V2: Further reduce memory usage by using sw_flow_key instead of sw_flow, revise description of change and provide data net/openvswitch/datapath.c | 30 +++++++++++------------------- 1 file changed, 11 insertions(+), 19 deletions(-)