diff mbox series

[net-next,v5,6/7] net/sched: act_ct: offload UDP NEW connections

Message ID 20230127183845.597861-7-vladbu@nvidia.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Allow offloading of UDP NEW connections via act_ct | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit fail Errors and warnings before: 0 this patch: 1
netdev/cc_maintainers warning 5 maintainers not CCed: nathan@kernel.org edumazet@google.com llvm@lists.linux.dev ndesaulniers@google.com trix@redhat.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 0 this patch: 1
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 98 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Vlad Buslov Jan. 27, 2023, 6:38 p.m. UTC
Modify the offload algorithm of UDP connections to the following:

- Offload NEW connection as unidirectional.

- When connection state changes to ESTABLISHED also update the hardware
flow. However, in order to prevent act_ct from spamming offload add wq for
every packet coming in reply direction in this state verify whether
connection has already been updated to ESTABLISHED in the drivers. If that
it the case, then skip flow_table and let conntrack handle such packets
which will also allow conntrack to potentially promote the connection to
ASSURED.

- When connection state changes to ASSURED set the flow_table flow
NF_FLOW_HW_BIDIRECTIONAL flag which will cause refresh mechanism to offload
the reply direction.

All other protocols have their offload algorithm preserved and are always
offloaded as bidirectional.

Note that this change tries to minimize the load on flow_table add
workqueue. First, it tracks the last ctinfo that was offloaded by using new
flow 'ext_data' field and doesn't schedule the refresh for reply direction
packets when the offloads have already been updated with current ctinfo.
Second, when 'add' task executes on workqueue it always update the offload
with current flow state (by checking 'bidirectional' flow flag and
obtaining actual ctinfo/cookie through meta action instead of caching any
of these from the moment of scheduling the 'add' work) preventing the need
from scheduling more updates if state changed concurrently while the 'add'
work was pending on workqueue.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
---

Notes:
    Changes V4 -> V5:
    
    - Make clang happy.
    
    Changes V3 -> V4:
    
    - Refactor the patch to leverage the refresh code and new flow 'ext_data'
    field in order to change the offload state instead of relying on async gc
    update.

 net/sched/act_ct.c | 51 +++++++++++++++++++++++++++++++++++-----------
 1 file changed, 39 insertions(+), 12 deletions(-)

Comments

Pablo Neira Ayuso Jan. 28, 2023, 3:26 p.m. UTC | #1
Hi Vlad,

On Fri, Jan 27, 2023 at 07:38:44PM +0100, Vlad Buslov wrote:
> Modify the offload algorithm of UDP connections to the following:
> 
> - Offload NEW connection as unidirectional.
> 
> - When connection state changes to ESTABLISHED also update the hardware
> flow. However, in order to prevent act_ct from spamming offload add wq for
> every packet coming in reply direction in this state verify whether
> connection has already been updated to ESTABLISHED in the drivers. If that
> it the case, then skip flow_table and let conntrack handle such packets
> which will also allow conntrack to potentially promote the connection to
> ASSURED.
> 
> - When connection state changes to ASSURED set the flow_table flow
> NF_FLOW_HW_BIDIRECTIONAL flag which will cause refresh mechanism to offload
> the reply direction.
> 
> All other protocols have their offload algorithm preserved and are always
> offloaded as bidirectional.
> 
> Note that this change tries to minimize the load on flow_table add
> workqueue. First, it tracks the last ctinfo that was offloaded by using new
> flow 'ext_data' field and doesn't schedule the refresh for reply direction
> packets when the offloads have already been updated with current ctinfo.
> Second, when 'add' task executes on workqueue it always update the offload
> with current flow state (by checking 'bidirectional' flow flag and
> obtaining actual ctinfo/cookie through meta action instead of caching any
> of these from the moment of scheduling the 'add' work) preventing the need
> from scheduling more updates if state changed concurrently while the 'add'
> work was pending on workqueue.

Could you use a flag to achieve what you need instead of this ext_data
field? Better this ext_data and the flag, I prefer the flags.

Thanks
Vlad Buslov Jan. 28, 2023, 3:31 p.m. UTC | #2
On Sat 28 Jan 2023 at 16:26, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Hi Vlad,
>
> On Fri, Jan 27, 2023 at 07:38:44PM +0100, Vlad Buslov wrote:
>> Modify the offload algorithm of UDP connections to the following:
>> 
>> - Offload NEW connection as unidirectional.
>> 
>> - When connection state changes to ESTABLISHED also update the hardware
>> flow. However, in order to prevent act_ct from spamming offload add wq for
>> every packet coming in reply direction in this state verify whether
>> connection has already been updated to ESTABLISHED in the drivers. If that
>> it the case, then skip flow_table and let conntrack handle such packets
>> which will also allow conntrack to potentially promote the connection to
>> ASSURED.
>> 
>> - When connection state changes to ASSURED set the flow_table flow
>> NF_FLOW_HW_BIDIRECTIONAL flag which will cause refresh mechanism to offload
>> the reply direction.
>> 
>> All other protocols have their offload algorithm preserved and are always
>> offloaded as bidirectional.
>> 
>> Note that this change tries to minimize the load on flow_table add
>> workqueue. First, it tracks the last ctinfo that was offloaded by using new
>> flow 'ext_data' field and doesn't schedule the refresh for reply direction
>> packets when the offloads have already been updated with current ctinfo.
>> Second, when 'add' task executes on workqueue it always update the offload
>> with current flow state (by checking 'bidirectional' flow flag and
>> obtaining actual ctinfo/cookie through meta action instead of caching any
>> of these from the moment of scheduling the 'add' work) preventing the need
>> from scheduling more updates if state changed concurrently while the 'add'
>> work was pending on workqueue.
>
> Could you use a flag to achieve what you need instead of this ext_data
> field? Better this ext_data and the flag, I prefer the flags.

Sure, np. Do you prefer the functionality to be offloaded to gc (as in
earlier versions of this series) or leverage 'refresh' code as in
versions 4-5?
Pablo Neira Ayuso Jan. 28, 2023, 7:09 p.m. UTC | #3
On Sat, Jan 28, 2023 at 05:31:40PM +0200, Vlad Buslov wrote:
> 
> On Sat 28 Jan 2023 at 16:26, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > Hi Vlad,
> >
> > On Fri, Jan 27, 2023 at 07:38:44PM +0100, Vlad Buslov wrote:
> >> Modify the offload algorithm of UDP connections to the following:
> >> 
> >> - Offload NEW connection as unidirectional.
> >> 
> >> - When connection state changes to ESTABLISHED also update the hardware
> >> flow. However, in order to prevent act_ct from spamming offload add wq for
> >> every packet coming in reply direction in this state verify whether
> >> connection has already been updated to ESTABLISHED in the drivers. If that
> >> it the case, then skip flow_table and let conntrack handle such packets
> >> which will also allow conntrack to potentially promote the connection to
> >> ASSURED.
> >> 
> >> - When connection state changes to ASSURED set the flow_table flow
> >> NF_FLOW_HW_BIDIRECTIONAL flag which will cause refresh mechanism to offload
> >> the reply direction.
> >> 
> >> All other protocols have their offload algorithm preserved and are always
> >> offloaded as bidirectional.
> >> 
> >> Note that this change tries to minimize the load on flow_table add
> >> workqueue. First, it tracks the last ctinfo that was offloaded by using new
> >> flow 'ext_data' field and doesn't schedule the refresh for reply direction
> >> packets when the offloads have already been updated with current ctinfo.
> >> Second, when 'add' task executes on workqueue it always update the offload
> >> with current flow state (by checking 'bidirectional' flow flag and
> >> obtaining actual ctinfo/cookie through meta action instead of caching any
> >> of these from the moment of scheduling the 'add' work) preventing the need
> >> from scheduling more updates if state changed concurrently while the 'add'
> >> work was pending on workqueue.
> >
> > Could you use a flag to achieve what you need instead of this ext_data
> > field? Better this ext_data and the flag, I prefer the flags.
> 
> Sure, np. Do you prefer the functionality to be offloaded to gc (as in
> earlier versions of this series) or leverage 'refresh' code as in
> versions 4-5?

No, I prefer generic gc/refresh mechanism is not used for this.

What I mean is: could you replace this new ->ext_data generic pointer
by a flag to annotate what you need? Between this generic pointer and
a flag, I prefer a flag.
Vlad Buslov Jan. 28, 2023, 7:28 p.m. UTC | #4
On Sat 28 Jan 2023 at 20:09, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Sat, Jan 28, 2023 at 05:31:40PM +0200, Vlad Buslov wrote:
>> 
>> On Sat 28 Jan 2023 at 16:26, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>> > Hi Vlad,
>> >
>> > On Fri, Jan 27, 2023 at 07:38:44PM +0100, Vlad Buslov wrote:
>> >> Modify the offload algorithm of UDP connections to the following:
>> >> 
>> >> - Offload NEW connection as unidirectional.
>> >> 
>> >> - When connection state changes to ESTABLISHED also update the hardware
>> >> flow. However, in order to prevent act_ct from spamming offload add wq for
>> >> every packet coming in reply direction in this state verify whether
>> >> connection has already been updated to ESTABLISHED in the drivers. If that
>> >> it the case, then skip flow_table and let conntrack handle such packets
>> >> which will also allow conntrack to potentially promote the connection to
>> >> ASSURED.
>> >> 
>> >> - When connection state changes to ASSURED set the flow_table flow
>> >> NF_FLOW_HW_BIDIRECTIONAL flag which will cause refresh mechanism to offload
>> >> the reply direction.
>> >> 
>> >> All other protocols have their offload algorithm preserved and are always
>> >> offloaded as bidirectional.
>> >> 
>> >> Note that this change tries to minimize the load on flow_table add
>> >> workqueue. First, it tracks the last ctinfo that was offloaded by using new
>> >> flow 'ext_data' field and doesn't schedule the refresh for reply direction
>> >> packets when the offloads have already been updated with current ctinfo.
>> >> Second, when 'add' task executes on workqueue it always update the offload
>> >> with current flow state (by checking 'bidirectional' flow flag and
>> >> obtaining actual ctinfo/cookie through meta action instead of caching any
>> >> of these from the moment of scheduling the 'add' work) preventing the need
>> >> from scheduling more updates if state changed concurrently while the 'add'
>> >> work was pending on workqueue.
>> >
>> > Could you use a flag to achieve what you need instead of this ext_data
>> > field? Better this ext_data and the flag, I prefer the flags.
>> 
>> Sure, np. Do you prefer the functionality to be offloaded to gc (as in
>> earlier versions of this series) or leverage 'refresh' code as in
>> versions 4-5?
>
> No, I prefer generic gc/refresh mechanism is not used for this.
>
> What I mean is: could you replace this new ->ext_data generic pointer
> by a flag to annotate what you need?

Yes, will do.

> Between this generic pointer and a flag, I prefer a flag.

Got it.
diff mbox series

Patch

diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index 2b81a7898662..5107f4149474 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -401,7 +401,7 @@  static void tcf_ct_flow_tc_ifidx(struct flow_offload *entry,
 
 static void tcf_ct_flow_table_add(struct tcf_ct_flow_table *ct_ft,
 				  struct nf_conn *ct,
-				  bool tcp)
+				  bool tcp, bool bidirectional)
 {
 	struct nf_conn_act_ct_ext *act_ct_ext;
 	struct flow_offload *entry;
@@ -420,6 +420,8 @@  static void tcf_ct_flow_table_add(struct tcf_ct_flow_table *ct_ft,
 		ct->proto.tcp.seen[0].flags |= IP_CT_TCP_FLAG_BE_LIBERAL;
 		ct->proto.tcp.seen[1].flags |= IP_CT_TCP_FLAG_BE_LIBERAL;
 	}
+	if (bidirectional)
+		__set_bit(NF_FLOW_HW_BIDIRECTIONAL, &entry->flags);
 
 	act_ct_ext = nf_conn_act_ct_ext_find(ct);
 	if (act_ct_ext) {
@@ -443,26 +445,34 @@  static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft,
 					   struct nf_conn *ct,
 					   enum ip_conntrack_info ctinfo)
 {
-	bool tcp = false;
-
-	if ((ctinfo != IP_CT_ESTABLISHED && ctinfo != IP_CT_ESTABLISHED_REPLY) ||
-	    !test_bit(IPS_ASSURED_BIT, &ct->status))
-		return;
+	bool tcp = false, bidirectional = true;
 
 	switch (nf_ct_protonum(ct)) {
 	case IPPROTO_TCP:
-		tcp = true;
-		if (ct->proto.tcp.state != TCP_CONNTRACK_ESTABLISHED)
+		if ((ctinfo != IP_CT_ESTABLISHED &&
+		     ctinfo != IP_CT_ESTABLISHED_REPLY) ||
+		    !test_bit(IPS_ASSURED_BIT, &ct->status) ||
+		    ct->proto.tcp.state != TCP_CONNTRACK_ESTABLISHED)
 			return;
+
+		tcp = true;
 		break;
 	case IPPROTO_UDP:
+		if (!nf_ct_is_confirmed(ct))
+			return;
+		if (!test_bit(IPS_ASSURED_BIT, &ct->status))
+			bidirectional = false;
 		break;
 #ifdef CONFIG_NF_CT_PROTO_GRE
 	case IPPROTO_GRE: {
 		struct nf_conntrack_tuple *tuple;
 
-		if (ct->status & IPS_NAT_MASK)
+		if ((ctinfo != IP_CT_ESTABLISHED &&
+		     ctinfo != IP_CT_ESTABLISHED_REPLY) ||
+		    !test_bit(IPS_ASSURED_BIT, &ct->status) ||
+		    ct->status & IPS_NAT_MASK)
 			return;
+
 		tuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
 		/* No support for GRE v1 */
 		if (tuple->src.u.gre.key || tuple->dst.u.gre.key)
@@ -478,7 +488,7 @@  static void tcf_ct_flow_table_process_conn(struct tcf_ct_flow_table *ct_ft,
 	    ct->status & IPS_SEQ_ADJUST)
 		return;
 
-	tcf_ct_flow_table_add(ct_ft, ct, tcp);
+	tcf_ct_flow_table_add(ct_ft, ct, tcp, bidirectional);
 }
 
 static bool
@@ -657,13 +667,30 @@  static bool tcf_ct_flow_table_lookup(struct tcf_ct_params *p,
 	flow = container_of(tuplehash, struct flow_offload, tuplehash[dir]);
 	ct = flow->ct;
 
+	if (dir == FLOW_OFFLOAD_DIR_REPLY &&
+	    !test_bit(NF_FLOW_HW_BIDIRECTIONAL, &flow->flags)) {
+		/* Only offload reply direction after connection became
+		 * assured.
+		 */
+		if (test_bit(IPS_ASSURED_BIT, &ct->status))
+			set_bit(NF_FLOW_HW_BIDIRECTIONAL, &flow->flags);
+		else if (READ_ONCE(flow->ext_data) == IP_CT_ESTABLISHED)
+			/* If flow_table flow has already been updated to the
+			 * established state, then don't refresh.
+			 */
+			return false;
+	}
+
 	if (tcph && (unlikely(tcph->fin || tcph->rst))) {
 		flow_offload_teardown(nf_ft, flow);
 		return false;
 	}
 
-	ctinfo = dir == FLOW_OFFLOAD_DIR_ORIGINAL ? IP_CT_ESTABLISHED :
-						    IP_CT_ESTABLISHED_REPLY;
+	if (dir == FLOW_OFFLOAD_DIR_ORIGINAL)
+		ctinfo = test_bit(IPS_SEEN_REPLY_BIT, &ct->status) ?
+			IP_CT_ESTABLISHED : IP_CT_NEW;
+	else
+		ctinfo = IP_CT_ESTABLISHED_REPLY;
 
 	flow_offload_refresh(nf_ft, flow);
 	nf_conntrack_get(&ct->ct_general);