diff mbox series

[RFC,net-next,2/3] net: flow_offload: add action stats api

Message ID 20220816092338.12613-3-ozsh@nvidia.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series net: flow_offload: add support for per action hw stats | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 116882 this patch: 116882
netdev/cc_maintainers warning 5 maintainers not CCed: davem@davemloft.net pabeni@redhat.com xiyou.wangcong@gmail.com edumazet@google.com kuba@kernel.org
netdev/build_clang success Errors and warnings before: 170 this patch: 170
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 125410 this patch: 125410
netdev/checkpatch warning WARNING: line length of 83 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Oz Shlomo Aug. 16, 2022, 9:23 a.m. UTC
The current offload api provides visibility to flow hw stats.
This works as long as the flow stats values apply to all the flow's
actions. However, this assumption breaks when an action, such as police,
decides to drop or jump over other actions.

Extend the flow_offload api to return stat record per action instance.
Use the per action stats value, if available, when updating the action
instance counters.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
---
 include/net/flow_offload.h |  6 ++++++
 include/net/pkt_cls.h      | 26 ++++++++++++++++----------
 net/sched/cls_flower.c     |  4 +++-
 net/sched/cls_matchall.c   |  2 +-
 4 files changed, 26 insertions(+), 12 deletions(-)

Comments

Edward Cree Aug. 16, 2022, 1:42 p.m. UTC | #1
On 16/08/2022 10:23, Oz Shlomo wrote:
> The current offload api provides visibility to flow hw stats.
> This works as long as the flow stats values apply to all the flow's
> actions. However, this assumption breaks when an action, such as police,
> decides to drop or jump over other actions.
> 
> Extend the flow_offload api to return stat record per action instance.
> Use the per action stats value, if available, when updating the action
> instance counters.
> 
> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>

When I worked on this before I tried with a similar "array of action
 stats" API [1], but after some discussion it seemed cleaner to have
 a "get stats for one single action" callback [2] which then could
 be called in a loop for filter dumps but also called singly for
 action dumps (RTM_GETACTION).  I recommend this approach to your
 consideration.

[1]: https://lore.kernel.org/all/9804a392-c9fd-8d03-7900-e01848044fea@solarflare.com/
[2]: https://lore.kernel.org/all/a3f0a79a-7e2c-4cdc-8c97-dfebe959ab1f@solarflare.com/

> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index 7da3337c4356..7dc8a62796b5 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -499,7 +499,9 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f,
>  	tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false,
>  			 rtnl_held);
>  
> -	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats);
> +	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, cls_flower.act_stats);
> +
> +	kfree(cls_flower.act_stats);
>  }

Perhaps I'm being dumb, but I don't see this being allocated
 anywhere.  Is the driver supposed to be responsible for doing so?
 That seems inelegant.

-ed
Oz Shlomo Aug. 17, 2022, 2:43 p.m. UTC | #2
Hi Edward,

On 8/16/2022 4:42 PM, Edward Cree wrote:
> On 16/08/2022 10:23, Oz Shlomo wrote:
>> The current offload api provides visibility to flow hw stats.
>> This works as long as the flow stats values apply to all the flow's
>> actions. However, this assumption breaks when an action, such as police,
>> decides to drop or jump over other actions.
>>
>> Extend the flow_offload api to return stat record per action instance.
>> Use the per action stats value, if available, when updating the action
>> instance counters.
>>
>> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
> 
> When I worked on this before I tried with a similar "array of action
>   stats" API [1], but after some discussion it seemed cleaner to have
>   a "get stats for one single action" callback [2] which then could
>   be called in a loop for filter dumps but also called singly for
>   action dumps (RTM_GETACTION).  I recommend this approach to your
>   consideration.
> 
> [1]: https://lore.kernel.org/all/9804a392-c9fd-8d03-7900-e01848044fea@solarflare.com/
> [2]: https://lore.kernel.org/all/a3f0a79a-7e2c-4cdc-8c97-dfebe959ab1f@solarflare.com/
> 

The recent hw_actions infrastructure provides the platform for updating 
stats per action.
However, the platform does introduce performance penalties as it invokes 
a driver api method call per action (compared to the current single api 
call). It also requires the driver to lookup the specific action counter 
- requiring more processing compared to the current flow cookie lookup.
Further more, the current single stats per filter (rather than per 
action) design only breaks when using branching actions (e.g. police), 
which probably applies to a small subset of the rules.

This series proposes two apis:
1. High performance api for filter dump update (ovs triggers a dump per 
rule per second) - extending the current api providing the driver an 
option to update stats per action, if required.
2. Re-use the hw_actions api for tc action list update (see patch #3)

>> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>> index 7da3337c4356..7dc8a62796b5 100644
>> --- a/net/sched/cls_flower.c
>> +++ b/net/sched/cls_flower.c
>> @@ -499,7 +499,9 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f,
>>   	tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false,
>>   			 rtnl_held);
>>   
>> -	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats);
>> +	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, cls_flower.act_stats);
>> +
>> +	kfree(cls_flower.act_stats);
>>   }
> 
> Perhaps I'm being dumb, but I don't see this being allocated
>   anywhere.  Is the driver supposed to be responsible for doing so?
>   That seems inelegant.

You are right, the intention is for the driver to allocate the array and 
for the calling method to free it.

While the proposed design is indeed inelegant, it is efficient compared 
to the possible other alternatives:
1. Dynamically allocated stats array - this will introduce an alloc/free 
calls per stats query (1 / filter/ second), even if per action stats is 
not required.
2. Static action stats array - this has size issues, as this api is 
shared for both tc and nft. Perhaps we can use a hard coded size and 
return an error if the actual counter array size is larger.


> 
> -ed
Oz Shlomo Sept. 28, 2022, 3:19 p.m. UTC | #3
Hן Edward,

On 8/17/2022 5:43 PM, Oz Shlomo wrote:
> Hi Edward,
> 
> On 8/16/2022 4:42 PM, Edward Cree wrote:
>> On 16/08/2022 10:23, Oz Shlomo wrote:
>>> The current offload api provides visibility to flow hw stats.
>>> This works as long as the flow stats values apply to all the flow's
>>> actions. However, this assumption breaks when an action, such as police,
>>> decides to drop or jump over other actions.
>>>
>>> Extend the flow_offload api to return stat record per action instance.
>>> Use the per action stats value, if available, when updating the action
>>> instance counters.
>>>
>>> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
>>
>> When I worked on this before I tried with a similar "array of action
>>   stats" API [1], but after some discussion it seemed cleaner to have
>>   a "get stats for one single action" callback [2] which then could
>>   be called in a loop for filter dumps but also called singly for
>>   action dumps (RTM_GETACTION).  I recommend this approach to your
>>   consideration.
>>
>> [1]: 
>> https://lore.kernel.org/all/9804a392-c9fd-8d03-7900-e01848044fea@solarflare.com/ 
>>
>> [2]: 
>> https://lore.kernel.org/all/a3f0a79a-7e2c-4cdc-8c97-dfebe959ab1f@solarflare.com/ 
>>
>>
> 
> The recent hw_actions infrastructure provides the platform for updating 
> stats per action.
> However, the platform does introduce performance penalties as it invokes 
> a driver api method call per action (compared to the current single api 
> call). It also requires the driver to lookup the specific action counter 
> - requiring more processing compared to the current flow cookie lookup.
> Further more, the current single stats per filter (rather than per 
> action) design only breaks when using branching actions (e.g. police), 
> which probably applies to a small subset of the rules.
> 
> This series proposes two apis:
> 1. High performance api for filter dump update (ovs triggers a dump per 
> rule per second) - extending the current api providing the driver an 
> option to update stats per action, if required.
> 2. Re-use the hw_actions api for tc action list update (see patch #3)
> 

I tried implementing the per action stats using the hw_action api.
The api proved itself well.
However, it is extremely inefficient to allocate a counter per action in
hardware. As such, the driver is required to lookup the action's counter
(hashtable lookup) and also update all the other action stats hanging on
this hw counter (requiring list iteration and locks).
This introduces quite a complex design with performance overheads.

Stats update is performance sensitive as ovs queries the filters' stats
every second.
Supporting tc action stats api will degrade the performance for existing
use cases.
Extending the existing flow_offload api will preserve the current
functionality (single flow stat which applies to all the actions) and
performance while providing the ability to specify per action stats for
use cases involving branching actions.
In the future we could add driver support for returning a per action
stats using the current hw_action api.
WDYT?

>>> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>>> index 7da3337c4356..7dc8a62796b5 100644
>>> --- a/net/sched/cls_flower.c
>>> +++ b/net/sched/cls_flower.c
>>> @@ -499,7 +499,9 @@ static void fl_hw_update_stats(struct tcf_proto 
>>> *tp, struct cls_fl_filter *f,
>>>       tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false,
>>>                rtnl_held);
>>> -    tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats);
>>> +    tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, 
>>> cls_flower.act_stats);
>>> +
>>> +    kfree(cls_flower.act_stats);
>>>   }
>>
>> Perhaps I'm being dumb, but I don't see this being allocated
>>   anywhere.  Is the driver supposed to be responsible for doing so?
>>   That seems inelegant.
> 
> You are right, the intention is for the driver to allocate the array and 
> for the calling method to free it.
> 
> While the proposed design is indeed inelegant, it is efficient compared 
> to the possible other alternatives:
> 1. Dynamically allocated stats array - this will introduce an alloc/free 
> calls per stats query (1 / filter/ second), even if per action stats is 
> not required.
> 2. Static action stats array - this has size issues, as this api is 
> shared for both tc and nft. Perhaps we can use a hard coded size and 
> return an error if the actual counter array size is larger.
> 
> 

I realized that we cannot assume a 1:1 mapping between tc action and its
corresponding offload action as tc pedit action can create an array of
flow offload actions.
I will fix this in v2.

>>
>> -ed
diff mbox series

Patch

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 2a9a9e42e7fd..5e1a34a76772 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -436,6 +436,11 @@  struct flow_stats {
 	bool used_hw_stats_valid;
 };
 
+struct flow_act_stats {
+	unsigned int		num_actions;
+	struct flow_stats	stats[];
+};
+
 static inline void flow_stats_update(struct flow_stats *flow_stats,
 				     u64 bytes, u64 pkts,
 				     u64 drops, u64 lastused,
@@ -583,6 +588,7 @@  struct flow_cls_offload {
 	struct flow_rule *rule;
 	struct flow_stats stats;
 	u32 classid;
+	struct flow_act_stats *act_stats;
 };
 
 enum offload_act_command  {
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 27eac9e73c61..f5e5582aef17 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -269,24 +269,30 @@  static inline void tcf_exts_put_net(struct tcf_exts *exts)
 
 static inline void
 tcf_exts_hw_stats_update(const struct tcf_exts *exts,
-			 struct flow_stats *stats)
+			 struct flow_stats *flow_stats,
+			 struct flow_act_stats *act_stats)
 {
 #ifdef CONFIG_NET_CLS_ACT
 	int i;
 
 	for (i = 0; i < exts->nr_actions; i++) {
 		struct tc_action *a = exts->actions[i];
+		struct flow_stats *stats = flow_stats;
 
 		/* if stats from hw, just skip */
-		if (tcf_action_update_hw_stats(a)) {
-			preempt_disable();
-			tcf_action_stats_update(a, stats->bytes, stats->pkts, stats->drops,
-						stats->lastused, true);
-			preempt_enable();
-
-			a->used_hw_stats = stats->used_hw_stats;
-			a->used_hw_stats_valid = stats->used_hw_stats_valid;
-		}
+		if (!tcf_action_update_hw_stats(a))
+			continue;
+
+		if (act_stats)
+			stats = &act_stats->stats[i];
+
+		preempt_disable();
+		tcf_action_stats_update(a, stats->bytes, stats->pkts, stats->drops,
+					stats->lastused, true);
+		preempt_enable();
+
+		a->used_hw_stats = stats->used_hw_stats;
+		a->used_hw_stats_valid = stats->used_hw_stats_valid;
 	}
 #endif
 }
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 7da3337c4356..7dc8a62796b5 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -499,7 +499,9 @@  static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f,
 	tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false,
 			 rtnl_held);
 
-	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats);
+	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, cls_flower.act_stats);
+
+	kfree(cls_flower.act_stats);
 }
 
 static void __fl_put(struct cls_fl_filter *f)
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index b5520a9c35e6..0ba4392b93de 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -332,7 +332,7 @@  static void mall_stats_hw_filter(struct tcf_proto *tp,
 
 	tc_setup_cb_call(block, TC_SETUP_CLSMATCHALL, &cls_mall, false, true);
 
-	tcf_exts_hw_stats_update(&head->exts, &cls_mall.stats);
+	tcf_exts_hw_stats_update(&head->exts, &cls_mall.stats, NULL);
 }
 
 static int mall_dump(struct net *net, struct tcf_proto *tp, void *fh,