[RFC,net-next,2/3] net: flow_offload: add action stats api

Message ID	20220816092338.12613-3-ozsh@nvidia.com (mailing list archive)
State	RFC
Delegated to:	Netdev Maintainers
Headers	show Return-Path: <netdev-owner@kernel.org> Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.235 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.235; helo=mail.nvidia.com; pr=C From: Oz Shlomo <ozsh@nvidia.com> To: <netdev@vger.kernel.org> CC: Jiri Pirko <jiri@nvidia.com>, Jamal Hadi Salim <jhs@mojatatu.com>, "Simon Horman" <simon.horman@corigine.com>, Baowen Zheng <baowen.zheng@corigine.com>, Vlad Buslov <vladbu@nvidia.com>, Ido Schimmel <idosch@nvidia.com>, Roi Dayan <roid@nvidia.com>, Oz Shlomo <ozsh@nvidia.com> Subject: [ RFC net-next 2/3] net: flow_offload: add action stats api Date: Tue, 16 Aug 2022 12:23:37 +0300 Message-ID: <20220816092338.12613-3-ozsh@nvidia.com> In-Reply-To: <20220816092338.12613-1-ozsh@nvidia.com> References: <20220816092338.12613-1-ozsh@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk
Series	net: flow_offload: add support for per action hw stats \| expand [RFC,net-next,0/3] net: flow_offload: add support for per action hw stats [RFC,net-next,1/3] net: sched: Pass flow_stats instead of multiple stats args [RFC,net-next,2/3] net: flow_offload: add action stats api [RFC,net-next,3/3] net/sched: act_api: update hw stats for tc action list

Message ID

20220816092338.12613-3-ozsh@nvidia.com (mailing list archive)

State

RFC

Delegated to:

Netdev Maintainers

Headers

Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates
 12.22.5.235 as permitted sender) receiver=protection.outlook.com;
 client-ip=12.22.5.235; helo=mail.nvidia.com; pr=C
From: Oz Shlomo <ozsh@nvidia.com>
To: <netdev@vger.kernel.org>
CC: Jiri Pirko <jiri@nvidia.com>, Jamal Hadi Salim <jhs@mojatatu.com>,
        "Simon Horman" <simon.horman@corigine.com>,
        Baowen Zheng <baowen.zheng@corigine.com>,
        Vlad Buslov <vladbu@nvidia.com>,
        Ido Schimmel <idosch@nvidia.com>, Roi Dayan <roid@nvidia.com>,
        Oz Shlomo <ozsh@nvidia.com>
Subject: [ RFC  net-next 2/3] net: flow_offload: add action stats api
Date: Tue, 16 Aug 2022 12:23:37 +0300
Message-ID: <20220816092338.12613-3-ozsh@nvidia.com>
In-Reply-To: <20220816092338.12613-1-ozsh@nvidia.com>
References: <20220816092338.12613-1-ozsh@nvidia.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Aug 2022 09:23:52.9871
 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 
 c6e96679-e7a6-4448-3f2a-08da7f69095d
X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: 
 TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[12.22.5.235];Helo=[mail.nvidia.com]
X-MS-Exchange-CrossTenant-AuthSource: 
 DM6NAM11FT084.eop-nam11.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL3PR12MB6642
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org
X-Patchwork-Delegate: kuba@kernel.org
X-Patchwork-State: RFC

Series

net: flow_offload: add support for per action hw stats | expand

Context	Check	Description
netdev/tree_selection	success	Clearly marked for net-next
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/subject_prefix	success	Link
netdev/cover_letter	success	Series has a cover letter
netdev/patch_count	success	Link
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 116882 this patch: 116882
netdev/cc_maintainers	warning	5 maintainers not CCed: davem@davemloft.net pabeni@redhat.com xiyou.wangcong@gmail.com edumazet@google.com kuba@kernel.org
netdev/build_clang	success	Errors and warnings before: 170 this patch: 170
netdev/module_param	success	Was 0 now: 0
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 125410 this patch: 125410
netdev/checkpatch	warning	WARNING: line length of 83 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0

Context

Check

Description

netdev/tree_selection

success

Clearly marked for net-next

netdev/fixes_present

success

Fixes tag not required for -next series

netdev/subject_prefix

success

Link

netdev/cover_letter

success

Series has a cover letter

netdev/patch_count

success

Link

netdev/header_inline

success

No static functions without inline keyword in header files

netdev/build_32bit

success

Errors and warnings before: 116882 this patch: 116882

netdev/cc_maintainers

warning

5 maintainers not CCed: davem@davemloft.net pabeni@redhat.com xiyou.wangcong@gmail.com edumazet@google.com kuba@kernel.org

netdev/build_clang

success

Errors and warnings before: 170 this patch: 170

netdev/module_param

success

Was 0 now: 0

netdev/verify_signedoff

success

Signed-off-by tag matches author and committer

netdev/check_selftest

success

No net selftest shell script

netdev/verify_fixes

success

No Fixes tag

netdev/build_allmodconfig_warn

success

Errors and warnings before: 125410 this patch: 125410

netdev/checkpatch

warning

WARNING: line length of 83 exceeds 80 columns WARNING: line length of 84 exceeds 80 columns

netdev/kdoc

success

Errors and warnings before: 0 this patch: 0

netdev/source_inline

success

Was 0 now: 0

Commit Message

Oz Shlomo Aug. 16, 2022, 9:23 a.m. UTC

The current offload api provides visibility to flow hw stats.
This works as long as the flow stats values apply to all the flow's
actions. However, this assumption breaks when an action, such as police,
decides to drop or jump over other actions.

Extend the flow_offload api to return stat record per action instance.
Use the per action stats value, if available, when updating the action
instance counters.

Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
---
 include/net/flow_offload.h |  6 ++++++
 include/net/pkt_cls.h      | 26 ++++++++++++++++----------
 net/sched/cls_flower.c     |  4 +++-
 net/sched/cls_matchall.c   |  2 +-
 4 files changed, 26 insertions(+), 12 deletions(-)

Comments

Edward Cree Aug. 16, 2022, 1:42 p.m. UTC | #1

On 16/08/2022 10:23, Oz Shlomo wrote:
> The current offload api provides visibility to flow hw stats.
> This works as long as the flow stats values apply to all the flow's
> actions. However, this assumption breaks when an action, such as police,
> decides to drop or jump over other actions.
> 
> Extend the flow_offload api to return stat record per action instance.
> Use the per action stats value, if available, when updating the action
> instance counters.
> 
> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>

When I worked on this before I tried with a similar "array of action
 stats" API [1], but after some discussion it seemed cleaner to have
 a "get stats for one single action" callback [2] which then could
 be called in a loop for filter dumps but also called singly for
 action dumps (RTM_GETACTION).  I recommend this approach to your
 consideration.

[1]: https://lore.kernel.org/all/9804a392-c9fd-8d03-7900-e01848044fea@solarflare.com/
[2]: https://lore.kernel.org/all/a3f0a79a-7e2c-4cdc-8c97-dfebe959ab1f@solarflare.com/

> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index 7da3337c4356..7dc8a62796b5 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -499,7 +499,9 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f,
>  	tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false,
>  			 rtnl_held);
>  
> -	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats);
> +	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, cls_flower.act_stats);
> +
> +	kfree(cls_flower.act_stats);
>  }

Perhaps I'm being dumb, but I don't see this being allocated
 anywhere.  Is the driver supposed to be responsible for doing so?
 That seems inelegant.

-ed

Oz Shlomo Aug. 17, 2022, 2:43 p.m. UTC | #2

Hi Edward,

On 8/16/2022 4:42 PM, Edward Cree wrote:
> On 16/08/2022 10:23, Oz Shlomo wrote:
>> The current offload api provides visibility to flow hw stats.
>> This works as long as the flow stats values apply to all the flow's
>> actions. However, this assumption breaks when an action, such as police,
>> decides to drop or jump over other actions.
>>
>> Extend the flow_offload api to return stat record per action instance.
>> Use the per action stats value, if available, when updating the action
>> instance counters.
>>
>> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
> 
> When I worked on this before I tried with a similar "array of action
>   stats" API [1], but after some discussion it seemed cleaner to have
>   a "get stats for one single action" callback [2] which then could
>   be called in a loop for filter dumps but also called singly for
>   action dumps (RTM_GETACTION).  I recommend this approach to your
>   consideration.
> 
> [1]: https://lore.kernel.org/all/9804a392-c9fd-8d03-7900-e01848044fea@solarflare.com/
> [2]: https://lore.kernel.org/all/a3f0a79a-7e2c-4cdc-8c97-dfebe959ab1f@solarflare.com/
> 

The recent hw_actions infrastructure provides the platform for updating 
stats per action.
However, the platform does introduce performance penalties as it invokes 
a driver api method call per action (compared to the current single api 
call). It also requires the driver to lookup the specific action counter 
- requiring more processing compared to the current flow cookie lookup.
Further more, the current single stats per filter (rather than per 
action) design only breaks when using branching actions (e.g. police), 
which probably applies to a small subset of the rules.

This series proposes two apis:
1. High performance api for filter dump update (ovs triggers a dump per 
rule per second) - extending the current api providing the driver an 
option to update stats per action, if required.
2. Re-use the hw_actions api for tc action list update (see patch #3)

>> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>> index 7da3337c4356..7dc8a62796b5 100644
>> --- a/net/sched/cls_flower.c
>> +++ b/net/sched/cls_flower.c
>> @@ -499,7 +499,9 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f,
>>   	tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false,
>>   			 rtnl_held);
>>   
>> -	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats);
>> +	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, cls_flower.act_stats);
>> +
>> +	kfree(cls_flower.act_stats);
>>   }
> 
> Perhaps I'm being dumb, but I don't see this being allocated
>   anywhere.  Is the driver supposed to be responsible for doing so?
>   That seems inelegant.

You are right, the intention is for the driver to allocate the array and 
for the calling method to free it.

While the proposed design is indeed inelegant, it is efficient compared 
to the possible other alternatives:
1. Dynamically allocated stats array - this will introduce an alloc/free 
calls per stats query (1 / filter/ second), even if per action stats is 
not required.
2. Static action stats array - this has size issues, as this api is 
shared for both tc and nft. Perhaps we can use a hard coded size and 
return an error if the actual counter array size is larger.


> 
> -ed

Oz Shlomo Sept. 28, 2022, 3:19 p.m. UTC | #3

Hן Edward,

On 8/17/2022 5:43 PM, Oz Shlomo wrote:
> Hi Edward,
> 
> On 8/16/2022 4:42 PM, Edward Cree wrote:
>> On 16/08/2022 10:23, Oz Shlomo wrote:
>>> The current offload api provides visibility to flow hw stats.
>>> This works as long as the flow stats values apply to all the flow's
>>> actions. However, this assumption breaks when an action, such as police,
>>> decides to drop or jump over other actions.
>>>
>>> Extend the flow_offload api to return stat record per action instance.
>>> Use the per action stats value, if available, when updating the action
>>> instance counters.
>>>
>>> Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
>>
>> When I worked on this before I tried with a similar "array of action
>>   stats" API [1], but after some discussion it seemed cleaner to have
>>   a "get stats for one single action" callback [2] which then could
>>   be called in a loop for filter dumps but also called singly for
>>   action dumps (RTM_GETACTION).  I recommend this approach to your
>>   consideration.
>>
>> [1]: 
>> https://lore.kernel.org/all/9804a392-c9fd-8d03-7900-e01848044fea@solarflare.com/ 
>>
>> [2]: 
>> https://lore.kernel.org/all/a3f0a79a-7e2c-4cdc-8c97-dfebe959ab1f@solarflare.com/ 
>>
>>
> 
> The recent hw_actions infrastructure provides the platform for updating 
> stats per action.
> However, the platform does introduce performance penalties as it invokes 
> a driver api method call per action (compared to the current single api 
> call). It also requires the driver to lookup the specific action counter 
> - requiring more processing compared to the current flow cookie lookup.
> Further more, the current single stats per filter (rather than per 
> action) design only breaks when using branching actions (e.g. police), 
> which probably applies to a small subset of the rules.
> 
> This series proposes two apis:
> 1. High performance api for filter dump update (ovs triggers a dump per 
> rule per second) - extending the current api providing the driver an 
> option to update stats per action, if required.
> 2. Re-use the hw_actions api for tc action list update (see patch #3)
> 

I tried implementing the per action stats using the hw_action api.
The api proved itself well.
However, it is extremely inefficient to allocate a counter per action in
hardware. As such, the driver is required to lookup the action's counter
(hashtable lookup) and also update all the other action stats hanging on
this hw counter (requiring list iteration and locks).
This introduces quite a complex design with performance overheads.

Stats update is performance sensitive as ovs queries the filters' stats
every second.
Supporting tc action stats api will degrade the performance for existing
use cases.
Extending the existing flow_offload api will preserve the current
functionality (single flow stat which applies to all the actions) and
performance while providing the ability to specify per action stats for
use cases involving branching actions.
In the future we could add driver support for returning a per action
stats using the current hw_action api.
WDYT?

>>> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>>> index 7da3337c4356..7dc8a62796b5 100644
>>> --- a/net/sched/cls_flower.c
>>> +++ b/net/sched/cls_flower.c
>>> @@ -499,7 +499,9 @@ static void fl_hw_update_stats(struct tcf_proto 
>>> *tp, struct cls_fl_filter *f,
>>>       tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false,
>>>                rtnl_held);
>>> -    tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats);
>>> +    tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, 
>>> cls_flower.act_stats);
>>> +
>>> +    kfree(cls_flower.act_stats);
>>>   }
>>
>> Perhaps I'm being dumb, but I don't see this being allocated
>>   anywhere.  Is the driver supposed to be responsible for doing so?
>>   That seems inelegant.
> 
> You are right, the intention is for the driver to allocate the array and 
> for the calling method to free it.
> 
> While the proposed design is indeed inelegant, it is efficient compared 
> to the possible other alternatives:
> 1. Dynamically allocated stats array - this will introduce an alloc/free 
> calls per stats query (1 / filter/ second), even if per action stats is 
> not required.
> 2. Static action stats array - this has size issues, as this api is 
> shared for both tc and nft. Perhaps we can use a hard coded size and 
> return an error if the actual counter array size is larger.
> 
> 

I realized that we cannot assume a 1:1 mapping between tc action and its
corresponding offload action as tc pedit action can create an array of
flow offload actions.
I will fix this in v2.

>>
>> -ed

diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 2a9a9e42e7fd..5e1a34a76772 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -436,6 +436,11 @@  struct flow_stats {
 	bool used_hw_stats_valid;
 };
 
+struct flow_act_stats {
+	unsigned int		num_actions;
+	struct flow_stats	stats[];
+};
+
 static inline void flow_stats_update(struct flow_stats *flow_stats,
 				     u64 bytes, u64 pkts,
 				     u64 drops, u64 lastused,
@@ -583,6 +588,7 @@  struct flow_cls_offload {
 	struct flow_rule *rule;
 	struct flow_stats stats;
 	u32 classid;
+	struct flow_act_stats *act_stats;
 };
 
 enum offload_act_command  {
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 27eac9e73c61..f5e5582aef17 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -269,24 +269,30 @@  static inline void tcf_exts_put_net(struct tcf_exts *exts)
 
 static inline void
 tcf_exts_hw_stats_update(const struct tcf_exts *exts,
-			 struct flow_stats *stats)
+			 struct flow_stats *flow_stats,
+			 struct flow_act_stats *act_stats)
 {
 #ifdef CONFIG_NET_CLS_ACT
 	int i;
 
 	for (i = 0; i < exts->nr_actions; i++) {
 		struct tc_action *a = exts->actions[i];
+		struct flow_stats *stats = flow_stats;
 
 		/* if stats from hw, just skip */
-		if (tcf_action_update_hw_stats(a)) {
-			preempt_disable();
-			tcf_action_stats_update(a, stats->bytes, stats->pkts, stats->drops,
-						stats->lastused, true);
-			preempt_enable();
-
-			a->used_hw_stats = stats->used_hw_stats;
-			a->used_hw_stats_valid = stats->used_hw_stats_valid;
-		}
+		if (!tcf_action_update_hw_stats(a))
+			continue;
+
+		if (act_stats)
+			stats = &act_stats->stats[i];
+
+		preempt_disable();
+		tcf_action_stats_update(a, stats->bytes, stats->pkts, stats->drops,
+					stats->lastused, true);
+		preempt_enable();
+
+		a->used_hw_stats = stats->used_hw_stats;
+		a->used_hw_stats_valid = stats->used_hw_stats_valid;
 	}
 #endif
 }
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 7da3337c4356..7dc8a62796b5 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -499,7 +499,9 @@  static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f,
 	tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false,
 			 rtnl_held);
 
-	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats);
+	tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, cls_flower.act_stats);
+
+	kfree(cls_flower.act_stats);
 }
 
 static void __fl_put(struct cls_fl_filter *f)
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c
index b5520a9c35e6..0ba4392b93de 100644
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -332,7 +332,7 @@  static void mall_stats_hw_filter(struct tcf_proto *tp,
 
 	tc_setup_cb_call(block, TC_SETUP_CLSMATCHALL, &cls_mall, false, true);
 
-	tcf_exts_hw_stats_update(&head->exts, &cls_mall.stats);
+	tcf_exts_hw_stats_update(&head->exts, &cls_mall.stats, NULL);
 }
 
 static int mall_dump(struct net *net, struct tcf_proto *tp, void *fh,

[RFC,net-next,2/3] net: flow_offload: add action stats api

Checks

Commit Message

Comments

Patch