Message ID | 20230201161039.20714-6-ozsh@nvidia.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: flow_offload: add support for per action hw stats | expand |
On Wed, Feb 01, 2023 at 06:10:34PM +0200, Oz Shlomo wrote: > There are currently two mechanisms for populating hardware stats: > 1. Using flow_offload api to query the flow's statistics. > The api assumes that the same stats values apply to all > the flow's actions. > This assumption breaks when action drops or jumps over following > actions. > 2. Using hw_action api to query specific action stats via a driver > callback method. This api assures the correct action stats for > the offloaded action, however, it does not apply to the rest of the > actions in the flow's actions array. > > Extend the flow_offload stats callback to indicate that a per action > stats update is required. > Use the existing flow_offload_action api to query the action's hw stats. > In addition, currently the tc action stats utility only updates hw actions. > Reuse the existing action stats cb infrastructure to query any action > stats. > > Signed-off-by: Oz Shlomo <ozsh@nvidia.com> ... > diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h > index be21764a3b34..d4315757d1a2 100644 > --- a/include/net/pkt_cls.h > +++ b/include/net/pkt_cls.h > @@ -292,9 +292,15 @@ static inline void tcf_exts_put_net(struct tcf_exts *exts) > #define tcf_act_for_each_action(i, a, actions) \ > for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++) > > +static bool tc_act_in_hw(struct tc_action *act) > +{ > + return !!act->in_hw_count; > +} > + > static inline void > tcf_exts_hw_stats_update(const struct tcf_exts *exts, > - struct flow_stats *stats) > + struct flow_stats *stats, > + bool use_act_stats) > { > #ifdef CONFIG_NET_CLS_ACT > int i; > @@ -302,16 +308,18 @@ static inline void tcf_exts_put_net(struct tcf_exts *exts) > for (i = 0; i < exts->nr_actions; i++) { > struct tc_action *a = exts->actions[i]; > > - /* if stats from hw, just skip */ > - if (tcf_action_update_hw_stats(a)) { > - preempt_disable(); > - tcf_action_stats_update(a, stats->bytes, stats->pkts, stats->drops, > - stats->lastused, true); > - preempt_enable(); > - > - a->used_hw_stats = stats->used_hw_stats; > - a->used_hw_stats_valid = stats->used_hw_stats_valid; > + if (use_act_stats || tc_act_in_hw(a)) { > + tcf_action_update_hw_stats(a); Hi Oz, I am a unclear why it is ok to continue here even if tcf_action_update_hw_stats() fails. There seem to be cases other than !tc_act_in_hw() at play here, which prior to this patch, would execute the code below that is now outside this if clause. > + continue; > } > + > + preempt_disable(); > + tcf_action_stats_update(a, stats->bytes, stats->pkts, stats->drops, > + stats->lastused, true); > + preempt_enable(); > + > + a->used_hw_stats = stats->used_hw_stats; > + a->used_hw_stats_valid = stats->used_hw_stats_valid; > } > #endif > } > @@ -769,6 +777,7 @@ struct tc_cls_matchall_offload { > enum tc_matchall_command command; > struct flow_rule *rule; > struct flow_stats stats; > + bool use_act_stats; > unsigned long cookie; > }; > > diff --git a/net/sched/act_api.c b/net/sched/act_api.c > index 917827199102..eda58b78da13 100644 > --- a/net/sched/act_api.c > +++ b/net/sched/act_api.c > @@ -169,11 +169,6 @@ static bool tc_act_skip_sw(u32 flags) > return (flags & TCA_ACT_FLAGS_SKIP_SW) ? true : false; > } > > -static bool tc_act_in_hw(struct tc_action *act) > -{ > - return !!act->in_hw_count; > -} > - > /* SKIP_HW and SKIP_SW are mutually exclusive flags. */ > static bool tc_act_flags_valid(u32 flags) > { > @@ -308,9 +303,6 @@ int tcf_action_update_hw_stats(struct tc_action *action) > struct flow_offload_action fl_act = {}; > int err; > > - if (!tc_act_in_hw(action)) > - return -EOPNOTSUPP; > - > err = offload_action_init(&fl_act, action, FLOW_ACT_STATS, NULL); > if (err) > return err; ...
On Wed, Feb 01, 2023 at 06:10:34PM +0200, Oz Shlomo wrote: > There are currently two mechanisms for populating hardware stats: > 1. Using flow_offload api to query the flow's statistics. > The api assumes that the same stats values apply to all > the flow's actions. > This assumption breaks when action drops or jumps over following > actions. > 2. Using hw_action api to query specific action stats via a driver > callback method. This api assures the correct action stats for > the offloaded action, however, it does not apply to the rest of the > actions in the flow's actions array. > > Extend the flow_offload stats callback to indicate that a per action > stats update is required. > Use the existing flow_offload_action api to query the action's hw stats. > In addition, currently the tc action stats utility only updates hw actions. > Reuse the existing action stats cb infrastructure to query any action > stats. > > Signed-off-by: Oz Shlomo <ozsh@nvidia.com> > --- > include/net/flow_offload.h | 1 + > include/net/pkt_cls.h | 29 +++++++++++++++++++---------- > net/sched/act_api.c | 8 -------- > net/sched/cls_flower.c | 2 +- > net/sched/cls_matchall.c | 2 +- > 5 files changed, 22 insertions(+), 20 deletions(-) > > diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h > index d177bf5f0e1a..27decadd4f5f 100644 > --- a/include/net/flow_offload.h > +++ b/include/net/flow_offload.h > @@ -597,6 +597,7 @@ struct flow_cls_offload { > unsigned long cookie; > struct flow_rule *rule; > struct flow_stats stats; > + bool use_act_stats; > u32 classid; > }; Hi Oz, It's probably not important, but I thought it is worth bringing to your attention. The placement of use_act_stats above puts it on a different cacheline (on x86_64) to stats. Which does not seem to be idea as those fields are accessed together. There is a 4 byte hole immediately above above cookie, on the same cacheline as stats, which can accommodate use_act_stats. > > diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h > index be21764a3b34..d4315757d1a2 100644 > --- a/include/net/pkt_cls.h > +++ b/include/net/pkt_cls.h ... > @@ -769,6 +777,7 @@ struct tc_cls_matchall_offload { > enum tc_matchall_command command; > struct flow_rule *rule; > struct flow_stats stats; > + bool use_act_stats; > unsigned long cookie; > }; I believe the same logic applies to this change too. ... > diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c > index cb04739a13ce..885c95191ccf 100644 > --- a/net/sched/cls_flower.c > +++ b/net/sched/cls_flower.c > @@ -502,7 +502,7 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f, > tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false, > rtnl_held); > > - tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats); > + tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, cls_flower.use_act_stats); > } > > static void __fl_put(struct cls_fl_filter *f) > diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c > index b3883d3d4dbd..fa3bbd187eb9 100644 > --- a/net/sched/cls_matchall.c > +++ b/net/sched/cls_matchall.c > @@ -331,7 +331,7 @@ static void mall_stats_hw_filter(struct tcf_proto *tp, > > tc_setup_cb_call(block, TC_SETUP_CLSMATCHALL, &cls_mall, false, true); > > - tcf_exts_hw_stats_update(&head->exts, &cls_mall.stats); > + tcf_exts_hw_stats_update(&head->exts, &cls_mall.stats, cls_mall.use_act_stats); > } > > static int mall_dump(struct net *net, struct tcf_proto *tp, void *fh,
On 03/02/2023 16:33, Simon Horman wrote: > On Wed, Feb 01, 2023 at 06:10:34PM +0200, Oz Shlomo wrote: >> There are currently two mechanisms for populating hardware stats: >> 1. Using flow_offload api to query the flow's statistics. >> The api assumes that the same stats values apply to all >> the flow's actions. >> This assumption breaks when action drops or jumps over following >> actions. >> 2. Using hw_action api to query specific action stats via a driver >> callback method. This api assures the correct action stats for >> the offloaded action, however, it does not apply to the rest of the >> actions in the flow's actions array. >> >> Extend the flow_offload stats callback to indicate that a per action >> stats update is required. >> Use the existing flow_offload_action api to query the action's hw stats. >> In addition, currently the tc action stats utility only updates hw actions. >> Reuse the existing action stats cb infrastructure to query any action >> stats. >> >> Signed-off-by: Oz Shlomo <ozsh@nvidia.com> > ... > >> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h >> index be21764a3b34..d4315757d1a2 100644 >> --- a/include/net/pkt_cls.h >> +++ b/include/net/pkt_cls.h >> @@ -292,9 +292,15 @@ static inline void tcf_exts_put_net(struct tcf_exts *exts) >> #define tcf_act_for_each_action(i, a, actions) \ >> for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++) >> >> +static bool tc_act_in_hw(struct tc_action *act) >> +{ >> + return !!act->in_hw_count; >> +} >> + >> static inline void >> tcf_exts_hw_stats_update(const struct tcf_exts *exts, >> - struct flow_stats *stats) >> + struct flow_stats *stats, >> + bool use_act_stats) >> { >> #ifdef CONFIG_NET_CLS_ACT >> int i; >> @@ -302,16 +308,18 @@ static inline void tcf_exts_put_net(struct tcf_exts *exts) >> for (i = 0; i < exts->nr_actions; i++) { >> struct tc_action *a = exts->actions[i]; >> >> - /* if stats from hw, just skip */ >> - if (tcf_action_update_hw_stats(a)) { >> - preempt_disable(); >> - tcf_action_stats_update(a, stats->bytes, stats->pkts, stats->drops, >> - stats->lastused, true); >> - preempt_enable(); >> - >> - a->used_hw_stats = stats->used_hw_stats; >> - a->used_hw_stats_valid = stats->used_hw_stats_valid; >> + if (use_act_stats || tc_act_in_hw(a)) { >> + tcf_action_update_hw_stats(a); > Hi Oz, > > I am a unclear why it is ok to continue here even if > tcf_action_update_hw_stats() fails. There seem to be cases other than > !tc_act_in_hw() at play here, which prior to this patch, would execute the > code below that is now outside this if clause. You're right, my bad >> + continue; >> } >> + >> + preempt_disable(); >> + tcf_action_stats_update(a, stats->bytes, stats->pkts, stats->drops, >> + stats->lastused, true); >> + preempt_enable(); >> + >> + a->used_hw_stats = stats->used_hw_stats; >> + a->used_hw_stats_valid = stats->used_hw_stats_valid; >> } >> #endif >> } >> @@ -769,6 +777,7 @@ struct tc_cls_matchall_offload { >> enum tc_matchall_command command; >> struct flow_rule *rule; >> struct flow_stats stats; >> + bool use_act_stats; >> unsigned long cookie; >> }; >> >> diff --git a/net/sched/act_api.c b/net/sched/act_api.c >> index 917827199102..eda58b78da13 100644 >> --- a/net/sched/act_api.c >> +++ b/net/sched/act_api.c >> @@ -169,11 +169,6 @@ static bool tc_act_skip_sw(u32 flags) >> return (flags & TCA_ACT_FLAGS_SKIP_SW) ? true : false; >> } >> >> -static bool tc_act_in_hw(struct tc_action *act) >> -{ >> - return !!act->in_hw_count; >> -} >> - >> /* SKIP_HW and SKIP_SW are mutually exclusive flags. */ >> static bool tc_act_flags_valid(u32 flags) >> { >> @@ -308,9 +303,6 @@ int tcf_action_update_hw_stats(struct tc_action *action) >> struct flow_offload_action fl_act = {}; >> int err; >> >> - if (!tc_act_in_hw(action)) >> - return -EOPNOTSUPP; >> - >> err = offload_action_init(&fl_act, action, FLOW_ACT_STATS, NULL); >> if (err) >> return err; > ...
On 03/02/2023 16:51, Simon Horman wrote: > On Wed, Feb 01, 2023 at 06:10:34PM +0200, Oz Shlomo wrote: >> There are currently two mechanisms for populating hardware stats: >> 1. Using flow_offload api to query the flow's statistics. >> The api assumes that the same stats values apply to all >> the flow's actions. >> This assumption breaks when action drops or jumps over following >> actions. >> 2. Using hw_action api to query specific action stats via a driver >> callback method. This api assures the correct action stats for >> the offloaded action, however, it does not apply to the rest of the >> actions in the flow's actions array. >> >> Extend the flow_offload stats callback to indicate that a per action >> stats update is required. >> Use the existing flow_offload_action api to query the action's hw stats. >> In addition, currently the tc action stats utility only updates hw actions. >> Reuse the existing action stats cb infrastructure to query any action >> stats. >> >> Signed-off-by: Oz Shlomo <ozsh@nvidia.com> >> --- >> include/net/flow_offload.h | 1 + >> include/net/pkt_cls.h | 29 +++++++++++++++++++---------- >> net/sched/act_api.c | 8 -------- >> net/sched/cls_flower.c | 2 +- >> net/sched/cls_matchall.c | 2 +- >> 5 files changed, 22 insertions(+), 20 deletions(-) >> >> diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h >> index d177bf5f0e1a..27decadd4f5f 100644 >> --- a/include/net/flow_offload.h >> +++ b/include/net/flow_offload.h >> @@ -597,6 +597,7 @@ struct flow_cls_offload { >> unsigned long cookie; >> struct flow_rule *rule; >> struct flow_stats stats; >> + bool use_act_stats; >> u32 classid; >> }; > Hi Oz, > > It's probably not important, but I thought it is worth bringing > to your attention. > > The placement of use_act_stats above puts it on a different > cacheline (on x86_64) to stats. Which does not seem to be idea > as those fields are accessed together. > > There is a 4 byte hole immediately above above cookie, > on the same cacheline as stats, which can accommodate use_act_stats. Good catch. I will move use_act_stats for better cache alignment. >> >> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h >> index be21764a3b34..d4315757d1a2 100644 >> --- a/include/net/pkt_cls.h >> +++ b/include/net/pkt_cls.h > ... > >> @@ -769,6 +777,7 @@ struct tc_cls_matchall_offload { >> enum tc_matchall_command command; >> struct flow_rule *rule; >> struct flow_stats stats; >> + bool use_act_stats; >> unsigned long cookie; >> }; > I believe the same logic applies to this change too. > > ... > >> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c >> index cb04739a13ce..885c95191ccf 100644 >> --- a/net/sched/cls_flower.c >> +++ b/net/sched/cls_flower.c >> @@ -502,7 +502,7 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f, >> tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false, >> rtnl_held); >> >> - tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats); >> + tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, cls_flower.use_act_stats); >> } >> >> static void __fl_put(struct cls_fl_filter *f) >> diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c >> index b3883d3d4dbd..fa3bbd187eb9 100644 >> --- a/net/sched/cls_matchall.c >> +++ b/net/sched/cls_matchall.c >> @@ -331,7 +331,7 @@ static void mall_stats_hw_filter(struct tcf_proto *tp, >> >> tc_setup_cb_call(block, TC_SETUP_CLSMATCHALL, &cls_mall, false, true); >> >> - tcf_exts_hw_stats_update(&head->exts, &cls_mall.stats); >> + tcf_exts_hw_stats_update(&head->exts, &cls_mall.stats, cls_mall.use_act_stats); >> } >> >> static int mall_dump(struct net *net, struct tcf_proto *tp, void *fh,
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h index d177bf5f0e1a..27decadd4f5f 100644 --- a/include/net/flow_offload.h +++ b/include/net/flow_offload.h @@ -597,6 +597,7 @@ struct flow_cls_offload { unsigned long cookie; struct flow_rule *rule; struct flow_stats stats; + bool use_act_stats; u32 classid; }; diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index be21764a3b34..d4315757d1a2 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -292,9 +292,15 @@ static inline void tcf_exts_put_net(struct tcf_exts *exts) #define tcf_act_for_each_action(i, a, actions) \ for (i = 0; i < TCA_ACT_MAX_PRIO && ((a) = actions[i]); i++) +static bool tc_act_in_hw(struct tc_action *act) +{ + return !!act->in_hw_count; +} + static inline void tcf_exts_hw_stats_update(const struct tcf_exts *exts, - struct flow_stats *stats) + struct flow_stats *stats, + bool use_act_stats) { #ifdef CONFIG_NET_CLS_ACT int i; @@ -302,16 +308,18 @@ static inline void tcf_exts_put_net(struct tcf_exts *exts) for (i = 0; i < exts->nr_actions; i++) { struct tc_action *a = exts->actions[i]; - /* if stats from hw, just skip */ - if (tcf_action_update_hw_stats(a)) { - preempt_disable(); - tcf_action_stats_update(a, stats->bytes, stats->pkts, stats->drops, - stats->lastused, true); - preempt_enable(); - - a->used_hw_stats = stats->used_hw_stats; - a->used_hw_stats_valid = stats->used_hw_stats_valid; + if (use_act_stats || tc_act_in_hw(a)) { + tcf_action_update_hw_stats(a); + continue; } + + preempt_disable(); + tcf_action_stats_update(a, stats->bytes, stats->pkts, stats->drops, + stats->lastused, true); + preempt_enable(); + + a->used_hw_stats = stats->used_hw_stats; + a->used_hw_stats_valid = stats->used_hw_stats_valid; } #endif } @@ -769,6 +777,7 @@ struct tc_cls_matchall_offload { enum tc_matchall_command command; struct flow_rule *rule; struct flow_stats stats; + bool use_act_stats; unsigned long cookie; }; diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 917827199102..eda58b78da13 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -169,11 +169,6 @@ static bool tc_act_skip_sw(u32 flags) return (flags & TCA_ACT_FLAGS_SKIP_SW) ? true : false; } -static bool tc_act_in_hw(struct tc_action *act) -{ - return !!act->in_hw_count; -} - /* SKIP_HW and SKIP_SW are mutually exclusive flags. */ static bool tc_act_flags_valid(u32 flags) { @@ -308,9 +303,6 @@ int tcf_action_update_hw_stats(struct tc_action *action) struct flow_offload_action fl_act = {}; int err; - if (!tc_act_in_hw(action)) - return -EOPNOTSUPP; - err = offload_action_init(&fl_act, action, FLOW_ACT_STATS, NULL); if (err) return err; diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index cb04739a13ce..885c95191ccf 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -502,7 +502,7 @@ static void fl_hw_update_stats(struct tcf_proto *tp, struct cls_fl_filter *f, tc_setup_cb_call(block, TC_SETUP_CLSFLOWER, &cls_flower, false, rtnl_held); - tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats); + tcf_exts_hw_stats_update(&f->exts, &cls_flower.stats, cls_flower.use_act_stats); } static void __fl_put(struct cls_fl_filter *f) diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c index b3883d3d4dbd..fa3bbd187eb9 100644 --- a/net/sched/cls_matchall.c +++ b/net/sched/cls_matchall.c @@ -331,7 +331,7 @@ static void mall_stats_hw_filter(struct tcf_proto *tp, tc_setup_cb_call(block, TC_SETUP_CLSMATCHALL, &cls_mall, false, true); - tcf_exts_hw_stats_update(&head->exts, &cls_mall.stats); + tcf_exts_hw_stats_update(&head->exts, &cls_mall.stats, cls_mall.use_act_stats); } static int mall_dump(struct net *net, struct tcf_proto *tp, void *fh,
There are currently two mechanisms for populating hardware stats: 1. Using flow_offload api to query the flow's statistics. The api assumes that the same stats values apply to all the flow's actions. This assumption breaks when action drops or jumps over following actions. 2. Using hw_action api to query specific action stats via a driver callback method. This api assures the correct action stats for the offloaded action, however, it does not apply to the rest of the actions in the flow's actions array. Extend the flow_offload stats callback to indicate that a per action stats update is required. Use the existing flow_offload_action api to query the action's hw stats. In addition, currently the tc action stats utility only updates hw actions. Reuse the existing action stats cb infrastructure to query any action stats. Signed-off-by: Oz Shlomo <ozsh@nvidia.com> --- include/net/flow_offload.h | 1 + include/net/pkt_cls.h | 29 +++++++++++++++++++---------- net/sched/act_api.c | 8 -------- net/sched/cls_flower.c | 2 +- net/sched/cls_matchall.c | 2 +- 5 files changed, 22 insertions(+), 20 deletions(-)