Message ID | c2ef23da1d9a4eb62f4e7b7c4540f9bafb553c15.1658420239.git.dcaratti@redhat.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net/sched: act_mirred: avoid printout in the traffic path | expand |
On Thu, 21 Jul 2022 18:19:22 +0200 Davide Caratti wrote: > if (tcf_mirred_dev_dereference(m) == dev) { > + pr_notice("tc mirred: target device %s is %s\n", > + dev->name, > + event == NETDEV_UNREGISTER ? "gone" : "down"); Should we only do this print only once per event as well? There can be a large number of actions redirecting to a single device, no point printing the warning multiple times for one down event.
You dont want to use the target device if it is operationally/admin down. But if that happens momentarily then it comes back up - what happens then? cheers, jamal On Thu, Jul 21, 2022 at 12:19 PM Davide Caratti <dcaratti@redhat.com> wrote: > > when tc-mirred outputs to a device that's not up, dmesg is cluttered with > messages like: > > tc mirred to Houston: device br-int is down > > we can't completely remove this printout: users might be relying on it to > detect setups where tc-mirred drops everything, as discussed earlier [1]. > however, we can at least reduce the amount of these messages, and improve > their content as follows: > - add a pr_notice(...) in the .init() function, to warn users of missing > IFF_UP flag on the target of a newly added tc-mirred action > - check for NETDEV_DOWN in the .notifier_call() function, and add proper > pr_notice(...) to warn users of missing/down target devices > > [1] https://lore.kernel.org/netdev/CAM_iQpUvn+ijyZtLmca3n+nZmHY9cMmPYwZMp5BTv10bLUhg3Q@mail.gmail.com/ > > Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> > CC: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> > Signed-off-by: Davide Caratti <dcaratti@redhat.com> > --- > net/sched/act_mirred.c | 34 +++++++++++++++++++--------------- > 1 file changed, 19 insertions(+), 15 deletions(-) > > diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c > index a1d70cf86843..4af6073e472b 100644 > --- a/net/sched/act_mirred.c > +++ b/net/sched/act_mirred.c > @@ -178,6 +178,13 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, > err = -ENODEV; > goto put_chain; > } > + if (!(ndev->flags & IFF_UP)) > + pr_notice("tc mirred: action %i %s on %s while device is down", > + m->tcf_index, > + tcf_mirred_is_act_redirect(parm->eaction) ? > + "redirects" : "mirrors", > + ndev->name); > + > mac_header_xmit = dev_is_mac_header_xmit(ndev); > odev = rcu_replace_pointer(m->tcfm_dev, ndev, > lockdep_is_held(&m->tcf_lock)); > @@ -251,16 +258,8 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, > m_eaction = READ_ONCE(m->tcfm_eaction); > retval = READ_ONCE(m->tcf_action); > dev = rcu_dereference_bh(m->tcfm_dev); > - if (unlikely(!dev)) { > - pr_notice_once("tc mirred: target device is gone\n"); > + if (unlikely(!dev || !(dev->flags & IFF_UP))) > goto out; > - } > - > - if (unlikely(!(dev->flags & IFF_UP))) { > - net_notice_ratelimited("tc mirred to Houston: device %s is down\n", > - dev->name); > - goto out; > - } > > /* we could easily avoid the clone only if called by ingress and clsact; > * since we can't easily detect the clsact caller, skip clone only for > @@ -397,16 +396,21 @@ static int mirred_device_event(struct notifier_block *unused, > struct tcf_mirred *m; > > ASSERT_RTNL(); > - if (event == NETDEV_UNREGISTER) { > + if (event == NETDEV_UNREGISTER || event == NETDEV_DOWN) { > spin_lock(&mirred_list_lock); > list_for_each_entry(m, &mirred_list, tcfm_list) { > spin_lock_bh(&m->tcf_lock); > if (tcf_mirred_dev_dereference(m) == dev) { > - netdev_put(dev, &m->tcfm_dev_tracker); > - /* Note : no rcu grace period necessary, as > - * net_device are already rcu protected. > - */ > - RCU_INIT_POINTER(m->tcfm_dev, NULL); > + pr_notice("tc mirred: target device %s is %s\n", > + dev->name, > + event == NETDEV_UNREGISTER ? "gone" : "down"); > + if (event == NETDEV_UNREGISTER) { > + netdev_put(dev, &m->tcfm_dev_tracker); > + /* Note : no rcu grace period necessary, as > + * net_device are already rcu protected. > + */ > + RCU_INIT_POINTER(m->tcfm_dev, NULL); > + } > } > spin_unlock_bh(&m->tcf_lock); > } > -- > 2.35.3 >
On Thu, Jul 21, 2022 at 06:19:22PM +0200, Davide Caratti wrote: > @@ -251,16 +258,8 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, > m_eaction = READ_ONCE(m->tcfm_eaction); > retval = READ_ONCE(m->tcf_action); > dev = rcu_dereference_bh(m->tcfm_dev); > - if (unlikely(!dev)) { > - pr_notice_once("tc mirred: target device is gone\n"); > + if (unlikely(!dev || !(dev->flags & IFF_UP))) > goto out; > - } > - > - if (unlikely(!(dev->flags & IFF_UP))) { > - net_notice_ratelimited("tc mirred to Houston: device %s is down\n", > - dev->name); > - goto out; > - } I have no objection, just want to point it out users could still figure out this drop by tracing kfree_skb(). _Maybe_ we could pass a reason to kfree_skb_reason() too but it is definitely harder. Thanks!
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index a1d70cf86843..4af6073e472b 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -178,6 +178,13 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, err = -ENODEV; goto put_chain; } + if (!(ndev->flags & IFF_UP)) + pr_notice("tc mirred: action %i %s on %s while device is down", + m->tcf_index, + tcf_mirred_is_act_redirect(parm->eaction) ? + "redirects" : "mirrors", + ndev->name); + mac_header_xmit = dev_is_mac_header_xmit(ndev); odev = rcu_replace_pointer(m->tcfm_dev, ndev, lockdep_is_held(&m->tcf_lock)); @@ -251,16 +258,8 @@ static int tcf_mirred_act(struct sk_buff *skb, const struct tc_action *a, m_eaction = READ_ONCE(m->tcfm_eaction); retval = READ_ONCE(m->tcf_action); dev = rcu_dereference_bh(m->tcfm_dev); - if (unlikely(!dev)) { - pr_notice_once("tc mirred: target device is gone\n"); + if (unlikely(!dev || !(dev->flags & IFF_UP))) goto out; - } - - if (unlikely(!(dev->flags & IFF_UP))) { - net_notice_ratelimited("tc mirred to Houston: device %s is down\n", - dev->name); - goto out; - } /* we could easily avoid the clone only if called by ingress and clsact; * since we can't easily detect the clsact caller, skip clone only for @@ -397,16 +396,21 @@ static int mirred_device_event(struct notifier_block *unused, struct tcf_mirred *m; ASSERT_RTNL(); - if (event == NETDEV_UNREGISTER) { + if (event == NETDEV_UNREGISTER || event == NETDEV_DOWN) { spin_lock(&mirred_list_lock); list_for_each_entry(m, &mirred_list, tcfm_list) { spin_lock_bh(&m->tcf_lock); if (tcf_mirred_dev_dereference(m) == dev) { - netdev_put(dev, &m->tcfm_dev_tracker); - /* Note : no rcu grace period necessary, as - * net_device are already rcu protected. - */ - RCU_INIT_POINTER(m->tcfm_dev, NULL); + pr_notice("tc mirred: target device %s is %s\n", + dev->name, + event == NETDEV_UNREGISTER ? "gone" : "down"); + if (event == NETDEV_UNREGISTER) { + netdev_put(dev, &m->tcfm_dev_tracker); + /* Note : no rcu grace period necessary, as + * net_device are already rcu protected. + */ + RCU_INIT_POINTER(m->tcfm_dev, NULL); + } } spin_unlock_bh(&m->tcf_lock); }
when tc-mirred outputs to a device that's not up, dmesg is cluttered with messages like: tc mirred to Houston: device br-int is down we can't completely remove this printout: users might be relying on it to detect setups where tc-mirred drops everything, as discussed earlier [1]. however, we can at least reduce the amount of these messages, and improve their content as follows: - add a pr_notice(...) in the .init() function, to warn users of missing IFF_UP flag on the target of a newly added tc-mirred action - check for NETDEV_DOWN in the .notifier_call() function, and add proper pr_notice(...) to warn users of missing/down target devices [1] https://lore.kernel.org/netdev/CAM_iQpUvn+ijyZtLmca3n+nZmHY9cMmPYwZMp5BTv10bLUhg3Q@mail.gmail.com/ Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> CC: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> --- net/sched/act_mirred.c | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-)