Message ID | 20240207205335.1465818-1-victor@mojatatu.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net/sched: act_mirred: Don't zero blockid when netns is going down | expand |
On Wed, Feb 7, 2024 at 9:54 PM Victor Nogueira <victor@mojatatu.com> wrote: > > While testing tdc with parallel tests for mirred to block we caught an > intermittent bug. The blockid was being zeroed out when a parallel net > namespace was going down and, thus, giving us an incorrect blockid value(0) > whenever we tried to dump the mirred action. Since we don't increment the > block refcount in the control path (and only use the ID), we don't need to > zero the blockid field whenever the net namespace is going down. > You mention netns being removed, but the issue is more about unregistering a net device ? Ie the following would also trigger the bug ? ip link add dev dummy type dummy ip link del dummy
On 07/02/2024 18:32, Eric Dumazet wrote: > On Wed, Feb 7, 2024 at 9:54 PM Victor Nogueira <victor@mojatatu.com> wrote: >> >> While testing tdc with parallel tests for mirred to block we caught an >> intermittent bug. The blockid was being zeroed out when a parallel net >> namespace was going down and, thus, giving us an incorrect blockid value(0) >> whenever we tried to dump the mirred action. Since we don't increment the >> block refcount in the control path (and only use the ID), we don't need to >> zero the blockid field whenever the net namespace is going down. >> > > You mention netns being removed, but the issue is more about unregistering > a net device ? > Ie the following would also trigger the bug ? > > ip link add dev dummy type dummy > ip link del dummy Indeed you are right. I focused on the net namespace part because this is what triggered the bug in our testing. I'll send a v2 with a better commit message. cheers, Victor
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index 93a96e9d8d90..6f4bb1c8ce7b 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -533,8 +533,6 @@ static int mirred_device_event(struct notifier_block *unused, * net_device are already rcu protected. */ RCU_INIT_POINTER(m->tcfm_dev, NULL); - } else if (m->tcfm_blockid) { - m->tcfm_blockid = 0; } spin_unlock_bh(&m->tcf_lock); }
While testing tdc with parallel tests for mirred to block we caught an intermittent bug. The blockid was being zeroed out when a parallel net namespace was going down and, thus, giving us an incorrect blockid value(0) whenever we tried to dump the mirred action. Since we don't increment the block refcount in the control path (and only use the ID), we don't need to zero the blockid field whenever the net namespace is going down. Fixes: 42f39036cda8 ("net/sched: act_mirred: Allow mirred to block") Signed-off-by: Victor Nogueira <victor@mojatatu.com> --- net/sched/act_mirred.c | 2 -- 1 file changed, 2 deletions(-)