Message ID | 20230111203732.51363-1-rrameshbabu@nvidia.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] sch_htb: Avoid grafting on htb_destroy_class_offload when destroying htb | expand |
On Wed, Jan 11, 2023 at 12:37:33PM -0800, Rahul Rameshbabu wrote: > When destroying the htb, the caller may already have grafted a new qdisc > that is not part of the htb structure being destroyed. > htb_destroy_class_offload should not peek at the qdisc of the netdev queue. > Peek at old qdisc and graft only when deleting a leaf class in the htb, > rather than when deleting the htb itself. > > This fix resolves two use cases. > > 1. Using tc to destroy the htb. > 2. Using tc to replace the htb with another qdisc (which also leads to > the htb being destroyed). Please elaborate in the commit message what exactly was broken in these cases, i.e. premature dev_activate in both cases, and also accidental overwriting of the qdisc in case 2. > > Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") > Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Maxim Mikityanskiy <maxtram95@gmail.com> > --- > net/sched/sch_htb.c | 23 +++++++++++++---------- > 1 file changed, 13 insertions(+), 10 deletions(-) > > diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c > index 2238edece1a4..360ce8616fd2 100644 > --- a/net/sched/sch_htb.c > +++ b/net/sched/sch_htb.c > @@ -1557,14 +1557,13 @@ static int htb_destroy_class_offload(struct Qdisc *sch, struct htb_class *cl, > > WARN_ON(!q); > dev_queue = htb_offload_get_queue(cl); > - old = htb_graft_helper(dev_queue, NULL); > - if (destroying) > - /* Before HTB is destroyed, the kernel grafts noop_qdisc to > - * all queues. > + if (!destroying) { > + old = htb_graft_helper(dev_queue, NULL); > + /* Last qdisc grafted should be the same as cl->leaf.q when > + * calling htb_destroy Did you mean "when calling htb_delete"? Worth also commenting that on destroying, graft is done by qdisc_graft, and the latter also qdisc_puts the old one. Just to explain why we skip steps on destroying. > */ > - WARN_ON(!(old->flags & TCQ_F_BUILTIN)); > - else > WARN_ON(old != q); > + } > > if (cl->parent) { > _bstats_update(&cl->parent->bstats_bias, > @@ -1581,10 +1580,14 @@ static int htb_destroy_class_offload(struct Qdisc *sch, struct htb_class *cl, > }; > err = htb_offload(qdisc_dev(sch), &offload_opt); > > - if (!err || destroying) > - qdisc_put(old); > - else > - htb_graft_helper(dev_queue, old); > + /* htb_offload related errors when destroying cannot be handled */ > + WARN_ON(err && destroying); Not sure whether we want to WARN on this error... On destroying, we call htb_offload with TC_HTB_LEAF_DEL_LAST_FORCE, which makes the mlx5e driver proceed with deleting the node even if it failed to create a replacement node. Normally it cancels the deletion to keep the integrity of hardware structures, but on htb_destroy it doesn't matter, because everything is going to be torn down anyway. An error is still returned by the driver, but it's safe to ignore it, not worth a WARN at all. Another error flow, when the firmware command to delete a node fails for some reason, doesn't even lead to returning an error, because the worst that happens is a leak of hardware resources, and we can't do anything meaningful about it at that stage. So, I don't think this WARN_ON is helpful, unless you also want to change the way mlx5e returns errors. > + if (!destroying) { > + if (!err) > + qdisc_put(old); > + else > + htb_graft_helper(dev_queue, old); > + } Looks good. I also suggest removing NULL-initialization of old to make sure one will get a compiler warning about an uninitialized variable if one changes the code in the future and accidentally uses old in the destroying flow. > > if (last_child) > return err; > -- > 2.36.2 > > Previous related discussions > > [1] https://lore.kernel.org/netdev/20230110202003.25452-1-rrameshbabu@nvidia.com/ > [2] https://lore.kernel.org/netdev/20230104174744.22280-1-rrameshbabu@nvidia.com/ > [3] https://lore.kernel.org/all/CANn89iJSsFPBp5dYm3y6Jbbpuwbb9P+X3gmqk6zow0VWgx1Q-A@mail.gmail.com/
Maxim Mikityanskiy <maxtram95@gmail.com> writes: > On Wed, Jan 11, 2023 at 12:37:33PM -0800, Rahul Rameshbabu wrote: >> When destroying the htb, the caller may already have grafted a new qdisc >> that is not part of the htb structure being destroyed. >> htb_destroy_class_offload should not peek at the qdisc of the netdev queue. >> Peek at old qdisc and graft only when deleting a leaf class in the htb, >> rather than when deleting the htb itself. >> >> This fix resolves two use cases. >> >> 1. Using tc to destroy the htb. >> 2. Using tc to replace the htb with another qdisc (which also leads to >> the htb being destroyed). > > Please elaborate in the commit message what exactly was broken in these > cases, i.e. premature dev_activate in both cases, and also accidental > overwriting of the qdisc in case 2. Ack. > >> >> Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") >> Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> >> Cc: Eric Dumazet <edumazet@google.com> >> Cc: Maxim Mikityanskiy <maxtram95@gmail.com> >> --- >> net/sched/sch_htb.c | 23 +++++++++++++---------- >> 1 file changed, 13 insertions(+), 10 deletions(-) >> >> diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c >> index 2238edece1a4..360ce8616fd2 100644 >> --- a/net/sched/sch_htb.c >> +++ b/net/sched/sch_htb.c >> @@ -1557,14 +1557,13 @@ static int htb_destroy_class_offload(struct Qdisc *sch, struct htb_class *cl, >> >> WARN_ON(!q); >> dev_queue = htb_offload_get_queue(cl); >> - old = htb_graft_helper(dev_queue, NULL); >> - if (destroying) >> - /* Before HTB is destroyed, the kernel grafts noop_qdisc to >> - * all queues. >> + if (!destroying) { >> + old = htb_graft_helper(dev_queue, NULL); >> + /* Last qdisc grafted should be the same as cl->leaf.q when >> + * calling htb_destroy > > Did you mean "when calling htb_delete"? > > Worth also commenting that on destroying, graft is done by qdisc_graft, > and the latter also qdisc_puts the old one. Just to explain why we skip > steps on destroying. > Yes, I did mean htb_delete. Ack. >> */ >> - WARN_ON(!(old->flags & TCQ_F_BUILTIN)); >> - else >> WARN_ON(old != q); >> + } >> >> if (cl->parent) { >> _bstats_update(&cl->parent->bstats_bias, >> @@ -1581,10 +1580,14 @@ static int htb_destroy_class_offload(struct Qdisc *sch, struct htb_class *cl, >> }; >> err = htb_offload(qdisc_dev(sch), &offload_opt); >> >> - if (!err || destroying) >> - qdisc_put(old); >> - else >> - htb_graft_helper(dev_queue, old); >> + /* htb_offload related errors when destroying cannot be handled */ >> + WARN_ON(err && destroying); > > Not sure whether we want to WARN on this error... > > On destroying, we call htb_offload with TC_HTB_LEAF_DEL_LAST_FORCE, > which makes the mlx5e driver proceed with deleting the node even if it > failed to create a replacement node. Normally it cancels the deletion to > keep the integrity of hardware structures, but on htb_destroy it doesn't > matter, because everything is going to be torn down anyway. An error is > still returned by the driver, but it's safe to ignore it, not worth a > WARN at all. I see. This makes sense to me. > > Another error flow, when the firmware command to delete a node fails for > some reason, doesn't even lead to returning an error, because the worst > that happens is a leak of hardware resources, and we can't do anything > meaningful about it at that stage. This was what I was trying to catch with the WARN_ON, with the hope that at worst it wouldn't have any false positives. However, if there are errors due to certain operation modes like TC_HTB_LEAF_DEL_LAST_FORCE where the htb is still destroyed, this WARN_ON seems to be problematic more than helpful. Will remove in my next revision. > > So, I don't think this WARN_ON is helpful, unless you also want to > change the way mlx5e returns errors. > >> + if (!destroying) { >> + if (!err) >> + qdisc_put(old); >> + else >> + htb_graft_helper(dev_queue, old); >> + } > > Looks good. I also suggest removing NULL-initialization of old to make > sure one will get a compiler warning about an uninitialized variable if > one changes the code in the future and accidentally uses old in the > destroying flow. Ack. > >> >> if (last_child) >> return err; >> -- >> 2.36.2 >> >> Previous related discussions >> >> [1] https://lore.kernel.org/netdev/20230110202003.25452-1-rrameshbabu@nvidia.com/ >> [2] https://lore.kernel.org/netdev/20230104174744.22280-1-rrameshbabu@nvidia.com/ >> [3] https://lore.kernel.org/all/CANn89iJSsFPBp5dYm3y6Jbbpuwbb9P+X3gmqk6zow0VWgx1Q-A@mail.gmail.com/
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 2238edece1a4..360ce8616fd2 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1557,14 +1557,13 @@ static int htb_destroy_class_offload(struct Qdisc *sch, struct htb_class *cl, WARN_ON(!q); dev_queue = htb_offload_get_queue(cl); - old = htb_graft_helper(dev_queue, NULL); - if (destroying) - /* Before HTB is destroyed, the kernel grafts noop_qdisc to - * all queues. + if (!destroying) { + old = htb_graft_helper(dev_queue, NULL); + /* Last qdisc grafted should be the same as cl->leaf.q when + * calling htb_destroy */ - WARN_ON(!(old->flags & TCQ_F_BUILTIN)); - else WARN_ON(old != q); + } if (cl->parent) { _bstats_update(&cl->parent->bstats_bias, @@ -1581,10 +1580,14 @@ static int htb_destroy_class_offload(struct Qdisc *sch, struct htb_class *cl, }; err = htb_offload(qdisc_dev(sch), &offload_opt); - if (!err || destroying) - qdisc_put(old); - else - htb_graft_helper(dev_queue, old); + /* htb_offload related errors when destroying cannot be handled */ + WARN_ON(err && destroying); + if (!destroying) { + if (!err) + qdisc_put(old); + else + htb_graft_helper(dev_queue, old); + } if (last_child) return err;
When destroying the htb, the caller may already have grafted a new qdisc that is not part of the htb structure being destroyed. htb_destroy_class_offload should not peek at the qdisc of the netdev queue. Peek at old qdisc and graft only when deleting a leaf class in the htb, rather than when deleting the htb itself. This fix resolves two use cases. 1. Using tc to destroy the htb. 2. Using tc to replace the htb with another qdisc (which also leads to the htb being destroyed). Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Maxim Mikityanskiy <maxtram95@gmail.com> --- net/sched/sch_htb.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-)