Message ID | 20211129154520.295823-1-atenart@kernel.org (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net-sysfs: update the queue counts in the unregistration path | expand |
On Mon, 29 Nov 2021 16:45:20 +0100 Antoine Tenart wrote: > When updating Rx and Tx queue kobjects, the queue count should always be > updated to match the queue kobjects count. This was not done in the net > device unregistration path and due to the Tx queue logic allowing > updates when unregistering (for qdisc cleanup) it was possible with > ethtool to manually add new queues after unregister, leading to NULL > pointer exceptions and UaFs, such as: > > BUG: KASAN: use-after-free in kobject_get+0x14/0x90 > Read of size 1 at addr ffff88801961248c by task ethtool/755 > > CPU: 0 PID: 755 Comm: ethtool Not tainted 5.15.0-rc6+ #778 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/014 > Call Trace: > dump_stack_lvl+0x57/0x72 > print_address_description.constprop.0+0x1f/0x140 > kasan_report.cold+0x7f/0x11b > kobject_get+0x14/0x90 > kobject_add_internal+0x3d1/0x450 > kobject_init_and_add+0xba/0xf0 > netdev_queue_update_kobjects+0xcf/0x200 > netif_set_real_num_tx_queues+0xb4/0x310 > veth_set_channels+0x1c3/0x550 > ethnl_set_channels+0x524/0x610 > > Updating the queue counts in the unregistration path solve the above > issue, as the ethtool path updating the queue counts makes sanity checks > and a queue count of 0 should prevent any update. Would you mind pointing where in the code that happens? I can't seem to find anything looking at real_num_.x_queues outside dev.c and net-sysfs.c :S > Fixes: 5c56580b74e5 ("net: Adjust TX queue kobjects if number of queues changes during unregister") > Signed-off-by: Antoine Tenart <atenart@kernel.org> > --- > Following a previous thread[1] I had another look at this issue and now > have a better fix (this patch). In this previous thread we also > discussed preventing ethtool operations after unregister and adding a > warning in netdev_queue_update_kobjects; I'll send two patches for this > but targetting net-next.
Quoting Jakub Kicinski (2021-12-01 03:08:39) > On Mon, 29 Nov 2021 16:45:20 +0100 Antoine Tenart wrote: > > When updating Rx and Tx queue kobjects, the queue count should always be > > updated to match the queue kobjects count. This was not done in the net > > device unregistration path and due to the Tx queue logic allowing > > updates when unregistering (for qdisc cleanup) it was possible with > > ethtool to manually add new queues after unregister, leading to NULL > > pointer exceptions and UaFs, such as: > > > > BUG: KASAN: use-after-free in kobject_get+0x14/0x90 > > Read of size 1 at addr ffff88801961248c by task ethtool/755 > > > > CPU: 0 PID: 755 Comm: ethtool Not tainted 5.15.0-rc6+ #778 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/014 > > Call Trace: > > dump_stack_lvl+0x57/0x72 > > print_address_description.constprop.0+0x1f/0x140 > > kasan_report.cold+0x7f/0x11b > > kobject_get+0x14/0x90 > > kobject_add_internal+0x3d1/0x450 > > kobject_init_and_add+0xba/0xf0 > > netdev_queue_update_kobjects+0xcf/0x200 > > netif_set_real_num_tx_queues+0xb4/0x310 > > veth_set_channels+0x1c3/0x550 > > ethnl_set_channels+0x524/0x610 > > > > Updating the queue counts in the unregistration path solve the above > > issue, as the ethtool path updating the queue counts makes sanity checks > > and a queue count of 0 should prevent any update. > > Would you mind pointing where in the code that happens? I can't seem > to find anything looking at real_num_.x_queues outside dev.c and > net-sysfs.c :S I read the above commit message again; it's not well explained... Sorry for that. The above trace was triggered using veths and this patch would solve this as veths do use real_num_x_queues to fill 'struct ethtool_channels' in its get_channels ops[1] which is then used to avoid making channel counts updates if it is 0[2]. In addition, keeping track of the queue counts in the unregistration path do help other drivers as it will allow adding a warning in netdev_queue_update_kobjects when adding queues after unregister (without tracking the queue counts in the unregistration path we can't detect illegal queue additions). We could also not only warn for illegal uses of netdev_queue_update_kobjects but also return an error. Another change that was discussed is to forbid ethtool ops after unregister. This is good, but is outside of the queue code so it might not solve all issues. (I do have those two patches, warn + ethtool, in my local tree and plan on targeting net-next). As you can see I'm a bit puzzled at how to fix this in the best way possible[3]. I think the combination of the three patches should be good enough, with only one sent to net as it does fix veths which IMHO is easier to trigger. WDYT? Thanks! Antoine [1] https://elixir.bootlin.com/linux/latest/source/drivers/net/veth.c#L222 [2] https://elixir.bootlin.com/linux/latest/source/net/ethtool/channels.c#L175 [3] Because the queue code does rely on external states.
On Wed, 01 Dec 2021 10:32:01 +0100 Antoine Tenart wrote: > > Would you mind pointing where in the code that happens? I can't seem > > to find anything looking at real_num_.x_queues outside dev.c and > > net-sysfs.c :S > > I read the above commit message again; it's not well explained... Sorry > for that. > > The above trace was triggered using veths and this patch would solve > this as veths do use real_num_x_queues to fill 'struct ethtool_channels' > in its get_channels ops[1] which is then used to avoid making channel > counts updates if it is 0[2]. But when we are at line 175 in [2] we already updated the values from the user space request at lines 144-151. This check validates the new config so a transition from 0 -> n should not be prevented here AFAICT. > In addition, keeping track of the queue counts in the unregistration > path do help other drivers as it will allow adding a warning in > netdev_queue_update_kobjects when adding queues after unregister > (without tracking the queue counts in the unregistration path we can't > detect illegal queue additions). We could also not only warn for illegal > uses of netdev_queue_update_kobjects but also return an error. > > Another change that was discussed is to forbid ethtool ops after > unregister. This is good, but is outside of the queue code so it might > not solve all issues. > > (I do have those two patches, warn + ethtool, in my local tree and plan > on targeting net-next). > > As you can see I'm a bit puzzled at how to fix this in the best way > possible[3]. I think the combination of the three patches should be good > enough, with only one sent to net as it does fix veths which IMHO is > easier to trigger. WDYT? > > [1] https://elixir.bootlin.com/linux/latest/source/drivers/net/veth.c#L222 > [2] https://elixir.bootlin.com/linux/latest/source/net/ethtool/channels.c#L175 > [3] Because the queue code does rely on external states. Any way of fixing this is fine. If you ask me personally I'd probably go with the ethtool fix to net and the zeroing and warn to net-next. Unless I'm misreading and this fix does work, in which case your plan is good, too.
Quoting Jakub Kicinski (2021-12-01 16:05:13) > On Wed, 01 Dec 2021 10:32:01 +0100 Antoine Tenart wrote: > > > Would you mind pointing where in the code that happens? I can't seem > > > to find anything looking at real_num_.x_queues outside dev.c and > > > net-sysfs.c :S > > > > The above trace was triggered using veths and this patch would solve > > this as veths do use real_num_x_queues to fill 'struct ethtool_channels' > > in its get_channels ops[1] which is then used to avoid making channel > > counts updates if it is 0[2]. > > But when we are at line 175 in [2] we already updated the values from > the user space request at lines 144-151. This check validates the new > config so a transition from 0 -> n should not be prevented here AFAICT. You're right. It worked in my testbed because I was not providing the number of Rx channels with Ethtool, so channels.rx_counts wasn't updated and still had the value veth provided. It "fixed" the issue for a completely unrelated reason and obviously can't be a fix for the issue described here. Thanks for having a second look at this, I completely missed it... > Any way of fixing this is fine. If you ask me personally I'd probably > go with the ethtool fix to net and the zeroing and warn to net-next. > Unless I'm misreading and this fix does work, in which case your plan > is good, too. I think you're right. Let's target net with the ethtool patch[1]. This patch (still needed to keep track of the queue counts) and the warning one can target net-next. [1] And probably another one for the old ethtool ioctl interface too. Thanks! Antoine
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 9c01c642cf9e..d7f9ee830d34 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -1820,6 +1820,9 @@ static void remove_queue_kobjects(struct net_device *dev) net_rx_queue_update_kobjects(dev, real_rx, 0); netdev_queue_update_kobjects(dev, real_tx, 0); + + dev->real_num_rx_queues = 0; + dev->real_num_tx_queues = 0; #ifdef CONFIG_SYSFS kset_unregister(dev->queues_kset); #endif
When updating Rx and Tx queue kobjects, the queue count should always be updated to match the queue kobjects count. This was not done in the net device unregistration path and due to the Tx queue logic allowing updates when unregistering (for qdisc cleanup) it was possible with ethtool to manually add new queues after unregister, leading to NULL pointer exceptions and UaFs, such as: BUG: KASAN: use-after-free in kobject_get+0x14/0x90 Read of size 1 at addr ffff88801961248c by task ethtool/755 CPU: 0 PID: 755 Comm: ethtool Not tainted 5.15.0-rc6+ #778 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/014 Call Trace: dump_stack_lvl+0x57/0x72 print_address_description.constprop.0+0x1f/0x140 kasan_report.cold+0x7f/0x11b kobject_get+0x14/0x90 kobject_add_internal+0x3d1/0x450 kobject_init_and_add+0xba/0xf0 netdev_queue_update_kobjects+0xcf/0x200 netif_set_real_num_tx_queues+0xb4/0x310 veth_set_channels+0x1c3/0x550 ethnl_set_channels+0x524/0x610 Updating the queue counts in the unregistration path solve the above issue, as the ethtool path updating the queue counts makes sanity checks and a queue count of 0 should prevent any update. Fixes: 5c56580b74e5 ("net: Adjust TX queue kobjects if number of queues changes during unregister") Signed-off-by: Antoine Tenart <atenart@kernel.org> --- Hello, Following a previous thread[1] I had another look at this issue and now have a better fix (this patch). In this previous thread we also discussed preventing ethtool operations after unregister and adding a warning in netdev_queue_update_kobjects; I'll send two patches for this but targetting net-next. Thanks! Antoine [1] https://lore.kernel.org/all/20211122162007.303623-1-atenart@kernel.org/T/ net/core/net-sysfs.c | 3 +++ 1 file changed, 3 insertions(+)