Message ID | 20240516133035.1050113-1-houtao@huaweicloud.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net/sched: unregister root_lock_key in the error path of qdisc_alloc() | expand |
Context | Check | Description |
---|---|---|
netdev/tree_selection | success | Guessing tree name failed - patch did not apply |
Oops. Forgot to add the target git tree for the patch. It is targeted for net- tree. On 5/16/2024 9:30 PM, Hou Tao wrote: > From: Hou Tao <houtao1@huawei.com> > > The following slab-use-after-free problem was reported by syzbot: > > ================================================================== > BUG: KASAN: slab-use-after-free in lockdep_register_key+0x253/0x3f0 kernel/locking/lockdep.c:1225 > Read of size 8 at addr ffff88805fe2c298 by task syz-executor.1/5906 > > CPU: 1 PID: 5906 Comm: syz-executor.1 Not tainted 6.9.0-rc5-syzkaller-01473-g2506f6229bd0 #0 > ...... > Call Trace: > <TASK> > __dump_stack lib/dump_stack.c:88 [inline] > dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 > print_address_description mm/kasan/report.c:377 [inline] > print_report+0x169/0x550 mm/kasan/report.c:488 > kasan_report+0x143/0x180 mm/kasan/report.c:601 > lockdep_register_key+0x253/0x3f0 kernel/locking/lockdep.c:1225 > htab_map_alloc+0x9b/0xe60 kernel/bpf/hashtab.c:506 > map_create+0x90c/0x1200 kernel/bpf/syscall.c:1333 > __sys_bpf+0x6d1/0x810 kernel/bpf/syscall.c:5659 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > ...... > </TASK> > > Allocated by task 5593: > kasan_save_stack mm/kasan/common.c:47 [inline] > kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 > poison_kmalloc_redzone mm/kasan/common.c:370 [inline] > __kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:387 > kasan_kmalloc include/linux/kasan.h:211 [inline] > __do_kmalloc_node mm/slub.c:3966 [inline] > __kmalloc_node_track_caller+0x24e/0x4e0 mm/slub.c:3986 > kmalloc_reserve+0x111/0x2a0 net/core/skbuff.c:597 > __alloc_skb+0x1f3/0x440 net/core/skbuff.c:666 > alloc_skb include/linux/skbuff.h:1308 [inline] > alloc_skb_with_frags+0xc3/0x770 net/core/skbuff.c:6455 > ...... > > Freed by task 5593: > kasan_save_stack mm/kasan/common.c:47 [inline] > kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 > kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579 > poison_slab_object+0xa6/0xe0 mm/kasan/common.c:240 > __kasan_slab_free+0x37/0x60 mm/kasan/common.c:256 > kasan_slab_free include/linux/kasan.h:184 [inline] > slab_free_hook mm/slub.c:2106 [inline] > slab_free mm/slub.c:4280 [inline] > kfree+0x153/0x3a0 mm/slub.c:4390 > skb_kfree_head net/core/skbuff.c:1033 [inline] > skb_free_head net/core/skbuff.c:1045 [inline] > ...... > > The buggy address belongs to the object at ffff88805fe2c000 > which belongs to the cache kmalloc-2k of size 2048 > The buggy address is located 664 bytes inside of > freed 2048-byte region [ffff88805fe2c000, ffff88805fe2c800) > > At first glance, it seems there is a problem with bpf hash-table, > because the use-after-free problem is reported when invoking > htab_map_alloc(). However, after checking the reported error more > carefully, it appears that qdiscs_alloc() is the culprit. The most > important clue regarding why qdisc_alloc() is involved is the following: > "The buggy address is located 664 bytes inside of freed 2048-byte > region". lockdep_register_key() has several callers, and only the > offset of lock_class_key in Qdisc in 664. The problem occurs as follow: > > (1) call qdisc_alloc() > After calling lockdep_register_key(), qdisc_alloc() jumps to errout1 > label because netdev_alloc_pcpu_stats() or alloc_percpu() fails (e.g., > due to mem-cg limitation or SIGKILL). However it doesn't call > lockdep_unregister_key() to unregister root_lock_key, but it frees the > allocated memory. > > (2) call htab_map_alloc > During the calling of lockdep_register_key(), it finds the lockdep_key > registered by free-ed Qdisc and triggers the use-after-free. > > Fix it by invoking lockdep_unregister_key() in the error path of > qdisc_alloc(). > > Reported-by: syzbot+061f58eec3bde7ee8ffa@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/bpf/d28e4f02-965d-96de-ee56-f7a001b67fe7@huaweicloud.com/T/#m47c0670021ada17869bf887c73438133d879d326 > Fixes: af0cb3fa3f9e ("net/sched: fix false lockdep warning on qdisc root lock") > Signed-off-by: Hou Tao <houtao1@huawei.com> > --- > net/sched/sch_generic.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c > index 31dfd6c7405b0..d3f6006b563cc 100644 > --- a/net/sched/sch_generic.c > +++ b/net/sched/sch_generic.c > @@ -982,6 +982,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, > > return sch; > errout1: > + lockdep_unregister_key(&sch->root_lock_key); > kfree(sch); > errout: > return ERR_PTR(err);
hello Hou Tao, On Thu, May 16, 2024 at 3:33 PM Hou Tao <houtao@huaweicloud.com> wrote: > > Oops. Forgot to add the target git tree for the patch. It is targeted > for net- tree. > > > > > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c > > index 31dfd6c7405b0..d3f6006b563cc 100644 > > --- a/net/sched/sch_generic.c > > +++ b/net/sched/sch_generic.c > > @@ -982,6 +982,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, > > > > return sch; > > errout1: > > + lockdep_unregister_key(&sch->root_lock_key); > > kfree(sch); > > errout: > > return ERR_PTR(err); > AFAIS this line is part of the fix that was merged a couple of weeks ago, (see the 2nd hunk of [1]). That patch also protects the error path of qdisc_create(), that proved to make kselftest fail with similar splats. Can you check if this commit resolves that syzbot? thanks a lot!
H Davide, On 5/16/2024 9:45 PM, Davide Caratti wrote: > hello Hou Tao, > > On Thu, May 16, 2024 at 3:33 PM Hou Tao <houtao@huaweicloud.com> wrote: >> Oops. Forgot to add the target git tree for the patch. It is targeted >> for net- tree. >> >>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c >>> index 31dfd6c7405b0..d3f6006b563cc 100644 >>> --- a/net/sched/sch_generic.c >>> +++ b/net/sched/sch_generic.c >>> @@ -982,6 +982,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, >>> >>> return sch; >>> errout1: >>> + lockdep_unregister_key(&sch->root_lock_key); >>> kfree(sch); >>> errout: >>> return ERR_PTR(err); > AFAIS this line is part of the fix that was merged a couple of weeks > ago, (see the 2nd hunk of [1]). That patch also protects the error > path of qdisc_create(), that proved to make kselftest fail with > similar splats. Can you check if this commit resolves that syzbot? I missed that commit and didn't check the net-next git tree before posting the patch. Yes, I think this commit will resolve the reported problem. Thanks. > > thanks a lot!
Hi Hou, kernel test robot noticed the following build errors: [auto build test ERROR on v6.9] [cannot apply to net/main net-next/main linus/master next-20240517] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Hou-Tao/net-sched-unregister-root_lock_key-in-the-error-path-of-qdisc_alloc/20240516-213538 base: v6.9 patch link: https://lore.kernel.org/r/20240516133035.1050113-1-houtao%40huaweicloud.com patch subject: [PATCH] net/sched: unregister root_lock_key in the error path of qdisc_alloc() config: openrisc-defconfig (https://download.01.org/0day-ci/archive/20240517/202405171311.SyRzzQjC-lkp@intel.com/config) compiler: or1k-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240517/202405171311.SyRzzQjC-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202405171311.SyRzzQjC-lkp@intel.com/ All errors (new ones prefixed by >>): net/sched/sch_generic.c: In function 'qdisc_alloc': >> net/sched/sch_generic.c:983:36: error: 'struct Qdisc' has no member named 'root_lock_key' 983 | lockdep_unregister_key(&sch->root_lock_key); | ^~ vim +983 net/sched/sch_generic.c 924 925 struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, 926 const struct Qdisc_ops *ops, 927 struct netlink_ext_ack *extack) 928 { 929 struct Qdisc *sch; 930 unsigned int size = sizeof(*sch) + ops->priv_size; 931 int err = -ENOBUFS; 932 struct net_device *dev; 933 934 if (!dev_queue) { 935 NL_SET_ERR_MSG(extack, "No device queue given"); 936 err = -EINVAL; 937 goto errout; 938 } 939 940 dev = dev_queue->dev; 941 sch = kzalloc_node(size, GFP_KERNEL, netdev_queue_numa_node_read(dev_queue)); 942 943 if (!sch) 944 goto errout; 945 __skb_queue_head_init(&sch->gso_skb); 946 __skb_queue_head_init(&sch->skb_bad_txq); 947 gnet_stats_basic_sync_init(&sch->bstats); 948 spin_lock_init(&sch->q.lock); 949 950 if (ops->static_flags & TCQ_F_CPUSTATS) { 951 sch->cpu_bstats = 952 netdev_alloc_pcpu_stats(struct gnet_stats_basic_sync); 953 if (!sch->cpu_bstats) 954 goto errout1; 955 956 sch->cpu_qstats = alloc_percpu(struct gnet_stats_queue); 957 if (!sch->cpu_qstats) { 958 free_percpu(sch->cpu_bstats); 959 goto errout1; 960 } 961 } 962 963 spin_lock_init(&sch->busylock); 964 lockdep_set_class(&sch->busylock, 965 dev->qdisc_tx_busylock ?: &qdisc_tx_busylock); 966 967 /* seqlock has the same scope of busylock, for NOLOCK qdisc */ 968 spin_lock_init(&sch->seqlock); 969 lockdep_set_class(&sch->seqlock, 970 dev->qdisc_tx_busylock ?: &qdisc_tx_busylock); 971 972 sch->ops = ops; 973 sch->flags = ops->static_flags; 974 sch->enqueue = ops->enqueue; 975 sch->dequeue = ops->dequeue; 976 sch->dev_queue = dev_queue; 977 sch->owner = -1; 978 netdev_hold(dev, &sch->dev_tracker, GFP_KERNEL); 979 refcount_set(&sch->refcnt, 1); 980 981 return sch; 982 errout1: > 983 lockdep_unregister_key(&sch->root_lock_key); 984 kfree(sch); 985 errout: 986 return ERR_PTR(err); 987 } 988
Hi Hou, kernel test robot noticed the following build errors: [auto build test ERROR on v6.9] [cannot apply to net/main net-next/main linus/master next-20240517] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Hou-Tao/net-sched-unregister-root_lock_key-in-the-error-path-of-qdisc_alloc/20240516-213538 base: v6.9 patch link: https://lore.kernel.org/r/20240516133035.1050113-1-houtao%40huaweicloud.com patch subject: [PATCH] net/sched: unregister root_lock_key in the error path of qdisc_alloc() config: um-allnoconfig (https://download.01.org/0day-ci/archive/20240517/202405171402.nix3cdP7-lkp@intel.com/config) compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240517/202405171402.nix3cdP7-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202405171402.nix3cdP7-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from net/sched/sch_generic.c:17: In file included from include/linux/netdevice.h:38: In file included from include/net/net_namespace.h:43: In file included from include/linux/skbuff.h:17: In file included from include/linux/bvec.h:10: In file included from include/linux/highmem.h:12: In file included from include/linux/hardirq.h:11: In file included from arch/um/include/asm/hardirq.h:5: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/um/include/asm/io.h:24: include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 547 | val = __raw_readb(PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 560 | val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr)); | ~~~~~~~~~~ ^ include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu' 37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x)) | ^ In file included from net/sched/sch_generic.c:17: In file included from include/linux/netdevice.h:38: In file included from include/net/net_namespace.h:43: In file included from include/linux/skbuff.h:17: In file included from include/linux/bvec.h:10: In file included from include/linux/highmem.h:12: In file included from include/linux/hardirq.h:11: In file included from arch/um/include/asm/hardirq.h:5: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/um/include/asm/io.h:24: include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 573 | val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr)); | ~~~~~~~~~~ ^ include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu' 35 | #define __le32_to_cpu(x) ((__force __u32)(__le32)(x)) | ^ In file included from net/sched/sch_generic.c:17: In file included from include/linux/netdevice.h:38: In file included from include/net/net_namespace.h:43: In file included from include/linux/skbuff.h:17: In file included from include/linux/bvec.h:10: In file included from include/linux/highmem.h:12: In file included from include/linux/hardirq.h:11: In file included from arch/um/include/asm/hardirq.h:5: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/um/include/asm/io.h:24: include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 584 | __raw_writeb(value, PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 594 | __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 604 | __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 692 | readsb(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 700 | readsw(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 708 | readsl(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 717 | writesb(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 726 | writesw(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] 735 | writesl(PCI_IOBASE + addr, buffer, count); | ~~~~~~~~~~ ^ >> net/sched/sch_generic.c:983:31: error: no member named 'root_lock_key' in 'struct Qdisc' 983 | lockdep_unregister_key(&sch->root_lock_key); | ~~~ ^ 12 warnings and 1 error generated. vim +983 net/sched/sch_generic.c 924 925 struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, 926 const struct Qdisc_ops *ops, 927 struct netlink_ext_ack *extack) 928 { 929 struct Qdisc *sch; 930 unsigned int size = sizeof(*sch) + ops->priv_size; 931 int err = -ENOBUFS; 932 struct net_device *dev; 933 934 if (!dev_queue) { 935 NL_SET_ERR_MSG(extack, "No device queue given"); 936 err = -EINVAL; 937 goto errout; 938 } 939 940 dev = dev_queue->dev; 941 sch = kzalloc_node(size, GFP_KERNEL, netdev_queue_numa_node_read(dev_queue)); 942 943 if (!sch) 944 goto errout; 945 __skb_queue_head_init(&sch->gso_skb); 946 __skb_queue_head_init(&sch->skb_bad_txq); 947 gnet_stats_basic_sync_init(&sch->bstats); 948 spin_lock_init(&sch->q.lock); 949 950 if (ops->static_flags & TCQ_F_CPUSTATS) { 951 sch->cpu_bstats = 952 netdev_alloc_pcpu_stats(struct gnet_stats_basic_sync); 953 if (!sch->cpu_bstats) 954 goto errout1; 955 956 sch->cpu_qstats = alloc_percpu(struct gnet_stats_queue); 957 if (!sch->cpu_qstats) { 958 free_percpu(sch->cpu_bstats); 959 goto errout1; 960 } 961 } 962 963 spin_lock_init(&sch->busylock); 964 lockdep_set_class(&sch->busylock, 965 dev->qdisc_tx_busylock ?: &qdisc_tx_busylock); 966 967 /* seqlock has the same scope of busylock, for NOLOCK qdisc */ 968 spin_lock_init(&sch->seqlock); 969 lockdep_set_class(&sch->seqlock, 970 dev->qdisc_tx_busylock ?: &qdisc_tx_busylock); 971 972 sch->ops = ops; 973 sch->flags = ops->static_flags; 974 sch->enqueue = ops->enqueue; 975 sch->dequeue = ops->dequeue; 976 sch->dev_queue = dev_queue; 977 sch->owner = -1; 978 netdev_hold(dev, &sch->dev_tracker, GFP_KERNEL); 979 refcount_set(&sch->refcnt, 1); 980 981 return sch; 982 errout1: > 983 lockdep_unregister_key(&sch->root_lock_key); 984 kfree(sch); 985 errout: 986 return ERR_PTR(err); 987 } 988
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 31dfd6c7405b0..d3f6006b563cc 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -982,6 +982,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue, return sch; errout1: + lockdep_unregister_key(&sch->root_lock_key); kfree(sch); errout: return ERR_PTR(err);