diff mbox series

[net,1/2] net/sched: flower: fix filter idr initialization

Message ID 20230426121415.2149732-2-vladbu@nvidia.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series Fixes for miss to tc action series | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 18 this patch: 18
netdev/cc_maintainers warning 3 maintainers not CCed: pabeni@redhat.com bpf@vger.kernel.org edumazet@google.com
netdev/build_clang success Errors and warnings before: 18 this patch: 18
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 18 this patch: 18
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 20 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Vlad Buslov April 26, 2023, 12:14 p.m. UTC
The cited commit moved idr initialization too early in fl_change() which
allows concurrent users to access the filter that is still being
initialized and is in inconsistent state, which, in turn, can cause NULL
pointer dereference [0]. Since there is no obvious way to fix the ordering
without reverting the whole cited commit, alternative approach taken to
first insert NULL pointer into idr in order to allocate the handle but
still cause fl_get() to return NULL and prevent concurrent users from
seeing the filter while providing miss-to-action infrastructure with valid
handle id early in fl_change().

[  152.434728] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN
[  152.436163] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
[  152.437269] CPU: 4 PID: 3877 Comm: tc Not tainted 6.3.0-rc4+ #5
[  152.438110] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[  152.439644] RIP: 0010:fl_dump_key+0x8b/0x1d10 [cls_flower]
[  152.440461] Code: 01 f2 02 f2 c7 40 08 04 f2 04 f2 c7 40 0c 04 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 84 24 00 01 00 00 48 89 c8 48 c1 e8 03 <0f> b6 04 10 84 c0 74 08 3c 03 0f 8e 98 19 00 00 8b 13 85 d2 74 57
[  152.442885] RSP: 0018:ffff88817a28f158 EFLAGS: 00010246
[  152.443851] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  152.444826] RDX: dffffc0000000000 RSI: ffffffff8500ae80 RDI: ffff88810a987900
[  152.445791] RBP: ffff888179d88240 R08: ffff888179d8845c R09: ffff888179d88240
[  152.446780] R10: ffffed102f451e48 R11: 00000000fffffff2 R12: ffff88810a987900
[  152.447741] R13: ffffffff8500ae80 R14: ffff88810a987900 R15: ffff888149b3c738
[  152.448756] FS:  00007f5eb2a34800(0000) GS:ffff88881ec00000(0000) knlGS:0000000000000000
[  152.449888] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  152.450685] CR2: 000000000046ad19 CR3: 000000010b0bd006 CR4: 0000000000370ea0
[  152.451641] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  152.452628] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  152.453588] Call Trace:
[  152.454032]  <TASK>
[  152.454447]  ? netlink_sendmsg+0x7a1/0xcb0
[  152.455109]  ? sock_sendmsg+0xc5/0x190
[  152.455689]  ? ____sys_sendmsg+0x535/0x6b0
[  152.456320]  ? ___sys_sendmsg+0xeb/0x170
[  152.456916]  ? do_syscall_64+0x3d/0x90
[  152.457529]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  152.458321]  ? ___sys_sendmsg+0xeb/0x170
[  152.458958]  ? __sys_sendmsg+0xb5/0x140
[  152.459564]  ? do_syscall_64+0x3d/0x90
[  152.460122]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  152.460852]  ? fl_dump_key_options.part.0+0xea0/0xea0 [cls_flower]
[  152.461710]  ? _raw_spin_lock+0x7a/0xd0
[  152.462299]  ? _raw_read_lock_irq+0x30/0x30
[  152.462924]  ? nla_put+0x15e/0x1c0
[  152.463480]  fl_dump+0x228/0x650 [cls_flower]
[  152.464112]  ? fl_tmplt_dump+0x210/0x210 [cls_flower]
[  152.464854]  ? __kmem_cache_alloc_node+0x1a7/0x330
[  152.465592]  ? nla_put+0x15e/0x1c0
[  152.466160]  tcf_fill_node+0x515/0x9a0
[  152.466766]  ? tc_setup_offload_action+0xf0/0xf0
[  152.467463]  ? __alloc_skb+0x13c/0x2a0
[  152.468067]  ? __build_skb_around+0x330/0x330
[  152.468814]  ? fl_get+0x107/0x1a0 [cls_flower]
[  152.469503]  tc_del_tfilter+0x718/0x1330
[  152.470115]  ? is_bpf_text_address+0xa/0x20
[  152.470765]  ? tc_ctl_chain+0xee0/0xee0
[  152.471335]  ? __kernel_text_address+0xe/0x30
[  152.471948]  ? unwind_get_return_address+0x56/0xa0
[  152.472639]  ? __thaw_task+0x150/0x150
[  152.473218]  ? arch_stack_walk+0x98/0xf0
[  152.473839]  ? __stack_depot_save+0x35/0x4c0
[  152.474501]  ? stack_trace_save+0x91/0xc0
[  152.475119]  ? security_capable+0x51/0x90
[  152.475741]  rtnetlink_rcv_msg+0x2c1/0x9d0
[  152.476387]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
[  152.477042]  ? __sys_sendmsg+0xb5/0x140
[  152.477664]  ? do_syscall_64+0x3d/0x90
[  152.478255]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  152.479010]  ? __stack_depot_save+0x35/0x4c0
[  152.479679]  ? __stack_depot_save+0x35/0x4c0
[  152.480346]  netlink_rcv_skb+0x12c/0x360
[  152.480929]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
[  152.481517]  ? do_syscall_64+0x3d/0x90
[  152.482061]  ? netlink_ack+0x1550/0x1550
[  152.482612]  ? rhashtable_walk_peek+0x170/0x170
[  152.483262]  ? kmem_cache_alloc_node+0x1af/0x390
[  152.483875]  ? _copy_from_iter+0x3d6/0xc70
[  152.484528]  netlink_unicast+0x553/0x790
[  152.485168]  ? netlink_attachskb+0x6a0/0x6a0
[  152.485848]  ? unwind_next_frame+0x11cc/0x1a10
[  152.486538]  ? arch_stack_walk+0x61/0xf0
[  152.487169]  netlink_sendmsg+0x7a1/0xcb0
[  152.487799]  ? netlink_unicast+0x790/0x790
[  152.488355]  ? iovec_from_user.part.0+0x4d/0x220
[  152.488990]  ? _raw_spin_lock+0x7a/0xd0
[  152.489598]  ? netlink_unicast+0x790/0x790
[  152.490236]  sock_sendmsg+0xc5/0x190
[  152.490796]  ____sys_sendmsg+0x535/0x6b0
[  152.491394]  ? import_iovec+0x7/0x10
[  152.491964]  ? kernel_sendmsg+0x30/0x30
[  152.492561]  ? __copy_msghdr+0x3c0/0x3c0
[  152.493160]  ? do_syscall_64+0x3d/0x90
[  152.493706]  ___sys_sendmsg+0xeb/0x170
[  152.494283]  ? may_open_dev+0xd0/0xd0
[  152.494858]  ? copy_msghdr_from_user+0x110/0x110
[  152.495541]  ? __handle_mm_fault+0x2678/0x4ad0
[  152.496205]  ? copy_page_range+0x2360/0x2360
[  152.496862]  ? __fget_light+0x57/0x520
[  152.497449]  ? mas_find+0x1c0/0x1c0
[  152.498026]  ? sockfd_lookup_light+0x1a/0x140
[  152.498703]  __sys_sendmsg+0xb5/0x140
[  152.499306]  ? __sys_sendmsg_sock+0x20/0x20
[  152.499951]  ? do_user_addr_fault+0x369/0xd80
[  152.500595]  do_syscall_64+0x3d/0x90
[  152.501185]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[  152.501917] RIP: 0033:0x7f5eb294f887
[  152.502494] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[  152.505008] RSP: 002b:00007ffd2c708f78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  152.506152] RAX: ffffffffffffffda RBX: 00000000642d9472 RCX: 00007f5eb294f887
[  152.507134] RDX: 0000000000000000 RSI: 00007ffd2c708fe0 RDI: 0000000000000003
[  152.508113] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[  152.509119] R10: 00007f5eb2808708 R11: 0000000000000246 R12: 0000000000000001
[  152.510068] R13: 0000000000000000 R14: 00007ffd2c70d1b8 R15: 0000000000485400
[  152.511031]  </TASK>
[  152.511444] Modules linked in: cls_flower sch_ingress openvswitch nsh mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay zram zsmalloc fuse [last unloaded: mlx5_core]
[  152.515720] ---[ end trace 0000000000000000 ]---

Fixes: 08a0063df3ae ("net/sched: flower: Move filter handle initialization earlier")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
---
 net/sched/cls_flower.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Simon Horman April 26, 2023, 2:25 p.m. UTC | #1
On Wed, Apr 26, 2023 at 02:14:14PM +0200, Vlad Buslov wrote:
> The cited commit moved idr initialization too early in fl_change() which
> allows concurrent users to access the filter that is still being
> initialized and is in inconsistent state, which, in turn, can cause NULL
> pointer dereference [0]. Since there is no obvious way to fix the ordering
> without reverting the whole cited commit, alternative approach taken to
> first insert NULL pointer into idr in order to allocate the handle but
> still cause fl_get() to return NULL and prevent concurrent users from
> seeing the filter while providing miss-to-action infrastructure with valid
> handle id early in fl_change().
> 
> [  152.434728] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN
> [  152.436163] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
> [  152.437269] CPU: 4 PID: 3877 Comm: tc Not tainted 6.3.0-rc4+ #5
> [  152.438110] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [  152.439644] RIP: 0010:fl_dump_key+0x8b/0x1d10 [cls_flower]
> [  152.440461] Code: 01 f2 02 f2 c7 40 08 04 f2 04 f2 c7 40 0c 04 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 84 24 00 01 00 00 48 89 c8 48 c1 e8 03 <0f> b6 04 10 84 c0 74 08 3c 03 0f 8e 98 19 00 00 8b 13 85 d2 74 57
> [  152.442885] RSP: 0018:ffff88817a28f158 EFLAGS: 00010246
> [  152.443851] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [  152.444826] RDX: dffffc0000000000 RSI: ffffffff8500ae80 RDI: ffff88810a987900
> [  152.445791] RBP: ffff888179d88240 R08: ffff888179d8845c R09: ffff888179d88240
> [  152.446780] R10: ffffed102f451e48 R11: 00000000fffffff2 R12: ffff88810a987900
> [  152.447741] R13: ffffffff8500ae80 R14: ffff88810a987900 R15: ffff888149b3c738
> [  152.448756] FS:  00007f5eb2a34800(0000) GS:ffff88881ec00000(0000) knlGS:0000000000000000
> [  152.449888] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  152.450685] CR2: 000000000046ad19 CR3: 000000010b0bd006 CR4: 0000000000370ea0
> [  152.451641] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  152.452628] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  152.453588] Call Trace:
> [  152.454032]  <TASK>
> [  152.454447]  ? netlink_sendmsg+0x7a1/0xcb0
> [  152.455109]  ? sock_sendmsg+0xc5/0x190
> [  152.455689]  ? ____sys_sendmsg+0x535/0x6b0
> [  152.456320]  ? ___sys_sendmsg+0xeb/0x170
> [  152.456916]  ? do_syscall_64+0x3d/0x90
> [  152.457529]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.458321]  ? ___sys_sendmsg+0xeb/0x170
> [  152.458958]  ? __sys_sendmsg+0xb5/0x140
> [  152.459564]  ? do_syscall_64+0x3d/0x90
> [  152.460122]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.460852]  ? fl_dump_key_options.part.0+0xea0/0xea0 [cls_flower]
> [  152.461710]  ? _raw_spin_lock+0x7a/0xd0
> [  152.462299]  ? _raw_read_lock_irq+0x30/0x30
> [  152.462924]  ? nla_put+0x15e/0x1c0
> [  152.463480]  fl_dump+0x228/0x650 [cls_flower]
> [  152.464112]  ? fl_tmplt_dump+0x210/0x210 [cls_flower]
> [  152.464854]  ? __kmem_cache_alloc_node+0x1a7/0x330
> [  152.465592]  ? nla_put+0x15e/0x1c0
> [  152.466160]  tcf_fill_node+0x515/0x9a0
> [  152.466766]  ? tc_setup_offload_action+0xf0/0xf0
> [  152.467463]  ? __alloc_skb+0x13c/0x2a0
> [  152.468067]  ? __build_skb_around+0x330/0x330
> [  152.468814]  ? fl_get+0x107/0x1a0 [cls_flower]
> [  152.469503]  tc_del_tfilter+0x718/0x1330
> [  152.470115]  ? is_bpf_text_address+0xa/0x20
> [  152.470765]  ? tc_ctl_chain+0xee0/0xee0
> [  152.471335]  ? __kernel_text_address+0xe/0x30
> [  152.471948]  ? unwind_get_return_address+0x56/0xa0
> [  152.472639]  ? __thaw_task+0x150/0x150
> [  152.473218]  ? arch_stack_walk+0x98/0xf0
> [  152.473839]  ? __stack_depot_save+0x35/0x4c0
> [  152.474501]  ? stack_trace_save+0x91/0xc0
> [  152.475119]  ? security_capable+0x51/0x90
> [  152.475741]  rtnetlink_rcv_msg+0x2c1/0x9d0
> [  152.476387]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
> [  152.477042]  ? __sys_sendmsg+0xb5/0x140
> [  152.477664]  ? do_syscall_64+0x3d/0x90
> [  152.478255]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.479010]  ? __stack_depot_save+0x35/0x4c0
> [  152.479679]  ? __stack_depot_save+0x35/0x4c0
> [  152.480346]  netlink_rcv_skb+0x12c/0x360
> [  152.480929]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
> [  152.481517]  ? do_syscall_64+0x3d/0x90
> [  152.482061]  ? netlink_ack+0x1550/0x1550
> [  152.482612]  ? rhashtable_walk_peek+0x170/0x170
> [  152.483262]  ? kmem_cache_alloc_node+0x1af/0x390
> [  152.483875]  ? _copy_from_iter+0x3d6/0xc70
> [  152.484528]  netlink_unicast+0x553/0x790
> [  152.485168]  ? netlink_attachskb+0x6a0/0x6a0
> [  152.485848]  ? unwind_next_frame+0x11cc/0x1a10
> [  152.486538]  ? arch_stack_walk+0x61/0xf0
> [  152.487169]  netlink_sendmsg+0x7a1/0xcb0
> [  152.487799]  ? netlink_unicast+0x790/0x790
> [  152.488355]  ? iovec_from_user.part.0+0x4d/0x220
> [  152.488990]  ? _raw_spin_lock+0x7a/0xd0
> [  152.489598]  ? netlink_unicast+0x790/0x790
> [  152.490236]  sock_sendmsg+0xc5/0x190
> [  152.490796]  ____sys_sendmsg+0x535/0x6b0
> [  152.491394]  ? import_iovec+0x7/0x10
> [  152.491964]  ? kernel_sendmsg+0x30/0x30
> [  152.492561]  ? __copy_msghdr+0x3c0/0x3c0
> [  152.493160]  ? do_syscall_64+0x3d/0x90
> [  152.493706]  ___sys_sendmsg+0xeb/0x170
> [  152.494283]  ? may_open_dev+0xd0/0xd0
> [  152.494858]  ? copy_msghdr_from_user+0x110/0x110
> [  152.495541]  ? __handle_mm_fault+0x2678/0x4ad0
> [  152.496205]  ? copy_page_range+0x2360/0x2360
> [  152.496862]  ? __fget_light+0x57/0x520
> [  152.497449]  ? mas_find+0x1c0/0x1c0
> [  152.498026]  ? sockfd_lookup_light+0x1a/0x140
> [  152.498703]  __sys_sendmsg+0xb5/0x140
> [  152.499306]  ? __sys_sendmsg_sock+0x20/0x20
> [  152.499951]  ? do_user_addr_fault+0x369/0xd80
> [  152.500595]  do_syscall_64+0x3d/0x90
> [  152.501185]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.501917] RIP: 0033:0x7f5eb294f887
> [  152.502494] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
> [  152.505008] RSP: 002b:00007ffd2c708f78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> [  152.506152] RAX: ffffffffffffffda RBX: 00000000642d9472 RCX: 00007f5eb294f887
> [  152.507134] RDX: 0000000000000000 RSI: 00007ffd2c708fe0 RDI: 0000000000000003
> [  152.508113] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> [  152.509119] R10: 00007f5eb2808708 R11: 0000000000000246 R12: 0000000000000001
> [  152.510068] R13: 0000000000000000 R14: 00007ffd2c70d1b8 R15: 0000000000485400
> [  152.511031]  </TASK>
> [  152.511444] Modules linked in: cls_flower sch_ingress openvswitch nsh mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay zram zsmalloc fuse [last unloaded: mlx5_core]
> [  152.515720] ---[ end trace 0000000000000000 ]---
> 
> Fixes: 08a0063df3ae ("net/sched: flower: Move filter handle initialization earlier")
> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Pedro Tammela April 26, 2023, 2:27 p.m. UTC | #2
On 26/04/2023 09:14, Vlad Buslov wrote:
> The cited commit moved idr initialization too early in fl_change() which
> allows concurrent users to access the filter that is still being
> initialized and is in inconsistent state, which, in turn, can cause NULL
> pointer dereference [0]. Since there is no obvious way to fix the ordering
> without reverting the whole cited commit, alternative approach taken to
> first insert NULL pointer into idr in order to allocate the handle but
> still cause fl_get() to return NULL and prevent concurrent users from
> seeing the filter while providing miss-to-action infrastructure with valid
> handle id early in fl_change().
> 
> [  152.434728] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN
> [  152.436163] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
> [  152.437269] CPU: 4 PID: 3877 Comm: tc Not tainted 6.3.0-rc4+ #5
> [  152.438110] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [  152.439644] RIP: 0010:fl_dump_key+0x8b/0x1d10 [cls_flower]
> [  152.440461] Code: 01 f2 02 f2 c7 40 08 04 f2 04 f2 c7 40 0c 04 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 84 24 00 01 00 00 48 89 c8 48 c1 e8 03 <0f> b6 04 10 84 c0 74 08 3c 03 0f 8e 98 19 00 00 8b 13 85 d2 74 57
> [  152.442885] RSP: 0018:ffff88817a28f158 EFLAGS: 00010246
> [  152.443851] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [  152.444826] RDX: dffffc0000000000 RSI: ffffffff8500ae80 RDI: ffff88810a987900
> [  152.445791] RBP: ffff888179d88240 R08: ffff888179d8845c R09: ffff888179d88240
> [  152.446780] R10: ffffed102f451e48 R11: 00000000fffffff2 R12: ffff88810a987900
> [  152.447741] R13: ffffffff8500ae80 R14: ffff88810a987900 R15: ffff888149b3c738
> [  152.448756] FS:  00007f5eb2a34800(0000) GS:ffff88881ec00000(0000) knlGS:0000000000000000
> [  152.449888] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  152.450685] CR2: 000000000046ad19 CR3: 000000010b0bd006 CR4: 0000000000370ea0
> [  152.451641] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  152.452628] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  152.453588] Call Trace:
> [  152.454032]  <TASK>
> [  152.454447]  ? netlink_sendmsg+0x7a1/0xcb0
> [  152.455109]  ? sock_sendmsg+0xc5/0x190
> [  152.455689]  ? ____sys_sendmsg+0x535/0x6b0
> [  152.456320]  ? ___sys_sendmsg+0xeb/0x170
> [  152.456916]  ? do_syscall_64+0x3d/0x90
> [  152.457529]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.458321]  ? ___sys_sendmsg+0xeb/0x170
> [  152.458958]  ? __sys_sendmsg+0xb5/0x140
> [  152.459564]  ? do_syscall_64+0x3d/0x90
> [  152.460122]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.460852]  ? fl_dump_key_options.part.0+0xea0/0xea0 [cls_flower]
> [  152.461710]  ? _raw_spin_lock+0x7a/0xd0
> [  152.462299]  ? _raw_read_lock_irq+0x30/0x30
> [  152.462924]  ? nla_put+0x15e/0x1c0
> [  152.463480]  fl_dump+0x228/0x650 [cls_flower]
> [  152.464112]  ? fl_tmplt_dump+0x210/0x210 [cls_flower]
> [  152.464854]  ? __kmem_cache_alloc_node+0x1a7/0x330
> [  152.465592]  ? nla_put+0x15e/0x1c0
> [  152.466160]  tcf_fill_node+0x515/0x9a0
> [  152.466766]  ? tc_setup_offload_action+0xf0/0xf0
> [  152.467463]  ? __alloc_skb+0x13c/0x2a0
> [  152.468067]  ? __build_skb_around+0x330/0x330
> [  152.468814]  ? fl_get+0x107/0x1a0 [cls_flower]
> [  152.469503]  tc_del_tfilter+0x718/0x1330
> [  152.470115]  ? is_bpf_text_address+0xa/0x20
> [  152.470765]  ? tc_ctl_chain+0xee0/0xee0
> [  152.471335]  ? __kernel_text_address+0xe/0x30
> [  152.471948]  ? unwind_get_return_address+0x56/0xa0
> [  152.472639]  ? __thaw_task+0x150/0x150
> [  152.473218]  ? arch_stack_walk+0x98/0xf0
> [  152.473839]  ? __stack_depot_save+0x35/0x4c0
> [  152.474501]  ? stack_trace_save+0x91/0xc0
> [  152.475119]  ? security_capable+0x51/0x90
> [  152.475741]  rtnetlink_rcv_msg+0x2c1/0x9d0
> [  152.476387]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
> [  152.477042]  ? __sys_sendmsg+0xb5/0x140
> [  152.477664]  ? do_syscall_64+0x3d/0x90
> [  152.478255]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.479010]  ? __stack_depot_save+0x35/0x4c0
> [  152.479679]  ? __stack_depot_save+0x35/0x4c0
> [  152.480346]  netlink_rcv_skb+0x12c/0x360
> [  152.480929]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
> [  152.481517]  ? do_syscall_64+0x3d/0x90
> [  152.482061]  ? netlink_ack+0x1550/0x1550
> [  152.482612]  ? rhashtable_walk_peek+0x170/0x170
> [  152.483262]  ? kmem_cache_alloc_node+0x1af/0x390
> [  152.483875]  ? _copy_from_iter+0x3d6/0xc70
> [  152.484528]  netlink_unicast+0x553/0x790
> [  152.485168]  ? netlink_attachskb+0x6a0/0x6a0
> [  152.485848]  ? unwind_next_frame+0x11cc/0x1a10
> [  152.486538]  ? arch_stack_walk+0x61/0xf0
> [  152.487169]  netlink_sendmsg+0x7a1/0xcb0
> [  152.487799]  ? netlink_unicast+0x790/0x790
> [  152.488355]  ? iovec_from_user.part.0+0x4d/0x220
> [  152.488990]  ? _raw_spin_lock+0x7a/0xd0
> [  152.489598]  ? netlink_unicast+0x790/0x790
> [  152.490236]  sock_sendmsg+0xc5/0x190
> [  152.490796]  ____sys_sendmsg+0x535/0x6b0
> [  152.491394]  ? import_iovec+0x7/0x10
> [  152.491964]  ? kernel_sendmsg+0x30/0x30
> [  152.492561]  ? __copy_msghdr+0x3c0/0x3c0
> [  152.493160]  ? do_syscall_64+0x3d/0x90
> [  152.493706]  ___sys_sendmsg+0xeb/0x170
> [  152.494283]  ? may_open_dev+0xd0/0xd0
> [  152.494858]  ? copy_msghdr_from_user+0x110/0x110
> [  152.495541]  ? __handle_mm_fault+0x2678/0x4ad0
> [  152.496205]  ? copy_page_range+0x2360/0x2360
> [  152.496862]  ? __fget_light+0x57/0x520
> [  152.497449]  ? mas_find+0x1c0/0x1c0
> [  152.498026]  ? sockfd_lookup_light+0x1a/0x140
> [  152.498703]  __sys_sendmsg+0xb5/0x140
> [  152.499306]  ? __sys_sendmsg_sock+0x20/0x20
> [  152.499951]  ? do_user_addr_fault+0x369/0xd80
> [  152.500595]  do_syscall_64+0x3d/0x90
> [  152.501185]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.501917] RIP: 0033:0x7f5eb294f887
> [  152.502494] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
> [  152.505008] RSP: 002b:00007ffd2c708f78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> [  152.506152] RAX: ffffffffffffffda RBX: 00000000642d9472 RCX: 00007f5eb294f887
> [  152.507134] RDX: 0000000000000000 RSI: 00007ffd2c708fe0 RDI: 0000000000000003
> [  152.508113] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> [  152.509119] R10: 00007f5eb2808708 R11: 0000000000000246 R12: 0000000000000001
> [  152.510068] R13: 0000000000000000 R14: 00007ffd2c70d1b8 R15: 0000000000485400
> [  152.511031]  </TASK>
> [  152.511444] Modules linked in: cls_flower sch_ingress openvswitch nsh mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay zram zsmalloc fuse [last unloaded: mlx5_core]
> [  152.515720] ---[ end trace 0000000000000000 ]---
> 
> Fixes: 08a0063df3ae ("net/sched: flower: Move filter handle initialization earlier")
> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>|

LGTM,

Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>


> ---
>   net/sched/cls_flower.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
> index 475fe222a855..1844545bef37 100644
> --- a/net/sched/cls_flower.c
> +++ b/net/sched/cls_flower.c
> @@ -2210,10 +2210,10 @@ static int fl_change(struct net *net, struct sk_buff *in_skb,
>   		spin_lock(&tp->lock);
>   		if (!handle) {
>   			handle = 1;
> -			err = idr_alloc_u32(&head->handle_idr, fnew, &handle,
> +			err = idr_alloc_u32(&head->handle_idr, NULL, &handle,
>   					    INT_MAX, GFP_ATOMIC);
>   		} else {
> -			err = idr_alloc_u32(&head->handle_idr, fnew, &handle,
> +			err = idr_alloc_u32(&head->handle_idr, NULL, &handle,
>   					    handle, GFP_ATOMIC);
>   
>   			/* Filter with specified handle was concurrently
> @@ -2378,7 +2378,7 @@ static void fl_walk(struct tcf_proto *tp, struct tcf_walker *arg,
>   	rcu_read_lock();
>   	idr_for_each_entry_continue_ul(&head->handle_idr, f, tmp, id) {
>   		/* don't return filters that are being deleted */
> -		if (!refcount_inc_not_zero(&f->refcnt))
> +		if (!f || !refcount_inc_not_zero(&f->refcnt))
>   			continue;
>   		rcu_read_unlock();
>
Paul Blakey April 27, 2023, 5:53 a.m. UTC | #3
On 26/04/2023 15:14, Vlad Buslov wrote:
> The cited commit moved idr initialization too early in fl_change() which
> allows concurrent users to access the filter that is still being
> initialized and is in inconsistent state, which, in turn, can cause NULL
> pointer dereference [0]. Since there is no obvious way to fix the ordering
> without reverting the whole cited commit, alternative approach taken to
> first insert NULL pointer into idr in order to allocate the handle but
> still cause fl_get() to return NULL and prevent concurrent users from
> seeing the filter while providing miss-to-action infrastructure with valid
> handle id early in fl_change().
> 
> [  152.434728] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN
> [  152.436163] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
> [  152.437269] CPU: 4 PID: 3877 Comm: tc Not tainted 6.3.0-rc4+ #5
> [  152.438110] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [  152.439644] RIP: 0010:fl_dump_key+0x8b/0x1d10 [cls_flower]
> [  152.440461] Code: 01 f2 02 f2 c7 40 08 04 f2 04 f2 c7 40 0c 04 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 84 24 00 01 00 00 48 89 c8 48 c1 e8 03 <0f> b6 04 10 84 c0 74 08 3c 03 0f 8e 98 19 00 00 8b 13 85 d2 74 57
> [  152.442885] RSP: 0018:ffff88817a28f158 EFLAGS: 00010246
> [  152.443851] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [  152.444826] RDX: dffffc0000000000 RSI: ffffffff8500ae80 RDI: ffff88810a987900
> [  152.445791] RBP: ffff888179d88240 R08: ffff888179d8845c R09: ffff888179d88240
> [  152.446780] R10: ffffed102f451e48 R11: 00000000fffffff2 R12: ffff88810a987900
> [  152.447741] R13: ffffffff8500ae80 R14: ffff88810a987900 R15: ffff888149b3c738
> [  152.448756] FS:  00007f5eb2a34800(0000) GS:ffff88881ec00000(0000) knlGS:0000000000000000
> [  152.449888] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  152.450685] CR2: 000000000046ad19 CR3: 000000010b0bd006 CR4: 0000000000370ea0
> [  152.451641] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  152.452628] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  152.453588] Call Trace:
> [  152.454032]  <TASK>
> [  152.454447]  ? netlink_sendmsg+0x7a1/0xcb0
> [  152.455109]  ? sock_sendmsg+0xc5/0x190
> [  152.455689]  ? ____sys_sendmsg+0x535/0x6b0
> [  152.456320]  ? ___sys_sendmsg+0xeb/0x170
> [  152.456916]  ? do_syscall_64+0x3d/0x90
> [  152.457529]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.458321]  ? ___sys_sendmsg+0xeb/0x170
> [  152.458958]  ? __sys_sendmsg+0xb5/0x140
> [  152.459564]  ? do_syscall_64+0x3d/0x90
> [  152.460122]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.460852]  ? fl_dump_key_options.part.0+0xea0/0xea0 [cls_flower]
> [  152.461710]  ? _raw_spin_lock+0x7a/0xd0
> [  152.462299]  ? _raw_read_lock_irq+0x30/0x30
> [  152.462924]  ? nla_put+0x15e/0x1c0
> [  152.463480]  fl_dump+0x228/0x650 [cls_flower]
> [  152.464112]  ? fl_tmplt_dump+0x210/0x210 [cls_flower]
> [  152.464854]  ? __kmem_cache_alloc_node+0x1a7/0x330
> [  152.465592]  ? nla_put+0x15e/0x1c0
> [  152.466160]  tcf_fill_node+0x515/0x9a0
> [  152.466766]  ? tc_setup_offload_action+0xf0/0xf0
> [  152.467463]  ? __alloc_skb+0x13c/0x2a0
> [  152.468067]  ? __build_skb_around+0x330/0x330
> [  152.468814]  ? fl_get+0x107/0x1a0 [cls_flower]
> [  152.469503]  tc_del_tfilter+0x718/0x1330
> [  152.470115]  ? is_bpf_text_address+0xa/0x20
> [  152.470765]  ? tc_ctl_chain+0xee0/0xee0
> [  152.471335]  ? __kernel_text_address+0xe/0x30
> [  152.471948]  ? unwind_get_return_address+0x56/0xa0
> [  152.472639]  ? __thaw_task+0x150/0x150
> [  152.473218]  ? arch_stack_walk+0x98/0xf0
> [  152.473839]  ? __stack_depot_save+0x35/0x4c0
> [  152.474501]  ? stack_trace_save+0x91/0xc0
> [  152.475119]  ? security_capable+0x51/0x90
> [  152.475741]  rtnetlink_rcv_msg+0x2c1/0x9d0
> [  152.476387]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
> [  152.477042]  ? __sys_sendmsg+0xb5/0x140
> [  152.477664]  ? do_syscall_64+0x3d/0x90
> [  152.478255]  ? entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.479010]  ? __stack_depot_save+0x35/0x4c0
> [  152.479679]  ? __stack_depot_save+0x35/0x4c0
> [  152.480346]  netlink_rcv_skb+0x12c/0x360
> [  152.480929]  ? rtnl_calcit.isra.0+0x2b0/0x2b0
> [  152.481517]  ? do_syscall_64+0x3d/0x90
> [  152.482061]  ? netlink_ack+0x1550/0x1550
> [  152.482612]  ? rhashtable_walk_peek+0x170/0x170
> [  152.483262]  ? kmem_cache_alloc_node+0x1af/0x390
> [  152.483875]  ? _copy_from_iter+0x3d6/0xc70
> [  152.484528]  netlink_unicast+0x553/0x790
> [  152.485168]  ? netlink_attachskb+0x6a0/0x6a0
> [  152.485848]  ? unwind_next_frame+0x11cc/0x1a10
> [  152.486538]  ? arch_stack_walk+0x61/0xf0
> [  152.487169]  netlink_sendmsg+0x7a1/0xcb0
> [  152.487799]  ? netlink_unicast+0x790/0x790
> [  152.488355]  ? iovec_from_user.part.0+0x4d/0x220
> [  152.488990]  ? _raw_spin_lock+0x7a/0xd0
> [  152.489598]  ? netlink_unicast+0x790/0x790
> [  152.490236]  sock_sendmsg+0xc5/0x190
> [  152.490796]  ____sys_sendmsg+0x535/0x6b0
> [  152.491394]  ? import_iovec+0x7/0x10
> [  152.491964]  ? kernel_sendmsg+0x30/0x30
> [  152.492561]  ? __copy_msghdr+0x3c0/0x3c0
> [  152.493160]  ? do_syscall_64+0x3d/0x90
> [  152.493706]  ___sys_sendmsg+0xeb/0x170
> [  152.494283]  ? may_open_dev+0xd0/0xd0
> [  152.494858]  ? copy_msghdr_from_user+0x110/0x110
> [  152.495541]  ? __handle_mm_fault+0x2678/0x4ad0
> [  152.496205]  ? copy_page_range+0x2360/0x2360
> [  152.496862]  ? __fget_light+0x57/0x520
> [  152.497449]  ? mas_find+0x1c0/0x1c0
> [  152.498026]  ? sockfd_lookup_light+0x1a/0x140
> [  152.498703]  __sys_sendmsg+0xb5/0x140
> [  152.499306]  ? __sys_sendmsg_sock+0x20/0x20
> [  152.499951]  ? do_user_addr_fault+0x369/0xd80
> [  152.500595]  do_syscall_64+0x3d/0x90
> [  152.501185]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [  152.501917] RIP: 0033:0x7f5eb294f887
> [  152.502494] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
> [  152.505008] RSP: 002b:00007ffd2c708f78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> [  152.506152] RAX: ffffffffffffffda RBX: 00000000642d9472 RCX: 00007f5eb294f887
> [  152.507134] RDX: 0000000000000000 RSI: 00007ffd2c708fe0 RDI: 0000000000000003
> [  152.508113] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> [  152.509119] R10: 00007f5eb2808708 R11: 0000000000000246 R12: 0000000000000001
> [  152.510068] R13: 0000000000000000 R14: 00007ffd2c70d1b8 R15: 0000000000485400
> [  152.511031]  </TASK>
> [  152.511444] Modules linked in: cls_flower sch_ingress openvswitch nsh mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter overlay zram zsmalloc fuse [last unloaded: mlx5_core]
> [  152.515720] ---[ end trace 0000000000000000 ]---
> 
> Fixes: 08a0063df3ae ("net/sched: flower: Move filter handle initialization earlier")
> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>

Reviewed-by: Paul Blakey <paulb@nvidia.com>
diff mbox series

Patch

diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 475fe222a855..1844545bef37 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -2210,10 +2210,10 @@  static int fl_change(struct net *net, struct sk_buff *in_skb,
 		spin_lock(&tp->lock);
 		if (!handle) {
 			handle = 1;
-			err = idr_alloc_u32(&head->handle_idr, fnew, &handle,
+			err = idr_alloc_u32(&head->handle_idr, NULL, &handle,
 					    INT_MAX, GFP_ATOMIC);
 		} else {
-			err = idr_alloc_u32(&head->handle_idr, fnew, &handle,
+			err = idr_alloc_u32(&head->handle_idr, NULL, &handle,
 					    handle, GFP_ATOMIC);
 
 			/* Filter with specified handle was concurrently
@@ -2378,7 +2378,7 @@  static void fl_walk(struct tcf_proto *tp, struct tcf_walker *arg,
 	rcu_read_lock();
 	idr_for_each_entry_continue_ul(&head->handle_idr, f, tmp, id) {
 		/* don't return filters that are being deleted */
-		if (!refcount_inc_not_zero(&f->refcnt))
+		if (!f || !refcount_inc_not_zero(&f->refcnt))
 			continue;
 		rcu_read_unlock();