mbox series

[v1,net-next,0/7] nexthop: Convert RTM_{NEW,DEL}NEXTHOP to per-netns RTNL.

Message ID 20250318233240.53946-1-kuniyu@amazon.com (mailing list archive)
Headers show
Series nexthop: Convert RTM_{NEW,DEL}NEXTHOP to per-netns RTNL. | expand

Message

Kuniyuki Iwashima March 18, 2025, 11:31 p.m. UTC
Patch 1 - 5 move some validation for RTM_NEWNEXTHOP so that it can be
done without RTNL.

Patch 6 & 7 converts RTM_NEWNEXTHOP and RTM_DELNEXTHOP to per-netns RTNL.

Note that RTM_GETNEXTHOP and RTM_GETNEXTHOPBUCKET are not touched in
this series.

rtm_get_nexthop() can be easily converted to RCU, but rtm_dump_nexthop()
needs more work due to the left-to-right rbtree walk, which looks prone
to node deletion and tree rotation without a retry mechanism.


Kuniyuki Iwashima (7):
  nexthop: Move nlmsg_parse() in rtm_to_nh_config() to
    rtm_new_nexthop().
  nexthop: Split nh_check_attr_group().
  nexthop: Move NHA_OIF validation to rtm_to_nh_config_rtnl().
  nexthop: Check NLM_F_REPLACE and NHA_ID in rtm_new_nexthop().
  nexthop: Remove redundant group len check in nexthop_create_group().
  nexthop: Convert RTM_NEWNEXTHOP to per-netns RTNL.
  nexthop: Convert RTM_DELNEXTHOP to per-netns RTNL.

 net/ipv4/nexthop.c | 183 +++++++++++++++++++++++++++------------------
 1 file changed, 112 insertions(+), 71 deletions(-)

Comments

Paolo Abeni March 19, 2025, 7:57 a.m. UTC | #1
Hi,

On 3/19/25 12:31 AM, Kuniyuki Iwashima wrote:
> Patch 1 - 5 move some validation for RTM_NEWNEXTHOP so that it can be
> done without RTNL.
> 
> Patch 6 & 7 converts RTM_NEWNEXTHOP and RTM_DELNEXTHOP to per-netns RTNL.
> 
> Note that RTM_GETNEXTHOP and RTM_GETNEXTHOPBUCKET are not touched in
> this series.
> 
> rtm_get_nexthop() can be easily converted to RCU, but rtm_dump_nexthop()
> needs more work due to the left-to-right rbtree walk, which looks prone
> to node deletion and tree rotation without a retry mechanism.
> 
> 
> Kuniyuki Iwashima (7):
>   nexthop: Move nlmsg_parse() in rtm_to_nh_config() to
>     rtm_new_nexthop().
>   nexthop: Split nh_check_attr_group().
>   nexthop: Move NHA_OIF validation to rtm_to_nh_config_rtnl().
>   nexthop: Check NLM_F_REPLACE and NHA_ID in rtm_new_nexthop().
>   nexthop: Remove redundant group len check in nexthop_create_group().
>   nexthop: Convert RTM_NEWNEXTHOP to per-netns RTNL.
>   nexthop: Convert RTM_DELNEXTHOP to per-netns RTNL.
> 
>  net/ipv4/nexthop.c | 183 +++++++++++++++++++++++++++------------------
>  1 file changed, 112 insertions(+), 71 deletions(-)

This series is apparently causing NULL ptr deref in the nexthop.sh
netdevsim selftests. Unfortunately, due to a transient nipa infra
outage, a lot of stuff landed into the same batch, so I'm not 110% this
series is the real curprit but looks like a reasonable suspect.

Kuniyuki, could you please have a look?

---
[    1.653896] BUG: kernel NULL pointer dereference, address:
0000000000000068
[    1.653963] #PF: supervisor read access in kernel mode
[    1.654003] #PF: error_code(0x0000) - not-present page
[    1.654037] PGD 7828067 P4D 7828067 PUD 782a067 PMD 0
[    1.654077] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[    1.654119] CPU: 0 UID: 0 PID: 303 Comm: ip Not tainted
6.14.0-rc6-virtme #1
[    1.654176] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    1.654219] RIP: 0010:rtm_new_nexthop+0x645/0x2260
[    1.654263] Code: 70 02 00 00 48 85 db 75 0e eb 1f 48 83 c3 10 48 8b
1b 48 85 db 74 13 3b 43 60 72 ef 76 0c 48 83 c3 08 48 8b 1b 48 85 db 75
ed <8b> 53 68 4c 8d 63 68 85 d2 0f 84 f1 02 00 00 8d 4a 01 89 d0 f0 41
[    1.654390] RSP: 0018:ffffae348037b860 EFLAGS: 00010246
[    1.654430] RAX: 0000000000000001 RBX: 0000000000000000 RCX:
0000000000000000
[    1.654482] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffff992d066b8000
[    1.654534] RBP: ffffae348037bab0 R08: ffff992d012d2fa8 R09:
ffff992d055d1780
[    1.654587] R10: ffffae348037b860 R11: ffff992d055d17c8 R12:
ffffae348037bb60
[    1.654638] R13: 0000000000000000 R14: 0000000000000001 R15:
ffff992d055d17c8
[    1.654692] FS:  00007f8b6fb0c800(0000) GS:ffff992d3ec00000(0000)
knlGS:0000000000000000
[    1.654749] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.654791] CR2: 0000000000000068 CR3: 00000000067ae005 CR4:
0000000000772ef0
[    1.654844] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[    1.654900] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[    1.654957] PKRU: 55555554
[    1.654974] Call Trace:
[    1.654993]  <TASK>
[    1.655015]  ? __die+0x24/0x70
[    1.655049]  ? page_fault_oops+0x15a/0x450
[    1.655080]  ? mas_topiary_replace+0x9ba/0xca0
[    1.655121]  ? exc_page_fault+0x69/0x150
[    1.655162]  ? asm_exc_page_fault+0x26/0x30
[    1.655202]  ? rtm_new_nexthop+0x645/0x2260
[    1.655239]  ? virtqueue_notify+0x1c/0x40
[    1.655269]  ? virtio_fs_enqueue_req+0x50c/0x570
[    1.655311]  ? __pfx_rtm_new_nexthop+0x10/0x10
[    1.655351]  ? rtnetlink_rcv_msg+0x361/0x410
[    1.655391]  rtnetlink_rcv_msg+0x361/0x410
[    1.655417]  ? __remove_hrtimer+0x39/0x90
[    1.655448]  ? sysvec_apic_timer_interrupt+0xf/0x90
[    1.655494]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[    1.655528]  netlink_rcv_skb+0x58/0x110
[    1.655560]  netlink_unicast+0x247/0x370
[    1.655592]  netlink_sendmsg+0x1bf/0x3e0
[    1.655624]  ____sys_sendmsg+0x2bc/0x320
[    1.655656]  ? copy_msghdr_from_user+0x6d/0xa0
[    1.655696]  ___sys_sendmsg+0x88/0xd0
[    1.655729]  __sys_sendmsg+0x6c/0xc0
[    1.655760]  do_syscall_64+0x9e/0x1a0
[    1.655793]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[    1.655834] RIP: 0033:0x7f8b6fd189a7
[    1.655864] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f
1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f
05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[    1.655986] RSP: 002b:00007ffebd919418 EFLAGS: 00000246 ORIG_RAX:
000000000000002e
[    1.656043] RAX: ffffffffffffffda RBX: 00007ffebd919f80 RCX:
00007f8b6fd189a7
[    1.656099] RDX: 0000000000000000 RSI: 00007ffebd919480 RDI:
0000000000000005
[    1.656151] RBP: 00007ffebd919940 R08: 0000000006ba3910 R09:
0000000000000000
[    1.656202] R10: 00007f8b6fbd1708 R11: 0000000000000246 R12:
0000000006ba3918
[    1.656254] R13: 0000000067da36b9 R14: 0000000000498600 R15:
0000000006ba3910
[    1.656307]  </TASK>
[    1.656324] Modules linked in: netdevsim
[    1.656356] CR2: 0000000000000068