Message ID | 20240816114813.326645-4-razor@blackwall.org (mailing list archive) |
---|---|
State | Accepted |
Commit | f8cde9805981c50d0c029063dc7d82821806fc44 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | bonding: fix xfrm offload bugs | expand |
On Fri, Aug 16, 2024 at 02:48:12PM +0300, Nikolay Aleksandrov wrote: > We shouldn't set real_dev to NULL because packets can be in transit and > xfrm might call xdo_dev_offload_ok() in parallel. All callbacks assume > real_dev is set. > > Example trace: > kernel: BUG: unable to handle page fault for address: 0000000000001030 > kernel: bond0: (slave eni0np1): making interface the new active one > kernel: #PF: supervisor write access in kernel mode > kernel: #PF: error_code(0x0002) - not-present page > kernel: PGD 0 P4D 0 > kernel: Oops: 0002 [#1] PREEMPT SMP > kernel: CPU: 4 PID: 2237 Comm: ping Not tainted 6.7.7+ #12 > kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 > kernel: RIP: 0010:nsim_ipsec_offload_ok+0xc/0x20 [netdevsim] > kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA I saw the errors are during bond_ipsec_add_sa_all, which also set ipsec->xs->xso.real_dev = NULL. Should we fix it there? Thanks Hangbin > kernel: Code: e0 0f 0b 48 83 7f 38 00 74 de 0f 0b 48 8b 47 08 48 8b 37 48 8b 78 40 e9 b2 e5 9a d7 66 90 0f 1f 44 00 00 48 8b 86 80 02 00 00 <83> 80 30 10 00 00 01 b8 01 00 00 00 c3 0f 1f 80 00 00 00 00 0f 1f > kernel: bond0: (slave eni0np1): making interface the new active one > kernel: RSP: 0018:ffffabde81553b98 EFLAGS: 00010246 > kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA > kernel: > kernel: RAX: 0000000000000000 RBX: ffff9eb404e74900 RCX: ffff9eb403d97c60 > kernel: RDX: ffffffffc090de10 RSI: ffff9eb404e74900 RDI: ffff9eb3c5de9e00 > kernel: RBP: ffff9eb3c0a42000 R08: 0000000000000010 R09: 0000000000000014 > kernel: R10: 7974203030303030 R11: 3030303030303030 R12: 0000000000000000 > kernel: R13: ffff9eb3c5de9e00 R14: ffffabde81553cc8 R15: ffff9eb404c53000 > kernel: FS: 00007f2a77a3ad00(0000) GS:ffff9eb43bd00000(0000) knlGS:0000000000000000 > kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > kernel: CR2: 0000000000001030 CR3: 00000001122ab000 CR4: 0000000000350ef0 > kernel: bond0: (slave eni0np1): making interface the new active one > kernel: Call Trace: > kernel: <TASK> > kernel: ? __die+0x1f/0x60 > kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA > kernel: ? page_fault_oops+0x142/0x4c0 > kernel: ? do_user_addr_fault+0x65/0x670 > kernel: ? kvm_read_and_reset_apf_flags+0x3b/0x50 > kernel: bond0: (slave eni0np1): making interface the new active one > kernel: ? exc_page_fault+0x7b/0x180 > kernel: ? asm_exc_page_fault+0x22/0x30 > kernel: ? nsim_bpf_uninit+0x50/0x50 [netdevsim] > kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA > kernel: ? nsim_ipsec_offload_ok+0xc/0x20 [netdevsim] > kernel: bond0: (slave eni0np1): making interface the new active one > kernel: bond_ipsec_offload_ok+0x7b/0x90 [bonding] > kernel: xfrm_output+0x61/0x3b0 > kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA > kernel: ip_push_pending_frames+0x56/0x80 > > Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") > Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> > --- > drivers/net/bonding/bond_main.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index 65ddb71eebcd..f74bacf071fc 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -582,7 +582,6 @@ static void bond_ipsec_del_sa_all(struct bonding *bond) > } else { > slave->dev->xfrmdev_ops->xdo_dev_state_delete(ipsec->xs); > } > - ipsec->xs->xso.real_dev = NULL; > } > spin_unlock_bh(&bond->ipsec_lock); > rcu_read_unlock(); > -- > 2.44.0 >
On 19/08/2024 05:54, Hangbin Liu wrote: > On Fri, Aug 16, 2024 at 02:48:12PM +0300, Nikolay Aleksandrov wrote: >> We shouldn't set real_dev to NULL because packets can be in transit and >> xfrm might call xdo_dev_offload_ok() in parallel. All callbacks assume >> real_dev is set. >> >> Example trace: >> kernel: BUG: unable to handle page fault for address: 0000000000001030 >> kernel: bond0: (slave eni0np1): making interface the new active one >> kernel: #PF: supervisor write access in kernel mode >> kernel: #PF: error_code(0x0002) - not-present page >> kernel: PGD 0 P4D 0 >> kernel: Oops: 0002 [#1] PREEMPT SMP >> kernel: CPU: 4 PID: 2237 Comm: ping Not tainted 6.7.7+ #12 >> kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 >> kernel: RIP: 0010:nsim_ipsec_offload_ok+0xc/0x20 [netdevsim] >> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA > > I saw the errors are during bond_ipsec_add_sa_all, which also > set ipsec->xs->xso.real_dev = NULL. Should we fix it there? > > Thanks > Hangbin Correct, I saw it too but I didn't remove it on purpose. I know it can lead to a similar error, but the fix is more complicated. I don't believe it's correct to set real_dev if the SA add failed, so we need to think about a different way to sync it. To be fair in real life it would be more difficult to hit it because the device must be in a state where the SA add fails, although it supports xfrm offload. The problem is that real_dev must be set before attempting the SA add in the first place. >> kernel: Code: e0 0f 0b 48 83 7f 38 00 74 de 0f 0b 48 8b 47 08 48 8b 37 48 8b 78 40 e9 b2 e5 9a d7 66 90 0f 1f 44 00 00 48 8b 86 80 02 00 00 <83> 80 30 10 00 00 01 b8 01 00 00 00 c3 0f 1f 80 00 00 00 00 0f 1f >> kernel: bond0: (slave eni0np1): making interface the new active one >> kernel: RSP: 0018:ffffabde81553b98 EFLAGS: 00010246 >> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA >> kernel: >> kernel: RAX: 0000000000000000 RBX: ffff9eb404e74900 RCX: ffff9eb403d97c60 >> kernel: RDX: ffffffffc090de10 RSI: ffff9eb404e74900 RDI: ffff9eb3c5de9e00 >> kernel: RBP: ffff9eb3c0a42000 R08: 0000000000000010 R09: 0000000000000014 >> kernel: R10: 7974203030303030 R11: 3030303030303030 R12: 0000000000000000 >> kernel: R13: ffff9eb3c5de9e00 R14: ffffabde81553cc8 R15: ffff9eb404c53000 >> kernel: FS: 00007f2a77a3ad00(0000) GS:ffff9eb43bd00000(0000) knlGS:0000000000000000 >> kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> kernel: CR2: 0000000000001030 CR3: 00000001122ab000 CR4: 0000000000350ef0 >> kernel: bond0: (slave eni0np1): making interface the new active one >> kernel: Call Trace: >> kernel: <TASK> >> kernel: ? __die+0x1f/0x60 >> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA >> kernel: ? page_fault_oops+0x142/0x4c0 >> kernel: ? do_user_addr_fault+0x65/0x670 >> kernel: ? kvm_read_and_reset_apf_flags+0x3b/0x50 >> kernel: bond0: (slave eni0np1): making interface the new active one >> kernel: ? exc_page_fault+0x7b/0x180 >> kernel: ? asm_exc_page_fault+0x22/0x30 >> kernel: ? nsim_bpf_uninit+0x50/0x50 [netdevsim] >> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA >> kernel: ? nsim_ipsec_offload_ok+0xc/0x20 [netdevsim] >> kernel: bond0: (slave eni0np1): making interface the new active one >> kernel: bond_ipsec_offload_ok+0x7b/0x90 [bonding] >> kernel: xfrm_output+0x61/0x3b0 >> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA >> kernel: ip_push_pending_frames+0x56/0x80 >> >> Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") >> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> >> --- >> drivers/net/bonding/bond_main.c | 1 - >> 1 file changed, 1 deletion(-) >> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >> index 65ddb71eebcd..f74bacf071fc 100644 >> --- a/drivers/net/bonding/bond_main.c >> +++ b/drivers/net/bonding/bond_main.c >> @@ -582,7 +582,6 @@ static void bond_ipsec_del_sa_all(struct bonding *bond) >> } else { >> slave->dev->xfrmdev_ops->xdo_dev_state_delete(ipsec->xs); >> } >> - ipsec->xs->xso.real_dev = NULL; >> } >> spin_unlock_bh(&bond->ipsec_lock); >> rcu_read_unlock(); >> -- >> 2.44.0 >>
On 19/08/2024 10:34, Nikolay Aleksandrov wrote: > On 19/08/2024 05:54, Hangbin Liu wrote: >> On Fri, Aug 16, 2024 at 02:48:12PM +0300, Nikolay Aleksandrov wrote: >>> We shouldn't set real_dev to NULL because packets can be in transit and >>> xfrm might call xdo_dev_offload_ok() in parallel. All callbacks assume >>> real_dev is set. >>> >>> Example trace: >>> kernel: BUG: unable to handle page fault for address: 0000000000001030 >>> kernel: bond0: (slave eni0np1): making interface the new active one >>> kernel: #PF: supervisor write access in kernel mode >>> kernel: #PF: error_code(0x0002) - not-present page >>> kernel: PGD 0 P4D 0 >>> kernel: Oops: 0002 [#1] PREEMPT SMP >>> kernel: CPU: 4 PID: 2237 Comm: ping Not tainted 6.7.7+ #12 >>> kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 >>> kernel: RIP: 0010:nsim_ipsec_offload_ok+0xc/0x20 [netdevsim] >>> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA >> >> I saw the errors are during bond_ipsec_add_sa_all, which also >> set ipsec->xs->xso.real_dev = NULL. Should we fix it there? >> >> Thanks >> Hangbin > > Correct, I saw it too but I didn't remove it on purpose. I know it can lead to a > similar error, but the fix is more complicated. I don't believe it's correct to > set real_dev if the SA add failed, so we need to think about a different way > to sync it. To be fair in real life it would be more difficult to hit it because > the device must be in a state where the SA add fails, although it supports > xfrm offload. The problem is that real_dev must be set before attempting the SA > add in the first place. > Just fyi I do have an idea about an additional bit that is set on successful ops in combination with a call_rcu to wait for a grace period on error, I'll test it this week and send a patch if it's good. >>> kernel: Code: e0 0f 0b 48 83 7f 38 00 74 de 0f 0b 48 8b 47 08 48 8b 37 48 8b 78 40 e9 b2 e5 9a d7 66 90 0f 1f 44 00 00 48 8b 86 80 02 00 00 <83> 80 30 10 00 00 01 b8 01 00 00 00 c3 0f 1f 80 00 00 00 00 0f 1f >>> kernel: bond0: (slave eni0np1): making interface the new active one >>> kernel: RSP: 0018:ffffabde81553b98 EFLAGS: 00010246 >>> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA >>> kernel: >>> kernel: RAX: 0000000000000000 RBX: ffff9eb404e74900 RCX: ffff9eb403d97c60 >>> kernel: RDX: ffffffffc090de10 RSI: ffff9eb404e74900 RDI: ffff9eb3c5de9e00 >>> kernel: RBP: ffff9eb3c0a42000 R08: 0000000000000010 R09: 0000000000000014 >>> kernel: R10: 7974203030303030 R11: 3030303030303030 R12: 0000000000000000 >>> kernel: R13: ffff9eb3c5de9e00 R14: ffffabde81553cc8 R15: ffff9eb404c53000 >>> kernel: FS: 00007f2a77a3ad00(0000) GS:ffff9eb43bd00000(0000) knlGS:0000000000000000 >>> kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> kernel: CR2: 0000000000001030 CR3: 00000001122ab000 CR4: 0000000000350ef0 >>> kernel: bond0: (slave eni0np1): making interface the new active one >>> kernel: Call Trace: >>> kernel: <TASK> >>> kernel: ? __die+0x1f/0x60 >>> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA >>> kernel: ? page_fault_oops+0x142/0x4c0 >>> kernel: ? do_user_addr_fault+0x65/0x670 >>> kernel: ? kvm_read_and_reset_apf_flags+0x3b/0x50 >>> kernel: bond0: (slave eni0np1): making interface the new active one >>> kernel: ? exc_page_fault+0x7b/0x180 >>> kernel: ? asm_exc_page_fault+0x22/0x30 >>> kernel: ? nsim_bpf_uninit+0x50/0x50 [netdevsim] >>> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA >>> kernel: ? nsim_ipsec_offload_ok+0xc/0x20 [netdevsim] >>> kernel: bond0: (slave eni0np1): making interface the new active one >>> kernel: bond_ipsec_offload_ok+0x7b/0x90 [bonding] >>> kernel: xfrm_output+0x61/0x3b0 >>> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA >>> kernel: ip_push_pending_frames+0x56/0x80 >>> >>> Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") >>> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> >>> --- >>> drivers/net/bonding/bond_main.c | 1 - >>> 1 file changed, 1 deletion(-) >>> >>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >>> index 65ddb71eebcd..f74bacf071fc 100644 >>> --- a/drivers/net/bonding/bond_main.c >>> +++ b/drivers/net/bonding/bond_main.c >>> @@ -582,7 +582,6 @@ static void bond_ipsec_del_sa_all(struct bonding *bond) >>> } else { >>> slave->dev->xfrmdev_ops->xdo_dev_state_delete(ipsec->xs); >>> } >>> - ipsec->xs->xso.real_dev = NULL; >>> } >>> spin_unlock_bh(&bond->ipsec_lock); >>> rcu_read_unlock(); >>> -- >>> 2.44.0 >>> >
On Mon, Aug 19, 2024 at 10:34:16AM +0300, Nikolay Aleksandrov wrote: > On 19/08/2024 05:54, Hangbin Liu wrote: > > On Fri, Aug 16, 2024 at 02:48:12PM +0300, Nikolay Aleksandrov wrote: > >> We shouldn't set real_dev to NULL because packets can be in transit and > >> xfrm might call xdo_dev_offload_ok() in parallel. All callbacks assume > >> real_dev is set. > >> > >> Example trace: > >> kernel: BUG: unable to handle page fault for address: 0000000000001030 > >> kernel: bond0: (slave eni0np1): making interface the new active one > >> kernel: #PF: supervisor write access in kernel mode > >> kernel: #PF: error_code(0x0002) - not-present page > >> kernel: PGD 0 P4D 0 > >> kernel: Oops: 0002 [#1] PREEMPT SMP > >> kernel: CPU: 4 PID: 2237 Comm: ping Not tainted 6.7.7+ #12 > >> kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 > >> kernel: RIP: 0010:nsim_ipsec_offload_ok+0xc/0x20 [netdevsim] > >> kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA > > > > I saw the errors are during bond_ipsec_add_sa_all, which also > > set ipsec->xs->xso.real_dev = NULL. Should we fix it there? > > > > Thanks > > Hangbin > > Correct, I saw it too but I didn't remove it on purpose. I know it can lead to a > similar error, but the fix is more complicated. I don't believe it's correct to > set real_dev if the SA add failed, so we need to think about a different way > to sync it. To be fair in real life it would be more difficult to hit it because > the device must be in a state where the SA add fails, although it supports > xfrm offload. The problem is that real_dev must be set before attempting the SA > add in the first place. Got it, so this time we only fix the delete path. Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 65ddb71eebcd..f74bacf071fc 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -582,7 +582,6 @@ static void bond_ipsec_del_sa_all(struct bonding *bond) } else { slave->dev->xfrmdev_ops->xdo_dev_state_delete(ipsec->xs); } - ipsec->xs->xso.real_dev = NULL; } spin_unlock_bh(&bond->ipsec_lock); rcu_read_unlock();
We shouldn't set real_dev to NULL because packets can be in transit and xfrm might call xdo_dev_offload_ok() in parallel. All callbacks assume real_dev is set. Example trace: kernel: BUG: unable to handle page fault for address: 0000000000001030 kernel: bond0: (slave eni0np1): making interface the new active one kernel: #PF: supervisor write access in kernel mode kernel: #PF: error_code(0x0002) - not-present page kernel: PGD 0 P4D 0 kernel: Oops: 0002 [#1] PREEMPT SMP kernel: CPU: 4 PID: 2237 Comm: ping Not tainted 6.7.7+ #12 kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014 kernel: RIP: 0010:nsim_ipsec_offload_ok+0xc/0x20 [netdevsim] kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA kernel: Code: e0 0f 0b 48 83 7f 38 00 74 de 0f 0b 48 8b 47 08 48 8b 37 48 8b 78 40 e9 b2 e5 9a d7 66 90 0f 1f 44 00 00 48 8b 86 80 02 00 00 <83> 80 30 10 00 00 01 b8 01 00 00 00 c3 0f 1f 80 00 00 00 00 0f 1f kernel: bond0: (slave eni0np1): making interface the new active one kernel: RSP: 0018:ffffabde81553b98 EFLAGS: 00010246 kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA kernel: kernel: RAX: 0000000000000000 RBX: ffff9eb404e74900 RCX: ffff9eb403d97c60 kernel: RDX: ffffffffc090de10 RSI: ffff9eb404e74900 RDI: ffff9eb3c5de9e00 kernel: RBP: ffff9eb3c0a42000 R08: 0000000000000010 R09: 0000000000000014 kernel: R10: 7974203030303030 R11: 3030303030303030 R12: 0000000000000000 kernel: R13: ffff9eb3c5de9e00 R14: ffffabde81553cc8 R15: ffff9eb404c53000 kernel: FS: 00007f2a77a3ad00(0000) GS:ffff9eb43bd00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 0000000000001030 CR3: 00000001122ab000 CR4: 0000000000350ef0 kernel: bond0: (slave eni0np1): making interface the new active one kernel: Call Trace: kernel: <TASK> kernel: ? __die+0x1f/0x60 kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA kernel: ? page_fault_oops+0x142/0x4c0 kernel: ? do_user_addr_fault+0x65/0x670 kernel: ? kvm_read_and_reset_apf_flags+0x3b/0x50 kernel: bond0: (slave eni0np1): making interface the new active one kernel: ? exc_page_fault+0x7b/0x180 kernel: ? asm_exc_page_fault+0x22/0x30 kernel: ? nsim_bpf_uninit+0x50/0x50 [netdevsim] kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA kernel: ? nsim_ipsec_offload_ok+0xc/0x20 [netdevsim] kernel: bond0: (slave eni0np1): making interface the new active one kernel: bond_ipsec_offload_ok+0x7b/0x90 [bonding] kernel: xfrm_output+0x61/0x3b0 kernel: bond0: (slave eni0np1): bond_ipsec_add_sa_all: failed to add SA kernel: ip_push_pending_frames+0x56/0x80 Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> --- drivers/net/bonding/bond_main.c | 1 - 1 file changed, 1 deletion(-)