Message ID | 20241015033632.12120-1-liuhangbin@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [net] bpf: xdp: fallback to SKB mode if DRV flag is absent. | expand |
On 10/15/24 5:36 AM, Hangbin Liu wrote: > After commit c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags > specified"), the mode is automatically set to XDP_MODE_DRV if the driver > implements the .ndo_bpf function. However, for drivers like bonding, which > only support native XDP for specific modes, this may result in an > "unsupported" response. > > In such cases, let's fall back to SKB mode if the user did not explicitly > request DRV mode. > > Fixes: c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags specified") > Reported-by: Liang Li <liali@redhat.com> > Closes: https://issues.redhat.com/browse/RHEL-62339 nit: The link is not accessible to the public. Also, this breaks BPF CI with regards to existing bonding selftest : https://github.com/kernel-patches/bpf/actions/runs/11340153361/job/31536275257 Given this issue is related to only bonding driver, could this be fixed there instead? > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> > --- > net/core/dev.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index ea5fbcd133ae..e32069d81cd7 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -9579,6 +9579,7 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack > > /* don't call drivers if the effective program didn't change */ > if (new_prog != cur_prog) { > +reinstall: > bpf_op = dev_xdp_bpf_op(dev, mode); > if (!bpf_op) { > NL_SET_ERR_MSG(extack, "Underlying driver does not support XDP in native mode"); > @@ -9586,8 +9587,17 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack > } > > err = dev_xdp_install(dev, mode, bpf_op, extack, flags, new_prog); > - if (err) > + if (err) { > + /* The driver returns not supported even .ndo_bpf > + * implemented, fall back to SKB mode. > + */ > + if (err == -EOPNOTSUPP && mode == XDP_MODE_DRV && > + !(flags & XDP_FLAGS_DRV_MODE)) { > + mode = XDP_MODE_SKB; > + goto reinstall; > + } > return err; > + } > } > > if (link)
On 15/10/2024 11:17, Daniel Borkmann wrote: > On 10/15/24 5:36 AM, Hangbin Liu wrote: >> After commit c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags >> specified"), the mode is automatically set to XDP_MODE_DRV if the driver >> implements the .ndo_bpf function. However, for drivers like bonding, which >> only support native XDP for specific modes, this may result in an >> "unsupported" response. >> >> In such cases, let's fall back to SKB mode if the user did not explicitly >> request DRV mode. >> So behaviour changed once, now it's changing again.. IMO it's better to explicitly error out and let the user decide how to resolve the situation. The above commit is 4 years old, surely everyone is used to the behaviour by now. If you insist to do auto-fallback, then at least I'd go with Daniel's suggestion and do it in the bonding device. Maybe it can return -EFALLBACK, or some other way to signal the caller and change the mode, but you assume that's what the user would want, maybe it is and maybe it's not - that is why I'd prefer the explicit error so conscious action can be taken to resolve the situation. That being said, I don't have a strong preference, just my few cents. :) >> Fixes: c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags specified") >> Reported-by: Liang Li <liali@redhat.com> >> Closes: https://issues.redhat.com/browse/RHEL-62339 > > nit: The link is not accessible to the public. > > Also, this breaks BPF CI with regards to existing bonding selftest : > > https://github.com/kernel-patches/bpf/actions/runs/11340153361/job/31536275257 > > Given this issue is related to only bonding driver, could this be fixed > there instead? > >> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> >> --- >> net/core/dev.c | 12 +++++++++++- >> 1 file changed, 11 insertions(+), 1 deletion(-) >> >> diff --git a/net/core/dev.c b/net/core/dev.c >> index ea5fbcd133ae..e32069d81cd7 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -9579,6 +9579,7 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack >> /* don't call drivers if the effective program didn't change */ >> if (new_prog != cur_prog) { >> +reinstall: >> bpf_op = dev_xdp_bpf_op(dev, mode); >> if (!bpf_op) { >> NL_SET_ERR_MSG(extack, "Underlying driver does not support XDP in native mode"); >> @@ -9586,8 +9587,17 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack >> } >> err = dev_xdp_install(dev, mode, bpf_op, extack, flags, new_prog); >> - if (err) >> + if (err) { >> + /* The driver returns not supported even .ndo_bpf >> + * implemented, fall back to SKB mode. >> + */ >> + if (err == -EOPNOTSUPP && mode == XDP_MODE_DRV && >> + !(flags & XDP_FLAGS_DRV_MODE)) { >> + mode = XDP_MODE_SKB; >> + goto reinstall; >> + } >> return err; >> + } >> } >> if (link)
On Tue, Oct 15, 2024 at 12:53:08PM +0300, Nikolay Aleksandrov wrote: > On 15/10/2024 11:17, Daniel Borkmann wrote: > > On 10/15/24 5:36 AM, Hangbin Liu wrote: > >> After commit c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags > >> specified"), the mode is automatically set to XDP_MODE_DRV if the driver > >> implements the .ndo_bpf function. However, for drivers like bonding, which > >> only support native XDP for specific modes, this may result in an > >> "unsupported" response. > >> > >> In such cases, let's fall back to SKB mode if the user did not explicitly > >> request DRV mode. > >> > > So behaviour changed once, now it's changing again.. This should not be a behaviour change, it just follow the fallback rules. > IMO it's better to explicitly > error out and let the user decide how to resolve the situation. The user feels confused and reported a bug. Because cmd `ip link set bond0 xdp obj xdp_dummy.o section xdp` failed with "Operation not supported" in stead of fall back to xdpgeneral mode. > The above commit > is 4 years old, surely everyone is used to the behaviour by now. If you insist > to do auto-fallback, then at least I'd go with Daniel's suggestion and do it > in the bonding device. Maybe it can return -EFALLBACK, or some other way to > signal the caller and change the mode, but you assume that's what the user > would want, maybe it is and maybe it's not - that is why I'd prefer the > explicit error so conscious action can be taken to resolve the situation. > > That being said, I don't have a strong preference, just my few cents. :) > > >> Fixes: c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags specified") > >> Reported-by: Liang Li <liali@redhat.com> > >> Closes: https://issues.redhat.com/browse/RHEL-62339 > > > > nit: The link is not accessible to the public. I made it public now. > > > > Also, this breaks BPF CI with regards to existing bonding selftest : > > > > https://github.com/kernel-patches/bpf/actions/runs/11340153361/job/31536275257 The following should fix the selftest error. diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 18d1314fa797..0c380558a25d 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -5705,7 +5705,7 @@ static int bond_xdp_set(struct net_device *dev, struct bpf_prog *prog, if (dev_xdp_prog_count(slave_dev) > 0) { SLAVE_NL_ERR(dev, slave_dev, extack, "Slave has XDP program loaded, please unload before enslaving"); - err = -EOPNOTSUPP; + err = -EEXIST; goto err; } But it doesn't solve the problem if the slave has xdp program loaded while using an unsupported bond mode, which will return too early. If there is not other driver has this problem. I can try fix this on bonding side. Thanks Hangbin
On 15/10/2024 13:38, Hangbin Liu wrote: > On Tue, Oct 15, 2024 at 12:53:08PM +0300, Nikolay Aleksandrov wrote: >> On 15/10/2024 11:17, Daniel Borkmann wrote: >>> On 10/15/24 5:36 AM, Hangbin Liu wrote: >>>> After commit c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags >>>> specified"), the mode is automatically set to XDP_MODE_DRV if the driver >>>> implements the .ndo_bpf function. However, for drivers like bonding, which >>>> only support native XDP for specific modes, this may result in an >>>> "unsupported" response. >>>> >>>> In such cases, let's fall back to SKB mode if the user did not explicitly >>>> request DRV mode. >>>> >> >> So behaviour changed once, now it's changing again.. > > This should not be a behaviour change, it just follow the fallback rules. > hm, what fallback rules? I see dev_xdp_attach() exits on many errors with proper codes and extack messages, am I missing something, where's the fallback? >> IMO it's better to explicitly >> error out and let the user decide how to resolve the situation. > > The user feels confused and reported a bug. Because cmd > `ip link set bond0 xdp obj xdp_dummy.o section xdp` failed with "Operation > not supported" in stead of fall back to xdpgeneral mode. > Where's the nice extack msg then? :) We can tell them what's going on, maybe they'll want to change the bonding mode and still use this mode rather than falling back to another mode silently. That was my point, fallback is not the only solution. >> The above commit >> is 4 years old, surely everyone is used to the behaviour by now. If you insist >> to do auto-fallback, then at least I'd go with Daniel's suggestion and do it >> in the bonding device. Maybe it can return -EFALLBACK, or some other way to >> signal the caller and change the mode, but you assume that's what the user >> would want, maybe it is and maybe it's not - that is why I'd prefer the >> explicit error so conscious action can be taken to resolve the situation. >> >> That being said, I don't have a strong preference, just my few cents. :) >> >>>> Fixes: c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags specified") >>>> Reported-by: Liang Li <liali@redhat.com> >>>> Closes: https://issues.redhat.com/browse/RHEL-62339 >>> >>> nit: The link is not accessible to the public. > > I made it public now. > >>> >>> Also, this breaks BPF CI with regards to existing bonding selftest : >>> >>> https://github.com/kernel-patches/bpf/actions/runs/11340153361/job/31536275257 > > The following should fix the selftest error. > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index 18d1314fa797..0c380558a25d 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -5705,7 +5705,7 @@ static int bond_xdp_set(struct net_device *dev, struct bpf_prog *prog, > if (dev_xdp_prog_count(slave_dev) > 0) { > SLAVE_NL_ERR(dev, slave_dev, extack, > "Slave has XDP program loaded, please unload before enslaving"); > - err = -EOPNOTSUPP; > + err = -EEXIST; > goto err; > } > > But it doesn't solve the problem if the slave has xdp program loaded while > using an unsupported bond mode, which will return too early. > > If there is not other driver has this problem. I can try fix this on > bonding side. > > Thanks > Hangbin
On 15/10/2024 13:46, Nikolay Aleksandrov wrote: > On 15/10/2024 13:38, Hangbin Liu wrote: >> On Tue, Oct 15, 2024 at 12:53:08PM +0300, Nikolay Aleksandrov wrote: >>> On 15/10/2024 11:17, Daniel Borkmann wrote: >>>> On 10/15/24 5:36 AM, Hangbin Liu wrote: >>>>> After commit c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags >>>>> specified"), the mode is automatically set to XDP_MODE_DRV if the driver >>>>> implements the .ndo_bpf function. However, for drivers like bonding, which >>>>> only support native XDP for specific modes, this may result in an >>>>> "unsupported" response. >>>>> >>>>> In such cases, let's fall back to SKB mode if the user did not explicitly >>>>> request DRV mode. >>>>> >>> >>> So behaviour changed once, now it's changing again.. >> >> This should not be a behaviour change, it just follow the fallback rules. >> > > hm, what fallback rules? I see dev_xdp_attach() exits on many errors > with proper codes and extack messages, am I missing something, where's the > fallback? > Oh did you mean dev_xdp_mode()'s ndo_bpf check to decide which mode to use ? So you'd like to do that for the unsupported bond modes as well, then I'd go with Daniel's suggestion in that case and keep it in the bonding until something else needs it.
On Tue, Oct 15, 2024 at 01:46:53PM +0300, Nikolay Aleksandrov wrote: > > This should not be a behaviour change, it just follow the fallback rules. > > hm, what fallback rules? I see dev_xdp_attach() exits on many errors > with proper codes and extack messages, am I missing something, where's the > fallback? I mean in the `man ip link` page [1], it said ip link output will indicate a xdp flag for the networking device. If the driver does not have native XDP support, the kernel will fall back to a slower, driver-independent "generic" XDP variant. > > >> IMO it's better to explicitly > >> error out and let the user decide how to resolve the situation. > > > > The user feels confused and reported a bug. Because cmd > > `ip link set bond0 xdp obj xdp_dummy.o section xdp` failed with "Operation > > not supported" in stead of fall back to xdpgeneral mode. > > > > Where's the nice extack msg then? :) > > We can tell them what's going on, maybe they'll want to change the bonding mode > and still use this mode rather than falling back to another mode silently. > That was my point, fallback is not the only solution. Yes, that's also a good solution. My goal is to either inform the user why the XDP program couldn't be loaded, or load it in SKB mode if the user hasn't specifically requested XDPDRV mode. Otherwise, the user might be confused about why the kernel didn't automatically fall back to SKB mode. Thanks Hangbin
On Tue, 15 Oct 2024 03:36:32 +0000 Hangbin Liu wrote: > After commit c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags > specified"), the mode is automatically set to XDP_MODE_DRV if the driver > implements the .ndo_bpf function. However, for drivers like bonding, which > only support native XDP for specific modes, this may result in an > "unsupported" response. > > In such cases, let's fall back to SKB mode if the user did not explicitly > request DRV mode. Looks like the issue is reported by QA rather than a real user. A weak -1 from me on building such unreliable heuristics into the kernel. As BPF CI's failure points out the ops can return EOPNOTSUPP for multiple reasons while dev_xdp_mode() only checks if the driver *has* ndo_bpf, not if it fails.
diff --git a/net/core/dev.c b/net/core/dev.c index ea5fbcd133ae..e32069d81cd7 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9579,6 +9579,7 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack /* don't call drivers if the effective program didn't change */ if (new_prog != cur_prog) { +reinstall: bpf_op = dev_xdp_bpf_op(dev, mode); if (!bpf_op) { NL_SET_ERR_MSG(extack, "Underlying driver does not support XDP in native mode"); @@ -9586,8 +9587,17 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack } err = dev_xdp_install(dev, mode, bpf_op, extack, flags, new_prog); - if (err) + if (err) { + /* The driver returns not supported even .ndo_bpf + * implemented, fall back to SKB mode. + */ + if (err == -EOPNOTSUPP && mode == XDP_MODE_DRV && + !(flags & XDP_FLAGS_DRV_MODE)) { + mode = XDP_MODE_SKB; + goto reinstall; + } return err; + } } if (link)
After commit c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags specified"), the mode is automatically set to XDP_MODE_DRV if the driver implements the .ndo_bpf function. However, for drivers like bonding, which only support native XDP for specific modes, this may result in an "unsupported" response. In such cases, let's fall back to SKB mode if the user did not explicitly request DRV mode. Fixes: c8a36f1945b2 ("bpf: xdp: Fix XDP mode when no mode flags specified") Reported-by: Liang Li <liali@redhat.com> Closes: https://issues.redhat.com/browse/RHEL-62339 Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> --- net/core/dev.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)