Message ID | 20241016031649.880-2-liuhangbin@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Bonding: return detailed error about XDP failures | expand |
On 16/10/2024 06:16, Hangbin Liu wrote: > Bonding only supports native XDP for specific modes, which can lead to > confusion for users regarding why XDP loads successfully at times and > fails at others. This patch enhances error handling by returning detailed > error messages, providing users with clearer insights into the specific > reasons for the failure when loading native XDP. > > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> > --- > drivers/net/bonding/bond_main.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index b1bffd8e9a95..f0f76b6ac8be 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -5676,8 +5676,11 @@ static int bond_xdp_set(struct net_device *dev, struct bpf_prog *prog, > > ASSERT_RTNL(); > > - if (!bond_xdp_check(bond)) > + if (!bond_xdp_check(bond)) { > + BOND_NL_ERR(dev, extack, > + "No native XDP support for the current bonding mode"); > return -EOPNOTSUPP; > + } > > old_prog = bond->xdp_prog; > bond->xdp_prog = prog; I guess this is based on our discussion earlier? Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
On 10/16/24 5:16 AM, Hangbin Liu wrote: > Bonding only supports native XDP for specific modes, which can lead to > confusion for users regarding why XDP loads successfully at times and > fails at others. This patch enhances error handling by returning detailed > error messages, providing users with clearer insights into the specific > reasons for the failure when loading native XDP. > > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> > --- > drivers/net/bonding/bond_main.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index b1bffd8e9a95..f0f76b6ac8be 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -5676,8 +5676,11 @@ static int bond_xdp_set(struct net_device *dev, struct bpf_prog *prog, > > ASSERT_RTNL(); > > - if (!bond_xdp_check(bond)) > + if (!bond_xdp_check(bond)) { > + BOND_NL_ERR(dev, extack, > + "No native XDP support for the current bonding mode"); > return -EOPNOTSUPP; > + } > > old_prog = bond->xdp_prog; > bond->xdp_prog = prog; LGTM, but independent of these I was more thinking whether something like this could do the trick (only compile tested). That way you also get the fallback without changing anything in the core XDP code. Thanks, Daniel diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index b1bffd8e9a95..2861b3a895ff 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -5915,6 +5915,10 @@ static const struct ethtool_ops bond_ethtool_ops = { .get_ts_info = bond_ethtool_get_ts_info, }; +static const struct device_type bond_type = { + .name = "bond", +}; + static const struct net_device_ops bond_netdev_ops = { .ndo_init = bond_init, .ndo_uninit = bond_uninit, @@ -5951,9 +5955,20 @@ static const struct net_device_ops bond_netdev_ops = { .ndo_hwtstamp_set = bond_hwtstamp_set, }; -static const struct device_type bond_type = { - .name = "bond", -}; +static struct net_device_ops bond_netdev_ops_noxdp __ro_after_init; + +static void __init bond_setup_noxdp_ops(void) +{ + memcpy(&bond_netdev_ops_noxdp, &bond_netdev_ops, + sizeof(bond_netdev_ops)); + + /* Used for bond device mode which does not support XDP + * yet, see also bond_xdp_check(). + */ + bond_netdev_ops_noxdp.ndo_bpf = NULL; + bond_netdev_ops_noxdp.ndo_xdp_xmit = NULL; + bond_netdev_ops_noxdp.ndo_xdp_get_xmit_slave = NULL; +} static void bond_destructor(struct net_device *bond_dev) { @@ -5978,7 +5993,9 @@ void bond_setup(struct net_device *bond_dev) /* Initialize the device entry points */ ether_setup(bond_dev); bond_dev->max_mtu = ETH_MAX_MTU; - bond_dev->netdev_ops = &bond_netdev_ops; + bond_dev->netdev_ops = bond_xdp_check(bond) ? + &bond_netdev_ops : + &bond_netdev_ops_noxdp; bond_dev->ethtool_ops = &bond_ethtool_ops; bond_dev->needs_free_netdev = true; @@ -6591,6 +6608,8 @@ static int __init bonding_init(void) int i; int res; + bond_setup_noxdp_ops(); + res = bond_check_params(&bonding_defaults); if (res) goto out;
On 16/10/2024 10:59, Daniel Borkmann wrote: > On 10/16/24 5:16 AM, Hangbin Liu wrote: >> Bonding only supports native XDP for specific modes, which can lead to >> confusion for users regarding why XDP loads successfully at times and >> fails at others. This patch enhances error handling by returning detailed >> error messages, providing users with clearer insights into the specific >> reasons for the failure when loading native XDP. >> >> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> >> --- >> drivers/net/bonding/bond_main.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >> index b1bffd8e9a95..f0f76b6ac8be 100644 >> --- a/drivers/net/bonding/bond_main.c >> +++ b/drivers/net/bonding/bond_main.c >> @@ -5676,8 +5676,11 @@ static int bond_xdp_set(struct net_device *dev, struct bpf_prog *prog, >> ASSERT_RTNL(); >> - if (!bond_xdp_check(bond)) >> + if (!bond_xdp_check(bond)) { >> + BOND_NL_ERR(dev, extack, >> + "No native XDP support for the current bonding mode"); >> return -EOPNOTSUPP; >> + } >> old_prog = bond->xdp_prog; >> bond->xdp_prog = prog; > > LGTM, but independent of these I was more thinking whether something like this > could do the trick (only compile tested). That way you also get the fallback > without changing anything in the core XDP code. > > Thanks, > Daniel > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index b1bffd8e9a95..2861b3a895ff 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -5915,6 +5915,10 @@ static const struct ethtool_ops bond_ethtool_ops = { > .get_ts_info = bond_ethtool_get_ts_info, > }; > > +static const struct device_type bond_type = { > + .name = "bond", > +}; > + > static const struct net_device_ops bond_netdev_ops = { > .ndo_init = bond_init, > .ndo_uninit = bond_uninit, > @@ -5951,9 +5955,20 @@ static const struct net_device_ops bond_netdev_ops = { > .ndo_hwtstamp_set = bond_hwtstamp_set, > }; > > -static const struct device_type bond_type = { > - .name = "bond", > -}; > +static struct net_device_ops bond_netdev_ops_noxdp __ro_after_init; > + > +static void __init bond_setup_noxdp_ops(void) > +{ > + memcpy(&bond_netdev_ops_noxdp, &bond_netdev_ops, > + sizeof(bond_netdev_ops)); > + > + /* Used for bond device mode which does not support XDP > + * yet, see also bond_xdp_check(). > + */ > + bond_netdev_ops_noxdp.ndo_bpf = NULL; > + bond_netdev_ops_noxdp.ndo_xdp_xmit = NULL; > + bond_netdev_ops_noxdp.ndo_xdp_get_xmit_slave = NULL; > +} > > static void bond_destructor(struct net_device *bond_dev) > { > @@ -5978,7 +5993,9 @@ void bond_setup(struct net_device *bond_dev) > /* Initialize the device entry points */ > ether_setup(bond_dev); > bond_dev->max_mtu = ETH_MAX_MTU; > - bond_dev->netdev_ops = &bond_netdev_ops; > + bond_dev->netdev_ops = bond_xdp_check(bond) ? > + &bond_netdev_ops : > + &bond_netdev_ops_noxdp; This will have to be done safely on bond mode change as well. If all slaves are released we can switch modes without destroying the device. > bond_dev->ethtool_ops = &bond_ethtool_ops; > > bond_dev->needs_free_netdev = true; > @@ -6591,6 +6608,8 @@ static int __init bonding_init(void) > int i; > int res; > > + bond_setup_noxdp_ops(); > + > res = bond_check_params(&bonding_defaults); > if (res) > goto out;
On Wed, Oct 16, 2024 at 09:59:01AM +0200, Daniel Borkmann wrote: > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > > index b1bffd8e9a95..f0f76b6ac8be 100644 > > --- a/drivers/net/bonding/bond_main.c > > +++ b/drivers/net/bonding/bond_main.c > > @@ -5676,8 +5676,11 @@ static int bond_xdp_set(struct net_device *dev, struct bpf_prog *prog, > > ASSERT_RTNL(); > > - if (!bond_xdp_check(bond)) > > + if (!bond_xdp_check(bond)) { > > + BOND_NL_ERR(dev, extack, > > + "No native XDP support for the current bonding mode"); > > return -EOPNOTSUPP; > > + } > > old_prog = bond->xdp_prog; > > bond->xdp_prog = prog; > > LGTM, but independent of these I was more thinking whether something like this > could do the trick (only compile tested). That way you also get the fallback > without changing anything in the core XDP code. Yes, I also thought about do fallback on bonding. But Nikolay suggested just use extack msg[1], and Jakub think this is report by QE rather than a real user. So I think we can use extack first, and convert to auto fallback on bonding if a real user complains. What do you think? [1] https://lore.kernel.org/netdev/8088f2a7-3ab1-4a1e-996d-c15703da13cc@blackwall.org/ [2] https://lore.kernel.org/netdev/20241015085121.5f22e96f@kernel.org/ Thanks Hangbin
On 10/16/24 10:13 AM, Nikolay Aleksandrov wrote: > On 16/10/2024 10:59, Daniel Borkmann wrote: >> On 10/16/24 5:16 AM, Hangbin Liu wrote: >>> Bonding only supports native XDP for specific modes, which can lead to >>> confusion for users regarding why XDP loads successfully at times and >>> fails at others. This patch enhances error handling by returning detailed >>> error messages, providing users with clearer insights into the specific >>> reasons for the failure when loading native XDP. >>> >>> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> >>> --- >>> drivers/net/bonding/bond_main.c | 5 ++++- >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >>> index b1bffd8e9a95..f0f76b6ac8be 100644 >>> --- a/drivers/net/bonding/bond_main.c >>> +++ b/drivers/net/bonding/bond_main.c >>> @@ -5676,8 +5676,11 @@ static int bond_xdp_set(struct net_device *dev, struct bpf_prog *prog, >>> ASSERT_RTNL(); >>> - if (!bond_xdp_check(bond)) >>> + if (!bond_xdp_check(bond)) { >>> + BOND_NL_ERR(dev, extack, >>> + "No native XDP support for the current bonding mode"); >>> return -EOPNOTSUPP; >>> + } >>> old_prog = bond->xdp_prog; >>> bond->xdp_prog = prog; >> >> LGTM, but independent of these I was more thinking whether something like this >> could do the trick (only compile tested). That way you also get the fallback >> without changing anything in the core XDP code. >> >> Thanks, >> Daniel >> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >> index b1bffd8e9a95..2861b3a895ff 100644 >> --- a/drivers/net/bonding/bond_main.c >> +++ b/drivers/net/bonding/bond_main.c >> @@ -5915,6 +5915,10 @@ static const struct ethtool_ops bond_ethtool_ops = { >> .get_ts_info = bond_ethtool_get_ts_info, >> }; >> >> +static const struct device_type bond_type = { >> + .name = "bond", >> +}; >> + >> static const struct net_device_ops bond_netdev_ops = { >> .ndo_init = bond_init, >> .ndo_uninit = bond_uninit, >> @@ -5951,9 +5955,20 @@ static const struct net_device_ops bond_netdev_ops = { >> .ndo_hwtstamp_set = bond_hwtstamp_set, >> }; >> >> -static const struct device_type bond_type = { >> - .name = "bond", >> -}; >> +static struct net_device_ops bond_netdev_ops_noxdp __ro_after_init; >> + >> +static void __init bond_setup_noxdp_ops(void) >> +{ >> + memcpy(&bond_netdev_ops_noxdp, &bond_netdev_ops, >> + sizeof(bond_netdev_ops)); >> + >> + /* Used for bond device mode which does not support XDP >> + * yet, see also bond_xdp_check(). >> + */ >> + bond_netdev_ops_noxdp.ndo_bpf = NULL; >> + bond_netdev_ops_noxdp.ndo_xdp_xmit = NULL; >> + bond_netdev_ops_noxdp.ndo_xdp_get_xmit_slave = NULL; >> +} >> >> static void bond_destructor(struct net_device *bond_dev) >> { >> @@ -5978,7 +5993,9 @@ void bond_setup(struct net_device *bond_dev) >> /* Initialize the device entry points */ >> ether_setup(bond_dev); >> bond_dev->max_mtu = ETH_MAX_MTU; >> - bond_dev->netdev_ops = &bond_netdev_ops; >> + bond_dev->netdev_ops = bond_xdp_check(bond) ? >> + &bond_netdev_ops : >> + &bond_netdev_ops_noxdp; > > This will have to be done safely on bond mode change as well. > If all slaves are released we can switch modes without destroying > the device. Ah fair, yeah perhaps not worth the added complexity. Tbh, if someone loads an XDP program with the fallback to generic, it feels super fragile in the first place and I wouldn't do this ever for production workloads. Meaning, fixed to native for production, generic XDP for testing where native is not available (e.g. CI), at least that's how we use it. Thanks, Daniel
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index b1bffd8e9a95..f0f76b6ac8be 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -5676,8 +5676,11 @@ static int bond_xdp_set(struct net_device *dev, struct bpf_prog *prog, ASSERT_RTNL(); - if (!bond_xdp_check(bond)) + if (!bond_xdp_check(bond)) { + BOND_NL_ERR(dev, extack, + "No native XDP support for the current bonding mode"); return -EOPNOTSUPP; + } old_prog = bond->xdp_prog; bond->xdp_prog = prog;
Bonding only supports native XDP for specific modes, which can lead to confusion for users regarding why XDP loads successfully at times and fails at others. This patch enhances error handling by returning detailed error messages, providing users with clearer insights into the specific reasons for the failure when loading native XDP. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> --- drivers/net/bonding/bond_main.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)