Message ID | 20241107133004.7469-6-shaw.leon@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | net: Improve netns handling in RTNL and ip_tunnel | expand |
On Thu, Nov 7, 2024 at 2:30 PM Xiao Liang <shaw.leon@gmail.com> wrote: > > If set to true, create device in target netns (when different from > link-netns) without netns change, and use current netns as link-netns > if not specified explicitly. > Sorry, module parameters are not going to fly. Instead, add new rtnetlink attributes ?
On Thu, Nov 7, 2024 at 9:37 PM Eric Dumazet <edumazet@google.com> wrote: > > On Thu, Nov 7, 2024 at 2:30 PM Xiao Liang <shaw.leon@gmail.com> wrote: > > > > If set to true, create device in target netns (when different from > > link-netns) without netns change, and use current netns as link-netns > > if not specified explicitly. > > > > Sorry, module parameters are not going to fly. Sorry, I didn't know that. > Instead, add new rtnetlink attributes ? It is to control driver behavior at rtnl_ops registration time. I think rtnetlink attributes are too late for that, maybe? Can't think of a way other than module parameters or register separate ops. Any suggestions?
On Thu, 7 Nov 2024 22:11:24 +0800 Xiao Liang wrote: > > Instead, add new rtnetlink attributes ? > > It is to control driver behavior at rtnl_ops registration time. I > think rtnetlink > attributes are too late for that, maybe? Can't think of a way other than > module parameters or register separate ops. Any suggestions? Step back from the implementation you have a little, forget that there is a boolean in rtnl_link_ops. User makes a request to spawn an interface, surely a flag inside that request can dictate how the netns attrs are interpreted.
On Thu, Nov 7, 2024 at 11:59 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Thu, 7 Nov 2024 22:11:24 +0800 Xiao Liang wrote: > > > Instead, add new rtnetlink attributes ? > > > > It is to control driver behavior at rtnl_ops registration time. I > > think rtnetlink > > attributes are too late for that, maybe? Can't think of a way other than > > module parameters or register separate ops. Any suggestions? > > Step back from the implementation you have a little, forget that there > is a boolean in rtnl_link_ops. User makes a request to spawn an > interface, surely a flag inside that request can dictate how the netns > attrs are interpreted. IMO, this is about driver capability, not about user requests. As you've pointed out earlier, probably no one would actually want the old behavior whenever the driver supports the new one. I added the module parameter just for compatibility, because ip_tunnels was not implemented to support src_net properly. Yes it's possible to add an extra flag in user request, but I don't think it's a good approach. BTW, I didn't find what's going on with module parameters, is there any documentation? Thanks.
On Fri, 8 Nov 2024 00:53:55 +0800 Xiao Liang wrote: > > > It is to control driver behavior at rtnl_ops registration time. I > > > think rtnetlink > > > attributes are too late for that, maybe? Can't think of a way other than > > > module parameters or register separate ops. Any suggestions? > > > > Step back from the implementation you have a little, forget that there > > is a boolean in rtnl_link_ops. User makes a request to spawn an > > interface, surely a flag inside that request can dictate how the netns > > attrs are interpreted. > > IMO, this is about driver capability, not about user requests. The bit is a driver capability, that's fine. But the question was how to achieve backward compatibility. A flag in user request shifts the responsibility of ensuring all services are compatible to whoever spawns the interfaces. Which will probably be some network management daemon. > As you've pointed out earlier, probably no one would actually want > the old behavior whenever the driver supports the new one. > I added the module parameter just for compatibility, because ip_tunnels > was not implemented to support src_net properly. And I maintain that it's very unlikely anyone cares about old behavior. So maybe as a starting point we can have neither the flag nor the module param? We can add them later if someone screams. > Yes it's possible to add an extra flag in user request, but I don't > think it's a good approach. There are two maintainers with opposing intuition so more data may be needed to convince.. > BTW, I didn't find what's going on with module parameters, is there > any documentation? Not sure if there is documentation, but module params are quite painful to work with. Main reason is that they are global and not namespace aware. Plus developers usually default to making them read only, which means they practically speaking have to be configured at boot.
On Fri, Nov 8, 2024 at 12:04 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Fri, 8 Nov 2024 00:53:55 +0800 Xiao Liang wrote: > > IMO, this is about driver capability, not about user requests. > > The bit is a driver capability, that's fine. But the question was how > to achieve backward compatibility. A flag in user request shifts the > responsibility of ensuring all services are compatible to whoever > spawns the interfaces. Which will probably be some network management > daemon. OK. So I think we can change the driver capability indicator in rtnl_ops to a tristate field, say, "linkns_support". If it is - not supported, then keep the old behavior - supported (vlan, macvlan, etc.), then change to the new behavior - compat-mode (ip_tunnel), default to old behavior and can be changed via an IFLA flag. Is this reasonable? > > BTW, I didn't find what's going on with module parameters, is there > > any documentation? > > Not sure if there is documentation, but module params are quite painful > to work with. Main reason is that they are global and not namespace > aware. Plus developers usually default to making them read only, which > means they practically speaking have to be configured at boot. Understood, thanks.
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index f1f31ebfc793..6ef7e98e4620 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -107,6 +107,11 @@ static bool log_ecn_error = true; module_param(log_ecn_error, bool, 0644); MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN"); +static bool netns_atomic; +module_param(netns_atomic, bool, 0444); +MODULE_PARM_DESC(netns_atomic, + "Create tunnel in target net namespace directly and use current net namespace as link-netns by default"); + static struct rtnl_link_ops ipgre_link_ops __read_mostly; static const struct header_ops ipgre_header_ops; @@ -1393,6 +1398,7 @@ static int ipgre_newlink(struct net *src_net, struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) { + struct net *link_net = netns_atomic ? src_net : dev_net(dev); struct ip_tunnel_parm_kern p; __u32 fwmark = 0; int err; @@ -1404,13 +1410,14 @@ static int ipgre_newlink(struct net *src_net, struct net_device *dev, err = ipgre_netlink_parms(dev, data, tb, &p, &fwmark); if (err < 0) return err; - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink_net(link_net, dev, tb, &p, fwmark); } static int erspan_newlink(struct net *src_net, struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) { + struct net *link_net = netns_atomic ? src_net : dev_net(dev); struct ip_tunnel_parm_kern p; __u32 fwmark = 0; int err; @@ -1422,7 +1429,7 @@ static int erspan_newlink(struct net *src_net, struct net_device *dev, err = erspan_netlink_parms(dev, data, tb, &p, &fwmark); if (err) return err; - return ip_tunnel_newlink(dev, tb, &p, fwmark); + return ip_tunnel_newlink_net(link_net, dev, tb, &p, fwmark); } static int ipgre_changelink(struct net_device *dev, struct nlattr *tb[], @@ -1777,6 +1784,10 @@ static int __init ipgre_init(void) pr_info("GRE over IPv4 tunneling driver\n"); + ipgre_link_ops.netns_atomic = netns_atomic; + ipgre_tap_ops.netns_atomic = netns_atomic; + erspan_link_ops.netns_atomic = netns_atomic; + err = register_pernet_device(&ipgre_net_ops); if (err < 0) return err;
If set to true, create device in target netns (when different from link-netns) without netns change, and use current netns as link-netns if not specified explicitly. Signed-off-by: Xiao Liang <shaw.leon@gmail.com> --- net/ipv4/ip_gre.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)