Message ID | 20240917010734.1905-4-antonio@openvpn.net (mailing list archive) |
---|---|
State | Deferred |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Introducing OpenVPN Data Channel Offload | expand |
From: Antonio Quartulli <antonio@openvpn.net> Date: Tue, 17 Sep 2024 03:07:12 +0200 > +/* we register with rtnl to let core know that ovpn is a virtual driver and > + * therefore ifaces should be destroyed when exiting a netns > + */ > +static struct rtnl_link_ops ovpn_link_ops = { > +}; This looks like abusing rtnl_link_ops. Instead of a hack to rely on default_device_exit_batch() and rtnl_link_unregister(), this should be implemented as struct pernet_operations.exit_batch_rtnl(). Then, the patch 2 is not needed, which is confusing for all other rtnl_link_ops users. If we want to avoid extra RTNL in default_device_exit_batch(), I can post this patch after merge window. ---8<--- diff --git a/net/core/dev.c b/net/core/dev.c index 1e740faf9e78..eacf6f5a6ace 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -11916,7 +11916,8 @@ static void __net_exit default_device_exit_net(struct net *net) } } -static void __net_exit default_device_exit_batch(struct list_head *net_list) +void __net_exit default_device_exit_batch(struct list_head *net_list, + struct list_head *dev_kill_list) { /* At exit all network devices most be removed from a network * namespace. Do this in the reverse order of registration. @@ -11925,9 +11926,7 @@ static void __net_exit default_device_exit_batch(struct list_head *net_list) */ struct net_device *dev; struct net *net; - LIST_HEAD(dev_kill_list); - rtnl_lock(); list_for_each_entry(net, net_list, exit_list) { default_device_exit_net(net); cond_resched(); @@ -11936,19 +11935,13 @@ static void __net_exit default_device_exit_batch(struct list_head *net_list) list_for_each_entry(net, net_list, exit_list) { for_each_netdev_reverse(net, dev) { if (dev->rtnl_link_ops && dev->rtnl_link_ops->dellink) - dev->rtnl_link_ops->dellink(dev, &dev_kill_list); + dev->rtnl_link_ops->dellink(dev, dev_kill_list); else - unregister_netdevice_queue(dev, &dev_kill_list); + unregister_netdevice_queue(dev, dev_kill_list); } } - unregister_netdevice_many(&dev_kill_list); - rtnl_unlock(); } -static struct pernet_operations __net_initdata default_device_ops = { - .exit_batch = default_device_exit_batch, -}; - static void __init net_dev_struct_check(void) { /* TX read-mostly hotpath */ @@ -12140,9 +12133,6 @@ static int __init net_dev_init(void) if (register_pernet_device(&loopback_net_ops)) goto out; - if (register_pernet_device(&default_device_ops)) - goto out; - open_softirq(NET_TX_SOFTIRQ, net_tx_action); open_softirq(NET_RX_SOFTIRQ, net_rx_action); diff --git a/net/core/dev.h b/net/core/dev.h index 5654325c5b71..d1feecab9c4a 100644 --- a/net/core/dev.h +++ b/net/core/dev.h @@ -99,6 +99,9 @@ void __dev_notify_flags(struct net_device *dev, unsigned int old_flags, void unregister_netdevice_many_notify(struct list_head *head, u32 portid, const struct nlmsghdr *nlh); +void default_device_exit_batch(struct list_head *net_list, + struct list_head *dev_kill_list); + static inline void netif_set_gso_max_size(struct net_device *dev, unsigned int size) { diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index 11e4dd4f09ed..0a9bce599d54 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -27,6 +27,8 @@ #include <net/net_namespace.h> #include <net/netns/generic.h> +#include "dev.h" + /* * Our network namespace constructor/destructor lists */ @@ -380,6 +382,7 @@ static __net_init int setup_net(struct net *net) if (ops->exit_batch_rtnl) ops->exit_batch_rtnl(&net_exit_list, &dev_kill_list); } + default_device_exit_batch(&net_exit_list, &dev_kill_list); unregister_netdevice_many(&dev_kill_list); rtnl_unlock(); @@ -618,6 +621,7 @@ static void cleanup_net(struct work_struct *work) if (ops->exit_batch_rtnl) ops->exit_batch_rtnl(&net_exit_list, &dev_kill_list); } + default_device_exit_batch(&net_exit_list, &dev_kill_list); unregister_netdevice_many(&dev_kill_list); rtnl_unlock(); @@ -1214,6 +1218,7 @@ static void free_exit_list(struct pernet_operations *ops, struct list_head *net_ rtnl_lock(); ops->exit_batch_rtnl(net_exit_list, &dev_kill_list); + default_device_exit_batch(net_exit_list, &dev_kill_list); unregister_netdevice_many(&dev_kill_list); rtnl_unlock(); } ---8<---
Hi Kuniyuki and thank you for chiming in. On 19/09/2024 07:52, Kuniyuki Iwashima wrote: > From: Antonio Quartulli <antonio@openvpn.net> > Date: Tue, 17 Sep 2024 03:07:12 +0200 >> +/* we register with rtnl to let core know that ovpn is a virtual driver and >> + * therefore ifaces should be destroyed when exiting a netns >> + */ >> +static struct rtnl_link_ops ovpn_link_ops = { >> +}; > > This looks like abusing rtnl_link_ops. In some way, the inspiration came from 5b9e7e160795 ("openvswitch: introduce rtnl ops stub") [which just reminded me that I wanted to fill the .kind field, but I forgot to do so] The reason for taking this approach was to avoid handling the iface destruction upon netns exit inside the driver, when the core already has all the code for taking care of this for us. Originally I implemented pernet_operations.pre_exit, but Sabrina suggested that letting the core handle the destruction was cleaner (and I agreed). However, after I removed the pre_exit implementation, we realized that default_device_exit_batch/default_device_exit_net thought that an ovpn device is a real NIC and was moving it to the global netns rather than killing it. One way to fix the above was to register rtnl_link_ops with netns_fund = false (so the ops object you see in this patch is not truly "empty"). However, I then hit the bug which required patch 2 to get fixed. Does it make sense to you? Or you still think this is an rtnl_link_ops abuse? The alternative was to change default_device_exit_batch/default_device_exit_net to read some new netdevice flag which would tell if the interface should be killed or moved to global upon netns exit. Regards, > > Instead of a hack to rely on default_device_exit_batch() > and rtnl_link_unregister(), this should be implemented as > struct pernet_operations.exit_batch_rtnl(). > > Then, the patch 2 is not needed, which is confusing for > all other rtnl_link_ops users. > > If we want to avoid extra RTNL in default_device_exit_batch(), > I can post this patch after merge window. > > ---8<--- > diff --git a/net/core/dev.c b/net/core/dev.c > index 1e740faf9e78..eacf6f5a6ace 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -11916,7 +11916,8 @@ static void __net_exit default_device_exit_net(struct net *net) > } > } > > -static void __net_exit default_device_exit_batch(struct list_head *net_list) > +void __net_exit default_device_exit_batch(struct list_head *net_list, > + struct list_head *dev_kill_list) > { > /* At exit all network devices most be removed from a network > * namespace. Do this in the reverse order of registration. > @@ -11925,9 +11926,7 @@ static void __net_exit default_device_exit_batch(struct list_head *net_list) > */ > struct net_device *dev; > struct net *net; > - LIST_HEAD(dev_kill_list); > > - rtnl_lock(); > list_for_each_entry(net, net_list, exit_list) { > default_device_exit_net(net); > cond_resched(); > @@ -11936,19 +11935,13 @@ static void __net_exit default_device_exit_batch(struct list_head *net_list) > list_for_each_entry(net, net_list, exit_list) { > for_each_netdev_reverse(net, dev) { > if (dev->rtnl_link_ops && dev->rtnl_link_ops->dellink) > - dev->rtnl_link_ops->dellink(dev, &dev_kill_list); > + dev->rtnl_link_ops->dellink(dev, dev_kill_list); > else > - unregister_netdevice_queue(dev, &dev_kill_list); > + unregister_netdevice_queue(dev, dev_kill_list); > } > } > - unregister_netdevice_many(&dev_kill_list); > - rtnl_unlock(); > } > > -static struct pernet_operations __net_initdata default_device_ops = { > - .exit_batch = default_device_exit_batch, > -}; > - > static void __init net_dev_struct_check(void) > { > /* TX read-mostly hotpath */ > @@ -12140,9 +12133,6 @@ static int __init net_dev_init(void) > if (register_pernet_device(&loopback_net_ops)) > goto out; > > - if (register_pernet_device(&default_device_ops)) > - goto out; > - > open_softirq(NET_TX_SOFTIRQ, net_tx_action); > open_softirq(NET_RX_SOFTIRQ, net_rx_action); > > diff --git a/net/core/dev.h b/net/core/dev.h > index 5654325c5b71..d1feecab9c4a 100644 > --- a/net/core/dev.h > +++ b/net/core/dev.h > @@ -99,6 +99,9 @@ void __dev_notify_flags(struct net_device *dev, unsigned int old_flags, > void unregister_netdevice_many_notify(struct list_head *head, > u32 portid, const struct nlmsghdr *nlh); > > +void default_device_exit_batch(struct list_head *net_list, > + struct list_head *dev_kill_list); > + > static inline void netif_set_gso_max_size(struct net_device *dev, > unsigned int size) > { > diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c > index 11e4dd4f09ed..0a9bce599d54 100644 > --- a/net/core/net_namespace.c > +++ b/net/core/net_namespace.c > @@ -27,6 +27,8 @@ > #include <net/net_namespace.h> > #include <net/netns/generic.h> > > +#include "dev.h" > + > /* > * Our network namespace constructor/destructor lists > */ > @@ -380,6 +382,7 @@ static __net_init int setup_net(struct net *net) > if (ops->exit_batch_rtnl) > ops->exit_batch_rtnl(&net_exit_list, &dev_kill_list); > } > + default_device_exit_batch(&net_exit_list, &dev_kill_list); > unregister_netdevice_many(&dev_kill_list); > rtnl_unlock(); > > @@ -618,6 +621,7 @@ static void cleanup_net(struct work_struct *work) > if (ops->exit_batch_rtnl) > ops->exit_batch_rtnl(&net_exit_list, &dev_kill_list); > } > + default_device_exit_batch(&net_exit_list, &dev_kill_list); > unregister_netdevice_many(&dev_kill_list); > rtnl_unlock(); > > @@ -1214,6 +1218,7 @@ static void free_exit_list(struct pernet_operations *ops, struct list_head *net_ > > rtnl_lock(); > ops->exit_batch_rtnl(net_exit_list, &dev_kill_list); > + default_device_exit_batch(net_exit_list, &dev_kill_list); > unregister_netdevice_many(&dev_kill_list); > rtnl_unlock(); > } > ---8<---
From: Antonio Quartulli <antonio@openvpn.net> Date: Thu, 19 Sep 2024 13:57:51 +0200 > Hi Kuniyuki and thank you for chiming in. > > On 19/09/2024 07:52, Kuniyuki Iwashima wrote: > > From: Antonio Quartulli <antonio@openvpn.net> > > Date: Tue, 17 Sep 2024 03:07:12 +0200 > >> +/* we register with rtnl to let core know that ovpn is a virtual driver and > >> + * therefore ifaces should be destroyed when exiting a netns > >> + */ > >> +static struct rtnl_link_ops ovpn_link_ops = { > >> +}; > > > > This looks like abusing rtnl_link_ops. > > In some way, the inspiration came from > 5b9e7e160795 ("openvswitch: introduce rtnl ops stub") > > [which just reminded me that I wanted to fill the .kind field, but I > forgot to do so] > > The reason for taking this approach was to avoid handling the iface > destruction upon netns exit inside the driver, when the core already has > all the code for taking care of this for us. > > Originally I implemented pernet_operations.pre_exit, but Sabrina > suggested that letting the core handle the destruction was cleaner (and > I agreed). > > However, after I removed the pre_exit implementation, we realized that > default_device_exit_batch/default_device_exit_net thought that an ovpn > device is a real NIC and was moving it to the global netns rather than > killing it. > > One way to fix the above was to register rtnl_link_ops with netns_fund = > false (so the ops object you see in this patch is not truly "empty"). > > However, I then hit the bug which required patch 2 to get fixed. > > Does it make sense to you? > Or you still think this is an rtnl_link_ops abuse? The use of .kind makes sense, and the change should be in this patch. For the patch 2 and dellink(), is the device not expected to be removed by ip link del ? Setting unregister_netdevice_queue() to dellink() will support RTM_DELLINK, but otherwise -EOPNOTSUPP is returned. > > The alternative was to change > default_device_exit_batch/default_device_exit_net to read some new > netdevice flag which would tell if the interface should be killed or > moved to global upon netns exit. > > Regards, >
Hi, On 20/09/2024 11:32, Kuniyuki Iwashima wrote: > From: Antonio Quartulli <antonio@openvpn.net> > Date: Thu, 19 Sep 2024 13:57:51 +0200 >> Hi Kuniyuki and thank you for chiming in. >> >> On 19/09/2024 07:52, Kuniyuki Iwashima wrote: >>> From: Antonio Quartulli <antonio@openvpn.net> >>> Date: Tue, 17 Sep 2024 03:07:12 +0200 >>>> +/* we register with rtnl to let core know that ovpn is a virtual driver and >>>> + * therefore ifaces should be destroyed when exiting a netns >>>> + */ >>>> +static struct rtnl_link_ops ovpn_link_ops = { >>>> +}; >>> >>> This looks like abusing rtnl_link_ops. >> >> In some way, the inspiration came from >> 5b9e7e160795 ("openvswitch: introduce rtnl ops stub") >> >> [which just reminded me that I wanted to fill the .kind field, but I >> forgot to do so] >> >> The reason for taking this approach was to avoid handling the iface >> destruction upon netns exit inside the driver, when the core already has >> all the code for taking care of this for us. >> >> Originally I implemented pernet_operations.pre_exit, but Sabrina >> suggested that letting the core handle the destruction was cleaner (and >> I agreed). >> >> However, after I removed the pre_exit implementation, we realized that >> default_device_exit_batch/default_device_exit_net thought that an ovpn >> device is a real NIC and was moving it to the global netns rather than >> killing it. >> >> One way to fix the above was to register rtnl_link_ops with netns_fund = >> false (so the ops object you see in this patch is not truly "empty"). >> >> However, I then hit the bug which required patch 2 to get fixed. >> >> Does it make sense to you? >> Or you still think this is an rtnl_link_ops abuse? > > The use of .kind makes sense, and the change should be in this patch. Ok, will add it here and I will also add an explicit .netns_fund = false to highlight the fact that we need this attribute to avoid moving the iface to the global netns. > > For the patch 2 and dellink(), is the device not expected to be removed > by ip link del ? Setting unregister_netdevice_queue() to dellink() will > support RTM_DELLINK, but otherwise -EOPNOTSUPP is returned. For the time being I decided that it would make sense to add and delete ovpn interfaces via netlink API only. But there are already discussions about implementing the RTNL add/dellink() too. Therefore I think it makes sense to set dellink to unregister_netdevice_queue() in this patch and thus avoid patch 2 at all. Thanks. Regards, > > >> >> The alternative was to change >> default_device_exit_batch/default_device_exit_net to read some new >> netdevice flag which would tell if the interface should be killed or >> moved to global upon netns exit. >> >> Regards, >>
Hello Antonio, Kuniyuki, On 20.09.2024 12:46, Antonio Quartulli wrote: > Hi, > > On 20/09/2024 11:32, Kuniyuki Iwashima wrote: >> From: Antonio Quartulli <antonio@openvpn.net> >> Date: Thu, 19 Sep 2024 13:57:51 +0200 >>> Hi Kuniyuki and thank you for chiming in. >>> >>> On 19/09/2024 07:52, Kuniyuki Iwashima wrote: >>>> From: Antonio Quartulli <antonio@openvpn.net> >>>> Date: Tue, 17 Sep 2024 03:07:12 +0200 >>>>> +/* we register with rtnl to let core know that ovpn is a virtual >>>>> driver and >>>>> + * therefore ifaces should be destroyed when exiting a netns >>>>> + */ >>>>> +static struct rtnl_link_ops ovpn_link_ops = { >>>>> +}; >>>> >>>> This looks like abusing rtnl_link_ops. >>> >>> In some way, the inspiration came from >>> 5b9e7e160795 ("openvswitch: introduce rtnl ops stub") >>> >>> [which just reminded me that I wanted to fill the .kind field, but I >>> forgot to do so] >>> >>> The reason for taking this approach was to avoid handling the iface >>> destruction upon netns exit inside the driver, when the core already has >>> all the code for taking care of this for us. >>> >>> Originally I implemented pernet_operations.pre_exit, but Sabrina >>> suggested that letting the core handle the destruction was cleaner (and >>> I agreed). >>> >>> However, after I removed the pre_exit implementation, we realized that >>> default_device_exit_batch/default_device_exit_net thought that an ovpn >>> device is a real NIC and was moving it to the global netns rather than >>> killing it. >>> >>> One way to fix the above was to register rtnl_link_ops with netns_fund = >>> false (so the ops object you see in this patch is not truly "empty"). >>> >>> However, I then hit the bug which required patch 2 to get fixed. >>> >>> Does it make sense to you? >>> Or you still think this is an rtnl_link_ops abuse? >> >> The use of .kind makes sense, and the change should be in this patch. > > Ok, will add it here and I will also add an explicit .netns_fund = false > to highlight the fact that we need this attribute to avoid moving the > iface to the global netns. > >> >> For the patch 2 and dellink(), is the device not expected to be removed >> by ip link del ? Setting unregister_netdevice_queue() to dellink() will >> support RTM_DELLINK, but otherwise -EOPNOTSUPP is returned. > > For the time being I decided that it would make sense to add and delete > ovpn interfaces via netlink API only. > > But there are already discussions about implementing the RTNL > add/dellink() too. > Therefore I think it makes sense to set dellink to > unregister_netdevice_queue() in this patch and thus avoid patch 2 at all. I should make a confession :) It was me who proposed and pushed the idea of the RTNL ops removing. I was too concerned about uselessness of addlink operation so I did not clearly mention that dellink is useful operation. Especially when it comes to namespace destruction. My bad. So yeah, providing the dellink operation make sense for namespace destruction handling and for user to manually cleanup reminding network interfaces after a forceful user application killing or crash. >>> The alternative was to change >>> default_device_exit_batch/default_device_exit_net to read some new >>> netdevice flag which would tell if the interface should be killed or >>> moved to global upon netns exit. -- Sergey
On 22/09/2024 22:51, Sergey Ryazanov wrote: > Hello Antonio, Kuniyuki, > > On 20.09.2024 12:46, Antonio Quartulli wrote: >> Hi, >> >> On 20/09/2024 11:32, Kuniyuki Iwashima wrote: >>> From: Antonio Quartulli <antonio@openvpn.net> >>> Date: Thu, 19 Sep 2024 13:57:51 +0200 >>>> Hi Kuniyuki and thank you for chiming in. >>>> >>>> On 19/09/2024 07:52, Kuniyuki Iwashima wrote: >>>>> From: Antonio Quartulli <antonio@openvpn.net> >>>>> Date: Tue, 17 Sep 2024 03:07:12 +0200 >>>>>> +/* we register with rtnl to let core know that ovpn is a virtual >>>>>> driver and >>>>>> + * therefore ifaces should be destroyed when exiting a netns >>>>>> + */ >>>>>> +static struct rtnl_link_ops ovpn_link_ops = { >>>>>> +}; >>>>> >>>>> This looks like abusing rtnl_link_ops. >>>> >>>> In some way, the inspiration came from >>>> 5b9e7e160795 ("openvswitch: introduce rtnl ops stub") >>>> >>>> [which just reminded me that I wanted to fill the .kind field, but I >>>> forgot to do so] >>>> >>>> The reason for taking this approach was to avoid handling the iface >>>> destruction upon netns exit inside the driver, when the core already >>>> has >>>> all the code for taking care of this for us. >>>> >>>> Originally I implemented pernet_operations.pre_exit, but Sabrina >>>> suggested that letting the core handle the destruction was cleaner (and >>>> I agreed). >>>> >>>> However, after I removed the pre_exit implementation, we realized that >>>> default_device_exit_batch/default_device_exit_net thought that an ovpn >>>> device is a real NIC and was moving it to the global netns rather than >>>> killing it. >>>> >>>> One way to fix the above was to register rtnl_link_ops with >>>> netns_fund = >>>> false (so the ops object you see in this patch is not truly "empty"). >>>> >>>> However, I then hit the bug which required patch 2 to get fixed. >>>> >>>> Does it make sense to you? >>>> Or you still think this is an rtnl_link_ops abuse? >>> >>> The use of .kind makes sense, and the change should be in this patch. >> >> Ok, will add it here and I will also add an explicit .netns_fund = >> false to highlight the fact that we need this attribute to avoid >> moving the iface to the global netns. >> >>> >>> For the patch 2 and dellink(), is the device not expected to be removed >>> by ip link del ? Setting unregister_netdevice_queue() to dellink() will >>> support RTM_DELLINK, but otherwise -EOPNOTSUPP is returned. >> >> For the time being I decided that it would make sense to add and >> delete ovpn interfaces via netlink API only. >> >> But there are already discussions about implementing the RTNL >> add/dellink() too. >> Therefore I think it makes sense to set dellink to >> unregister_netdevice_queue() in this patch and thus avoid patch 2 at all. > > I should make a confession :) It was me who proposed and pushed the idea > of the RTNL ops removing. I was too concerned about uselessness of > addlink operation so I did not clearly mention that dellink is useful > operation. Especially when it comes to namespace destruction. My bad. It helped getting where we are now :) > > So yeah, providing the dellink operation make sense for namespace > destruction handling and for user to manually cleanup reminding network > interfaces after a forceful user application killing or crash. For this specific case (i.e. crash) I am planning to add a netlink notifier that detects when the process having created the interface goes away and then kill the interface from within the kernel. This way we have some sort of self cleanup and avoid leaving the system in a bogus state. (For those specific use cases where you want to create a "persistent" interface, I think we will provide a flag. But this is for a later patch..) Cheers, > >>>> The alternative was to change >>>> default_device_exit_batch/default_device_exit_net to read some new >>>> netdevice flag which would tell if the interface should be killed or >>>> moved to global upon netns exit. > > -- > Sergey
diff --git a/MAINTAINERS b/MAINTAINERS index 77fcd6f802a5..53b6350d95be 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -17263,6 +17263,13 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs.git F: Documentation/filesystems/overlayfs.rst F: fs/overlayfs/ +OPENVPN DATA CHANNEL OFFLOAD +M: Antonio Quartulli <antonio@openvpn.net> +L: openvpn-devel@lists.sourceforge.net (moderated for non-subscribers) +L: netdev@vger.kernel.org +S: Maintained +F: drivers/net/ovpn/ + P54 WIRELESS DRIVER M: Christian Lamparter <chunkeey@googlemail.com> L: linux-wireless@vger.kernel.org diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 9920b3a68ed1..0055bcd2356c 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -115,6 +115,20 @@ config WIREGUARD_DEBUG Say N here unless you know what you're doing. +config OVPN + tristate "OpenVPN data channel offload" + depends on NET && INET + select NET_UDP_TUNNEL + select DST_CACHE + select CRYPTO + select CRYPTO_AES + select CRYPTO_GCM + select CRYPTO_CHACHA20POLY1305 + select STREAM_PARSER + help + This module enhances the performance of the OpenVPN userspace software + by offloading the data channel processing to kernelspace. + config EQUALIZER tristate "EQL (serial line load balancing) support" help diff --git a/drivers/net/Makefile b/drivers/net/Makefile index 13743d0e83b5..5152b3330e28 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -11,6 +11,7 @@ obj-$(CONFIG_IPVLAN) += ipvlan/ obj-$(CONFIG_IPVTAP) += ipvlan/ obj-$(CONFIG_DUMMY) += dummy.o obj-$(CONFIG_WIREGUARD) += wireguard/ +obj-$(CONFIG_OVPN) += ovpn/ obj-$(CONFIG_EQUALIZER) += eql.o obj-$(CONFIG_IFB) += ifb.o obj-$(CONFIG_MACSEC) += macsec.o diff --git a/drivers/net/ovpn/Makefile b/drivers/net/ovpn/Makefile new file mode 100644 index 000000000000..53fb197027d7 --- /dev/null +++ b/drivers/net/ovpn/Makefile @@ -0,0 +1,11 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# ovpn -- OpenVPN data channel offload in kernel space +# +# Copyright (C) 2020-2024 OpenVPN, Inc. +# +# Author: Antonio Quartulli <antonio@openvpn.net> + +obj-$(CONFIG_OVPN) := ovpn.o +ovpn-y += main.o +ovpn-y += io.o diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c new file mode 100644 index 000000000000..ad3813419c33 --- /dev/null +++ b/drivers/net/ovpn/io.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0 +/* OpenVPN data channel offload + * + * Copyright (C) 2019-2024 OpenVPN, Inc. + * + * Author: James Yonan <james@openvpn.net> + * Antonio Quartulli <antonio@openvpn.net> + */ + +#include <linux/netdevice.h> +#include <linux/skbuff.h> + +#include "io.h" + +/* Send user data to the network + */ +netdev_tx_t ovpn_net_xmit(struct sk_buff *skb, struct net_device *dev) +{ + skb_tx_error(skb); + kfree_skb(skb); + return NET_XMIT_DROP; +} diff --git a/drivers/net/ovpn/io.h b/drivers/net/ovpn/io.h new file mode 100644 index 000000000000..aa259be66441 --- /dev/null +++ b/drivers/net/ovpn/io.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* OpenVPN data channel offload + * + * Copyright (C) 2019-2024 OpenVPN, Inc. + * + * Author: James Yonan <james@openvpn.net> + * Antonio Quartulli <antonio@openvpn.net> + */ + +#ifndef _NET_OVPN_OVPN_H_ +#define _NET_OVPN_OVPN_H_ + +netdev_tx_t ovpn_net_xmit(struct sk_buff *skb, struct net_device *dev); + +#endif /* _NET_OVPN_OVPN_H_ */ diff --git a/drivers/net/ovpn/main.c b/drivers/net/ovpn/main.c new file mode 100644 index 000000000000..8a90319e4600 --- /dev/null +++ b/drivers/net/ovpn/main.c @@ -0,0 +1,109 @@ +// SPDX-License-Identifier: GPL-2.0 +/* OpenVPN data channel offload + * + * Copyright (C) 2020-2024 OpenVPN, Inc. + * + * Author: Antonio Quartulli <antonio@openvpn.net> + * James Yonan <james@openvpn.net> + */ + +#include <linux/module.h> +#include <linux/netdevice.h> +#include <linux/version.h> +#include <net/rtnetlink.h> + +#include "main.h" +#include "io.h" + +/* Driver info */ +#define DRV_DESCRIPTION "OpenVPN data channel offload (ovpn)" +#define DRV_COPYRIGHT "(C) 2020-2024 OpenVPN, Inc." + +/** + * ovpn_dev_is_valid - check if the netdevice is of type 'ovpn' + * @dev: the interface to check + * + * Return: whether the netdevice is of type 'ovpn' + */ +bool ovpn_dev_is_valid(const struct net_device *dev) +{ + return dev->netdev_ops->ndo_start_xmit == ovpn_net_xmit; +} + +/* we register with rtnl to let core know that ovpn is a virtual driver and + * therefore ifaces should be destroyed when exiting a netns + */ +static struct rtnl_link_ops ovpn_link_ops = { +}; + +static int ovpn_netdev_notifier_call(struct notifier_block *nb, + unsigned long state, void *ptr) +{ + struct net_device *dev = netdev_notifier_info_to_dev(ptr); + + if (!ovpn_dev_is_valid(dev)) + return NOTIFY_DONE; + + switch (state) { + case NETDEV_REGISTER: + /* add device to internal list for later destruction upon + * unregistration + */ + break; + case NETDEV_UNREGISTER: + /* can be delivered multiple times, so check registered flag, + * then destroy the interface + */ + break; + case NETDEV_POST_INIT: + case NETDEV_GOING_DOWN: + case NETDEV_DOWN: + case NETDEV_UP: + case NETDEV_PRE_UP: + default: + return NOTIFY_DONE; + } + + return NOTIFY_OK; +} + +static struct notifier_block ovpn_netdev_notifier = { + .notifier_call = ovpn_netdev_notifier_call, +}; + +static int __init ovpn_init(void) +{ + int err = register_netdevice_notifier(&ovpn_netdev_notifier); + + if (err) { + pr_err("ovpn: can't register netdevice notifier: %d\n", err); + return err; + } + + err = rtnl_link_register(&ovpn_link_ops); + if (err) { + pr_err("ovpn: can't register rtnl link ops: %d\n", err); + goto unreg_netdev; + } + + return 0; + +unreg_netdev: + unregister_netdevice_notifier(&ovpn_netdev_notifier); + return err; +} + +static __exit void ovpn_cleanup(void) +{ + rtnl_link_unregister(&ovpn_link_ops); + unregister_netdevice_notifier(&ovpn_netdev_notifier); + + rcu_barrier(); +} + +module_init(ovpn_init); +module_exit(ovpn_cleanup); + +MODULE_DESCRIPTION(DRV_DESCRIPTION); +MODULE_AUTHOR(DRV_COPYRIGHT); +MODULE_LICENSE("GPL"); diff --git a/drivers/net/ovpn/main.h b/drivers/net/ovpn/main.h new file mode 100644 index 000000000000..a3215316c49b --- /dev/null +++ b/drivers/net/ovpn/main.h @@ -0,0 +1,15 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* OpenVPN data channel offload + * + * Copyright (C) 2019-2024 OpenVPN, Inc. + * + * Author: James Yonan <james@openvpn.net> + * Antonio Quartulli <antonio@openvpn.net> + */ + +#ifndef _NET_OVPN_MAIN_H_ +#define _NET_OVPN_MAIN_H_ + +bool ovpn_dev_is_valid(const struct net_device *dev); + +#endif /* _NET_OVPN_MAIN_H_ */ diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h index 1a0fe8b151fb..f9f8ffddfd0c 100644 --- a/include/uapi/linux/udp.h +++ b/include/uapi/linux/udp.h @@ -43,5 +43,6 @@ struct udphdr { #define UDP_ENCAP_GTP1U 5 /* 3GPP TS 29.060 */ #define UDP_ENCAP_RXRPC 6 #define TCP_ENCAP_ESPINTCP 7 /* Yikes, this is really xfrm encap types. */ +#define UDP_ENCAP_OVPNINUDP 8 /* OpenVPN traffic */ #endif /* _UAPI_LINUX_UDP_H */
OpenVPN is a userspace software existing since around 2005 that allows users to create secure tunnels. So far OpenVPN has implemented all operations in userspace, which implies several back and forth between kernel and user land in order to process packets (encapsulate/decapsulate, encrypt/decrypt, rerouting..). With `ovpn` we intend to move the fast path (data channel) entirely in kernel space and thus improve user measured throughput over the tunnel. `ovpn` is implemented as a simple virtual network device driver, that can be manipulated by means of the standard RTNL APIs. A device of kind `ovpn` allows only IPv4/6 traffic and can be of type: * P2P (peer-to-peer): any packet sent over the interface will be encapsulated and transmitted to the other side (typical OpenVPN client or peer-to-peer behaviour); * P2MP (point-to-multipoint): packets sent over the interface are transmitted to peers based on existing routes (typical OpenVPN server behaviour). After the interface has been created, OpenVPN in userspace can configure it using a new Netlink API. Specifically it is possible to manage peers and their keys. The OpenVPN control channel is multiplexed over the same transport socket by means of OP codes. Anything that is not DATA_V2 (OpenVPN OP code for data traffic) is sent to userspace and handled there. This way the `ovpn` codebase is kept as compact as possible while focusing on handling data traffic only (fast path). Any OpenVPN control feature (like cipher negotiation, TLS handshake, rekeying, etc.) is still fully handled by the userspace process. When userspace establishes a new connection with a peer, it first performs the handshake and then passes the socket to the `ovpn` kernel module, which takes ownership. From this moment on `ovpn` will handle data traffic for the new peer. When control packets are received on the link, they are forwarded to userspace through the same transport socket they were received on, as userspace is still listening to them. Some events (like peer deletion) are sent to a Netlink multicast group. Although it wasn't easy to convince the community, `ovpn` implements only a limited number of the data-channel features supported by the userspace program. Each feature that made it to `ovpn` was attentively vetted to avoid carrying too much legacy along with us (and to give a clear cut to old and probalby-not-so-useful features). Notably, only encryption using AEAD ciphers (specifically ChaCha20Poly1305 and AES-GCM) was implemented. Supporting any other cipher out there was not deemed useful. Both UDP and TCP sockets ae supported. As explained above, in case of P2MP mode, OpenVPN will use the main system routing table to decide which packet goes to which peer. This implies that no routing table was re-implemented in the `ovpn` kernel module. This kernel module can be enabled by selecting the CONFIG_OVPN entry in the networking drivers section. NOTE: this first patch introduces the very basic framework only. Features are then added patch by patch, however, although each patch will compile and possibly not break at runtime, only after having applied the full set it is expected to see the ovpn module fully working. Cc: steffen.klassert@secunet.com Cc: antony.antony@secunet.com Signed-off-by: Antonio Quartulli <antonio@openvpn.net> --- MAINTAINERS | 7 +++ drivers/net/Kconfig | 14 +++++ drivers/net/Makefile | 1 + drivers/net/ovpn/Makefile | 11 ++++ drivers/net/ovpn/io.c | 22 ++++++++ drivers/net/ovpn/io.h | 15 ++++++ drivers/net/ovpn/main.c | 109 ++++++++++++++++++++++++++++++++++++++ drivers/net/ovpn/main.h | 15 ++++++ include/uapi/linux/udp.h | 1 + 9 files changed, 195 insertions(+) create mode 100644 drivers/net/ovpn/Makefile create mode 100644 drivers/net/ovpn/io.c create mode 100644 drivers/net/ovpn/io.h create mode 100644 drivers/net/ovpn/main.c create mode 100644 drivers/net/ovpn/main.h