[net-next,v3,07/24] ovpn: introduce the ovpn_peer object

Message ID	20240506011637.27272-8-antonio@openvpn.net (mailing list archive)
State	Changes Requested
Delegated to:	Netdev Maintainers
Headers	show Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0FC2B41C73 for <netdev@vger.kernel.org>; Mon, 6 May 2024 01:15:39 +0000 (UTC) From: Antonio Quartulli <antonio@openvpn.net> To: netdev@vger.kernel.org Cc: Jakub Kicinski <kuba@kernel.org>, Sergey Ryazanov <ryazanov.s.a@gmail.com>, Paolo Abeni <pabeni@redhat.com>, Eric Dumazet <edumazet@google.com>, Andrew Lunn <andrew@lunn.ch>, Esben Haabendal <esben@geanix.com>, Antonio Quartulli <antonio@openvpn.net> Subject: [PATCH net-next v3 07/24] ovpn: introduce the ovpn_peer object Date: Mon, 6 May 2024 03:16:20 +0200 Message-ID: <20240506011637.27272-8-antonio@openvpn.net> In-Reply-To: <20240506011637.27272-1-antonio@openvpn.net> References: <20240506011637.27272-1-antonio@openvpn.net> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Introducing OpenVPN Data Channel Offload \| expand [net-next,v3,00/24] Introducing OpenVPN Data Channel Offload [net-next,v3,01/24] netlink: add NLA_POLICY_MAX_LEN macro [net-next,v3,02/24] net: introduce OpenVPN Data Channel Offload (ovpn) [net-next,v3,03/24] ovpn: add basic netlink support [net-next,v3,04/24] ovpn: add basic interface creation/destruction/management routines [net-next,v3,05/24] ovpn: implement interface creation/destruction via netlink [net-next,v3,06/24] ovpn: keep carrier always on [net-next,v3,07/24] ovpn: introduce the ovpn_peer object [net-next,v3,08/24] ovpn: introduce the ovpn_socket object [net-next,v3,09/24] ovpn: implement basic TX path (UDP) [net-next,v3,10/24] ovpn: implement basic RX path (UDP) [net-next,v3,11/24] ovpn: implement packet processing [net-next,v3,12/24] ovpn: store tunnel and transport statistics [net-next,v3,13/24] ovpn: implement TCP transport [net-next,v3,14/24] ovpn: implement multi-peer support [net-next,v3,15/24] ovpn: implement peer lookup logic [net-next,v3,16/24] ovpn: implement keepalive mechanism [net-next,v3,17/24] ovpn: add support for updating local UDP endpoint [net-next,v3,18/24] ovpn: add support for peer floating [net-next,v3,19/24] ovpn: implement peer add/dump/delete via netlink [net-next,v3,20/24] ovpn: implement key add/del/swap via netlink [net-next,v3,21/24] ovpn: kill key and notify userspace in case of IV exhaustion [net-next,v3,22/24] ovpn: notify userspace when a peer is deleted [net-next,v3,23/24] ovpn: add basic ethtool support [net-next,v3,24/24] testing/selftest: add test tool and scripts for ovpn module

Context	Check	Description
netdev/series_format	fail	Series longer than 15 patches
netdev/tree_selection	success	Clearly marked for net-next, async
netdev/ynl	success	Generated files up to date; no warnings/errors; GEN HAS DIFF 2 files changed, 2613 insertions(+);
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 932 this patch: 932
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	warning	1 maintainers not CCed: openvpn-devel@lists.sourceforge.net
netdev/build_clang	success	Errors and warnings before: 938 this patch: 938
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	warning	Found: 'dev_hold(' was: 0 now: 1; 'dev_put(' was: 0 now: 1
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 944 this patch: 944
netdev/checkpatch	warning	WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	fail	Errors and warnings before: 0 this patch: 4
netdev/source_inline	success	Was 0 now: 0
netdev/contest	success	net-next-2024-05-07--03-00 (tests: 1000)

Antonio Quartulli May 6, 2024, 1:16 a.m. UTC

An ovpn_peer object holds the whole status of a remote peer
(regardless whether it is a server or a client).

This includes status for crypto, tx/rx buffers, napi, etc.

Only support for one peer is introduced (P2P mode).
Multi peer support is introduced with a later patch.

Along with the ovpn_peer, also the ovpn_bind object is introcued
as the two are strictly related.
An ovpn_bind object wraps a sockaddr representing the local
coordinates being used to talk to a specific peer.

Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
---
 drivers/net/ovpn/Makefile     |   2 +
 drivers/net/ovpn/bind.c       |  60 ++++++
 drivers/net/ovpn/bind.h       | 130 ++++++++++++
 drivers/net/ovpn/io.c         |   8 +
 drivers/net/ovpn/main.c       |  10 +
 drivers/net/ovpn/main.h       |   2 +
 drivers/net/ovpn/ovpnstruct.h |   7 +
 drivers/net/ovpn/peer.c       | 379 ++++++++++++++++++++++++++++++++++
 drivers/net/ovpn/peer.h       | 152 ++++++++++++++
 9 files changed, 750 insertions(+)
 create mode 100644 drivers/net/ovpn/bind.c
 create mode 100644 drivers/net/ovpn/bind.h
 create mode 100644 drivers/net/ovpn/peer.c
 create mode 100644 drivers/net/ovpn/peer.h

Sabrina Dubroca May 8, 2024, 4:06 p.m. UTC | #1

2024-05-06, 03:16:20 +0200, Antonio Quartulli wrote:
> An ovpn_peer object holds the whole status of a remote peer
> (regardless whether it is a server or a client).
> 
> This includes status for crypto, tx/rx buffers, napi, etc.
> 
> Only support for one peer is introduced (P2P mode).
> Multi peer support is introduced with a later patch.
> 
> Along with the ovpn_peer, also the ovpn_bind object is introcued
                                                         ^
typo: "introduced"

> as the two are strictly related.
> An ovpn_bind object wraps a sockaddr representing the local
> coordinates being used to talk to a specific peer.

> diff --git a/drivers/net/ovpn/bind.c b/drivers/net/ovpn/bind.c
> new file mode 100644
> index 000000000000..c1f842c06e32
> --- /dev/null
> +++ b/drivers/net/ovpn/bind.c
> +static void ovpn_bind_release_rcu(struct rcu_head *head)
> +{
> +	struct ovpn_bind *bind = container_of(head, struct ovpn_bind, rcu);
> +
> +	kfree(bind);
> +}
> +
> +void ovpn_bind_reset(struct ovpn_peer *peer, struct ovpn_bind *new)
> +{
> +	struct ovpn_bind *old;
> +
> +	spin_lock_bh(&peer->lock);
> +	old = rcu_replace_pointer(peer->bind, new, true);
> +	spin_unlock_bh(&peer->lock);
> +
> +	if (old)
> +		call_rcu(&old->rcu, ovpn_bind_release_rcu);

Isn't that just kfree_rcu? (note kfree_rcu doesn't need the NULL check)

> +}


> diff --git a/drivers/net/ovpn/bind.h b/drivers/net/ovpn/bind.h
> new file mode 100644
> index 000000000000..61433550a961
> --- /dev/null
> +++ b/drivers/net/ovpn/bind.h
[...]
> +static inline bool ovpn_bind_skb_src_match(const struct ovpn_bind *bind,
> +					   struct sk_buff *skb)

nit: I think skb can also be const here


> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
> index 338e99dfe886..a420bb45f25f 100644
> --- a/drivers/net/ovpn/io.c
> +++ b/drivers/net/ovpn/io.c
> @@ -13,6 +13,7 @@
>  #include "io.h"
>  #include "ovpnstruct.h"
>  #include "netlink.h"
> +#include "peer.h"
>  
>  int ovpn_struct_init(struct net_device *dev)
>  {
> @@ -25,6 +26,13 @@ int ovpn_struct_init(struct net_device *dev)
>  	if (err < 0)
>  		return err;
>  
> +	spin_lock_init(&ovpn->lock);
> +
> +	ovpn->events_wq = alloc_workqueue("ovpn-events-wq-%s", WQ_MEM_RECLAIM,
> +					  0, dev->name);

I'm not convinced this will get freed consistently if
register_netdevice fails early (before ndo_init).  After talking to
Paolo, it seems this should be moved into a new ->ndo_init instead.

> +	if (!ovpn->events_wq)
> +		return -ENOMEM;
> +
>  	dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
>  	if (!dev->tstats)
>  		return -ENOMEM;
> diff --git a/drivers/net/ovpn/main.c b/drivers/net/ovpn/main.c
> index cc8a97a1a189..dba35ecb236b 100644
> --- a/drivers/net/ovpn/main.c
> +++ b/drivers/net/ovpn/main.c
> @@ -37,6 +39,9 @@ static void ovpn_struct_free(struct net_device *net)
>  	rtnl_unlock();
>  
>  	free_percpu(net->tstats);
> +	flush_workqueue(ovpn->events_wq);
> +	destroy_workqueue(ovpn->events_wq);

Is the flush needed? I'm not an expert on workqueues, but from a quick
look at destroy_workqueue it calls drain_workqueue, which would take
care of flushing the queue?

> +	rcu_barrier();
>  }
>  

[...]
> diff --git a/drivers/net/ovpn/ovpnstruct.h b/drivers/net/ovpn/ovpnstruct.h
> index ee05b8a2c61d..b79d4f0474b0 100644
> --- a/drivers/net/ovpn/ovpnstruct.h
> +++ b/drivers/net/ovpn/ovpnstruct.h
> @@ -17,12 +17,19 @@
>   * @dev: the actual netdev representing the tunnel
>   * @registered: whether dev is still registered with netdev or not
>   * @mode: device operation mode (i.e. p2p, mp, ..)
> + * @lock: protect this object
> + * @event_wq: used to schedule generic events that may sleep and that need to be
> + *            performed outside of softirq context
> + * @peer: in P2P mode, this is the only remote peer
>   * @dev_list: entry for the module wide device list
>   */
>  struct ovpn_struct {
>  	struct net_device *dev;
>  	bool registered;
>  	enum ovpn_mode mode;
> +	spinlock_t lock; /* protect writing to the ovpn_struct object */

nit: the comment isn't really needed since you have kdoc saying the same thing

> +	struct workqueue_struct *events_wq;
> +	struct ovpn_peer __rcu *peer;
>  	struct list_head dev_list;
>  };
>  
> diff --git a/drivers/net/ovpn/peer.c b/drivers/net/ovpn/peer.c
> new file mode 100644
> index 000000000000..2948b7320d47
> --- /dev/null
> +++ b/drivers/net/ovpn/peer.c
[...]
> +/**
> + * ovpn_peer_free - release private members and free peer object
> + * @peer: the peer to free
> + */
> +static void ovpn_peer_free(struct ovpn_peer *peer)
> +{
> +	ovpn_bind_reset(peer, NULL);
> +
> +	WARN_ON(!__ptr_ring_empty(&peer->tx_ring));

Could you pass a destructor to ptr_ring_cleanup instead of all these WARNs?

> +	ptr_ring_cleanup(&peer->tx_ring, NULL);
> +	WARN_ON(!__ptr_ring_empty(&peer->rx_ring));
> +	ptr_ring_cleanup(&peer->rx_ring, NULL);
> +	WARN_ON(!__ptr_ring_empty(&peer->netif_rx_ring));
> +	ptr_ring_cleanup(&peer->netif_rx_ring, NULL);
> +
> +	dst_cache_destroy(&peer->dst_cache);
> +
> +	dev_put(peer->ovpn->dev);
> +
> +	kfree(peer);
> +}

[...]
> +void ovpn_peer_release(struct ovpn_peer *peer)
> +{
> +	call_rcu(&peer->rcu, ovpn_peer_release_rcu);
> +}
> +
> +/**
> + * ovpn_peer_delete_work - work scheduled to release peer in process context
> + * @work: the work object
> + */
> +static void ovpn_peer_delete_work(struct work_struct *work)
> +{
> +	struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
> +					      delete_work);
> +	ovpn_peer_release(peer);

Does call_rcu really need to run in process context?

> +}

[...]
> +/**
> + * ovpn_peer_transp_match - check if sockaddr and peer binding match
> + * @peer: the peer to get the binding from
> + * @ss: the sockaddr to match
> + *
> + * Return: true if sockaddr and binding match or false otherwise
> + */
> +static bool ovpn_peer_transp_match(struct ovpn_peer *peer,
> +				   struct sockaddr_storage *ss)
> +{
[...]
> +	case AF_INET6:
> +		sa6 = (struct sockaddr_in6 *)ss;
> +		if (memcmp(&sa6->sin6_addr, &bind->sa.in6.sin6_addr,
> +			   sizeof(struct in6_addr)))

ipv6_addr_equal?

> +			return false;
> +		if (sa6->sin6_port != bind->sa.in6.sin6_port)
> +			return false;
> +		break;

[...]
> +struct ovpn_peer *ovpn_peer_get_by_id(struct ovpn_struct *ovpn, u32 peer_id)
> +{
> +	struct ovpn_peer *peer = NULL;
> +
> +	if (ovpn->mode == OVPN_MODE_P2P)
> +		peer = ovpn_peer_get_by_id_p2p(ovpn, peer_id);
> +
> +	return peer;
> +}
> +
> +/**
> + * ovpn_peer_add_p2p - add per to related tables in a P2P instance
                              ^
typo: peer?


[...]
> +/**
> + * ovpn_peer_del_p2p - delete peer from related tables in a P2P instance
> + * @peer: the peer to delete
> + * @reason: reason why the peer was deleted (sent to userspace)
> + *
> + * Return: 0 on success or a negative error code otherwise
> + */
> +static int ovpn_peer_del_p2p(struct ovpn_peer *peer,
> +			     enum ovpn_del_peer_reason reason)
> +{
> +	struct ovpn_peer *tmp;
> +	int ret = -ENOENT;
> +
> +	spin_lock_bh(&peer->ovpn->lock);
> +	tmp = rcu_dereference(peer->ovpn->peer);
> +	if (tmp != peer)
> +		goto unlock;

How do we recover if all those objects got out of sync? Are we stuck
with a broken peer?

And if this happens during interface deletion, aren't we leaking the
peer memory here?

> +	ovpn_peer_put(tmp);
> +	tmp->delete_reason = reason;
> +	RCU_INIT_POINTER(peer->ovpn->peer, NULL);
> +	ret = 0;
> +
> +unlock:
> +	spin_unlock_bh(&peer->ovpn->lock);
> +
> +	return ret;
> +}

[...]
> diff --git a/drivers/net/ovpn/peer.h b/drivers/net/ovpn/peer.h
> new file mode 100644
> index 000000000000..659df320525c
> --- /dev/null
> +++ b/drivers/net/ovpn/peer.h
[...]
> +/**
> + * struct ovpn_peer - the main remote peer object
> + * @ovpn: main openvpn instance this peer belongs to
> + * @id: unique identifier
> + * @vpn_addrs.ipv4: IPv4 assigned to peer on the tunnel
> + * @vpn_addrs.ipv6: IPv6 assigned to peer on the tunnel
> + * @tx_ring: queue of outgoing poackets to this peer
> + * @rx_ring: queue of incoming packets from this peer
> + * @netif_rx_ring: queue of packets to be sent to the netdevice via NAPI
> + * @dst_cache: cache for dst_entry used to send to peer
> + * @bind: remote peer binding
> + * @halt: true if ovpn_peer_mark_delete was called
> + * @delete_reason: why peer was deleted (i.e. timeout, transport error, ..)
> + * @lock: protects binding to peer (bind)
> + * @refcount: reference counter
> + * @rcu: used to free peer in an RCU safe way
> + * @delete_work: deferred cleanup work, used to notify userspace
> + */
> +struct ovpn_peer {
> +	struct ovpn_struct *ovpn;
> +	u32 id;
> +	struct {
> +		struct in_addr ipv4;
> +		struct in6_addr ipv6;
> +	} vpn_addrs;
> +	struct ptr_ring tx_ring;
> +	struct ptr_ring rx_ring;
> +	struct ptr_ring netif_rx_ring;
> +	struct dst_cache dst_cache;
> +	struct ovpn_bind __rcu *bind;
> +	bool halt;
> +	enum ovpn_del_peer_reason delete_reason;
> +	spinlock_t lock; /* protects bind */

nit: the comment isn't really needed, it's redundant with kdoc.

> +	struct kref refcount;
> +	struct rcu_head rcu;
> +	struct work_struct delete_work;
> +};

Antonio Quartulli May 8, 2024, 8:31 p.m. UTC | #2

On 08/05/2024 18:06, Sabrina Dubroca wrote:
> 2024-05-06, 03:16:20 +0200, Antonio Quartulli wrote:
>> An ovpn_peer object holds the whole status of a remote peer
>> (regardless whether it is a server or a client).
>>
>> This includes status for crypto, tx/rx buffers, napi, etc.
>>
>> Only support for one peer is introduced (P2P mode).
>> Multi peer support is introduced with a later patch.
>>
>> Along with the ovpn_peer, also the ovpn_bind object is introcued
>                                                           ^
> typo: "introduced"

thanks

> 
>> as the two are strictly related.
>> An ovpn_bind object wraps a sockaddr representing the local
>> coordinates being used to talk to a specific peer.
> 
>> diff --git a/drivers/net/ovpn/bind.c b/drivers/net/ovpn/bind.c
>> new file mode 100644
>> index 000000000000..c1f842c06e32
>> --- /dev/null
>> +++ b/drivers/net/ovpn/bind.c
>> +static void ovpn_bind_release_rcu(struct rcu_head *head)
>> +{
>> +	struct ovpn_bind *bind = container_of(head, struct ovpn_bind, rcu);
>> +
>> +	kfree(bind);
>> +}
>> +
>> +void ovpn_bind_reset(struct ovpn_peer *peer, struct ovpn_bind *new)
>> +{
>> +	struct ovpn_bind *old;
>> +
>> +	spin_lock_bh(&peer->lock);
>> +	old = rcu_replace_pointer(peer->bind, new, true);
>> +	spin_unlock_bh(&peer->lock);
>> +
>> +	if (old)
>> +		call_rcu(&old->rcu, ovpn_bind_release_rcu);
> 
> Isn't that just kfree_rcu? (note kfree_rcu doesn't need the NULL check)

yeah, you're right. I think ovpn_bind_release_rcu() was more complex in 
the past, but got reduced step by step...will directly use kfree_rcu().

> 
>> +}
> 
> 
>> diff --git a/drivers/net/ovpn/bind.h b/drivers/net/ovpn/bind.h
>> new file mode 100644
>> index 000000000000..61433550a961
>> --- /dev/null
>> +++ b/drivers/net/ovpn/bind.h
> [...]
>> +static inline bool ovpn_bind_skb_src_match(const struct ovpn_bind *bind,
>> +					   struct sk_buff *skb)
> 
> nit: I think skb can also be const here

right

> 
> 
>> diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c
>> index 338e99dfe886..a420bb45f25f 100644
>> --- a/drivers/net/ovpn/io.c
>> +++ b/drivers/net/ovpn/io.c
>> @@ -13,6 +13,7 @@
>>   #include "io.h"
>>   #include "ovpnstruct.h"
>>   #include "netlink.h"
>> +#include "peer.h"
>>   
>>   int ovpn_struct_init(struct net_device *dev)
>>   {
>> @@ -25,6 +26,13 @@ int ovpn_struct_init(struct net_device *dev)
>>   	if (err < 0)
>>   		return err;
>>   
>> +	spin_lock_init(&ovpn->lock);
>> +
>> +	ovpn->events_wq = alloc_workqueue("ovpn-events-wq-%s", WQ_MEM_RECLAIM,
>> +					  0, dev->name);
> 
> I'm not convinced this will get freed consistently if
> register_netdevice fails early (before ndo_init).  After talking to
> Paolo, it seems this should be moved into a new ->ndo_init instead.

oh good point. I didn't consider that register_netdevice could fail that 
early.

> 
>> +	if (!ovpn->events_wq)
>> +		return -ENOMEM;
>> +
>>   	dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
>>   	if (!dev->tstats)
>>   		return -ENOMEM;
>> diff --git a/drivers/net/ovpn/main.c b/drivers/net/ovpn/main.c
>> index cc8a97a1a189..dba35ecb236b 100644
>> --- a/drivers/net/ovpn/main.c
>> +++ b/drivers/net/ovpn/main.c
>> @@ -37,6 +39,9 @@ static void ovpn_struct_free(struct net_device *net)
>>   	rtnl_unlock();
>>   
>>   	free_percpu(net->tstats);
>> +	flush_workqueue(ovpn->events_wq);
>> +	destroy_workqueue(ovpn->events_wq);
> 
> Is the flush needed? I'm not an expert on workqueues, but from a quick
> look at destroy_workqueue it calls drain_workqueue, which would take
> care of flushing the queue?

you're right. drain_workqueue calls __flush_workqueue as often as needed 
to empty the queue.
Therefore I can get rid of my flush_worqueue invocation.

> 
>> +	rcu_barrier();
>>   }
>>   
> 
> [...]
>> diff --git a/drivers/net/ovpn/ovpnstruct.h b/drivers/net/ovpn/ovpnstruct.h
>> index ee05b8a2c61d..b79d4f0474b0 100644
>> --- a/drivers/net/ovpn/ovpnstruct.h
>> +++ b/drivers/net/ovpn/ovpnstruct.h
>> @@ -17,12 +17,19 @@
>>    * @dev: the actual netdev representing the tunnel
>>    * @registered: whether dev is still registered with netdev or not
>>    * @mode: device operation mode (i.e. p2p, mp, ..)
>> + * @lock: protect this object
>> + * @event_wq: used to schedule generic events that may sleep and that need to be
>> + *            performed outside of softirq context
>> + * @peer: in P2P mode, this is the only remote peer
>>    * @dev_list: entry for the module wide device list
>>    */
>>   struct ovpn_struct {
>>   	struct net_device *dev;
>>   	bool registered;
>>   	enum ovpn_mode mode;
>> +	spinlock_t lock; /* protect writing to the ovpn_struct object */
> 
> nit: the comment isn't really needed since you have kdoc saying the same thing

True, but checkpatch.pl (or some other script?) was still throwing a 
warning, therefore I added this comment to silence it.

> 
>> +	struct workqueue_struct *events_wq;
>> +	struct ovpn_peer __rcu *peer;
>>   	struct list_head dev_list;
>>   };
>>   
>> diff --git a/drivers/net/ovpn/peer.c b/drivers/net/ovpn/peer.c
>> new file mode 100644
>> index 000000000000..2948b7320d47
>> --- /dev/null
>> +++ b/drivers/net/ovpn/peer.c
> [...]
>> +/**
>> + * ovpn_peer_free - release private members and free peer object
>> + * @peer: the peer to free
>> + */
>> +static void ovpn_peer_free(struct ovpn_peer *peer)
>> +{
>> +	ovpn_bind_reset(peer, NULL);
>> +
>> +	WARN_ON(!__ptr_ring_empty(&peer->tx_ring));
> 
> Could you pass a destructor to ptr_ring_cleanup instead of all these WARNs?

hmm but if we remove the WARNs then we lose the possibility to catch 
potential bugs, no? rings should definitely be empty at this point.

Or you think I should just not care and free any potentially remaining item?

> 
>> +	ptr_ring_cleanup(&peer->tx_ring, NULL);
>> +	WARN_ON(!__ptr_ring_empty(&peer->rx_ring));
>> +	ptr_ring_cleanup(&peer->rx_ring, NULL);
>> +	WARN_ON(!__ptr_ring_empty(&peer->netif_rx_ring));
>> +	ptr_ring_cleanup(&peer->netif_rx_ring, NULL);
>> +
>> +	dst_cache_destroy(&peer->dst_cache);
>> +
>> +	dev_put(peer->ovpn->dev);
>> +
>> +	kfree(peer);
>> +}
> 
> [...]
>> +void ovpn_peer_release(struct ovpn_peer *peer)
>> +{
>> +	call_rcu(&peer->rcu, ovpn_peer_release_rcu);
>> +}
>> +
>> +/**
>> + * ovpn_peer_delete_work - work scheduled to release peer in process context
>> + * @work: the work object
>> + */
>> +static void ovpn_peer_delete_work(struct work_struct *work)
>> +{
>> +	struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
>> +					      delete_work);
>> +	ovpn_peer_release(peer);
> 
> Does call_rcu really need to run in process context?

Reason for switching to process context is that we have to invoke 
ovpn_nl_notify_del_peer (that sends a netlink event to userspace) and 
the latter requires a reference to the peer.

For this reason I thought it would be safe to have 
ovpn_nl_notify_del_peer and call_rcu invoked by the same context.

If I invoke call_rcu in ovpn_peer_release_kref, how can I be sure that 
the peer hasn't been free'd already when ovpn_nl_notify_del_peer is 
executed?


> 
>> +}
> 
> [...]
>> +/**
>> + * ovpn_peer_transp_match - check if sockaddr and peer binding match
>> + * @peer: the peer to get the binding from
>> + * @ss: the sockaddr to match
>> + *
>> + * Return: true if sockaddr and binding match or false otherwise
>> + */
>> +static bool ovpn_peer_transp_match(struct ovpn_peer *peer,
>> +				   struct sockaddr_storage *ss)
>> +{
> [...]
>> +	case AF_INET6:
>> +		sa6 = (struct sockaddr_in6 *)ss;
>> +		if (memcmp(&sa6->sin6_addr, &bind->sa.in6.sin6_addr,
>> +			   sizeof(struct in6_addr)))
> 
> ipv6_addr_equal?
> 

definitely. thanks

>> +			return false;
>> +		if (sa6->sin6_port != bind->sa.in6.sin6_port)
>> +			return false;
>> +		break;
> 
> [...]
>> +struct ovpn_peer *ovpn_peer_get_by_id(struct ovpn_struct *ovpn, u32 peer_id)
>> +{
>> +	struct ovpn_peer *peer = NULL;
>> +
>> +	if (ovpn->mode == OVPN_MODE_P2P)
>> +		peer = ovpn_peer_get_by_id_p2p(ovpn, peer_id);
>> +
>> +	return peer;
>> +}
>> +
>> +/**
>> + * ovpn_peer_add_p2p - add per to related tables in a P2P instance
>                                ^
> typo: peer?

yeah

> 
> 
> [...]
>> +/**
>> + * ovpn_peer_del_p2p - delete peer from related tables in a P2P instance
>> + * @peer: the peer to delete
>> + * @reason: reason why the peer was deleted (sent to userspace)
>> + *
>> + * Return: 0 on success or a negative error code otherwise
>> + */
>> +static int ovpn_peer_del_p2p(struct ovpn_peer *peer,
>> +			     enum ovpn_del_peer_reason reason)
>> +{
>> +	struct ovpn_peer *tmp;
>> +	int ret = -ENOENT;
>> +
>> +	spin_lock_bh(&peer->ovpn->lock);
>> +	tmp = rcu_dereference(peer->ovpn->peer);
>> +	if (tmp != peer)
>> +		goto unlock;
> 
> How do we recover if all those objects got out of sync? Are we stuck
> with a broken peer?

mhhh I don't fully get the scenario you are depicting.

In P2P mode there is only peer stored (reference is saved in ovpn->peer)

When we want to get rid of it, we invoke ovpn_peer_del_p2p().
The check we are performing here is just about being sure that we are 
removing the exact peer we requested to remove (and not some other peer 
that was still floating around for some reason).

> 
> And if this happens during interface deletion, aren't we leaking the
> peer memory here?

at interface deletion we call

ovpn_iface_destruct -> ovpn_peer_release_p2p -> 
ovpn_peer_del_p2p(ovpn->peer)

so at the last step we just ask to remove the very same peer that is 
curently stored, which should just never fail.

makes sense?

> 
>> +	ovpn_peer_put(tmp);
>> +	tmp->delete_reason = reason;
>> +	RCU_INIT_POINTER(peer->ovpn->peer, NULL);
>> +	ret = 0;
>> +
>> +unlock:
>> +	spin_unlock_bh(&peer->ovpn->lock);
>> +
>> +	return ret;
>> +}
> 
> [...]
>> diff --git a/drivers/net/ovpn/peer.h b/drivers/net/ovpn/peer.h
>> new file mode 100644
>> index 000000000000..659df320525c
>> --- /dev/null
>> +++ b/drivers/net/ovpn/peer.h
> [...]
>> +/**
>> + * struct ovpn_peer - the main remote peer object
>> + * @ovpn: main openvpn instance this peer belongs to
>> + * @id: unique identifier
>> + * @vpn_addrs.ipv4: IPv4 assigned to peer on the tunnel
>> + * @vpn_addrs.ipv6: IPv6 assigned to peer on the tunnel
>> + * @tx_ring: queue of outgoing poackets to this peer
>> + * @rx_ring: queue of incoming packets from this peer
>> + * @netif_rx_ring: queue of packets to be sent to the netdevice via NAPI
>> + * @dst_cache: cache for dst_entry used to send to peer
>> + * @bind: remote peer binding
>> + * @halt: true if ovpn_peer_mark_delete was called
>> + * @delete_reason: why peer was deleted (i.e. timeout, transport error, ..)
>> + * @lock: protects binding to peer (bind)
>> + * @refcount: reference counter
>> + * @rcu: used to free peer in an RCU safe way
>> + * @delete_work: deferred cleanup work, used to notify userspace
>> + */
>> +struct ovpn_peer {
>> +	struct ovpn_struct *ovpn;
>> +	u32 id;
>> +	struct {
>> +		struct in_addr ipv4;
>> +		struct in6_addr ipv6;
>> +	} vpn_addrs;
>> +	struct ptr_ring tx_ring;
>> +	struct ptr_ring rx_ring;
>> +	struct ptr_ring netif_rx_ring;
>> +	struct dst_cache dst_cache;
>> +	struct ovpn_bind __rcu *bind;
>> +	bool halt;
>> +	enum ovpn_del_peer_reason delete_reason;
>> +	spinlock_t lock; /* protects bind */
> 
> nit: the comment isn't really needed, it's redundant with kdoc.

like before, also here I had a warning which I wanted to silence.

> 
>> +	struct kref refcount;
>> +	struct rcu_head rcu;
>> +	struct work_struct delete_work;
>> +};
> 

Thanks a lot!

Sabrina Dubroca May 9, 2024, 1:04 p.m. UTC | #3

2024-05-08, 22:31:51 +0200, Antonio Quartulli wrote:
> On 08/05/2024 18:06, Sabrina Dubroca wrote:
> > 2024-05-06, 03:16:20 +0200, Antonio Quartulli wrote:
> > > diff --git a/drivers/net/ovpn/ovpnstruct.h b/drivers/net/ovpn/ovpnstruct.h
> > > index ee05b8a2c61d..b79d4f0474b0 100644
> > > --- a/drivers/net/ovpn/ovpnstruct.h
> > > +++ b/drivers/net/ovpn/ovpnstruct.h
> > > @@ -17,12 +17,19 @@
> > >    * @dev: the actual netdev representing the tunnel
> > >    * @registered: whether dev is still registered with netdev or not
> > >    * @mode: device operation mode (i.e. p2p, mp, ..)
> > > + * @lock: protect this object
> > > + * @event_wq: used to schedule generic events that may sleep and that need to be
> > > + *            performed outside of softirq context
> > > + * @peer: in P2P mode, this is the only remote peer
> > >    * @dev_list: entry for the module wide device list
> > >    */
> > >   struct ovpn_struct {
> > >   	struct net_device *dev;
> > >   	bool registered;
> > >   	enum ovpn_mode mode;
> > > +	spinlock_t lock; /* protect writing to the ovpn_struct object */
> > 
> > nit: the comment isn't really needed since you have kdoc saying the same thing
> 
> True, but checkpatch.pl (or some other script?) was still throwing a
> warning, therefore I added this comment to silence it.

Ok, then I guess the comment (and the other one below) can stay. That
sounds like a checkpatch.pl bug.

> > > +	struct workqueue_struct *events_wq;
> > > +	struct ovpn_peer __rcu *peer;
> > >   	struct list_head dev_list;
> > >   };
> > > diff --git a/drivers/net/ovpn/peer.c b/drivers/net/ovpn/peer.c
> > > new file mode 100644
> > > index 000000000000..2948b7320d47
> > > --- /dev/null
> > > +++ b/drivers/net/ovpn/peer.c
> > [...]
> > > +/**
> > > + * ovpn_peer_free - release private members and free peer object
> > > + * @peer: the peer to free
> > > + */
> > > +static void ovpn_peer_free(struct ovpn_peer *peer)
> > > +{
> > > +	ovpn_bind_reset(peer, NULL);
> > > +
> > > +	WARN_ON(!__ptr_ring_empty(&peer->tx_ring));
> > 
> > Could you pass a destructor to ptr_ring_cleanup instead of all these WARNs?
> 
> hmm but if we remove the WARNs then we lose the possibility to catch
> potential bugs, no? rings should definitely be empty at this point.

Ok, I haven't looked deep enough into how all the parts interact to
understand that. The refcount bump around the tx_ring loop in
ovpn_encrypt_work() takes care of that? Maybe worth a comment "$RING
should be empty at this point because of XYZ" (for each of the rings).

> Or you think I should just not care and free any potentially remaining item?

Whether you WARN or not, any remaining item is going to be leaked. I'd
go with WARN (or maybe DEBUG_NET_WARN_ON_ONCE) and free remaining
items. It should never happen but seems easy to deal with, so why not
handle it?

> > > +void ovpn_peer_release(struct ovpn_peer *peer)
> > > +{
> > > +	call_rcu(&peer->rcu, ovpn_peer_release_rcu);
> > > +}
> > > +
> > > +/**
> > > + * ovpn_peer_delete_work - work scheduled to release peer in process context
> > > + * @work: the work object
> > > + */
> > > +static void ovpn_peer_delete_work(struct work_struct *work)
> > > +{
> > > +	struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
> > > +					      delete_work);
> > > +	ovpn_peer_release(peer);
> > 
> > Does call_rcu really need to run in process context?
> 
> Reason for switching to process context is that we have to invoke
> ovpn_nl_notify_del_peer (that sends a netlink event to userspace) and the
> latter requires a reference to the peer.

I'm confused. When you say "requires a reference to the peer", do you
mean accessing fields of the peer object? I don't see why this
requires ovpn_nl_notify_del_peer to to run from process context.

> For this reason I thought it would be safe to have ovpn_nl_notify_del_peer
> and call_rcu invoked by the same context.
> 
> If I invoke call_rcu in ovpn_peer_release_kref, how can I be sure that the
> peer hasn't been free'd already when ovpn_nl_notify_del_peer is executed?

Put the ovpn_nl_notify_del_peer call before the call_rcu, it will
access the peer and then once that's done call_rcu will do its job?


> > > +/**
> > > + * ovpn_peer_del_p2p - delete peer from related tables in a P2P instance
> > > + * @peer: the peer to delete
> > > + * @reason: reason why the peer was deleted (sent to userspace)
> > > + *
> > > + * Return: 0 on success or a negative error code otherwise
> > > + */
> > > +static int ovpn_peer_del_p2p(struct ovpn_peer *peer,
> > > +			     enum ovpn_del_peer_reason reason)
> > > +{
> > > +	struct ovpn_peer *tmp;
> > > +	int ret = -ENOENT;
> > > +
> > > +	spin_lock_bh(&peer->ovpn->lock);
> > > +	tmp = rcu_dereference(peer->ovpn->peer);
> > > +	if (tmp != peer)
> > > +		goto unlock;
> > 
> > How do we recover if all those objects got out of sync? Are we stuck
> > with a broken peer?
> 
> mhhh I don't fully get the scenario you are depicting.
> 
> In P2P mode there is only peer stored (reference is saved in ovpn->peer)
> 
> When we want to get rid of it, we invoke ovpn_peer_del_p2p().
> The check we are performing here is just about being sure that we are
> removing the exact peer we requested to remove (and not some other peer that
> was still floating around for some reason).

But it's the right peer because it's the one the caller decided to get
rid of.  How about DEBUG_NET_WARN_ON_ONCE(tmp != peer) and always
releasing the peer?

> > And if this happens during interface deletion, aren't we leaking the
> > peer memory here?
> 
> at interface deletion we call
> 
> ovpn_iface_destruct -> ovpn_peer_release_p2p ->
> ovpn_peer_del_p2p(ovpn->peer)
> 
> so at the last step we just ask to remove the very same peer that is
> curently stored, which should just never fail.

But that's not what the test checks for. If ovpn->peer->ovpn != ovpn,
the test in ovpn_peer_del_p2p will fail. That's "objects getting out
of sync" in my previous email. The peer has a bogus back reference to
its ovpn parent, but it's ovpn->peer nevertheless.

Andrew Lunn May 9, 2024, 1:24 p.m. UTC | #4

On Thu, May 09, 2024 at 03:04:36PM +0200, Sabrina Dubroca wrote:
> 2024-05-08, 22:31:51 +0200, Antonio Quartulli wrote:
> > On 08/05/2024 18:06, Sabrina Dubroca wrote:
> > > 2024-05-06, 03:16:20 +0200, Antonio Quartulli wrote:
> > > > diff --git a/drivers/net/ovpn/ovpnstruct.h b/drivers/net/ovpn/ovpnstruct.h
> > > > index ee05b8a2c61d..b79d4f0474b0 100644
> > > > --- a/drivers/net/ovpn/ovpnstruct.h
> > > > +++ b/drivers/net/ovpn/ovpnstruct.h
> > > > @@ -17,12 +17,19 @@
> > > >    * @dev: the actual netdev representing the tunnel
> > > >    * @registered: whether dev is still registered with netdev or not
> > > >    * @mode: device operation mode (i.e. p2p, mp, ..)
> > > > + * @lock: protect this object
> > > > + * @event_wq: used to schedule generic events that may sleep and that need to be
> > > > + *            performed outside of softirq context
> > > > + * @peer: in P2P mode, this is the only remote peer
> > > >    * @dev_list: entry for the module wide device list
> > > >    */
> > > >   struct ovpn_struct {
> > > >   	struct net_device *dev;
> > > >   	bool registered;
> > > >   	enum ovpn_mode mode;
> > > > +	spinlock_t lock; /* protect writing to the ovpn_struct object */
> > > 
> > > nit: the comment isn't really needed since you have kdoc saying the same thing
> > 
> > True, but checkpatch.pl (or some other script?) was still throwing a
> > warning, therefore I added this comment to silence it.
> 
> Ok, then I guess the comment (and the other one below) can stay. That
> sounds like a checkpatch.pl bug.

I suspect it is more complex than that. checkpatch does not understand
kdoc. It just knows the rule that there should be a comment next to a
lock, hopefully indicating what the lock protects. In order to fix
this, checkpatch would need to somehow invoke the kdoc parser, and ask
it if the lock has kdoc documentation.

I suspect we are just going to have to live with this.

  Andrew

Antonio Quartulli May 9, 2024, 1:44 p.m. UTC | #5

On 09/05/2024 15:04, Sabrina Dubroca wrote:
[..]
>>>> +	struct workqueue_struct *events_wq;
>>>> +	struct ovpn_peer __rcu *peer;
>>>>    	struct list_head dev_list;
>>>>    };
>>>> diff --git a/drivers/net/ovpn/peer.c b/drivers/net/ovpn/peer.c
>>>> new file mode 100644
>>>> index 000000000000..2948b7320d47
>>>> --- /dev/null
>>>> +++ b/drivers/net/ovpn/peer.c
>>> [...]
>>>> +/**
>>>> + * ovpn_peer_free - release private members and free peer object
>>>> + * @peer: the peer to free
>>>> + */
>>>> +static void ovpn_peer_free(struct ovpn_peer *peer)
>>>> +{
>>>> +	ovpn_bind_reset(peer, NULL);
>>>> +
>>>> +	WARN_ON(!__ptr_ring_empty(&peer->tx_ring));
>>>
>>> Could you pass a destructor to ptr_ring_cleanup instead of all these WARNs?
>>
>> hmm but if we remove the WARNs then we lose the possibility to catch
>> potential bugs, no? rings should definitely be empty at this point.
> 
> Ok, I haven't looked deep enough into how all the parts interact to
> understand that. The refcount bump around the tx_ring loop in
> ovpn_encrypt_work() takes care of that? Maybe worth a comment "$RING
> should be empty at this point because of XYZ" (for each of the rings).

Yeah, all piped skbs will be processed before exiting.
Ok, will add a comment.

> 
>> Or you think I should just not care and free any potentially remaining item?
> 
> Whether you WARN or not, any remaining item is going to be leaked. I'd
> go with WARN (or maybe DEBUG_NET_WARN_ON_ONCE) and free remaining
> items. It should never happen but seems easy to deal with, so why not
> handle it?

Sure, passing consume_skb as destructor to ptr_ring_cleanup should be 
enough.

> 
>>>> +void ovpn_peer_release(struct ovpn_peer *peer)
>>>> +{
>>>> +	call_rcu(&peer->rcu, ovpn_peer_release_rcu);
>>>> +}
>>>> +
>>>> +/**
>>>> + * ovpn_peer_delete_work - work scheduled to release peer in process context
>>>> + * @work: the work object
>>>> + */
>>>> +static void ovpn_peer_delete_work(struct work_struct *work)
>>>> +{
>>>> +	struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
>>>> +					      delete_work);
>>>> +	ovpn_peer_release(peer);
>>>
>>> Does call_rcu really need to run in process context?
>>
>> Reason for switching to process context is that we have to invoke
>> ovpn_nl_notify_del_peer (that sends a netlink event to userspace) and the
>> latter requires a reference to the peer.
> 
> I'm confused. When you say "requires a reference to the peer", do you
> mean accessing fields of the peer object? I don't see why this
> requires ovpn_nl_notify_del_peer to to run from process context.

ovpn_nl_notify_del_peer sends a netlink message to userspace and I was 
under the impression that it may block/sleep, no?
For this reason I assumed it must be executed in process context.

> 
>> For this reason I thought it would be safe to have ovpn_nl_notify_del_peer
>> and call_rcu invoked by the same context.
>>
>> If I invoke call_rcu in ovpn_peer_release_kref, how can I be sure that the
>> peer hasn't been free'd already when ovpn_nl_notify_del_peer is executed?
> 
> Put the ovpn_nl_notify_del_peer call before the call_rcu, it will
> access the peer and then once that's done call_rcu will do its job?

If ovpn_nl_notify_del_peer is allowed to run out of process context, 
then I totally agree.

Will test again.

> 
> 
>>>> +/**
>>>> + * ovpn_peer_del_p2p - delete peer from related tables in a P2P instance
>>>> + * @peer: the peer to delete
>>>> + * @reason: reason why the peer was deleted (sent to userspace)
>>>> + *
>>>> + * Return: 0 on success or a negative error code otherwise
>>>> + */
>>>> +static int ovpn_peer_del_p2p(struct ovpn_peer *peer,
>>>> +			     enum ovpn_del_peer_reason reason)
>>>> +{
>>>> +	struct ovpn_peer *tmp;
>>>> +	int ret = -ENOENT;
>>>> +
>>>> +	spin_lock_bh(&peer->ovpn->lock);
>>>> +	tmp = rcu_dereference(peer->ovpn->peer);
>>>> +	if (tmp != peer)
>>>> +		goto unlock;
>>>
>>> How do we recover if all those objects got out of sync? Are we stuck
>>> with a broken peer?
>>
>> mhhh I don't fully get the scenario you are depicting.
>>
>> In P2P mode there is only peer stored (reference is saved in ovpn->peer)
>>
>> When we want to get rid of it, we invoke ovpn_peer_del_p2p().
>> The check we are performing here is just about being sure that we are
>> removing the exact peer we requested to remove (and not some other peer that
>> was still floating around for some reason).
> 
> But it's the right peer because it's the one the caller decided to get
> rid of.  How about DEBUG_NET_WARN_ON_ONCE(tmp != peer) and always
> releasing the peer?

sounds good. I should force myself to use more WARN_ON for conditions 
that are truly unexpected.

This said, I have a question regarding DEBUG_NET_WARN_ON_ONCE: it prints 
something only if CONFIG_DEBUG_NET is enabled.
Is this the case on standard desktop/server distribution? Otherwise how 
are we going to get reports from users?

> 
>>> And if this happens during interface deletion, aren't we leaking the
>>> peer memory here?
>>
>> at interface deletion we call
>>
>> ovpn_iface_destruct -> ovpn_peer_release_p2p ->
>> ovpn_peer_del_p2p(ovpn->peer)
>>
>> so at the last step we just ask to remove the very same peer that is
>> curently stored, which should just never fail.
> 
> But that's not what the test checks for. If ovpn->peer->ovpn != ovpn,
> the test in ovpn_peer_del_p2p will fail. That's "objects getting out
> of sync" in my previous email. The peer has a bogus back reference to
> its ovpn parent, but it's ovpn->peer nevertheless.
> 

Oh thanks for explaining that.

Ok, my assumption is that "ovpn->peer->ovpn != ovpn" can never be true.

Peers are created within the context of one ovpn object and are never 
exposed to other ovpns.

I hope it makes sense.

Andrew Lunn May 9, 2024, 1:55 p.m. UTC | #6

> > Whether you WARN or not, any remaining item is going to be leaked. I'd
> > go with WARN (or maybe DEBUG_NET_WARN_ON_ONCE) and free remaining
> > items. It should never happen but seems easy to deal with, so why not
> > handle it?

> This said, I have a question regarding DEBUG_NET_WARN_ON_ONCE: it prints
> something only if CONFIG_DEBUG_NET is enabled.
> Is this the case on standard desktop/server distribution? Otherwise how are
> we going to get reports from users?

A bit tangential, but:

https://lwn.net/Articles/969923/

	Andrew

Sabrina Dubroca May 9, 2024, 2:17 p.m. UTC | #7

2024-05-09, 15:44:26 +0200, Antonio Quartulli wrote:
> On 09/05/2024 15:04, Sabrina Dubroca wrote:
> > > > > +void ovpn_peer_release(struct ovpn_peer *peer)
> > > > > +{
> > > > > +	call_rcu(&peer->rcu, ovpn_peer_release_rcu);
> > > > > +}
> > > > > +
> > > > > +/**
> > > > > + * ovpn_peer_delete_work - work scheduled to release peer in process context
> > > > > + * @work: the work object
> > > > > + */
> > > > > +static void ovpn_peer_delete_work(struct work_struct *work)
> > > > > +{
> > > > > +	struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
> > > > > +					      delete_work);
> > > > > +	ovpn_peer_release(peer);
> > > > 
> > > > Does call_rcu really need to run in process context?
> > > 
> > > Reason for switching to process context is that we have to invoke
> > > ovpn_nl_notify_del_peer (that sends a netlink event to userspace) and the
> > > latter requires a reference to the peer.
> > 
> > I'm confused. When you say "requires a reference to the peer", do you
> > mean accessing fields of the peer object? I don't see why this
> > requires ovpn_nl_notify_del_peer to to run from process context.
> 
> ovpn_nl_notify_del_peer sends a netlink message to userspace and I was under
> the impression that it may block/sleep, no?
> For this reason I assumed it must be executed in process context.

With s/GFP_KERNEL/GFP_ATOMIC/, it should be ok to run from whatever
context. Firing up a workqueue just to send a 100B netlink message
seems a bit overkill.



> This said, I have a question regarding DEBUG_NET_WARN_ON_ONCE: it prints
> something only if CONFIG_DEBUG_NET is enabled.
> Is this the case on standard desktop/server distribution? Otherwise how are
> we going to get reports from users?

That's pretty much why I'm suggesting to use it. For those things that
should really never happen, I think letting developers find them
during testing (or syzbot when it gets to your driver) is enough. I'm
not convinced getting a stack trace from a user without any ability to
reproduce is that useful.

But if you or someone else really want some WARN_ONs, I can live with
that.

> > > > And if this happens during interface deletion, aren't we leaking the
> > > > peer memory here?
> > > 
> > > at interface deletion we call
> > > 
> > > ovpn_iface_destruct -> ovpn_peer_release_p2p ->
> > > ovpn_peer_del_p2p(ovpn->peer)
> > > 
> > > so at the last step we just ask to remove the very same peer that is
> > > curently stored, which should just never fail.
> > 
> > But that's not what the test checks for. If ovpn->peer->ovpn != ovpn,
> > the test in ovpn_peer_del_p2p will fail. That's "objects getting out
> > of sync" in my previous email. The peer has a bogus back reference to
> > its ovpn parent, but it's ovpn->peer nevertheless.
> > 
> 
> Oh thanks for explaining that.
> 
> Ok, my assumption is that "ovpn->peer->ovpn != ovpn" can never be true.
> 
> Peers are created within the context of one ovpn object and are never
> exposed to other ovpns.
> 
> I hope it makes sense.

Ok, so this would indicate that something has gone badly wrong. Is it
really worth checking for that (or maybe just during development)?

Antonio Quartulli May 9, 2024, 2:36 p.m. UTC | #8

On 09/05/2024 16:17, Sabrina Dubroca wrote:
> 2024-05-09, 15:44:26 +0200, Antonio Quartulli wrote:
>> On 09/05/2024 15:04, Sabrina Dubroca wrote:
>>>>>> +void ovpn_peer_release(struct ovpn_peer *peer)
>>>>>> +{
>>>>>> +	call_rcu(&peer->rcu, ovpn_peer_release_rcu);
>>>>>> +}
>>>>>> +
>>>>>> +/**
>>>>>> + * ovpn_peer_delete_work - work scheduled to release peer in process context
>>>>>> + * @work: the work object
>>>>>> + */
>>>>>> +static void ovpn_peer_delete_work(struct work_struct *work)
>>>>>> +{
>>>>>> +	struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
>>>>>> +					      delete_work);
>>>>>> +	ovpn_peer_release(peer);
>>>>>
>>>>> Does call_rcu really need to run in process context?
>>>>
>>>> Reason for switching to process context is that we have to invoke
>>>> ovpn_nl_notify_del_peer (that sends a netlink event to userspace) and the
>>>> latter requires a reference to the peer.
>>>
>>> I'm confused. When you say "requires a reference to the peer", do you
>>> mean accessing fields of the peer object? I don't see why this
>>> requires ovpn_nl_notify_del_peer to to run from process context.
>>
>> ovpn_nl_notify_del_peer sends a netlink message to userspace and I was under
>> the impression that it may block/sleep, no?
>> For this reason I assumed it must be executed in process context.
> 
> With s/GFP_KERNEL/GFP_ATOMIC/, it should be ok to run from whatever
> context. Firing up a workqueue just to send a 100B netlink message
> seems a bit overkill.

Oh ok, I thought the send could be a problem too.

Will test with GFP_ATOMIC then. Thanks for the hint.

> 
> 
> 
>> This said, I have a question regarding DEBUG_NET_WARN_ON_ONCE: it prints
>> something only if CONFIG_DEBUG_NET is enabled.
>> Is this the case on standard desktop/server distribution? Otherwise how are
>> we going to get reports from users?
> 
> That's pretty much why I'm suggesting to use it. For those things that
> should really never happen, I think letting developers find them
> during testing (or syzbot when it gets to your driver) is enough. I'm
> not convinced getting a stack trace from a user without any ability to
> reproduce is that useful.
> 
> But if you or someone else really want some WARN_ONs, I can live with
> that.

I would personally prefer to keep the WARN_ON.

Since these bogus conditions may have consequences, users will open 
report in any case.
Having some extra text that they can post for us to contextualize the 
issue may be useful.

> 
>>>>> And if this happens during interface deletion, aren't we leaking the
>>>>> peer memory here?
>>>>
>>>> at interface deletion we call
>>>>
>>>> ovpn_iface_destruct -> ovpn_peer_release_p2p ->
>>>> ovpn_peer_del_p2p(ovpn->peer)
>>>>
>>>> so at the last step we just ask to remove the very same peer that is
>>>> curently stored, which should just never fail.
>>>
>>> But that's not what the test checks for. If ovpn->peer->ovpn != ovpn,
>>> the test in ovpn_peer_del_p2p will fail. That's "objects getting out
>>> of sync" in my previous email. The peer has a bogus back reference to
>>> its ovpn parent, but it's ovpn->peer nevertheless.
>>>
>>
>> Oh thanks for explaining that.
>>
>> Ok, my assumption is that "ovpn->peer->ovpn != ovpn" can never be true.
>>
>> Peers are created within the context of one ovpn object and are never
>> exposed to other ovpns.
>>
>> I hope it makes sense.
> 
> Ok, so this would indicate that something has gone badly wrong. Is it
> really worth checking for that (or maybe just during development)?
> 

A peer is created in ovpn_nl_set_peer_doit(), where the ovpn object is 
used to first assign peer->ovpn and then to store the peer in its own 
members. This all happens in one call and the value of ovpn can't be 
switched.

Anyway, bugs hide where we are most confident that things cannot go 
wrong :) So I'll still add a WARN_ON, just in case.

Thanks

Antonio Quartulli May 9, 2024, 2:53 p.m. UTC | #9

On 09/05/2024 16:36, Antonio Quartulli wrote:
> On 09/05/2024 16:17, Sabrina Dubroca wrote:
>> 2024-05-09, 15:44:26 +0200, Antonio Quartulli wrote:
>>> On 09/05/2024 15:04, Sabrina Dubroca wrote:
>>>>>>> +void ovpn_peer_release(struct ovpn_peer *peer)
>>>>>>> +{
>>>>>>> +    call_rcu(&peer->rcu, ovpn_peer_release_rcu);
>>>>>>> +}
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * ovpn_peer_delete_work - work scheduled to release peer in 
>>>>>>> process context
>>>>>>> + * @work: the work object
>>>>>>> + */
>>>>>>> +static void ovpn_peer_delete_work(struct work_struct *work)
>>>>>>> +{
>>>>>>> +    struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
>>>>>>> +                          delete_work);
>>>>>>> +    ovpn_peer_release(peer);
>>>>>>
>>>>>> Does call_rcu really need to run in process context?
>>>>>
>>>>> Reason for switching to process context is that we have to invoke
>>>>> ovpn_nl_notify_del_peer (that sends a netlink event to userspace) 
>>>>> and the
>>>>> latter requires a reference to the peer.
>>>>
>>>> I'm confused. When you say "requires a reference to the peer", do you
>>>> mean accessing fields of the peer object? I don't see why this
>>>> requires ovpn_nl_notify_del_peer to to run from process context.
>>>
>>> ovpn_nl_notify_del_peer sends a netlink message to userspace and I 
>>> was under
>>> the impression that it may block/sleep, no?
>>> For this reason I assumed it must be executed in process context.
>>
>> With s/GFP_KERNEL/GFP_ATOMIC/, it should be ok to run from whatever
>> context. Firing up a workqueue just to send a 100B netlink message
>> seems a bit overkill.
> 
> Oh ok, I thought the send could be a problem too.
> 
> Will test with GFP_ATOMIC then. Thanks for the hint.

I am back and unfortunately we also have (added by a later patch):

  294         napi_disable(&peer->napi);
  295         netif_napi_del(&peer->napi);

that need to be executed in process context.
So it seems I must fire up the worker anyway..

Sabrina Dubroca May 10, 2024, 10:30 a.m. UTC | #10

2024-05-09, 16:53:42 +0200, Antonio Quartulli wrote:
> 
> 
> On 09/05/2024 16:36, Antonio Quartulli wrote:
> > On 09/05/2024 16:17, Sabrina Dubroca wrote:
> > > 2024-05-09, 15:44:26 +0200, Antonio Quartulli wrote:
> > > > On 09/05/2024 15:04, Sabrina Dubroca wrote:
> > > > > > > > +void ovpn_peer_release(struct ovpn_peer *peer)
> > > > > > > > +{
> > > > > > > > +    call_rcu(&peer->rcu, ovpn_peer_release_rcu);
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +/**
> > > > > > > > + * ovpn_peer_delete_work - work scheduled to
> > > > > > > > release peer in process context
> > > > > > > > + * @work: the work object
> > > > > > > > + */
> > > > > > > > +static void ovpn_peer_delete_work(struct work_struct *work)
> > > > > > > > +{
> > > > > > > > +    struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
> > > > > > > > +                          delete_work);
> > > > > > > > +    ovpn_peer_release(peer);
> > > > > > > 
> > > > > > > Does call_rcu really need to run in process context?
> > > > > > 
> > > > > > Reason for switching to process context is that we have to invoke
> > > > > > ovpn_nl_notify_del_peer (that sends a netlink event to
> > > > > > userspace) and the
> > > > > > latter requires a reference to the peer.
> > > > > 
> > > > > I'm confused. When you say "requires a reference to the peer", do you
> > > > > mean accessing fields of the peer object? I don't see why this
> > > > > requires ovpn_nl_notify_del_peer to to run from process context.
> > > > 
> > > > ovpn_nl_notify_del_peer sends a netlink message to userspace and
> > > > I was under
> > > > the impression that it may block/sleep, no?
> > > > For this reason I assumed it must be executed in process context.
> > > 
> > > With s/GFP_KERNEL/GFP_ATOMIC/, it should be ok to run from whatever
> > > context. Firing up a workqueue just to send a 100B netlink message
> > > seems a bit overkill.
> > 
> > Oh ok, I thought the send could be a problem too.
> > 
> > Will test with GFP_ATOMIC then. Thanks for the hint.
> 
> I am back and unfortunately we also have (added by a later patch):
> 
>  294         napi_disable(&peer->napi);
>  295         netif_napi_del(&peer->napi);

Do you need the napi instance to be per peer, or can it be per
netdevice? If it's per netdevice you can clean it up in
->priv_destructor.

> that need to be executed in process context.
> So it seems I must fire up the worker anyway..

I hope with can simplify all that logic. There's some complexity
that's unavoidable in this kind of driver, but maybe not as much as
you've got here.

Antonio Quartulli May 10, 2024, 12:34 p.m. UTC | #11

On 10/05/2024 12:30, Sabrina Dubroca wrote:
> 2024-05-09, 16:53:42 +0200, Antonio Quartulli wrote:
>>
>>
>> On 09/05/2024 16:36, Antonio Quartulli wrote:
>>> On 09/05/2024 16:17, Sabrina Dubroca wrote:
>>>> 2024-05-09, 15:44:26 +0200, Antonio Quartulli wrote:
>>>>> On 09/05/2024 15:04, Sabrina Dubroca wrote:
>>>>>>>>> +void ovpn_peer_release(struct ovpn_peer *peer)
>>>>>>>>> +{
>>>>>>>>> +    call_rcu(&peer->rcu, ovpn_peer_release_rcu);
>>>>>>>>> +}
>>>>>>>>> +
>>>>>>>>> +/**
>>>>>>>>> + * ovpn_peer_delete_work - work scheduled to
>>>>>>>>> release peer in process context
>>>>>>>>> + * @work: the work object
>>>>>>>>> + */
>>>>>>>>> +static void ovpn_peer_delete_work(struct work_struct *work)
>>>>>>>>> +{
>>>>>>>>> +    struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
>>>>>>>>> +                          delete_work);
>>>>>>>>> +    ovpn_peer_release(peer);
>>>>>>>>
>>>>>>>> Does call_rcu really need to run in process context?
>>>>>>>
>>>>>>> Reason for switching to process context is that we have to invoke
>>>>>>> ovpn_nl_notify_del_peer (that sends a netlink event to
>>>>>>> userspace) and the
>>>>>>> latter requires a reference to the peer.
>>>>>>
>>>>>> I'm confused. When you say "requires a reference to the peer", do you
>>>>>> mean accessing fields of the peer object? I don't see why this
>>>>>> requires ovpn_nl_notify_del_peer to to run from process context.
>>>>>
>>>>> ovpn_nl_notify_del_peer sends a netlink message to userspace and
>>>>> I was under
>>>>> the impression that it may block/sleep, no?
>>>>> For this reason I assumed it must be executed in process context.
>>>>
>>>> With s/GFP_KERNEL/GFP_ATOMIC/, it should be ok to run from whatever
>>>> context. Firing up a workqueue just to send a 100B netlink message
>>>> seems a bit overkill.
>>>
>>> Oh ok, I thought the send could be a problem too.
>>>
>>> Will test with GFP_ATOMIC then. Thanks for the hint.
>>
>> I am back and unfortunately we also have (added by a later patch):
>>
>>   294         napi_disable(&peer->napi);
>>   295         netif_napi_del(&peer->napi);
> 
> Do you need the napi instance to be per peer, or can it be per
> netdevice? If it's per netdevice you can clean it up in
> ->priv_destructor.

In an ideal world, at some point I could leverage on multiple CPUs 
handling traffic from multiple peers, therefore every queue in the 
driver is per peer, NAPI included.

Does it make sense?

Now, whether this is truly feasible from the core perspective is 
something I don't know yet.

For sure, for the time being I could shrink everything to one queue.
There is one workqueue only encrypting/decrypting packets right now, 
therefore multiple NAPI queues are not truly useful at this time.


> 
>> that need to be executed in process context.
>> So it seems I must fire up the worker anyway..
> 
> I hope with can simplify all that logic. There's some complexity
> that's unavoidable in this kind of driver, but maybe not as much as
> you've got here.
> 

I am all for simplification, and rest assured that the current version 
is already much simpler than what it originally was :-)

Sabrina Dubroca May 10, 2024, 2:11 p.m. UTC | #12

2024-05-10, 14:34:16 +0200, Antonio Quartulli wrote:
> On 10/05/2024 12:30, Sabrina Dubroca wrote:
> > 2024-05-09, 16:53:42 +0200, Antonio Quartulli wrote:
> > > 
> > > 
> > > On 09/05/2024 16:36, Antonio Quartulli wrote:
> > > > On 09/05/2024 16:17, Sabrina Dubroca wrote:
> > > > > 2024-05-09, 15:44:26 +0200, Antonio Quartulli wrote:
> > > > > > On 09/05/2024 15:04, Sabrina Dubroca wrote:
> > > > > > > > > > +void ovpn_peer_release(struct ovpn_peer *peer)
> > > > > > > > > > +{
> > > > > > > > > > +    call_rcu(&peer->rcu, ovpn_peer_release_rcu);
> > > > > > > > > > +}
> > > > > > > > > > +
> > > > > > > > > > +/**
> > > > > > > > > > + * ovpn_peer_delete_work - work scheduled to
> > > > > > > > > > release peer in process context
> > > > > > > > > > + * @work: the work object
> > > > > > > > > > + */
> > > > > > > > > > +static void ovpn_peer_delete_work(struct work_struct *work)
> > > > > > > > > > +{
> > > > > > > > > > +    struct ovpn_peer *peer = container_of(work, struct ovpn_peer,
> > > > > > > > > > +                          delete_work);
> > > > > > > > > > +    ovpn_peer_release(peer);
> > > > > > > > > 
> > > > > > > > > Does call_rcu really need to run in process context?
> > > > > > > > 
> > > > > > > > Reason for switching to process context is that we have to invoke
> > > > > > > > ovpn_nl_notify_del_peer (that sends a netlink event to
> > > > > > > > userspace) and the
> > > > > > > > latter requires a reference to the peer.
> > > > > > > 
> > > > > > > I'm confused. When you say "requires a reference to the peer", do you
> > > > > > > mean accessing fields of the peer object? I don't see why this
> > > > > > > requires ovpn_nl_notify_del_peer to to run from process context.
> > > > > > 
> > > > > > ovpn_nl_notify_del_peer sends a netlink message to userspace and
> > > > > > I was under
> > > > > > the impression that it may block/sleep, no?
> > > > > > For this reason I assumed it must be executed in process context.
> > > > > 
> > > > > With s/GFP_KERNEL/GFP_ATOMIC/, it should be ok to run from whatever
> > > > > context. Firing up a workqueue just to send a 100B netlink message
> > > > > seems a bit overkill.
> > > > 
> > > > Oh ok, I thought the send could be a problem too.
> > > > 
> > > > Will test with GFP_ATOMIC then. Thanks for the hint.
> > > 
> > > I am back and unfortunately we also have (added by a later patch):
> > > 
> > >   294         napi_disable(&peer->napi);
> > >   295         netif_napi_del(&peer->napi);
> > 
> > Do you need the napi instance to be per peer, or can it be per
> > netdevice? If it's per netdevice you can clean it up in
> > ->priv_destructor.
> 
> In an ideal world, at some point I could leverage on multiple CPUs handling
> traffic from multiple peers, therefore every queue in the driver is per
> peer, NAPI included.

But for that they could also be per-CPU at the device level, instead
of being created at the peer level.

> Does it make sense?
> 
> Now, whether this is truly feasible from the core perspective is something I
> don't know yet.

If those packets arrive on different CPUs/queues, probably.

> For sure, for the time being I could shrink everything to one queue.
> There is one workqueue only encrypting/decrypting packets right now,
> therefore multiple NAPI queues are not truly useful at this time.
> 
> > 
> > > that need to be executed in process context.
> > > So it seems I must fire up the worker anyway..
> > 
> > I hope with can simplify all that logic. There's some complexity
> > that's unavoidable in this kind of driver, but maybe not as much as
> > you've got here.
> > 
> 
> I am all for simplification, and rest assured that the current version is
> already much simpler than what it originally was :-)

Ouch :)

Antonio Quartulli May 10, 2024, 6:57 p.m. UTC | #13

On 09/05/2024 15:24, Andrew Lunn wrote:
> On Thu, May 09, 2024 at 03:04:36PM +0200, Sabrina Dubroca wrote:
>> 2024-05-08, 22:31:51 +0200, Antonio Quartulli wrote:
>>> On 08/05/2024 18:06, Sabrina Dubroca wrote:
>>>> 2024-05-06, 03:16:20 +0200, Antonio Quartulli wrote:
>>>>> diff --git a/drivers/net/ovpn/ovpnstruct.h b/drivers/net/ovpn/ovpnstruct.h
>>>>> index ee05b8a2c61d..b79d4f0474b0 100644
>>>>> --- a/drivers/net/ovpn/ovpnstruct.h
>>>>> +++ b/drivers/net/ovpn/ovpnstruct.h
>>>>> @@ -17,12 +17,19 @@
>>>>>     * @dev: the actual netdev representing the tunnel
>>>>>     * @registered: whether dev is still registered with netdev or not
>>>>>     * @mode: device operation mode (i.e. p2p, mp, ..)
>>>>> + * @lock: protect this object
>>>>> + * @event_wq: used to schedule generic events that may sleep and that need to be
>>>>> + *            performed outside of softirq context
>>>>> + * @peer: in P2P mode, this is the only remote peer
>>>>>     * @dev_list: entry for the module wide device list
>>>>>     */
>>>>>    struct ovpn_struct {
>>>>>    	struct net_device *dev;
>>>>>    	bool registered;
>>>>>    	enum ovpn_mode mode;
>>>>> +	spinlock_t lock; /* protect writing to the ovpn_struct object */
>>>>
>>>> nit: the comment isn't really needed since you have kdoc saying the same thing
>>>
>>> True, but checkpatch.pl (or some other script?) was still throwing a
>>> warning, therefore I added this comment to silence it.
>>
>> Ok, then I guess the comment (and the other one below) can stay. That
>> sounds like a checkpatch.pl bug.
> 
> I suspect it is more complex than that. checkpatch does not understand
> kdoc. It just knows the rule that there should be a comment next to a
> lock, hopefully indicating what the lock protects. In order to fix
> this, checkpatch would need to somehow invoke the kdoc parser, and ask
> it if the lock has kdoc documentation.
> 
> I suspect we are just going to have to live with this.

since we are now requiring new code to always have kdoc, can't we just 
drop the checkpatch warning?

Regards,

Jakub Kicinski May 11, 2024, 12:28 a.m. UTC | #14

On Fri, 10 May 2024 20:57:33 +0200 Antonio Quartulli wrote:
> > I suspect it is more complex than that. checkpatch does not understand
> > kdoc. It just knows the rule that there should be a comment next to a
> > lock, hopefully indicating what the lock protects. In order to fix
> > this, checkpatch would need to somehow invoke the kdoc parser, and ask
> > it if the lock has kdoc documentation.
> > 
> > I suspect we are just going to have to live with this.  
> 
> since we are now requiring new code to always have kdoc, can't we just 
> drop the checkpatch warning?

I don't think we require kdoc, but I agree that the warning is rather
ineffective.

Simon Horman May 13, 2024, 10:09 a.m. UTC | #15

On Mon, May 06, 2024 at 03:16:20AM +0200, Antonio Quartulli wrote:
> An ovpn_peer object holds the whole status of a remote peer
> (regardless whether it is a server or a client).
> 
> This includes status for crypto, tx/rx buffers, napi, etc.
> 
> Only support for one peer is introduced (P2P mode).
> Multi peer support is introduced with a later patch.
> 
> Along with the ovpn_peer, also the ovpn_bind object is introcued
> as the two are strictly related.
> An ovpn_bind object wraps a sockaddr representing the local
> coordinates being used to talk to a specific peer.
> 
> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>

...

> diff --git a/drivers/net/ovpn/ovpnstruct.h b/drivers/net/ovpn/ovpnstruct.h
> index ee05b8a2c61d..b79d4f0474b0 100644
> --- a/drivers/net/ovpn/ovpnstruct.h
> +++ b/drivers/net/ovpn/ovpnstruct.h
> @@ -17,12 +17,19 @@
>   * @dev: the actual netdev representing the tunnel
>   * @registered: whether dev is still registered with netdev or not
>   * @mode: device operation mode (i.e. p2p, mp, ..)
> + * @lock: protect this object
> + * @event_wq: used to schedule generic events that may sleep and that need to be
> + *            performed outside of softirq context

nit: events_wq

> + * @peer: in P2P mode, this is the only remote peer
>   * @dev_list: entry for the module wide device list
>   */
>  struct ovpn_struct {
>  	struct net_device *dev;
>  	bool registered;
>  	enum ovpn_mode mode;
> +	spinlock_t lock; /* protect writing to the ovpn_struct object */
> +	struct workqueue_struct *events_wq;
> +	struct ovpn_peer __rcu *peer;
>  	struct list_head dev_list;
>  };
>  

...

Antonio Quartulli May 13, 2024, 10:53 a.m. UTC | #16

On 13/05/2024 12:09, Simon Horman wrote:
> On Mon, May 06, 2024 at 03:16:20AM +0200, Antonio Quartulli wrote:
>> An ovpn_peer object holds the whole status of a remote peer
>> (regardless whether it is a server or a client).
>>
>> This includes status for crypto, tx/rx buffers, napi, etc.
>>
>> Only support for one peer is introduced (P2P mode).
>> Multi peer support is introduced with a later patch.
>>
>> Along with the ovpn_peer, also the ovpn_bind object is introcued
>> as the two are strictly related.
>> An ovpn_bind object wraps a sockaddr representing the local
>> coordinates being used to talk to a specific peer.
>>
>> Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
> 
> ...
> 
>> diff --git a/drivers/net/ovpn/ovpnstruct.h b/drivers/net/ovpn/ovpnstruct.h
>> index ee05b8a2c61d..b79d4f0474b0 100644
>> --- a/drivers/net/ovpn/ovpnstruct.h
>> +++ b/drivers/net/ovpn/ovpnstruct.h
>> @@ -17,12 +17,19 @@
>>    * @dev: the actual netdev representing the tunnel
>>    * @registered: whether dev is still registered with netdev or not
>>    * @mode: device operation mode (i.e. p2p, mp, ..)
>> + * @lock: protect this object
>> + * @event_wq: used to schedule generic events that may sleep and that need to be
>> + *            performed outside of softirq context
> 
> nit: events_wq

Thanks for the report. I fixed this locally already.

You don't know how long I had to stare at the kdoc warning and the code 
in order to realize that I missed a 's' :-S

Regards,

> 
>> + * @peer: in P2P mode, this is the only remote peer
>>    * @dev_list: entry for the module wide device list
>>    */
>>   struct ovpn_struct {
>>   	struct net_device *dev;
>>   	bool registered;
>>   	enum ovpn_mode mode;
>> +	spinlock_t lock; /* protect writing to the ovpn_struct object */
>> +	struct workqueue_struct *events_wq;
>> +	struct ovpn_peer __rcu *peer;
>>   	struct list_head dev_list;
>>   };
>>   
> 
> ...

Simon Horman May 13, 2024, 3:04 p.m. UTC | #17

On Mon, May 13, 2024 at 12:53:09PM +0200, Antonio Quartulli wrote:
> On 13/05/2024 12:09, Simon Horman wrote:
> > On Mon, May 06, 2024 at 03:16:20AM +0200, Antonio Quartulli wrote:
> > > An ovpn_peer object holds the whole status of a remote peer
> > > (regardless whether it is a server or a client).
> > > 
> > > This includes status for crypto, tx/rx buffers, napi, etc.
> > > 
> > > Only support for one peer is introduced (P2P mode).
> > > Multi peer support is introduced with a later patch.
> > > 
> > > Along with the ovpn_peer, also the ovpn_bind object is introcued
> > > as the two are strictly related.
> > > An ovpn_bind object wraps a sockaddr representing the local
> > > coordinates being used to talk to a specific peer.
> > > 
> > > Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
> > 
> > ...
> > 
> > > diff --git a/drivers/net/ovpn/ovpnstruct.h b/drivers/net/ovpn/ovpnstruct.h
> > > index ee05b8a2c61d..b79d4f0474b0 100644
> > > --- a/drivers/net/ovpn/ovpnstruct.h
> > > +++ b/drivers/net/ovpn/ovpnstruct.h
> > > @@ -17,12 +17,19 @@
> > >    * @dev: the actual netdev representing the tunnel
> > >    * @registered: whether dev is still registered with netdev or not
> > >    * @mode: device operation mode (i.e. p2p, mp, ..)
> > > + * @lock: protect this object
> > > + * @event_wq: used to schedule generic events that may sleep and that need to be
> > > + *            performed outside of softirq context
> > 
> > nit: events_wq
> 
> Thanks for the report. I fixed this locally already.
> 
> You don't know how long I had to stare at the kdoc warning and the code in
> order to realize that I missed a 's' :-S

It took me more than one reading too :)

[net-next,v3,07/24] ovpn: introduce the ovpn_peer object

Checks

Commit Message

Comments

Patch