diff mbox series

[v1,net-next,04/15] net/tls: expose get_netdev_for_sock

Message ID 20201207210649.19194-5-borisp@mellanox.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series nvme-tcp receive offloads | expand

Checks

Context Check Description
netdev/apply fail Patch does not apply to net-next
netdev/tree_selection success Clearly marked for net-next

Commit Message

Boris Pismenny Dec. 7, 2020, 9:06 p.m. UTC
get_netdev_for_sock is a utility that is used to obtain
the net_device structure from a connected socket.

Later patches will use this for nvme-tcp DDP and DDP CRC offloads.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
---
 include/net/sock.h   | 17 +++++++++++++++++
 net/tls/tls_device.c | 20 ++------------------
 2 files changed, 19 insertions(+), 18 deletions(-)

Comments

David Ahern Dec. 9, 2020, 1:06 a.m. UTC | #1
On 12/7/20 2:06 PM, Boris Pismenny wrote:
> get_netdev_for_sock is a utility that is used to obtain
> the net_device structure from a connected socket.
> 
> Later patches will use this for nvme-tcp DDP and DDP CRC offloads.
> 
> Signed-off-by: Boris Pismenny <borisp@mellanox.com>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> ---
>  include/net/sock.h   | 17 +++++++++++++++++
>  net/tls/tls_device.c | 20 ++------------------
>  2 files changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 093b51719c69..a8f7393ea433 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -2711,4 +2711,21 @@ void sock_set_sndtimeo(struct sock *sk, s64 secs);
>  
>  int sock_bind_add(struct sock *sk, struct sockaddr *addr, int addr_len);
>  
> +/* Assume that the socket is already connected */
> +static inline struct net_device *get_netdev_for_sock(struct sock *sk, bool hold)
> +{
> +	struct dst_entry *dst = sk_dst_get(sk);
> +	struct net_device *netdev = NULL;
> +
> +	if (likely(dst)) {
> +		netdev = dst->dev;

I noticed you grab this once when the offload is configured. The dst
device could change - e.g., ECMP, routing changes. I'm guessing that
does not matter much for the use case - you are really wanting to
configure queues and zc buffers for a flow with the device; the netdev
is an easy gateway to get to it.

But, data center deployments tend to have redundant access points --
either multipath for L3 or bond for L2. For the latter, this offload
setup won't work - dst->dev will be the bond, the bond does not support
the offload, so user is out of luck.
Boris Pismenny Dec. 9, 2020, 7:41 a.m. UTC | #2
On 09/12/2020 3:06, David Ahern wrote:
> On 12/7/20 2:06 PM, Boris Pismenny wrote:
>> get_netdev_for_sock is a utility that is used to obtain
>> the net_device structure from a connected socket.
>>
>> Later patches will use this for nvme-tcp DDP and DDP CRC offloads.
>>
>> Signed-off-by: Boris Pismenny <borisp@mellanox.com>
>> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
>> ---
>>  include/net/sock.h   | 17 +++++++++++++++++
>>  net/tls/tls_device.c | 20 ++------------------
>>  2 files changed, 19 insertions(+), 18 deletions(-)
>>
>> diff --git a/include/net/sock.h b/include/net/sock.h
>> index 093b51719c69..a8f7393ea433 100644
>> --- a/include/net/sock.h
>> +++ b/include/net/sock.h
>> @@ -2711,4 +2711,21 @@ void sock_set_sndtimeo(struct sock *sk, s64 secs);
>>  
>>  int sock_bind_add(struct sock *sk, struct sockaddr *addr, int addr_len);
>>  
>> +/* Assume that the socket is already connected */
>> +static inline struct net_device *get_netdev_for_sock(struct sock *sk, bool hold)
>> +{
>> +	struct dst_entry *dst = sk_dst_get(sk);
>> +	struct net_device *netdev = NULL;
>> +
>> +	if (likely(dst)) {
>> +		netdev = dst->dev;
> 
> I noticed you grab this once when the offload is configured. The dst
> device could change - e.g., ECMP, routing changes. I'm guessing that
> does not matter much for the use case - you are really wanting to
> configure queues and zc buffers for a flow with the device; the netdev
> is an easy gateway to get to it.
> 
> But, data center deployments tend to have redundant access points --
> either multipath for L3 or bond for L2. For the latter, this offload
> setup won't work - dst->dev will be the bond, the bond does not support
> the offload, so user is out of luck.
> 

You are correct, and bond support is currently under review for TLS,
i.e., search for "TLS TX HW offload for Bond". The same approach that
is applied there is relevant here. More generally, this offload is
very similar in concept to TLS offload (tls_device).
David Ahern Dec. 10, 2020, 3:39 a.m. UTC | #3
On 12/9/20 12:41 AM, Boris Pismenny wrote:
> 
> You are correct, and bond support is currently under review for TLS,
> i.e., search for "TLS TX HW offload for Bond". The same approach that

ok, missed that.

> is applied there is relevant here. More generally, this offload is
> very similar in concept to TLS offload (tls_device).
> 

I disagree with the TLS comparison. As an example, AIUI the TLS offload
works with userspace sockets, and it is a protocol offload.
Boris Pismenny Dec. 11, 2020, 6:43 p.m. UTC | #4
On 10/12/2020 5:39, David Ahern wrote:
> On 12/9/20 12:41 AM, Boris Pismenny wrote:
> 
>> is applied there is relevant here. More generally, this offload is
>> very similar in concept to TLS offload (tls_device).
>>
> 
> I disagree with the TLS comparison. As an example, AIUI the TLS offload
> works with userspace sockets, and it is a protocol offload.
> 

First, tls offload can be used by kernel sockets just as userspace
sockets. Second, it is not a protocol offload per-se, if you are looking
for a carch-phrase to define it partial/autonomous offload is what I
think fits better.

IMO, the concepts are very similar, and those who implemented offload
using the tls_device APIs will find this offload fits both hardware drivers
naturally.

To compare between them, please look at the NDOs, in particular the add/del/resync.
Additionally, see the similarity between skb->ddp_crc and skb->decrypted.
Also, consider the software-fallback flows used in both.
diff mbox series

Patch

diff --git a/include/net/sock.h b/include/net/sock.h
index 093b51719c69..a8f7393ea433 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2711,4 +2711,21 @@  void sock_set_sndtimeo(struct sock *sk, s64 secs);
 
 int sock_bind_add(struct sock *sk, struct sockaddr *addr, int addr_len);
 
+/* Assume that the socket is already connected */
+static inline struct net_device *get_netdev_for_sock(struct sock *sk, bool hold)
+{
+	struct dst_entry *dst = sk_dst_get(sk);
+	struct net_device *netdev = NULL;
+
+	if (likely(dst)) {
+		netdev = dst->dev;
+		if (hold)
+			dev_hold(netdev);
+	}
+
+	dst_release(dst);
+
+	return netdev;
+}
+
 #endif	/* _SOCK_H */
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index a3ab2d3d4e4e..8c3bc8705efb 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -106,22 +106,6 @@  static void tls_device_queue_ctx_destruction(struct tls_context *ctx)
 	spin_unlock_irqrestore(&tls_device_lock, flags);
 }
 
-/* We assume that the socket is already connected */
-static struct net_device *get_netdev_for_sock(struct sock *sk)
-{
-	struct dst_entry *dst = sk_dst_get(sk);
-	struct net_device *netdev = NULL;
-
-	if (likely(dst)) {
-		netdev = dst->dev;
-		dev_hold(netdev);
-	}
-
-	dst_release(dst);
-
-	return netdev;
-}
-
 static void destroy_record(struct tls_record_info *record)
 {
 	int i;
@@ -1104,7 +1088,7 @@  int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
 	if (skb)
 		TCP_SKB_CB(skb)->eor = 1;
 
-	netdev = get_netdev_for_sock(sk);
+	netdev = get_netdev_for_sock(sk, true);
 	if (!netdev) {
 		pr_err_ratelimited("%s: netdev not found\n", __func__);
 		rc = -EINVAL;
@@ -1180,7 +1164,7 @@  int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx)
 	if (ctx->crypto_recv.info.version != TLS_1_2_VERSION)
 		return -EOPNOTSUPP;
 
-	netdev = get_netdev_for_sock(sk);
+	netdev = get_netdev_for_sock(sk, true);
 	if (!netdev) {
 		pr_err_ratelimited("%s: netdev not found\n", __func__);
 		return -EINVAL;