Message ID | 20201207210649.19194-5-borisp@mellanox.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | nvme-tcp receive offloads | expand |
Context | Check | Description |
---|---|---|
netdev/apply | fail | Patch does not apply to net-next |
netdev/tree_selection | success | Clearly marked for net-next |
On 12/7/20 2:06 PM, Boris Pismenny wrote: > get_netdev_for_sock is a utility that is used to obtain > the net_device structure from a connected socket. > > Later patches will use this for nvme-tcp DDP and DDP CRC offloads. > > Signed-off-by: Boris Pismenny <borisp@mellanox.com> > Reviewed-by: Sagi Grimberg <sagi@grimberg.me> > --- > include/net/sock.h | 17 +++++++++++++++++ > net/tls/tls_device.c | 20 ++------------------ > 2 files changed, 19 insertions(+), 18 deletions(-) > > diff --git a/include/net/sock.h b/include/net/sock.h > index 093b51719c69..a8f7393ea433 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -2711,4 +2711,21 @@ void sock_set_sndtimeo(struct sock *sk, s64 secs); > > int sock_bind_add(struct sock *sk, struct sockaddr *addr, int addr_len); > > +/* Assume that the socket is already connected */ > +static inline struct net_device *get_netdev_for_sock(struct sock *sk, bool hold) > +{ > + struct dst_entry *dst = sk_dst_get(sk); > + struct net_device *netdev = NULL; > + > + if (likely(dst)) { > + netdev = dst->dev; I noticed you grab this once when the offload is configured. The dst device could change - e.g., ECMP, routing changes. I'm guessing that does not matter much for the use case - you are really wanting to configure queues and zc buffers for a flow with the device; the netdev is an easy gateway to get to it. But, data center deployments tend to have redundant access points -- either multipath for L3 or bond for L2. For the latter, this offload setup won't work - dst->dev will be the bond, the bond does not support the offload, so user is out of luck.
On 09/12/2020 3:06, David Ahern wrote: > On 12/7/20 2:06 PM, Boris Pismenny wrote: >> get_netdev_for_sock is a utility that is used to obtain >> the net_device structure from a connected socket. >> >> Later patches will use this for nvme-tcp DDP and DDP CRC offloads. >> >> Signed-off-by: Boris Pismenny <borisp@mellanox.com> >> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> >> --- >> include/net/sock.h | 17 +++++++++++++++++ >> net/tls/tls_device.c | 20 ++------------------ >> 2 files changed, 19 insertions(+), 18 deletions(-) >> >> diff --git a/include/net/sock.h b/include/net/sock.h >> index 093b51719c69..a8f7393ea433 100644 >> --- a/include/net/sock.h >> +++ b/include/net/sock.h >> @@ -2711,4 +2711,21 @@ void sock_set_sndtimeo(struct sock *sk, s64 secs); >> >> int sock_bind_add(struct sock *sk, struct sockaddr *addr, int addr_len); >> >> +/* Assume that the socket is already connected */ >> +static inline struct net_device *get_netdev_for_sock(struct sock *sk, bool hold) >> +{ >> + struct dst_entry *dst = sk_dst_get(sk); >> + struct net_device *netdev = NULL; >> + >> + if (likely(dst)) { >> + netdev = dst->dev; > > I noticed you grab this once when the offload is configured. The dst > device could change - e.g., ECMP, routing changes. I'm guessing that > does not matter much for the use case - you are really wanting to > configure queues and zc buffers for a flow with the device; the netdev > is an easy gateway to get to it. > > But, data center deployments tend to have redundant access points -- > either multipath for L3 or bond for L2. For the latter, this offload > setup won't work - dst->dev will be the bond, the bond does not support > the offload, so user is out of luck. > You are correct, and bond support is currently under review for TLS, i.e., search for "TLS TX HW offload for Bond". The same approach that is applied there is relevant here. More generally, this offload is very similar in concept to TLS offload (tls_device).
On 12/9/20 12:41 AM, Boris Pismenny wrote: > > You are correct, and bond support is currently under review for TLS, > i.e., search for "TLS TX HW offload for Bond". The same approach that ok, missed that. > is applied there is relevant here. More generally, this offload is > very similar in concept to TLS offload (tls_device). > I disagree with the TLS comparison. As an example, AIUI the TLS offload works with userspace sockets, and it is a protocol offload.
On 10/12/2020 5:39, David Ahern wrote: > On 12/9/20 12:41 AM, Boris Pismenny wrote: > >> is applied there is relevant here. More generally, this offload is >> very similar in concept to TLS offload (tls_device). >> > > I disagree with the TLS comparison. As an example, AIUI the TLS offload > works with userspace sockets, and it is a protocol offload. > First, tls offload can be used by kernel sockets just as userspace sockets. Second, it is not a protocol offload per-se, if you are looking for a carch-phrase to define it partial/autonomous offload is what I think fits better. IMO, the concepts are very similar, and those who implemented offload using the tls_device APIs will find this offload fits both hardware drivers naturally. To compare between them, please look at the NDOs, in particular the add/del/resync. Additionally, see the similarity between skb->ddp_crc and skb->decrypted. Also, consider the software-fallback flows used in both.
diff --git a/include/net/sock.h b/include/net/sock.h index 093b51719c69..a8f7393ea433 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2711,4 +2711,21 @@ void sock_set_sndtimeo(struct sock *sk, s64 secs); int sock_bind_add(struct sock *sk, struct sockaddr *addr, int addr_len); +/* Assume that the socket is already connected */ +static inline struct net_device *get_netdev_for_sock(struct sock *sk, bool hold) +{ + struct dst_entry *dst = sk_dst_get(sk); + struct net_device *netdev = NULL; + + if (likely(dst)) { + netdev = dst->dev; + if (hold) + dev_hold(netdev); + } + + dst_release(dst); + + return netdev; +} + #endif /* _SOCK_H */ diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c index a3ab2d3d4e4e..8c3bc8705efb 100644 --- a/net/tls/tls_device.c +++ b/net/tls/tls_device.c @@ -106,22 +106,6 @@ static void tls_device_queue_ctx_destruction(struct tls_context *ctx) spin_unlock_irqrestore(&tls_device_lock, flags); } -/* We assume that the socket is already connected */ -static struct net_device *get_netdev_for_sock(struct sock *sk) -{ - struct dst_entry *dst = sk_dst_get(sk); - struct net_device *netdev = NULL; - - if (likely(dst)) { - netdev = dst->dev; - dev_hold(netdev); - } - - dst_release(dst); - - return netdev; -} - static void destroy_record(struct tls_record_info *record) { int i; @@ -1104,7 +1088,7 @@ int tls_set_device_offload(struct sock *sk, struct tls_context *ctx) if (skb) TCP_SKB_CB(skb)->eor = 1; - netdev = get_netdev_for_sock(sk); + netdev = get_netdev_for_sock(sk, true); if (!netdev) { pr_err_ratelimited("%s: netdev not found\n", __func__); rc = -EINVAL; @@ -1180,7 +1164,7 @@ int tls_set_device_offload_rx(struct sock *sk, struct tls_context *ctx) if (ctx->crypto_recv.info.version != TLS_1_2_VERSION) return -EOPNOTSUPP; - netdev = get_netdev_for_sock(sk); + netdev = get_netdev_for_sock(sk, true); if (!netdev) { pr_err_ratelimited("%s: netdev not found\n", __func__); return -EINVAL;