diff mbox series

[4/4] net/tls: implement ->read_sock()

Message ID 20230620102856.56074-5-hare@suse.de (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series net/tls: fixes for NVMe-over-TLS | expand

Checks

Context Check Description
netdev/series_format warning Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 8 this patch: 8
netdev/cc_maintainers warning 3 maintainers not CCed: borisp@nvidia.com john.fastabend@gmail.com davem@davemloft.net
netdev/build_clang success Errors and warnings before: 8 this patch: 8
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 8 this patch: 8
netdev/checkpatch fail ERROR: space prohibited before that close square bracket ']'
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Hannes Reinecke June 20, 2023, 10:28 a.m. UTC
Implement ->read_sock() function for use with nvme-tcp.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Cc: Boris Pismenny <boris.pismenny@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
---
 net/tls/tls.h      |  2 ++
 net/tls/tls_main.c |  2 ++
 net/tls/tls_sw.c   | 78 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 82 insertions(+)

Comments

Sagi Grimberg June 20, 2023, 1:21 p.m. UTC | #1
> Implement ->read_sock() function for use with nvme-tcp.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
> Cc: Boris Pismenny <boris.pismenny@gmail.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: netdev@vger.kernel.org
> ---
>   net/tls/tls.h      |  2 ++
>   net/tls/tls_main.c |  2 ++
>   net/tls/tls_sw.c   | 78 ++++++++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 82 insertions(+)
> 
> diff --git a/net/tls/tls.h b/net/tls/tls.h
> index d002c3af1966..ba55cd5c4913 100644
> --- a/net/tls/tls.h
> +++ b/net/tls/tls.h
> @@ -114,6 +114,8 @@ bool tls_sw_sock_is_readable(struct sock *sk);
>   ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
>   			   struct pipe_inode_info *pipe,
>   			   size_t len, unsigned int flags);
> +int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
> +		     sk_read_actor_t read_actor);
>   
>   int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
>   void tls_device_splice_eof(struct socket *sock);
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index 7b9c83dd7de2..1a062a8c6d33 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
> @@ -963,10 +963,12 @@ static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG]
>   	ops[TLS_BASE][TLS_SW  ] = ops[TLS_BASE][TLS_BASE];
>   	ops[TLS_BASE][TLS_SW  ].splice_read	= tls_sw_splice_read;
>   	ops[TLS_BASE][TLS_SW  ].poll		= tls_sk_poll;
> +	ops[TLS_BASE][TLS_SW  ].read_sock	= tls_sw_read_sock;
>   
>   	ops[TLS_SW  ][TLS_SW  ] = ops[TLS_SW  ][TLS_BASE];
>   	ops[TLS_SW  ][TLS_SW  ].splice_read	= tls_sw_splice_read;
>   	ops[TLS_SW  ][TLS_SW  ].poll		= tls_sk_poll;
> +	ops[TLS_SW  ][TLS_SW  ].read_sock	= tls_sw_read_sock;
>   
>   #ifdef CONFIG_TLS_DEVICE
>   	ops[TLS_HW  ][TLS_BASE] = ops[TLS_BASE][TLS_BASE];
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 97379e34c997..e918c98bbeb2 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2231,6 +2231,84 @@ ssize_t tls_sw_splice_read(struct socket *sock,  loff_t *ppos,
>   	goto splice_read_end;
>   }
>   
> +int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
> +		     sk_read_actor_t read_actor)
> +{
> +	struct tls_context *tls_ctx = tls_get_ctx(sk);
> +	struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
> +	struct strp_msg *rxm = NULL;
> +	struct tls_msg *tlm;
> +	struct sk_buff *skb;
> +	ssize_t copied = 0;
> +	int err, used;
> +
> +	err = tls_rx_reader_lock(sk, ctx, true);
> +	if (err < 0)
> +		return err;

Unlike recvmsg or splice_read, the caller of read_sock is assumed to
have the socket locked, and tls_rx_reader_lock also calls lock_sock,
how is this not a deadlock?

I'm not exactly clear why the lock is needed here or what is the subtle
distinction between tls_rx_reader_lock and what lock_sock provides.
Jakub Kicinski June 20, 2023, 5:08 p.m. UTC | #2
On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
> > +	err = tls_rx_reader_lock(sk, ctx, true);
> > +	if (err < 0)
> > +		return err;  
> 
> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
> how is this not a deadlock?

Yeah :|

> I'm not exactly clear why the lock is needed here or what is the subtle
> distinction between tls_rx_reader_lock and what lock_sock provides.

It's a bit of a workaround for the consistency of the data stream.
There's bunch of state in the TLS ULP and waiting for mem or data
releases and re-takes the socket lock. So to stop the flow annoying
corner case races I slapped a lock around all of the reader.

IMHO depending on the socket lock for anything non-trivial and outside
of the socket itself is a bad idea in general.

The immediate need at the time was that if you did a read() and someone
else did a peek() at the same time from a stream of A B C D you may read
A D B C.
Hannes Reinecke June 21, 2023, 6:44 a.m. UTC | #3
On 6/20/23 19:08, Jakub Kicinski wrote:
> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>> +	err = tls_rx_reader_lock(sk, ctx, true);
>>> +	if (err < 0)
>>> +		return err;
>>
>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>> how is this not a deadlock?
> 
> Yeah :|
> 
>> I'm not exactly clear why the lock is needed here or what is the subtle
>> distinction between tls_rx_reader_lock and what lock_sock provides.
> 
> It's a bit of a workaround for the consistency of the data stream.
> There's bunch of state in the TLS ULP and waiting for mem or data
> releases and re-takes the socket lock. So to stop the flow annoying
> corner case races I slapped a lock around all of the reader.
> 
> IMHO depending on the socket lock for anything non-trivial and outside
> of the socket itself is a bad idea in general.
> 
> The immediate need at the time was that if you did a read() and someone
> else did a peek() at the same time from a stream of A B C D you may read
> A D B C.

Leaving me ever so confused.

read_sock() is a generic interface; we cannot require a protocol 
specific lock before calling it.

What to do now?
Drop the tls_rx_read_lock from read_sock() again?

Cheers,

Hannes
Sagi Grimberg June 21, 2023, 8:39 a.m. UTC | #4
>> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>>> +    err = tls_rx_reader_lock(sk, ctx, true);
>>>> +    if (err < 0)
>>>> +        return err;
>>>
>>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>>> how is this not a deadlock?
>>
>> Yeah :|
>>
>>> I'm not exactly clear why the lock is needed here or what is the subtle
>>> distinction between tls_rx_reader_lock and what lock_sock provides.
>>
>> It's a bit of a workaround for the consistency of the data stream.
>> There's bunch of state in the TLS ULP and waiting for mem or data
>> releases and re-takes the socket lock. So to stop the flow annoying
>> corner case races I slapped a lock around all of the reader.
>>
>> IMHO depending on the socket lock for anything non-trivial and outside
>> of the socket itself is a bad idea in general.
>>
>> The immediate need at the time was that if you did a read() and someone
>> else did a peek() at the same time from a stream of A B C D you may read
>> A D B C.
> 
> Leaving me ever so confused.
> 
> read_sock() is a generic interface; we cannot require a protocol 
> specific lock before calling it.
> 
> What to do now?
> Drop the tls_rx_read_lock from read_sock() again?

Probably just need to synchronize the readers by splitting that from
tls_rx_reader_lock:
--
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 53f944e6d8ef..53404c3fdcc6 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1845,13 +1845,10 @@ tls_read_flush_backlog(struct sock *sk, struct 
tls_prot_info *prot,
         return sk_flush_backlog(sk);
  }

-static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx 
*ctx,
-                             bool nonblock)
+static int tls_rx_reader_acquire(struct sock *sk, struct 
tls_sw_context_rx *ctx,
+                            bool nonblock)
  {
         long timeo;
-       int err;
-
-       lock_sock(sk);

         timeo = sock_rcvtimeo(sk, nonblock);

@@ -1865,26 +1862,30 @@ static int tls_rx_reader_lock(struct sock *sk, 
struct tls_sw_context_rx *ctx,
                               !READ_ONCE(ctx->reader_present), &wait);
                 remove_wait_queue(&ctx->wq, &wait);

-               if (timeo <= 0) {
-                       err = -EAGAIN;
-                       goto err_unlock;
-               }
-               if (signal_pending(current)) {
-                       err = sock_intr_errno(timeo);
-                       goto err_unlock;
-               }
+               if (timeo <= 0)
+                       return -EAGAIN;
+               if (signal_pending(current))
+                       return sock_intr_errno(timeo);
         }

         WRITE_ONCE(ctx->reader_present, 1);

         return 0;
+}

-err_unlock:
-       release_sock(sk);
+static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx 
*ctx,
+                             bool nonblock)
+{
+       int err;
+
+       lock_sock(sk);
+       err = tls_rx_reader_acquire(sk, ctx, nonblock);
+       if (err)
+               release_sock(sk);
         return err;
  }

-static void tls_rx_reader_unlock(struct sock *sk, struct 
tls_sw_context_rx *ctx)
+static void tls_rx_reader_release(struct sock *sk, struct 
tls_sw_context_rx *ctx)
  {
         if (unlikely(ctx->reader_contended)) {
                 if (wq_has_sleeper(&ctx->wq))
@@ -1896,6 +1897,11 @@ static void tls_rx_reader_unlock(struct sock *sk, 
struct tls_sw_context_rx *ctx)
         }

         WRITE_ONCE(ctx->reader_present, 0);
+}
+
+static void tls_rx_reader_unlock(struct sock *sk, struct 
tls_sw_context_rx *ctx)
+{
+       tls_rx_reader_release(sk, ctx);
         release_sock(sk);
  }
--

Then read_sock can just acquire/release.
Hannes Reinecke June 21, 2023, 9:08 a.m. UTC | #5
On 6/21/23 10:39, Sagi Grimberg wrote:
> 
>>> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>>>> +    err = tls_rx_reader_lock(sk, ctx, true);
>>>>> +    if (err < 0)
>>>>> +        return err;
>>>>
>>>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>>>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>>>> how is this not a deadlock?
>>>
>>> Yeah :|
>>>
>>>> I'm not exactly clear why the lock is needed here or what is the subtle
>>>> distinction between tls_rx_reader_lock and what lock_sock provides.
>>>
>>> It's a bit of a workaround for the consistency of the data stream.
>>> There's bunch of state in the TLS ULP and waiting for mem or data
>>> releases and re-takes the socket lock. So to stop the flow annoying
>>> corner case races I slapped a lock around all of the reader.
>>>
>>> IMHO depending on the socket lock for anything non-trivial and outside
>>> of the socket itself is a bad idea in general.
>>>
>>> The immediate need at the time was that if you did a read() and someone
>>> else did a peek() at the same time from a stream of A B C D you may read
>>> A D B C.
>>
>> Leaving me ever so confused.
>>
>> read_sock() is a generic interface; we cannot require a protocol 
>> specific lock before calling it.
>>
>> What to do now?
>> Drop the tls_rx_read_lock from read_sock() again?
> 
> Probably just need to synchronize the readers by splitting that from
> tls_rx_reader_lock:
> -- 
> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
> index 53f944e6d8ef..53404c3fdcc6 100644
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -1845,13 +1845,10 @@ tls_read_flush_backlog(struct sock *sk, struct 
> tls_prot_info *prot,
>          return sk_flush_backlog(sk);
>   }
> 
> -static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx 
> *ctx,
> -                             bool nonblock)
> +static int tls_rx_reader_acquire(struct sock *sk, struct 
> tls_sw_context_rx *ctx,
> +                            bool nonblock)
>   {
>          long timeo;
> -       int err;
> -
> -       lock_sock(sk);
> 
>          timeo = sock_rcvtimeo(sk, nonblock);
> 
> @@ -1865,26 +1862,30 @@ static int tls_rx_reader_lock(struct sock *sk, 
> struct tls_sw_context_rx *ctx,
>                                !READ_ONCE(ctx->reader_present), &wait);
>                  remove_wait_queue(&ctx->wq, &wait);
> 
> -               if (timeo <= 0) {
> -                       err = -EAGAIN;
> -                       goto err_unlock;
> -               }
> -               if (signal_pending(current)) {
> -                       err = sock_intr_errno(timeo);
> -                       goto err_unlock;
> -               }
> +               if (timeo <= 0)
> +                       return -EAGAIN;
> +               if (signal_pending(current))
> +                       return sock_intr_errno(timeo);
>          }
> 
>          WRITE_ONCE(ctx->reader_present, 1);
> 
>          return 0;
> +}
> 
> -err_unlock:
> -       release_sock(sk);
> +static int tls_rx_reader_lock(struct sock *sk, struct tls_sw_context_rx 
> *ctx,
> +                             bool nonblock)
> +{
> +       int err;
> +
> +       lock_sock(sk);
> +       err = tls_rx_reader_acquire(sk, ctx, nonblock);
> +       if (err)
> +               release_sock(sk);
>          return err;
>   }
> 
> -static void tls_rx_reader_unlock(struct sock *sk, struct 
> tls_sw_context_rx *ctx)
> +static void tls_rx_reader_release(struct sock *sk, struct 
> tls_sw_context_rx *ctx)
>   {
>          if (unlikely(ctx->reader_contended)) {
>                  if (wq_has_sleeper(&ctx->wq))
> @@ -1896,6 +1897,11 @@ static void tls_rx_reader_unlock(struct sock *sk, 
> struct tls_sw_context_rx *ctx)
>          }
> 
>          WRITE_ONCE(ctx->reader_present, 0);
> +}
> +
> +static void tls_rx_reader_unlock(struct sock *sk, struct 
> tls_sw_context_rx *ctx)
> +{
> +       tls_rx_reader_release(sk, ctx);
>          release_sock(sk);
>   }
> -- 
> 
> Then read_sock can just acquire/release.

Good suggestion.
Will be including it in the next round.

Cheers,

Hannes
Sagi Grimberg June 21, 2023, 9:49 a.m. UTC | #6
On 6/21/23 12:08, Hannes Reinecke wrote:
> On 6/21/23 10:39, Sagi Grimberg wrote:
>>
>>>> On Tue, 20 Jun 2023 16:21:22 +0300 Sagi Grimberg wrote:
>>>>>> +    err = tls_rx_reader_lock(sk, ctx, true);
>>>>>> +    if (err < 0)
>>>>>> +        return err;
>>>>>
>>>>> Unlike recvmsg or splice_read, the caller of read_sock is assumed to
>>>>> have the socket locked, and tls_rx_reader_lock also calls lock_sock,
>>>>> how is this not a deadlock?
>>>>
>>>> Yeah :|
>>>>
>>>>> I'm not exactly clear why the lock is needed here or what is the 
>>>>> subtle
>>>>> distinction between tls_rx_reader_lock and what lock_sock provides.
>>>>
>>>> It's a bit of a workaround for the consistency of the data stream.
>>>> There's bunch of state in the TLS ULP and waiting for mem or data
>>>> releases and re-takes the socket lock. So to stop the flow annoying
>>>> corner case races I slapped a lock around all of the reader.
>>>>
>>>> IMHO depending on the socket lock for anything non-trivial and outside
>>>> of the socket itself is a bad idea in general.
>>>>
>>>> The immediate need at the time was that if you did a read() and someone
>>>> else did a peek() at the same time from a stream of A B C D you may 
>>>> read
>>>> A D B C.
>>>
>>> Leaving me ever so confused.
>>>
>>> read_sock() is a generic interface; we cannot require a protocol 
>>> specific lock before calling it.
>>>
>>> What to do now?
>>> Drop the tls_rx_read_lock from read_sock() again?
>>
>> Probably just need to synchronize the readers by splitting that from
>> tls_rx_reader_lock:
>> -- 
>> diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
>> index 53f944e6d8ef..53404c3fdcc6 100644
>> --- a/net/tls/tls_sw.c
>> +++ b/net/tls/tls_sw.c
>> @@ -1845,13 +1845,10 @@ tls_read_flush_backlog(struct sock *sk, struct 
>> tls_prot_info *prot,
>>          return sk_flush_backlog(sk);
>>   }
>>
>> -static int tls_rx_reader_lock(struct sock *sk, struct 
>> tls_sw_context_rx *ctx,
>> -                             bool nonblock)
>> +static int tls_rx_reader_acquire(struct sock *sk, struct 
>> tls_sw_context_rx *ctx,
>> +                            bool nonblock)
>>   {
>>          long timeo;
>> -       int err;
>> -
>> -       lock_sock(sk);
>>
>>          timeo = sock_rcvtimeo(sk, nonblock);
>>
>> @@ -1865,26 +1862,30 @@ static int tls_rx_reader_lock(struct sock *sk, 
>> struct tls_sw_context_rx *ctx,
>>                                !READ_ONCE(ctx->reader_present), &wait);
>>                  remove_wait_queue(&ctx->wq, &wait);
>>
>> -               if (timeo <= 0) {
>> -                       err = -EAGAIN;
>> -                       goto err_unlock;
>> -               }
>> -               if (signal_pending(current)) {
>> -                       err = sock_intr_errno(timeo);
>> -                       goto err_unlock;
>> -               }
>> +               if (timeo <= 0)
>> +                       return -EAGAIN;
>> +               if (signal_pending(current))
>> +                       return sock_intr_errno(timeo);
>>          }
>>
>>          WRITE_ONCE(ctx->reader_present, 1);
>>
>>          return 0;
>> +}
>>
>> -err_unlock:
>> -       release_sock(sk);
>> +static int tls_rx_reader_lock(struct sock *sk, struct 
>> tls_sw_context_rx *ctx,
>> +                             bool nonblock)
>> +{
>> +       int err;
>> +
>> +       lock_sock(sk);
>> +       err = tls_rx_reader_acquire(sk, ctx, nonblock);
>> +       if (err)
>> +               release_sock(sk);
>>          return err;
>>   }
>>
>> -static void tls_rx_reader_unlock(struct sock *sk, struct 
>> tls_sw_context_rx *ctx)
>> +static void tls_rx_reader_release(struct sock *sk, struct 
>> tls_sw_context_rx *ctx)
>>   {
>>          if (unlikely(ctx->reader_contended)) {
>>                  if (wq_has_sleeper(&ctx->wq))
>> @@ -1896,6 +1897,11 @@ static void tls_rx_reader_unlock(struct sock 
>> *sk, struct tls_sw_context_rx *ctx)
>>          }
>>
>>          WRITE_ONCE(ctx->reader_present, 0);
>> +}
>> +
>> +static void tls_rx_reader_unlock(struct sock *sk, struct 
>> tls_sw_context_rx *ctx)
>> +{
>> +       tls_rx_reader_release(sk, ctx);
>>          release_sock(sk);
>>   }
>> -- 
>>
>> Then read_sock can just acquire/release.
> 
> Good suggestion.
> Will be including it in the next round.

Maybe more appropriate helper names would be
tls_rx_reader_enter / tls_rx_reader_exit.

Whatever Jakub prefers...
Jakub Kicinski June 21, 2023, 7:31 p.m. UTC | #7
On Wed, 21 Jun 2023 12:49:21 +0300 Sagi Grimberg wrote:
> > Good suggestion.
> > Will be including it in the next round.  
> 
> Maybe more appropriate helper names would be
> tls_rx_reader_enter / tls_rx_reader_exit.
> 
> Whatever Jakub prefers...

I was thinking along the same lines but with __ in front of the names
of the factored out code. Your naming as suggested in the diff is
better.
diff mbox series

Patch

diff --git a/net/tls/tls.h b/net/tls/tls.h
index d002c3af1966..ba55cd5c4913 100644
--- a/net/tls/tls.h
+++ b/net/tls/tls.h
@@ -114,6 +114,8 @@  bool tls_sw_sock_is_readable(struct sock *sk);
 ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
 			   struct pipe_inode_info *pipe,
 			   size_t len, unsigned int flags);
+int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
+		     sk_read_actor_t read_actor);
 
 int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
 void tls_device_splice_eof(struct socket *sock);
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 7b9c83dd7de2..1a062a8c6d33 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -963,10 +963,12 @@  static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG]
 	ops[TLS_BASE][TLS_SW  ] = ops[TLS_BASE][TLS_BASE];
 	ops[TLS_BASE][TLS_SW  ].splice_read	= tls_sw_splice_read;
 	ops[TLS_BASE][TLS_SW  ].poll		= tls_sk_poll;
+	ops[TLS_BASE][TLS_SW  ].read_sock	= tls_sw_read_sock;
 
 	ops[TLS_SW  ][TLS_SW  ] = ops[TLS_SW  ][TLS_BASE];
 	ops[TLS_SW  ][TLS_SW  ].splice_read	= tls_sw_splice_read;
 	ops[TLS_SW  ][TLS_SW  ].poll		= tls_sk_poll;
+	ops[TLS_SW  ][TLS_SW  ].read_sock	= tls_sw_read_sock;
 
 #ifdef CONFIG_TLS_DEVICE
 	ops[TLS_HW  ][TLS_BASE] = ops[TLS_BASE][TLS_BASE];
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 97379e34c997..e918c98bbeb2 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -2231,6 +2231,84 @@  ssize_t tls_sw_splice_read(struct socket *sock,  loff_t *ppos,
 	goto splice_read_end;
 }
 
+int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc,
+		     sk_read_actor_t read_actor)
+{
+	struct tls_context *tls_ctx = tls_get_ctx(sk);
+	struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
+	struct strp_msg *rxm = NULL;
+	struct tls_msg *tlm;
+	struct sk_buff *skb;
+	ssize_t copied = 0;
+	int err, used;
+
+	err = tls_rx_reader_lock(sk, ctx, true);
+	if (err < 0)
+		return err;
+	if (!skb_queue_empty(&ctx->rx_list)) {
+		skb = __skb_dequeue(&ctx->rx_list);
+	} else {
+		struct tls_decrypt_arg darg;
+
+		err = tls_rx_rec_wait(sk, NULL, true, true);
+		if (err <= 0) {
+			tls_rx_reader_unlock(sk, ctx);
+			return err;
+		}
+
+		memset(&darg.inargs, 0, sizeof(darg.inargs));
+
+		err = tls_rx_one_record(sk, NULL, &darg);
+		if (err < 0) {
+			tls_err_abort(sk, -EBADMSG);
+			tls_rx_reader_unlock(sk, ctx);
+			return err;
+		}
+
+		tls_rx_rec_done(ctx);
+		skb = darg.skb;
+	}
+
+	do {
+		rxm = strp_msg(skb);
+		tlm = tls_msg(skb);
+
+		/* read_sock does not support reading control messages */
+		if (tlm->control != TLS_RECORD_TYPE_DATA) {
+			err = -EINVAL;
+			goto read_sock_requeue;
+		}
+
+		used = read_actor(desc, skb, rxm->offset, rxm->full_len);
+		if (used <= 0) {
+			err = used;
+			goto read_sock_end;
+		}
+
+		copied += used;
+		if (used < rxm->full_len) {
+			rxm->offset += used;
+			rxm->full_len -= used;
+			if (!desc->count)
+				goto read_sock_requeue;
+		} else {
+			consume_skb(skb);
+			if (desc->count && !skb_queue_empty(&ctx->rx_list))
+				skb = __skb_dequeue(&ctx->rx_list);
+			else
+				skb = NULL;
+		}
+	} while (skb);
+
+read_sock_end:
+	tls_rx_reader_unlock(sk, ctx);
+	return copied ? : err;
+
+read_sock_requeue:
+	__skb_queue_head(&ctx->rx_list, skb);
+	goto read_sock_end;
+}
+
 bool tls_sw_sock_is_readable(struct sock *sk)
 {
 	struct tls_context *tls_ctx = tls_get_ctx(sk);