diff mbox series

[v2] tcp: pass back data left in socket after receive

Message ID 2975a359-2422-71dc-db6b-9e4f369cae77@kernel.dk (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [v2] tcp: pass back data left in socket after receive | expand

Checks

Context Check Description
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix warning Target tree name not specified in the subject
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 6507 this patch: 6507
netdev/cc_maintainers warning 6 maintainers not CCed: jk@codeconstruct.com.au davem@davemloft.net pabeni@redhat.com yoshfuji@linux-ipv6.org dsahern@kernel.org changbin.du@intel.com
netdev/build_clang success Errors and warnings before: 1713 this patch: 1713
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 11987 this patch: 11987
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 55 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/tree_selection success Guessing tree name failed - patch did not apply, async

Commit Message

Jens Axboe April 28, 2022, 11:13 p.m. UTC
This is currently done for CMSG_INQ, add an ability to do so via struct
msghdr as well and have CMSG_INQ use that too. If the caller sets
msghdr->msg_get_inq, then we'll pass back the hint in msghdr->msg_inq.

Rearrange struct msghdr a bit so we can add this member while shrinking
it at the same time. On a 64-bit build, it was 96 bytes before this
change and 88 bytes afterwards.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---

v2: rebased on net-next, s/net/tcp in subject

 include/linux/socket.h |  6 +++++-
 net/ipv4/tcp.c         | 14 +++++++++-----
 2 files changed, 14 insertions(+), 6 deletions(-)

Comments

Eric Dumazet April 28, 2022, 11:22 p.m. UTC | #1
On Thu, Apr 28, 2022 at 4:13 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> This is currently done for CMSG_INQ, add an ability to do so via struct
> msghdr as well and have CMSG_INQ use that too. If the caller sets
> msghdr->msg_get_inq, then we'll pass back the hint in msghdr->msg_inq.
>
> Rearrange struct msghdr a bit so we can add this member while shrinking
> it at the same time. On a 64-bit build, it was 96 bytes before this
> change and 88 bytes afterwards.
>
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---


SGTM, thanks.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Neal Cardwell April 28, 2022, 11:41 p.m. UTC | #2
On Thu, Apr 28, 2022 at 7:23 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Apr 28, 2022 at 4:13 PM Jens Axboe <axboe@kernel.dk> wrote:
> >
> > This is currently done for CMSG_INQ, add an ability to do so via struct
> > msghdr as well and have CMSG_INQ use that too. If the caller sets
> > msghdr->msg_get_inq, then we'll pass back the hint in msghdr->msg_inq.
> >
> > Rearrange struct msghdr a bit so we can add this member while shrinking
> > it at the same time. On a 64-bit build, it was 96 bytes before this
> > change and 88 bytes afterwards.
> >
> > Signed-off-by: Jens Axboe <axboe@kernel.dk>
> > ---
>
>
> SGTM, thanks.
>
> Reviewed-by: Eric Dumazet <edumazet@google.com>

The patch seems to add an extra branch or two to the recvmsg() fast
path even for the common application use case that does not use any of
these INQ features.

To avoid imposing one of these new extra branches for the common case
where the INQ features are not used, what do folks think about
structuring it something like the following:

               if (msg->msg_get_inq) {
                       msg->msg_inq = tcp_inq_hint(sk);
                       if (cmsg_flags & TCP_CMSG_INQ)
                               put_cmsg(msg, SOL_TCP, TCP_CM_INQ,
                                        sizeof(msg->msg_inq),
                                        &msg->msg_inq);
                }

neal
Eric Dumazet April 28, 2022, 11:46 p.m. UTC | #3
On Thu, Apr 28, 2022 at 4:41 PM Neal Cardwell <ncardwell@google.com> wrote:
>
> On Thu, Apr 28, 2022 at 7:23 PM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Thu, Apr 28, 2022 at 4:13 PM Jens Axboe <axboe@kernel.dk> wrote:
> > >
> > > This is currently done for CMSG_INQ, add an ability to do so via struct
> > > msghdr as well and have CMSG_INQ use that too. If the caller sets
> > > msghdr->msg_get_inq, then we'll pass back the hint in msghdr->msg_inq.
> > >
> > > Rearrange struct msghdr a bit so we can add this member while shrinking
> > > it at the same time. On a 64-bit build, it was 96 bytes before this
> > > change and 88 bytes afterwards.
> > >
> > > Signed-off-by: Jens Axboe <axboe@kernel.dk>
> > > ---
> >
> >
> > SGTM, thanks.
> >
> > Reviewed-by: Eric Dumazet <edumazet@google.com>
>
> The patch seems to add an extra branch or two to the recvmsg() fast
> path even for the common application use case that does not use any of
> these INQ features.
>
> To avoid imposing one of these new extra branches for the common case
> where the INQ features are not used, what do folks think about
> structuring it something like the following:
>
>                if (msg->msg_get_inq) {
>                        msg->msg_inq = tcp_inq_hint(sk);
>                        if (cmsg_flags & TCP_CMSG_INQ)
>                                put_cmsg(msg, SOL_TCP, TCP_CM_INQ,
>                                         sizeof(msg->msg_inq),
>                                         &msg->msg_inq);
>                 }

Sure thing.

Note that the prior test would not take this path anyway (unless
someone requests TS)

if ((cmsg_flags || msg->msg_get_inq) ....
Jens Axboe April 29, 2022, 12:40 a.m. UTC | #4
On 4/28/22 5:41 PM, Neal Cardwell wrote:
> On Thu, Apr 28, 2022 at 7:23 PM Eric Dumazet <edumazet@google.com> wrote:
>>
>> On Thu, Apr 28, 2022 at 4:13 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> This is currently done for CMSG_INQ, add an ability to do so via struct
>>> msghdr as well and have CMSG_INQ use that too. If the caller sets
>>> msghdr->msg_get_inq, then we'll pass back the hint in msghdr->msg_inq.
>>>
>>> Rearrange struct msghdr a bit so we can add this member while shrinking
>>> it at the same time. On a 64-bit build, it was 96 bytes before this
>>> change and 88 bytes afterwards.
>>>
>>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>>> ---
>>
>>
>> SGTM, thanks.
>>
>> Reviewed-by: Eric Dumazet <edumazet@google.com>
> 
> The patch seems to add an extra branch or two to the recvmsg() fast
> path even for the common application use case that does not use any of
> these INQ features.
> 
> To avoid imposing one of these new extra branches for the common case
> where the INQ features are not used, what do folks think about
> structuring it something like the following:
> 
>                if (msg->msg_get_inq) {
>                        msg->msg_inq = tcp_inq_hint(sk);
>                        if (cmsg_flags & TCP_CMSG_INQ)
>                                put_cmsg(msg, SOL_TCP, TCP_CM_INQ,
>                                         sizeof(msg->msg_inq),
>                                         &msg->msg_inq);
>                 }

I'm fine with that, doesn't really matter to me. You're under that
cmsg_flags branch anyway.
diff mbox series

Patch

diff --git a/include/linux/socket.h b/include/linux/socket.h
index 6f85f5d957ef..12085c9a8544 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -50,6 +50,9 @@  struct linger {
 struct msghdr {
 	void		*msg_name;	/* ptr to socket address structure */
 	int		msg_namelen;	/* size of socket address structure */
+
+	int		msg_inq;	/* output, data left in socket */
+
 	struct iov_iter	msg_iter;	/* data */
 
 	/*
@@ -62,8 +65,9 @@  struct msghdr {
 		void __user	*msg_control_user;
 	};
 	bool		msg_control_is_user : 1;
-	__kernel_size_t	msg_controllen;	/* ancillary data buffer length */
+	bool		msg_get_inq : 1;/* return INQ after receive */
 	unsigned int	msg_flags;	/* flags on received message */
+	__kernel_size_t	msg_controllen;	/* ancillary data buffer length */
 	struct kiocb	*msg_iocb;	/* ptr to iocb for async requests */
 };
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index db55af9eb37b..b6aa4df79429 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2314,8 +2314,10 @@  static int tcp_recvmsg_locked(struct sock *sk, struct msghdr *msg, size_t len,
 	if (sk->sk_state == TCP_LISTEN)
 		goto out;
 
-	if (tp->recvmsg_inq)
+	if (tp->recvmsg_inq) {
 		*cmsg_flags = TCP_CMSG_INQ;
+		msg->msg_get_inq = 1;
+	}
 	timeo = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
 
 	/* Urgent data needs to be handled specially. */
@@ -2537,7 +2539,7 @@  static int tcp_recvmsg_locked(struct sock *sk, struct msghdr *msg, size_t len,
 int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		int *addr_len)
 {
-	int cmsg_flags = 0, ret, inq;
+	int cmsg_flags = 0, ret;
 	struct scm_timestamping_internal tss;
 
 	if (unlikely(flags & MSG_ERRQUEUE))
@@ -2552,12 +2554,14 @@  int tcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 	ret = tcp_recvmsg_locked(sk, msg, len, flags, &tss, &cmsg_flags);
 	release_sock(sk);
 
-	if (cmsg_flags && ret >= 0) {
+	if ((cmsg_flags || msg->msg_get_inq) && ret >= 0) {
 		if (cmsg_flags & TCP_CMSG_TS)
 			tcp_recv_timestamp(msg, sk, &tss);
+		if (msg->msg_get_inq)
+			msg->msg_inq = tcp_inq_hint(sk);
 		if (cmsg_flags & TCP_CMSG_INQ) {
-			inq = tcp_inq_hint(sk);
-			put_cmsg(msg, SOL_TCP, TCP_CM_INQ, sizeof(inq), &inq);
+			put_cmsg(msg, SOL_TCP, TCP_CM_INQ, sizeof(msg->msg_inq),
+				 &msg->msg_inq);
 		}
 	}
 	return ret;