diff mbox series

[mptcp-next,v2,3/5] tcp: add mptcp join demultiplex hooks

Message ID 20220217142538.7849-4-fw@strlen.de (mailing list archive)
State Superseded, archived
Delegated to: Mat Martineau
Headers show
Series mptcp: replace per-addr listener sockets | expand

Commit Message

Florian Westphal Feb. 17, 2022, 2:25 p.m. UTC
Split from the next patch to make core tcp changes more obvious:
add a dummy function that gets called after tcp socket demux came up
empty.

This will be used by mptcp to check if a tcp syn contains an mptcp
join option with a valid token (connection id).

If so, a hidden pernet mptcp listener socket is returned and packet
resumes normally.

This patch series does not cover timewait sockets so far.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/mptcp.h | 5 +++++
 net/ipv4/tcp_ipv4.c | 4 ++++
 net/ipv6/tcp_ipv6.c | 4 ++++
 3 files changed, 13 insertions(+)

Comments

Paolo Abeni Feb. 17, 2022, 9:58 p.m. UTC | #1
On Thu, 2022-02-17 at 15:25 +0100, Florian Westphal wrote:
> Split from the next patch to make core tcp changes more obvious:
> add a dummy function that gets called after tcp socket demux came up
> empty.
> 
> This will be used by mptcp to check if a tcp syn contains an mptcp
> join option with a valid token (connection id).
> 
> If so, a hidden pernet mptcp listener socket is returned and packet
> resumes normally.
> 
> This patch series does not cover timewait sockets so far.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>

Following-up todays mtg discussion WRT filtering incoming MPJ on
existing endpoint. 

If I read correctly, subflow_syn_recv_sock() will filter the incoming
packets vs the annonced list, but only if the incoming packet's
destination port is different from the listener port.

- incoming request targeting a bad address and/or port will reach the
syn-ack before being rejected.

- incoming request targeting port 0 will likely be always accepted.

What about moving the anno_list check in __mptcp_handle_join()? Should
address the 2 above points and keep the anno_list traversals to the
bare minimum.

WDYT?

Cheers,

Paolo
Florian Westphal Feb. 18, 2022, 7:29 a.m. UTC | #2
Paolo Abeni <pabeni@redhat.com> wrote:
> If I read correctly, subflow_syn_recv_sock() will filter the incoming
> packets vs the annonced list, but only if the incoming packet's
> destination port is different from the listener port.

Same as today, if it hit listener port then no issue...?
The listener port isn't in the announced list, so the check would fail.

> - incoming request targeting a bad address and/or port will reach the
> syn-ack before being rejected.

Yes.

> - incoming request targeting port 0 will likely be always accepted.

No, port 0 is illegal, there is an explicit check that prevents
such port from being wired up to the magic listener.

> What about moving the anno_list check in __mptcp_handle_join()? Should
> address the 2 above points and keep the anno_list traversals to the
> bare minimum.

I can have a look.
Paolo Abeni Feb. 18, 2022, 8:49 a.m. UTC | #3
On Fri, 2022-02-18 at 08:29 +0100, Florian Westphal wrote:
> Paolo Abeni <pabeni@redhat.com> wrote:
> > If I read correctly, subflow_syn_recv_sock() will filter the incoming
> > packets vs the annonced list, but only if the incoming packet's
> > destination port is different from the listener port.
> 
> Same as today, if it hit listener port then no issue...?
> The listener port isn't in the announced list, so the check would fail.

Side not not related to this series: It looks like that if we have 2
MPTCP listeners on different addresses and/or ports, and no 'signal'
endpoints, the kernel will accept MPJ on either sockets for any mptcp
connection, regardless of the originating listener. I'm unsure if that
is expeted/fits the RFC fully. 
@Mat: ^^^ WDYT?

> 
> > - incoming request targeting a bad address and/or port will reach the
> > syn-ack before being rejected.
> 
> Yes.
> 
> > - incoming request targeting port 0 will likely be always accepted.
> 
> No, port 0 is illegal, there is an explicit check that prevents
> such port from being wired up to the magic listener.

whoops, I missed that check, despite the huge comment. Sorry for the
noise here.

> > What about moving the anno_list check in __mptcp_handle_join()? Should
> > address the 2 above points and keep the anno_list traversals to the
> > bare minimum.
> 
> I can have a look.

Thanks! It looks like some minor changes are needed to handle !ipv6
build.

Cheers,
Paolo
diff mbox series

Patch

diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 8b1afd6f5cc4..5ee422b56902 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -197,6 +197,10 @@  static inline __be32 mptcp_reset_option(const struct sk_buff *skb)
 
 	return htonl(0u);
 }
+static inline struct sock *mptcp_handle_join(int af, struct sk_buff *skb)
+{
+	return NULL;
+}
 #else
 
 static inline void mptcp_init(void)
@@ -274,6 +278,7 @@  static inline int mptcp_subflow_init_cookie_req(struct request_sock *req,
 }
 
 static inline __be32 mptcp_reset_option(const struct sk_buff *skb)  { return htonl(0u); }
+static inline struct sock *mptcp_handle_join(int af, struct sk_buff *skb) { return NULL; }
 #endif /* CONFIG_MPTCP */
 
 #if IS_ENABLED(CONFIG_MPTCP_IPV6)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 6873f46fc8ba..6e6675a09443 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2140,6 +2140,10 @@  int tcp_v4_rcv(struct sk_buff *skb)
 	if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
 		goto discard_it;
 
+	sk = mptcp_handle_join(AF_INET, skb);
+	if (sk)
+		goto process;
+
 	tcp_v4_fill_cb(skb, iph, th);
 
 	if (tcp_checksum_complete(skb)) {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 0c648bf07f39..788040db8e9e 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1782,6 +1782,10 @@  INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb)
 	if (!xfrm6_policy_check(NULL, XFRM_POLICY_IN, skb))
 		goto discard_it;
 
+	sk = mptcp_handle_join(AF_INET6, skb);
+	if (sk)
+		goto process;
+
 	tcp_v6_fill_cb(skb, hdr, th);
 
 	if (tcp_checksum_complete(skb)) {